There was a Solaris patch a while back
that broke rshell in 2.6 that caused
this same issue on many of our systems.
Bug ID 4242754 caused by jumbo kernel patch
105181-13...
Check and see if you have the updated 105181
patch. Might be a good place to start.
-b
-----Original Message-----
From: Hal Haygood [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, January 09, 2001 4:06 AM
To: [EMAIL PROTECTED]
Subject: Rsync 2.4.6, Solaris 2.6 hang, w/info
I'm experiencing a hang with rsync 2.4.6 on Solaris. Inititating and
target
hosts are both Solaris 2.6. It looks like there might be some network
latency issues, but the parent rsh process has been blocking on the same
write() for several hours now, so I don't think that's quite it. It
also
looks like something's quite hung up, because the 15-minute timeout
isn't
timing out.
This is for an rsync push of a large directory tree. The command is:
/usr/local/bin/rsync \
-avzHlW \
--rsync-path=/usr/local/bin/rsync \
--timeout=900 \
--delete \
--exclude (some excludes here) \
/local/directory/name/* \
remotehost:/remote/directory/name
The TCP queue on the sending host looks like this:
Local Address Remote Address Swind Send-Q Rwind Recv-Q
State
-------------------- -------------------- ----- ------ ----- ------
-------
thishost.1018 remotehost.shell 8760 0 0 0
ESTABLISHED
thishost.1017 remotehost.1022 8760 0 8760 0
ESTABLISHED
The TCP queue on the receiving host looks like this:
Local Address Remote Address Swind Send-Q Rwind Recv-Q
State
-------------------- -------------------- ----- ------ ----- ------
-------
remotehost.shell thishost.1018 1 0 8760 0
ESTABLISHED
remotehost.1022 thishost.1017 8760 0 8760 0
ESTABLISHED
The "rsync --avzHlW" process on the sending host is looping on something
like
this:
poll(0xEFFFD580, 0, 20) = 0
poll(0xEFFFD580, 0, 1) = 0
waitid(P_PID, 3019, 0xEFFFF588, WEXITED|WTRAPPED|WNOHANG) = 0
poll(0xEFFFD580, 0, 20) = 0
waitid(P_PID, 3019, 0xEFFFF588, WEXITED|WTRAPPED|WNOHANG) = 0
poll(0xEFFFD580, 0, 20) = 0
poll(0xEFFFD580, 0, 1) = 0
waitid(P_PID, 3019, 0xEFFFF588, WEXITED|WTRAPPED|WNOHANG) = 0
poll(0xEFFFD580, 0, 20) = 0
waitid(P_PID, 3019, 0xEFFFF588, WEXITED|WTRAPPED|WNOHANG) = 0
poll(0xEFFFD580, 0, 20) = 0
poll(0xEFFFD580, 0, 9) = 0
waitid(P_PID, 3019, 0xEFFFF588, WEXITED|WTRAPPED|WNOHANG) = 0
poll(0xEFFFD580, 0, 20) = 0
poll(0xEFFFD580, 0, 1) = 0
The parent rsh process on the sending host is stuck in:
write(1, " p a r t o f a f i l e n a m e".., 285) (sleeping...)
The child rsh process on the sending host is stuck in:
read(0, 0xEFFFF410, 1024) (sleeping...)
The "rsync --server" process on the receiving host is stuck in:
poll(0xEFFFC110, 1, 60000) (sleeping...)
The "csh --c /usr/local/bin/rsync" process on the receiving host is
stuck in:
sigsuspend(0xEFFFF938) (sleeping...)
The "in.rshd" process on the receiving host is stuck in:
poll(0xEFFFD7F8, 2, -1) (sleeping...)
So, any ideas? Like I said, it looks like write() is blocking for no
particular reason, and that's causing us to sit and spin.
Thoughts?
Hal