Subject: Re: What could cause rsync to kill ssh?

2023-06-12 Thread Madhu via rsync
* Albert Croft via rsync 
 :
Wrote on Sat, 3 Jun 2023 11:52:56 -0500:

> You say, "knocking my ssh session offline on all terminals and it
> blocks ssh from being able to connect again. Even restarting sshd
> doesn't help".
>
> Questions:
> * Is the network stack on the affected machine still active? (Can it
>   reach other services or systems on the network?)
> * If the network is NOT reachable, does restarting the network stack
>   make a difference?

I think I've seen this problem.  Last month I was transferring the root
filesystem (live, with script which excludes config files and var) from
one gentoo machine A (with zfs) to another B (with ext4), and "ssh
stopped working."

both kernels were 5.10.x - both were running rsync-3.2.3 at this time.

- sshd on B did not crash, and I was able to walk over to the other
  machine and restarted `sshd -D' by hand and i could watch tcpdump on
  both boxes.

- the network still worked. Only SSH to machine B stopped working. I
  cleared all IPTABLES/NFT and made sure there wasnt any problem from
  those things.

- I could connect via ssh to hosts from A to the internet. I could ssh
  from A to localhost or to A to A another local interface . What I
  could not do was connect via ssh from machine A to the sshd process
  running on B. Only the SYN packet goes out and there is no response
  from B.

on A ip r g 192.168.1.
 192.168.1. dev wlan0 src 192.168.1. uid 0

(Luckily i was able to start rsyncd on B and finish the transfer of /
without ssh and without a catastrophe)

After Rebooting machine B, it came up and ssh to it worked like it had
been working for 3 years under the same setup: Machine A was using on
wifi and talks to machine B which is wired, through a d-link wifi
router after adding an IP address on B's wired network to A's wlan
interface. I couldnt figure out what was happening i put it down to
NSA backdoor level stuff in the kernel which had cut me off.

Earlier during "heavy" rsync, over ssh over i would see "stalls" were
not explained by stracing the rsync processes on either end, this would
resume. But In this instance all port 22 packets from A to B got dropped
without either kernel indicating why.


> I ask because I intermittently see what seems to be a similar
> behavior--rsync client (3.2.7) to a remote system with rsync (3.2.3)
> and a 5.11.x linux kernel that occasionally terminates with the linux
> system losing access to the network where restarting the network stack
> doesn't seem to restore access and requires a reboot of the linux
> system in question.
>
> On 6/2/23 10:44 PM, Maurice R Volaski via rsync wrote:
>> I have an rsync script that it is copying one computer (over ssh) to
>> a shared CIFS mount on Gentoo Linux, kernel 6.3.4. The script runs
>> for a while and then at some point quits knocking my ssh session
>> offline on all terminals and it blocks ssh from being able to
>> connect again. Even restarting sshd doesn’t help. Rsync has
>> apparently killed it. I have to reboot.

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: What could cause rsync to kill ssh?

2023-06-09 Thread Maurice R Volaski via rsync
This Gentoo VM is ZFS and BIOS based. I have a pretty similar Gentoo VM that is 
btrfs/UEFI-based. I am able to run rsync on it with no trouble, so I have just 
switched to that. The cause for why rsync is bringing down sshd likely will 
remain a mystery. 

Thanks for your inputs.

On 6/3/23, 04:55, "Perry Hutchison" mailto:pl...@agora.rdrop.com>> wrote:


CAUTION: This email comes from an external source; the attachments and/or links 
may compromise our secure environment. Do not open or click on suspicious 
emails. Please click on the “Phish Alert” button on the top right of the 
Outlook dashboard to report any suspicious emails.


Maurice R Volaski mailto:maurice.vola...@einsteinmed.edu>> wrote:


> Rsync 3.2.7 is running on the Gentoo computer, which doesn't have
> a version, other than it's "current". I'm running the script from
> this computer.
>
> Rsync 3.1.2 is on the source computer, where the files come from,
> which is Ubuntu 18.0.4.6.
>
> I'm copying to a CIFS share mounted on the Gentoo computer.
>
> The rsync scripts are all similar to this one:
>
> /usr/bin/rsync -v -a --progress --exclude-from=${exclude} --safe-links 
> --itemize-changes --no-perms --no-owner --progress --stats \
> al...@labadmin-precision-tower-3620.montefiore.org 
> :/home/alexa/ 
> /mnt/data.einstein/luke/all_but_dat/alexa/desktop_bkup/profile \
> >> /home/maurice/logs/rsync-client-alexa.log
>
> I re-ran the scripts skipping this one. The next one was running
> and during that period, ssh stopped responded to new connections,
> so it may be the case that the failure is taking place across
> time, and it doesn't fail wholesale immediately.
>
> However, I have other scripts like these copying from other
> sources (not Ubuntu) and they are not causing these failures.


You have several moving parts, which complicates figuring out which
of the various interactions is contributing to the problem.


BTW anyone else on the list is more than welcome to weigh in. I am
hardly an expert on rsync, and not at all familiar with the ins and
outs of either Gentoo or CIFS.


One thing which I think is most likely _not_ involved in the problem is
sshd on the Gentoo system, and this is consistent with the observation
that restarting sshd did not help. (If I'm reading the rsync command
correctly, rsync on the Gentoo system is establishing the ssh
connection and transferring the files over it. Gentoo's sshd would
be involved only if the client were initiating the connection.)


Is there anything interesting in the rsync logfile, especially near the
end, or in the any of the involved machines' system logs (including the
CIFS host) around the time of the hang?


Does the Gentoo system have enough space for rsync to copy the files
to a local drive, so that rsync and Samba are not both working on the
same transfer at the same time? (The files can then be copied to the
CIFS share in a separate step, using "cp -r" or some such.) If rsync
still fails when arranged that way it would tend to eliminate CIFS as
a factor (and it will simplify the environment); OTOH if that "solves"
the problem you'll at least have a workaround.


Totally separate from that, is this Ubuntu system the only client using
Ubuntu 18.0.4.6 and/or rsync 3.1.2?



-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: What could cause rsync to kill ssh?

2023-06-03 Thread Albert Croft via rsync

Maurice,

You say, "knocking my ssh session offline on all terminals and it blocks 
ssh from being able to connect again. Even restarting sshd doesn't help".


Questions:
* Is the network stack on the affected machine still active? (Can it 
reach other services or systems on the network?)
* If the network is NOT reachable, does restarting the network stack 
make a difference?


I ask because I intermittently see what seems to be a similar 
behavior--rsync client (3.2.7) to a remote system with rsync (3.2.3) and 
a 5.11.x linux kernel that occasionally terminates with the linux system 
losing access to the network where restarting the network stack doesn't 
seem to restore access and requires a reboot of the linux system in 
question.


On 6/2/23 10:44 PM, Maurice R Volaski via rsync wrote:
I have an rsync script that it is copying one computer (over ssh) to a 
shared CIFS mount on Gentoo Linux, kernel 6.3.4. The script runs for a 
while and then at some point quits knocking my ssh session offline on 
all terminals and it blocks ssh from being able to connect again. Even 
restarting sshd doesn’t help. Rsync has apparently killed it. I have to 
reboot.





--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: What could cause rsync to kill ssh?

2023-06-03 Thread Roger via rsync
Nice job on converting each switch to it's equivalent human readable format!

I used Gentoo for two decades or so.  Now using Void Linux as I have
little time for compiling.

One item that might be noteworthy for those running Gentoo, or a
compiled from source distribution, is including reporting the
CFLAGS/LDFLAGS compiler options utilized for compiling/linking the
code.

Sometimes optimizations will cause self-compiled programs surface bugs
not commonly seen,  when otherwise using the more commonly used
CFLAGS/LDFLAGS compiler options.  When software gets flaky like this,
eg. just disappearing or mysteriously quitting, usually is the main
cause.

Strace will usually catch the problem, while gdb/debugger might make
things amazingly stable.

Also, try a different file system, for ruling out the compiled (CIFS)
kernel drivers.

Some thoughts, but since you're already in the edu domain, likely
already thought of all this already!

Roger


On 6/3/23, Maurice R Volaski via rsync  wrote:
> Rsync 3.2.7 is running on the Gentoo computer, which doesn't have a version,
> other than it's "current". I'm running the script from this computer.
>
> Rsync 3.1.2 is on the source computer, where the files come from, which is
> Ubuntu 18.0.4.6.
>
> I'm copying to a CIFS share mounted on the Gentoo computer.
>
> The rsync scripts are all similar to this one:
>
> /usr/bin/rsync -v -a --progress --exclude-from=${exclude} --safe-links
> --itemize-changes --no-perms --no-owner --progress --stats \
> al...@labadmin-precision-tower-3620.montefiore.org:/home/alexa/
> /mnt/data.einstein/luke/all_but_dat/alexa/desktop_bkup/profile \
>>> /home/maurice/logs/rsync-client-alexa.log

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: What could cause rsync to kill ssh?

2023-06-03 Thread Paul Slootman via rsync
On Sat 03 Jun 2023, Maurice R Volaski via rsync wrote:

> I have an rsync script that it is copying one computer (over ssh) to a shared 
> CIFS mount on Gentoo Linux, kernel 6.3.4. The script runs for a while and 
> then at some point quits knocking my ssh session offline on all terminals and 
> it blocks ssh from being able to connect again. Even restarting sshd doesn’t 
> help. Rsync has apparently killed it. I have to reboot.

Note there's no such thing as an rsync script. You probably mean you
have a shell script that runs rsync at some point.

Is the script copying from the system it's running on, to the Gentoo
Linux system? Is the CIFS mount actually mounted on the Gentoo Linux, or
is the Gentoo Linux system serving the CIFS mount which actually is
mounted on the "one computer"? In that case it would be much better to
directly rsync to the filesystem on the Gentoo system.

Re: the ssh stopping working:
To me this would suggest that there's an out-of-memory situation going
on, and sshd is being killed because of this. However that would not
explain why restarting it doesn't work.
What exactly do you mean when you say restarting sshd doesn't help?
Does it not stay running, or is the daemon in fact running but not
accepting connections?
It's the age-old question: "it doesn't work" -- "_how_ is it not working?"

Does dmesg give any useful information? Or perhaps journalctl?
Usually the clues are in plain signt if you check logs.


Paul

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: What could cause rsync to kill ssh?

2023-06-03 Thread Perry Hutchison via rsync
Maurice R Volaski  wrote:

> Rsync 3.2.7 is running on the Gentoo computer, which doesn't have
> a version, other than it's "current". I'm running the script from
> this computer.
>
> Rsync 3.1.2 is on the source computer, where the files come from,
> which is Ubuntu 18.0.4.6.
>
> I'm copying to a CIFS share mounted on the Gentoo computer.
>
> The rsync scripts are all similar to this one:
>
> /usr/bin/rsync -v -a --progress --exclude-from=${exclude} --safe-links 
> --itemize-changes --no-perms --no-owner --progress --stats \
> al...@labadmin-precision-tower-3620.montefiore.org:/home/alexa/ 
> /mnt/data.einstein/luke/all_but_dat/alexa/desktop_bkup/profile \
> >> /home/maurice/logs/rsync-client-alexa.log
>
> I re-ran the scripts skipping this one. The next one was running
> and during that period, ssh stopped responded to new connections,
> so it may be the case that the failure is taking place across
> time, and it doesn't fail wholesale immediately.
>
> However, I have other scripts like these copying from other
> sources (not Ubuntu) and they are not causing these failures.

You have several moving parts, which complicates figuring out which
of the various interactions is contributing to the problem.

BTW anyone else on the list is more than welcome to weigh in.  I am
hardly an expert on rsync, and not at all familiar with the ins and
outs of either Gentoo or CIFS.

One thing which I think is most likely _not_ involved in the problem is
sshd on the Gentoo system, and this is consistent with the observation
that restarting sshd did not help.  (If I'm reading the rsync command
correctly, rsync on the Gentoo system is establishing the ssh
connection and transferring the files over it.  Gentoo's sshd would
be involved only if the client were initiating the connection.)

Is there anything interesting in the rsync logfile, especially near the
end, or in the any of the involved machines' system logs (including the
CIFS host) around the time of the hang?

Does the Gentoo system have enough space for rsync to copy the files
to a local drive, so that rsync and Samba are not both working on the
same transfer at the same time?  (The files can then be copied to the
CIFS share in a separate step, using "cp -r" or some such.)  If rsync
still fails when arranged that way it would tend to eliminate CIFS as
a factor (and it will simplify the environment); OTOH if that "solves"
the problem you'll at least have a workaround.

Totally separate from that, is this Ubuntu system the only client using
Ubuntu 18.0.4.6 and/or rsync 3.1.2?

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: What could cause rsync to kill ssh?

2023-06-02 Thread Maurice R Volaski via rsync
Rsync 3.2.7 is running on the Gentoo computer, which doesn't have a version, 
other than it's "current". I'm running the script from this computer.

Rsync 3.1.2 is on the source computer, where the files come from, which is 
Ubuntu 18.0.4.6.

I'm copying to a CIFS share mounted on the Gentoo computer.

The rsync scripts are all similar to this one:

/usr/bin/rsync -v -a --progress --exclude-from=${exclude} --safe-links 
--itemize-changes --no-perms --no-owner --progress --stats \
al...@labadmin-precision-tower-3620.montefiore.org:/home/alexa/ 
/mnt/data.einstein/luke/all_but_dat/alexa/desktop_bkup/profile \
>> /home/maurice/logs/rsync-client-alexa.log

I re-ran the scripts skipping this one. The next one was running and during 
that period, ssh stopped responded to new connections, so it may be the case 
that the failure is taking place across time, and it doesn't fail wholesale 
immediately.

However, I have other scripts like these copying from other sources (not 
Ubuntu) and they are not causing these failures. 


On 6/3/23, 12:40 AM, "Perry Hutchison" mailto:pl...@agora.rdrop.com> >> wrote:




CAUTION: This email comes from an external source; the attachments and/or links 
may compromise our secure environment. Do not open or click on suspicious 
emails. Please click on the “Phish Alert” button on the top right of the 
Outlook dashboard to report any suspicious emails.




Maurice R Volaski via rsync mailto:maurice.vola...@lists.samba.org> 
>> wrote:




> I have an rsync script that it is copying one computer (over ssh)
> to a shared CIFS mount on Gentoo Linux, kernel 6.3.4. The script
> runs for a while and then at some point quits knocking my ssh
> session offline on all terminals and it blocks ssh from being able
> to connect again. Even restarting sshd doesn't help. Rsync has
> apparently killed it. I have to reboot.




For starters:




What OS and version is the rsync script running on?




Which end do you have to reboot? The machine running the script,
or the Gentoo Linux?




What versions of rsync are running on each end?




Can you show the command line that fails?




Based on the mention of multiple terminals, it sounds as if you
have a fairly complex ssh environment. Can you get it to fail in
a simpler environment, ideally with only one terminal?









-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: What could cause rsync to kill ssh?

2023-06-02 Thread Perry Hutchison via rsync
Maurice R Volaski via rsync  wrote:

> I have an rsync script that it is copying one computer (over ssh)
> to a shared CIFS mount on Gentoo Linux, kernel 6.3.4. The script
> runs for a while and then at some point quits knocking my ssh
> session offline on all terminals and it blocks ssh from being able
> to connect again. Even restarting sshd doesn't help. Rsync has
> apparently killed it. I have to reboot.

For starters:

  What OS and version is the rsync script running on?

  Which end do you have to reboot?  The machine running the script,
  or the Gentoo Linux?

  What versions of rsync are running on each end?

  Can you show the command line that fails?

  Based on the mention of multiple terminals, it sounds as if you
  have a fairly complex ssh environment.  Can you get it to fail in
  a simpler environment, ideally with only one terminal?

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html