Holger Parplies wrote: >> But lots of other people including myself run rsync without errors so it >> has to be something unique to your situation. > > well, no. You don't rule out bugs by "it works for me", not even by "it > works for everyone I know". I'm sure you know that.
Anything is possible I suppose, but if I know something works for everyone else I move it to the bottom of the list of things to test. > We don't know much about the "lots of other people", do we? We know there > have been no further *reports* of it on this list, but I don't remember > hundreds of people reporting success with rsync on RHEL4 either. You might > know about other lists, I don't. I know enough about mailing lists to expect a ton of matches on a google search for the 'no route to host' problem but I don't see much relating to a local LAN or RHEL there. And I'm sure I'd have seen mentions on the Centos or fedora lists if it affected those very similar kernels. >> Maybe cables from a different vendor would help. > > I doubt it, because other applications are doing well. It doesn't seem to be > hardware related to me. I suspect the kernel on the host side (backup client) > or its configuration. Of course, it may be hardware specific in that > different hardware does not trigger whatever is happening (and that could > include the switch, maybe, perhaps), but the cables? It's not the hardware > where I would start looking, especially after Tim *has* tested quite a lot > of different setups. TCP retries can cover a lot of errors. A bad crimp on a patch cable or an extra half-inch untwisted on the wall punch-downs can cause exactly this sort of thing. > It could be stupid things like arp poisoning, a misbehaving machine on the > local network or whatever. Remains the question what communication > characteristics rsync has and SMB doesn't (hmm, SMB is UDP, isn't it?) that > make the problem appear. Arp poisoning is possible - maybe just someone plugging/unplugging a machine with the same IP somewhere else. A badly configured NAT gateway on the network doing proxy-arp the wrong direction could do it. If there are multiple interconnected switches involved it could even be a loop with a spanning-tree problem. > Tim sent me his /etc/sysctl.conf off-list, and I find it harmless (that > refered to "kernel configuration" before I added the previous paragraph). As I > understand him, he's about to try out different kernels (2.4.x ?), now that > he has a test setup available. Swapping kernels is *not* something I'd happily > do without further thought on a production server either, and I'm sure you > agree. If the machine can be taken down for testing, I'd boot a knoppix or ubuntu CD instead of installing something different. > May I summarize a few points I believe we all agree on? > > 1.) It's a client side problem, i.e. the backed up client seems to be the > cause, not the BackupPC server machine. I'm guessing a network problem with the only likely software connection being the NIC driver. > 2.) It is thus not a BackupPC problem. On the client only stock RHEL4 > software is in use (on the test setup anyway). > 3.) It is still on-topic in that it happens using BackupPC and only then. > Other users of BackupPC may run into similar problems and be glad to > find a solution in the archives once we find one. > 4.) It's an obscure and unnerving problem. There are many things to try out, > nothing obvious springing to mind, and each of us has different thoughts > on what to try in which order :). > > My bet stays the kernel. Craig has a point with the isolated network. Either > one might fix it, without leading to a definitive diagnose. Running on an > isolated network as a workaround is not an option :-), but it's the easier > thing to try out, and *reproducing* the problem on an isolated network would > rule out quite a lot of causes. I think the kernel is the least likely thing to be involved. My test procedure would be to build a 'known working' pair of machines, perhaps as simple as booting 2 boxes with knoppix or ubuntu with a crossover cable using ip addresses between them. Once you get a set that can rsync without errors (which can't be that hard - it works for everyone else), start introducing the cable/switch/destination where you've seen the problem, one piece at a time. Intel 100M NICs would probably be a safe bet for ruling out obscure driver issues. If the switches are managed, I'd see what they say about interface errors and that you don't have a duplex mismatch on the connections. A tcpdump of broadcasts to see the arp traffic might show something interesting. -- Les Mikesell [EMAIL PROTECTED] ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/backuppc-users http://backuppc.sourceforge.net/