Hello, after 1 month we found the solution to our problem :D Problem (short): after some time clients lose their NBD-mounts (Log: "Read failed: Connection reset by peer") It is similar problem to https://bugs.launchpad.net/ubuntu/+source/nbd/+bug/113617
Solution: Tuning of the parameters of the TCP-Keepalive connection (see http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.html) We suppose our network closes mount-connections. We use mostly enterprise-class network components (Cisco 6500 Series). Our LTSP system runs well. We wanted to share our experience. Greetings, Wojtek Wojtek Polcwiartek schrieb: > Hello, > > we think, that the problem is the load-balancer (Cisco ACE). Most of the > traffic on the servers goes through it. Sniffing showed some strange > RST-Tcp-Packets. > We found some way to solve this problem. It is not the most beautiful > solution, but it works for now. > We used a script in /etc/rc2.d/ with following lines: > nbd-client -d /dev/nbd0 > nbd-client <IP-Address> 2000 /dev/nbd0 -persist > > I deconnects the nbd-client and connects it again with "persist" option. > > Is there any reason, why the option "persist" isn't used by default? For > me the connection seems to be robuster then without it. > > Is there a clean way to change the parameters of the default nbd-connection? > > > Thanks for help! > > Wojtek > > > > > Gideon Romm schrieb: >> The only other thing I can think of is your switch. >> >> Is it a managed switch? Some switches will not allow a connection to be >> active and idle for an extended period of time. >> >> To test this, connect a single client to the LTSP server via crossover >> cable and let it sit for a day, and see if it disconnects, too. If it >> does not, then the problem is the switch, and you should figure out what >> setting in the switch needs to be changed, or use a dumber switch. :) >> >> -Gideon >> >> >> On Tue, 2008-09-30 at 08:39 +0200, Wojtek Polcwiartek wrote: >>> Hello, >>> >>> yes, we do have this line in /etc/hosts.allow >>> We still work on this (wireshark etc.) :/ >>> Are other tcp-/udp-ports then 69 and 2000 needed? >>> Any other ideas? >>> >>> Greetings, >>> >>> Wojtek >>> >>> >>> >>> Gideon Romm schrieb: >>>> Do you have the following line in /etc/hosts.allow: >>>> >>>> nbdrootd: ALL: keepalive >>>> >>>> -Gadi >>>> >>>> On Fri, 2008-09-26 at 12:04 +0200, Wojtek Polcwiartek wrote: >>>>> Hello, >>>>> >>>>> we try to implement LTSP in pc-pool (about 200 thin clients) for >>>>> students at Tech.Univ. of Berlin (we are students too). The work is >>>>> almost done. We are now in the test phase. Here we got an error, witch >>>>> can stop our project :/ We use [EMAIL PROTECTED] >>>>> Our problem: The connection between nbd-client and ndb-server breaks. >>>>> >>>>> The message at the clients says (After switching to another terminal): >>>>> "nbd0: Attempted to send on closed socket" >>>>> >>>>> The logs at the server: >>>>> - Connection >>>>> ./syslog:Sep 24 16:43:14 lts02 nbdrootd[11882]: connect from >>>>> 130.149.10.132 (130.149.10.132) >>>>> ./syslog:Sep 24 16:43:14 lts02 nbd_server[11883]: connect from >>>>> 130.149.10.132, assigned file is /opt/ltsp/images/i386.img >>>>> ./syslog:Sep 24 16:43:14 lts02 nbd_server[11883]: Size of exported >>>>> file/device is 228229120 >>>>> ./syslog:Sep 24 16:43:16 lts02 nbdrootd[11903]: connect from >>>>> 130.149.10.131 (130.149.10.131) >>>>> ./syslog:Sep 24 16:43:16 lts02 nbd_server[11904]: connect from >>>>> 130.149.10.131, assigned file is /opt/ltsp/images/i386.img >>>>> ./syslog:Sep 24 16:43:16 lts02 nbd_server[11904]: Size of exported >>>>> file/device is 228229120 >>>>> >>>>> - Connection lost >>>>> Sep 24 17:56:08 lts02 nbd_server[11883]: Read failed: Connection reset >>>>> by peer >>>>> Sep 24 17:56:08 lts02 nbd_server[11904]: Read failed: Connection reset >>>>> by peer >>>>> >>>>> >>>>> Do You have any idea why could it happen? >>>>> >>>>> What tcp-ports are needed for well-working LTSP? We opened 69 (tftp) and >>>>> 2000 (nbd-server). Our network infrastructure works good: we couldn't >>>>> notice high-traffic time periods. >>>>> >>>>> Our H/W-Configuration: >>>>> 2xServers (4x3GHz, 4GB Ram), H/W load balancer >>>>> about 200x HP t5725, t5735 and t5525 >>>>> >>>>> >>>>> I already wrote an email about this error, but now I deliver some details. >>>>> >>>>> >>>>> Thanks in advance! >>>>> >>>>> > > -- Wojtek Polcwiartek ------ tubIT TU-Berlin Web : www.tubit.tu-berlin.de Email : [EMAIL PROTECTED] Tel : +49.30.314.28000 ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _____________________________________________________________________ Ltsp-discuss mailing list. To un-subscribe, or change prefs, goto: https://lists.sourceforge.net/lists/listinfo/ltsp-discuss For additional LTSP help, try #ltsp channel on irc.freenode.net
