Hello,

after 1 month we found the solution to our problem :D
Problem (short):
after some time clients lose their NBD-mounts (Log: "Read failed: 
Connection reset by peer")  It is similar problem to 
https://bugs.launchpad.net/ubuntu/+source/nbd/+bug/113617

Solution:
Tuning of the parameters of the TCP-Keepalive connection (see 
http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.html)
We suppose our network closes mount-connections. We use mostly 
enterprise-class network components (Cisco 6500 Series).

Our LTSP system runs well. We wanted to share our experience.

Greetings,
Wojtek







Wojtek Polcwiartek schrieb:
> Hello,
> 
> we think, that the problem is the load-balancer (Cisco ACE). Most of the 
> traffic on the servers goes through  it. Sniffing showed some strange 
> RST-Tcp-Packets.
> We found some way to solve this problem. It is not the most beautiful 
> solution, but it works for now.
> We used a script in /etc/rc2.d/ with following lines:
> nbd-client -d /dev/nbd0
> nbd-client <IP-Address> 2000 /dev/nbd0 -persist
> 
> I deconnects the nbd-client and connects it again with "persist" option.
> 
> Is there any reason, why the option "persist" isn't used by default? For 
> me the connection seems to be robuster then without it.
> 
> Is there a clean way to change the parameters of the default nbd-connection?
> 
> 
> Thanks for help!
> 
> Wojtek
> 
> 
> 
> 
> Gideon Romm schrieb:
>> The only other thing I can think of is your switch.
>>
>> Is it a managed switch?  Some switches will not allow a connection to be
>> active and idle for an extended period of time.
>>
>> To test this, connect a single client to the LTSP server via crossover
>> cable and let it sit for a day, and see if it disconnects, too.  If it
>> does not, then the problem is the switch, and you should figure out what
>> setting in the switch needs to be changed, or use a dumber switch.  :)
>>
>> -Gideon
>>
>>
>> On Tue, 2008-09-30 at 08:39 +0200, Wojtek Polcwiartek wrote:
>>> Hello,
>>>
>>> yes, we do have this line in /etc/hosts.allow
>>> We still work on this (wireshark etc.) :/
>>> Are other tcp-/udp-ports then 69 and 2000 needed?
>>> Any other ideas?
>>>
>>> Greetings,
>>>
>>> Wojtek
>>>
>>>
>>>
>>> Gideon Romm schrieb:
>>>> Do you have the following line in /etc/hosts.allow:
>>>>
>>>> nbdrootd: ALL: keepalive
>>>>
>>>> -Gadi
>>>>
>>>> On Fri, 2008-09-26 at 12:04 +0200, Wojtek Polcwiartek wrote:
>>>>> Hello,
>>>>>
>>>>> we try to implement LTSP in pc-pool (about 200 thin clients) for 
>>>>> students at Tech.Univ. of Berlin (we are students too). The work is 
>>>>> almost done. We are now in the test phase. Here we got an error, witch 
>>>>> can stop our project :/ We use [EMAIL PROTECTED]
>>>>> Our problem: The connection between nbd-client and ndb-server breaks.
>>>>>
>>>>> The message at the clients says (After switching to another terminal):
>>>>> "nbd0: Attempted to send on closed socket"
>>>>>
>>>>> The logs at the server:
>>>>> - Connection
>>>>> ./syslog:Sep 24 16:43:14 lts02 nbdrootd[11882]: connect from 
>>>>> 130.149.10.132 (130.149.10.132)
>>>>> ./syslog:Sep 24 16:43:14 lts02 nbd_server[11883]: connect from 
>>>>> 130.149.10.132, assigned file is /opt/ltsp/images/i386.img
>>>>> ./syslog:Sep 24 16:43:14 lts02 nbd_server[11883]: Size of exported 
>>>>> file/device is 228229120
>>>>> ./syslog:Sep 24 16:43:16 lts02 nbdrootd[11903]: connect from 
>>>>> 130.149.10.131 (130.149.10.131)
>>>>> ./syslog:Sep 24 16:43:16 lts02 nbd_server[11904]: connect from 
>>>>> 130.149.10.131, assigned file is /opt/ltsp/images/i386.img
>>>>> ./syslog:Sep 24 16:43:16 lts02 nbd_server[11904]: Size of exported 
>>>>> file/device is 228229120
>>>>>
>>>>> - Connection lost
>>>>> Sep 24 17:56:08 lts02 nbd_server[11883]: Read failed: Connection reset 
>>>>> by peer
>>>>> Sep 24 17:56:08 lts02 nbd_server[11904]: Read failed: Connection reset 
>>>>> by peer
>>>>>
>>>>>
>>>>> Do You have any idea why could it happen?
>>>>>
>>>>> What tcp-ports are needed for well-working LTSP? We opened 69 (tftp) and 
>>>>> 2000 (nbd-server). Our network infrastructure works good: we couldn't 
>>>>> notice high-traffic time periods.
>>>>>
>>>>> Our H/W-Configuration:
>>>>> 2xServers (4x3GHz, 4GB Ram), H/W load balancer
>>>>> about 200x HP t5725, t5735 and t5525
>>>>>
>>>>>
>>>>> I already wrote an email about this error, but now I deliver some details.
>>>>>
>>>>>
>>>>> Thanks in advance!
>>>>>
>>>>>
> 
> 


-- 
Wojtek Polcwiartek

------
tubIT
TU-Berlin
Web   : www.tubit.tu-berlin.de
Email : [EMAIL PROTECTED]
Tel   : +49.30.314.28000

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_____________________________________________________________________
Ltsp-discuss mailing list.   To un-subscribe, or change prefs, goto:
      https://lists.sourceforge.net/lists/listinfo/ltsp-discuss
For additional LTSP help,   try #ltsp channel on irc.freenode.net

Reply via email to