What do the servers have in common? Do they have the same kind of network cards? Are they on the same switch? Look for what is shared between the two of them and it might point you to the source of the problem.


Ricardo Ara�jo wrote:

Well, a little update on the problem.

I couldn't yet run memtest all the way, I had it run once and no problems
showed. I have also experimented changing memory modules (I have 2x512MB)
by letting it run with only 512MB at a time, but the problems remains (random lock down).


The situation is now as follows: I decided to split the machines among two
servers. What was once a backup server is now operating as a server for 4
clients that were previously on the main server. I was hoping that that
would put less pressure on servers and things would run smoothly. But now
I have two servers locking down, exactly the same way, but not
concurrently.

So I'd better give an update on the topology of the network, as that might
be important information after all. All 10 LTSP clients are connected
through 100MB/s switches. On the same network there are about 7 machines
still running Windows, no LTSP. A router handles DHCP for the Windows
machines and Internet for those and also for the servers. Two servers runs
LTSP, one providing access to 6 clients and other to 4. Both provide DHCP
for their own clients and get IP from the router (all fixed). One server
also provides a intranet web interface.

Probably the main concerns are the fact that we have 3 DHCP servers
running. The servers get their IP from the router and LTSP clients get IP
 from the servers. No big configuration was made in order to try to
configure the DHCP to respond only to certain machines: the router
responds to everyone, the LTSP servers only to the machines it serves.
First I expected that that would be a problem, since maybe sometimes the
client would get IP from the router and sometimes from the server, but
somehow all clients get IP only from the server, as it is supposed to be.
Even if it did that, it should only cause clients to mal-function, not
servers locking down.

Anyway, the problem now is that BOTH servers locks down randomly. It is
never concurrently, which must say something about the nature of the
problem. Also, I don't believe it is a hacker problem, I tried to switch
off the internet and made sure no spurious connections are being made to
the network (it is easy, since it is possible to have a global view from
the room where all clients are). Running "top" shows no difference in
server load from what should be expected before servers locks down.

It is indeed a tricky problem. If both servers locked concurrently, things
would be easier. But the fact that they don't and lock downs seems
completely random is quite misterious to me.

Thanks for the help so far. I hope we can find a solution and this problem
doesn't remains as one of those LTSP great misteries...

[]s
Ricardo.




------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _____________________________________________________________________ Ltsp-discuss mailing list. To un-subscribe, or change prefs, goto: https://lists.sourceforge.net/lists/listinfo/ltsp-discuss For additional LTSP help, try #ltsp channel on irc.freenode.net

-- Brian Payst, MS Director of Technology & Systems Support Division of Student Affairs The University of North Carolina at Chapel Hill voice: (919) 962-1469 fax: (919) 962-5241


------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _____________________________________________________________________ Ltsp-discuss mailing list. To un-subscribe, or change prefs, goto: https://lists.sourceforge.net/lists/listinfo/ltsp-discuss For additional LTSP help, try #ltsp channel on irc.freenode.net

Reply via email to