Re: [Ltsp-discuss] [ltsp-discuss] Network clients stop responding for a while after 4 or 5 units boot up

2008-11-24 Thread Mark David Dumlao
On Thu, Nov 20, 2008 at 7:36 AM, jam [EMAIL PROTECTED] wrote:

  Previously I configured my DNS under bind. However, I noticed that after
  booting 4 or 5 thin clients, bind9 would magically stop replying to
  queries. At this point, sudo would become very slow and few of my clients
  would be able to boot or even get IP addresses from the server.
 
  Neither dhcp3-server nor sudo should be making DNS lookups. My hosts file
  looks like so:
 
  ===
  127.0.0.1localhost

  127.0.1.1mars.schoolsite.localmars
 
  192.168.1.8mars.schoolsite.localmars
  192.168.11.254mars.schoolsite.localmars

 [snip]
 So who is mars 127.0.0.1, 182.168.1.9 or 192.168.11.254
 Don't know if THIS causes your problem, I've never tried something as umm
 not-
 clever as this. But this is NOT good

Wait, what?

/etc/hosts entries shouldn't kill bind, no matter what. And my nsswitch.conf
clearly picks files before dns.

===
[EMAIL PROTECTED]:~$ cat /etc/nsswitch.conf
# /etc/nsswitch.conf
#
# Example configuration of GNU Name Service Switch functionality.
# If you have the `glibc-doc-reference' and `info' packages installed, try:
# `info libc Name Service Switch' for information about this file.

passwd: compat
group:  compat
shadow: compat

hosts:  files mdns4_minimal [NOTFOUND=return] dns mdns4
networks:   files

protocols:  db files
services:   db files
ethers: db files
rpc:db files

netgroup:   nis
===
mars is the name of the server.
mars.schoolsite.local is the fqdn.

127.0.1.1 is a localhost alias. It's a default fill-me-in entry in ubuntu.
the two other addresses are hostnames of the different interfaces.

Now if you have DNS why is there an entry for mars in /etc/hosts. That is
 not
 BAD but certainly not GOOD. See /etc/nsswitch.conf

I have no idea why it is not good to have a host entry in case your dns goes
down.
-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/_
Ltsp-discuss mailing list.   To un-subscribe, or change prefs, goto:
  https://lists.sourceforge.net/lists/listinfo/ltsp-discuss
For additional LTSP help,   try #ltsp channel on irc.freenode.net


Re: [Ltsp-discuss] [ltsp-discuss] Network clients stop responding for a while after 4 or 5 units boot up.

2008-11-19 Thread Vagrant Cascadian
On Mon, Nov 17, 2008 at 05:19:36PM +0800, Mark David Dumlao wrote:
 Thanks Steve, I didn't know gPXE was universal because I only downloaded 
 images
 from rom-o-matic.

you can download a universal gPXE iso from rom-o-matic.net, just select the
gpxe:all-drivers driver. why it isn't the default is beyond me.

i haven't had luck with the gpxe:all-drivers floppy disk images from
rom-o-matic.net, though. when building from source, i had luck with padded disk
images (.pdsk) floppy images which aren't built by default.

i've also got .deb packages of gPXE available at:

  http://llama.freegeek.org/~vagrant/debian/UNRELEASED/

that has various images in /usr/share/gpxe/

live well,
  vagrant

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
_
Ltsp-discuss mailing list.   To un-subscribe, or change prefs, goto:
  https://lists.sourceforge.net/lists/listinfo/ltsp-discuss
For additional LTSP help,   try #ltsp channel on irc.freenode.net


Re: [Ltsp-discuss] [ltsp-discuss] Network clients stop responding for a while after 4 or 5 units boot up

2008-11-19 Thread jam
On Thursday 20 November 2008 04:27:31 ltsp-discuss-
[EMAIL PROTECTED] wrote:
   Will do, but Im just annoyed by the individual ROM generation from
   ROM-o-matic. My refurbished units all have different lan cards and I
   was hoping for a universal diskette bootrom.
  
   Any help? :)
 
  I think if you download the gPXE source you can build yourself a
  universal loader. I forget the details, but I think it's possible.

 Thanks Steve, I didn't know gPXE was universal because I only downloaded
 images from rom-o-matic.

 I was just about to pat myself on the back and congratulate myself after
 gPXE was able to boot the whole lab in abuot 10-15 minutes. However, after
 I rebooted and tried again later, my problem with the inability to get
 addresses returned. It's apparently intermittent, and I think it is highly
 related to either a huge system slowdown or dns. I don't know why it would
 be related to DNS after all.

 Previously I configured my DNS under bind. However, I noticed that after
 booting 4 or 5 thin clients, bind9 would magically stop replying to
 queries. At this point, sudo would become very slow and few of my clients
 would be able to boot or even get IP addresses from the server.

 Neither dhcp3-server nor sudo should be making DNS lookups. My hosts file
 looks like so:

 ===
 127.0.0.1    localhost

 127.0.1.1    mars.schoolsite.local    mars

 192.168.1.8    mars.schoolsite.local    mars
 192.168.11.254    mars.schoolsite.local    mars

[snip]
So who is mars 127.0.0.1, 182.168.1.9 or 192.168.11.254
Don't know if THIS causes your problem, I've never tried something as umm not-
clever as this. But this is NOT good

Now if you have DNS why is there an entry for mars in /etc/hosts. That is not 
BAD but certainly not GOOD. See /etc/nsswitch.conf

To get your system running I'd start with bind DNS and /etc/hosts
127.0.0.1   localhost

Then you can start to troubleshoot
James



-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
_
Ltsp-discuss mailing list.   To un-subscribe, or change prefs, goto:
  https://lists.sourceforge.net/lists/listinfo/ltsp-discuss
For additional LTSP help,   try #ltsp channel on irc.freenode.net


Re: [Ltsp-discuss] [ltsp-discuss] Network clients stop responding for a while after 4 or 5 units boot up.

2008-11-18 Thread Mark David Dumlao
On Mon, Nov 17, 2008 at 11:06 AM, Steve Cayford [EMAIL PROTECTED] wrote:

 Mark David Dumlao wrote:
  On Sat, Nov 15, 2008 at 7:47 AM, JF Straeten [EMAIL PROTECTED]
  mailto:[EMAIL PROTECTED] wrote:
 
  Perhaps could you give gPXE a try, instead of etherboot, and see if
 it
  change anything ?
 
  Will do, but Im just annoyed by the individual ROM generation from
  ROM-o-matic. My refurbished units all have different lan cards and I was
  hoping for a universal diskette bootrom.
 
  Any help? :)

 I think if you download the gPXE source you can build yourself a
 universal loader. I forget the details, but I think it's possible.

Thanks Steve, I didn't know gPXE was universal because I only downloaded
images from rom-o-matic.

I was just about to pat myself on the back and congratulate myself after
gPXE was able to boot the whole lab in abuot 10-15 minutes. However, after I
rebooted and tried again later, my problem with the inability to get
addresses returned. It's apparently intermittent, and I think it is highly
related to either a huge system slowdown or dns. I don't know why it would
be related to DNS after all.

Previously I configured my DNS under bind. However, I noticed that after
booting 4 or 5 thin clients, bind9 would magically stop replying to queries.
At this point, sudo would become very slow and few of my clients would be
able to boot or even get IP addresses from the server.

Neither dhcp3-server nor sudo should be making DNS lookups. My hosts file
looks like so:

===
127.0.0.1localhost
127.0.1.1mars.schoolsite.localmars

192.168.1.8mars.schoolsite.localmars
192.168.11.254mars.schoolsite.localmars

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
===

Nevertheless, when my bind9 server dies, my hosts are unable to get ips from
the dhcp3-server, and sudo take forever. What the human is going on, I
wonder.

Thinking it was a problem with bind9, I replaced it with pdns. PDNS has a
bind backend, which basically just reads your bind zones and makes it easy
to transfer. But no joy. After 4 or 5 boots, dhcp3 server goes to hell and
stops responding.

I tried all sorts of network dumping and monitoring. After a while I noticed
that there were reverse IP queries somewhere, and not bothering to find out
why, I gave the entire subnet reverse entries in my PDNS.

Amazingly, I was able to boot the whole lab. In under 10 minutes. It brought
tears to my eyes. However the problem came back, and checking my logs:

Nov 17 16:53:40 mars pdns[2981]: Scheduling exit on remote request

And I was like, what the hell.

I'm not 100% sure, but it would seem like my bot-infested waters (worms
everywhere in the network, it wasn't maintained well) are trying to remotely
my bind or something like that, and maybe all sorts of poisoning tricks.
Since my pdns server shared my bind rndc key, if the key was weak, then it's
conceivable that bots are shutting it down.

Anyways, it would seem that I have two problems that ask for a solution:
1) Are my local services performing DNS lookups? Why? How do I get them to
stop doing that?
2) Is my pdns / bind / whatever dns under attack by bots using rndc? How do
I stop them?
-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/_
Ltsp-discuss mailing list.   To un-subscribe, or change prefs, goto:
  https://lists.sourceforge.net/lists/listinfo/ltsp-discuss
For additional LTSP help,   try #ltsp channel on irc.freenode.net


Re: [Ltsp-discuss] [ltsp-discuss] Network clients stop responding for a while after 4 or 5 units boot up.

2008-11-18 Thread David Burgess
On Mon, Nov 17, 2008 at 2:19 AM, Mark David Dumlao [EMAIL PROTECTED] wrote:

 1) Are my local services performing DNS lookups? Why? How do I get them to
 stop doing that?

I've screwed up my host name in the past, lo and behold, sudo, ssh and
probably some others become really slow or unusable. It doesn't
surprise me that some of your services stop responding when DNS goes
down. Sorry, I have no idea why or how to fix it

My personal instinct would be not to change the behviour or said
services, but to figure out how to protect the DNS service from
whatever is bringing it down.

db

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
_
Ltsp-discuss mailing list.   To un-subscribe, or change prefs, goto:
  https://lists.sourceforge.net/lists/listinfo/ltsp-discuss
For additional LTSP help,   try #ltsp channel on irc.freenode.net


Re: [Ltsp-discuss] [ltsp-discuss] Network clients stop responding for a while after 4 or 5 units boot up.

2008-11-18 Thread Steve Cayford
Mark David Dumlao wrote:
 [...]
 Anyways, it would seem that I have two problems that ask for a solution:
 1) Are my local services performing DNS lookups? Why? How do I get them
 to stop doing that?
 2) Is my pdns / bind / whatever dns under attack by bots using rndc? How
 do I stop them?

This sounds like you need to break out wireshark (formerly ethereal).
Watch for everything in and out of your server on port 53 and you should
start to get a better idea of what's going on.

-Steve

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
_
Ltsp-discuss mailing list.   To un-subscribe, or change prefs, goto:
  https://lists.sourceforge.net/lists/listinfo/ltsp-discuss
For additional LTSP help,   try #ltsp channel on irc.freenode.net


Re: [Ltsp-discuss] [ltsp-discuss] Network clients stop responding for a while after 4 or 5 units boot up.

2008-11-17 Thread JF Straeten
On Sun, Nov 16, 2008 at 09:06:34PM -0600, Steve Cayford wrote:
 Mark David Dumlao wrote:
  On Sat, Nov 15, 2008 at 7:47 AM, JF Straeten [EMAIL PROTECTED]
  mailto:[EMAIL PROTECTED] wrote:
  
  Perhaps could you give gPXE a try, instead of etherboot, and see if it
  change anything ?
  
  Will do, but Im just annoyed by the individual ROM generation from
  ROM-o-matic. My refurbished units all have different lan cards and I was
  hoping for a universal diskette bootrom.
  
  Any help? :)
 
 I think if you download the gPXE source you can build yourself a
 universal loader. I forget the details, but I think it's possible.

Yes. With gPXE, you can use the same diskette for all clients and they
will behave like if they were PXE enabled.

A+

-- 

JFS.

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
_
Ltsp-discuss mailing list.   To un-subscribe, or change prefs, goto:
  https://lists.sourceforge.net/lists/listinfo/ltsp-discuss
For additional LTSP help,   try #ltsp channel on irc.freenode.net


Re: [Ltsp-discuss] [ltsp-discuss] Network clients stop responding for a while after 4 or 5 units boot up.

2008-11-17 Thread Rob Owens


Mark David Dumlao wrote:
 On Sat, Nov 15, 2008 at 7:47 AM, JF Straeten [EMAIL PROTECTED]
 mailto:[EMAIL PROTECTED] wrote:
 
 Perhaps could you give gPXE a try, instead of etherboot, and see if it
 change anything ?
 
 Will do, but Im just annoyed by the individual ROM generation from
 ROM-o-matic. My refurbished units all have different lan cards and I was
 hoping for a universal diskette bootrom.
 
 Any help? :)

I've used the thinstation universal boot cd cited on this page:
http://www.ltsp.org/twiki/bin/view/Ltsp/Etherboot

-Rob


The information transmitted is intended only for the person or entity to
which it is addressed and may contain confidential and/or privileged
material. If you are not the addressee, any disclosure, reproduction,
copying, distribution, or other dissemination or use of this transmission in
error please notify the sender immediately and then delete this e-mail.
E-mail transmission cannot be guaranteed to be secure or error free as
information could be intercepted, corrupted lost, destroyed, arrive late or
incomplete, or contain viruses.
The sender therefore does not accept liability for any errors or omissions
in the contents of this message which arise as a result of e-mail
transmission. If verification is required please request a hard copy
version.




-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
_
Ltsp-discuss mailing list.   To un-subscribe, or change prefs, goto:
  https://lists.sourceforge.net/lists/listinfo/ltsp-discuss
For additional LTSP help,   try #ltsp channel on irc.freenode.net


Re: [Ltsp-discuss] [ltsp-discuss] Network clients stop responding for a while after 4 or 5 units boot up.

2008-11-17 Thread Rob Owens
I think that server should be able to handle 10 clients or more, based
on my experience.  It'll depend somewhat on what applications you're
running, but for basic office stuff 10 clients should be no problem.

Check /etc/ltsp/dhcpd.conf to make sure that you have more than 5
addresses in your dynamic pool.  Look for the line that starts with range

Lastly, are you sure you're using a switch and not a hub?  A hub can
cause the kind of problems you're seeing.  I'd bet money that this is
your problem.

-Rob

Mark David Dumlao wrote:
 Hello ltsp-discuss list!
 
 I'm setting up a thin client laboratory for a computer school in the
 Philippines, in the Visayas region, and I'm encountering some
 difficulties with getting the clients to boot up.
 
 Problem statement:
 I have a lab of 20 units booting off of a lower-middle class desktop
 converted into an Ubuntu LTSP server with increased RAM and disk space.
 The server is able to boot all of the clients individually without
 hitches, recording invidual boot times of 1-3 minutes in isolation.
 However, when 4 or 5 clients start up, the rest of the lab is unable to
 start up or even get IP addresses. Also, when booting several units at a
 time, the boot process sometimes hangs for 3-5 minutes before entering
 the Login screen.
 
 Details:
 I am running Ubuntu 8.04 LTS (Hardy Heron), and using it to boot up an
 entire lab of second hand, beat down Pentium III units. My server specs
 are as follows:
 
 Server:
 processor: Pentium 4, 2.4GHz, 512 kb cache
 RAM: 2GB DDR SDRAM
 disks: 2x 80GB PATA Hard Disks - software RAID
 md0 = 10gigs root partition
 2 gigs swap per hard disk
 md1 = 55 gigs home partition
 NICs: 2x100 mb/s ethernet
 eth1 - 192.168.1.8 http://192.168.1.8 (facing the school network)
 eth2 - 192.168.11.254 http://192.168.11.254 (facing the lab network
 [thin clients])
 
 As you may have noticed, it's not a brand new server, but rather, one of
 the units which were lying around that I happened to notice and say
 hey, that could totally work. The only thing new about it is the RAM
 and the hard disks, which are second hand. The disks are used to run a
 software RAID1 setup, because I was under the impression that RAID1
 could theoretically increase my disk read speeds to about double.
 
 I did a fresh install of Ubuntu 8.04.1 on the server. Ubuntu's installer
 has an LTSP server install mode. It puts LTSP5 on the server, and
 created boot disks for the clients. The clients are using a universal
 multidriver etherboot boot diskette image which I downloaded from the
 net, called eb_on_hd. The diskette is available from this site:
 http://etherboot.anadex.de/
 
 It's really neat. It works for unattended (the network operating system
 below).
 
 As for the clients, this laboratory consists of 20 refurbished Pentium 3
 units. The highest RAM they have is 128MB. Some of them have 96MB only.
 About 19 of them are fast ethernet (100mbps) although one of them has an
 old 10BaseT card. Their existing hard disks have Windows XP installed,
 and when troubles happen, I also use my server to reinstall their XP
 using the unattended network install method from this site:
 http://unattended.sourceforge.net/
 
 None of the units have PXE boot ROM builtin on their boards or NICs,
 which is why I use the etherboot image from above.
 
 Here's a quick ASCII diagram of the network setup:
 
 { Internet } -- [school router] -- [main switch] -- [mars server]
 -- [lab switch] -- {lab units}
 
 The server, which we called mars, functions not only as an LTSP server,
 but also as a stand-in firewall router for the units. That way, while
 I'm still playing around with the LTSP stuff, there is no service
 interruption in the labs, which are still using virus-laden XPs for class :(
 
 In theory, all of the units seem to be working fine. Individually, I can
 get the units to boot into LTSP, with boot times from 1-3 minutes, and I
 can prove that their performance on the server is noticeably better than
 their local performance (It's a miracle to me that we could get that
 happening on our junktop, but that's what my few beta testing users
 say). However, A problem occurs when I try to boot multiple units. What
 happens is that the first four or five units boot up fine, but the sixth
 unit and so on seem to be unable to even get an IP address. What happens
 is that after etherboot loads and gets to the part where it's asking for
 an IP, and it keeps returning No IP address after that point. Under
 windows, they are normally able to get an IP address from the server,
 which is why I suspect the problem only arises after LTSP starts to
 boot. Anyways, I have to cover all points when troubleshooting this,
 because the owner of the school wants me to replicate the setup to
 another lab and another branch with 4 labs. So here's what I've covered
 so far:
 
 Unless I mention otherwise, testing commands are run by booting one lab
 unit into thin client mode, and running local tests on the server.

Re: [Ltsp-discuss] [ltsp-discuss] Network clients stop responding for a while after 4 or 5 units boot up.

2008-11-16 Thread Mark David Dumlao
On Sat, Nov 15, 2008 at 7:47 AM, JF Straeten [EMAIL PROTECTED] wrote:

 Perhaps could you give gPXE a try, instead of etherboot, and see if it
 change anything ?

Will do, but Im just annoyed by the individual ROM generation from
ROM-o-matic. My refurbished units all have different lan cards and I was
hoping for a universal diskette bootrom.

Any help? :)
-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/_
Ltsp-discuss mailing list.   To un-subscribe, or change prefs, goto:
  https://lists.sourceforge.net/lists/listinfo/ltsp-discuss
For additional LTSP help,   try #ltsp channel on irc.freenode.net


Re: [Ltsp-discuss] [ltsp-discuss] Network clients stop responding for a while after 4 or 5 units boot up.

2008-11-16 Thread Mark David Dumlao
On Sat, Nov 15, 2008 at 1:35 AM, Mark David Dumlao [EMAIL PROTECTED]wrote:

 5 - disk channel
 I'll make sure tomorrow that my RAID disks are on different channels. But I
 think they already are, and I think that even if they were on the same
 channel, the performance wouldn't be THAT bad.

Yipes. My assistant placed both disks on the same channel. I'll reboot and
reconfig, but as I said, I don't really think that's the issue.
-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/_
Ltsp-discuss mailing list.   To un-subscribe, or change prefs, goto:
  https://lists.sourceforge.net/lists/listinfo/ltsp-discuss
For additional LTSP help,   try #ltsp channel on irc.freenode.net


[Ltsp-discuss] [ltsp-discuss] Network clients stop responding for a while after 4 or 5 units boot up.

2008-11-14 Thread Mark David Dumlao
Hello ltsp-discuss list!

I'm setting up a thin client laboratory for a computer school in the
Philippines, in the Visayas region, and I'm encountering some difficulties
with getting the clients to boot up.

Problem statement:
I have a lab of 20 units booting off of a lower-middle class desktop
converted into an Ubuntu LTSP server with increased RAM and disk space. The
server is able to boot all of the clients individually without hitches,
recording invidual boot times of 1-3 minutes in isolation. However, when 4
or 5 clients start up, the rest of the lab is unable to start up or even get
IP addresses. Also, when booting several units at a time, the boot process
sometimes hangs for 3-5 minutes before entering the Login screen.

Details:
I am running Ubuntu 8.04 LTS (Hardy Heron), and using it to boot up an
entire lab of second hand, beat down Pentium III units. My server specs are
as follows:

Server:
processor: Pentium 4, 2.4GHz, 512 kb cache
RAM: 2GB DDR SDRAM
disks: 2x 80GB PATA Hard Disks - software RAID
md0 = 10gigs root partition
2 gigs swap per hard disk
md1 = 55 gigs home partition
NICs: 2x100 mb/s ethernet
eth1 - 192.168.1.8 (facing the school network)
eth2 - 192.168.11.254 (facing the lab network [thin clients])

As you may have noticed, it's not a brand new server, but rather, one of the
units which were lying around that I happened to notice and say hey, that
could totally work. The only thing new about it is the RAM and the hard
disks, which are second hand. The disks are used to run a software RAID1
setup, because I was under the impression that RAID1 could theoretically
increase my disk read speeds to about double.

I did a fresh install of Ubuntu 8.04.1 on the server. Ubuntu's installer has
an LTSP server install mode. It puts LTSP5 on the server, and created boot
disks for the clients. The clients are using a universal multidriver
etherboot boot diskette image which I downloaded from the net, called
eb_on_hd. The diskette is available from this site:
http://etherboot.anadex.de/

It's really neat. It works for unattended (the network operating system
below).

As for the clients, this laboratory consists of 20 refurbished Pentium 3
units. The highest RAM they have is 128MB. Some of them have 96MB only.
About 19 of them are fast ethernet (100mbps) although one of them has an old
10BaseT card. Their existing hard disks have Windows XP installed, and when
troubles happen, I also use my server to reinstall their XP using the
unattended network install method from this site:
http://unattended.sourceforge.net/

None of the units have PXE boot ROM builtin on their boards or NICs, which
is why I use the etherboot image from above.

Here's a quick ASCII diagram of the network setup:

{ Internet } -- [school router] -- [main switch] -- [mars server] --
[lab switch] -- {lab units}

The server, which we called mars, functions not only as an LTSP server, but
also as a stand-in firewall router for the units. That way, while I'm still
playing around with the LTSP stuff, there is no service interruption in the
labs, which are still using virus-laden XPs for class :(

In theory, all of the units seem to be working fine. Individually, I can get
the units to boot into LTSP, with boot times from 1-3 minutes, and I can
prove that their performance on the server is noticeably better than their
local performance (It's a miracle to me that we could get that happening on
our junktop, but that's what my few beta testing users say). However, A
problem occurs when I try to boot multiple units. What happens is that the
first four or five units boot up fine, but the sixth unit and so on seem to
be unable to even get an IP address. What happens is that after etherboot
loads and gets to the part where it's asking for an IP, and it keeps
returning No IP address after that point. Under windows, they are normally
able to get an IP address from the server, which is why I suspect the
problem only arises after LTSP starts to boot. Anyways, I have to cover all
points when troubleshooting this, because the owner of the school wants me
to replicate the setup to another lab and another branch with 4 labs. So
here's what I've covered so far:

Unless I mention otherwise, testing commands are run by booting one lab unit
into thin client mode, and running local tests on the server.

0) server capacity
My initial reaction was that the server is unable to handle the load.
Because of that, I turned on the GNOME system monitor applicaiton to track
system stats while I boot up. Everything goes as expected: the RAM usage
prior to logins is negligible. Ditto for CPU usage. The network spikes up
every time etherboot starts downloading the file - also expected. However,
after I boot up five units, I get _NO_ signs from the server that it is
hitting peak usage - CPU, memory, hard disk use, network are all under 20% -
and I highly suspect that the server is in fact able to handle more than 5
units.

At one point, to make sure that everything was 

Re: [Ltsp-discuss] [ltsp-discuss] Network clients stop responding for a while after 4 or 5 units boot up.

2008-11-14 Thread JF Straeten

Mark,

On Sat, Nov 15, 2008 at 01:35:38AM +0800, Mark David Dumlao wrote:


 3 - Etherboot is dumb
 maybe etherboot is just discarding DHCP requests because it isn't very
 smart. How much intelligence can you fit on a floppy disk anyhow?
 But this is suspect, because other clients besides etherboot are also unable
 to get IP addresses.

Perhaps could you give gPXE a try, instead of etherboot, and see if it
change anything ?

http://www.etherboot.org/wiki/start

I find it more easy than etherboot to set up, since you just need a
pxe enabled server, and gPXE on the floppy magically does its work
booting againts it ;)

Just an idea...


-- 
JFS

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
_
Ltsp-discuss mailing list.   To un-subscribe, or change prefs, goto:
  https://lists.sourceforge.net/lists/listinfo/ltsp-discuss
For additional LTSP help,   try #ltsp channel on irc.freenode.net