Hi Larry, (sorry for the late reply)
first of all thank you very much for the feedback!
Larry Stewart wrote:
I was going to say "how often do you really deal with the A.B.C.D rather
than DNS names anyway?" but I've
just spent a couple of weeks doing just that and it really is convenient
when you are in the weeds.
That was our thought as well, thus the "idea".
One comment is that nearly all software that deals with dotted quads
prints in decimal, which makes
binary encodings of the meaning awkward. So using 4 bit fields for the
X and Y coordinates is hard
to translate in your head. Instead, making the third octet be
(row*20)+column would be a lot easier
on the brain and supports 12 rows. This is why we do things like
A.B.200+<module ID>.100+<node ID>/18.
It's a little awkward to get started, but then it is trivial to map in
your brain from IP to function
and position.
Right now the current plan allows up to 10 rows, thus 20 seems to be a
good number here as well :)
The next issue is how all this gets initialized. Pretty much the only
way to do it is to have the DHCP
servers configured to map MAC addresses to IP addresses in a stable
way. We don't really have that
problem because pretty much the only interfaces that have random MAC
addresses are the module
service processors. The MAC address maps to the manufacturing serial
number, which is essential
for tracking faults, but the position (slot ID/module ID) is reported in
the DHCP request in a <vendor>
field and the DHCP server knows what to do.
It seems like when you install something, you will have to enter its MAC
addresses into the DHCP
server database and map to a stable IP address given database knowlege
of the position and function
of the device.
Yes, we will require our vendor to hand over a list (text file) of all
MAC addresses of the cluster, i.e. two on board NICs plus MAC from IPMI
card.
For us, there were a number of benefits in going to "IP address maps to
function": * Humans can debug given the IP addresses alone
* No DNS lookups required in performance critical paths
* Higher level configuration files for things like SLURM can be nearly
static
So far so good.
Nevertheless, is the benefit of mapping IP to physical location really
valuable? Trying to
maintain this given the probable frequency of swapping out boxes will
cause trouble with
DHCP and ARP. Either you make the leases short and wait for them to
expire before
powering on a replacement, or you have to go around manually flushing
leases and arp
tables. Ugh. Instead, it may make more sense to give a type of device
a stable IP address
without regard to position, and to maintain a database mapping MAC/IP to
location
separately. For a few 1000's of devices, grepping the location file
will be faster than
walking over to the right rack anyway. We have this problem with
modules. The service
guys want to swap modules in the backplane to see if a problem follows
it and it has
cost us some DHCP hackery to let the addressing respond smoothly.
So far our experience with slightly smaller clusters suggest that the
DHCP problem *might* occur, but usually we have a few "spare nodes"
which are switched off during regular operations (at least officially
;)). If a node dies and is send back for service we will simply leave
the "hole" on the rack and switch on the spare node at its position -
again at least officially. After the box returns we can simply reinstall
it back in its own place. Thus lease times should thus not be an issue.
So far it seems we will have enough spare room to house all real and
spare nodes, thus it should not be a problem (keeping my fingers crossed).
Anyone else seeing a big problem in this idea?
Cheers
Carsten
_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf