automatic node discovery

Petru Dimulescu Tue, 18 Oct 2011 02:48:42 -0700

Hello,

I wonder how do you guys see the problem of automatic node discovery:having, for instance, a couple of hadoops, with no configurationexplicitly set whatsoever, simply discover each other and work together,like Gridgain does: just fire up two instances of the product, on thesame machine or on different machines in the same LAN, they will usemulitcast or whatever to discover each other and to be a part of aself-discovered topology.

Of course, if you have special network requirements you should be ableto specify undiscovarable nodes by IP or name but often grids areinstalled on LANs and it should really be simpler.

Namenodes are a bit different, they should use safer machines, I'mbasically talking about datanodes here, but still I wonder how hard canit be to have self-assigned namenodes, maybe replicated automatically onseveral machines, unless one specific namenode is explicitly set via xmlconfiguration.

Also, the ssh passwordless thing is so awkward. If you have a network ofhadoop that mutually discover each other there is really no need forthis passwordless ssh requirement. This is more of a systemadministrator aspect, if sysadmins want to automatically deploy or starta program on 5000 machines they often have the tools&skills to do that,it should not be a requirement.


What do you people think about this?

Best
Petru

automatic node discovery

Reply via email to