On Mon, Oct 19, 2009 at 7:17 PM, Allen Wittenauer <[email protected]> wrote: > > > > On 10/19/09 11:46 AM, "Edward Capriolo" <[email protected]> wrote: > >> I am interested in your post. What has caused you to run caching DNS >> servers on each of your nodes? Is this a hadoop specific problem or a >> problem specific to your implementation? > > Hadoop does a -tremendous- amount of hostname lookups. If you don't have > either nscd or a local DNS caching server, you are likely throwing what > could be some significant performance gains away. > >> My assumption here is that a hadoop cluster of say 1000 nodes would >> repeatedly talk to the same 1000 nodes. > > ... and that's the catch! Every node running the DFSClient code or being > called out from a map/reduce task is a potential hostname that would need be > resolved. Just think about something like distcp. > > Also note that this is before we talk about monitoring, any other naming > services, CNAMEs, multi-As, etc, that get built as a normal part of running > an infrastructure. > >> Are you saying that nscd is >> inadequacy to handle the size of the cache, or nscd is not very >> efficient? What exactly is the reason you are running a caching DNS >> server on each node? > > In the case of Yahoo!, we had (or, at least, a perception) that we had or > were going to have jobs that did a lot of direct DNS lookups and/or > accessed/referenced things outside of the local grid. Also note that a DNS > caching server is going to store more information about hostnames than a > simple host to IP service like nscd. > > Hypothetical: Let's say I'm building rules for a spam filter and part of my > process is to look up the MX record for a given host. nscd isn't going to > help you there. > > In the case of LinkedIn, the jury is still out. I suspect we don't have > nscd.conf tuned correctly. Our grid isn't that big, our connections in/out > are fairly small, etc. It has been one of the things on my todo list since I > got hired here 2 months ago. :) > > [For the record, I'm not one of those crazy people who turns off nscd > because I had a bad experience with a broken version five years ago. In > the case of Yahoo!, I was the crazy person who started insisting we turn it > on, albeit not for hosts.] > > Cool thanks for the info.
I have found NSCD to be absolutely essential in most/all situations. Whenever I would truss processes on OS'es without NSCD (say freebsd 6.2) I would see numerous repeated 'stat' against /etc/passwd and /etc/group. If you are doing users and groups through LDAP nscd is super important as well. Your not going to want to make a series of lookups each stat. I would think the most efficient implementation would be nscd and a local caching server in that case. NSCD should be very efficient since it is done through libraries, dns lookups have to open sockets (overhead). However I can see your point nscd can not do other types of records.
