Re: [Lucene-hadoop Wiki] Update of "FAQ" by DevarajDas

Raghu Angadi Thu, 06 Sep 2007 11:32:18 -0700

Doug Cutting wrote:

Apache Wiki wrote:
+ Sort performances on 1400 nodes and 2000 nodes are pretty good too -sorting 14TB of data on a 1400-node cluster takes 2.2 hours; sorting20TB on a 2000-node cluster takes 2.5 hours. The updates to the aboveconfiguration being: + * `mapred.job.tracker.handler.count = 60`
+   * `mapred.reduce.parallel.copies = 50`
+   * `tasktracker.http.threads = 50`
This is a pretty good indication of stuff that we might better specifyas proportional to cluster size. For example, we might replace thefirst with something like mapred.jobtracker.tasks.per.handler=30. Todetermine the number of handlers we'd determine the number of task slots(#nodes * mapred.tasktracker.tasks.maximum) and divide that bytasks.per.handler to determine the number of handlers. Then folkswouldn't need to alter these settings as their cluster grows.
It's best if folks don't have to change defaults for good performance.Not only does that simplify configuration, but it means we can moreeasily change implementations. For example, if we switch to async RPCresponses, then the handler count may change significantly, and we'llprobably change the default, and it would be nice if most folks were notoverriding the default.
Thoughts?  Should we file an issue?

I don't think there is an explanation of why increasing the handlersproportionally helps (I does help, but it might be a big hammerapproach). I think ipc Q-length and q management also matters a lot. Iwill open a Jira with couple thoughts/explanation/improvements regd Qmangagement in our IPC.


Raghu.

Re: [Lucene-hadoop Wiki] Update of "FAQ" by DevarajDas

Reply via email to