Hi Alan, On Thu, Jun 13, 2013 at 8:54 AM, Alan Gardner <[email protected]> wrote:
> The weirdest behaviour I'm seeing is that the multithreaded training Map > task only utilizes one core on an eight core node. I'm not sure if this is > configurable in the JVM parameters or the job config. In the meantime I've > set the input split very small, so that I can run 8 parallel 1-thread > training mappers per node. Should I be configuring this differently? > At my office it's generally frowned upon to run MR tasks which attempt to make use of lots of cores on a multicore system, due to cluster configuration which forces number of map / reduce slots to sum to num cores. If multiple multi-threaded task attempts run on the same node, CPU load may spike and negatively affect performance of all task attempts on the node. > I also wanted to check in and verify that the performance I'm seeing is > typical: > > - on a six-node cluster (48 map slots, 8 cores per node) running full tilt, > each iteration takes about 7 hours. I assume the problem is just that our > cluster is far too small, and that the performance will scale if I make the > splits even smaller and distribute the job across more nodes. > How many input splits are generated for your input doc-term matrix? In each task attempt, how many rows are processed? Make sure input is balanced across all map tasks. > - with an 8GB heap size I can't exceed about 200 topics before running out > of heap space. I tried making the Map input smaller, but that didn't seem > to help. Can someone describe how memory usage scales per mapper in terms > of topics, documents and terms? > The tasks need memory proportional to num topics x num terms. Do you have a full 8 GB heap for each task slot? Cheers, Andy Twitter, Inc.
