Dear nutchers, Having Nutch 2.2.1/HBase 0.90.6/Hadoop 1.1.2/6Mappers/6Reducers/Core i7-3770/32GB (no swap)/2x3TB
When I parse (in mapper, 6 simultaneously running map-tasks), this is very slow. Max load is ~1.5, max iowait is 5%, max CPU per task is only 30%, max CPU for hmaster is about 30%. iotop in consequence also shows low numbers. Since parsing is a CPU-intensive job and all IO-stuff is on very low level, I wonder why parsing does not work faster und with full CPU usage. It really takes a long time to finish. Where might be the bottleneck? Thanks for any advice, Martin

