Dear Wiki user, You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.
The following page has been changed by udanax: http://wiki.apache.org/lucene-hadoop/DataProcessingBenchmarks The comment on the change is: Page is too long. ------------------------------------------------------------------------------ * Map was used for extract the IP address of the client requesting the web page. * Reduce was used for summation. + * 1 more Map/Reduce was used for sort by count. - {{{ - bash# ./bin/hadoop jar ./log_examples.jar loganalysis -m 1000 -r 1000 udanax/logfiles udanax/rank 100 - - * Connecting Time Count Map/Reduce Job start. - ---------------------------------------------------------------------- - 08/01/08 16:13:46 INFO mapred.FileInputFormat: Total input paths to process : 18 - 08/01/08 16:13:47 INFO mapred.JobClient: Running job: job_200801081529_0005 - 08/01/08 16:13:48 INFO mapred.JobClient: map 0% reduce 0% - 08/01/08 16:13:54 INFO mapred.JobClient: map 1% reduce 0% - ... - 08/01/08 16:15:34 INFO mapred.JobClient: map 100% reduce 100% - 08/01/08 16:15:35 INFO mapred.JobClient: Job complete: job_200801081529_0005 - 08/01/08 16:15:35 INFO mapred.JobClient: Counters: 11 - 08/01/08 16:15:35 INFO mapred.JobClient: Job Counters - 08/01/08 16:15:35 INFO mapred.JobClient: Launched map tasks=1355 - 08/01/08 16:15:35 INFO mapred.JobClient: Launched reduce tasks=1457 - 08/01/08 16:15:35 INFO mapred.JobClient: Map-Reduce Framework - 08/01/08 16:15:35 INFO mapred.JobClient: Map input records=49324734 - 08/01/08 16:15:35 INFO mapred.JobClient: Map output records=49324721 - 08/01/08 16:15:35 INFO mapred.JobClient: Map input bytes=8551673779 - 08/01/08 16:15:35 INFO mapred.JobClient: Map output bytes=790763358 - 08/01/08 16:15:35 INFO mapred.JobClient: Combine input records=49324734 - 08/01/08 16:15:35 INFO mapred.JobClient: Combine output records=705771 - 08/01/08 16:15:35 INFO mapred.JobClient: Reduce input groups=201330 - 08/01/08 16:15:35 INFO mapred.JobClient: Reduce input records=705771 - 08/01/08 16:15:35 INFO mapred.JobClient: Reduce output records=201330 - }}} - - * Map/Reduce was used for sort by count. - - {{{ - * Sort by Connection Time Count Map/Reduce Job start. - ---------------------------------------------------------------------- - 08/01/08 16:15:35 INFO mapred.FileInputFormat: Total input paths to process : 100 - 08/01/08 16:15:36 INFO mapred.JobClient: Running job: job_200801081529_0006 - 08/01/08 16:15:37 INFO mapred.JobClient: map 0% reduce 0% - ... - 08/01/08 16:33:54 INFO mapred.JobClient: map 100% reduce 100% - 08/01/08 16:33:55 INFO mapred.JobClient: Job complete: job_200801081529_0006 - 08/01/08 16:33:11 INFO mapred.JobClient: Counters: 11 - 08/01/08 16:33:55 INFO mapred.JobClient: Job Counters - 08/01/08 16:33:55 INFO mapred.JobClient: Launched map tasks=1080 - 08/01/08 16:33:55 INFO mapred.JobClient: Launched reduce tasks=1 - 08/01/08 16:33:55 INFO mapred.JobClient: Map-Reduce Framework - 08/01/08 16:33:55 INFO mapred.JobClient: Map input records=201330 - 08/01/08 16:33:55 INFO mapred.JobClient: Map output records=201330 - 08/01/08 16:33:55 INFO mapred.JobClient: Map input bytes=5080608 - 08/01/08 16:33:55 INFO mapred.JobClient: Map output bytes=5108994 - 08/01/08 16:33:55 INFO mapred.JobClient: Combine input records=201330 - 08/01/08 16:33:55 INFO mapred.JobClient: Combine output records=8406 - 08/01/08 16:33:55 INFO mapred.JobClient: Reduce input groups=19270 - 08/01/08 16:33:55 INFO mapred.JobClient: Reduce input records=84069 - 08/01/08 16:33:55 INFO mapred.JobClient: Reduce output records=200 - }}} ==== Processing Results ==== {{{ ------------------------------------