Go into the web interface and look at the file. See if you can see all of the blocks.
On 1/18/08 7:46 AM, "Matt Herndon" <[EMAIL PROTECTED]> wrote: > Hello, > > > > I'm trying to get Hadoop to process a 2 gig file but it seems to only be > processing the first block. I'm running the exact Hadoop vmware image > that is available here http://dl.google.com/edutools/hadoop-vmware.zip > without any tweaks or modifications to it. I think my file has been > properly loaded into HDFS (hdfs reports it as having 2270607035 bytes) > but when I run the example wordcount task it only seems to operate on > the first 64 meg chunk (Map input bytes is reported as 67239230 when the > job completes). Is the image setup to only run the first block, and if > so how to I change this so it runs over the whole file? Any help would > be greatly appreciated. > > > > Thanks, > > > > --Matt > > > > P.S. Here are the commands I've actually run to verify that the file is > in the hdfs and to run the wordcount example along with their output: > > > > hadoop dfs -ls /clickdir > > Found 1 items > > /clickdir/cf709.txt <r 1> 2270607035 > > > > hadoop jar hadoop-examples.jar wordcount /clickdir /wordTEST3 > > 08/01/18 00:18:59 INFO mapred.FileInputFormat: Total input paths to > process : 1 > > 08/01/18 00:19:00 INFO mapred.JobClient: Running job: job_0023 > > 08/01/18 00:19:01 INFO mapred.JobClient: map 0% reduce 0% > > 08/01/18 00:19:28 INFO mapred.JobClient: map 2% reduce 0% > > 08/01/18 00:19:34 INFO mapred.JobClient: map 3% reduce 0% > > 08/01/18 00:19:37 INFO mapred.JobClient: map 5% reduce 0% > > 08/01/18 00:19:43 INFO mapred.JobClient: map 6% reduce 1% > > 08/01/18 00:19:45 INFO mapred.JobClient: map 9% reduce 1% > > 08/01/18 00:19:54 INFO mapred.JobClient: map 12% reduce 2% > > 08/01/18 00:20:02 INFO mapred.JobClient: map 15% reduce 3% > > 08/01/18 00:20:11 INFO mapred.JobClient: map 18% reduce 4% > > 08/01/18 00:20:19 INFO mapred.JobClient: map 21% reduce 4% > > 08/01/18 00:20:25 INFO mapred.JobClient: map 21% reduce 6% > > 08/01/18 00:20:26 INFO mapred.JobClient: map 24% reduce 6% > > 08/01/18 00:20:34 INFO mapred.JobClient: map 27% reduce 7% > > 08/01/18 00:20:45 INFO mapred.JobClient: map 27% reduce 8% > > 08/01/18 00:20:46 INFO mapred.JobClient: map 30% reduce 8% > > 08/01/18 00:20:54 INFO mapred.JobClient: map 33% reduce 8% > > 08/01/18 00:20:56 INFO mapred.JobClient: map 33% reduce 9% > > 08/01/18 00:21:03 INFO mapred.JobClient: map 36% reduce 10% > > 08/01/18 00:21:11 INFO mapred.JobClient: map 39% reduce 11% > > 08/01/18 00:21:19 INFO mapred.JobClient: map 41% reduce 12% > > 08/01/18 00:21:25 INFO mapred.JobClient: map 44% reduce 13% > > 08/01/18 00:21:31 INFO mapred.JobClient: map 47% reduce 13% > > 08/01/18 00:21:36 INFO mapred.JobClient: map 50% reduce 14% > > 08/01/18 00:21:42 INFO mapred.JobClient: map 53% reduce 16% > > 08/01/18 00:21:47 INFO mapred.JobClient: map 56% reduce 16% > > 08/01/18 00:21:52 INFO mapred.JobClient: map 59% reduce 17% > > 08/01/18 00:21:56 INFO mapred.JobClient: map 62% reduce 18% > > 08/01/18 00:22:01 INFO mapred.JobClient: map 65% reduce 19% > > 08/01/18 00:22:06 INFO mapred.JobClient: map 68% reduce 20% > > 08/01/18 00:22:11 INFO mapred.JobClient: map 71% reduce 20% > > 08/01/18 00:22:15 INFO mapred.JobClient: map 74% reduce 22% > > 08/01/18 00:22:20 INFO mapred.JobClient: map 77% reduce 24% > > 08/01/18 00:22:25 INFO mapred.JobClient: map 80% reduce 24% > > 08/01/18 00:22:30 INFO mapred.JobClient: map 83% reduce 25% > > 08/01/18 00:22:35 INFO mapred.JobClient: map 86% reduce 27% > > 08/01/18 00:22:40 INFO mapred.JobClient: map 89% reduce 28% > > 08/01/18 00:22:45 INFO mapred.JobClient: map 89% reduce 29% > > 08/01/18 00:22:46 INFO mapred.JobClient: map 91% reduce 29% > > 08/01/18 00:22:51 INFO mapred.JobClient: map 94% reduce 30% > > 08/01/18 00:22:56 INFO mapred.JobClient: map 97% reduce 30% > > 08/01/18 00:23:06 INFO mapred.JobClient: map 98% reduce 32% > > 08/01/18 00:25:06 INFO mapred.JobClient: map 99% reduce 32% > > 08/01/18 00:26:16 INFO mapred.JobClient: map 100% reduce 32% > > 08/01/18 00:27:08 INFO mapred.JobClient: map 100% reduce 66% > > 08/01/18 00:27:16 INFO mapred.JobClient: map 100% reduce 71% > > 08/01/18 00:27:27 INFO mapred.JobClient: map 100% reduce 77% > > 08/01/18 00:27:28 INFO mapred.JobClient: map 100% reduce 78% > > 08/01/18 00:27:37 INFO mapred.JobClient: map 100% reduce 100% > > 08/01/18 00:27:38 INFO mapred.JobClient: Job complete: job_0023 > > 08/01/18 00:27:38 INFO mapred.JobClient: Counters: 11 > > 08/01/18 00:27:38 INFO mapred.JobClient: > org.apache.hadoop.examples.WordCount$Counter > > 08/01/18 00:27:38 INFO mapred.JobClient: WORDS=13050362 > > 08/01/18 00:27:38 INFO mapred.JobClient: VALUES=13976767 > > 08/01/18 00:27:38 INFO mapred.JobClient: Map-Reduce Framework > > 08/01/18 00:27:38 INFO mapred.JobClient: Map input records=277434 > > 08/01/18 00:27:38 INFO mapred.JobClient: Map output records=13050362 > > 08/01/18 00:27:38 INFO mapred.JobClient: Map input bytes=67239230 > > 08/01/18 00:27:38 INFO mapred.JobClient: Map output bytes=118620427 > > 08/01/18 00:27:38 INFO mapred.JobClient: Combine input > records=13050362 > > 08/01/18 00:27:38 INFO mapred.JobClient: Combine output > records=926405 > > 08/01/18 00:27:38 INFO mapred.JobClient: Reduce input groups=709097 > > 08/01/18 00:27:38 INFO mapred.JobClient: Reduce input records=926405 > > 08/01/18 00:27:38 INFO mapred.JobClient: Reduce output > records=709097 >