I have problems with running injector in nutch-1.4 on hadoop, same command with nutch-1.3 works fine. As you can see, list of URLs is loaded from hdfs correctly Map input records=66906 but no records are on map ouput. Could it be some problems with broken filtering?

ponto:(crawler)runtime/deploy>bin/nutch inject /czcrawl/db /czcrawl/seeds
11/10/13 17:56:25 INFO crawl.Injector: Injector: starting at 2011-10-13 17:56:25
11/10/13 17:56:25 INFO crawl.Injector: Injector: crawlDb: /czcrawl/db
11/10/13 17:56:25 INFO crawl.Injector: Injector: urlDir: /czcrawl/seeds
11/10/13 17:56:25 INFO crawl.Injector: Injector: Converting injected urls to crawl db entries. 11/10/13 17:56:28 INFO mapred.FileInputFormat: Total input paths to process : 1
11/10/13 17:56:29 INFO mapred.JobClient: Running job: job_201110091645_0032
11/10/13 17:56:30 INFO mapred.JobClient:  map 0% reduce 0%
11/10/13 17:56:52 INFO mapred.JobClient:  map 50% reduce 0%
11/10/13 17:56:53 INFO mapred.JobClient:  map 100% reduce 0%
11/10/13 17:57:05 INFO mapred.JobClient:  map 100% reduce 100%
11/10/13 17:57:10 INFO mapred.JobClient: Job complete: job_201110091645_0032
11/10/13 17:57:10 INFO mapred.JobClient: Counters: 27
11/10/13 17:57:10 INFO mapred.JobClient:   Job Counters
11/10/13 17:57:10 INFO mapred.JobClient:     Launched reduce tasks=1
11/10/13 17:57:10 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=20455
11/10/13 17:57:10 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 11/10/13 17:57:10 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
11/10/13 17:57:10 INFO mapred.JobClient:     Rack-local map tasks=1
11/10/13 17:57:10 INFO mapred.JobClient:     Launched map tasks=2
11/10/13 17:57:10 INFO mapred.JobClient:     Data-local map tasks=1
11/10/13 17:57:10 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=10356
11/10/13 17:57:10 INFO mapred.JobClient:   File Input Format Counters
11/10/13 17:57:10 INFO mapred.JobClient:     Bytes Read=1283144
11/10/13 17:57:10 INFO mapred.JobClient:   File Output Format Counters
11/10/13 17:57:10 INFO mapred.JobClient:     Bytes Written=86
11/10/13 17:57:10 INFO mapred.JobClient:   FileSystemCounters
11/10/13 17:57:10 INFO mapred.JobClient:     FILE_BYTES_READ=6
11/10/13 17:57:10 INFO mapred.JobClient:     HDFS_BYTES_READ=1283358
11/10/13 17:57:10 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=89486
11/10/13 17:57:10 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=86
11/10/13 17:57:10 INFO mapred.JobClient:   Map-Reduce Framework
11/10/13 17:57:10 INFO mapred.JobClient: Map output materialized bytes=12
11/10/13 17:57:10 INFO mapred.JobClient:     Map input records=66906
11/10/13 17:57:10 INFO mapred.JobClient:     Reduce shuffle bytes=6
11/10/13 17:57:10 INFO mapred.JobClient:     Spilled Records=0
11/10/13 17:57:10 INFO mapred.JobClient:     Map output bytes=0
11/10/13 17:57:10 INFO mapred.JobClient:     Map input bytes=1280141
11/10/13 17:57:10 INFO mapred.JobClient:     Combine input records=0
11/10/13 17:57:10 INFO mapred.JobClient:     SPLIT_RAW_BYTES=214
11/10/13 17:57:10 INFO mapred.JobClient:     Reduce input records=0
11/10/13 17:57:10 INFO mapred.JobClient:     Reduce input groups=0
11/10/13 17:57:10 INFO mapred.JobClient:     Combine output records=0
11/10/13 17:57:10 INFO mapred.JobClient:     Reduce output records=0
11/10/13 17:57:10 INFO mapred.JobClient:     Map output records=0

Reply via email to