I'm running nutch 1.3 on 64 bit Ubuntu, following are the commands and relevant output.
---------------------------------- llist@LeosLinux:~$ /usr/share/nutch/runtime/local/bin/nutch inject /home/llist/nutchData/crawl/crawldb /home/llist/nutchData/seed Injector: starting at 2011-07-15 18:32:10 Injector: crawlDb: /home/llist/nutchData/crawl/crawldb Injector: urlDir: /home/llist/nutchData/seed Injector: Converting injected urls to crawl db entries. Injector: Merging injected urls into crawl db. Injector: finished at 2011-07-15 18:32:13, elapsed: 00:00:02 ================= llist@LeosLinux:~$ /usr/share/nutch/runtime/local/bin/nutch generate /home/llist/nutchData/crawl/crawldb /home/llist/nutchData/crawl/segments Generator: starting at 2011-07-15 18:32:41 Generator: Selecting best-scoring urls due for fetch. Generator: filtering: true Generator: normalizing: true Generator: jobtracker is 'local', generating exactly one partition. Generator: Partitioning selected urls for politeness. Generator: segment: /home/llist/nutchData/crawl/segments/20110715183244 Generator: finished at 2011-07-15 18:32:45, elapsed: 00:00:03 ================== llist@LeosLinux:~$ /usr/share/nutch/runtime/local/bin/nutch fetch /home/llist/nutchData/crawl/segments/20110715183244 Fetcher: Your 'http.agent.name' value should be listed first in 'http.robots.agents' property. Fetcher: starting at 2011-07-15 18:34:55 Fetcher: segment: /home/llist/nutchData/crawl/segments/20110715183244 Fetcher: threads: 10 QueueFeeder finished: total 1 records + hit by time limit :0 fetching http://www.seek.com.au/ -finishing thread FetcherThread, activeThreads=1 -finishing thread FetcherThread, activeThreads=1 -finishing thread FetcherThread, activeThreads=1 -finishing thread FetcherThread, activeThreads=1 -finishing thread FetcherThread, activeThreads=2 -finishing thread FetcherThread, activeThreads=1 -finishing thread FetcherThread, activeThreads=1 -finishing thread FetcherThread, activeThreads=1 -finishing thread FetcherThread, activeThreads=1 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0 -finishing thread FetcherThread, activeThreads=0 -activeThreads=0, spinWaiting=0, fetchQueues.totalSize=0 -activeThreads=0 Fetcher: finished at 2011-07-15 18:34:59, elapsed: 00:00:03 ================= llist@LeosLinux:~$ /usr/share/nutch/runtime/local/bin/nutch updatedb /home/llist/nutchData/crawl/crawldb -dir /home/llist/nutchData/crawl/segments/20110715183244 CrawlDb update: starting at 2011-07-15 18:36:00 CrawlDb update: db: /home/llist/nutchData/crawl/crawldb CrawlDb update: segments: [file:/home/llist/nutchData/crawl/segments/20110715183244/crawl_fetch, file:/home/llist/nutchData/crawl/segments/20110715183244/crawl_generate, file:/home/llist/nutchData/crawl/segments/20110715183244/content] CrawlDb update: additions allowed: true CrawlDb update: URL normalizing: false CrawlDb update: URL filtering: false - skipping invalid segment file:/home/llist/nutchData/crawl/segments/20110715183244/crawl_fetch - skipping invalid segment file:/home/llist/nutchData/crawl/segments/20110715183244/crawl_generate - skipping invalid segment file:/home/llist/nutchData/crawl/segments/20110715183244/content CrawlDb update: Merging segment data into db. CrawlDb update: finished at 2011-07-15 18:36:01, elapsed: 00:00:01 ----------------------------------- Appreciate any hints on what I'm missing.

