Perhaps I have missed something important, but I am not able to find a way to build an index in nutch 1.3 since this command isn't found any more?
Is there a new way to do this?

I tried to run:

root@hrz-vm180:/home/nutchServer/nutch/runtime/local/bin# ./nutch index crawl/indexes crawl/crawldb/ crawl/linkdb/ crawl/segments/*
Exception in thread "main" java.lang.NoClassDefFoundError: index
Caused by: java.lang.ClassNotFoundException: index
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
Could not find the main class: index.  Program will exit.

And after running nutch without an argument I saw that the index command is missing.

root@hrz-vm180:/home/nutchServer/nutch/runtime/local/bin# ./nutch
Usage: nutch [-core] COMMAND
where COMMAND is one of:
  crawl             one-step crawler for intranets
  readdb            read / dump crawl db
  convdb            convert crawl db from pre-0.9 format
  mergedb           merge crawldb-s, with optional filtering
  readlinkdb        read / dump link db
  inject            inject new urls into the database
  generate          generate new segments to fetch from crawl db
  freegen           generate new segments to fetch from text files
  fetch             fetch a segment's pages
  parse             parse a segment's pages
  readseg           read / dump segment data
mergesegs merge several segments, with optional filtering and slicing
  updatedb          update crawl db from segments after fetching
  invertlinks       create a linkdb from parsed segments
  mergelinkdb       merge linkdb-s, with optional filtering
  solrindex         run the solr indexer on parsed segments and linkdb
  solrdedup         remove duplicates from solr
  solrclean         remove HTTP 301 and 404 documents from solr
  plugin            load a plugin and run one of its classes main()
 or
  CLASSNAME         run the class named CLASSNAME
Most commands print help when invoked w/o parameters.

Expert: -core option is for developers only. It avoids building the job jar,
        instead it simply includes classes compiled with ant compile-core.
        NOTE: this works only for jobs executed in 'local' mode

Reply via email to