Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The "JavaDemoApplication" page has been changed by Cristian Vulpe. http://wiki.apache.org/nutch/JavaDemoApplication?action=diff&rev1=8&rev2=9 -------------------------------------------------- }}} Place this copy of nutch-site.xml and a copy of common-terms.utf8 (from the conf directory in the Nutch distribution) in the WEB-INF/classes directory of the web application that you're deploying. For a standalone application, the mentioned files have to be available in the classpath. - You also need to make sure that the following jars are placed in WEB-INF/lib: + You also need to make sure that the following jars are placed in WEB-INF/lib (this assumes usage of Nutch 0.9): {{{ commons-cli-2.0-SNAPSHOT.jar @@ -42, +42 @@ lucene-misc-2.2.0.jar nutch-0.9.jar }}} + + For a standalone application, one might want to use Apache maven (this configuration assumes Nutch 1.1). At the moment of writing this note, Nutch does not publish its artifacts to maven. However we (members of community) hope that maven support will be added soon. In the meantime, just install the nutch-1.1.jar to your maven repository. Here is a snippet that will manage the dependencies that you need to run this example (note that the 1.1-XXX version of Nutch marks the fact that the artifact cannot be found in any public repository yet): + + {{{ + <dependency> + <groupId>org.apache.nutch</groupId> + <artifactId>nutch</artifactId> + <version>1.1-XXX</version> + </dependency> + + <dependency> + <groupId>org.apache.hadoop</groupId> + <artifactId>hadoop-core</artifactId> + <version>0.20.2</version> + </dependency> + + <dependency> + <groupId>org.apache.lucene</groupId> + <artifactId>lucene-core</artifactId> + <version>3.0.1</version> + <scope>runtime</scope> + </dependency> + + <dependency> + <groupId>org.apache.lucene</groupId> + <artifactId>lucene-misc</artifactId> + <version>3.0.1</version> + <scope>runtime</scope> + </dependency> + + <dependency> + <groupId>commons-lang</groupId> + <artifactId>commons-lang</artifactId> + <version>2.1</version> + <scope>runtime</scope> + </dependency> + }}} + == Sample code == With that, all is ready and we can now write some simple code to search. A quick example in Java to search the crawl index and return the number of hits found is:

