My original post was to nutch mailing list. But since the error was reported from hadoop class, I thought I may get some advice here.
Thanks On Saturday, February 27, 2010, Ted Yu <[email protected]> wrote: > Please disregard my previous email - the command was launched from incorrect > directory. > > I don't see improvement for my latest run: > [r...@snv-qa-lin-domain-crawler1 software]# hfs -text > /user/tomcatadmin/lpm/15-100226111258118-tomcatadmin/parse/0/part-m-00000 > 10/02/27 07:36:28 INFO util.NativeCodeLoader: Loaded the native-hadoop library > 10/02/27 07:36:28 INFO zlib.ZlibFactory: Successfully loaded & initialized > native-zlib library > 10/02/27 07:36:28 INFO compress.CodecPool: Got brand-new decompressor > text: java.io.IOException: WritableName can't load class: > org.apache.nutch.parse.Parse > > Here is the command line (see bold): > 510 1255 38.3 0.1 1441444 62660 ? Sl 07:23 0:02 > /usr/local/jdk1.6.0_14/bin/java -Xmx1000m > -Dhadoop.log.dir=/opt/kindsight/nutchbase/logs > -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl > -Dhadoop.log.file=hadoop.log > -Djava.library.path=/opt/kindsight/nutchbase/lib/native/Linux-amd64-64 > -classpath > /opt/kindsight/nutchbase:/opt/kindsight/nutchbase/conf:/opt/kindsight/nutchbase/conf/batchclient:/opt/kindsight/nutchbase/lib/batchplatform.jar:/opt/kindsight/nutchbase/lib/colo_common.jar:/opt/kindsight/nutchbase/lib/csreader.jar:/opt/kindsight/nutchbase/lib/pr_common.jar:/opt/kindsight/nutchbase/lib/nutch-1.0.job:/opt/kindsight/nutchbase/lib/3rdparty/commons-collections-3.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/servlet-api.jar:/opt/kindsight/nutchbase/lib/3rdparty/lucene-misc-2.4.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/tika-0.1-incubating.jar:/opt/kindsight/nutchbase/lib/3rdparty/junit-3.8.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/oozie-core-0.20.0.o0.1-SNAPSHOT.jar:/opt/kindsight/nutchbase/lib/3rdparty/hbase-0.20.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/lucene-core-2.4.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/apache-solr-solrj-1.3.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/json_simple-1.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/xerces-2_6_2.jar:/opt/kindsight/nutchbase/lib/3rdparty/jetty-5.1.4.jar:/opt/kindsight/nutchbase/lib/3rdparty/jets3t-0.6.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-lang-2.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/xerces-2_6_2-apis.jar:/opt/kindsight/nutchbase/lib/3rdparty/apache-solr-common-1.3.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/jdom-1.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-fileupload-1.3-SNAPSHOT.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-httpclient-3.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-logging-1.0.4.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-logging-api-1.0.4.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-beanutils-1.8.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/nutch-1.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-io-1.3.2.jar:/opt/kindsight/nutchbase/lib/3rdparty/icu4j-4_0_1.jar:/opt/kindsight/nutchbase/lib/3rdparty/log4j-1.2.15.jar:/opt/kindsight/nutchbase/lib/3rdparty/oozie-client-0.20.0.o0.1-SNAPSHOT.jar:/opt/kindsight/nutchbase/lib/3rdparty/batch/hbase-0.20.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/batch/nutch-1.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/batch/hadoop-core.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-logging-1.1.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-codec-1.3.jar:/opt/kindsight/nutchbase/lib/3rdparty/jakarta-oro-2.0.8.jar:/opt/kindsight/nutchbase/lib/3rdparty/hsqldb-1.8.0.7.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-pool-1.4.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-dbcp-1.2.2.jar:/opt/kindsight/nutchbase/lib/3rdparty/mysql-connector-java-5.1.10-bin.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-httpclient-3.0.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/zookeeper-3.2.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/taglibs-i18n.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-collections-3.2.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-cli-1.2.jar:/usr/local/jdk1.6.0_14/lib/tools.jar:/opt/kindsight/nutchbase/build/nutch-*.job:/opt/kindsight/nutchbase/nutch-*.job:/opt/kindsight/nutchbase/lib/batchplatform.jar:/opt/kindsight/nutchbase/lib/colo_common.jar:/opt/kindsight/nutchbase/lib/csreader.jar:/opt/kindsight/nutchbase/lib/pr_common.jar:/opt/kindsight/nutchbase/lib/jetty-ext/*.jar > com.rialto.nutchbase.fetcher.Fetcher -libjars > /opt/kindsight/nutchbase/lib/3rdparty/nutch-1.0.jar,/opt/kindsight/nutchbase/lib/3rdparty/hbase-0.20.1.jar > -D db.max.outlinks.per.page=1000 domaincrawltable > lpm/15-100226111258118-tomcatadmin/generate/0 > lpm/15-100226111258118-tomcatadmin/parse/0 -threads 10 -actionid > 15-100226111258118-tomcatad...@domain_crawl > > On Sat, Feb 27, 2010 at 7:29 AM, Ted Yu <[email protected]> wrote: > > Now I see this in the log: > [r...@snv-qa-lin-domain-crawler1 webmap_workflow]# hfs -text > /user/tomcatadmin/lpm/15-100226111258118-tomcatadmin/generate/0/part-r-00000 > 2010-02-27 07:25:08,062 WARN [main] conf.Configuration DEPRECATED: > hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is > deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to > override properties of core-default.xml, mapred-default.xml and > hdfs-default.xml respectively > 2010-02-27 07:25:08,062 WARN [main] conf.Configuration DEPRECATED: > hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is > deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to > override properties of core-default.xml, mapred-default.xml and > hdfs-default.xml respectively > 2010-02-27 07:25:08,342 WARN [main] util.NativeCodeLoader Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 2010-02-27 07:25:08,342 WARN [main] util.NativeCodeLoader Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 2010-02-27 07:25:08,342 INFO [main] compress.CodecPool Got brand-new > decompressor > 2010-02-27 07:25:08,342 INFO [main] compress.CodecPool Got brand-new > decompressor > text: null > > But we do have native library as specified by -Djava.library.path=/opt/ > > kindsight/nutchbase/lib/native/Linux-amd64-64: > > [r...@snv-qa-lin-domain-crawler1 webmap_workflow]# ls > /opt/kindsight/nutchbase/lib/native/Linux-amd64-64/ > libhadoop.a libhadoop.la libhadoop.so libhadoop.so.1 libhadoop.so.1.0.0 > > > > On Sat, Feb 27, 2010 at 5:52 AM, Julien Nioche > <[email protected]> wrote: > > > Look at the Hadoop option -libjars and use it to point to the nutch-1.0.jar, > that should work > J. > > On 27 February 2010 13:08, Ted Yu <[email protected]> wrote: > >> Hi, >> We use nutch to perform domain crawl but I see strange 'can't load class' >> error: >> >> [r...@snv-qa-lin-domain-crawler1 software]# hfs -text >> /user/tomcatadmin/lpm/12-100226111258118-tomcatadmin/parse/0/part-m-00000 >> 10/02/27 04:45:10 INFO util.NativeCodeLoader: Loaded the native-hadoop >> library >> 10/02/27 04:45:10 INFO zlib.ZlibFactory: Successfully loaded & initialized >> native-zlib library >> 10/02/27 04:45:10 INFO compress.CodecPool: Got brand-new decompressor >> text: java.io.IOException: WritableName can't load class: >> org.apache.nutch.parse.Parse >> >> Here is the commandline which includes nutch-1.0.jar that contains >> org.apache.nutch.parse.Parse (see bold): >> >> 510 32488 1.3 0.1 1370060 53264 ? Sl 04:35 0:02 >> /usr/local/jdk1.6.0_14/bin/java -Xmx1000m >> -Dhadoop.log.dir=/opt/kindsight/nutchbase/logs >> >> -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl >> -Dhadoop.log.file=hadoop.log >> -Djava.library.path=/opt/kindsight/nutchbase/lib/native/Linux-amd64-64 >> -classpath >> >> /opt/kindsight/nutchbase:/opt/kindsight/nutchbase/conf:/opt/kindsight/nutchbase/conf/batchclient:/opt/kindsight/nutchbase/lib/batchplatform.jar:/opt/kindsight/nutchbase/lib/colo_common.jar:/opt/kindsight/nutchbase/lib/csreader.jar:/opt/kindsight/nutchbase/lib/pr_common.jar:/opt/kindsight/nutchbase/lib/nutch-1.0.job:/opt/kindsight/nutchbase/lib/3rdparty/commons-collections-3.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/servlet-api.jar:/opt/kindsight/nutchbase/lib/3rdparty/lucene-misc-2.4.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/tika-0.1-incubating.jar:/opt/kindsight/nutchbase/lib/3rdparty/junit-3.8.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/oozie-core-0.20.0.o0.1-SNAPSHOT.jar:/opt/kindsight/nutchbase/lib/3rdparty/hbase-0.20.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/lucene-core-2.4.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/apache-solr-solrj-1.3.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/json_simple-1.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/xerces-2_6_2.jar:/opt/kindsight/nutchbase/lib/3rdparty/jetty-5.1.4.jar:/opt/kindsight/nutchbase/lib/3rdparty/jets3t-0.6.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-lang-2.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/xerces-2_6_2-apis.jar:/opt/kindsight/nutchbase/lib/3rdparty/apache-solr-common-1.3.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/jdom-1.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-fileupload-1.3-SNAPSHOT.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-httpclient-3.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-logging-1.0.4.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-logging-api-1.0.4.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-beanutils-1.8.0.jar: >> >> */opt/kindsight/nutchbase/lib/3rdparty/nutch-1.0.jar*:/opt/kindsight/nutchbase/lib/3rdparty/commons-io-1.3.2.jar:/opt/kindsight/nutchbase/lib/3rdparty/icu4j-4_0_1.jar:/opt/kindsight/nutchbase/lib/3rdparty/log4j-1.2.15.jar:/opt/kindsight/nutchbase/lib/3rdparty/oozie-client-0.20.0.o0.1-SNAPSHOT.jar:/opt/kindsight/nutchbase/lib/3rdparty/batch/hadoop-core.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-logging-1.1.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-codec-1.3.jar:/opt/kindsight/nutchbase/lib/3rdparty/jakarta-oro-2.0.8.jar:/opt/kindsight/nutchbase/lib/3rdparty/hsqldb-1.8.0.7.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-pool-1.4.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-dbcp-1.2.2.jar:/opt/kindsight/nutchbase/lib/3rdparty/mysql-connector-java-5.1.10-bin.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-httpclient-3.0.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/zo >
