Please disregard my previous email - the command was launched from incorrect
directory.

I don't see improvement for my latest run:
[r...@snv-qa-lin-domain-crawler1 software]# hfs -text
/user/tomcatadmin/lpm/15-100226111258118-tomcatadmin/parse/0/part-m-00000
10/02/27 07:36:28 INFO util.NativeCodeLoader: Loaded the native-hadoop
library
10/02/27 07:36:28 INFO zlib.ZlibFactory: Successfully loaded & initialized
native-zlib library
10/02/27 07:36:28 INFO compress.CodecPool: Got brand-new decompressor
text: java.io.IOException: WritableName can't load class:
org.apache.nutch.parse.Parse

Here is the command line (see bold):
510       1255 38.3  0.1 1441444 62660 ?       Sl   07:23   0:02
/usr/local/jdk1.6.0_14/bin/java -Xmx1000m
-Dhadoop.log.dir=/opt/kindsight/nutchbase/logs
-Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl
-Dhadoop.log.file=hadoop.log
-Djava.library.path=/opt/kindsight/nutchbase/lib/native/Linux-amd64-64
-classpath
/opt/kindsight/nutchbase:/opt/kindsight/nutchbase/conf:/opt/kindsight/nutchbase/conf/batchclient:/opt/kindsight/nutchbase/lib/batchplatform.jar:/opt/kindsight/nutchbase/lib/colo_common.jar:/opt/kindsight/nutchbase/lib/csreader.jar:/opt/kindsight/nutchbase/lib/pr_common.jar:/opt/kindsight/nutchbase/lib/nutch-1.0.job:/opt/kindsight/nutchbase/lib/3rdparty/commons-collections-3.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/servlet-api.jar:/opt/kindsight/nutchbase/lib/3rdparty/lucene-misc-2.4.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/tika-0.1-incubating.jar:/opt/kindsight/nutchbase/lib/3rdparty/junit-3.8.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/oozie-core-0.20.0.o0.1-SNAPSHOT.jar:/opt/kindsight/nutchbase/lib/3rdparty/hbase-0.20.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/lucene-core-2.4.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/apache-solr-solrj-1.3.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/json_simple-1.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/xerces-2_6_2.jar:/opt/kindsight/nutchbase/lib/3rdparty/jetty-5.1.4.jar:/opt/kindsight/nutchbase/lib/3rdparty/jets3t-0.6.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-lang-2.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/xerces-2_6_2-apis.jar:/opt/kindsight/nutchbase/lib/3rdparty/apache-solr-common-1.3.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/jdom-1.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-fileupload-1.3-SNAPSHOT.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-httpclient-3.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-logging-1.0.4.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-logging-api-1.0.4.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-beanutils-1.8.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/nutch-1.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-io-1.3.2.jar:/opt/kindsight/nutchbase/lib/3rdparty/icu4j-4_0_1.jar:/opt/kindsight/nutchbase/lib/3rdparty/log4j-1.2.15.jar:/opt/kindsight/nutchbase/lib/3rdparty/oozie-client-0.20.0.o0.1-SNAPSHOT.jar:/opt/kindsight/nutchbase/lib/3rdparty/batch/hbase-0.20.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/batch/nutch-1.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/batch/hadoop-core.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-logging-1.1.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-codec-1.3.jar:/opt/kindsight/nutchbase/lib/3rdparty/jakarta-oro-2.0.8.jar:/opt/kindsight/nutchbase/lib/3rdparty/hsqldb-1.8.0.7.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-pool-1.4.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-dbcp-1.2.2.jar:/opt/kindsight/nutchbase/lib/3rdparty/mysql-connector-java-5.1.10-bin.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-httpclient-3.0.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/zookeeper-3.2.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/taglibs-i18n.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-collections-3.2.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-cli-1.2.jar:/usr/local/jdk1.6.0_14/lib/tools.jar:/opt/kindsight/nutchbase/build/nutch-*.job:/opt/kindsight/nutchbase/nutch-*.job:/opt/kindsight/nutchbase/lib/batchplatform.jar:/opt/kindsight/nutchbase/lib/colo_common.jar:/opt/kindsight/nutchbase/lib/csreader.jar:/opt/kindsight/nutchbase/lib/pr_common.jar:/opt/kindsight/nutchbase/lib/jetty-ext/*.jar
com.rialto.nutchbase.fetcher.Fetcher *-libjars
/opt/kindsight/nutchbase/lib/3rdparty/nutch-1.0.jar,/opt/kindsight/nutchbase/lib/3rdparty/hbase-0.20.1.jar
* -D db.max.outlinks.per.page=1000 domaincrawltable
lpm/15-100226111258118-tomcatadmin/generate/0
lpm/15-100226111258118-tomcatadmin/parse/0 -threads 10 -actionid
15-100226111258118-tomcatad...@domain_crawl

On Sat, Feb 27, 2010 at 7:29 AM, Ted Yu <yuzhih...@gmail.com> wrote:

> Now I see this in the log:
> [r...@snv-qa-lin-domain-crawler1 webmap_workflow]# hfs -text
> /user/tomcatadmin/lpm/15-100226111258118-tomcatadmin/generate/0/part-r-00000
> 2010-02-27 07:25:08,062 WARN  [main] conf.Configuration DEPRECATED:
> hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is
> deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to
> override properties of core-default.xml, mapred-default.xml and
> hdfs-default.xml respectively
> 2010-02-27 07:25:08,062 WARN  [main] conf.Configuration DEPRECATED:
> hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is
> deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to
> override properties of core-default.xml, mapred-default.xml and
> hdfs-default.xml respectively
> *2010-02-27 07:25:08,342 WARN  [main] util.NativeCodeLoader Unable to load
> native-hadoop library for your platform... using builtin-java classes where
> applicable
> *2010-02-27 07:25:08,342 WARN  [main] util.NativeCodeLoader Unable to load
> native-hadoop library for your platform... using builtin-java classes where
> applicable
> 2010-02-27 07:25:08,342 INFO  [main] compress.CodecPool Got brand-new
> decompressor
> 2010-02-27 07:25:08,342 INFO  [main] compress.CodecPool Got brand-new
> decompressor
> text: null
>
> But we do have native library as specified by -Djava.library.path=/opt/
> kindsight/nutchbase/lib/native/Linux-amd64-64:
>
> [r...@snv-qa-lin-domain-crawler1 webmap_workflow]# ls
> /opt/kindsight/nutchbase/lib/native/Linux-amd64-64/
> libhadoop.a  libhadoop.la  libhadoop.so  libhadoop.so.1
> libhadoop.so.1.0.0
>
>
>
> On Sat, Feb 27, 2010 at 5:52 AM, Julien Nioche <
> lists.digitalpeb...@gmail.com> wrote:
>
>> Look at the Hadoop option -libjars and use it to point to the
>> nutch-1.0.jar,
>> that should work
>> J.
>>
>> On 27 February 2010 13:08, Ted Yu <yuzhih...@gmail.com> wrote:
>>
>> > Hi,
>> > We use nutch to perform domain crawl but I see strange 'can't load
>> class'
>> > error:
>> >
>> > [r...@snv-qa-lin-domain-crawler1 software]# hfs -text
>> >
>> /user/tomcatadmin/lpm/12-100226111258118-tomcatadmin/parse/0/part-m-00000
>> > 10/02/27 04:45:10 INFO util.NativeCodeLoader: Loaded the native-hadoop
>> > library
>> > 10/02/27 04:45:10 INFO zlib.ZlibFactory: Successfully loaded &
>> initialized
>> > native-zlib library
>> > 10/02/27 04:45:10 INFO compress.CodecPool: Got brand-new decompressor
>> > text: java.io.IOException: WritableName can't load class:
>> > org.apache.nutch.parse.Parse
>> >
>> > Here is the commandline which includes nutch-1.0.jar that contains
>> > org.apache.nutch.parse.Parse (see bold):
>> >
>> > 510      32488  1.3  0.1 1370060 53264 ?       Sl   04:35   0:02
>> > /usr/local/jdk1.6.0_14/bin/java -Xmx1000m
>> >  -Dhadoop.log.dir=/opt/kindsight/nutchbase/logs
>> >
>> >
>> -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl
>> > -Dhadoop.log.file=hadoop.log
>> > -Djava.library.path=/opt/kindsight/nutchbase/lib/native/Linux-amd64-64
>> > -classpath
>> >
>> >
>> /opt/kindsight/nutchbase:/opt/kindsight/nutchbase/conf:/opt/kindsight/nutchbase/conf/batchclient:/opt/kindsight/nutchbase/lib/batchplatform.jar:/opt/kindsight/nutchbase/lib/colo_common.jar:/opt/kindsight/nutchbase/lib/csreader.jar:/opt/kindsight/nutchbase/lib/pr_common.jar:/opt/kindsight/nutchbase/lib/nutch-1.0.job:/opt/kindsight/nutchbase/lib/3rdparty/commons-collections-3.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/servlet-api.jar:/opt/kindsight/nutchbase/lib/3rdparty/lucene-misc-2.4.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/tika-0.1-incubating.jar:/opt/kindsight/nutchbase/lib/3rdparty/junit-3.8.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/oozie-core-0.20.0.o0.1-SNAPSHOT.jar:/opt/kindsight/nutchbase/lib/3rdparty/hbase-0.20.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/lucene-core-2.4.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/apache-solr-solrj-1.3.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/json_simple-1.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/xerces-2_6_2.jar:/opt/kindsight/nutchbase/lib/3rdparty/jetty-5.1.4.jar:/opt/kindsight/nutchbase/lib/3rdparty/jets3t-0.6.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-lang-2.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/xerces-2_6_2-apis.jar:/opt/kindsight/nutchbase/lib/3rdparty/apache-solr-common-1.3.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/jdom-1.0.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-fileupload-1.3-SNAPSHOT.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-httpclient-3.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-logging-1.0.4.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-logging-api-1.0.4.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-beanutils-1.8.0.jar:
>> >
>> >
>> */opt/kindsight/nutchbase/lib/3rdparty/nutch-1.0.jar*:/opt/kindsight/nutchbase/lib/3rdparty/commons-io-1.3.2.jar:/opt/kindsight/nutchbase/lib/3rdparty/icu4j-4_0_1.jar:/opt/kindsight/nutchbase/lib/3rdparty/log4j-1.2.15.jar:/opt/kindsight/nutchbase/lib/3rdparty/oozie-client-0.20.0.o0.1-SNAPSHOT.jar:/opt/kindsight/nutchbase/lib/3rdparty/batch/hadoop-core.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-logging-1.1.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-codec-1.3.jar:/opt/kindsight/nutchbase/lib/3rdparty/jakarta-oro-2.0.8.jar:/opt/kindsight/nutchbase/lib/3rdparty/hsqldb-1.8.0.7.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-pool-1.4.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-dbcp-1.2.2.jar:/opt/kindsight/nutchbase/lib/3rdparty/mysql-connector-java-5.1.10-bin.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-httpclient-3.0.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/zookeeper-3.2.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/taglibs-i18n.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-collections-3.2.1.jar:/opt/kindsight/nutchbase/lib/3rdparty/commons-cli-1.2.jar:/usr/local/jdk1.6.0_14/lib/tools.jar:/opt/kindsight/nutchbase/build/nutch-*.job:/opt/kindsight/nutchbase/nutch-*.job:/opt/kindsight/nutchbase/lib/batchplatform.jar:/opt/kindsight/nutchbase/lib/colo_common.jar:/opt/kindsight/nutchbase/lib/csreader.jar:/opt/kindsight/nutchbase/lib/pr_common.jar:/opt/kindsight/nutchbase/lib/jetty-ext/*.jar
>> > com.rialto.nutchbase.fetcher.Fetcher -D db.max.outlinks.per.page=1000
>> > domaincrawltable lpm/12-100226111258118-tomcatadmin/generate/2
>> > lpm/12-100226111258118-tomcatadmin/parse/2 -threads 10 -actionid
>> > 12-100226111258118-tomcatad...@domain_crawl
>> >
>> > Please shed some light on the above error.
>> >
>> > Thanks
>> >
>>
>>
>>
>> --
>> DigitalPebble Ltd
>> http://www.digitalpebble.com
>>
>
>

Reply via email to