Re: Class Not Found from 0.8-SNAPSHOT for org.apache.lucene.analysis.WhitespaceAnalyzer

万代豊 Tue, 14 May 2013 01:30:18 -0700

Suneel
Hi.
Yes, you were right.
It had nothing to do with version matching between Hadoop and Mahout.
Even a simple Hadoop example have failed (Pi calculation from example jar
file) and machine reboot resolved Hadoop issue. The machine had been
running without reboot for over a couple of months and believe that it was
lacking some system resources.
Regards,,,
Y.Mandai



2013/5/12 Suneel Marthi <[email protected]>

> Its definitely not a Mahout-Hadoop compatibility issue and is more to do
> with your hadoop setup.
>
> Check this link:
>
>
> http://stackoverflow.com/questions/15585630/file-jobtracker-info-could-only-be-replicated-to-0-nodes-instead-of-1
>
>
>
>
>
> ________________________________
>  From: 万代豊 <[email protected]>
> To: "[email protected]" <[email protected]>
> Sent: Saturday, May 11, 2013 1:14 PM
> Subject: Re: Class Not Found from 0.8-SNAPSHOT for
> org.apache.lucene.analysis.WhitespaceAnalyzer
>
>
> Well, my Mahout-0.8-SNAPSHOT is now fine with the analyzer option
> "org.apache.lucene.analysis.core.WhitespaceAnalyzer", but there are still
> some steps to get over with...
> This could be the Hadoop version incompatibility issue and if so, then what
> should be the right/minimum Hadoop version? (At least "ClusterDump" with
> Mahout-SNAPSHOT-0.8 worked fine against exisiting K-means result previously
> done in 0.7)
> I've been with Hadoop-0.20.203 (Pseudo-distributed) and Mahout-0.7 for
> sometime and have just recently upgraded Mahout side up to 0.8-SNAPSHOT.
>
> $MAHOUT_HOME/bin/mahout seq2sparse --namedVector -i NHTSA-seqfile01/ -o
> NHTSA-namedVector -ow -a org.apache.lucene.analysis.core.WhitespaceAnalyzer
> -chunk 200 -wt tfidf -s 5 -md 3 -x 90 -ng 2 -ml 50 -seq -n 2
> Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
> MAHOUT-JOB:
> /usr/local/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar
> 13/05/12 01:45:48 INFO vectorizer.SparseVectorsFromSequenceFiles: Maximum
> n-gram size is: 2
> 13/05/12 01:45:48 INFO vectorizer.SparseVectorsFromSequenceFiles: Minimum
> LLR value: 50.0
> 13/05/12 01:45:48 INFO vectorizer.SparseVectorsFromSequenceFiles: Number of
> reduce tasks: 1
> 13/05/12 01:45:48 WARN hdfs.DFSClient: DataStreamer Exception:
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
> /home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001/job.jar
> could only be replicated to 0 nodes, instead of 1
> at
>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1417)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:596)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377)
>
> at org.apache.hadoop.ipc.Client.call(Client.java:1030)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
> at $Proxy1.addBlock(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
> at
>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
> at $Proxy1.addBlock(Unknown Source)
> at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3104)
> at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2975)
> at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2255)
> at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2446)
>
> 13/05/12 01:45:48 WARN hdfs.DFSClient: Error Recovery for block null bad
> datanode[0] nodes == null
> 13/05/12 01:45:48 WARN hdfs.DFSClient: Could not get block locations.
> Source file
> "/home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001/job.jar"
> - Aborting...
> 13/05/12 01:45:48 INFO mapred.JobClient: Cleaning up the staging area
>
> hdfs://localhost:9000/home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001
> Exception in thread "main" org.apache.hadoop.ipc.RemoteException:
> java.io.IOException: File
> /home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001/job.jar
> could only be replicated to 0 nodes, instead of 1
> at
>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1417)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:596)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377)
>
> at org.apache.hadoop.ipc.Client.call(Client.java:1030)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
> at $Proxy1.addBlock(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
> at
>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
> at $Proxy1.addBlock(Unknown Source)
> at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3104)
> at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2975)
> at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2255)
> at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2446)
> 13/05/12 01:45:48 ERROR hdfs.DFSClient: Exception closing file
> /home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001/job.jar :
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
> /home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001/job.jar
> could only be replicated to 0 nodes, instead of 1
> at
>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1417)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:596)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377)
>
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
> /home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001/job.jar
> could only be replicated to 0 nodes, instead of 1
> at
>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1417)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:596)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377)
> at org.apache.hadoop.ipc.Client.call(Client.java:1030)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
> at $Proxy1.addBlock(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
> at
>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
> at $Proxy1.addBlock(Unknown Source)
> at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3104)
> at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2975)
> at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2255)
> at
>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2446)
>
> Sorry for the long error log.
> I believe my Hadoop-0.20.203 is up and running correctly...
>
> $JAVA_HOME/bin/jps
> 13322 TaskTracker
> 12985 DataNode
> 12890 NameNode
> 13937 Jps
> 13080 SecondaryNameNode
> 13219 JobTracker
> Hope someone could help this out.
> Regards,,
> Y.Mandai
>
>
> 2013/5/9 Yutaka Mandai <[email protected]>
>
> > Suneel
> > Great to know.
> > Thanks!
> > Y.Mandai
> >
> > iPhoneから送信⌘
> >
> > On 2013/05/07, at 22:24, Suneel Marthi <[email protected]> wrote:
> >
> > > It should be
> > > org.apache.lucene.analysis.core.WhitespaceAnalyzer ( u were missing the
> > 'core')
> > >
> > > Mahout trunk's presently at Lucene 4.2.1. Lucene's has gone through a
> > major refactor in 4.x.
> > > Check Lucene 4.2.1 docs for the correct package name.
> > >
> > >
> > >
> > >
> > > ________________________________
> > > From: 万代豊 <[email protected]>
> > > To: "[email protected]" <[email protected]>
> > > Sent: Tuesday, May 7, 2013 3:20 AM
> > > Subject: Class Not Found from 0.8-SNAPSHOT for
> > org.apache.lucene.analysis.WhitespaceAnalyzer
> > >
> > >
> > > Hi all
> > > I guest I must've seen somewhere on very similar topics on classname
> > change
> > > in Mahout-0.8-SNAPSHOT for some of the Lucene analyzer and here is
> > another
> > > one that I need to be solved.
> > > Mahout gave me an error for seq2sparse with Lucene analyzer option as
> > > follows,
> > > which of cource had been working in at least Mahout 0.7.
> > >
> > > $MAHOUT_HOME/bin/mahout seq2sparse --namedVector -i NHTSA-seqfile01/ -o
> > > NHTSA-namedVector -ow -a org.apache.lucene.analysis.WhitespaceAnalyzer
> > > -chunk 200 -wt tfidf -s 5 -md 3 -x 90 -ng 2 -ml 50 -seq -n 2
> > > Running on hadoop, using /usr/local/hadoop/bin/hadoop and
> > HADOOP_CONF_DIR=
> > > MAHOUT-JOB:
> > > /usr/local/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar
> > > 13/05/07 15:41:12 INFO vectorizer.SparseVectorsFromSequenceFiles:
> Maximum
> > > n-gram size is: 2
> > > 13/05/07 15:41:18 INFO vectorizer.SparseVectorsFromSequenceFiles:
> Minimum
> > > LLR value: 50.0
> > > 13/05/07 15:41:18 INFO vectorizer.SparseVectorsFromSequenceFiles:
> Number
> > of
> > > reduce tasks: 1
> > > Exception in thread "main" java.lang.ClassNotFoundException:
> > > org.apache.lucene.analysis.WhitespaceAnalyzer
> > > I have confirmed what classpath Mahout is refering to as;
> > > $ $MAHOUT_HOME/bin/mahout classpath
> > > and obtained Lucene related classpath as below.
> > >
> > >
> >
> /usr/local/trunk/examples/target/dependency/lucene-analyzers-common-4.2.1.jar
> > > /usr/local/trunk/examples/target/dependency/lucene-benchmark-4.2.1.jar:
> > > /usr/local/trunk/examples/target/dependency/lucene-core-4.2.1.jar
> > > /usr/local/trunk/examples/target/dependency/lucene-facet-4.2.1.jar
> > >
> /usr/local/trunk/examples/target/dependency/lucene-highlighter-4.2.1.jar
> > > /usr/local/trunk/examples/target/dependency/lucene-memory-4.2.1.jar
> > > /usr/local/trunk/examples/target/dependency/lucene-queries-4.2.1.jar
> > >
> /usr/local/trunk/examples/target/dependency/lucene-queryparser-4.2.1.jar
> > > /usr/local/trunk/examples/target/dependency/lucene-sandbox-4.2.1.jar
> > >
> > > I want to believe this to be simple classname change related issue.
> > > Please let me be advised.
> > > Regards,,,
> > > Y.Mandai
> >
>

Re: Class Not Found from 0.8-SNAPSHOT for org.apache.lucene.analysis.WhitespaceAnalyzer

Reply via email to