Hi, libthrift is a dependency of cassandra-thrift, as listed here:
http://mvnrepository.com/artifact/org.apache.cassandra/cassandra-thrift/0.8.1

During Nutch build, you have to manually tweak the Ivy configuration
depending on your choice of the Gora store, in this case Cassandra.
Basically you need to add all the dependencies listed there:
http://svn.apache.org/viewvc/incubator/gora/trunk/gora-cassandra/ivy/ivy.xml?view=markup

Let's try to add to $NUTCH_HOME/ivy/ivy.xml the following dependencies
and then let's rebuild Nutch (see attached patch):
        <dependency org="org.apache.gora" name="gora-cassandra"
rev="0.2-incubating" conf="*->compile"/>
        <dependency org="org.apache.cassandra" name="cassandra-thrift" 
rev="0.8.1"/>
        <dependency org="com.ecyrd.speed4j" name="speed4j" rev="0.9"
conf="*->*,!javadoc,!sources"/>
        <dependency org="com.github.stephenc.high-scale-lib"
name="high-scale-lib" rev="1.1.2" conf="*->*,!javadoc,!sources"/>
        <dependency org="com.google.collections" name="google-collections"
rev="1.0" conf="*->*,!javadoc,!sources"/>
        <dependency org="com.google.guava" name="guava" rev="r09"
conf="*->*,!javadoc,!sources"/>

$ ant clean
$ ant

In your case libthrift should now be downloaded by Ivy and then
bundled into the nutch-2.0-dev.job file. I'm not sure how
apache-cassandra and hector got included in your classpath...

Somehow we need to resolve as well:
        <dependency org="org.apache.cassandra" name="apache-cassandra"
rev="0.8.1"/>
        <dependency org="me.prettyprint" name="hector" rev="0.8.0-1"/>

I don't think the following 2 jars are in the default maven repository
so they won't be downloaded, that's why they were commented in the
Gora Cassandra Ivy config (gora/trunk/gora-cassandra/ivy/ivy.xml)


Since hector jar is not found in my case I get:
~/java/workspace/Nutch/trunk/runtime/deploy$ bin/nutch inject
~/java/workspace/Nutch/seeds
11/08/01 14:18:42 INFO crawl.InjectorJob: InjectorJob: starting
11/08/01 14:18:42 INFO crawl.InjectorJob: InjectorJob: urlDir:
/home/alex/java/workspace/Nutch/seeds
11/08/01 14:18:42 INFO security.Groups: Group mapping
impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
cacheTimeout=300000
11/08/01 14:18:42 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
11/08/01 14:18:42 ERROR crawl.InjectorJob: InjectorJob:
org.apache.gora.util.GoraException:
java.lang.reflect.InvocationTargetException
        at 
org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:110)
        at 
org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:93)
        at 
org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:59)
        at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:243)
        at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:268)
        at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:282)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
        at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:292)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:192)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
        at 
org.apache.gora.util.ReflectionUtils.newInstance(ReflectionUtils.java:76)
        at 
org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:102)
        ... 12 more
Caused by: java.lang.NoClassDefFoundError: me/prettyprint/hector/api/Serializer
        at 
org.apache.gora.cassandra.store.CassandraStore.<init>(CassandraStore.java:60)
        ... 18 more
Caused by: java.lang.ClassNotFoundException:
me.prettyprint.hector.api.Serializer
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
        ... 19 more




On Mon, Aug 1, 2011 at 11:59 AM, Tom Davidson <tdavid...@covario.com> wrote:
> Hi All,
>
>
>
> I am kind of at my wit’s end here, so I am hoping someone here can help.  I
> am trying to use Nutch2 and Cassandra and I have been successful using the
> runtime/local build. I am using the Cloudera CDH3 on CentOs 5 and I do not
> want to contaminate by hadoop install by dropping in a bunch of Nutch jars,
> etc. So I am trying to use the nutch-2-dev.job jar. When I try to use the
> nutch2-dev.job jar, I get the error below.  I have double and triple checked
> the classpath and the included jars and the only jar that contains
> FieldValueMetaData is the libthrift-0.6.1.jar which has the method that is
> claimed to be missing. Any ideas?
>
>
>
> Thanks,
>
> Tom
>
>
>
>
>
>
>
>
>
> [tdavidson@nadevsan06 ~]$ bin/nutch inject urls
>
> /opt/jdk1.6.0_21/bin/java -Dproc_jar -Xmx1000m
> -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs -Dhadoop.log.file=hadoop.log
> -Dhadoop.home.dir=/usr/lib/hadoop-0.20 -Dhadoop.id.str=tdavidson
> -Dhadoop.root.logger=INFO,console
> -Djava.library.path=/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
> -Dhadoop.policy.file=hadoop-policy.xml -classpath
> /usr/lib/hadoop-0.20/conf:/opt/jdk1.6.0_21/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u1.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u1.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/hue-plugins-1.2.0-cdh3u1.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> org.apache.hadoop.util.RunJar /home/SEMDIRECTOR/tdavidson/nutch-2.job
> org.apache.nutch.crawl.InjectorJob urls
>
> 11/08/01 11:51:54 INFO crawl.InjectorJob: InjectorJob: starting
>
> 11/08/01 11:51:54 INFO crawl.InjectorJob: InjectorJob: urlDir: urls
>
> 11/08/01 11:51:55 INFO connection.CassandraHostRetryService: Downed Host
> Retry service started with queue size -1 and retry delay 10s
>
> 11/08/01 11:51:55 INFO service.JmxMonitor: Registering JMX
> me.prettyprint.cassandra.service_Test
> Cluster:ServiceType=hector,MonitorType=hector
>
> 11/08/01 11:51:55 ERROR crawl.InjectorJob: InjectorJob:
> org.apache.gora.util.GoraException:
> java.lang.reflect.InvocationTargetException
>
>         at
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:110)
>
>         at
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:93)
>
>         at
> org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:59)
>
>         at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:243)
>
>         at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:268)
>
>         at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:282)
>
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
>         at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:292)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>         at java.lang.reflect.Method.invoke(Method.java:597)
>
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
>
> Caused by: java.lang.reflect.InvocationTargetException
>
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>
>         at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>
>         at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>
>         at
> org.apache.gora.util.ReflectionUtils.newInstance(ReflectionUtils.java:76)
>
>         at
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:102)
>
>         ... 12 more
>
> Caused by: java.lang.NoSuchMethodError:
> org.apache.thrift.meta_data.FieldValueMetaData.<init>(BZ)V
>
>         at org.apache.cassandra.thrift.CfDef.<clinit>(CfDef.java:299)
>
>         at org.apache.cassandra.thrift.KsDef.read(KsDef.java:753)
>
>         at
> org.apache.cassandra.thrift.Cassandra$describe_keyspace_result.read(Cassandra.java:24338)
>
>         at
> org.apache.cassandra.thrift.Cassandra$Client.recv_describe_keyspace(Cassandra.java:1371)
>
>         at
> org.apache.cassandra.thrift.Cassandra$Client.describe_keyspace(Cassandra.java:1346)
>
>         at
> me.prettyprint.cassandra.service.AbstractCluster$4.execute(AbstractCluster.java:192)
>
>         at
> me.prettyprint.cassandra.service.AbstractCluster$4.execute(AbstractCluster.java:187)
>
>         at
> me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
>
>         at
> me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:232)
>
>         at
> me.prettyprint.cassandra.service.AbstractCluster.describeKeyspace(AbstractCluster.java:201)
>
>         at
> org.apache.gora.cassandra.store.CassandraClient.checkKeyspace(CassandraClient.java:82)
>
>         at
> org.apache.gora.cassandra.store.CassandraClient.init(CassandraClient.java:69)
>
>         at
> org.apache.gora.cassandra.store.CassandraStore.<init>(CassandraStore.java:68)
>
>         ... 18 more
Index: ivy/ivy.xml
===================================================================
--- ivy/ivy.xml	(revision 1145734)
+++ ivy/ivy.xml	(working copy)
@@ -32,7 +32,7 @@
 	<dependencies>
 		<dependency org="org.apache.solr" name="solr-solrj" rev="3.1.0"
 			conf="*->default" />
-		<dependency org="org.slf4j" name="slf4j-log4j12" rev="1.5.5" conf="*->master" />
+		<dependency org="org.slf4j" name="slf4j-log4j12" rev="1.6.1" conf="*->master" />
 
 		<dependency org="commons-lang" name="commons-lang" rev="2.4"
 			conf="*->default" />
@@ -93,18 +93,13 @@
 		<dependency org="org.hsqldb" name="hsqldb" rev="2.0.0" conf="*->default"/>
 		<dependency org="org.jdom" name="jdom" rev="1.1" conf="test->default"/>
 
-		<dependency org="org.apache.gora" name="gora-sql" rev="0.2-incubating" conf="*->compile"/>				
+<!--
+		<dependency org="org.apache.gora" name="gora-sql" rev="0.2-incubating" conf="*->compile"/>
+-->				
                 <dependency org="org.restlet.jse" name="org.restlet" rev="2.0.5" conf="*->default"/>
                 <dependency org="org.restlet.jse" name="org.restlet.ext.jackson" rev="2.0.5" conf="*->default"/>
 
 <!--
-       Uncomment this to use MySQL as database with SQL as Gora store.
--->
-<!--
-       <dependency org="mysql" name="mysql-connector-java" rev="5.1.13" conf="*->default"/>
--->
-
-<!--
        Uncomment this to use HBase as Gora backend. Then manually add hbase-0.20.6 jar to the lib directory.
 -->
 <!--
@@ -114,6 +109,14 @@
        </dependency>
 -->
 
+	<dependency org="org.apache.gora" name="gora-cassandra" rev="0.2-incubating" conf="*->compile"/>
+	<dependency org="org.apache.cassandra" name="cassandra-thrift" rev="0.8.1"/>
+	<dependency org="com.ecyrd.speed4j" name="speed4j" rev="0.9" conf="*->*,!javadoc,!sources"/>
+	<dependency org="com.github.stephenc.high-scale-lib" name="high-scale-lib" rev="1.1.2" conf="*->*,!javadoc,!sources"/>
+	<dependency org="com.google.collections" name="google-collections" rev="1.0" conf="*->*,!javadoc,!sources"/>
+	<dependency org="com.google.guava" name="guava" rev="r09" conf="*->*,!javadoc,!sources"/>
+
+
                 <!--global exclusion-->
              	<exclude module="ant" />
              	<exclude module="jmxtools" />

Reply via email to