Hi,

I’m running Nutch 2.2.1 on a 3-node Hadoop 1.2.1 cluster. I’m using Gora to 
store the crawl data on Cassandra. Since Gora 0.3 does not support string and 
null unions on the avro schema, I was advised to use Gora 0.4 SNAPSHOT and 
bundle it with Nutch to create the job file.

However, upon finally running “ant job” on the NUTCH_HOME directory, the 0.3 
version is bundled in the job file and not the 0.4 snapshot. I suppose this is 
because ant does a full cleanup and copy of libs and also because in 
ivy/ivy.xml, the cassandra dependancy is mentioned as rev=“0.3”.

I changed that to “0.4-SNAPSHOT” and I’m able to build by moving the snapshot 
artefacts to 
/home/hduser/.ivy2/local/org.apache.gora/gora-cassandra/0.4-SNAPSHOT/jars 
(since this is where the system looks for local jars before looking up on 
maven’s online repo).

By doing so, I’m able to build the job file with 0.4 snapshot bundled but I’m 
not getting other dependencies like thrift etc. Kindly help me with a permanent 
solution to this problem.

-- 
Manikandan Saravanan
Architect - Technology
TheSocialPeople

Reply via email to