RE: InvocationTargetException with Nutch 2.0 Gora 0.2 and Cassandra 0.8.4

2011-08-29 Thread Tom Davidson
I had similar classpath issues. Are there any versions of Hector in your classpath (in your Hadoop lib folder?) that are not the same as in your nutch deployment jar? From: lewis john mcgibbney [mailto:lewis.mcgibb...@gmail.com] Sent: Monday, August 29, 2011 1:57 PM To: dev@nutch.apache.org Subj

RE: Gora CassandraStore is not thread safe?

2011-08-29 Thread Tom Davidson
gotten any further with this? Lewis On Wed, Aug 10, 2011 at 8:43 PM, Tom Davidson wrote: > Has anyone tested the CassandraStore in gora 0.2 using multiple threads? > The nutch 2 fetcher architecture has many threads writing to one > GoraRecordWriter and I am getting concurrent modif

[jira] [Created] (NUTCH-1077) Nutch 2 DbUpdateMapper throws ArrayOutOfBoundsException when running update

2011-08-09 Thread Tom Davidson (JIRA)
Issue Type: Bug Components: fetcher Affects Versions: 2.0 Environment: CentOS 5 Linux with CDH3 Hadoop. Reporter: Tom Davidson I got this error when running a simple nutch update after doing a small fetch and parse. java.lang.ArrayIndexOutOfBoundsException: 0

RE: Future of Nutch 2.0 [Was: Unresolved dependencies org.apache.gora#gora-hbase;0.1: not found in Nutch trunk]

2011-08-09 Thread Tom Davidson
Hi All, I have been using Nutch 1.x for the last 9 months or so and it works well for large scale crawls up to around a billion pages. However, the inherent lack of random access in HDFS really starts to become a burden on our hadoop cluster when going through the whole generate/update/fetch cy

Nutch 2 LinkAnalysisScoringFilter

2011-08-02 Thread Tom Davidson
Does the LinkAnalysisScoringFilter in Nutch 2 work?

Getting this exception when running Nutch2 updatedb after fetch

2011-08-02 Thread Tom Davidson
java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.nutch.util.TableUtil.reverseAppendSplits(TableUtil.java:126) at org.apache.nutch.util.TableUtil.reverseUrl(TableUtil.java:66) at org.apache.nutch.util.TableUtil.reverseUrl(TableUtil.java:43) at org.apache.nut

RE: Nutch 2 and Cassandra

2011-08-02 Thread Tom Davidson
u have provided, for example some latent info that has been assumed or not been explained. Thank you [1] http://wiki.apache.org/nutch/ErrorMessagesInNutch2 On Tue, Aug 2, 2011 at 6:32 PM, Tom Davidson mailto:tdavid...@covario.com>> wrote: I found the problem. I am using Cloudera CDH3 and

[jira] [Commented] (NUTCH-937) When nutch is run on hadoop > 0.20.2 (or cdh) it will not find plugins because MapReduce will not unpack plugin/ directory from the job's pack (due to MAPREDUCE-967)

2011-08-02 Thread Tom Davidson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078341#comment-13078341 ] Tom Davidson commented on NUTCH-937: I had the same problem in my build of nutch 2

RE: Nutch 2 and Cassandra

2011-08-02 Thread Tom Davidson
I found the problem. I am using Cloudera CDH3 and it has a hue plugins jar with an older thrift library in it. I removed the jar from my classpath and all is good. Thanks for your help. -Original Message- From: Tom Davidson [mailto:tdavid...@covario.com] Sent: Monday, August 01, 2011 3

RE: Nutch 2 and Cassandra

2011-08-01 Thread Tom Davidson
+ + + + On Mon, Aug 1, 2011 at 2:55 PM, Tom Davidson wrote: > I did something similar to below to add the Cassandra dependencies. Note that > I am getting NoSuchMethodErrors not ClassNotFoundExceptions. Can you add the > hector jars to your nutch job

RE: Nutch 2 and Cassandra

2011-08-01 Thread Tom Davidson
ava.lang.ClassLoader.loadClass(ClassLoader.java:247) ... 19 more On Mon, Aug 1, 2011 at 11:59 AM, Tom Davidson wrote: > Hi All, > > > > I am kind of at my wit's end here, so I am hoping someone here can > help.  I am trying to use Nutch2 and Cassandra and I have been >

Nutch 2 and Cassandra

2011-08-01 Thread Tom Davidson
Hi All, I am kind of at my wit's end here, so I am hoping someone here can help. I am trying to use Nutch2 and Cassandra and I have been successful using the runtime/local build. I am using the Cloudera CDH3 on CentOs 5 and I do not want to contaminate by hadoop install by dropping in a bunch

subscribe

2011-03-23 Thread Tom Davidson