Hi Lewis,

I was running Nutch deployed with a dedicated Cassandra cluster. Frankly, I 
have given up on using Nutch 2 at this time as it seems highly unstable and not 
really in active development. Your effort to address this is encouraging. 
Because Nutch uses multithreading in the fetchers, I was getting 
ConcurrentModification errors and OutOfMemory errors on a regular basis in the 
CassandraStore. As far as I recall, the caching/flushing implementation is just 
not thread safe. If the CassandraStore caching was completely removed it may 
work, but would probably not be very efficient.  If I were to fix this class, I 
would try to rewrite it to use Hector batched mutations instead.

Tom

-----Original Message-----
From: lewis john mcgibbney [mailto:lewis.mcgibb...@gmail.com] 
Sent: Monday, August 29, 2011 1:41 PM
To: gora-dev@incubator.apache.org; d...@nutch.apache.org
Subject: Re: Gora CassandraStore is not thread safe?

Hi Tom,

Apologies for cross posting, this would not usually be the case but I'm
hoping that if any results come from the thread then both communities can
benefit.

I'm in the process of getting Cassandra 0.8.4 working with Nutch 2.0 and
Gora 0.2 myself and seem to be having some nasty problems.

Some questions for you

1) How are you running Nutch local or deploy?
2) How are you running Cassandra, local or deployed in a cluster?

The obvious thoughts are that this is a bug and that there are
method(s)/object(s) which are not safe.

Have you gotten any further with this?

Lewis


On Wed, Aug 10, 2011 at 8:43 PM, Tom Davidson <tdavid...@covario.com> wrote:

> Has anyone tested the CassandraStore in gora 0.2 using multiple threads?
>  The nutch 2 fetcher architecture has many threads writing to one
> GoraRecordWriter and I am getting concurrent modification errors like below.
>
> Caused by: java.util.ConcurrentModificationException
>               at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
>               at java.util.HashMap$KeyIterator.next(HashMap.java:828)
>               at
> org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:192)
>               at
> org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:65)
>
>
>
>
>
>


-- 
*Lewis*

Reply via email to