[
https://issues.apache.org/jira/browse/GORA-211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13596071#comment-13596071
]
Renato Javier Marroquín Mogrovejo commented on GORA-211:
--------------------------------------------------------
Hi Roland,
So for what I am seeing most methods using Hector client will need
synchronization right? what do you think it should be done? synchronizing while
getting the problems? or synchronizing them all once in for all?
> thread safety: java.lang.NullPointerException
> ---------------------------------------------
>
> Key: GORA-211
> URL: https://issues.apache.org/jira/browse/GORA-211
> Project: Apache Gora
> Issue Type: Bug
> Components: storage-cassandra
> Affects Versions: 0.2
> Environment: nutch 2.1 / cassandra 1.2.1 / gora-cassandra 0.2 /
> gora-core 0.2.1
> running fetch with parse=true
> fetcher.threads.per.queue=2
> nutch on a 16 core AMD Opteron 2GHz
> Cassandra on 8 core Intel Xeon 3.3 GHz
> Reporter: Roland
> Assignee: Roland
> Priority: Critical
> Attachments: GORA-211-0.2.patch, GORA-211-trunk.patch,
> GORA-211-trunk-v2.patch
>
>
> This is the result of debugging one of my issues described in NUTCH-1534.
> example trace:
> java.lang.NullPointerException
> at
> me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
> at
> me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:71)
> at
> org.apache.gora.cassandra.store.CassandraClient.addColumn(CassandraClient.java:139)
> at
> org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:307)
> at
> org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:212)
> at
> org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:65)
> at
> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:587)
> at
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
> at
> org.apache.nutch.fetcher.FetcherReducer$FetcherThread.output(FetcherReducer.java:664)
> at
> org.apache.nutch.fetcher.FetcherReducer$FetcherThread.run(FetcherReducer.java:534)
> I'm suspecting CassandraStore.put() not taking enough precautions to copy all
> objects safely to it's buffer.
> {code}
> switch(type) {
> case RECORD:
> Persistent persistent = (Persistent) fieldValue;
> Persistent newRecord = persistent.newInstance(new
> StateManagerImpl());
> for (Field member: fieldSchema.getFields()) {
> newRecord.put(member.pos(), persistent.get(member.pos()));
> }
> fieldValue = newRecord;
> break;
> case MAP:
> StatefulHashMap<?, ?> map = (StatefulHashMap<?, ?>) fieldValue;
> StatefulHashMap<?, ?> newMap = new StatefulHashMap(map);
> fieldValue = newMap;
> break;
> }
> {code}
> case RECORD - do we not need to duplicate the object returned by
> "persistent.get(member.pos())":
> newRecord.put(member.pos(), persistent.get(member.pos()))
> case MAP - do we not need to duplicate all value-objects of the map?
> I had not time to write a patch or test this, so, please comment :)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira