[
https://issues.apache.org/jira/browse/GORA-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13498930#comment-13498930
]
Lewis John McGibbney commented on GORA-182:
-------------------------------------------
Hi Kaz, I checked out gora-core and gora-cassandra 0.2.1, built the modules
locally then manually copied them over to my Nutch installation. Upon injecting
URLs into Cassandra, I get the following.
{code}
me.prettyprint.hector.api.exceptions.HInvalidRequestException:
InvalidRequestException(why:(String didn't validate.) [webpage][f][ts] failed
validation)
at
me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52)
at
me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:97)
at
me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:90)
at
me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
at
me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:233)
at
me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
at
me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:102)
at
me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:108)
at
me.prettyprint.cassandra.model.MutatorImpl$3.doInKeyspace(MutatorImpl.java:248)
at
me.prettyprint.cassandra.model.MutatorImpl$3.doInKeyspace(MutatorImpl.java:245)
at
me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
at
me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
at
me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:245)
at
me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:71)
at
org.apache.gora.cassandra.store.HectorUtils.insertColumn(HectorUtils.java:47)
at
org.apache.gora.cassandra.store.CassandraClient.addColumn(CassandraClient.java:169)
at
org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:347)
at
org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:228)
at
org.apache.gora.cassandra.store.CassandraStore.close(CassandraStore.java:95)
at
org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:55)
at
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: InvalidRequestException(why:(String didn't validate.)
[webpage][f][ts] failed validation)
at
org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:19479)
at
org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:1035)
at
org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:1009)
at
me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:95)
... 22 more
{code}
The offending fetchTime field in Nutch WebPage [0] and consequently mapped in
gora-cassandra-mapping.xml is of long data type. Initially I thought to add
appropriate methods using hectors LongSerializer for the creation and insertion
of columnNames in o.a.g.c.store.HectorUtils however one I repackage and
attempt to inject I get the above trace again.
Any ideas off the top of your head Kaz? Did you test this with Nutch 2.x head
or 2.1?
[0]
http://svn.apache.org/repos/asf/nutch/branches/2.x/src/java/org/apache/nutch/storage/WebPage.java
> Nutch 2.1 does not work with gora-cassandra 0.2.1
> -------------------------------------------------
>
> Key: GORA-182
> URL: https://issues.apache.org/jira/browse/GORA-182
> Project: Apache Gora
> Issue Type: Bug
> Components: storage-cassandra
> Affects Versions: 0.2.1
> Reporter: Kazuomi Kashii
> Attachments: GORA-182.patch
>
>
> Nutch 2.1 does not work with gora-cassandra 0.2.1.
> Especially, "outlinks" field is not written.
> I have confirmed this issue on Mac OS X and CentOS.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira