[jira] [Commented] (GORA-184) Gora with Hadoop 1.0.3 + Hbase 0.92.0 + Avro 1.5.3
[ https://issues.apache.org/jira/browse/GORA-184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13498842#comment-13498842 ] Lewis John McGibbney commented on GORA-184: --- Hi Alfonso, this sounds like areasonable feature request, however there are some issues as well, and i therefore propose the following -Hadoop 1.0.3/4: the upgrade should be addressed... a patch would be very welcome :) -Avro 1.5.3: Much more recent versions of the library are available... the aim would be to address Ed's work over in GORA-94 -HBase 0.92.0: Currently the Gora community is working and supporting HBase version 0.90.X... AFAIK this is going to contine unless someone proposes justification behind a strategic switch... Gora with Hadoop 1.0.3 + Hbase 0.92.0 + Avro 1.5.3 -- Key: GORA-184 URL: https://issues.apache.org/jira/browse/GORA-184 Project: Apache Gora Issue Type: Improvement Affects Versions: 0.2.1 Reporter: Alfonso Nishikawa I have seen the upgrado to Hadoop 1.0.1 [#GORA-76], but I ask for Hadoop 1.0.3 because it is the specific version I use although Hadoop 1.0.4 was released recently. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-182) Nutch 2.1 does not work with gora-cassandra 0.2.1
[ https://issues.apache.org/jira/browse/GORA-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13498930#comment-13498930 ] Lewis John McGibbney commented on GORA-182: --- Hi Kaz, I checked out gora-core and gora-cassandra 0.2.1, built the modules locally then manually copied them over to my Nutch installation. Upon injecting URLs into Cassandra, I get the following. {code} me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:(String didn't validate.) [webpage][f][ts] failed validation) at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52) at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:97) at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:90) at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:233) at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131) at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:102) at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:108) at me.prettyprint.cassandra.model.MutatorImpl$3.doInKeyspace(MutatorImpl.java:248) at me.prettyprint.cassandra.model.MutatorImpl$3.doInKeyspace(MutatorImpl.java:245) at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85) at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:245) at me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:71) at org.apache.gora.cassandra.store.HectorUtils.insertColumn(HectorUtils.java:47) at org.apache.gora.cassandra.store.CassandraClient.addColumn(CassandraClient.java:169) at org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:347) at org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:228) at org.apache.gora.cassandra.store.CassandraStore.close(CassandraStore.java:95) at org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:55) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) Caused by: InvalidRequestException(why:(String didn't validate.) [webpage][f][ts] failed validation) at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:19479) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:1035) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:1009) at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:95) ... 22 more {code} The offending fetchTime field in Nutch WebPage [0] and consequently mapped in gora-cassandra-mapping.xml is of long data type. Initially I thought to add appropriate methods using hectors LongSerializer for the creation and insertion of columnNames in o.a.g.c.store.HectorUtils however one I repackage and attempt to inject I get the above trace again. Any ideas off the top of your head Kaz? Did you test this with Nutch 2.x head or 2.1? [0] http://svn.apache.org/repos/asf/nutch/branches/2.x/src/java/org/apache/nutch/storage/WebPage.java Nutch 2.1 does not work with gora-cassandra 0.2.1 - Key: GORA-182 URL: https://issues.apache.org/jira/browse/GORA-182 Project: Apache Gora Issue Type: Bug Components: storage-cassandra Affects Versions: 0.2.1 Reporter: Kazuomi Kashii Attachments: GORA-182.patch Nutch 2.1 does not work with gora-cassandra 0.2.1. Especially, outlinks field is not written. I have confirmed this issue on Mac OS X and CentOS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Build failed in Jenkins: goraamazon_branch #136
Hi Henry, I've been trying to narrow this down. It's to do with deploying the snapshots to repository.apache.org. Basically this is what is happening. Currently the gora-core-0.3-SNAPSHOT-tests.jar [0] and accompanying signatures are all dated Tue Jul 24 05:48:55 UTC 2012, e.g. they have not been updated... as the WebServiceTestBase (I think) class is not included in this older SNAPSHOT the build fails. There is no *problem* with the code, this is purely a Maven logistical pain in the neck. I've been trying to resolve it by deploying non-unique tests snapshot for gora-core but I think there is a bug in the deploy plugin... as I am only able to deploy unique snapshots... I'll keep working on this Henry Lewis [0] https://repository.apache.org/content/repositories/snapshots/org/apache/gora/gora-core/0.3-SNAPSHOT/ On Fri, Nov 16, 2012 at 4:59 PM, Henry Saputra henry.sapu...@gmail.com wrote: Hi Guys, I am still seeing the cannot find symbol error in the gora-dynamo module when building from trunk. Is there a bug to trace this issue? - Henry On Wed, Oct 31, 2012 at 10:37 AM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi Renato, Its OK. I can confirm that things work fine when I run locally as well. The problem relates to the restructuring of the core modules and the fact that the goraamazon build pulls the 0.3-SNAPSHOT dependency (generated from trunk) which doesn't contain the new restructuring. You can see these errors here https://builds.apache.org/view/G-L/view/Gora/job/goraamazon_branch/136/org.apache.gora$gora-dynamodb/console Please check them out and we can either discuss here or offline to confirm that they would be resolved once the changes are ported to trunk. On Wed, Oct 31, 2012 at 5:27 PM, Renato Marroquín Mogrovejo renatoj.marroq...@gmail.com wrote: Hi, I don't know what breaks things in here. I tested locally and it worked fine. Lewis I remember you talking about this a while ago, do you have any clue on this? Or a place where I could start digging? If anybody has an idea of where to start on digging please let me know. Thanks in advance! Renato M. 2012/10/31 Apache Jenkins Server jenk...@builds.apache.org: See https://builds.apache.org/job/goraamazon_branch/136/changes Changes: [rmarroquin] Committing new patch for changes in the way exception were being handled. -- [...truncated 19900 lines...] [INFO] Installing https://builds.apache.org/job/goraamazon_branch/ws/branches/goraamazon/gora-cassandra/pom.xml to /home/jenkins/jenkins-slave/maven-repositories/0/org/apache/gora/gora-cassandra/0.3-SNAPSHOT/gora-cassandra-0.3-SNAPSHOT.pom mojoSucceeded org.apache.maven.plugins:maven-install-plugin:2.3.1(default-install) mojoStarted org.apache.felix:maven-bundle-plugin:2.3.7(default-install) [INFO] [INFO] --- maven-bundle-plugin:2.3.7:install (default-install) @ gora-cassandra --- [INFO] Installing org/apache/gora/gora-cassandra/0.3-SNAPSHOT/gora-cassandra-0.3-SNAPSHOT.jar [INFO] Writing OBR metadata mojoSucceeded org.apache.felix:maven-bundle-plugin:2.3.7(default-install) projectSucceeded org.apache.gora:gora-cassandra:0.3-SNAPSHOT projectStarted org.apache.gora:gora-dynamodb:0.3-SNAPSHOT [INFO] [INFO] [INFO] Building Apache Gora :: Dynamodb 0.3-SNAPSHOT [INFO] [INFO] Source directory: https://builds.apache.org/job/goraamazon_branch/ws/branches/goraamazon/gora-dynamodb/src/examples/java added. mojoStarted org.codehaus.mojo:build-helper-maven-plugin:1.7(default) [INFO] [INFO] --- build-helper-maven-plugin:1.7:add-source (default) @ gora-dynamodb --- mojoSucceeded org.codehaus.mojo:build-helper-maven-plugin:1.7(default) mojoStarted org.apache.maven.plugins:maven-remote-resources-plugin:1.2.1(default) [INFO] [INFO] --- maven-remote-resources-plugin:1.2.1:process (default) @ gora-dynamodb --- mojoSucceeded org.apache.maven.plugins:maven-remote-resources-plugin:1.2.1(default) [debug] execute contextualize mojoStarted org.apache.maven.plugins:maven-resources-plugin:2.5(default-resources)[INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory https://builds.apache.org/job/goraamazon_branch/ws/branches/goraamazon/gora-dynamodb/src/main/resources [INFO] Copying 0 resource [INFO] Copying 3 resources [INFO] [INFO] --- maven-resources-plugin:2.5:resources (default-resources) @ gora-dynamodb --- mojoSucceeded org.apache.maven.plugins:maven-resources-plugin:2.5(default-resources) mojoStarted org.apache.maven.plugins:maven-compiler-plugin:2.3.2(default-compile) [INFO] [INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ gora-dynamodb --- [INFO] Compiling 7 source files to https
Re: Build failed in Jenkins: goraamazon_branch #136
This is also an excellent reminder to complete the wiki entry and this very subject... nice one Henry :0) On Fri, Nov 16, 2012 at 5:20 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi Henry, I've been trying to narrow this down. It's to do with deploying the snapshots to repository.apache.org. Basically this is what is happening. Currently the gora-core-0.3-SNAPSHOT-tests.jar [0] and accompanying signatures are all dated Tue Jul 24 05:48:55 UTC 2012, e.g. they have not been updated... as the WebServiceTestBase (I think) class is not included in this older SNAPSHOT the build fails. There is no *problem* with the code, this is purely a Maven logistical pain in the neck. I've been trying to resolve it by deploying non-unique tests snapshot for gora-core but I think there is a bug in the deploy plugin... as I am only able to deploy unique snapshots... I'll keep working on this Henry Lewis [0] https://repository.apache.org/content/repositories/snapshots/org/apache/gora/gora-core/0.3-SNAPSHOT/ On Fri, Nov 16, 2012 at 4:59 PM, Henry Saputra henry.sapu...@gmail.com wrote: Hi Guys, I am still seeing the cannot find symbol error in the gora-dynamo module when building from trunk. Is there a bug to trace this issue? - Henry On Wed, Oct 31, 2012 at 10:37 AM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi Renato, Its OK. I can confirm that things work fine when I run locally as well. The problem relates to the restructuring of the core modules and the fact that the goraamazon build pulls the 0.3-SNAPSHOT dependency (generated from trunk) which doesn't contain the new restructuring. You can see these errors here https://builds.apache.org/view/G-L/view/Gora/job/goraamazon_branch/136/org.apache.gora$gora-dynamodb/console Please check them out and we can either discuss here or offline to confirm that they would be resolved once the changes are ported to trunk. On Wed, Oct 31, 2012 at 5:27 PM, Renato Marroquín Mogrovejo renatoj.marroq...@gmail.com wrote: Hi, I don't know what breaks things in here. I tested locally and it worked fine. Lewis I remember you talking about this a while ago, do you have any clue on this? Or a place where I could start digging? If anybody has an idea of where to start on digging please let me know. Thanks in advance! Renato M. 2012/10/31 Apache Jenkins Server jenk...@builds.apache.org: See https://builds.apache.org/job/goraamazon_branch/136/changes Changes: [rmarroquin] Committing new patch for changes in the way exception were being handled. -- [...truncated 19900 lines...] [INFO] Installing https://builds.apache.org/job/goraamazon_branch/ws/branches/goraamazon/gora-cassandra/pom.xml to /home/jenkins/jenkins-slave/maven-repositories/0/org/apache/gora/gora-cassandra/0.3-SNAPSHOT/gora-cassandra-0.3-SNAPSHOT.pom mojoSucceeded org.apache.maven.plugins:maven-install-plugin:2.3.1(default-install) mojoStarted org.apache.felix:maven-bundle-plugin:2.3.7(default-install) [INFO] [INFO] --- maven-bundle-plugin:2.3.7:install (default-install) @ gora-cassandra --- [INFO] Installing org/apache/gora/gora-cassandra/0.3-SNAPSHOT/gora-cassandra-0.3-SNAPSHOT.jar [INFO] Writing OBR metadata mojoSucceeded org.apache.felix:maven-bundle-plugin:2.3.7(default-install) projectSucceeded org.apache.gora:gora-cassandra:0.3-SNAPSHOT projectStarted org.apache.gora:gora-dynamodb:0.3-SNAPSHOT [INFO] [INFO] [INFO] Building Apache Gora :: Dynamodb 0.3-SNAPSHOT [INFO] [INFO] Source directory: https://builds.apache.org/job/goraamazon_branch/ws/branches/goraamazon/gora-dynamodb/src/examples/java added. mojoStarted org.codehaus.mojo:build-helper-maven-plugin:1.7(default) [INFO] [INFO] --- build-helper-maven-plugin:1.7:add-source (default) @ gora-dynamodb --- mojoSucceeded org.codehaus.mojo:build-helper-maven-plugin:1.7(default) mojoStarted org.apache.maven.plugins:maven-remote-resources-plugin:1.2.1(default) [INFO] [INFO] --- maven-remote-resources-plugin:1.2.1:process (default) @ gora-dynamodb --- mojoSucceeded org.apache.maven.plugins:maven-remote-resources-plugin:1.2.1(default) [debug] execute contextualize mojoStarted org.apache.maven.plugins:maven-resources-plugin:2.5(default-resources)[INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory https://builds.apache.org/job/goraamazon_branch/ws/branches/goraamazon/gora-dynamodb/src/main/resources [INFO] Copying 0 resource [INFO] Copying 3 resources [INFO] [INFO] --- maven-resources-plugin:2.5:resources (default-resources) @ gora-dynamodb --- mojoSucceeded org.apache.maven.plugins:maven-resources-plugin:2.5(default-resources) mojoStarted
[jira] [Updated] (GORA-186) Show better errors when a field is missing in HBase mapping
[ https://issues.apache.org/jira/browse/GORA-186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-186: -- Fix Version/s: 0.3 Show better errors when a field is missing in HBase mapping --- Key: GORA-186 URL: https://issues.apache.org/jira/browse/GORA-186 Project: Apache Gora Issue Type: Improvement Components: storage-hbase Affects Versions: 0.2, 0.2.1 Environment: Ubuntu 12.04, avro 1.3.2, hbase 0.92.0, gora 0.2.1 Reporter: Alfonso Nishikawa Assignee: Alfonso Nishikawa Priority: Trivial Fix For: 0.3 Attachments: GORA-186.patch, GORA-186v2.patch When a field is wrong typed or missing in gora-hbase-mapping.xml, a NullPointerException is raised in org.apache.gora.hbase.store.HBaseStore:235 Just control this to know which field is missing/wrong. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-183) dataStore.put() -org.apache.gora.hbase.util.HBaseInterface#toBytes(). Unknown type: UNION
[ https://issues.apache.org/jira/browse/GORA-183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13501227#comment-13501227 ] Lewis John McGibbney commented on GORA-183: --- {bq}Uhm noticed now that this issue is related to HBase. Does anyone knows if it affects Cassandra, SQL,... too? {bq} Honest answer is no. I will try to write the schema and mapping implementations for gora-cassandra and get back to you here dataStore.put() -org.apache.gora.hbase.util.HBaseInterface#toBytes(). Unknown type: UNION -- Key: GORA-183 URL: https://issues.apache.org/jira/browse/GORA-183 Project: Apache Gora Issue Type: Bug Components: storage-hbase Affects Versions: 0.2, 0.2.1 Environment: Ubuntu 12.04, HBase 0.92.0, Gora 0.2.1, Avro 1.3.2 Reporter: Alfonso Nishikawa Assignee: Alfonso Nishikawa Summary: HBase does not handle avro UNION type (in the schema like [string,null]. When trying to write a row I get the RuntimeException Unknown type: UNION. My .avsc is the following: {code} {name: TestRow, type: record, namespace: es.foo.tests.storage, fields: [ {name: columnLong, type: long, default: 0}, {name: unionRecursive, type: [TestRow,null]}, {name: unionString, type: [string,null]}, {name: family2, type: {type: map, values:string}} ] } {code} my mapping is: {code} ?xml version=1.0 encoding=UTF-8? gora-orm table name=test !-- Configuración de familias -- family name=family1 maxVersions=1 compression=SNAPPY / family name=family2 maxVersions=1 compression=SNAPPY / /table class table=test keyClass=java.lang.String name=es.foo.tests.storage.TestRow field name=unionString family=family1 qualifier=unionString/ field name=unionRecursivefamily=family1 qualifier=unionRecursive / field name=columnLongfamily=family1 qualifier=colInteger / field name=family2 family=family2 / /class /gora-orm {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-174) GORA compiler does not handle [string, null] unions in the AVRO schema
[ https://issues.apache.org/jira/browse/GORA-174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13501229#comment-13501229 ] Lewis John McGibbney commented on GORA-174: --- Hi, I'll try the new patch both with gora trunk and with the Nutch 2.x InjectorJob. GORA compiler does not handle [string, null] unions in the AVRO schema -- Key: GORA-174 URL: https://issues.apache.org/jira/browse/GORA-174 Project: Apache Gora Issue Type: Bug Components: schema Affects Versions: 0.2.1 Reporter: Julien Nioche Assignee: Alfonso Nishikawa Fix For: 0.3 Attachments: GORA-174-test.patch, GORA-174v2.patch See NUTCH-1477 for description. We are getting NPE when using the DataFileAvroStore, in order to avoid that I modified the schema to allow for null values on some fields e.g.{name: baseUrl, type: [string, null] } however when generating the code for the schema the accessors are not generated by GORA which prevents Nutch from compiling -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GORA-189) String parameters in generated Persistent subclasses by Compiler -not only Utf8-
[ https://issues.apache.org/jira/browse/GORA-189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-189: -- Fix Version/s: 0.3 String parameters in generated Persistent subclasses by Compiler -not only Utf8- Key: GORA-189 URL: https://issues.apache.org/jira/browse/GORA-189 Project: Apache Gora Issue Type: Improvement Components: gora-core Affects Versions: 0.2.1 Reporter: Alfonso Nishikawa Assignee: Alfonso Nishikawa Priority: Trivial Fix For: 0.3 Attachments: GORA-189-code.patch It would be much useful if gora compiler generates methods taking Strings as parameters (and creating Utf8 inside automatically). Code would be much more clear and simple when populating that classes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GORA-182) Nutch 2.1 does not work with gora-cassandra 0.2.1
[ https://issues.apache.org/jira/browse/GORA-182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-182: -- Fix Version/s: 0.3 Nutch 2.1 does not work with gora-cassandra 0.2.1 - Key: GORA-182 URL: https://issues.apache.org/jira/browse/GORA-182 Project: Apache Gora Issue Type: Bug Components: storage-cassandra Affects Versions: 0.2.1 Reporter: Kazuomi Kashii Fix For: 0.3 Attachments: GORA-182.patch Nutch 2.1 does not work with gora-cassandra 0.2.1. Especially, outlinks field is not written. I have confirmed this issue on Mac OS X and CentOS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GORA-184) Gora with Hadoop 1.0.3 + Hbase 0.92.0 + Avro 1.5.3
[ https://issues.apache.org/jira/browse/GORA-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-184: -- Fix Version/s: 0.4 Gora with Hadoop 1.0.3 + Hbase 0.92.0 + Avro 1.5.3 -- Key: GORA-184 URL: https://issues.apache.org/jira/browse/GORA-184 Project: Apache Gora Issue Type: Improvement Affects Versions: 0.2.1 Reporter: Alfonso Nishikawa Fix For: 0.4 I have seen the upgrado to Hadoop 1.0.1 [#GORA-76], but I ask for Hadoop 1.0.3 because it is the specific version I use although Hadoop 1.0.4 was released recently. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (GORA-176) GoraCI
[ https://issues.apache.org/jira/browse/GORA-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved GORA-176. --- Resolution: Duplicate Closing as duplicate of GORA-73 GoraCI -- Key: GORA-176 URL: https://issues.apache.org/jira/browse/GORA-176 Project: Apache Gora Issue Type: Umbrella Components: testing Affects Versions: 0.3 Reporter: Renato Javier Marroquín Mogrovejo -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GORA-183) dataStore.put() -org.apache.gora.hbase.util.HBaseInterface#toBytes(). Unknown type: UNION
[ https://issues.apache.org/jira/browse/GORA-183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-183: -- Fix Version/s: 0.3 dataStore.put() -org.apache.gora.hbase.util.HBaseInterface#toBytes(). Unknown type: UNION -- Key: GORA-183 URL: https://issues.apache.org/jira/browse/GORA-183 Project: Apache Gora Issue Type: Bug Components: storage-hbase Affects Versions: 0.2, 0.2.1 Environment: Ubuntu 12.04, HBase 0.92.0, Gora 0.2.1, Avro 1.3.2 Reporter: Alfonso Nishikawa Assignee: Alfonso Nishikawa Fix For: 0.3 Summary: HBase does not handle avro UNION type (in the schema like [string,null]. When trying to write a row I get the RuntimeException Unknown type: UNION. My .avsc is the following: {code} {name: TestRow, type: record, namespace: es.foo.tests.storage, fields: [ {name: columnLong, type: long, default: 0}, {name: unionRecursive, type: [TestRow,null]}, {name: unionString, type: [string,null]}, {name: family2, type: {type: map, values:string}} ] } {code} my mapping is: {code} ?xml version=1.0 encoding=UTF-8? gora-orm table name=test !-- Configuración de familias -- family name=family1 maxVersions=1 compression=SNAPPY / family name=family2 maxVersions=1 compression=SNAPPY / /table class table=test keyClass=java.lang.String name=es.foo.tests.storage.TestRow field name=unionString family=family1 qualifier=unionString/ field name=unionRecursivefamily=family1 qualifier=unionRecursive / field name=columnLongfamily=family1 qualifier=colInteger / field name=family2 family=family2 / /class /gora-orm {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GORA-187) gora-hbase always writing column when dirty, even if value is default or null
[ https://issues.apache.org/jira/browse/GORA-187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-187: -- Fix Version/s: 0.3 gora-hbase always writing column when dirty, even if value is default or null - Key: GORA-187 URL: https://issues.apache.org/jira/browse/GORA-187 Project: Apache Gora Issue Type: Improvement Components: storage-hbase Affects Versions: 0.2.1 Environment: Ubuntu 12.04, HBase 0.92.0 Reporter: Alfonso Nishikawa Priority: Minor Fix For: 0.3 When writing a field (tested with 'long' default '0'), if it is not dirty when saving, will not write the column. If setted to 1 and back to 0, saving will write that default value. With strings, after fixing [GORA-183], noticed that null values are too written (being default or not). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GORA-188) testSerdeWebPage failure - PersistentBase#equals() fails with map fields
[ https://issues.apache.org/jira/browse/GORA-188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-188: -- Fix Version/s: 0.3 testSerdeWebPage failure - PersistentBase#equals() fails with map fields Key: GORA-188 URL: https://issues.apache.org/jira/browse/GORA-188 Project: Apache Gora Issue Type: Bug Components: gora-core Affects Versions: 0.2.1 Reporter: Alfonso Nishikawa Priority: Minor Fix For: 0.3 As shown here: {code} junit.framework.AssertionFailedError: expected:org.apache.gora.examples.generated.WebPage@4b49ab6f { url:http://bar.com/; content:java.nio.HeapByteBuffer[pos=1 lim=1 cap=1] parsedContent:[1] outlinks:{http://bazbar.com=a8, http://baz.com/1.jspq=barbazp=foo=a6, http://baz.com/1.jspq=barbaz=a5, http://bar.com/3.jsp=a3, http://bar.com/1.html=a4, http://foo.com/1.html=a1, http://foo.com/2.html=a2, http://baz.com/1.jspq=foo=a7}; metadata:org.apache.gora.examples.generated.Metadata@51a { version:1 data:{metakey=metavalue} } } but was:org.apache.gora.examples.generated.WebPage@4b6d94c0 { url:http://bar.com/; content:java.nio.HeapByteBuffer[pos=0 lim=1 cap=1] parsedContent:[1] outlinks:{http://baz.com/1.jspq=barbaz=a5, http://baz.com/1.jspq=barbazp=foo=a6, http://bazbar.com=a8, http://bar.com/3.jsp=a3, http://foo.com/1.html=a1, http://bar.com/1.html=a4, http://foo.com/2.html=a2, http://baz.com/1.jspq=foo=a7}; metadata:org.apache.gora.examples.generated.Metadata@51a { version:1 data:{metakey=metavalue} } } at junit.framework.Assert.fail(Assert.java:50) at junit.framework.Assert.failNotEquals(Assert.java:287) at junit.framework.Assert.assertEquals(Assert.java:67) at junit.framework.Assert.assertEquals(Assert.java:74) at org.apache.gora.util.TestIOUtils.testSerializeDeserialize(TestIOUtils.java:125) at org.apache.gora.mapreduce.TestPersistentSerialization.testSerdeWebPage(TestPersistentSerialization.java:85) {code} the difference is the order of the outlinks. I guess they should be considered equal. Am I wrong? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-75) Improve documentation for DataStoreTestUtil
[ https://issues.apache.org/jira/browse/GORA-75?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502983#comment-13502983 ] Lewis John McGibbney commented on GORA-75: -- Does anyone have a perspective on this one? I suggest to close unless folks want to have more method annotation. Improve documentation for DataStoreTestUtil --- Key: GORA-75 URL: https://issues.apache.org/jira/browse/GORA-75 Project: Apache Gora Issue Type: Improvement Components: documentation Affects Versions: 0.1.1-incubating Reporter: Lewis John McGibbney Fix For: 0.3 As there are half a dozen or so tests within the above class, it is essential that each is thoroughly documented so that there is no doubt over how data stores treat various query and delete operations. To date it is causing some confusion and is hindering development. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GORA-185) Remove ANT scripts and IVY confs
[ https://issues.apache.org/jira/browse/GORA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-185: -- Attachment: GORA-185.patch patch for trunk Remove ANT scripts and IVY confs Key: GORA-185 URL: https://issues.apache.org/jira/browse/GORA-185 Project: Apache Gora Issue Type: Task Components: build process Affects Versions: 0.2.1 Reporter: Julien Nioche Fix For: 0.3 Attachments: GORA-185.patch There are currently build resources for ANT+IVY as well as Maven. According to Lewis only the latter is now used in which case it would be better to remove all the ANT+IVY stuff to avoid any confusion -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (GORA-75) Improve documentation for DataStoreTestUtil
[ https://issues.apache.org/jira/browse/GORA-75?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved GORA-75. -- Resolution: Fixed This issue related mostly to my unfamiliarity with our testing suite. Hopefully now it is clear and can be communicated to others. Thanks Henry Improve documentation for DataStoreTestUtil --- Key: GORA-75 URL: https://issues.apache.org/jira/browse/GORA-75 Project: Apache Gora Issue Type: Improvement Components: documentation Affects Versions: 0.1.1-incubating Reporter: Lewis John McGibbney Fix For: 0.3 As there are half a dozen or so tests within the above class, it is essential that each is thoroughly documented so that there is no doubt over how data stores treat various query and delete operations. To date it is causing some confusion and is hindering development. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Test errors java.lang.NoSuchMethodError: org.apache.gora.store.DataStore.setConf
Hi Henry Others, We are close to sorting this one out, I promise. Basically, I pruned out all of the old SNAPSHOT jar and tests.jar dependencies from repository.apache.org and adapted the Jenkins build to clean ad deploy only successful SNAPSHOT's after every CI build. This now resolved the problem we were having with the compilation failures with the dynamodb module. Now we are left with a scenario where many (74) tests fail with the following Exception [0] I looked at the change log for DataStoreTestUtil [1] and see that we've missed the additional IOExceptions which were added in during the dynamodb module development. @Renato, Is it possible for you to have a look at trunk, and see if the removal/correct implementation of such Exceptions is necessary... if so then where? This would be excellent and would also allow us to move towards addressing the other Dynamodb issues currently open on Jira. I think getting these tests it top priority just now, considering the changes which have been made further to the integration of the dynamodb module. Thanks all, Lewis [0] https://builds.apache.org/job/gora-trunk/org.apache.gora$gora-cassandra/532/testReport/org.apache.gora.cassandra.store/TestCassandraStore/testNewInstance/ [1] http://svn.apache.org/viewvc/gora/trunk/gora-core/src/test/java/org/apache/gora/store/DataStoreTestUtil.java?r1=1363659r2=1405417diff_format=h On Wed, Nov 28, 2012 at 7:09 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote:
Re: Test errors java.lang.NoSuchMethodError: org.apache.gora.store.DataStore.setConf
Hi, It looks the the core DataStore class also needs tidied up a bit, some method still throw IOExceptions, whereas others simply don't as the functionality has been moved further upstream. The Javadoc annotations for each method also need to be right, this is essential as DataStore is one of the key classes in Gora. We will get to the bottom of it soon :0) Best Lewis On Thu, Nov 29, 2012 at 2:48 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi Henry Others, We are close to sorting this one out, I promise. Basically, I pruned out all of the old SNAPSHOT jar and tests.jar dependencies from repository.apache.org and adapted the Jenkins build to clean ad deploy only successful SNAPSHOT's after every CI build. This now resolved the problem we were having with the compilation failures with the dynamodb module. Now we are left with a scenario where many (74) tests fail with the following Exception [0] I looked at the change log for DataStoreTestUtil [1] and see that we've missed the additional IOExceptions which were added in during the dynamodb module development. @Renato, Is it possible for you to have a look at trunk, and see if the removal/correct implementation of such Exceptions is necessary... if so then where? This would be excellent and would also allow us to move towards addressing the other Dynamodb issues currently open on Jira. I think getting these tests it top priority just now, considering the changes which have been made further to the integration of the dynamodb module. Thanks all, Lewis [0] https://builds.apache.org/job/gora-trunk/org.apache.gora$gora-cassandra/532/testReport/org.apache.gora.cassandra.store/TestCassandraStore/testNewInstance/ [1] http://svn.apache.org/viewvc/gora/trunk/gora-core/src/test/java/org/apache/gora/store/DataStoreTestUtil.java?r1=1363659r2=1405417diff_format=h On Wed, Nov 28, 2012 at 7:09 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: -- Lewis
[jira] [Created] (GORA-190) Add version switch to bin/gora script
Lewis John McGibbney created GORA-190: - Summary: Add version switch to bin/gora script Key: GORA-190 URL: https://issues.apache.org/jira/browse/GORA-190 Project: Apache Gora Issue Type: Improvement Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Priority: Minor Fix For: 0.3 This should act as a sure means of ensuring that Gora is properly installed in the target operating system. I have never used Gora on anything other than Ubuntu, so this will help us in the future to identify interoperability with other OS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (GORA-191) Add a constructor to GoraCompiler so it can be used outside of Gora.
Lewis John McGibbney created GORA-191: - Summary: Add a constructor to GoraCompiler so it can be used outside of Gora. Key: GORA-191 URL: https://issues.apache.org/jira/browse/GORA-191 Project: Apache Gora Issue Type: Improvement Components: gora-core, schema Reporter: Lewis John McGibbney Priority: Critical Fix For: 0.3 We need to automate the compiling of various .avsc files over in Nutch. We should add a constructor to GoraCompiler so it can be used more widely. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-174) GORA compiler does not handle [string, null] unions in the AVRO schema
[ https://issues.apache.org/jira/browse/GORA-174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509812#comment-13509812 ] Lewis John McGibbney commented on GORA-174: --- Hi Alfonso, at the beginning of the weekend I spent time working between the patched GoraCompiler and Nutch 2.x. AS Julien suggested in his initial problem description however when generating the code for the schema the accessors are not generated by GORA which prevents Nutch from compiling, this still seems to be the case even when using the GORA-174v3.patch so I am not sure that the patch is properly fixing this issue. GORA compiler does not handle [string, null] unions in the AVRO schema -- Key: GORA-174 URL: https://issues.apache.org/jira/browse/GORA-174 Project: Apache Gora Issue Type: Bug Components: schema Affects Versions: 0.2.1 Reporter: Julien Nioche Assignee: Alfonso Nishikawa Fix For: 0.3 Attachments: failed_tests_after_v3.tar.gz, GORA-174-test.patch, GORA-174v2.patch, GORA-174v3.patch See NUTCH-1477 for description. We are getting NPE when using the DataFileAvroStore, in order to avoid that I modified the schema to allow for null values on some fields e.g.{name: baseUrl, type: [string, null] } however when generating the code for the schema the accessors are not generated by GORA which prevents Nutch from compiling -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Build failed in Jenkins: Nutch-nutchgora #420
Hi All, On Sat, Dec 1, 2012 at 4:16 AM, Apache Jenkins Server jenk...@builds.apache.org wrote: See https://builds.apache.org/job/Nutch-nutchgora/420/ [junit] Running org.apache.nutch.crawl.TestGenerator [junit] Tests run: 4, Failures: 0, Errors: 1, Time elapsed: 63.594 sec [junit] Test org.apache.nutch.crawl.TestGenerator FAILED BUILD FAILED /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-nutchgora/2.x/build.xml:423: Tests failed! Total time: 6 minutes 18 seconds Build step 'Invoke Ant' marked build as failure Publishing Javadoc Below are the relevant parts from the test report... 2012-12-01 04:12:20,948 WARN mapred.FileOutputCommitter (FileOutputCommitter.java:cleanupJob(100)) - Output path is null in cleanup 2012-12-01 04:12:20,948 WARN mapred.LocalJobRunner (LocalJobRunner.java:run(298)) - job_local_0003 java.lang.ClassCastException: org.apache.gora.mapreduce.GoraInputSplit cannot be cast to org.apache.hadoop.mapred.InputSplit at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) 2012-12-01 04:12:21,941 INFO mapred.JobClient (JobClient.java:monitorAndPrintJob(1301)) - map 0% reduce 0% 2012-12-01 04:12:21,941 INFO mapred.JobClient (JobClient.java:monitorAndPrintJob(1356)) - Job complete: job_local_0003 Testcase: testGenerateHostLimit took 8.548 sec Caused an ERROR job failed: name=generate: 1354335140-1018707298, jobid=job_local_0003 java.lang.RuntimeException: job failed: name=generate: 1354335140-1018707298, jobid=job_local_0003 at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54) at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:191) at org.apache.nutch.crawl.GeneratorJob.generate(GeneratorJob.java:213) at org.apache.nutch.crawl.TestGenerator.generateFetchlist(TestGenerator.java:258) at org.apache.nutch.crawl.TestGenerator.testGenerateHostLimit(TestGenerator.java:138) I am not sure where in Nutch 2.x the GoraInputSplit is being incorrectly cast, however I'll try to find it. Anyone have any ideas? Best Lewis -- Lewis
[jira] [Commented] (GORA-174) GORA compiler does not handle [string, null] unions in the AVRO schema
[ https://issues.apache.org/jira/browse/GORA-174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509916#comment-13509916 ] Lewis John McGibbney commented on GORA-174: --- I can confirm that after applying the patch to gora-core then using this dependency with Nutch 2.x, the Exception is identical as the one documented in NUTCH-1477. When I compile the webpage avsc's here [0], the webpage class is generated but with no accessors for the fields in with the union case we are concerned with. I am also learning with the GoraCompiler (and Avro stuff) so I do not have a definitive solution to hand just now. I think the serializing and deserializing problems which seem to be a result of introducing union support should be addressed as we encounter them. [0] http://svn.apache.org/repos/asf/nutch/branches/2.x/src/gora/ GORA compiler does not handle [string, null] unions in the AVRO schema -- Key: GORA-174 URL: https://issues.apache.org/jira/browse/GORA-174 Project: Apache Gora Issue Type: Bug Components: schema Affects Versions: 0.2.1 Reporter: Julien Nioche Assignee: Alfonso Nishikawa Fix For: 0.3 Attachments: failed_tests_after_v3.tar.gz, GORA-174-test.patch, GORA-174v2.patch, GORA-174v3.patch See NUTCH-1477 for description. We are getting NPE when using the DataFileAvroStore, in order to avoid that I modified the schema to allow for null values on some fields e.g.{name: baseUrl, type: [string, null] } however when generating the code for the schema the accessors are not generated by GORA which prevents Nutch from compiling -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (GORA-192) Tests for GoraCompiler
Lewis John McGibbney created GORA-192: - Summary: Tests for GoraCompiler Key: GORA-192 URL: https://issues.apache.org/jira/browse/GORA-192 Project: Apache Gora Issue Type: Improvement Components: avro, gora-core, testing Reporter: Lewis John McGibbney Fix For: 0.4 The recent issues surrounding the GoraCompiler have clearly made a case for establishing some testing criteria for this important class. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-192) Tests for GoraCompiler
[ https://issues.apache.org/jira/browse/GORA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509932#comment-13509932 ] Lewis John McGibbney commented on GORA-192: --- As GoraCompiler is largely based on Avro's SpecificCompiler some basic guide tests can be seen here http://svn.apache.org/repos/asf/avro/trunk/lang/java/compiler/src/test/java/org/apache/avro/compiler/TestSpecificCompiler.java Tests for GoraCompiler -- Key: GORA-192 URL: https://issues.apache.org/jira/browse/GORA-192 Project: Apache Gora Issue Type: Improvement Components: avro, gora-core, testing Reporter: Lewis John McGibbney Fix For: 0.4 The recent issues surrounding the GoraCompiler have clearly made a case for establishing some testing criteria for this important class. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (GORA-193) Make sure gora-core test dependency is always generated when packaging.
[ https://issues.apache.org/jira/browse/GORA-193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved GORA-193. --- Resolution: Fixed Assignee: Lewis John McGibbney Committed @revision 1417112 in trunk Make sure gora-core test dependency is always generated when packaging. --- Key: GORA-193 URL: https://issues.apache.org/jira/browse/GORA-193 Project: Apache Gora Issue Type: Improvement Components: maven Affects Versions: 0.2.1 Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Priority: Trivial Fix For: 0.3 The trivial addition of maven jar plugin testing goal will ensure that the test dependency is always produced for gora-core. This is important. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-191) Add a constructor to GoraCompiler so it can be used outside of Gora.
[ https://issues.apache.org/jira/browse/GORA-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509942#comment-13509942 ] Lewis John McGibbney commented on GORA-191: --- This issue should also incorporate the addition of functionality to allow GoraCompiler to accept a List[] of input schemas as is done in Avro's SpecificCompiler. Add a constructor to GoraCompiler so it can be used outside of Gora. Key: GORA-191 URL: https://issues.apache.org/jira/browse/GORA-191 Project: Apache Gora Issue Type: Improvement Components: gora-core, schema Reporter: Lewis John McGibbney Priority: Critical Fix For: 0.3 We need to automate the compiling of various .avsc files over in Nutch. We should add a constructor to GoraCompiler so it can be used more widely. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Gora Site Docs
Hi Henry, When I started on this my opinion changed somewhat The investment required is as follows 1) Maven (svnpubsub) I can grab the maven fluido skin [0] (which looks OK) and have it up and running reasonably shortly. 2) Apache CMS, this requires someone writing the site however the publishing workflow is so much less hassle. What do you think? Lewis [0] http://maven.apache.org/skins/maven-fluido-skin/ On Wed, Dec 5, 2012 at 7:12 PM, Henry Saputra henry.sapu...@gmail.com wrote: Lewis, With Maven site, does it mean we are still using the svn pubsub or we could move to ASF CMS for publishing it? - Henry On Fri, Nov 30, 2012 at 3:47 AM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi All, I'm currently setting about the transition from Forrest to Maven for the site docs. It complicates things by having the current two tier structure which we maintain for the site docs. I therefore propose to just have docs. Enis, you had reasons and justification behind the legacy Gora documentation structure, if you could remind us again it would be excellent. I am working on this today regardless and will hopefully have a new proposal for the Mavenized site prepared shortly. Thanks, everyone. Happy St Andrews Day Lewis -- Lewis -- Lewis
Re: Gora Site Docs
I started the Maven transition and can complete tomorrow. Is everyone happy with the fluido skin that I mentioned? If so then I will work to get it sorted out tomorrow. Best Lewis On Wed, Dec 5, 2012 at 7:46 PM, Enis Söztutar enis@gmail.com wrote: Sorry, meant to reply this, but totally fell out of my radar. The reason why we are doing per-release and release-independent docs is that there are some docs that document the code (tutorial, javadoc, etc), and some docs that dont (the main site). Having said that, I don't think keeping the docs separated is a blocker for going maven. We can merge these, and if it still makes sense to separate the two, we can do it later. Enis On Wed, Dec 5, 2012 at 11:25 AM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi Henry, When I started on this my opinion changed somewhat The investment required is as follows 1) Maven (svnpubsub) I can grab the maven fluido skin [0] (which looks OK) and have it up and running reasonably shortly. 2) Apache CMS, this requires someone writing the site however the publishing workflow is so much less hassle. What do you think? Lewis [0] http://maven.apache.org/skins/maven-fluido-skin/ On Wed, Dec 5, 2012 at 7:12 PM, Henry Saputra henry.sapu...@gmail.com wrote: Lewis, With Maven site, does it mean we are still using the svn pubsub or we could move to ASF CMS for publishing it? - Henry On Fri, Nov 30, 2012 at 3:47 AM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi All, I'm currently setting about the transition from Forrest to Maven for the site docs. It complicates things by having the current two tier structure which we maintain for the site docs. I therefore propose to just have docs. Enis, you had reasons and justification behind the legacy Gora documentation structure, if you could remind us again it would be excellent. I am working on this today regardless and will hopefully have a new proposal for the Mavenized site prepared shortly. Thanks, everyone. Happy St Andrews Day Lewis -- Lewis -- Lewis -- Lewis
Re: Gora Site Docs
Hi, AFAIK the site docs (in thier current form are in the xdocs format), however it seems that they are also marked up with some html/xhtml/xdoc meaning that automating the transformation into the apt format is becoming a hellishly tedious and extremely time consuming task. I'm using the doxia converter tool to do this but every document seems to have numerous problems and the doxia stack traces make me want to smash the place up. I'll persist and see where I get. @Henry, If I can I'll get the new site sorted for all of the site docs, so that we can use maven for publishing from now on. Thanks Lewis On Wed, Dec 5, 2012 at 9:07 PM, Henry Saputra henry.sapu...@gmail.com wrote: +1 for the Fluido skin. But this is just for the release-independent site, right? - Henry On Wed, Dec 5, 2012 at 12:59 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: I started the Maven transition and can complete tomorrow. Is everyone happy with the fluido skin that I mentioned? If so then I will work to get it sorted out tomorrow. Best Lewis On Wed, Dec 5, 2012 at 7:46 PM, Enis Söztutar enis@gmail.com wrote: Sorry, meant to reply this, but totally fell out of my radar. The reason why we are doing per-release and release-independent docs is that there are some docs that document the code (tutorial, javadoc, etc), and some docs that dont (the main site). Having said that, I don't think keeping the docs separated is a blocker for going maven. We can merge these, and if it still makes sense to separate the two, we can do it later. Enis On Wed, Dec 5, 2012 at 11:25 AM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi Henry, When I started on this my opinion changed somewhat The investment required is as follows 1) Maven (svnpubsub) I can grab the maven fluido skin [0] (which looks OK) and have it up and running reasonably shortly. 2) Apache CMS, this requires someone writing the site however the publishing workflow is so much less hassle. What do you think? Lewis [0] http://maven.apache.org/skins/maven-fluido-skin/ On Wed, Dec 5, 2012 at 7:12 PM, Henry Saputra henry.sapu...@gmail.com wrote: Lewis, With Maven site, does it mean we are still using the svn pubsub or we could move to ASF CMS for publishing it? - Henry On Fri, Nov 30, 2012 at 3:47 AM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi All, I'm currently setting about the transition from Forrest to Maven for the site docs. It complicates things by having the current two tier structure which we maintain for the site docs. I therefore propose to just have docs. Enis, you had reasons and justification behind the legacy Gora documentation structure, if you could remind us again it would be excellent. I am working on this today regardless and will hopefully have a new proposal for the Mavenized site prepared shortly. Thanks, everyone. Happy St Andrews Day Lewis -- Lewis -- Lewis -- Lewis -- Lewis
Re: Test errors java.lang.NoSuchMethodError: org.apache.gora.store.DataStore.setConf
No hassle. There is still work to be done here Henry. It is certainly something on the radar and we can work on GORA-89 over this week/weekend hopefully. On Thu, Dec 6, 2012 at 11:23 PM, Henry Saputra henry.sapu...@gmail.com wrote: Ah mea culpa, I was replying the wrong thread. I was trying to reply about the initial work you did for GORA-89https://issues.apache.org/jira/browse/GORA-89 =( My bad, still recovering from the cold. - Henry On Thu, Dec 6, 2012 at 2:45 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi Henry, On Mon, Dec 3, 2012 at 7:06 PM, Henry Saputra henry.sapu...@gmail.com wrote: Unfortunately I am pretty much out this weekend due to cold and probably for some part of this week. Hopefully you are getting better., You want to see it here in Scotland... pure baltic is the underestimation of 2012. Lewis, you can continue working on this or I will take a look at this once I get my health back. I committed GORA-193 and the tests passed on build #539 https://builds.apache.org/view/G-L/view/Gora/job/gora-trunk/539/ #540 also passed, so eventually (fingers crossed) it seems like we've narrowed the dependency problem down and resolved it. I've re-initiated the email notifications on the trunk build, so from now on we will get the outcome of the builds sent to dev@ and the culprits will also receive a notification. If these get out of hand again we can disable them until we get the situation sorted. I think the best thing is to monitor the builds from now on however hopefully things are back to normal now. All the best for now Lewis -- Lewis -- Lewis
[jira] [Commented] (GORA-174) GORA compiler does not handle [string, null] unions in the AVRO schema
[ https://issues.apache.org/jira/browse/GORA-174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13526497#comment-13526497 ] Lewis John McGibbney commented on GORA-174: --- I think the fix for this can be committed. Although I am running into different issues (as commented over in NUTCH-1477) GoraCompiler with the GORA-174v3.patch certainly generates the Java classes properly now with getters and setters. We have a problem with the tests though, is this correct? I see your attachment Alfonso. GORA compiler does not handle [string, null] unions in the AVRO schema -- Key: GORA-174 URL: https://issues.apache.org/jira/browse/GORA-174 Project: Apache Gora Issue Type: Bug Components: schema Affects Versions: 0.2.1 Reporter: Julien Nioche Assignee: Alfonso Nishikawa Fix For: 0.3 Attachments: failed_tests_after_v3.tar.gz, GORA-174-test.patch, GORA-174v2.patch, GORA-174v3.patch See NUTCH-1477 for description. We are getting NPE when using the DataFileAvroStore, in order to avoid that I modified the schema to allow for null values on some fields e.g.{name: baseUrl, type: [string, null] } however when generating the code for the schema the accessors are not generated by GORA which prevents Nutch from compiling -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-182) Nutch 2.1 does not work with gora-cassandra 0.2.1
[ https://issues.apache.org/jira/browse/GORA-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529416#comment-13529416 ] Lewis John McGibbney commented on GORA-182: --- I am going to start work on this Kaz as it is a complete blocker IMO. Are you suggesting that we add support for Hector's LongSerializer to accommodate the nature of data and type(s) produced by the InjectorJob? Once I have a better understanding of this, I'm going to head over to hector users@ Nutch 2.1 does not work with gora-cassandra 0.2.1 - Key: GORA-182 URL: https://issues.apache.org/jira/browse/GORA-182 Project: Apache Gora Issue Type: Bug Components: storage-cassandra Affects Versions: 0.2.1 Reporter: Kazuomi Kashii Fix For: 0.3 Attachments: GORA-182.patch Nutch 2.1 does not work with gora-cassandra 0.2.1. Especially, outlinks field is not written. I have confirmed this issue on Mac OS X and CentOS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-27) Optionally add license headers to generated files
[ https://issues.apache.org/jira/browse/GORA-27?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531128#comment-13531128 ] Lewis John McGibbney commented on GORA-27: -- +1 for new patch Optionally add license headers to generated files - Key: GORA-27 URL: https://issues.apache.org/jira/browse/GORA-27 Project: Apache Gora Issue Type: Improvement Components: schema Affects Versions: 0.1-incubating, 0.2 Reporter: Andrzej Bialecki Fix For: 0.3 Attachments: GORA-27.patch, GORA-27-v2.patch, GORA-27v4.1.patch, GORA-27v4.patch Gora compiler should allow adding license headers to generated files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Gora and MongoDB
Hi, AFAIK, there is no work currently being undertaken to build a datastore for 10gen's MongoDB. If you would like to open an issue, please do. If you begin submitting patches, I'm sure that the community could and will test the code in an attempt to get a MondoDB module for Gora. This would be very much welcomed. Please keep us posted as to how you are getting on. Best Lewis On Tue, Dec 18, 2012 at 7:23 PM, Poulard, Fabien fpoul...@dictanova.comwrote: Hi all, I'm Fabien, I co-funded a company specialized in opinion mining on the Web. We use Nutch 2.x for our crawling needs... and therefore Apache Gora as an abstraction layer between our NoSQL datastore and Nutch results. We've been using HBase so far. But we'd like to give 10gen MongoBD a shot. I've started working on a gora datastore for MongoDB. I've searched in the archives and in Jira but did not find anything related to MongoDB. Before going anywhere further I'd like to check if anyone else is working on such a thing and if I may find myself stucked by some difficulties I did not anticipate. Any hint would help ;) -- *Fabien Poulard* Associé-Fondateur Dictanova Tél. 02 51 12 59 68 / 06 65 58 94 77 *Dictanova* 2, rue de la Houssinière - BP 92208 44322 Nantes Cedex 03 -- *Lewis*
[jira] [Created] (GORA-194) Upgrade to Hadoop 1.1.1
Lewis John McGibbney created GORA-194: - Summary: Upgrade to Hadoop 1.1.1 Key: GORA-194 URL: https://issues.apache.org/jira/browse/GORA-194 Project: Apache Gora Issue Type: Improvement Components: build process, maven Affects Versions: 0.2.1 Reporter: Lewis John McGibbney Fix For: 0.3 Over in Nutchland, Markus recently committed NUTCH-1510, which covered an upgrade of the underlying Hadoop dependency to 1.1.1 with significant performance improvements. It would be excellent to upgrade and see if we can identify any performance gains in Gora as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[DISCUSS] Timeline for Gora 0.3 Release Thoughts
Hi All, Firstly, Happy New Year everyone. I really hope that 2013 is a good year for everyone. It would be excellent to get a Gora 0.3 release done, however there are a couple of blocking issues. As I see it we have the following GORA-182 https://issues.apache.org/jira/browse/GORA-182 Nutch 2.1 does not work with gora-cassandra 0.2.1 GORA-170 https://issues.apache.org/jira/browse/GORA-170 Getting a BufferUnderflowException in class CassandraColumn, method fromByteBuffer() GORA-188 https://issues.apache.org/jira/browse/GORA-188 testSerdeWebPage failure - PersistentBase#equals() fails with map fields GORA-189 https://issues.apache.org/jira/browse/GORA-189 String parameters in generated Persistent subclasses by Compiler -not only Utf8- The thing is that some of these are linked, and I also anticipate that we may run into other problems once some/all have been resolved so to speak. The purpose of this thread is to attempt to draw up some roadmap for releasing, and of course to understand what is required in the development drive for us to reach this target. Any input would be excellent. Best Lewis -- *Lewis*
[jira] [Commented] (GORA-89) Avoid HBase MiniCluster restarts to shorten gora-hbase tests
[ https://issues.apache.org/jira/browse/GORA-89?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542939#comment-13542939 ] Lewis John McGibbney commented on GORA-89: -- Hi Henry, apologies for the time away from this issue. I will be travelling and will try out the patch ASAP. Sorry again and thanks for taking the time + uploading your work. Avoid HBase MiniCluster restarts to shorten gora-hbase tests Key: GORA-89 URL: https://issues.apache.org/jira/browse/GORA-89 Project: Apache Gora Issue Type: Improvement Components: storage-hbase Affects Versions: 0.2 Reporter: Lewis John McGibbney Priority: Critical Fix For: 0.3 Attachments: GORA-89-hsaputra.patch, GORA-89.patch Currently our hbase tests are taking forever and a day. We should shorten the time by avoiding MiniCluster restarts. Just implement the cluster as a singleton and clean up the tables in between test by doing a scan and deletes for all rows. It's much faster than restarting the cluster. For code referenece please see the implementation here[1]. The class is HBaseClusterSingleton. It needs some refactoring but I think it's enough to speed your tests. Thanks Ioan for the heads up. [1] http://svn.apache.org/repos/asf/james/mailbox/trunk/hbase/src/test/java/org/apache/james/mailbox/hbase/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [DISCUSS] Timeline for Gora 0.3 Release Thoughts
Hi Henry Is it a serious proposal to bump all of the issues quoted thus far to blocker for 0.3 release (the remaining issues can be bumped to 0.4) so we have a clear vision for the 0.3 development drive? Thanks Lewis On Thu, Jan 3, 2013 at 7:34 PM, Henry Saputra henry.sapu...@gmail.comwrote: Hi Lewis, Thanks for starting this discussions. At least couple of the Jira issues you mentioned before are blocked with GORA-174. I think these are good list of blockers that are must fix and should be resolved before start preparing for 0.3 release. - Henry On Thu, Jan 3, 2013 at 5:42 AM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi All, Firstly, Happy New Year everyone. I really hope that 2013 is a good year for everyone. It would be excellent to get a Gora 0.3 release done, however there are a couple of blocking issues. As I see it we have the following GORA-182 https://issues.apache.org/jira/browse/GORA-182 Nutch 2.1 does not work with gora-cassandra 0.2.1 GORA-170 https://issues.apache.org/jira/browse/GORA-170 Getting a BufferUnderflowException in class CassandraColumn, method fromByteBuffer() GORA-188 https://issues.apache.org/jira/browse/GORA-188 testSerdeWebPage failure - PersistentBase#equals() fails with map fields GORA-189 https://issues.apache.org/jira/browse/GORA-189 String parameters in generated Persistent subclasses by Compiler -not only Utf8- The thing is that some of these are linked, and I also anticipate that we may run into other problems once some/all have been resolved so to speak. The purpose of this thread is to attempt to draw up some roadmap for releasing, and of course to understand what is required in the development drive for us to reach this target. Any input would be excellent. Best Lewis -- *Lewis* -- *Lewis*
Re: [DISCUSS] Timeline for Gora 0.3 Release Thoughts
Hi Alfonso, Thanks for this. On Sun, Jan 6, 2013 at 1:19 PM, Alfonso Nishikawa alfonso.nishik...@gmail.com wrote: By the way... github repo is not in sync with official subversion repo. M... The official Github mirrior 'should' technically be up-to-date (give or take the latency between updates)... Lewis
Re: [DISCUSS] Timeline for Gora 0.3 Release Thoughts
Hi All, Another note on this, I've edited the Jira instance now so that we have a clear strategy for the 0.3 development drive, the remaining 0.3 issues can be viewed @ *http://s.apache.org/0Z *Thanks and here's to 0.3 :0) Lewis On Mon, Jan 7, 2013 at 1:57 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi Alfonso, Thanks for this. On Sun, Jan 6, 2013 at 1:19 PM, Alfonso Nishikawa alfonso.nishik...@gmail.com wrote: By the way... github repo is not in sync with official subversion repo. M... The official Github mirrior 'should' technically be up-to-date (give or take the latency between updates)... Lewis -- *Lewis*
[jira] [Commented] (GORA-24) Throwing EOFException with MEDIUMBLOB type for inlinks column
[ https://issues.apache.org/jira/browse/GORA-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13546402#comment-13546402 ] Lewis John McGibbney commented on GORA-24: -- As the gora-sql module is now deprecated (due to licensing issues). Please correct but my outlook on this one is as follows - write support for MEDIUMBLOB into new gora-sql module - accompany this with better error handling/message logging and additionally some additional guidance in the gora-sql-mapping.xml file There is little we can do about this in Gora until the gora-sql module is written, therefore any problems which are experienced using gora-sql with Nutch 2.x (or any other client applications for that matter) will need to be addressed at that level not within Gora. Throwing EOFException with MEDIUMBLOB type for inlinks column - Key: GORA-24 URL: https://issues.apache.org/jira/browse/GORA-24 Project: Apache Gora Issue Type: Bug Components: storage-sql Environment: MySQL Reporter: Alexis Fix For: 0.4 I had an exception with DbUpdaterJob complaining that inlinks column of type BLOB in webpage table was not big enough to store all the incoming links. So I changed the column definition in gora-sql-mapping.xml from BLOB to MEDIUMBLOB: field name=inlinks column=inlinks jdbc-type=MEDIUMBLOB/ Now I systematically get an exception in the update step: java.io.IOException: java.sql.BatchUpdateException: Error reading from InputStream java.io.EOFException at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:341) at org.apache.gora.sql.store.SqlStore.close(SqlStore.java:185) at org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:55) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:567) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216) Caused by: java.sql.BatchUpdateException: Error reading from InputStream java.io.EOFException at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2020) at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1451) at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:329) ... 5 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (GORA-24) Throwing EOFException with MEDIUMBLOB type for inlinks column
[ https://issues.apache.org/jira/browse/GORA-24?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reassigned GORA-24: Assignee: Lewis John McGibbney Throwing EOFException with MEDIUMBLOB type for inlinks column - Key: GORA-24 URL: https://issues.apache.org/jira/browse/GORA-24 Project: Apache Gora Issue Type: Bug Components: storage-sql Environment: MySQL Reporter: Alexis Assignee: Lewis John McGibbney Fix For: 0.4 I had an exception with DbUpdaterJob complaining that inlinks column of type BLOB in webpage table was not big enough to store all the incoming links. So I changed the column definition in gora-sql-mapping.xml from BLOB to MEDIUMBLOB: field name=inlinks column=inlinks jdbc-type=MEDIUMBLOB/ Now I systematically get an exception in the update step: java.io.IOException: java.sql.BatchUpdateException: Error reading from InputStream java.io.EOFException at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:341) at org.apache.gora.sql.store.SqlStore.close(SqlStore.java:185) at org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:55) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:567) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216) Caused by: java.sql.BatchUpdateException: Error reading from InputStream java.io.EOFException at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2020) at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1451) at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:329) ... 5 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-24) Throwing EOFException with MEDIUMBLOB type for inlinks column
[ https://issues.apache.org/jira/browse/GORA-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13546406#comment-13546406 ] Lewis John McGibbney commented on GORA-24: -- Hi Henry, yes the idea is to use JOOQ as is provides support for a wide variety of SQL stores out of the box... something which would be very appealing to users for obvious reasons. There is a separate Jira issue on this topic altogether GORA-86 Throwing EOFException with MEDIUMBLOB type for inlinks column - Key: GORA-24 URL: https://issues.apache.org/jira/browse/GORA-24 Project: Apache Gora Issue Type: Bug Components: storage-sql Environment: MySQL Reporter: Alexis Assignee: Lewis John McGibbney Fix For: 0.4 I had an exception with DbUpdaterJob complaining that inlinks column of type BLOB in webpage table was not big enough to store all the incoming links. So I changed the column definition in gora-sql-mapping.xml from BLOB to MEDIUMBLOB: field name=inlinks column=inlinks jdbc-type=MEDIUMBLOB/ Now I systematically get an exception in the update step: java.io.IOException: java.sql.BatchUpdateException: Error reading from InputStream java.io.EOFException at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:341) at org.apache.gora.sql.store.SqlStore.close(SqlStore.java:185) at org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:55) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:567) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216) Caused by: java.sql.BatchUpdateException: Error reading from InputStream java.io.EOFException at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2020) at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1451) at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:329) ... 5 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GORA-195) [gora-hbase] Allow mapping of an array to a single column
[ https://issues.apache.org/jira/browse/GORA-195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-195: -- Fix Version/s: 0.4 [gora-hbase] Allow mapping of an array to a single column - Key: GORA-195 URL: https://issues.apache.org/jira/browse/GORA-195 Project: Apache Gora Issue Type: Improvement Components: storage-hbase Affects Versions: 0.2.1 Environment: HBase 0.90.4 backend, Hadoop 1.0.1 Reporter: Alfonso Nishikawa Priority: Trivial Fix For: 0.4 At this time, defining a mapping in HBase for an array field to a family:column like this: {code} {name: A, fields: [ {name: field, type: {type: array, values: string}} ] } class name=A ... field name=field family=r qualifier=c/ /class {code} in HBase is discouraging since gets to an unexpected behavior loading parts of the rest of the record. So: by now only is allowed mappings of arrays(and maps) to families. Workaround: enclose the array inside an inner optional record like this: {code} {name: A, fields: [ {name:holder, type: [null, { name:holderRecord, type:record, fields: [ {name: field, type: {type: array, values: string}} ] }} ] } {code} The necessity comes partially if you don't want to create a family for each array in you HBase database (advised not to do), or if you just want to map to a column when your array is read-only. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-182) Nutch 2.1 does not work with gora-cassandra 0.2.1
[ https://issues.apache.org/jira/browse/GORA-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549317#comment-13549317 ] Lewis John McGibbney commented on GORA-182: --- This thread [0] is relevant to our issue. AS you mentioned Kaz (and as confirmed by Nate) the stack trace I've been getting usually happens when trying to insert the raw byte form of, say, an integer into a column expecting a string. Although your patch may address the core issue here, I am sure there is still work to be done to avoid the stack, however I don't know whether this should be done in Gora or @client level? Are you in a position to comment Kaz? [0] https://groups.google.com/forum/?fromgroups=#!topic/hector-users/y2G7VFajHK8 Nutch 2.1 does not work with gora-cassandra 0.2.1 - Key: GORA-182 URL: https://issues.apache.org/jira/browse/GORA-182 Project: Apache Gora Issue Type: Bug Components: storage-cassandra Affects Versions: 0.2.1 Reporter: Kazuomi Kashii Fix For: 0.3 Attachments: GORA-182.patch Nutch 2.1 does not work with gora-cassandra 0.2.1. Especially, outlinks field is not written. I have confirmed this issue on Mac OS X and CentOS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Attendance @ ApacheCon NA 2013 Portland
Duh... I know you guys will be present and raring to go. That's a given! Any excuse eh ;) On Wed, Jan 9, 2013 at 6:58 PM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: I'll be there (and so will Paul + Cam + Andrew + the rest of the OODT/Gora/etc. peeps that you know and love from JPL) Cheers! Chris On 1/9/13 9:52 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi All, This thread speaks for itself. Who is going, who is not. I'm just in a new position so I don't know if it is appropriate, convenient for me to take the short trip up to Portland, however Gora PMC/community members would surely build the case for me going. Any takers? Best Lewis -- *Lewis* -- *Lewis*
[jira] [Commented] (GORA-197) gora-cassandra requires BytesType for Cassandra column family validator
[ https://issues.apache.org/jira/browse/GORA-197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13550503#comment-13550503 ] Lewis John McGibbney commented on GORA-197: --- Hi Kaz, which version do you wish to set this for? I say 0.3 if possible as the maven artifact (with this fix) would be really valuable elsewhere. gora-cassandra requires BytesType for Cassandra column family validator --- Key: GORA-197 URL: https://issues.apache.org/jira/browse/GORA-197 Project: Apache Gora Issue Type: Task Components: storage-cassandra Reporter: Kazuomi Kashii gora-cassandra requires BytesType for Cassandra column family validator in order to support Avro complex data type. If a user manually creates a column family with other type of validator, gora-cassandra cannot do anything but throw an exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Document about GORA-174
Hi Alfonso, When you say that ...the first element in the union is considered as the default element, at this moment it is not implemented nor planned does this refer to Avro? On Sunday, January 13, 2013, Alfonso Nishikawa alfonso.nishik...@gmail.com wrote: Hello everybody. I wrote an article [0] regarding GORA-174 where I try to explain a compatibility issue with old data in HBase. I really don't know how it affects other backends. Need some info if anyone knows. (@Renato: maybe you can tell me something about how is it in Cassandra :) I will appreciate your thoughts :) Thank you very much! Alfonso Nishikawa [0] - http://people.apache.org/~alfonsonishikawa/gora-174.html -- *Lewis*
[jira] [Updated] (GORA-199) Support MongoDB in GORA
[ https://issues.apache.org/jira/browse/GORA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-199: -- Fix Version/s: 0.4 Support MongoDB in GORA --- Key: GORA-199 URL: https://issues.apache.org/jira/browse/GORA-199 Project: Apache Gora Issue Type: New Feature Components: storage Reporter: Fabien Poulard Priority: Minor Fix For: 0.4 Support 10gen MongoDB datastore in GORA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-199) Support MongoDB in GORA
[ https://issues.apache.org/jira/browse/GORA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564760#comment-13564760 ] Lewis John McGibbney commented on GORA-199: --- Hi Fabien, out of curiosity, have you been working on this at all? Is there any code written? Support MongoDB in GORA --- Key: GORA-199 URL: https://issues.apache.org/jira/browse/GORA-199 Project: Apache Gora Issue Type: New Feature Components: storage Reporter: Fabien Poulard Priority: Minor Fix For: 0.4 Support 10gen MongoDB datastore in GORA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-199) Support MongoDB in GORA
[ https://issues.apache.org/jira/browse/GORA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13565795#comment-13565795 ] Lewis John McGibbney commented on GORA-199: --- @Fabien, I totally missed these issues as I have recently moved to batch digests for all of the mailing lists I subscribe to. Please upload your patch here and we can begin to review. I understand from your threads elsewhere that you do not have the tests working with the TestDriver scenario, however this is not a huge problem. FYI we currently use Maven for the build lifecycle so maybe we can add that functionality as well. Finally, please feel free to explain a bit about your avro-gradle-plugin and how if possible we can use this within Gora. My understanding of Groovy is limited and Rails even less so please be gentle ;) Thank you and apologies again as this one seems to have passed slipped right through the net. Support MongoDB in GORA --- Key: GORA-199 URL: https://issues.apache.org/jira/browse/GORA-199 Project: Apache Gora Issue Type: New Feature Components: storage Reporter: Fabien Poulard Priority: Minor Fix For: 0.4 Support 10gen MongoDB datastore in GORA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: dev Digest 30 Jan 2013 09:38:18 -0000 Issue 313
Hi Alfonso, On Wed, Jan 30, 2013 at 1:38 AM, dev-digest-h...@gora.apache.org wrote: Greetings, Gora+MongoDB are happy news. Good to know about that feature. Maybe someone should make some little document with the involved classes for extend a new datastore (maybe I will try someday). This would be really welcomed actually. It is something which we need to improve upon most certainly. Currently the process of contributing documentation to Gora is bloody difficult and this needs to change. About compiling schemas, in my opinion, someday someone should do something for maven (plugin:) . But by now anything automated is welcome. Well, some good news. Within Ed's patch and proposal for the Avro upgrade he re-factored the compiler (splitting it out into its own module) and the idea is to publish this and Renato's DynamoDB compiler as maven plugins which you can simply call from within your pom. This is something for the future though. Thanks Lewis Cheers. Alfonso Nishikawa El 28/01/2013 07:37, Poulard, Fabien fpoul...@dictanova.com escribió:
[jira] [Updated] (GORA-197) gora-cassandra requires BytesType for Cassandra column family validator
[ https://issues.apache.org/jira/browse/GORA-197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-197: -- Fix Version/s: 0.3 gora-cassandra requires BytesType for Cassandra column family validator --- Key: GORA-197 URL: https://issues.apache.org/jira/browse/GORA-197 Project: Apache Gora Issue Type: Task Components: storage-cassandra Reporter: Kazuomi Kashii Fix For: 0.3 Attachments: GORA-197.patch gora-cassandra requires BytesType for Cassandra column family validator in order to support Avro complex data type. If a user manually creates a column family with other type of validator, gora-cassandra cannot do anything but throw an exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (GORA-201) Upgrade HBase API Usage in Gora
Lewis John McGibbney created GORA-201: - Summary: Upgrade HBase API Usage in Gora Key: GORA-201 URL: https://issues.apache.org/jira/browse/GORA-201 Project: Apache Gora Issue Type: Bug Components: storage-hbase Affects Versions: 0.3 Reporter: Lewis John McGibbney Fix For: 0.4 We haven't touched the HBase versioning in a good while. When a new user heads over to the HBase site, they are directed to the 'stable' release which is currently sitting at 0.94.4. I realise that we have (legacy) support for the 0.90.X branch of HBase, but from what I can see, there is no current justification for this decision and it is also not within any strategic short/medium/long term objectives of Gora. This issue should *Enable us to discuss what Hbase branch we wish to support moving forward *Actually implement the upgrade which gathers most consensus. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-182) Nutch 2.1 does not work with gora-cassandra 0.2.1
[ https://issues.apache.org/jira/browse/GORA-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13568113#comment-13568113 ] Lewis John McGibbney commented on GORA-182: --- So this can now be closed Kaz? Nutch 2.1 does not work with gora-cassandra 0.2.1 - Key: GORA-182 URL: https://issues.apache.org/jira/browse/GORA-182 Project: Apache Gora Issue Type: Bug Components: storage-cassandra Affects Versions: 0.2.1 Reporter: Kazuomi Kashii Assignee: Kazuomi Kashii Fix For: 0.3 Attachments: GORA-182.patch Nutch 2.1 does not work with gora-cassandra 0.2.1. Especially, outlinks field is not written. I have confirmed this issue on Mac OS X and CentOS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-196) OSX JDK7 failed to load snappy native library from snappy-java-1.0.4.1.jar.
[ https://issues.apache.org/jira/browse/GORA-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13568115#comment-13568115 ] Lewis John McGibbney commented on GORA-196: --- Hi Kaz. Can you please commit this when you get a chance? OSX JDK7 failed to load snappy native library from snappy-java-1.0.4.1.jar. --- Key: GORA-196 URL: https://issues.apache.org/jira/browse/GORA-196 Project: Apache Gora Issue Type: Test Components: storage-cassandra Environment: OSX JDK7 Reporter: Kazuomi Kashii Priority: Minor Fix For: 0.3 Attachments: GORA-196.patch OSX JDK7 failed to load snappy native library from snappy-java-1.0.4.1.jar which is currently specified in Cassandra, so gora-cassandra test failed. This is a known issue, and snappy 1.0.5 (currently M3) should fix this : https://github.com/xerial/snappy-java/issues/6 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-196) OSX JDK7 failed to load snappy native library from snappy-java-1.0.4.1.jar.
[ https://issues.apache.org/jira/browse/GORA-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13568116#comment-13568116 ] Lewis John McGibbney commented on GORA-196: --- Maybe add it to parent pom and inherit it through the gora-cassandra project pom? OSX JDK7 failed to load snappy native library from snappy-java-1.0.4.1.jar. --- Key: GORA-196 URL: https://issues.apache.org/jira/browse/GORA-196 Project: Apache Gora Issue Type: Test Components: storage-cassandra Environment: OSX JDK7 Reporter: Kazuomi Kashii Priority: Minor Fix For: 0.3 Attachments: GORA-196.patch OSX JDK7 failed to load snappy native library from snappy-java-1.0.4.1.jar which is currently specified in Cassandra, so gora-cassandra test failed. This is a known issue, and snappy 1.0.5 (currently M3) should fix this : https://github.com/xerial/snappy-java/issues/6 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GORA-174) GORA compiler does not handle [string, null] unions in the AVRO schema
[ https://issues.apache.org/jira/browse/GORA-174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-174: -- Priority: Blocker (was: Major) GORA compiler does not handle [string, null] unions in the AVRO schema -- Key: GORA-174 URL: https://issues.apache.org/jira/browse/GORA-174 Project: Apache Gora Issue Type: Bug Components: schema Affects Versions: 0.2.1 Reporter: Julien Nioche Assignee: Alfonso Nishikawa Priority: Blocker Fix For: 0.3 Attachments: failed_tests_after_v3.tar.gz, GORA-174-test.patch, GORA-174v2.patch, GORA-174v3.patch See NUTCH-1477 for description. We are getting NPE when using the DataFileAvroStore, in order to avoid that I modified the schema to allow for null values on some fields e.g.{name: baseUrl, type: [string, null] } however when generating the code for the schema the accessors are not generated by GORA which prevents Nutch from compiling -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-202) gora-tutorial does not work with Cassandra
[ https://issues.apache.org/jira/browse/GORA-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13568243#comment-13568243 ] Lewis John McGibbney commented on GORA-202: --- Can you please commit this Kaz? gora-tutorial does not work with Cassandra -- Key: GORA-202 URL: https://issues.apache.org/jira/browse/GORA-202 Project: Apache Gora Issue Type: Bug Components: storage-cassandra Affects Versions: 0.2.1 Reporter: Kazuomi Kashii Assignee: Kazuomi Kashii Priority: Minor Fix For: 0.3 Attachments: GORA-202.patch gora-cassandra fails to initialize with gora-cassandra-mapping.xml of gora-tutorial. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (GORA-196) OSX JDK7 failed to load snappy native library from snappy-java-1.0.4.1.jar.
[ https://issues.apache.org/jira/browse/GORA-196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved GORA-196. --- Resolution: Fixed OSX JDK7 failed to load snappy native library from snappy-java-1.0.4.1.jar. --- Key: GORA-196 URL: https://issues.apache.org/jira/browse/GORA-196 Project: Apache Gora Issue Type: Test Components: storage-cassandra Environment: OSX JDK7 Reporter: Kazuomi Kashii Priority: Minor Fix For: 0.3 Attachments: GORA-196.patch OSX JDK7 failed to load snappy native library from snappy-java-1.0.4.1.jar which is currently specified in Cassandra, so gora-cassandra test failed. This is a known issue, and snappy 1.0.5 (currently M3) should fix this : https://github.com/xerial/snappy-java/issues/6 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (GORA-182) Nutch 2.1 does not work with gora-cassandra 0.2.1
[ https://issues.apache.org/jira/browse/GORA-182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved GORA-182. --- Resolution: Fixed Nutch 2.1 does not work with gora-cassandra 0.2.1 - Key: GORA-182 URL: https://issues.apache.org/jira/browse/GORA-182 Project: Apache Gora Issue Type: Bug Components: storage-cassandra Affects Versions: 0.2.1 Reporter: Kazuomi Kashii Assignee: Kazuomi Kashii Fix For: 0.3 Attachments: GORA-182.patch Nutch 2.1 does not work with gora-cassandra 0.2.1. Especially, outlinks field is not written. I have confirmed this issue on Mac OS X and CentOS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (GORA-197) gora-cassandra requires BytesType for Cassandra column family validator
[ https://issues.apache.org/jira/browse/GORA-197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved GORA-197. --- Resolution: Fixed gora-cassandra requires BytesType for Cassandra column family validator --- Key: GORA-197 URL: https://issues.apache.org/jira/browse/GORA-197 Project: Apache Gora Issue Type: Task Components: storage-cassandra Reporter: Kazuomi Kashii Fix For: 0.3 Attachments: GORA-197.patch gora-cassandra requires BytesType for Cassandra column family validator in order to support Avro complex data type. If a user manually creates a column family with other type of validator, gora-cassandra cannot do anything but throw an exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-202) gora-tutorial does not work with Cassandra
[ https://issues.apache.org/jira/browse/GORA-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13568262#comment-13568262 ] Lewis John McGibbney commented on GORA-202: --- Hey Kaz. Can you close this one off please? Thank you gora-tutorial does not work with Cassandra -- Key: GORA-202 URL: https://issues.apache.org/jira/browse/GORA-202 Project: Apache Gora Issue Type: Bug Components: storage-cassandra Affects Versions: 0.2.1 Reporter: Kazuomi Kashii Assignee: Kazuomi Kashii Priority: Minor Fix For: 0.3 Attachments: GORA-202a.patch, GORA-202.patch gora-cassandra fails to initialize with gora-cassandra-mapping.xml of gora-tutorial. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[DRAFT] Gora Report
Hi All, We need to report again this month. Please see below for the report and please add/remove content where you see appropriate. I'll get this committed when we are done. Thanks Lewis - The Apache Gora open source framework provides an in-memory data model and persistence for big data. Gora supports persisting to column stores, key value stores, document stores and RDBMSs, and analyzing the data with extensive Apache Hadoop MapReduce support. Project Releases The Apache Gora team was happy to announce the release of Gora 0.2.1 on 7th August 2012. No releases have been made since however a clear staregy has been established for the 0.3 release. Overall Project Activity since last report Since last reporting, the PMC has geared the development drive towards the 0.3 release. We have addressed and resolved 28 of 33 issues meaning that the progression towards an RC for 0.3 is well on the way. We currently have two blockers which nee to be addressed before we can consider the 0.3 RC. How has the community developed since the last report? Activity on the user@ list has been very slow since last reporting. It was invisaged that after ApacheConEU user interest might pick up slightly, however this has not materialized as we hoped. Activity on dev@ has developed in line with our expectations as we move towards more regular Gora releases. Generally speaking more work needs to be done in an attempt to make it easier for people to use Gora. This is something which the PMC need to work on. Changes to PMC Committers The Gora PMC were very pleased to invite and have Alfonso Nishikawa join our ranks in early December. After working with the PMC to ensure smooth transition into the Apache community Alfonso is now contributing to Gora and making a real impact. Alfonso also joined the Gora PMC. PMC and Committer diversity We currently have committers from a wide variety of Apache projects including, Nutch, Tika, OODT, Camel, Solr, Accumulo, Whirr Hadoop (this is not an exhaustive list). We are still actively seeking one or more members to join the team from the Avro community so this will be a main target for us in the future post 0.3 release. Project Branding or Naming issues NONE Legal issues NONE -- Lewis
[jira] [Updated] (GORA-203) Bug in setting column field attribute qualifier in CassandraMapping
[ https://issues.apache.org/jira/browse/GORA-203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-203: -- Description: Currently, we are absolutely required to set a value for a column field attribute qualifier, however there are no checks to determine whether this is actually present or not, therefore this is a bug. Renato pointed this out and hopefully he can upload some stack traces relating to the issue to display the kind of issues one faces when qualifier attributes and their values are not present when mapping columns to Cassandra. As far as we know, column field attributes are supported in the most recent Cassandra data model (and this is not due to change) therefore we should also support them in Gora, however it is my opinion (please comment here) on whether they should be optional or not. was: Currently, we are absolutely required to set a column field value attribute qualifier, however there are no checks to determine whether this is actually present or not. Renato pointed this out and hopefully he can upload some stack traces relating to the issue. As far aw we know, column field attributes are supported in the most recent Cassandra data model therefore we should also support them in Gora, however it is my opinion (please comment) on whether they should be optional or not. Bug in setting column field attribute qualifier in CassandraMapping -- Key: GORA-203 URL: https://issues.apache.org/jira/browse/GORA-203 Project: Apache Gora Issue Type: Bug Components: storage-cassandra Affects Versions: 0.2.1 Reporter: Lewis John McGibbney Fix For: 0.3 Currently, we are absolutely required to set a value for a column field attribute qualifier, however there are no checks to determine whether this is actually present or not, therefore this is a bug. Renato pointed this out and hopefully he can upload some stack traces relating to the issue to display the kind of issues one faces when qualifier attributes and their values are not present when mapping columns to Cassandra. As far as we know, column field attributes are supported in the most recent Cassandra data model (and this is not due to change) therefore we should also support them in Gora, however it is my opinion (please comment here) on whether they should be optional or not. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-203) Bug in setting column field attribute qualifier in CassandraMapping
[ https://issues.apache.org/jira/browse/GORA-203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569368#comment-13569368 ] Lewis John McGibbney commented on GORA-203: --- OK Kaz, thanks. What is your opinion about the status quo, which is that we are making qualifier as attributes mandatory in Cassandra column mappings? I do not think they are mandatory attributes within the Cassandra data model therefore personally I do not think it is appropriate for us to enforce them within Gora. Bug in setting column field attribute qualifier in CassandraMapping -- Key: GORA-203 URL: https://issues.apache.org/jira/browse/GORA-203 Project: Apache Gora Issue Type: Bug Components: storage-cassandra Affects Versions: 0.2.1 Reporter: Lewis John McGibbney Fix For: 0.3 Currently, we are absolutely required to set a value for a column field attribute qualifier, however there are no checks to determine whether this is actually present or not, therefore this is a bug. Renato pointed this out and hopefully he can upload some stack traces relating to the issue to display the kind of issues one faces when qualifier attributes and their values are not present when mapping columns to Cassandra. As far as we know, column field attributes are supported in the most recent Cassandra data model (and this is not due to change) therefore we should also support them in Gora, however it is my opinion (please comment here) on whether they should be optional or not. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-203) Bug in setting column field attribute qualifier in CassandraMapping
[ https://issues.apache.org/jira/browse/GORA-203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569369#comment-13569369 ] Lewis John McGibbney commented on GORA-203: --- I think also that we could do with enforcing much more verbose logging but dropping it to DEBUG level. What do you feel on this as well? Bug in setting column field attribute qualifier in CassandraMapping -- Key: GORA-203 URL: https://issues.apache.org/jira/browse/GORA-203 Project: Apache Gora Issue Type: Bug Components: storage-cassandra Affects Versions: 0.2.1 Reporter: Lewis John McGibbney Fix For: 0.3 Currently, we are absolutely required to set a value for a column field attribute qualifier, however there are no checks to determine whether this is actually present or not, therefore this is a bug. Renato pointed this out and hopefully he can upload some stack traces relating to the issue to display the kind of issues one faces when qualifier attributes and their values are not present when mapping columns to Cassandra. As far as we know, column field attributes are supported in the most recent Cassandra data model (and this is not due to change) therefore we should also support them in Gora, however it is my opinion (please comment here) on whether they should be optional or not. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-121) Enhance CassandraMapping to support additional Column Definitions
[ https://issues.apache.org/jira/browse/GORA-121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569380#comment-13569380 ] Lewis John McGibbney commented on GORA-121: --- Can anyone see or suggest any additional column attributes which Cassandra currently supports? If not then we can close this issue as won't fix as it seems to be have been addressed. Enhance CassandraMapping to support additional Column Definitions -- Key: GORA-121 URL: https://issues.apache.org/jira/browse/GORA-121 Project: Apache Gora Issue Type: New Feature Components: storage-cassandra Affects Versions: 0.2 Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Fix For: 0.4 There are 2 parts to this issue 1) CassandraMapping#loadConfiguration currently loads definitions for keyspaces, column families and columns however the support for the latter is limited. The following is a mapping example Say we have the keyspace mapping configuration: keyspace name=WebPage cluster=Test Cluster host=localhost family name=p/ family name=f/ family name=sc type=super/ /keyspace and the column mapping configuration: class name=org.apache.gora.examples.generated.WebPage keyClass=java.lang.String keyspace=WebPage field name=url family=p path=c:u/ field name=content family=p path=p:cnt:c/ field name=parsedContent family=p path=p:parsedContent/ field name=outlinks family=p path=p:outlinks/ field name=metadata family=p path=c:mt/ /class Currently we don't support keyClass attributes or field path attributes. 2) Additionally, we mention private static final String COLUMN_ATTRIBUTE = qualifier; however this resource is neither loaded or requested at any stage during the process of ascertaining Cassandra mappings. This should also be supported, if not then it should be removed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (GORA-121) Enhance CassandraMapping to support additional Column Definitions
[ https://issues.apache.org/jira/browse/GORA-121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved GORA-121. --- Resolution: Won't Fix This has either been fixed elsewhere or is now not relevant. Enhance CassandraMapping to support additional Column Definitions -- Key: GORA-121 URL: https://issues.apache.org/jira/browse/GORA-121 Project: Apache Gora Issue Type: New Feature Components: storage-cassandra Affects Versions: 0.2 Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Fix For: 0.4 There are 2 parts to this issue 1) CassandraMapping#loadConfiguration currently loads definitions for keyspaces, column families and columns however the support for the latter is limited. The following is a mapping example Say we have the keyspace mapping configuration: keyspace name=WebPage cluster=Test Cluster host=localhost family name=p/ family name=f/ family name=sc type=super/ /keyspace and the column mapping configuration: class name=org.apache.gora.examples.generated.WebPage keyClass=java.lang.String keyspace=WebPage field name=url family=p path=c:u/ field name=content family=p path=p:cnt:c/ field name=parsedContent family=p path=p:parsedContent/ field name=outlinks family=p path=p:outlinks/ field name=metadata family=p path=c:mt/ /class Currently we don't support keyClass attributes or field path attributes. 2) Additionally, we mention private static final String COLUMN_ATTRIBUTE = qualifier; however this resource is neither loaded or requested at any stage during the process of ascertaining Cassandra mappings. This should also be supported, if not then it should be removed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-201) Upgrade HBase API Usage in Gora
[ https://issues.apache.org/jira/browse/GORA-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573148#comment-13573148 ] Lewis John McGibbney commented on GORA-201: --- One more reason for us to upgrade HBase API usage/dependency in Gora http://www.mail-archive.com/user%40nutch.apache.org/msg08700.html Upgrade HBase API Usage in Gora --- Key: GORA-201 URL: https://issues.apache.org/jira/browse/GORA-201 Project: Apache Gora Issue Type: Bug Components: storage-hbase Affects Versions: 0.3 Reporter: Lewis John McGibbney Fix For: 0.4 We haven't touched the HBase versioning in a good while. When a new user heads over to the HBase site, they are directed to the 'stable' release which is currently sitting at 0.94.4. I realise that we have (legacy) support for the 0.90.X branch of HBase, but from what I can see, there is no current justification for this decision and it is also not within any strategic short/medium/long term objectives of Gora. This issue should *Enable us to discuss what Hbase branch we wish to support moving forward *Actually implement the upgrade which gathers most consensus. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Making Sense of NoSQL
It might also be relevant to note that Dan is looking for good quality use cases for Big Data, as this is one of the aspects of the book. Lewis On Thu, Feb 7, 2013 at 12:40 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi, I recently spoke with Dan McGreary the (co)Author of the soon to be published Making sense of NoSQL http://www.manning.com/mccreary/ Thought that it may be a link a few of us would be interested in :) Best Lewis -- *Lewis* -- *Lewis*
Re: dev Digest 8 Feb 2013 22:46:15 -0000 Issue 317
Hi Alfonso, On Fri, Feb 8, 2013 at 2:46 PM, dev-digest-h...@gora.apache.org wrote: Hi all, I updated GORA-174 issue info about HBase backend at [0]. Any thoughts? I think now is better expressed. This is much clearer for me at least. We are always going to have certain problems (when developing Gora) when intricacies associated with (and which affect all) datastores are encountered. GORA-174 is a perfect example. There is no workaround and it is essential to have a thorough understanding of the problem at individual datastore level. Thanks for the documentation, it is really driving this issue forward! If no one think is wrong, I will implement solution-1 and solution-2(this means maybe quite work, so do we maintain it? -I vote yes). I think the proposed resolutions are certainly attractive and that we should progress on this basis. When we get to a 1.0 Gora release (please excuse my wishful long-term thinking) then we can act on completely removing the deprecated methods from Gora, for the time being I see no problem (and I certainly would back with my +1) methods being deprecated in favour of more appropriate mechanisms for data persistence. I've been talking this issue through with Renato offline and glad to observe that the HBase and Cassandra stuff seems to be coming along nicely. Is anyone in a position to address this with Accumulo? What about DynamoDB? Does DataFIle/AvroStore(s) support this in their current form? Thanks Lewis
[jira] [Created] (GORA-204) Don't store empty arrays in CassandraClient#addGenericArray() addStatefulHashMap()
Lewis John McGibbney created GORA-204: - Summary: Don't store empty arrays in CassandraClient#addGenericArray() addStatefulHashMap() Key: GORA-204 URL: https://issues.apache.org/jira/browse/GORA-204 Project: Apache Gora Issue Type: Improvement Components: avro, storage-cassandra Affects Versions: 0.2.1 Reporter: Lewis John McGibbney Priority: Minor Fix For: 0.4 We have two TODO's in this issue. Namely {code} // TODO: hack, do not store empty arrays if (itemValue instanceof GenericArray?) { if (((GenericArray)itemValue).size() == 0) { continue; } } else if (itemValue instanceof StatefulHashMap?,?) { if (((StatefulHashMap)itemValue).size() == 0) { continue; } } {code} and {code} // TODO: hack, do not store empty arrays Object mapValue = map.get(mapKey); if (mapValue instanceof GenericArray?) { if (((GenericArray)mapValue).size() == 0) { continue; } } else if (mapValue instanceof StatefulHashMap?,?) { if (((StatefulHashMap)mapValue).size() == 0) { continue; } } {code} in assGenericArray and addStateulHashMap respectively. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (GORA-169) Implement correct logging for KeySpaces and attributes in CassandraMappingManager
[ https://issues.apache.org/jira/browse/GORA-169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved GORA-169. --- Resolution: Fixed Fix Version/s: (was: 0.4) 0.3 Assignee: Lewis John McGibbney Committed @revision 135 in trunk Implement correct logging for KeySpaces and attributes in CassandraMappingManager - Key: GORA-169 URL: https://issues.apache.org/jira/browse/GORA-169 Project: Apache Gora Issue Type: Improvement Components: storage-cassandra Affects Versions: 0.2.1 Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Fix For: 0.3 Attachments: GORA-169.patch Currently the logging in CassandraMappingManager#loadConfiguration() fails to pick up a wealth of information from the keyspace definitions. An example is below: {code} 2012-09-20 23:47:05,469 INFO store.CassandraMappingManager - Located Cassandra Keyspace: 'keyspace' 2012-09-20 23:47:05,476 INFO store.CassandraMappingManager - Located Cassandra Keyspace name: 'name' 2012-09-20 23:47:05,476 INFO store.CassandraMappingManager - Located Cassandra Mapping: 'class' 2012-09-20 23:47:05,476 INFO store.CassandraMappingManager - Located Cassandra Mapping class name: 'name' {code} As the logging incorrectly uses the jdom methods, keyspace names and additional logging is incorrect and not nearly enough of what should be present. It should be changed to reflect below: {code} 2012-09-20 23:47:05,476 INFO store.CassandraMappingManager - Located Cassandra Keyspace name: '$nameOfKeySpace' 2012-09-20 23:47:05,476 INFO store.CassandraMappingManager - Located Cassandra Mapping for class: '$nameOfMappingClass' ... etc {code} right now this is very misleading and needs to be sorted out with much more verbose logging for keyspace attribute recognition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (GORA-205) Dedup CassandraMapping and CassandraMappingManager
Lewis John McGibbney created GORA-205: - Summary: Dedup CassandraMapping and CassandraMappingManager Key: GORA-205 URL: https://issues.apache.org/jira/browse/GORA-205 Project: Apache Gora Issue Type: Improvement Components: storage-cassandra Affects Versions: 0.2.1 Reporter: Lewis John McGibbney Priority: Minor Fix For: 0.4 We have a pile of what looks lie deduplication between these two classes. We should make a determination of what is required and then document it within the appropriate class. This will enable easy navigation of keyspace definition etc. from within gora-cassandra. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-203) Bug in setting column field attribute qualifier in CassandraMapping
[ https://issues.apache.org/jira/browse/GORA-203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575268#comment-13575268 ] Lewis John McGibbney commented on GORA-203: --- Hi Kaz, I just committed GORA-169. From what is left, please commit your fix when you have time. Thank you so much. Bug in setting column field attribute qualifier in CassandraMapping -- Key: GORA-203 URL: https://issues.apache.org/jira/browse/GORA-203 Project: Apache Gora Issue Type: Bug Components: storage-cassandra Affects Versions: 0.2.1 Reporter: Lewis John McGibbney Fix For: 0.3 Attachments: GORA-203.patch Currently, we are absolutely required to set a value for a column field attribute qualifier, however there are no checks to determine whether this is actually present or not, therefore this is a bug. Renato pointed this out and hopefully he can upload some stack traces relating to the issue to display the kind of issues one faces when qualifier attributes and their values are not present when mapping columns to Cassandra. As far as we know, column field attributes are supported in the most recent Cassandra data model (and this is not due to change) therefore we should also support them in Gora, however it is my opinion (please comment here) on whether they should be optional or not. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GORA-204) Don't store empty arrays in CassandraClient#addGenericArray(), addStatefulHashMap() and CassandraStore#addOrUpdateField()
[ https://issues.apache.org/jira/browse/GORA-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-204: -- Summary: Don't store empty arrays in CassandraClient#addGenericArray(), addStatefulHashMap() and CassandraStore#addOrUpdateField() (was: Don't store empty arrays in CassandraClient#addGenericArray() addStatefulHashMap()) Don't store empty arrays in CassandraClient#addGenericArray(), addStatefulHashMap() and CassandraStore#addOrUpdateField() - Key: GORA-204 URL: https://issues.apache.org/jira/browse/GORA-204 Project: Apache Gora Issue Type: Improvement Components: avro, storage-cassandra Affects Versions: 0.2.1 Reporter: Lewis John McGibbney Priority: Minor Fix For: 0.4 We have two TODO's in this issue. Namely {code} // TODO: hack, do not store empty arrays if (itemValue instanceof GenericArray?) { if (((GenericArray)itemValue).size() == 0) { continue; } } else if (itemValue instanceof StatefulHashMap?,?) { if (((StatefulHashMap)itemValue).size() == 0) { continue; } } {code} and {code} // TODO: hack, do not store empty arrays Object mapValue = map.get(mapKey); if (mapValue instanceof GenericArray?) { if (((GenericArray)mapValue).size() == 0) { continue; } } else if (mapValue instanceof StatefulHashMap?,?) { if (((StatefulHashMap)mapValue).size() == 0) { continue; } } {code} in assGenericArray and addStateulHashMap respectively. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GORA-204) Don't store empty arrays in CassandraClient#addGenericArray(), addStatefulHashMap() and CassandraStore#addOrUpdateField()
[ https://issues.apache.org/jira/browse/GORA-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-204: -- Description: We have three TODO's in this issue. Namely {code} // TODO: hack, do not store empty arrays if (itemValue instanceof GenericArray?) { if (((GenericArray)itemValue).size() == 0) { continue; } } else if (itemValue instanceof StatefulHashMap?,?) { if (((StatefulHashMap)itemValue).size() == 0) { continue; } } {code} {code} // TODO: hack, do not store empty arrays Object mapValue = map.get(mapKey); if (mapValue instanceof GenericArray?) { if (((GenericArray)mapValue).size() == 0) { continue; } } else if (mapValue instanceof StatefulHashMap?,?) { if (((StatefulHashMap)mapValue).size() == 0) { continue; } } {code} and {code} case RECORD: if (value != null) { if (value instanceof PersistentBase) { PersistentBase persistentBase = (PersistentBase) value; for (Field member: schema.getFields()) { // TODO: hack, do not store empty arrays Object memberValue = persistentBase.get(member.pos()); if (memberValue instanceof GenericArray?) { if (((GenericArray)memberValue).size() == 0) { continue; } } else if (memberValue instanceof StatefulHashMap?,?) { if (((StatefulHashMap)memberValue).size() == 0) { continue; } } this.cassandraClient.addSubColumn(key, field.name(), member.name(), memberValue); } } else { LOG.info(Record not supported: + value.toString()); } } break; {code} in addGenericArray and addStateulHashMap in CassandraClient and CassandraStore#addOrUpdateField respectively. was: We have two TODO's in this issue. Namely {code} // TODO: hack, do not store empty arrays if (itemValue instanceof GenericArray?) { if (((GenericArray)itemValue).size() == 0) { continue; } } else if (itemValue instanceof StatefulHashMap?,?) { if (((StatefulHashMap)itemValue).size() == 0) { continue; } } {code} and {code} // TODO: hack, do not store empty arrays Object mapValue = map.get(mapKey); if (mapValue instanceof GenericArray?) { if (((GenericArray)mapValue).size() == 0) { continue; } } else if (mapValue instanceof StatefulHashMap?,?) { if (((StatefulHashMap)mapValue).size() == 0) { continue; } } {code} in assGenericArray and addStateulHashMap respectively. Don't store empty arrays in CassandraClient#addGenericArray(), addStatefulHashMap() and CassandraStore#addOrUpdateField() - Key: GORA-204 URL: https://issues.apache.org/jira/browse/GORA-204 Project: Apache Gora Issue Type: Improvement Components: avro, storage-cassandra Affects Versions: 0.2.1 Reporter: Lewis John McGibbney Priority: Minor Fix For: 0.4 We have three TODO's in this issue. Namely {code} // TODO: hack, do not store empty arrays if (itemValue instanceof GenericArray?) { if (((GenericArray)itemValue).size() == 0) { continue; } } else if (itemValue instanceof StatefulHashMap?,?) { if (((StatefulHashMap)itemValue).size() == 0) { continue; } } {code} {code} // TODO: hack, do not store empty arrays Object mapValue = map.get(mapKey); if (mapValue instanceof GenericArray?) { if (((GenericArray)mapValue).size() == 0) { continue; } } else if (mapValue instanceof StatefulHashMap?,?) { if (((StatefulHashMap)mapValue).size() == 0) { continue; } } {code} and {code} case RECORD: if (value != null) { if (value instanceof PersistentBase) { PersistentBase persistentBase = (PersistentBase) value; for (Field member: schema.getFields()) { // TODO: hack, do not store empty arrays Object memberValue = persistentBase.get(member.pos()); if (memberValue instanceof GenericArray?) { if (((GenericArray)memberValue).size() == 0) { continue
[jira] [Updated] (GORA-167) Make Cassandra keyspace consistency configurable within gora.properties
[ https://issues.apache.org/jira/browse/GORA-167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-167: -- Fix Version/s: (was: 0.4) 0.3 Make Cassandra keyspace consistency configurable within gora.properties --- Key: GORA-167 URL: https://issues.apache.org/jira/browse/GORA-167 Project: Apache Gora Issue Type: Improvement Components: storage-cassandra Affects Versions: 0.2.1 Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Priority: Minor Fix For: 0.3 Current in CassandraClient#checkKeyspace() consistency is hard coded such that consistency level is .ONE which permits consistency to wait until one replica has responded. This could be improved to enable users to specify other consistency profiles e.g. ANY: Wait until some replica has responded. ONE: Wait until one replica has responded. TWO: Wait until two replicas have responded. THREE: Wait until three replicas have responded. LOCAL_QUORUM: Wait for quorum on the datacenter the connection was stablished. EACH_QUORUM: Wait for quorum on each datacenter. QUORUM: Wait for a quorum of replicas (no matter which datacenter). ALL: Blocks for all the replicas before returning to the client. Configuration should be made available through gora.properties -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GORA-167) Make Cassandra keyspace consistency configurable within gora.properties
[ https://issues.apache.org/jira/browse/GORA-167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-167: -- Attachment: GORA-167.patch Patch for trunk. Can someone please check on the Properties param for checkKeyspace() method which now accepts the consistency level property from gora.properties. Is specifying this solely in gora.properties the best way to go, or should this be configurable programmatically as well? Thanks for any feedback. Lets put this one to bed. Make Cassandra keyspace consistency configurable within gora.properties --- Key: GORA-167 URL: https://issues.apache.org/jira/browse/GORA-167 Project: Apache Gora Issue Type: Improvement Components: storage-cassandra Affects Versions: 0.2.1 Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Priority: Minor Fix For: 0.3 Attachments: GORA-167.patch Current in CassandraClient#checkKeyspace() consistency is hard coded such that consistency level is .ONE which permits consistency to wait until one replica has responded. This could be improved to enable users to specify other consistency profiles e.g. ANY: Wait until some replica has responded. ONE: Wait until one replica has responded. TWO: Wait until two replicas have responded. THREE: Wait until three replicas have responded. LOCAL_QUORUM: Wait for quorum on the datacenter the connection was stablished. EACH_QUORUM: Wait for quorum on each datacenter. QUORUM: Wait for a quorum of replicas (no matter which datacenter). ALL: Blocks for all the replicas before returning to the client. Configuration should be made available through gora.properties -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GORA-190) Add version switch to bin/gora script
[ https://issues.apache.org/jira/browse/GORA-190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-190: -- Fix Version/s: (was: 0.4) 0.3 Add version switch to bin/gora script Key: GORA-190 URL: https://issues.apache.org/jira/browse/GORA-190 Project: Apache Gora Issue Type: Improvement Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Priority: Minor Fix For: 0.3 Attachments: GORA-190.patch This should act as a sure means of ensuring that Gora is properly installed in the target operating system. I have never used Gora on anything other than Ubuntu, so this will help us in the future to identify interoperability with other OS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GORA-190) Add version switch to bin/gora script
[ https://issues.apache.org/jira/browse/GORA-190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-190: -- Attachment: GORA-190.patch Add version switch to bin/gora script Key: GORA-190 URL: https://issues.apache.org/jira/browse/GORA-190 Project: Apache Gora Issue Type: Improvement Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Priority: Minor Fix For: 0.3 Attachments: GORA-190.patch This should act as a sure means of ensuring that Gora is properly installed in the target operating system. I have never used Gora on anything other than Ubuntu, so this will help us in the future to identify interoperability with other OS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-201) Upgrade HBase API Usage in Gora
[ https://issues.apache.org/jira/browse/GORA-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575476#comment-13575476 ] Lewis John McGibbney commented on GORA-201: --- The hbase dep IGora seems to stagnated somewhat. I am not keeping up with hbase but do know that we use an old api. Eds avro patch is another kettle of post 0.3 fish but linked to this indeed. Upgrade HBase API Usage in Gora --- Key: GORA-201 URL: https://issues.apache.org/jira/browse/GORA-201 Project: Apache Gora Issue Type: Bug Components: storage-hbase Affects Versions: 0.3 Reporter: Lewis John McGibbney Fix For: 0.4 We haven't touched the HBase versioning in a good while. When a new user heads over to the HBase site, they are directed to the 'stable' release which is currently sitting at 0.94.4. I realise that we have (legacy) support for the 0.90.X branch of HBase, but from what I can see, there is no current justification for this decision and it is also not within any strategic short/medium/long term objectives of Gora. This issue should *Enable us to discuss what Hbase branch we wish to support moving forward *Actually implement the upgrade which gathers most consensus. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (GORA-208) Implement consistent use of DataStoreFactory across Gora modules
Lewis John McGibbney created GORA-208: - Summary: Implement consistent use of DataStoreFactory across Gora modules Key: GORA-208 URL: https://issues.apache.org/jira/browse/GORA-208 Project: Apache Gora Issue Type: Bug Components: gora-core, storage-accumulo, storage-cassandra, storage-dynamodb, storage-hbase Affects Versions: 0.2.1 Reporter: Lewis John McGibbney Fix For: 0.4 Currently usage of DataStoreFactory (for initializing datastores, mappings and datastore configuration properties) is in consistent across datastore modules. If we are to lower the barrier to datastore contributions and implementations then we need to make the approach consistent. This should also be documented thoroughly as it is a key part of the Gora architecture. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Support for NoSQL databases
Hi Apostolis, On Fri, Feb 15, 2013 at 7:10 PM, dev-digest-h...@gora.apache.org wrote: Hello, Could you please provide me a list of all the NoSQL databases that Gora supports at the moment We currently support Apache Accumulo, Avro, Cassandra and HBase. We also have a WebService's API and support Amazon's DynamoDB. and what NoSQL databases are planned to be supported in the near future? We have a number of tickets open for planned implementations. I've separated them into patches available and no patches available Patches available: Solr 4.X - https://issues.apache.org/jira/browse/GORA-9 MongoDB - https://issues.apache.org/jira/browse/GORA-199 Ehcache - https://issues.apache.org/jira/browse/GORA-13 JDBM2 - https://issues.apache.org/jira/browse/GORA-14 No patch File-based store - https://issues.apache.org/jira/browse/GORA-8 Also, do you have an estimate on how long would it take for someone to develop a Gora module to support a new NoSQL database? A good benchmark was last years Google Summer of Code project. Writing a new compiler, restructuring the core Gora API, adding a WebServices API and writing the gora-dynamodb was all achived within the project. I do not however have a definitive duration of time for this. I suppose it really depends on what you want to do and how much time you are prepared to allocate to the task. Take into consideration that a lot of your 'thinking' can be done out loud on the developer or user list. We would welcome such dialogue. The reason for asking is because I am interested in implementing such a module myself as a final year MSc project. Sounds excellent. You've certainly come to the right place. If you are serious about engaging in some work within Gora then please tell us more and we can begin to plan ahead. Best Lewis -- *Lewis*
Re: Updated GORA-174 HBase information - unions
Hi Renato On Fri, Feb 15, 2013 at 7:10 PM, dev-digest-h...@gora.apache.org wrote: This is a part I am not understanding very well. You guys are saying that legacy data is a problem, but why is this a problem if we haven't been supporting Avro Union in the past? This is a new feature, not an upgrade. And for what I am understanding, the second issue was on marking as deprecated the support for Union data types. But then again, if we are able to support Union data types, this would be the first time. Am I understanding things correctly here? Lewis? Alfonso? anyone else? If we have previously defined the JSON Avro schemas not defining unions (which is current practice), then new schemas supporting avro unions will not be compatible with the legacy data. This is the problem right? Ok, I see. But what about unions with more than one type? shouldn't we think in solving this once for all? We also have to keep in mind that the same solution might not be applicable to all data stores, but we should be able to provide the same features across all the supported data stores. This is very well put. It is clear that the implementations will differ considerably. We are moving in the right direction for Cassandra and HBase solutions, but currently lack Accumulo. Please see my other most recent thread on GORa-174. Thanks troops. Have a great weekend. Lewis
[jira] [Created] (GORA-209) Specify query timeout for Hector usage in gora-cassandra
Lewis John McGibbney created GORA-209: - Summary: Specify query timeout for Hector usage in gora-cassandra Key: GORA-209 URL: https://issues.apache.org/jira/browse/GORA-209 Project: Apache Gora Issue Type: Bug Components: storage-cassandra Affects Versions: 0.2.1 Reporter: Lewis John McGibbney Priority: Minor Fix For: 0.4 There is an interesting discussion going on over at Hector Dev list regarding improving Hector to support time outs for queries running for over X seconds. https://groups.google.com/forum/?fromgroups=#!topic/hector-dev/9a0-u9oXjk4 Once something results from this, we should improve gora-cassandra to also leverage timeouts for queries which time out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GORA-210) thread safety: java.util.ConcurrentModificationException
[ https://issues.apache.org/jira/browse/GORA-210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-210: -- Fix Version/s: 0.3 thread safety: java.util.ConcurrentModificationException Key: GORA-210 URL: https://issues.apache.org/jira/browse/GORA-210 Project: Apache Gora Issue Type: Bug Components: storage-cassandra Affects Versions: 0.2 Environment: nutch 2.1 / cassandra 1.2.1 / gora-cassandra 0.2 / gora-core 0.2.1 running fetch with parse=true fetcher.threads.per.queue1 Reporter: Roland Priority: Critical Labels: patch Fix For: 0.3 Attachments: GORA-210.patch This is the result of debugging one of my issues described in NUTCH-1534. I think there is a wrong assumpation about thread safety of LinkedHashMap, it is not enough to not iterate over the buffer (which is a LinkedHashMap). My patch fixes this error for me: java.util.ConcurrentModificationException at java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:394) at java.util.LinkedHashMap$KeyIterator.next(LinkedHashMap.java:405) at java.util.AbstractCollection.toArray(AbstractCollection.java:141) at org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:200) at org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:65) at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:587) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.nutch.fetcher.FetcherReducer$FetcherThread.output(FetcherReducer.java:664) at org.apache.nutch.fetcher.FetcherReducer$FetcherThread.run(FetcherReducer.java:534) It may not be perfect from a performance point of view... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-210) thread safety: java.util.ConcurrentModificationException
[ https://issues.apache.org/jira/browse/GORA-210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589816#comment-13589816 ] Lewis John McGibbney commented on GORA-210: --- How can I reproduce the Exception you's guys are talking about? Can we test for it?... easily? thread safety: java.util.ConcurrentModificationException Key: GORA-210 URL: https://issues.apache.org/jira/browse/GORA-210 Project: Apache Gora Issue Type: Bug Components: storage-cassandra Affects Versions: 0.2 Environment: nutch 2.1 / cassandra 1.2.1 / gora-cassandra 0.2 / gora-core 0.2.1 running fetch with parse=true fetcher.threads.per.queue1 Reporter: Roland Priority: Critical Labels: patch Fix For: 0.3 Attachments: GORA-210.patch This is the result of debugging one of my issues described in NUTCH-1534. I think there is a wrong assumpation about thread safety of LinkedHashMap, it is not enough to not iterate over the buffer (which is a LinkedHashMap). My patch fixes this error for me: java.util.ConcurrentModificationException at java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:394) at java.util.LinkedHashMap$KeyIterator.next(LinkedHashMap.java:405) at java.util.AbstractCollection.toArray(AbstractCollection.java:141) at org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:200) at org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:65) at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:587) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.nutch.fetcher.FetcherReducer$FetcherThread.output(FetcherReducer.java:664) at org.apache.nutch.fetcher.FetcherReducer$FetcherThread.run(FetcherReducer.java:534) It may not be perfect from a performance point of view... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-206) Verify storage and retrieval of Avro null-single-type Union data type within Gora-Cassandra
[ https://issues.apache.org/jira/browse/GORA-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590087#comment-13590087 ] Lewis John McGibbney commented on GORA-206: --- This is a large patch. Can you please describe it to us Renato please. It is difficult for me personally to digest. Verify storage and retrieval of Avro null-single-type Union data type within Gora-Cassandra --- Key: GORA-206 URL: https://issues.apache.org/jira/browse/GORA-206 Project: Apache Gora Issue Type: Sub-task Components: storage-cassandra Affects Versions: 0.3 Reporter: Renato Javier Marroquín Mogrovejo Assignee: Renato Javier Marroquín Mogrovejo Labels: gora-cassandra, gora-core Fix For: 0.3 Attachments: GORA-206.v1.patch The necessary features should be added to confirm that we are able to support Avro Union data types. This referes specifically to null-single-type unions. We will open another issue to address the multi-type unions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-206) Verify storage and retrieval of Avro null-single-type Union data type within Gora-Cassandra
[ https://issues.apache.org/jira/browse/GORA-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591556#comment-13591556 ] Lewis John McGibbney commented on GORA-206: --- OK so first things 1st, what issues do we have with GORA-174? If you mention that the patch here includes all of the stuff Alfonso implemented in GORA-174 then I will apply your patch to trunk and run the tests. Lets keep the conversation either here or else on GORA-174 as it is becoming difficult to track now. Is there anything on the mailing list (regarding conversation which has not been tied off) which you want to clarify or iron out? Thanks for the work Renato. Great help that you guys are pushing this on. Verify storage and retrieval of Avro null-single-type Union data type within Gora-Cassandra --- Key: GORA-206 URL: https://issues.apache.org/jira/browse/GORA-206 Project: Apache Gora Issue Type: Sub-task Components: storage-cassandra Affects Versions: 0.3 Reporter: Renato Javier Marroquín Mogrovejo Assignee: Renato Javier Marroquín Mogrovejo Labels: gora-cassandra, gora-core Fix For: 0.3 Attachments: GORA-206.v1.patch The necessary features should be added to confirm that we are able to support Avro Union data types. This referes specifically to null-single-type unions. We will open another issue to address the multi-type unions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-206) Verify storage and retrieval of Avro null-single-type Union data type within Gora-Cassandra
[ https://issues.apache.org/jira/browse/GORA-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591600#comment-13591600 ] Lewis John McGibbney commented on GORA-206: --- I've still not tracked this one down but I am getting closer. When we use a patched version of GoraCompiler to compile the latest webpage.json schema available on NUTCH-1477 (which for the record IS syntactically fine) we get the following generated into the WebPage.java class. {code} public static final Schema _SCHEMA = Schema.parse({\type\:\record\,\name\:\WebPage\,\namespace\:\org.apache.nutch.storage\,\fields\:[{\name\:\baseurl\,\type\:[\null\,\string\]}},{\name\:\status\,\type\:\int\},{\name\:\fetchtime\,\type\:\long\},{\name\:\prevfetchtime\,\type\:\long\},{\name\:\fetchinterval\,\type\:\int\},{\name\:\retriessincefetch\,\type\:\int\},{\name\:\modifiedtime\,\type\:\long\},{\name\:\protocolstatus\,\type\:[\null\,\protocolstatus\]}},{\name\:\content\,\type\:[\null\,\bytes\]}},{\name\:\contenttype\,\type\:[\null\,\string\]}},{\name\:\prevsignature\,\type\:[\null\,\bytes\]}},{\name\:\signature\,\type\:[\null\,\bytes\]}},{\name\:\title\,\type\:[\null\,\string\]}},{\name\:\text\,\type\:[\null\,\string\]}},{\name\:\parsestatus\,\type\:[\null\,\parsestatus\]}},{\name\:\score\,\type\:\float\},{\name\:\reprurl\,\type\:[\null\,\string\]}},{\name\:\headers\,\type\:\map\},{\name\:\outlinks\,\type\:\map\},{\name\:\inlinks\,\type\:\map\},{\name\:\markers\,\type\:\map\},{\name\:\metadata\,\type\:\map\}]}); {code} This does not look good when I do some simple bracket matching. I think we've introduced a big in GoraCompiler which needs to be ironed out. Verify storage and retrieval of Avro null-single-type Union data type within Gora-Cassandra --- Key: GORA-206 URL: https://issues.apache.org/jira/browse/GORA-206 Project: Apache Gora Issue Type: Sub-task Components: storage-cassandra Affects Versions: 0.3 Reporter: Renato Javier Marroquín Mogrovejo Assignee: Renato Javier Marroquín Mogrovejo Labels: gora-cassandra, gora-core Fix For: 0.3 Attachments: GORA-206.v1.patch, GORA-206.v2.patch The necessary features should be added to confirm that we are able to support Avro Union data types. This referes specifically to null-single-type unions. We will open another issue to address the multi-type unions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (GORA-191) Add a constructor to GoraCompiler so it can be used outside of Gora.
[ https://issues.apache.org/jira/browse/GORA-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reassigned GORA-191: - Assignee: Apostolos Giannakidis Done. Thanks for any contributions. Add a constructor to GoraCompiler so it can be used outside of Gora. Key: GORA-191 URL: https://issues.apache.org/jira/browse/GORA-191 Project: Apache Gora Issue Type: Improvement Components: gora-core, schema Reporter: Lewis John McGibbney Assignee: Apostolos Giannakidis Priority: Critical Fix For: 0.4 We need to automate the compiling of various .avsc files over in Nutch. We should add a constructor to GoraCompiler so it can be used more widely. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (GORA-206) Verify storage and retrieval of Avro null-single-type Union data type within Gora-Cassandra
[ https://issues.apache.org/jira/browse/GORA-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591901#comment-13591901 ] Lewis John McGibbney commented on GORA-206: --- Looking at this, this morning I see that we've introduced a number of bugs such as * fields for embedded records are now not handled correctly in new GoraComiler * fields containing array and map values are now not handled correctly in GoraCompiler I'll work on these two issues and hopefully attach a working v3 patch. Verify storage and retrieval of Avro null-single-type Union data type within Gora-Cassandra --- Key: GORA-206 URL: https://issues.apache.org/jira/browse/GORA-206 Project: Apache Gora Issue Type: Sub-task Components: storage-cassandra Affects Versions: 0.3 Reporter: Renato Javier Marroquín Mogrovejo Assignee: Renato Javier Marroquín Mogrovejo Labels: gora-cassandra, gora-core Fix For: 0.3 Attachments: GORA-206.v1.patch, GORA-206.v2.patch The necessary features should be added to confirm that we are able to support Avro Union data types. This referes specifically to null-single-type unions. We will open another issue to address the multi-type unions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (GORA-213) Move out StringUtil-capable methods from GoraCompiler to StringUtils
Lewis John McGibbney created GORA-213: - Summary: Move out StringUtil-capable methods from GoraCompiler to StringUtils Key: GORA-213 URL: https://issues.apache.org/jira/browse/GORA-213 Project: Apache Gora Issue Type: Bug Components: documentation, gora-core Affects Versions: 0.2.1 Reporter: Lewis John McGibbney Priority: Trivial Fix For: 0.4 This is a rather trivial affair, but concerns an attempt to clean up GoraCompiler. I know I for one have struggled in the past to get to grips to GoraCompiler, and honestly, additional class specific strung utility like methods really do not help in the slightest. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (GORA-206) Verify storage and retrieval of Avro null-single-type Union data type within Gora-Cassandra
[ https://issues.apache.org/jira/browse/GORA-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591901#comment-13591901 ] Lewis John McGibbney edited comment on GORA-206 at 3/4/13 12:37 AM: Looking at this, this morning I see that we've introduced a number of bugs such as * fields for embedded records are now not handled correctly in new GoraCompiler e.g {code} {name: protocolStatus, type: [null, { name: ProtocolStatus, type: record, namespace: org.apache.nutch.storage, fields: [ {name: code, type: int}, {name: args, type: {type: array, items: string}}, {name: lastModified, type: long} ] }]} {code} is simply compiled down into {code} {\name\:\protocolStatus\,\type\:[\null\,\protocolstatus\]}, {code} * additionally fields containing array and map values are now not handled correctly in GoraCompiler e.g. {code} {name: headers, type: {type: map, values: string}} {code} is incorrectly compiled down into {code} {\name\:\metadata\,\type\:\map\} {code} We need to sort these two cases at a minimum. These are blockers. I'll work on these two issues and hopefully attach a working v3 patch. was (Author: lewismc): Looking at this, this morning I see that we've introduced a number of bugs such as * fields for embedded records are now not handled correctly in new GoraComiler * fields containing array and map values are now not handled correctly in GoraCompiler I'll work on these two issues and hopefully attach a working v3 patch. Verify storage and retrieval of Avro null-single-type Union data type within Gora-Cassandra --- Key: GORA-206 URL: https://issues.apache.org/jira/browse/GORA-206 Project: Apache Gora Issue Type: Sub-task Components: storage-cassandra Affects Versions: 0.3 Reporter: Renato Javier Marroquín Mogrovejo Assignee: Renato Javier Marroquín Mogrovejo Labels: gora-cassandra, gora-core Fix For: 0.3 Attachments: GORA-206.v1.patch, GORA-206.v2.patch The necessary features should be added to confirm that we are able to support Avro Union data types. This referes specifically to null-single-type unions. We will open another issue to address the multi-type unions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GORA-211) thread safety: java.lang.NullPointerException
[ https://issues.apache.org/jira/browse/GORA-211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-211: -- Assignee: Roland thread safety: java.lang.NullPointerException - Key: GORA-211 URL: https://issues.apache.org/jira/browse/GORA-211 Project: Apache Gora Issue Type: Bug Components: storage-cassandra Affects Versions: 0.2 Environment: nutch 2.1 / cassandra 1.2.1 / gora-cassandra 0.2 / gora-core 0.2.1 running fetch with parse=true fetcher.threads.per.queue=2 nutch on a 16 core AMD Opteron 2GHz Cassandra on 8 core Intel Xeon 3.3 GHz Reporter: Roland Assignee: Roland Priority: Critical Attachments: GORA-211-0.2.patch, GORA-211-trunk.patch, GORA-211-trunk-v2.patch This is the result of debugging one of my issues described in NUTCH-1534. example trace: java.lang.NullPointerException at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243) at me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:71) at org.apache.gora.cassandra.store.CassandraClient.addColumn(CassandraClient.java:139) at org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:307) at org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:212) at org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:65) at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:587) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.nutch.fetcher.FetcherReducer$FetcherThread.output(FetcherReducer.java:664) at org.apache.nutch.fetcher.FetcherReducer$FetcherThread.run(FetcherReducer.java:534) I'm suspecting CassandraStore.put() not taking enough precautions to copy all objects safely to it's buffer. {code} switch(type) { case RECORD: Persistent persistent = (Persistent) fieldValue; Persistent newRecord = persistent.newInstance(new StateManagerImpl()); for (Field member: fieldSchema.getFields()) { newRecord.put(member.pos(), persistent.get(member.pos())); } fieldValue = newRecord; break; case MAP: StatefulHashMap?, ? map = (StatefulHashMap?, ?) fieldValue; StatefulHashMap?, ? newMap = new StatefulHashMap(map); fieldValue = newMap; break; } {code} case RECORD - do we not need to duplicate the object returned by persistent.get(member.pos()): newRecord.put(member.pos(), persistent.get(member.pos())) case MAP - do we not need to duplicate all value-objects of the map? I had not time to write a patch or test this, so, please comment :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (GORA-210) thread safety: java.util.ConcurrentModificationException
[ https://issues.apache.org/jira/browse/GORA-210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated GORA-210: -- Assignee: Roland thread safety: java.util.ConcurrentModificationException Key: GORA-210 URL: https://issues.apache.org/jira/browse/GORA-210 Project: Apache Gora Issue Type: Bug Components: storage-cassandra Affects Versions: 0.2 Environment: nutch 2.1 / cassandra 1.2.1 / gora-cassandra 0.2 / gora-core 0.2.1 running fetch with parse=true fetcher.threads.per.queue=2 / about 1 Exception per 100k URLs fetched nutch on a 16 core AMD Opteron 2GHz. Cassandra on 8 core Intel Xeon 3.3 GHz Reporter: Roland Assignee: Roland Priority: Critical Labels: patch Fix For: 0.3 Attachments: GORA-210.patch, GORA-210-trunk.patch This is the result of debugging one of my issues described in NUTCH-1534. I think there is a wrong assumpation about thread safety of LinkedHashMap, it is not enough to not iterate over the buffer (which is a LinkedHashMap). My patch fixes this error for me: java.util.ConcurrentModificationException at java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:394) at java.util.LinkedHashMap$KeyIterator.next(LinkedHashMap.java:405) at java.util.AbstractCollection.toArray(AbstractCollection.java:141) at org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:200) at org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:65) at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:587) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.nutch.fetcher.FetcherReducer$FetcherThread.output(FetcherReducer.java:664) at org.apache.nutch.fetcher.FetcherReducer$FetcherThread.run(FetcherReducer.java:534) It may not be perfect from a performance point of view... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira