[jira] [Commented] (GORA-184) Gora with Hadoop 1.0.3 + Hbase 0.92.0 + Avro 1.5.3

2012-11-16 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13498842#comment-13498842
 ] 

Lewis John McGibbney commented on GORA-184:
---

Hi Alfonso, this sounds like areasonable feature request, however there are 
some issues as well, and i therefore propose the following

-Hadoop 1.0.3/4: the upgrade should be addressed... a patch would be very 
welcome :)
-Avro 1.5.3: Much more recent versions of the library are available... the aim 
would be to address Ed's work over in GORA-94
-HBase 0.92.0: Currently the Gora community is working and supporting HBase 
version 0.90.X... AFAIK this is going to contine unless someone proposes 
justification behind a strategic switch... 

 Gora with Hadoop 1.0.3 + Hbase 0.92.0 + Avro 1.5.3
 --

 Key: GORA-184
 URL: https://issues.apache.org/jira/browse/GORA-184
 Project: Apache Gora
  Issue Type: Improvement
Affects Versions: 0.2.1
Reporter: Alfonso Nishikawa

 I have seen the upgrado to Hadoop 1.0.1 [#GORA-76], but I ask for Hadoop 
 1.0.3 because it is the specific version I use although Hadoop 1.0.4 was 
 released recently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-182) Nutch 2.1 does not work with gora-cassandra 0.2.1

2012-11-16 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13498930#comment-13498930
 ] 

Lewis John McGibbney commented on GORA-182:
---

Hi Kaz, I checked out gora-core and gora-cassandra 0.2.1, built the modules 
locally then manually copied them over to my Nutch installation. Upon injecting 
URLs into Cassandra, I get the following.

{code}
me.prettyprint.hector.api.exceptions.HInvalidRequestException: 
InvalidRequestException(why:(String didn't validate.) [webpage][f][ts] failed 
validation)
at 
me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:52)
at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:97)
at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:90)
at 
me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
at 
me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:233)
at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:102)
at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:108)
at 
me.prettyprint.cassandra.model.MutatorImpl$3.doInKeyspace(MutatorImpl.java:248)
at 
me.prettyprint.cassandra.model.MutatorImpl$3.doInKeyspace(MutatorImpl.java:245)
at 
me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
at 
me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
at 
me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:245)
at 
me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:71)
at 
org.apache.gora.cassandra.store.HectorUtils.insertColumn(HectorUtils.java:47)
at 
org.apache.gora.cassandra.store.CassandraClient.addColumn(CassandraClient.java:169)
at 
org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:347)
at 
org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:228)
at 
org.apache.gora.cassandra.store.CassandraStore.close(CassandraStore.java:95)
at 
org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:55)
at 
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: InvalidRequestException(why:(String didn't validate.) 
[webpage][f][ts] failed validation)
at 
org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:19479)
at 
org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:1035)
at 
org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:1009)
at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:95)
... 22 more
{code} 

The offending fetchTime field in Nutch WebPage [0] and consequently mapped in 
gora-cassandra-mapping.xml is of long data type. Initially I thought to add 
appropriate methods using hectors LongSerializer for the creation and insertion 
of  columnNames in o.a.g.c.store.HectorUtils however one I repackage and 
attempt to inject I get the above trace again.

Any ideas off the top of your head Kaz? Did you test this with Nutch 2.x head 
or 2.1? 

[0] 
http://svn.apache.org/repos/asf/nutch/branches/2.x/src/java/org/apache/nutch/storage/WebPage.java

 Nutch 2.1 does not work with gora-cassandra 0.2.1
 -

 Key: GORA-182
 URL: https://issues.apache.org/jira/browse/GORA-182
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-cassandra
Affects Versions: 0.2.1
Reporter: Kazuomi Kashii
 Attachments: GORA-182.patch


 Nutch 2.1 does not work with gora-cassandra 0.2.1.
 Especially, outlinks field is not written.
 I have confirmed this issue on Mac OS X and CentOS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Build failed in Jenkins: goraamazon_branch #136

2012-11-16 Thread Lewis John Mcgibbney
Hi Henry,

I've been trying to narrow this down. It's to do with deploying the
snapshots to repository.apache.org.

Basically this is what is happening.

Currently the gora-core-0.3-SNAPSHOT-tests.jar [0] and accompanying
signatures are all dated Tue Jul 24 05:48:55 UTC 2012, e.g. they have
not been updated... as the WebServiceTestBase (I think) class is not
included in this older SNAPSHOT the build fails. There is no *problem*
with the code, this is purely a Maven logistical pain in the neck.
I've been trying to resolve it by deploying non-unique tests snapshot
for gora-core but I think there is a bug in the deploy plugin... as I
am only able to deploy unique snapshots...

I'll keep working on this Henry

Lewis

[0] 
https://repository.apache.org/content/repositories/snapshots/org/apache/gora/gora-core/0.3-SNAPSHOT/

On Fri, Nov 16, 2012 at 4:59 PM, Henry Saputra henry.sapu...@gmail.com wrote:
 Hi Guys,

 I am still seeing the cannot find symbol error in the gora-dynamo module
 when building from trunk.

 Is there a bug to trace this issue?

 - Henry


 On Wed, Oct 31, 2012 at 10:37 AM, Lewis John Mcgibbney 
 lewis.mcgibb...@gmail.com wrote:

 Hi Renato,

 Its OK. I can confirm that things work fine when I run locally as well.
 The problem relates to the restructuring of the core modules and the
 fact that the goraamazon build pulls the 0.3-SNAPSHOT dependency
 (generated from trunk) which doesn't contain the new restructuring.

 You can see these errors here


 https://builds.apache.org/view/G-L/view/Gora/job/goraamazon_branch/136/org.apache.gora$gora-dynamodb/console

 Please check them out and we can either discuss here or offline to
 confirm that they would be resolved once the changes are ported to
 trunk.

 On Wed, Oct 31, 2012 at 5:27 PM, Renato Marroquín Mogrovejo
 renatoj.marroq...@gmail.com wrote:
  Hi,
 
  I don't know what breaks things in here. I tested locally and it worked
 fine.
  Lewis I remember you talking about this a while ago, do you have any
  clue on this? Or a place where I could start digging? If anybody has
  an idea of where to start on digging please let me know.
  Thanks in advance!
 
 
  Renato M.
 
  2012/10/31 Apache Jenkins Server jenk...@builds.apache.org:
  See https://builds.apache.org/job/goraamazon_branch/136/changes
 
  Changes:
 
  [rmarroquin] Committing new patch for changes in the way exception were
 being handled.
 
  --
  [...truncated 19900 lines...]
  [INFO] Installing 
 https://builds.apache.org/job/goraamazon_branch/ws/branches/goraamazon/gora-cassandra/pom.xml
 to
 /home/jenkins/jenkins-slave/maven-repositories/0/org/apache/gora/gora-cassandra/0.3-SNAPSHOT/gora-cassandra-0.3-SNAPSHOT.pom
  mojoSucceeded
 org.apache.maven.plugins:maven-install-plugin:2.3.1(default-install)
  mojoStarted org.apache.felix:maven-bundle-plugin:2.3.7(default-install)
  [INFO]
  [INFO] --- maven-bundle-plugin:2.3.7:install (default-install) @
 gora-cassandra ---
  [INFO] Installing
 org/apache/gora/gora-cassandra/0.3-SNAPSHOT/gora-cassandra-0.3-SNAPSHOT.jar
  [INFO] Writing OBR metadata
  mojoSucceeded
 org.apache.felix:maven-bundle-plugin:2.3.7(default-install)
  projectSucceeded org.apache.gora:gora-cassandra:0.3-SNAPSHOT
  projectStarted org.apache.gora:gora-dynamodb:0.3-SNAPSHOT
  [INFO]
  [INFO]
 
  [INFO] Building Apache Gora :: Dynamodb 0.3-SNAPSHOT
  [INFO]
 
  [INFO] Source directory: 
 https://builds.apache.org/job/goraamazon_branch/ws/branches/goraamazon/gora-dynamodb/src/examples/java
 added.
  mojoStarted org.codehaus.mojo:build-helper-maven-plugin:1.7(default)
  [INFO]
  [INFO] --- build-helper-maven-plugin:1.7:add-source (default) @
 gora-dynamodb ---
  mojoSucceeded org.codehaus.mojo:build-helper-maven-plugin:1.7(default)
  mojoStarted
 org.apache.maven.plugins:maven-remote-resources-plugin:1.2.1(default)
  [INFO]
  [INFO] --- maven-remote-resources-plugin:1.2.1:process (default) @
 gora-dynamodb ---
  mojoSucceeded
 org.apache.maven.plugins:maven-remote-resources-plugin:1.2.1(default)
  [debug] execute contextualize
  mojoStarted
 org.apache.maven.plugins:maven-resources-plugin:2.5(default-resources)[INFO]
 Using 'UTF-8' encoding to copy filtered resources.
  [INFO] skip non existing resourceDirectory 
 https://builds.apache.org/job/goraamazon_branch/ws/branches/goraamazon/gora-dynamodb/src/main/resources
 
  [INFO] Copying 0 resource
  [INFO] Copying 3 resources
 
  [INFO]
  [INFO] --- maven-resources-plugin:2.5:resources (default-resources) @
 gora-dynamodb ---
  mojoSucceeded
 org.apache.maven.plugins:maven-resources-plugin:2.5(default-resources)
  mojoStarted
 org.apache.maven.plugins:maven-compiler-plugin:2.3.2(default-compile)
  [INFO]
  [INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @
 gora-dynamodb ---
  [INFO] Compiling 7 source files to 
 https

Re: Build failed in Jenkins: goraamazon_branch #136

2012-11-16 Thread Lewis John Mcgibbney
This is also an excellent reminder to complete the wiki entry and this
very subject... nice one Henry :0)

On Fri, Nov 16, 2012 at 5:20 PM, Lewis John Mcgibbney
lewis.mcgibb...@gmail.com wrote:
 Hi Henry,

 I've been trying to narrow this down. It's to do with deploying the
 snapshots to repository.apache.org.

 Basically this is what is happening.

 Currently the gora-core-0.3-SNAPSHOT-tests.jar [0] and accompanying
 signatures are all dated Tue Jul 24 05:48:55 UTC 2012, e.g. they have
 not been updated... as the WebServiceTestBase (I think) class is not
 included in this older SNAPSHOT the build fails. There is no *problem*
 with the code, this is purely a Maven logistical pain in the neck.
 I've been trying to resolve it by deploying non-unique tests snapshot
 for gora-core but I think there is a bug in the deploy plugin... as I
 am only able to deploy unique snapshots...

 I'll keep working on this Henry

 Lewis

 [0] 
 https://repository.apache.org/content/repositories/snapshots/org/apache/gora/gora-core/0.3-SNAPSHOT/

 On Fri, Nov 16, 2012 at 4:59 PM, Henry Saputra henry.sapu...@gmail.com 
 wrote:
 Hi Guys,

 I am still seeing the cannot find symbol error in the gora-dynamo module
 when building from trunk.

 Is there a bug to trace this issue?

 - Henry


 On Wed, Oct 31, 2012 at 10:37 AM, Lewis John Mcgibbney 
 lewis.mcgibb...@gmail.com wrote:

 Hi Renato,

 Its OK. I can confirm that things work fine when I run locally as well.
 The problem relates to the restructuring of the core modules and the
 fact that the goraamazon build pulls the 0.3-SNAPSHOT dependency
 (generated from trunk) which doesn't contain the new restructuring.

 You can see these errors here


 https://builds.apache.org/view/G-L/view/Gora/job/goraamazon_branch/136/org.apache.gora$gora-dynamodb/console

 Please check them out and we can either discuss here or offline to
 confirm that they would be resolved once the changes are ported to
 trunk.

 On Wed, Oct 31, 2012 at 5:27 PM, Renato Marroquín Mogrovejo
 renatoj.marroq...@gmail.com wrote:
  Hi,
 
  I don't know what breaks things in here. I tested locally and it worked
 fine.
  Lewis I remember you talking about this a while ago, do you have any
  clue on this? Or a place where I could start digging? If anybody has
  an idea of where to start on digging please let me know.
  Thanks in advance!
 
 
  Renato M.
 
  2012/10/31 Apache Jenkins Server jenk...@builds.apache.org:
  See https://builds.apache.org/job/goraamazon_branch/136/changes
 
  Changes:
 
  [rmarroquin] Committing new patch for changes in the way exception were
 being handled.
 
  --
  [...truncated 19900 lines...]
  [INFO] Installing 
 https://builds.apache.org/job/goraamazon_branch/ws/branches/goraamazon/gora-cassandra/pom.xml
 to
 /home/jenkins/jenkins-slave/maven-repositories/0/org/apache/gora/gora-cassandra/0.3-SNAPSHOT/gora-cassandra-0.3-SNAPSHOT.pom
  mojoSucceeded
 org.apache.maven.plugins:maven-install-plugin:2.3.1(default-install)
  mojoStarted org.apache.felix:maven-bundle-plugin:2.3.7(default-install)
  [INFO]
  [INFO] --- maven-bundle-plugin:2.3.7:install (default-install) @
 gora-cassandra ---
  [INFO] Installing
 org/apache/gora/gora-cassandra/0.3-SNAPSHOT/gora-cassandra-0.3-SNAPSHOT.jar
  [INFO] Writing OBR metadata
  mojoSucceeded
 org.apache.felix:maven-bundle-plugin:2.3.7(default-install)
  projectSucceeded org.apache.gora:gora-cassandra:0.3-SNAPSHOT
  projectStarted org.apache.gora:gora-dynamodb:0.3-SNAPSHOT
  [INFO]
  [INFO]
 
  [INFO] Building Apache Gora :: Dynamodb 0.3-SNAPSHOT
  [INFO]
 
  [INFO] Source directory: 
 https://builds.apache.org/job/goraamazon_branch/ws/branches/goraamazon/gora-dynamodb/src/examples/java
 added.
  mojoStarted org.codehaus.mojo:build-helper-maven-plugin:1.7(default)
  [INFO]
  [INFO] --- build-helper-maven-plugin:1.7:add-source (default) @
 gora-dynamodb ---
  mojoSucceeded org.codehaus.mojo:build-helper-maven-plugin:1.7(default)
  mojoStarted
 org.apache.maven.plugins:maven-remote-resources-plugin:1.2.1(default)
  [INFO]
  [INFO] --- maven-remote-resources-plugin:1.2.1:process (default) @
 gora-dynamodb ---
  mojoSucceeded
 org.apache.maven.plugins:maven-remote-resources-plugin:1.2.1(default)
  [debug] execute contextualize
  mojoStarted
 org.apache.maven.plugins:maven-resources-plugin:2.5(default-resources)[INFO]
 Using 'UTF-8' encoding to copy filtered resources.
  [INFO] skip non existing resourceDirectory 
 https://builds.apache.org/job/goraamazon_branch/ws/branches/goraamazon/gora-dynamodb/src/main/resources
 
  [INFO] Copying 0 resource
  [INFO] Copying 3 resources
 
  [INFO]
  [INFO] --- maven-resources-plugin:2.5:resources (default-resources) @
 gora-dynamodb ---
  mojoSucceeded
 org.apache.maven.plugins:maven-resources-plugin:2.5(default-resources)
  mojoStarted

[jira] [Updated] (GORA-186) Show better errors when a field is missing in HBase mapping

2012-11-20 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-186:
--

Fix Version/s: 0.3

 Show better errors when a field is missing in HBase mapping
 ---

 Key: GORA-186
 URL: https://issues.apache.org/jira/browse/GORA-186
 Project: Apache Gora
  Issue Type: Improvement
  Components: storage-hbase
Affects Versions: 0.2, 0.2.1
 Environment: Ubuntu 12.04, avro 1.3.2, hbase 0.92.0, gora 0.2.1
Reporter: Alfonso Nishikawa
Assignee: Alfonso Nishikawa
Priority: Trivial
 Fix For: 0.3

 Attachments: GORA-186.patch, GORA-186v2.patch


 When a field is wrong typed or missing in gora-hbase-mapping.xml, a 
 NullPointerException is raised in org.apache.gora.hbase.store.HBaseStore:235
 Just control this to know which field is missing/wrong.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-183) dataStore.put() -org.apache.gora.hbase.util.HBaseInterface#toBytes(). Unknown type: UNION

2012-11-20 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13501227#comment-13501227
 ] 

Lewis John McGibbney commented on GORA-183:
---

{bq}Uhm noticed now that this issue is related to HBase. Does anyone knows 
if it affects Cassandra, SQL,... too? {bq}

Honest answer is no. I will try to write the schema and mapping implementations 
for gora-cassandra and get back to you here

 dataStore.put() -org.apache.gora.hbase.util.HBaseInterface#toBytes(). 
 Unknown type: UNION
 --

 Key: GORA-183
 URL: https://issues.apache.org/jira/browse/GORA-183
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-hbase
Affects Versions: 0.2, 0.2.1
 Environment: Ubuntu 12.04, HBase 0.92.0, Gora 0.2.1, Avro 1.3.2
Reporter: Alfonso Nishikawa
Assignee: Alfonso Nishikawa

 Summary: HBase does not handle avro UNION type (in the schema like 
 [string,null].
 When trying to write a row I get the RuntimeException Unknown type: UNION.
 My .avsc is the following:
 {code}
 {name: TestRow,
  type: record,
  namespace: es.foo.tests.storage,
  fields: [
 {name: columnLong, type: long, default: 0},
 {name: unionRecursive, type: [TestRow,null]},
 {name: unionString, type: [string,null]},
 
 {name: family2, type: {type: map, values:string}}
]
 }
 {code}
 my mapping is:
 {code}
 ?xml version=1.0 encoding=UTF-8?
 gora-orm
 table name=test !-- Configuración de familias --
 family name=family1  maxVersions=1 compression=SNAPPY /
 family name=family2  maxVersions=1 compression=SNAPPY /
 /table
 class table=test keyClass=java.lang.String 
 name=es.foo.tests.storage.TestRow
 field name=unionString   family=family1 
 qualifier=unionString/
 field name=unionRecursivefamily=family1 
 qualifier=unionRecursive /
 field name=columnLongfamily=family1 
 qualifier=colInteger /
 field name=family2   family=family2 /
 /class
 /gora-orm
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-174) GORA compiler does not handle [string, null] unions in the AVRO schema

2012-11-20 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13501229#comment-13501229
 ] 

Lewis John McGibbney commented on GORA-174:
---

Hi, I'll try the new patch both with gora trunk and with the Nutch 2.x 
InjectorJob.

 GORA compiler does not handle [string, null] unions in the AVRO schema
 --

 Key: GORA-174
 URL: https://issues.apache.org/jira/browse/GORA-174
 Project: Apache Gora
  Issue Type: Bug
  Components: schema
Affects Versions: 0.2.1
Reporter: Julien Nioche
Assignee: Alfonso Nishikawa
 Fix For: 0.3

 Attachments: GORA-174-test.patch, GORA-174v2.patch


 See NUTCH-1477 for description. 
 We are getting NPE when using the DataFileAvroStore, in order to avoid that I 
 modified the schema to allow for null values on some fields e.g.{name: 
 baseUrl, type: [string, null] }
 however when generating the code for the schema the accessors are not 
 generated by GORA which prevents Nutch from compiling 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (GORA-189) String parameters in generated Persistent subclasses by Compiler -not only Utf8-

2012-11-20 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-189:
--

Fix Version/s: 0.3

 String parameters in generated Persistent subclasses by Compiler -not only 
 Utf8-
 

 Key: GORA-189
 URL: https://issues.apache.org/jira/browse/GORA-189
 Project: Apache Gora
  Issue Type: Improvement
  Components: gora-core
Affects Versions: 0.2.1
Reporter: Alfonso Nishikawa
Assignee: Alfonso Nishikawa
Priority: Trivial
 Fix For: 0.3

 Attachments: GORA-189-code.patch


 It would be much useful if gora compiler generates methods taking Strings as 
 parameters (and creating Utf8 inside automatically). Code would be much more 
 clear and simple when populating that classes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (GORA-182) Nutch 2.1 does not work with gora-cassandra 0.2.1

2012-11-21 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-182:
--

Fix Version/s: 0.3

 Nutch 2.1 does not work with gora-cassandra 0.2.1
 -

 Key: GORA-182
 URL: https://issues.apache.org/jira/browse/GORA-182
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-cassandra
Affects Versions: 0.2.1
Reporter: Kazuomi Kashii
 Fix For: 0.3

 Attachments: GORA-182.patch


 Nutch 2.1 does not work with gora-cassandra 0.2.1.
 Especially, outlinks field is not written.
 I have confirmed this issue on Mac OS X and CentOS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (GORA-184) Gora with Hadoop 1.0.3 + Hbase 0.92.0 + Avro 1.5.3

2012-11-21 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-184:
--

Fix Version/s: 0.4

 Gora with Hadoop 1.0.3 + Hbase 0.92.0 + Avro 1.5.3
 --

 Key: GORA-184
 URL: https://issues.apache.org/jira/browse/GORA-184
 Project: Apache Gora
  Issue Type: Improvement
Affects Versions: 0.2.1
Reporter: Alfonso Nishikawa
 Fix For: 0.4


 I have seen the upgrado to Hadoop 1.0.1 [#GORA-76], but I ask for Hadoop 
 1.0.3 because it is the specific version I use although Hadoop 1.0.4 was 
 released recently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (GORA-176) GoraCI

2012-11-21 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved GORA-176.
---

Resolution: Duplicate

Closing as duplicate of GORA-73

 GoraCI
 --

 Key: GORA-176
 URL: https://issues.apache.org/jira/browse/GORA-176
 Project: Apache Gora
  Issue Type: Umbrella
  Components: testing
Affects Versions: 0.3
Reporter: Renato Javier Marroquín Mogrovejo



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (GORA-183) dataStore.put() -org.apache.gora.hbase.util.HBaseInterface#toBytes(). Unknown type: UNION

2012-11-21 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-183:
--

Fix Version/s: 0.3

 dataStore.put() -org.apache.gora.hbase.util.HBaseInterface#toBytes(). 
 Unknown type: UNION
 --

 Key: GORA-183
 URL: https://issues.apache.org/jira/browse/GORA-183
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-hbase
Affects Versions: 0.2, 0.2.1
 Environment: Ubuntu 12.04, HBase 0.92.0, Gora 0.2.1, Avro 1.3.2
Reporter: Alfonso Nishikawa
Assignee: Alfonso Nishikawa
 Fix For: 0.3


 Summary: HBase does not handle avro UNION type (in the schema like 
 [string,null].
 When trying to write a row I get the RuntimeException Unknown type: UNION.
 My .avsc is the following:
 {code}
 {name: TestRow,
  type: record,
  namespace: es.foo.tests.storage,
  fields: [
 {name: columnLong, type: long, default: 0},
 {name: unionRecursive, type: [TestRow,null]},
 {name: unionString, type: [string,null]},
 
 {name: family2, type: {type: map, values:string}}
]
 }
 {code}
 my mapping is:
 {code}
 ?xml version=1.0 encoding=UTF-8?
 gora-orm
 table name=test !-- Configuración de familias --
 family name=family1  maxVersions=1 compression=SNAPPY /
 family name=family2  maxVersions=1 compression=SNAPPY /
 /table
 class table=test keyClass=java.lang.String 
 name=es.foo.tests.storage.TestRow
 field name=unionString   family=family1 
 qualifier=unionString/
 field name=unionRecursivefamily=family1 
 qualifier=unionRecursive /
 field name=columnLongfamily=family1 
 qualifier=colInteger /
 field name=family2   family=family2 /
 /class
 /gora-orm
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (GORA-187) gora-hbase always writing column when dirty, even if value is default or null

2012-11-21 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-187:
--

Fix Version/s: 0.3

 gora-hbase always writing column when dirty, even if value is default or null
 -

 Key: GORA-187
 URL: https://issues.apache.org/jira/browse/GORA-187
 Project: Apache Gora
  Issue Type: Improvement
  Components: storage-hbase
Affects Versions: 0.2.1
 Environment: Ubuntu 12.04, HBase 0.92.0
Reporter: Alfonso Nishikawa
Priority: Minor
 Fix For: 0.3


 When writing a field (tested with 'long' default '0'), if it is not dirty 
 when saving, will not write the column. If setted to 1 and back to 0, saving 
 will write that default value.
 With strings, after fixing [GORA-183], noticed that null values are too 
 written (being default or not).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (GORA-188) testSerdeWebPage failure - PersistentBase#equals() fails with map fields

2012-11-21 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-188:
--

Fix Version/s: 0.3

 testSerdeWebPage failure - PersistentBase#equals() fails with map fields
 

 Key: GORA-188
 URL: https://issues.apache.org/jira/browse/GORA-188
 Project: Apache Gora
  Issue Type: Bug
  Components: gora-core
Affects Versions: 0.2.1
Reporter: Alfonso Nishikawa
Priority: Minor
 Fix For: 0.3


 As shown here:
 {code}
 junit.framework.AssertionFailedError: 
 expected:org.apache.gora.examples.generated.WebPage@4b49ab6f {
   url:http://bar.com/;
   content:java.nio.HeapByteBuffer[pos=1 lim=1 cap=1]
   parsedContent:[1]
   outlinks:{http://bazbar.com=a8, http://baz.com/1.jspq=barbazp=foo=a6, 
 http://baz.com/1.jspq=barbaz=a5, http://bar.com/3.jsp=a3, 
 http://bar.com/1.html=a4, http://foo.com/1.html=a1, http://foo.com/2.html=a2, 
 http://baz.com/1.jspq=foo=a7};
   metadata:org.apache.gora.examples.generated.Metadata@51a {
   version:1
   data:{metakey=metavalue}
 }
 } but was:org.apache.gora.examples.generated.WebPage@4b6d94c0 {
   url:http://bar.com/;
   content:java.nio.HeapByteBuffer[pos=0 lim=1 cap=1]
   parsedContent:[1]
   outlinks:{http://baz.com/1.jspq=barbaz=a5, 
 http://baz.com/1.jspq=barbazp=foo=a6, http://bazbar.com=a8, 
 http://bar.com/3.jsp=a3, http://foo.com/1.html=a1, http://bar.com/1.html=a4, 
 http://foo.com/2.html=a2, http://baz.com/1.jspq=foo=a7};
   metadata:org.apache.gora.examples.generated.Metadata@51a {
   version:1
   data:{metakey=metavalue}
 }
 }
   at junit.framework.Assert.fail(Assert.java:50)
   at junit.framework.Assert.failNotEquals(Assert.java:287)
   at junit.framework.Assert.assertEquals(Assert.java:67)
   at junit.framework.Assert.assertEquals(Assert.java:74)
   at 
 org.apache.gora.util.TestIOUtils.testSerializeDeserialize(TestIOUtils.java:125)
   at 
 org.apache.gora.mapreduce.TestPersistentSerialization.testSerdeWebPage(TestPersistentSerialization.java:85)
 {code}
 the difference is the order of the outlinks. I guess they should be 
 considered equal. Am I wrong?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-75) Improve documentation for DataStoreTestUtil

2012-11-22 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-75?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13502983#comment-13502983
 ] 

Lewis John McGibbney commented on GORA-75:
--

Does anyone have a perspective on this one? I suggest to close unless folks 
want to have more method annotation. 

 Improve documentation for DataStoreTestUtil
 ---

 Key: GORA-75
 URL: https://issues.apache.org/jira/browse/GORA-75
 Project: Apache Gora
  Issue Type: Improvement
  Components: documentation
Affects Versions: 0.1.1-incubating
Reporter: Lewis John McGibbney
 Fix For: 0.3


 As there are half a dozen or so tests within the above class, it is essential 
 that each is thoroughly documented so that there is no doubt over how data 
 stores treat various query and delete operations. To date it is causing some 
 confusion and is hindering development.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (GORA-185) Remove ANT scripts and IVY confs

2012-11-22 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-185:
--

Attachment: GORA-185.patch

patch for trunk

 Remove ANT scripts and IVY confs
 

 Key: GORA-185
 URL: https://issues.apache.org/jira/browse/GORA-185
 Project: Apache Gora
  Issue Type: Task
  Components: build process
Affects Versions: 0.2.1
Reporter: Julien Nioche
 Fix For: 0.3

 Attachments: GORA-185.patch


 There are currently build resources for ANT+IVY as well as Maven. According 
 to Lewis only the latter is now used in which case it would be better to 
 remove all the ANT+IVY stuff to avoid any confusion

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (GORA-75) Improve documentation for DataStoreTestUtil

2012-11-23 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-75?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved GORA-75.
--

Resolution: Fixed

This issue related mostly to my unfamiliarity with our testing suite. Hopefully 
now it is clear and can be communicated to others. Thanks Henry

 Improve documentation for DataStoreTestUtil
 ---

 Key: GORA-75
 URL: https://issues.apache.org/jira/browse/GORA-75
 Project: Apache Gora
  Issue Type: Improvement
  Components: documentation
Affects Versions: 0.1.1-incubating
Reporter: Lewis John McGibbney
 Fix For: 0.3


 As there are half a dozen or so tests within the above class, it is essential 
 that each is thoroughly documented so that there is no doubt over how data 
 stores treat various query and delete operations. To date it is causing some 
 confusion and is hindering development.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Test errors java.lang.NoSuchMethodError: org.apache.gora.store.DataStore.setConf

2012-11-29 Thread Lewis John Mcgibbney
Hi Henry  Others,

We are close to sorting this one out, I promise.

Basically, I pruned out all of the old SNAPSHOT jar and tests.jar
dependencies from repository.apache.org and adapted the Jenkins build
to clean ad deploy only successful SNAPSHOT's after every CI build.
This now resolved the problem we were having with the compilation
failures with the dynamodb module.

Now we are left with a scenario where many (74) tests fail with the
following Exception [0]

I looked at the change log for DataStoreTestUtil [1] and see that
we've missed the additional IOExceptions which were added in during
the dynamodb module development.

@Renato,
Is it possible for you to have a look at trunk, and see if the
removal/correct implementation of such Exceptions is necessary... if
so then where? This would be excellent and would also allow us to move
towards addressing the other Dynamodb issues currently open on Jira.

I think getting these tests it top priority just now, considering the
changes which have been made further to the integration of the
dynamodb module.

Thanks all,

Lewis

[0] 
https://builds.apache.org/job/gora-trunk/org.apache.gora$gora-cassandra/532/testReport/org.apache.gora.cassandra.store/TestCassandraStore/testNewInstance/
[1] 
http://svn.apache.org/viewvc/gora/trunk/gora-core/src/test/java/org/apache/gora/store/DataStoreTestUtil.java?r1=1363659r2=1405417diff_format=h

On Wed, Nov 28, 2012 at 7:09 PM, Lewis John Mcgibbney
lewis.mcgibb...@gmail.com wrote:


Re: Test errors java.lang.NoSuchMethodError: org.apache.gora.store.DataStore.setConf

2012-11-29 Thread Lewis John Mcgibbney
Hi,

It looks the the core DataStore class also needs tidied up a bit, some
method still throw IOExceptions, whereas others simply don't as the
functionality has been moved further upstream. The Javadoc annotations
for each method also need to be right, this is essential as DataStore
is one of the key classes in Gora.

We will get to the bottom of it soon :0)

Best

Lewis

On Thu, Nov 29, 2012 at 2:48 PM, Lewis John Mcgibbney
lewis.mcgibb...@gmail.com wrote:
 Hi Henry  Others,

 We are close to sorting this one out, I promise.

 Basically, I pruned out all of the old SNAPSHOT jar and tests.jar
 dependencies from repository.apache.org and adapted the Jenkins build
 to clean ad deploy only successful SNAPSHOT's after every CI build.
 This now resolved the problem we were having with the compilation
 failures with the dynamodb module.

 Now we are left with a scenario where many (74) tests fail with the
 following Exception [0]

 I looked at the change log for DataStoreTestUtil [1] and see that
 we've missed the additional IOExceptions which were added in during
 the dynamodb module development.

 @Renato,
 Is it possible for you to have a look at trunk, and see if the
 removal/correct implementation of such Exceptions is necessary... if
 so then where? This would be excellent and would also allow us to move
 towards addressing the other Dynamodb issues currently open on Jira.

 I think getting these tests it top priority just now, considering the
 changes which have been made further to the integration of the
 dynamodb module.

 Thanks all,

 Lewis

 [0] 
 https://builds.apache.org/job/gora-trunk/org.apache.gora$gora-cassandra/532/testReport/org.apache.gora.cassandra.store/TestCassandraStore/testNewInstance/
 [1] 
 http://svn.apache.org/viewvc/gora/trunk/gora-core/src/test/java/org/apache/gora/store/DataStoreTestUtil.java?r1=1363659r2=1405417diff_format=h

 On Wed, Nov 28, 2012 at 7:09 PM, Lewis John Mcgibbney
 lewis.mcgibb...@gmail.com wrote:



-- 
Lewis


[jira] [Created] (GORA-190) Add version switch to bin/gora script

2012-11-30 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created GORA-190:
-

 Summary: Add version switch to bin/gora script 
 Key: GORA-190
 URL: https://issues.apache.org/jira/browse/GORA-190
 Project: Apache Gora
  Issue Type: Improvement
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Minor
 Fix For: 0.3


This should act as a sure means of ensuring that Gora is properly installed in 
the target operating system. I have never used Gora on anything other than 
Ubuntu, so this will help us in the future to identify interoperability with 
other OS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (GORA-191) Add a constructor to GoraCompiler so it can be used outside of Gora.

2012-11-30 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created GORA-191:
-

 Summary: Add a constructor to GoraCompiler so it can be used 
outside of Gora.
 Key: GORA-191
 URL: https://issues.apache.org/jira/browse/GORA-191
 Project: Apache Gora
  Issue Type: Improvement
  Components: gora-core, schema
Reporter: Lewis John McGibbney
Priority: Critical
 Fix For: 0.3


We need to automate the compiling of various .avsc files over in Nutch. We 
should add a constructor to GoraCompiler so it can be used more widely. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-174) GORA compiler does not handle [string, null] unions in the AVRO schema

2012-12-04 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509812#comment-13509812
 ] 

Lewis John McGibbney commented on GORA-174:
---

Hi Alfonso, at the beginning of the weekend I spent time working between the 
patched GoraCompiler and Nutch 2.x. AS Julien suggested in his initial problem 
description however when generating the code for the schema the accessors are 
not generated by GORA which prevents Nutch from compiling, this still seems to 
be the case even when using the GORA-174v3.patch so I am not sure that the 
patch is properly fixing this issue. 

 GORA compiler does not handle [string, null] unions in the AVRO schema
 --

 Key: GORA-174
 URL: https://issues.apache.org/jira/browse/GORA-174
 Project: Apache Gora
  Issue Type: Bug
  Components: schema
Affects Versions: 0.2.1
Reporter: Julien Nioche
Assignee: Alfonso Nishikawa
 Fix For: 0.3

 Attachments: failed_tests_after_v3.tar.gz, GORA-174-test.patch, 
 GORA-174v2.patch, GORA-174v3.patch


 See NUTCH-1477 for description. 
 We are getting NPE when using the DataFileAvroStore, in order to avoid that I 
 modified the schema to allow for null values on some fields e.g.{name: 
 baseUrl, type: [string, null] }
 however when generating the code for the schema the accessors are not 
 generated by GORA which prevents Nutch from compiling 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Build failed in Jenkins: Nutch-nutchgora #420

2012-12-04 Thread Lewis John Mcgibbney
Hi All,

On Sat, Dec 1, 2012 at 4:16 AM, Apache Jenkins Server
jenk...@builds.apache.org wrote:
 See https://builds.apache.org/job/Nutch-nutchgora/420/

 [junit] Running org.apache.nutch.crawl.TestGenerator
 [junit] Tests run: 4, Failures: 0, Errors: 1, Time elapsed: 63.594 sec
 [junit] Test org.apache.nutch.crawl.TestGenerator FAILED

 BUILD FAILED
 /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-nutchgora/2.x/build.xml:423:
  Tests failed!

 Total time: 6 minutes 18 seconds
 Build step 'Invoke Ant' marked build as failure
 Publishing Javadoc

Below are the relevant parts from the test report...

2012-12-01 04:12:20,948 WARN  mapred.FileOutputCommitter
(FileOutputCommitter.java:cleanupJob(100)) - Output path is null in
cleanup
2012-12-01 04:12:20,948 WARN  mapred.LocalJobRunner
(LocalJobRunner.java:run(298)) - job_local_0003
java.lang.ClassCastException: org.apache.gora.mapreduce.GoraInputSplit
cannot be cast to org.apache.hadoop.mapred.InputSplit
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
2012-12-01 04:12:21,941 INFO  mapred.JobClient
(JobClient.java:monitorAndPrintJob(1301)) -  map 0% reduce 0%
2012-12-01 04:12:21,941 INFO  mapred.JobClient
(JobClient.java:monitorAndPrintJob(1356)) - Job complete:
job_local_0003

Testcase: testGenerateHostLimit took 8.548 sec
Caused an ERROR
job failed: name=generate: 1354335140-1018707298, jobid=job_local_0003
java.lang.RuntimeException: job failed: name=generate:
1354335140-1018707298, jobid=job_local_0003
at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54)
at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:191)
at org.apache.nutch.crawl.GeneratorJob.generate(GeneratorJob.java:213)
at 
org.apache.nutch.crawl.TestGenerator.generateFetchlist(TestGenerator.java:258)
at 
org.apache.nutch.crawl.TestGenerator.testGenerateHostLimit(TestGenerator.java:138)


I am not sure where in Nutch 2.x the GoraInputSplit is being
incorrectly cast, however I'll try to find it. Anyone have any ideas?

Best
Lewis

-- 
Lewis


[jira] [Commented] (GORA-174) GORA compiler does not handle [string, null] unions in the AVRO schema

2012-12-04 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509916#comment-13509916
 ] 

Lewis John McGibbney commented on GORA-174:
---

I can confirm that after applying the patch to gora-core then using this 
dependency with Nutch 2.x, the Exception is identical as the one documented in 
NUTCH-1477. When I compile the webpage avsc's here [0], the webpage class is 
generated but with no accessors for the fields in with the union case we are 
concerned with. I am also learning with the GoraCompiler (and Avro stuff) so I 
do not have a definitive solution to hand just now. I think the serializing and 
deserializing problems which seem to be a result of introducing union support 
should be addressed as we encounter them.

[0] http://svn.apache.org/repos/asf/nutch/branches/2.x/src/gora/ 

 GORA compiler does not handle [string, null] unions in the AVRO schema
 --

 Key: GORA-174
 URL: https://issues.apache.org/jira/browse/GORA-174
 Project: Apache Gora
  Issue Type: Bug
  Components: schema
Affects Versions: 0.2.1
Reporter: Julien Nioche
Assignee: Alfonso Nishikawa
 Fix For: 0.3

 Attachments: failed_tests_after_v3.tar.gz, GORA-174-test.patch, 
 GORA-174v2.patch, GORA-174v3.patch


 See NUTCH-1477 for description. 
 We are getting NPE when using the DataFileAvroStore, in order to avoid that I 
 modified the schema to allow for null values on some fields e.g.{name: 
 baseUrl, type: [string, null] }
 however when generating the code for the schema the accessors are not 
 generated by GORA which prevents Nutch from compiling 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (GORA-192) Tests for GoraCompiler

2012-12-04 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created GORA-192:
-

 Summary: Tests for GoraCompiler
 Key: GORA-192
 URL: https://issues.apache.org/jira/browse/GORA-192
 Project: Apache Gora
  Issue Type: Improvement
  Components: avro, gora-core, testing
Reporter: Lewis John McGibbney
 Fix For: 0.4


The recent issues surrounding the GoraCompiler have clearly made a case for 
establishing some testing criteria for this important class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-192) Tests for GoraCompiler

2012-12-04 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509932#comment-13509932
 ] 

Lewis John McGibbney commented on GORA-192:
---

As GoraCompiler is largely based on Avro's SpecificCompiler some basic guide 
tests can be seen here  
http://svn.apache.org/repos/asf/avro/trunk/lang/java/compiler/src/test/java/org/apache/avro/compiler/TestSpecificCompiler.java

 Tests for GoraCompiler
 --

 Key: GORA-192
 URL: https://issues.apache.org/jira/browse/GORA-192
 Project: Apache Gora
  Issue Type: Improvement
  Components: avro, gora-core, testing
Reporter: Lewis John McGibbney
 Fix For: 0.4


 The recent issues surrounding the GoraCompiler have clearly made a case for 
 establishing some testing criteria for this important class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (GORA-193) Make sure gora-core test dependency is always generated when packaging.

2012-12-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved GORA-193.
---

Resolution: Fixed
  Assignee: Lewis John McGibbney

Committed @revision 1417112 in trunk 

 Make sure gora-core test dependency is always generated when packaging.
 ---

 Key: GORA-193
 URL: https://issues.apache.org/jira/browse/GORA-193
 Project: Apache Gora
  Issue Type: Improvement
  Components: maven
Affects Versions: 0.2.1
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Trivial
 Fix For: 0.3


 The trivial addition of maven jar plugin testing goal will ensure that the 
 test dependency is always produced for gora-core. This is important.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-191) Add a constructor to GoraCompiler so it can be used outside of Gora.

2012-12-04 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509942#comment-13509942
 ] 

Lewis John McGibbney commented on GORA-191:
---

This issue should also incorporate the addition of functionality to allow 
GoraCompiler to accept a List[] of input schemas as is done in Avro's 
SpecificCompiler.

 Add a constructor to GoraCompiler so it can be used outside of Gora.
 

 Key: GORA-191
 URL: https://issues.apache.org/jira/browse/GORA-191
 Project: Apache Gora
  Issue Type: Improvement
  Components: gora-core, schema
Reporter: Lewis John McGibbney
Priority: Critical
 Fix For: 0.3


 We need to automate the compiling of various .avsc files over in Nutch. We 
 should add a constructor to GoraCompiler so it can be used more widely. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Gora Site Docs

2012-12-05 Thread Lewis John Mcgibbney
Hi Henry,

When I started on this my opinion changed somewhat
The investment required is as follows

1) Maven (svnpubsub) I can grab the maven fluido skin [0] (which looks
OK) and have it up and running reasonably shortly.
2) Apache CMS, this requires someone writing the site however the
publishing workflow is so much less hassle.

What do you think?

Lewis

[0] http://maven.apache.org/skins/maven-fluido-skin/

On Wed, Dec 5, 2012 at 7:12 PM, Henry Saputra henry.sapu...@gmail.com wrote:
 Lewis,

 With Maven site, does it mean we are still using the svn pubsub or we could
 move to ASF CMS for publishing it?

 - Henry


 On Fri, Nov 30, 2012 at 3:47 AM, Lewis John Mcgibbney 
 lewis.mcgibb...@gmail.com wrote:

 Hi All,

 I'm currently setting about the transition from Forrest to Maven for
 the site docs.
 It complicates things by having the current two tier structure which
 we maintain for the site docs. I therefore propose to just have docs.
 Enis, you had reasons and justification behind the legacy Gora
 documentation structure, if you could remind us again it would be
 excellent.
 I am working on this today regardless and will hopefully have a new
 proposal for the Mavenized site prepared shortly.

 Thanks, everyone.

 Happy St Andrews Day

 Lewis

 --
 Lewis




-- 
Lewis


Re: Gora Site Docs

2012-12-05 Thread Lewis John Mcgibbney
I started the Maven transition and can complete tomorrow.

Is everyone happy with the fluido skin that I mentioned?

If so then I will work to get it sorted out tomorrow.

Best

Lewis

On Wed, Dec 5, 2012 at 7:46 PM, Enis Söztutar enis@gmail.com wrote:
 Sorry, meant to reply this, but totally fell out of my radar.

 The reason why we are doing per-release and release-independent docs is
 that there are some docs that document the code (tutorial, javadoc, etc),
 and some docs that dont (the main site).

 Having said that, I don't think keeping the docs separated is a blocker for
 going maven. We can merge these, and if it still makes sense to separate
 the two, we can do it later.

 Enis


 On Wed, Dec 5, 2012 at 11:25 AM, Lewis John Mcgibbney 
 lewis.mcgibb...@gmail.com wrote:

 Hi Henry,

 When I started on this my opinion changed somewhat
 The investment required is as follows

 1) Maven (svnpubsub) I can grab the maven fluido skin [0] (which looks
 OK) and have it up and running reasonably shortly.
 2) Apache CMS, this requires someone writing the site however the
 publishing workflow is so much less hassle.

 What do you think?

 Lewis

 [0] http://maven.apache.org/skins/maven-fluido-skin/

 On Wed, Dec 5, 2012 at 7:12 PM, Henry Saputra henry.sapu...@gmail.com
 wrote:
  Lewis,
 
  With Maven site, does it mean we are still using the svn pubsub or we
 could
  move to ASF CMS for publishing it?
 
  - Henry
 
 
  On Fri, Nov 30, 2012 at 3:47 AM, Lewis John Mcgibbney 
  lewis.mcgibb...@gmail.com wrote:
 
  Hi All,
 
  I'm currently setting about the transition from Forrest to Maven for
  the site docs.
  It complicates things by having the current two tier structure which
  we maintain for the site docs. I therefore propose to just have docs.
  Enis, you had reasons and justification behind the legacy Gora
  documentation structure, if you could remind us again it would be
  excellent.
  I am working on this today regardless and will hopefully have a new
  proposal for the Mavenized site prepared shortly.
 
  Thanks, everyone.
 
  Happy St Andrews Day
 
  Lewis
 
  --
  Lewis
 



 --
 Lewis




-- 
Lewis


Re: Gora Site Docs

2012-12-06 Thread Lewis John Mcgibbney
Hi,

AFAIK the site docs (in thier current form are in the xdocs format),
however it seems that they are also marked up with some
html/xhtml/xdoc meaning that automating the transformation into the
apt format is becoming a hellishly tedious and extremely time
consuming task.

I'm using the doxia converter tool to do this but every document seems
to have numerous problems and the doxia stack traces make me want to
smash the place up.

I'll persist and see where I get.

@Henry,
If I can I'll get the new site sorted for all of the site docs, so
that we can use maven for publishing from now on.

Thanks

Lewis

On Wed, Dec 5, 2012 at 9:07 PM, Henry Saputra henry.sapu...@gmail.com wrote:
 +1 for the Fluido skin. But this is just for the release-independent site,
 right?

 - Henry

 On Wed, Dec 5, 2012 at 12:59 PM, Lewis John Mcgibbney 
 lewis.mcgibb...@gmail.com wrote:

 I started the Maven transition and can complete tomorrow.

 Is everyone happy with the fluido skin that I mentioned?

 If so then I will work to get it sorted out tomorrow.

 Best

 Lewis

 On Wed, Dec 5, 2012 at 7:46 PM, Enis Söztutar enis@gmail.com wrote:
  Sorry, meant to reply this, but totally fell out of my radar.
 
  The reason why we are doing per-release and release-independent docs is
  that there are some docs that document the code (tutorial, javadoc, etc),
  and some docs that dont (the main site).
 
  Having said that, I don't think keeping the docs separated is a blocker
 for
  going maven. We can merge these, and if it still makes sense to separate
  the two, we can do it later.
 
  Enis
 
 
  On Wed, Dec 5, 2012 at 11:25 AM, Lewis John Mcgibbney 
  lewis.mcgibb...@gmail.com wrote:
 
  Hi Henry,
 
  When I started on this my opinion changed somewhat
  The investment required is as follows
 
  1) Maven (svnpubsub) I can grab the maven fluido skin [0] (which looks
  OK) and have it up and running reasonably shortly.
  2) Apache CMS, this requires someone writing the site however the
  publishing workflow is so much less hassle.
 
  What do you think?
 
  Lewis
 
  [0] http://maven.apache.org/skins/maven-fluido-skin/
 
  On Wed, Dec 5, 2012 at 7:12 PM, Henry Saputra henry.sapu...@gmail.com
  wrote:
   Lewis,
  
   With Maven site, does it mean we are still using the svn pubsub or we
  could
   move to ASF CMS for publishing it?
  
   - Henry
  
  
   On Fri, Nov 30, 2012 at 3:47 AM, Lewis John Mcgibbney 
   lewis.mcgibb...@gmail.com wrote:
  
   Hi All,
  
   I'm currently setting about the transition from Forrest to Maven for
   the site docs.
   It complicates things by having the current two tier structure
 which
   we maintain for the site docs. I therefore propose to just have docs.
   Enis, you had reasons and justification behind the legacy Gora
   documentation structure, if you could remind us again it would be
   excellent.
   I am working on this today regardless and will hopefully have a new
   proposal for the Mavenized site prepared shortly.
  
   Thanks, everyone.
  
   Happy St Andrews Day
  
   Lewis
  
   --
   Lewis
  
 
 
 
  --
  Lewis
 



 --
 Lewis




-- 
Lewis


Re: Test errors java.lang.NoSuchMethodError: org.apache.gora.store.DataStore.setConf

2012-12-06 Thread Lewis John Mcgibbney
No hassle.

There is still work to be done here Henry. It is certainly something
on the radar and we can work on GORA-89 over this week/weekend
hopefully.

On Thu, Dec 6, 2012 at 11:23 PM, Henry Saputra henry.sapu...@gmail.com wrote:
 Ah mea culpa, I was replying the wrong thread. I was trying to reply about
 the initial work you did for
 GORA-89https://issues.apache.org/jira/browse/GORA-89
  =(

 My bad, still recovering from the cold.

 - Henry


 On Thu, Dec 6, 2012 at 2:45 PM, Lewis John Mcgibbney 
 lewis.mcgibb...@gmail.com wrote:

 Hi Henry,

 On Mon, Dec 3, 2012 at 7:06 PM, Henry Saputra henry.sapu...@gmail.com
 wrote:
  Unfortunately I am pretty much out this weekend due to cold and probably
  for some part of this week.

 Hopefully you are getting better., You want to see it here in
 Scotland... pure baltic is the underestimation of 2012.

  Lewis, you can continue working on this or I will take a look at this
 once
  I get my health back.

 I committed GORA-193 and the tests passed on build #539

 https://builds.apache.org/view/G-L/view/Gora/job/gora-trunk/539/

 #540 also passed, so eventually (fingers crossed) it seems like we've
 narrowed the dependency problem down and resolved it.

 I've re-initiated the email notifications on the trunk build, so from
 now on we will get the outcome of the builds sent to dev@ and the
 culprits will also receive a notification. If these get out of hand
 again we can disable them until we get the situation sorted.

 I think the best thing is to monitor the builds from now on however
 hopefully things are back to normal now.

 All the best for now

 Lewis


 --
 Lewis




-- 
Lewis


[jira] [Commented] (GORA-174) GORA compiler does not handle [string, null] unions in the AVRO schema

2012-12-07 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13526497#comment-13526497
 ] 

Lewis John McGibbney commented on GORA-174:
---

I think the fix for this can be committed. Although I am running into different 
issues (as commented over in NUTCH-1477) GoraCompiler with the GORA-174v3.patch 
certainly generates the Java classes properly now with getters and setters.
We have a problem with the tests though, is this correct? I see your attachment 
Alfonso.

 GORA compiler does not handle [string, null] unions in the AVRO schema
 --

 Key: GORA-174
 URL: https://issues.apache.org/jira/browse/GORA-174
 Project: Apache Gora
  Issue Type: Bug
  Components: schema
Affects Versions: 0.2.1
Reporter: Julien Nioche
Assignee: Alfonso Nishikawa
 Fix For: 0.3

 Attachments: failed_tests_after_v3.tar.gz, GORA-174-test.patch, 
 GORA-174v2.patch, GORA-174v3.patch


 See NUTCH-1477 for description. 
 We are getting NPE when using the DataFileAvroStore, in order to avoid that I 
 modified the schema to allow for null values on some fields e.g.{name: 
 baseUrl, type: [string, null] }
 however when generating the code for the schema the accessors are not 
 generated by GORA which prevents Nutch from compiling 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-182) Nutch 2.1 does not work with gora-cassandra 0.2.1

2012-12-11 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529416#comment-13529416
 ] 

Lewis John McGibbney commented on GORA-182:
---

I am going to start work on this Kaz as it is a complete blocker IMO.
Are you suggesting that we add support for Hector's LongSerializer to 
accommodate the nature of data and type(s) produced by the InjectorJob?
Once I have a better understanding of this, I'm going to head over to hector 
users@  

 Nutch 2.1 does not work with gora-cassandra 0.2.1
 -

 Key: GORA-182
 URL: https://issues.apache.org/jira/browse/GORA-182
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-cassandra
Affects Versions: 0.2.1
Reporter: Kazuomi Kashii
 Fix For: 0.3

 Attachments: GORA-182.patch


 Nutch 2.1 does not work with gora-cassandra 0.2.1.
 Especially, outlinks field is not written.
 I have confirmed this issue on Mac OS X and CentOS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-27) Optionally add license headers to generated files

2012-12-13 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-27?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531128#comment-13531128
 ] 

Lewis John McGibbney commented on GORA-27:
--

+1 for new patch

 Optionally add license headers to generated files
 -

 Key: GORA-27
 URL: https://issues.apache.org/jira/browse/GORA-27
 Project: Apache Gora
  Issue Type: Improvement
  Components: schema
Affects Versions: 0.1-incubating, 0.2
Reporter: Andrzej Bialecki 
 Fix For: 0.3

 Attachments: GORA-27.patch, GORA-27-v2.patch, GORA-27v4.1.patch, 
 GORA-27v4.patch


 Gora compiler should allow adding license headers to generated files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Gora and MongoDB

2012-12-18 Thread Lewis John Mcgibbney
Hi,

AFAIK, there is no work currently being undertaken to build a datastore for
10gen's MongoDB.

If you would like to open an issue, please do. If you begin submitting
patches, I'm sure that the community could and will test the code in an
attempt to get a MondoDB module for Gora. This would be very much welcomed.

Please keep us posted as to how you are getting on.

Best

Lewis

On Tue, Dec 18, 2012 at 7:23 PM, Poulard, Fabien fpoul...@dictanova.comwrote:

 Hi all,

 I'm Fabien, I co-funded a company specialized in opinion mining on the Web.
 We use Nutch 2.x for our crawling needs... and therefore Apache Gora as an
 abstraction layer between our NoSQL datastore and Nutch results.

 We've been using HBase so far. But we'd like to give 10gen MongoBD a shot.
 I've started working on a gora datastore for MongoDB. I've searched in the
 archives and in Jira but did not find anything related to MongoDB. Before
 going anywhere further I'd like to check if anyone else is working on such
 a thing and if I may find myself stucked by some difficulties I did not
 anticipate.

 Any hint would help ;)

 --
 *Fabien Poulard*
 Associé-Fondateur Dictanova
 Tél. 02 51 12 59 68 / 06 65 58 94 77

 *Dictanova*
 2, rue de la Houssinière - BP 92208
 44322 Nantes Cedex 03




-- 
*Lewis*


[jira] [Created] (GORA-194) Upgrade to Hadoop 1.1.1

2012-12-27 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created GORA-194:
-

 Summary: Upgrade to Hadoop 1.1.1
 Key: GORA-194
 URL: https://issues.apache.org/jira/browse/GORA-194
 Project: Apache Gora
  Issue Type: Improvement
  Components: build process, maven
Affects Versions: 0.2.1
Reporter: Lewis John McGibbney
 Fix For: 0.3


Over in Nutchland, Markus recently committed NUTCH-1510, which covered an 
upgrade of the underlying Hadoop dependency to 1.1.1 with significant 
performance improvements. It would be excellent to upgrade and see if we can 
identify any performance gains in Gora as well. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[DISCUSS] Timeline for Gora 0.3 Release Thoughts

2013-01-03 Thread Lewis John Mcgibbney
Hi All,

Firstly, Happy New Year everyone. I really hope that 2013 is a good year
for everyone.

It would be excellent to get a Gora 0.3 release done, however there are a
couple of blocking issues. As I see it we have the following

GORA-182 https://issues.apache.org/jira/browse/GORA-182 Nutch 2.1 does
not work with gora-cassandra 0.2.1

GORA-170 https://issues.apache.org/jira/browse/GORA-170 Getting a
BufferUnderflowException in class CassandraColumn, method fromByteBuffer()

GORA-188 https://issues.apache.org/jira/browse/GORA-188 testSerdeWebPage
failure - PersistentBase#equals() fails with map fields

GORA-189 https://issues.apache.org/jira/browse/GORA-189 String parameters
in generated Persistent subclasses by Compiler -not only Utf8-


The thing is that some of these are linked, and I also anticipate that we
may run into other problems once some/all have been resolved so to speak.
The purpose of this thread is to attempt to draw up some roadmap for
releasing, and of course to understand what is required in the development
drive for us to reach this target.

Any input would be excellent.

Best

Lewis

-- 
*Lewis*


[jira] [Commented] (GORA-89) Avoid HBase MiniCluster restarts to shorten gora-hbase tests

2013-01-03 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-89?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542939#comment-13542939
 ] 

Lewis John McGibbney commented on GORA-89:
--

Hi Henry, apologies for the time away from this issue. I will be travelling and 
will try out the patch ASAP. Sorry again and thanks for taking the time + 
uploading your work.

 Avoid HBase MiniCluster restarts to shorten gora-hbase tests
 

 Key: GORA-89
 URL: https://issues.apache.org/jira/browse/GORA-89
 Project: Apache Gora
  Issue Type: Improvement
  Components: storage-hbase
Affects Versions: 0.2
Reporter: Lewis John McGibbney
Priority: Critical
 Fix For: 0.3

 Attachments: GORA-89-hsaputra.patch, GORA-89.patch


 Currently our hbase tests are taking forever and a day. We should shorten the 
 time by avoiding MiniCluster restarts.
 Just implement the cluster as a singleton and clean up the tables in
 between test by doing a scan and deletes for all rows. It's much
 faster than restarting the cluster.
 For code referenece please see the implementation here[1]. The class is
 HBaseClusterSingleton. It needs some refactoring but I think it's
 enough to speed your tests.
 Thanks Ioan for the heads up.
 [1] 
 http://svn.apache.org/repos/asf/james/mailbox/trunk/hbase/src/test/java/org/apache/james/mailbox/hbase/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [DISCUSS] Timeline for Gora 0.3 Release Thoughts

2013-01-03 Thread Lewis John Mcgibbney
Hi Henry
Is it a serious proposal to bump all of the issues quoted thus far to
blocker for 0.3 release (the remaining issues can be bumped to 0.4) so we
have a clear vision for the 0.3 development drive?
Thanks
Lewis

On Thu, Jan 3, 2013 at 7:34 PM, Henry Saputra henry.sapu...@gmail.comwrote:

 Hi Lewis,

 Thanks for starting this discussions.

 At least couple of the Jira issues you mentioned before are blocked
 with GORA-174.

 I think these are good list of blockers that are must fix and should be
 resolved before start preparing for 0.3 release.


 - Henry


 On Thu, Jan 3, 2013 at 5:42 AM, Lewis John Mcgibbney 
 lewis.mcgibb...@gmail.com wrote:

  Hi All,
 
  Firstly, Happy New Year everyone. I really hope that 2013 is a good year
  for everyone.
 
  It would be excellent to get a Gora 0.3 release done, however there are a
  couple of blocking issues. As I see it we have the following
 
  GORA-182 https://issues.apache.org/jira/browse/GORA-182 Nutch 2.1 does
  not work with gora-cassandra 0.2.1
 
  GORA-170 https://issues.apache.org/jira/browse/GORA-170 Getting a
  BufferUnderflowException in class CassandraColumn, method
 fromByteBuffer()
 
  GORA-188 https://issues.apache.org/jira/browse/GORA-188
 testSerdeWebPage
  failure - PersistentBase#equals() fails with map fields
 
  GORA-189 https://issues.apache.org/jira/browse/GORA-189 String
  parameters
  in generated Persistent subclasses by Compiler -not only Utf8-
 
 
  The thing is that some of these are linked, and I also anticipate that we
  may run into other problems once some/all have been resolved so to speak.
  The purpose of this thread is to attempt to draw up some roadmap for
  releasing, and of course to understand what is required in the
 development
  drive for us to reach this target.
 
  Any input would be excellent.
 
  Best
 
  Lewis
 
  --
  *Lewis*
 




-- 
*Lewis*


Re: [DISCUSS] Timeline for Gora 0.3 Release Thoughts

2013-01-07 Thread Lewis John Mcgibbney
Hi Alfonso,

Thanks for this.

On Sun, Jan 6, 2013 at 1:19 PM, Alfonso Nishikawa 
alfonso.nishik...@gmail.com wrote:


 By the way... github repo is not in sync with official subversion repo.


M...
The official Github mirrior 'should' technically be up-to-date (give or
take the latency between updates)...

Lewis


Re: [DISCUSS] Timeline for Gora 0.3 Release Thoughts

2013-01-07 Thread Lewis John Mcgibbney
Hi All,

Another note on this, I've edited the Jira instance now so that we have a
clear strategy for the 0.3 development drive, the remaining 0.3 issues can
be viewed @ *http://s.apache.org/0Z

*Thanks and here's to 0.3 :0)

Lewis

On Mon, Jan 7, 2013 at 1:57 PM, Lewis John Mcgibbney 
lewis.mcgibb...@gmail.com wrote:

 Hi Alfonso,

 Thanks for this.


 On Sun, Jan 6, 2013 at 1:19 PM, Alfonso Nishikawa 
 alfonso.nishik...@gmail.com wrote:


 By the way... github repo is not in sync with official subversion repo.


 M...
 The official Github mirrior 'should' technically be up-to-date (give or
 take the latency between updates)...

 Lewis





-- 
*Lewis*


[jira] [Commented] (GORA-24) Throwing EOFException with MEDIUMBLOB type for inlinks column

2013-01-07 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13546402#comment-13546402
 ] 

Lewis John McGibbney commented on GORA-24:
--

As the gora-sql module is now deprecated (due to licensing issues).
Please correct but my outlook on this one is as follows
- write support for MEDIUMBLOB into new gora-sql module
- accompany this with better error handling/message logging and additionally 
some additional guidance in the gora-sql-mapping.xml file

There is little we can do about this in Gora until the gora-sql module is 
written, therefore any problems which are experienced using gora-sql with Nutch 
2.x (or any other client applications for that matter) will need to be 
addressed at that level not within Gora.

 Throwing EOFException with MEDIUMBLOB type for inlinks column
 -

 Key: GORA-24
 URL: https://issues.apache.org/jira/browse/GORA-24
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-sql
 Environment: MySQL
Reporter: Alexis
 Fix For: 0.4


 I had an exception with DbUpdaterJob complaining that inlinks column of type 
 BLOB in webpage table was not big enough to store all the incoming links. So 
 I changed the column definition in gora-sql-mapping.xml from BLOB to 
 MEDIUMBLOB:
 field name=inlinks column=inlinks jdbc-type=MEDIUMBLOB/
 Now I systematically get an exception in the update step:
 java.io.IOException: java.sql.BatchUpdateException: Error reading from 
 InputStream java.io.EOFException
   at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:341)
   at org.apache.gora.sql.store.SqlStore.close(SqlStore.java:185)
   at 
 org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:55)
   at 
 org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:567)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
 Caused by: java.sql.BatchUpdateException: Error reading from InputStream 
 java.io.EOFException
   at 
 com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2020)
   at 
 com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1451)
   at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:329)
   ... 5 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (GORA-24) Throwing EOFException with MEDIUMBLOB type for inlinks column

2013-01-07 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-24?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reassigned GORA-24:


Assignee: Lewis John McGibbney

 Throwing EOFException with MEDIUMBLOB type for inlinks column
 -

 Key: GORA-24
 URL: https://issues.apache.org/jira/browse/GORA-24
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-sql
 Environment: MySQL
Reporter: Alexis
Assignee: Lewis John McGibbney
 Fix For: 0.4


 I had an exception with DbUpdaterJob complaining that inlinks column of type 
 BLOB in webpage table was not big enough to store all the incoming links. So 
 I changed the column definition in gora-sql-mapping.xml from BLOB to 
 MEDIUMBLOB:
 field name=inlinks column=inlinks jdbc-type=MEDIUMBLOB/
 Now I systematically get an exception in the update step:
 java.io.IOException: java.sql.BatchUpdateException: Error reading from 
 InputStream java.io.EOFException
   at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:341)
   at org.apache.gora.sql.store.SqlStore.close(SqlStore.java:185)
   at 
 org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:55)
   at 
 org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:567)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
 Caused by: java.sql.BatchUpdateException: Error reading from InputStream 
 java.io.EOFException
   at 
 com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2020)
   at 
 com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1451)
   at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:329)
   ... 5 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-24) Throwing EOFException with MEDIUMBLOB type for inlinks column

2013-01-07 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13546406#comment-13546406
 ] 

Lewis John McGibbney commented on GORA-24:
--

Hi Henry, yes the idea is to use JOOQ as is provides support for a wide variety 
of SQL stores out of the box... something which would be very appealing to 
users for obvious reasons.
There is a separate Jira issue on this topic altogether GORA-86 

 Throwing EOFException with MEDIUMBLOB type for inlinks column
 -

 Key: GORA-24
 URL: https://issues.apache.org/jira/browse/GORA-24
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-sql
 Environment: MySQL
Reporter: Alexis
Assignee: Lewis John McGibbney
 Fix For: 0.4


 I had an exception with DbUpdaterJob complaining that inlinks column of type 
 BLOB in webpage table was not big enough to store all the incoming links. So 
 I changed the column definition in gora-sql-mapping.xml from BLOB to 
 MEDIUMBLOB:
 field name=inlinks column=inlinks jdbc-type=MEDIUMBLOB/
 Now I systematically get an exception in the update step:
 java.io.IOException: java.sql.BatchUpdateException: Error reading from 
 InputStream java.io.EOFException
   at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:341)
   at org.apache.gora.sql.store.SqlStore.close(SqlStore.java:185)
   at 
 org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:55)
   at 
 org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:567)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
 Caused by: java.sql.BatchUpdateException: Error reading from InputStream 
 java.io.EOFException
   at 
 com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2020)
   at 
 com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1451)
   at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:329)
   ... 5 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (GORA-195) [gora-hbase] Allow mapping of an array to a single column

2013-01-09 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-195:
--

Fix Version/s: 0.4

 [gora-hbase] Allow mapping of an array to a single column
 -

 Key: GORA-195
 URL: https://issues.apache.org/jira/browse/GORA-195
 Project: Apache Gora
  Issue Type: Improvement
  Components: storage-hbase
Affects Versions: 0.2.1
 Environment: HBase 0.90.4 backend, Hadoop 1.0.1
Reporter: Alfonso Nishikawa
Priority: Trivial
 Fix For: 0.4


 At this time, defining a mapping in HBase for an array field to a 
 family:column like this:
 {code}
 {name: A,
  fields: [
 {name: field,  type: {type: array, values: string}}
]
 }
 class name=A ...
   field name=field family=r qualifier=c/
 /class
 {code}
 in HBase is discouraging since gets to an unexpected behavior loading parts 
 of the rest of the record.
 So: by now only is allowed mappings of arrays(and maps) to families.
 Workaround: enclose the array inside an inner optional record like this:
 {code}
 {name: A,
  fields: [
 {name:holder, type: [null, {
 name:holderRecord,
 type:record,
 fields: [
  {name: field,  type: {type: array, 
 values: string}}
  ]
 }}
]
 }
 {code}
 The necessity comes partially if you don't want to create a family for each 
 array in you HBase database (advised not to do), or if you just want to map 
 to a column when your array is read-only.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-182) Nutch 2.1 does not work with gora-cassandra 0.2.1

2013-01-09 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549317#comment-13549317
 ] 

Lewis John McGibbney commented on GORA-182:
---

This thread [0] is relevant to our issue. AS you mentioned Kaz (and as 
confirmed by Nate) the stack trace I've been getting usually happens when 
trying to insert the raw byte form of, say, an integer into a column expecting 
a string.

Although your patch may address the core issue here, I am sure there is still 
work to be done to avoid the stack, however I don't know whether this should be 
done in Gora or @client level?

Are you in a position to comment Kaz?

[0] https://groups.google.com/forum/?fromgroups=#!topic/hector-users/y2G7VFajHK8

 Nutch 2.1 does not work with gora-cassandra 0.2.1
 -

 Key: GORA-182
 URL: https://issues.apache.org/jira/browse/GORA-182
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-cassandra
Affects Versions: 0.2.1
Reporter: Kazuomi Kashii
 Fix For: 0.3

 Attachments: GORA-182.patch


 Nutch 2.1 does not work with gora-cassandra 0.2.1.
 Especially, outlinks field is not written.
 I have confirmed this issue on Mac OS X and CentOS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Attendance @ ApacheCon NA 2013 Portland

2013-01-09 Thread Lewis John Mcgibbney
Duh... I know you guys will be present and raring to go. That's a given!
Any excuse eh ;)

On Wed, Jan 9, 2013 at 6:58 PM, Mattmann, Chris A (388J) 
chris.a.mattm...@jpl.nasa.gov wrote:

 I'll be there (and so will Paul + Cam + Andrew + the rest of the
 OODT/Gora/etc. peeps that you know and love from JPL)

 Cheers!

 Chris


 On 1/9/13 9:52 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com
 wrote:

 Hi All,
 This thread speaks for itself.
 Who is going, who is not.
 I'm just in a new position so I don't know if it is appropriate,
 convenient
 for me to take the short trip up to Portland, however Gora PMC/community
 members would surely build the case for me going.
 Any takers?
 Best
 Lewis
 
 --
 *Lewis*




-- 
*Lewis*


[jira] [Commented] (GORA-197) gora-cassandra requires BytesType for Cassandra column family validator

2013-01-10 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13550503#comment-13550503
 ] 

Lewis John McGibbney commented on GORA-197:
---

Hi Kaz, which version do you wish to set this for? I say 0.3 if possible as the 
maven artifact (with this fix) would be really valuable elsewhere.

 gora-cassandra requires BytesType for Cassandra column family validator
 ---

 Key: GORA-197
 URL: https://issues.apache.org/jira/browse/GORA-197
 Project: Apache Gora
  Issue Type: Task
  Components: storage-cassandra
Reporter: Kazuomi Kashii

 gora-cassandra requires BytesType for Cassandra column family validator in 
 order to support Avro complex data type.
 If a user manually creates a column family with other type of validator, 
 gora-cassandra cannot do anything but throw an exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Document about GORA-174

2013-01-16 Thread Lewis John Mcgibbney
Hi Alfonso,
When you say that ...the first element in the union is considered as the
default element, at this moment it is not implemented nor planned does
this refer to Avro?



On Sunday, January 13, 2013, Alfonso Nishikawa alfonso.nishik...@gmail.com
wrote:
 Hello everybody.

 I wrote an article [0] regarding GORA-174 where I try to explain a
 compatibility issue with old data in HBase.
 I really don't know how it affects other backends. Need some info if
anyone
 knows. (@Renato: maybe you can tell me something about how is it in
 Cassandra :)
 I will appreciate your thoughts :)

 Thank you very much!

 Alfonso Nishikawa

 [0] - http://people.apache.org/~alfonsonishikawa/gora-174.html


-- 
*Lewis*


[jira] [Updated] (GORA-199) Support MongoDB in GORA

2013-01-28 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-199:
--

Fix Version/s: 0.4

 Support MongoDB in GORA
 ---

 Key: GORA-199
 URL: https://issues.apache.org/jira/browse/GORA-199
 Project: Apache Gora
  Issue Type: New Feature
  Components: storage
Reporter: Fabien Poulard
Priority: Minor
 Fix For: 0.4


 Support 10gen MongoDB datastore in GORA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-199) Support MongoDB in GORA

2013-01-28 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564760#comment-13564760
 ] 

Lewis John McGibbney commented on GORA-199:
---

Hi Fabien, out of curiosity, have you been working on this at all? Is there any 
code written?

 Support MongoDB in GORA
 ---

 Key: GORA-199
 URL: https://issues.apache.org/jira/browse/GORA-199
 Project: Apache Gora
  Issue Type: New Feature
  Components: storage
Reporter: Fabien Poulard
Priority: Minor
 Fix For: 0.4


 Support 10gen MongoDB datastore in GORA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-199) Support MongoDB in GORA

2013-01-29 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13565795#comment-13565795
 ] 

Lewis John McGibbney commented on GORA-199:
---

@Fabien, I totally missed these issues as I have recently moved to batch 
digests for all of the mailing lists I subscribe to. Please upload your patch 
here and we can begin to review. I understand from your threads elsewhere that 
you do not have the tests working with the TestDriver scenario, however this is 
not a huge problem. 
FYI we currently use Maven for the build lifecycle so maybe we can add that 
functionality as well.
Finally, please feel free to explain a bit about your avro-gradle-plugin and 
how if possible we can use this within Gora. My understanding of Groovy is 
limited and Rails even less so please be gentle ;)
Thank you and apologies again as this one seems to have passed slipped right 
through the net. 

 Support MongoDB in GORA
 ---

 Key: GORA-199
 URL: https://issues.apache.org/jira/browse/GORA-199
 Project: Apache Gora
  Issue Type: New Feature
  Components: storage
Reporter: Fabien Poulard
Priority: Minor
 Fix For: 0.4


 Support 10gen MongoDB datastore in GORA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: dev Digest 30 Jan 2013 09:38:18 -0000 Issue 313

2013-01-30 Thread Lewis John Mcgibbney
Hi Alfonso,

On Wed, Jan 30, 2013 at 1:38 AM, dev-digest-h...@gora.apache.org wrote:


 Greetings,

 Gora+MongoDB are happy news. Good to know about that feature.
 Maybe someone should make some little document with the involved classes
 for extend a new datastore (maybe I will try someday).


This would be really welcomed actually. It is something which we need to
improve upon most certainly. Currently the process of contributing
documentation to Gora is bloody difficult and this needs to change.



 About compiling schemas, in my opinion, someday someone should do something
 for maven (plugin:) . But by now anything automated is welcome.


Well, some good news. Within Ed's patch and proposal for the Avro upgrade
he re-factored the compiler (splitting it out into its own module) and the
idea is to publish this and Renato's DynamoDB compiler as maven plugins
which you can simply call from within your pom. This is something for the
future though.
Thanks
Lewis


 Cheers.

 Alfonso Nishikawa
 El 28/01/2013 07:37, Poulard, Fabien fpoul...@dictanova.com escribió:



[jira] [Updated] (GORA-197) gora-cassandra requires BytesType for Cassandra column family validator

2013-01-30 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-197:
--

Fix Version/s: 0.3

 gora-cassandra requires BytesType for Cassandra column family validator
 ---

 Key: GORA-197
 URL: https://issues.apache.org/jira/browse/GORA-197
 Project: Apache Gora
  Issue Type: Task
  Components: storage-cassandra
Reporter: Kazuomi Kashii
 Fix For: 0.3

 Attachments: GORA-197.patch


 gora-cassandra requires BytesType for Cassandra column family validator in 
 order to support Avro complex data type.
 If a user manually creates a column family with other type of validator, 
 gora-cassandra cannot do anything but throw an exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (GORA-201) Upgrade HBase API Usage in Gora

2013-01-31 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created GORA-201:
-

 Summary: Upgrade HBase API Usage in Gora
 Key: GORA-201
 URL: https://issues.apache.org/jira/browse/GORA-201
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-hbase
Affects Versions: 0.3
Reporter: Lewis John McGibbney
 Fix For: 0.4


We haven't touched the HBase versioning in a good while. When a new user heads 
over to the HBase site, they are directed to the 'stable' release which is 
currently sitting at 0.94.4.

I realise that we have (legacy) support for the 0.90.X branch of HBase, but 
from what I can see, there is no current justification for this decision and it 
is also not within any strategic short/medium/long term objectives of Gora.

This issue should 

*Enable us to discuss what Hbase branch we wish to support moving forward
*Actually implement the upgrade which gathers most consensus.   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-182) Nutch 2.1 does not work with gora-cassandra 0.2.1

2013-01-31 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13568113#comment-13568113
 ] 

Lewis John McGibbney commented on GORA-182:
---

So this can now be closed Kaz?

 Nutch 2.1 does not work with gora-cassandra 0.2.1
 -

 Key: GORA-182
 URL: https://issues.apache.org/jira/browse/GORA-182
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-cassandra
Affects Versions: 0.2.1
Reporter: Kazuomi Kashii
Assignee: Kazuomi Kashii
 Fix For: 0.3

 Attachments: GORA-182.patch


 Nutch 2.1 does not work with gora-cassandra 0.2.1.
 Especially, outlinks field is not written.
 I have confirmed this issue on Mac OS X and CentOS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-196) OSX JDK7 failed to load snappy native library from snappy-java-1.0.4.1.jar.

2013-01-31 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13568115#comment-13568115
 ] 

Lewis John McGibbney commented on GORA-196:
---

Hi Kaz. Can you please commit this when you get a chance?

 OSX JDK7 failed to load snappy native library from snappy-java-1.0.4.1.jar.
 ---

 Key: GORA-196
 URL: https://issues.apache.org/jira/browse/GORA-196
 Project: Apache Gora
  Issue Type: Test
  Components: storage-cassandra
 Environment: OSX JDK7
Reporter: Kazuomi Kashii
Priority: Minor
 Fix For: 0.3

 Attachments: GORA-196.patch


 OSX JDK7 failed to load snappy native library from snappy-java-1.0.4.1.jar 
 which is currently specified in Cassandra, so gora-cassandra test failed.
 This is a known issue, and snappy 1.0.5 (currently M3) should fix this :
 https://github.com/xerial/snappy-java/issues/6

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-196) OSX JDK7 failed to load snappy native library from snappy-java-1.0.4.1.jar.

2013-01-31 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13568116#comment-13568116
 ] 

Lewis John McGibbney commented on GORA-196:
---

Maybe add it to parent pom and inherit it through the gora-cassandra project 
pom?

 OSX JDK7 failed to load snappy native library from snappy-java-1.0.4.1.jar.
 ---

 Key: GORA-196
 URL: https://issues.apache.org/jira/browse/GORA-196
 Project: Apache Gora
  Issue Type: Test
  Components: storage-cassandra
 Environment: OSX JDK7
Reporter: Kazuomi Kashii
Priority: Minor
 Fix For: 0.3

 Attachments: GORA-196.patch


 OSX JDK7 failed to load snappy native library from snappy-java-1.0.4.1.jar 
 which is currently specified in Cassandra, so gora-cassandra test failed.
 This is a known issue, and snappy 1.0.5 (currently M3) should fix this :
 https://github.com/xerial/snappy-java/issues/6

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (GORA-174) GORA compiler does not handle [string, null] unions in the AVRO schema

2013-01-31 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-174:
--

Priority: Blocker  (was: Major)

 GORA compiler does not handle [string, null] unions in the AVRO schema
 --

 Key: GORA-174
 URL: https://issues.apache.org/jira/browse/GORA-174
 Project: Apache Gora
  Issue Type: Bug
  Components: schema
Affects Versions: 0.2.1
Reporter: Julien Nioche
Assignee: Alfonso Nishikawa
Priority: Blocker
 Fix For: 0.3

 Attachments: failed_tests_after_v3.tar.gz, GORA-174-test.patch, 
 GORA-174v2.patch, GORA-174v3.patch


 See NUTCH-1477 for description. 
 We are getting NPE when using the DataFileAvroStore, in order to avoid that I 
 modified the schema to allow for null values on some fields e.g.{name: 
 baseUrl, type: [string, null] }
 however when generating the code for the schema the accessors are not 
 generated by GORA which prevents Nutch from compiling 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-202) gora-tutorial does not work with Cassandra

2013-01-31 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13568243#comment-13568243
 ] 

Lewis John McGibbney commented on GORA-202:
---

Can you please commit this Kaz?

 gora-tutorial does not work with Cassandra
 --

 Key: GORA-202
 URL: https://issues.apache.org/jira/browse/GORA-202
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-cassandra
Affects Versions: 0.2.1
Reporter: Kazuomi Kashii
Assignee: Kazuomi Kashii
Priority: Minor
 Fix For: 0.3

 Attachments: GORA-202.patch


 gora-cassandra fails to initialize with gora-cassandra-mapping.xml of 
 gora-tutorial.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (GORA-196) OSX JDK7 failed to load snappy native library from snappy-java-1.0.4.1.jar.

2013-01-31 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved GORA-196.
---

Resolution: Fixed

 OSX JDK7 failed to load snappy native library from snappy-java-1.0.4.1.jar.
 ---

 Key: GORA-196
 URL: https://issues.apache.org/jira/browse/GORA-196
 Project: Apache Gora
  Issue Type: Test
  Components: storage-cassandra
 Environment: OSX JDK7
Reporter: Kazuomi Kashii
Priority: Minor
 Fix For: 0.3

 Attachments: GORA-196.patch


 OSX JDK7 failed to load snappy native library from snappy-java-1.0.4.1.jar 
 which is currently specified in Cassandra, so gora-cassandra test failed.
 This is a known issue, and snappy 1.0.5 (currently M3) should fix this :
 https://github.com/xerial/snappy-java/issues/6

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (GORA-182) Nutch 2.1 does not work with gora-cassandra 0.2.1

2013-01-31 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved GORA-182.
---

Resolution: Fixed

 Nutch 2.1 does not work with gora-cassandra 0.2.1
 -

 Key: GORA-182
 URL: https://issues.apache.org/jira/browse/GORA-182
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-cassandra
Affects Versions: 0.2.1
Reporter: Kazuomi Kashii
Assignee: Kazuomi Kashii
 Fix For: 0.3

 Attachments: GORA-182.patch


 Nutch 2.1 does not work with gora-cassandra 0.2.1.
 Especially, outlinks field is not written.
 I have confirmed this issue on Mac OS X and CentOS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (GORA-197) gora-cassandra requires BytesType for Cassandra column family validator

2013-01-31 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved GORA-197.
---

Resolution: Fixed

 gora-cassandra requires BytesType for Cassandra column family validator
 ---

 Key: GORA-197
 URL: https://issues.apache.org/jira/browse/GORA-197
 Project: Apache Gora
  Issue Type: Task
  Components: storage-cassandra
Reporter: Kazuomi Kashii
 Fix For: 0.3

 Attachments: GORA-197.patch


 gora-cassandra requires BytesType for Cassandra column family validator in 
 order to support Avro complex data type.
 If a user manually creates a column family with other type of validator, 
 gora-cassandra cannot do anything but throw an exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-202) gora-tutorial does not work with Cassandra

2013-01-31 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13568262#comment-13568262
 ] 

Lewis John McGibbney commented on GORA-202:
---

Hey Kaz. Can you close this one off please? Thank you

 gora-tutorial does not work with Cassandra
 --

 Key: GORA-202
 URL: https://issues.apache.org/jira/browse/GORA-202
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-cassandra
Affects Versions: 0.2.1
Reporter: Kazuomi Kashii
Assignee: Kazuomi Kashii
Priority: Minor
 Fix For: 0.3

 Attachments: GORA-202a.patch, GORA-202.patch


 gora-cassandra fails to initialize with gora-cassandra-mapping.xml of 
 gora-tutorial.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[DRAFT] Gora Report

2013-01-31 Thread Lewis John Mcgibbney
Hi All,
We need to report again this month. Please see below for the report
and please add/remove content where you see appropriate. I'll get this
committed when we are done.
Thanks
Lewis

-

The Apache Gora open source framework provides an in-memory data model and
persistence for big data. Gora supports persisting to column stores, key
value stores, document stores and RDBMSs, and analyzing the data with
extensive Apache Hadoop MapReduce support.

Project Releases

The Apache Gora team was happy to announce the release of Gora 0.2.1 on
7th August 2012. No releases have been made since however a clear staregy
has been established for the 0.3 release.

Overall Project Activity since last report

Since last reporting, the PMC has geared the development drive towards the 0.3
release. We have addressed and resolved 28 of 33 issues meaning that the
progression towards an RC for 0.3 is well on the way. We currently have two
blockers which nee to be addressed before we can consider the 0.3 RC.

How has the community developed since the last report?

Activity on the user@ list has been very slow since last reporting. It was
invisaged that after ApacheConEU user interest might pick up slightly, however
this has not materialized as we hoped.
Activity on dev@ has developed in line with our expectations as we move towards
more regular Gora releases. Generally speaking more work needs to be done in an
attempt to make it easier for people to use Gora. This is something which the
PMC need to work on.

Changes to PMC  Committers

The Gora PMC were very pleased to invite and have Alfonso Nishikawa
join our ranks in
early December. After working with the PMC to ensure smooth transition
into the Apache
community Alfonso is now contributing to Gora and making a real impact. Alfonso
also joined the Gora PMC.

PMC and Committer diversity

We currently have committers from a wide variety of Apache projects including,
Nutch, Tika, OODT, Camel, Solr, Accumulo, Whirr  Hadoop (this is not
an exhaustive
list). We are still actively seeking one or more members to join the team from
the Avro community so this will be a main target for us in the future
post 0.3 release.

Project Branding or Naming issues

NONE

Legal issues

NONE

-- 
Lewis


[jira] [Updated] (GORA-203) Bug in setting column field attribute qualifier in CassandraMapping

2013-02-01 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-203:
--

Description: 
Currently, we are absolutely required to set a value for a column field 
attribute qualifier, however there are no checks to determine whether this is 
actually present or not, therefore this is a bug.

Renato pointed this out and hopefully he can upload some stack traces relating 
to the issue to display the kind of issues one faces when qualifier attributes 
and their values are not present when mapping columns to Cassandra.

As far as we know, column field attributes are supported in the most recent 
Cassandra data model (and this is not due to change) therefore we should also 
support them in Gora, however it is my opinion (please comment here) on whether 
they should be optional or not.

  was:
Currently, we are absolutely required to set a column field value attribute 
qualifier, however there are no checks to determine whether this is actually 
present or not. 

Renato pointed this out and hopefully he can upload some stack traces relating 
to the issue.

As far aw we know, column field attributes are supported in the most recent 
Cassandra data model therefore we should also support them in Gora, however it 
is my opinion (please comment) on whether they should be optional or not.


 Bug in setting column field attribute qualifier in CassandraMapping 
 --

 Key: GORA-203
 URL: https://issues.apache.org/jira/browse/GORA-203
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-cassandra
Affects Versions: 0.2.1
Reporter: Lewis John McGibbney
 Fix For: 0.3


 Currently, we are absolutely required to set a value for a column field 
 attribute qualifier, however there are no checks to determine whether this 
 is actually present or not, therefore this is a bug.
 Renato pointed this out and hopefully he can upload some stack traces 
 relating to the issue to display the kind of issues one faces when qualifier 
 attributes and their values are not present when mapping columns to Cassandra.
 As far as we know, column field attributes are supported in the most recent 
 Cassandra data model (and this is not due to change) therefore we should also 
 support them in Gora, however it is my opinion (please comment here) on 
 whether they should be optional or not.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-203) Bug in setting column field attribute qualifier in CassandraMapping

2013-02-01 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569368#comment-13569368
 ] 

Lewis John McGibbney commented on GORA-203:
---

OK Kaz, thanks.
What is your opinion about the status quo, which is that we are making 
qualifier as attributes mandatory in Cassandra column mappings?
I do not think they are mandatory attributes within the Cassandra data model 
therefore personally I do not think it is appropriate for us to enforce them 
within Gora.

 Bug in setting column field attribute qualifier in CassandraMapping 
 --

 Key: GORA-203
 URL: https://issues.apache.org/jira/browse/GORA-203
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-cassandra
Affects Versions: 0.2.1
Reporter: Lewis John McGibbney
 Fix For: 0.3


 Currently, we are absolutely required to set a value for a column field 
 attribute qualifier, however there are no checks to determine whether this 
 is actually present or not, therefore this is a bug.
 Renato pointed this out and hopefully he can upload some stack traces 
 relating to the issue to display the kind of issues one faces when qualifier 
 attributes and their values are not present when mapping columns to Cassandra.
 As far as we know, column field attributes are supported in the most recent 
 Cassandra data model (and this is not due to change) therefore we should also 
 support them in Gora, however it is my opinion (please comment here) on 
 whether they should be optional or not.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-203) Bug in setting column field attribute qualifier in CassandraMapping

2013-02-01 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569369#comment-13569369
 ] 

Lewis John McGibbney commented on GORA-203:
---

I think also that we could do with enforcing much more verbose logging but 
dropping it to DEBUG level. What do you feel on this as well?

 Bug in setting column field attribute qualifier in CassandraMapping 
 --

 Key: GORA-203
 URL: https://issues.apache.org/jira/browse/GORA-203
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-cassandra
Affects Versions: 0.2.1
Reporter: Lewis John McGibbney
 Fix For: 0.3


 Currently, we are absolutely required to set a value for a column field 
 attribute qualifier, however there are no checks to determine whether this 
 is actually present or not, therefore this is a bug.
 Renato pointed this out and hopefully he can upload some stack traces 
 relating to the issue to display the kind of issues one faces when qualifier 
 attributes and their values are not present when mapping columns to Cassandra.
 As far as we know, column field attributes are supported in the most recent 
 Cassandra data model (and this is not due to change) therefore we should also 
 support them in Gora, however it is my opinion (please comment here) on 
 whether they should be optional or not.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-121) Enhance CassandraMapping to support additional Column Definitions

2013-02-01 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569380#comment-13569380
 ] 

Lewis John McGibbney commented on GORA-121:
---

Can anyone see or suggest any additional column attributes which Cassandra 
currently supports? If not then we can close this issue as won't fix as it 
seems to be have been addressed. 

 Enhance CassandraMapping to support additional Column Definitions 
 --

 Key: GORA-121
 URL: https://issues.apache.org/jira/browse/GORA-121
 Project: Apache Gora
  Issue Type: New Feature
  Components: storage-cassandra
Affects Versions: 0.2
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: 0.4


 There are 2 parts to this issue
 1) CassandraMapping#loadConfiguration currently loads definitions for 
 keyspaces, column families and columns however the support for the latter is 
 limited.
 The following is a mapping example
 Say we have the keyspace mapping configuration:
 keyspace name=WebPage cluster=Test Cluster host=localhost
 family name=p/
 family name=f/
   family name=sc type=super/
 /keyspace
 and the column mapping configuration:
class name=org.apache.gora.examples.generated.WebPage 
 keyClass=java.lang.String keyspace=WebPage
 field name=url family=p path=c:u/
 field name=content family=p path=p:cnt:c/
 field name=parsedContent family=p path=p:parsedContent/
 field name=outlinks family=p path=p:outlinks/
 field name=metadata family=p path=c:mt/
/class
 Currently we don't support keyClass attributes or field path attributes.
 2) Additionally, we mention
 private static final String COLUMN_ATTRIBUTE = qualifier;
 however this resource is neither loaded or requested at any stage during the 
 process of ascertaining Cassandra mappings. This should also be supported, if 
 not then it should be removed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (GORA-121) Enhance CassandraMapping to support additional Column Definitions

2013-02-05 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved GORA-121.
---

Resolution: Won't Fix

This has either been fixed elsewhere or is now not relevant.

 Enhance CassandraMapping to support additional Column Definitions 
 --

 Key: GORA-121
 URL: https://issues.apache.org/jira/browse/GORA-121
 Project: Apache Gora
  Issue Type: New Feature
  Components: storage-cassandra
Affects Versions: 0.2
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: 0.4


 There are 2 parts to this issue
 1) CassandraMapping#loadConfiguration currently loads definitions for 
 keyspaces, column families and columns however the support for the latter is 
 limited.
 The following is a mapping example
 Say we have the keyspace mapping configuration:
 keyspace name=WebPage cluster=Test Cluster host=localhost
 family name=p/
 family name=f/
   family name=sc type=super/
 /keyspace
 and the column mapping configuration:
class name=org.apache.gora.examples.generated.WebPage 
 keyClass=java.lang.String keyspace=WebPage
 field name=url family=p path=c:u/
 field name=content family=p path=p:cnt:c/
 field name=parsedContent family=p path=p:parsedContent/
 field name=outlinks family=p path=p:outlinks/
 field name=metadata family=p path=c:mt/
/class
 Currently we don't support keyClass attributes or field path attributes.
 2) Additionally, we mention
 private static final String COLUMN_ATTRIBUTE = qualifier;
 however this resource is neither loaded or requested at any stage during the 
 process of ascertaining Cassandra mappings. This should also be supported, if 
 not then it should be removed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-201) Upgrade HBase API Usage in Gora

2013-02-06 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573148#comment-13573148
 ] 

Lewis John McGibbney commented on GORA-201:
---

One more reason for us to upgrade HBase API usage/dependency in Gora

 http://www.mail-archive.com/user%40nutch.apache.org/msg08700.html

 Upgrade HBase API Usage in Gora
 ---

 Key: GORA-201
 URL: https://issues.apache.org/jira/browse/GORA-201
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-hbase
Affects Versions: 0.3
Reporter: Lewis John McGibbney
 Fix For: 0.4


 We haven't touched the HBase versioning in a good while. When a new user 
 heads over to the HBase site, they are directed to the 'stable' release which 
 is currently sitting at 0.94.4.
 I realise that we have (legacy) support for the 0.90.X branch of HBase, but 
 from what I can see, there is no current justification for this decision and 
 it is also not within any strategic short/medium/long term objectives of Gora.
 This issue should 
 *Enable us to discuss what Hbase branch we wish to support moving forward
 *Actually implement the upgrade which gathers most consensus.   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Making Sense of NoSQL

2013-02-07 Thread Lewis John Mcgibbney
It might also be relevant to note that Dan is looking for good quality use
cases for Big Data, as this is one of the aspects of the book.
Lewis

On Thu, Feb 7, 2013 at 12:40 PM, Lewis John Mcgibbney 
lewis.mcgibb...@gmail.com wrote:

 Hi,
 I recently spoke with Dan McGreary the (co)Author of the soon to be
 published Making sense of NoSQL
 http://www.manning.com/mccreary/
 Thought that it may be a link a few of us would be interested in :)
 Best
 Lewis

 --
 *Lewis*




-- 
*Lewis*


Re: dev Digest 8 Feb 2013 22:46:15 -0000 Issue 317

2013-02-08 Thread Lewis John Mcgibbney
Hi Alfonso,

On Fri, Feb 8, 2013 at 2:46 PM, dev-digest-h...@gora.apache.org wrote:



 Hi all,

 I updated GORA-174 issue info about HBase backend at [0]. Any thoughts? I
 think now is better expressed.


This is much clearer for me at least. We are always going to have certain
problems (when developing Gora) when intricacies associated with (and which
affect all) datastores are encountered. GORA-174 is a perfect example.
There is no workaround and it is essential to have a thorough understanding
of the problem at individual datastore level. Thanks for the
documentation, it is really driving this issue forward!


 If no one think is wrong, I will implement solution-1 and solution-2(this
 means maybe quite work, so do we maintain it? -I vote yes).


I think the proposed resolutions are certainly attractive and that we
should progress on this basis. When we get to a 1.0 Gora release (please
excuse my wishful long-term thinking) then we can act on completely
removing the deprecated methods from Gora, for the time being I see no
problem (and I certainly would back with my +1) methods being deprecated in
favour of more appropriate mechanisms for data persistence.

I've been talking this issue through with Renato offline and glad to
observe that the HBase and Cassandra stuff seems to be coming along nicely.

Is anyone in a position to address this with Accumulo?
What about DynamoDB?
Does DataFIle/AvroStore(s) support this in their current form?
Thanks
Lewis


[jira] [Created] (GORA-204) Don't store empty arrays in CassandraClient#addGenericArray() addStatefulHashMap()

2013-02-09 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created GORA-204:
-

 Summary: Don't store empty arrays in 
CassandraClient#addGenericArray()  addStatefulHashMap()
 Key: GORA-204
 URL: https://issues.apache.org/jira/browse/GORA-204
 Project: Apache Gora
  Issue Type: Improvement
  Components: avro, storage-cassandra
Affects Versions: 0.2.1
Reporter: Lewis John McGibbney
Priority: Minor
 Fix For: 0.4


We have two TODO's in this issue.
Namely 

{code}
// TODO: hack, do not store empty arrays
if (itemValue instanceof GenericArray?) {
  if (((GenericArray)itemValue).size() == 0) {
continue;
  }
} else if (itemValue instanceof StatefulHashMap?,?) {
  if (((StatefulHashMap)itemValue).size() == 0) {
continue;
  }
}
{code}

and 

{code}
// TODO: hack, do not store empty arrays
Object mapValue = map.get(mapKey);
if (mapValue instanceof GenericArray?) {
  if (((GenericArray)mapValue).size() == 0) {
continue;
  }
} else if (mapValue instanceof StatefulHashMap?,?) {
  if (((StatefulHashMap)mapValue).size() == 0) {
continue;
  }
}
{code}

in assGenericArray and addStateulHashMap respectively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (GORA-169) Implement correct logging for KeySpaces and attributes in CassandraMappingManager

2013-02-09 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved GORA-169.
---

   Resolution: Fixed
Fix Version/s: (was: 0.4)
   0.3
 Assignee: Lewis John McGibbney

Committed @revision 135 in trunk

 Implement correct logging for KeySpaces and attributes in 
 CassandraMappingManager
 -

 Key: GORA-169
 URL: https://issues.apache.org/jira/browse/GORA-169
 Project: Apache Gora
  Issue Type: Improvement
  Components: storage-cassandra
Affects Versions: 0.2.1
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: 0.3

 Attachments: GORA-169.patch


 Currently the logging in CassandraMappingManager#loadConfiguration() fails to 
 pick up a wealth of information from the keyspace definitions. An example is 
 below:
 {code}
 2012-09-20 23:47:05,469 INFO  store.CassandraMappingManager - Located 
 Cassandra Keyspace: 'keyspace'
 2012-09-20 23:47:05,476 INFO  store.CassandraMappingManager - Located 
 Cassandra Keyspace name: 'name'
 2012-09-20 23:47:05,476 INFO  store.CassandraMappingManager - Located 
 Cassandra Mapping: 'class'
 2012-09-20 23:47:05,476 INFO  store.CassandraMappingManager - Located 
 Cassandra Mapping class name: 'name'
 {code}
 As the logging incorrectly uses the jdom methods, keyspace names and 
 additional logging is incorrect and not nearly enough of what should be 
 present. It should be changed to reflect below: 
 {code}
 2012-09-20 23:47:05,476 INFO  store.CassandraMappingManager - Located 
 Cassandra Keyspace name: '$nameOfKeySpace'
 2012-09-20 23:47:05,476 INFO  store.CassandraMappingManager - Located 
 Cassandra Mapping for class: '$nameOfMappingClass'
 ...
 etc
 {code}
 right now this is very misleading and needs to be sorted out with much more 
 verbose logging for keyspace  attribute recognition. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (GORA-205) Dedup CassandraMapping and CassandraMappingManager

2013-02-09 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created GORA-205:
-

 Summary: Dedup CassandraMapping and CassandraMappingManager
 Key: GORA-205
 URL: https://issues.apache.org/jira/browse/GORA-205
 Project: Apache Gora
  Issue Type: Improvement
  Components: storage-cassandra
Affects Versions: 0.2.1
Reporter: Lewis John McGibbney
Priority: Minor
 Fix For: 0.4


We have a pile of what looks lie deduplication between these two classes.
We should make a determination of what is required and then document it within 
the appropriate class.
This will enable easy navigation of keyspace definition etc. from within 
gora-cassandra. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-203) Bug in setting column field attribute qualifier in CassandraMapping

2013-02-09 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575268#comment-13575268
 ] 

Lewis John McGibbney commented on GORA-203:
---

Hi Kaz, I just committed GORA-169. From what is left, please commit your fix 
when you have time. Thank you so much.

 Bug in setting column field attribute qualifier in CassandraMapping 
 --

 Key: GORA-203
 URL: https://issues.apache.org/jira/browse/GORA-203
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-cassandra
Affects Versions: 0.2.1
Reporter: Lewis John McGibbney
 Fix For: 0.3

 Attachments: GORA-203.patch


 Currently, we are absolutely required to set a value for a column field 
 attribute qualifier, however there are no checks to determine whether this 
 is actually present or not, therefore this is a bug.
 Renato pointed this out and hopefully he can upload some stack traces 
 relating to the issue to display the kind of issues one faces when qualifier 
 attributes and their values are not present when mapping columns to Cassandra.
 As far as we know, column field attributes are supported in the most recent 
 Cassandra data model (and this is not due to change) therefore we should also 
 support them in Gora, however it is my opinion (please comment here) on 
 whether they should be optional or not.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (GORA-204) Don't store empty arrays in CassandraClient#addGenericArray(), addStatefulHashMap() and CassandraStore#addOrUpdateField()

2013-02-09 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-204:
--

Summary: Don't store empty arrays in CassandraClient#addGenericArray(), 
addStatefulHashMap() and CassandraStore#addOrUpdateField()  (was: Don't store 
empty arrays in CassandraClient#addGenericArray()  addStatefulHashMap())

 Don't store empty arrays in CassandraClient#addGenericArray(), 
 addStatefulHashMap() and CassandraStore#addOrUpdateField()
 -

 Key: GORA-204
 URL: https://issues.apache.org/jira/browse/GORA-204
 Project: Apache Gora
  Issue Type: Improvement
  Components: avro, storage-cassandra
Affects Versions: 0.2.1
Reporter: Lewis John McGibbney
Priority: Minor
 Fix For: 0.4


 We have two TODO's in this issue.
 Namely 
 {code}
 // TODO: hack, do not store empty arrays
 if (itemValue instanceof GenericArray?) {
   if (((GenericArray)itemValue).size() == 0) {
 continue;
   }
 } else if (itemValue instanceof StatefulHashMap?,?) {
   if (((StatefulHashMap)itemValue).size() == 0) {
 continue;
   }
 }
 {code}
 and 
 {code}
 // TODO: hack, do not store empty arrays
 Object mapValue = map.get(mapKey);
 if (mapValue instanceof GenericArray?) {
   if (((GenericArray)mapValue).size() == 0) {
 continue;
   }
 } else if (mapValue instanceof StatefulHashMap?,?) {
   if (((StatefulHashMap)mapValue).size() == 0) {
 continue;
   }
 }
 {code}
 in assGenericArray and addStateulHashMap respectively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (GORA-204) Don't store empty arrays in CassandraClient#addGenericArray(), addStatefulHashMap() and CassandraStore#addOrUpdateField()

2013-02-09 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-204:
--

Description: 
We have three TODO's in this issue.
Namely 

{code}
// TODO: hack, do not store empty arrays
if (itemValue instanceof GenericArray?) {
  if (((GenericArray)itemValue).size() == 0) {
continue;
  }
} else if (itemValue instanceof StatefulHashMap?,?) {
  if (((StatefulHashMap)itemValue).size() == 0) {
continue;
  }
}
{code}

{code}
// TODO: hack, do not store empty arrays
Object mapValue = map.get(mapKey);
if (mapValue instanceof GenericArray?) {
  if (((GenericArray)mapValue).size() == 0) {
continue;
  }
} else if (mapValue instanceof StatefulHashMap?,?) {
  if (((StatefulHashMap)mapValue).size() == 0) {
continue;
  }
}
{code}

and

{code}
  case RECORD:
if (value != null) {
  if (value instanceof PersistentBase) {
PersistentBase persistentBase = (PersistentBase) value;
for (Field member: schema.getFields()) {
  
  // TODO: hack, do not store empty arrays
  Object memberValue = persistentBase.get(member.pos());
  if (memberValue instanceof GenericArray?) {
if (((GenericArray)memberValue).size() == 0) {
  continue;
}
  } else if (memberValue instanceof StatefulHashMap?,?) {
if (((StatefulHashMap)memberValue).size() == 0) {
  continue;
}
  }

  this.cassandraClient.addSubColumn(key, field.name(), 
member.name(), memberValue);
}
  } else {
LOG.info(Record not supported:  + value.toString());

  }
}
break;
{code}

in addGenericArray and addStateulHashMap in CassandraClient and 
CassandraStore#addOrUpdateField respectively.

  was:
We have two TODO's in this issue.
Namely 

{code}
// TODO: hack, do not store empty arrays
if (itemValue instanceof GenericArray?) {
  if (((GenericArray)itemValue).size() == 0) {
continue;
  }
} else if (itemValue instanceof StatefulHashMap?,?) {
  if (((StatefulHashMap)itemValue).size() == 0) {
continue;
  }
}
{code}

and 

{code}
// TODO: hack, do not store empty arrays
Object mapValue = map.get(mapKey);
if (mapValue instanceof GenericArray?) {
  if (((GenericArray)mapValue).size() == 0) {
continue;
  }
} else if (mapValue instanceof StatefulHashMap?,?) {
  if (((StatefulHashMap)mapValue).size() == 0) {
continue;
  }
}
{code}

in assGenericArray and addStateulHashMap respectively.


 Don't store empty arrays in CassandraClient#addGenericArray(), 
 addStatefulHashMap() and CassandraStore#addOrUpdateField()
 -

 Key: GORA-204
 URL: https://issues.apache.org/jira/browse/GORA-204
 Project: Apache Gora
  Issue Type: Improvement
  Components: avro, storage-cassandra
Affects Versions: 0.2.1
Reporter: Lewis John McGibbney
Priority: Minor
 Fix For: 0.4


 We have three TODO's in this issue.
 Namely 
 {code}
 // TODO: hack, do not store empty arrays
 if (itemValue instanceof GenericArray?) {
   if (((GenericArray)itemValue).size() == 0) {
 continue;
   }
 } else if (itemValue instanceof StatefulHashMap?,?) {
   if (((StatefulHashMap)itemValue).size() == 0) {
 continue;
   }
 }
 {code}
 {code}
 // TODO: hack, do not store empty arrays
 Object mapValue = map.get(mapKey);
 if (mapValue instanceof GenericArray?) {
   if (((GenericArray)mapValue).size() == 0) {
 continue;
   }
 } else if (mapValue instanceof StatefulHashMap?,?) {
   if (((StatefulHashMap)mapValue).size() == 0) {
 continue;
   }
 }
 {code}
 and
 {code}
   case RECORD:
 if (value != null) {
   if (value instanceof PersistentBase) {
 PersistentBase persistentBase = (PersistentBase) value;
 for (Field member: schema.getFields()) {
   
   // TODO: hack, do not store empty arrays
   Object memberValue = persistentBase.get(member.pos());
   if (memberValue instanceof GenericArray?) {
 if (((GenericArray)memberValue).size() == 0) {
   continue

[jira] [Updated] (GORA-167) Make Cassandra keyspace consistency configurable within gora.properties

2013-02-09 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-167:
--

Fix Version/s: (was: 0.4)
   0.3

 Make Cassandra keyspace consistency configurable within gora.properties
 ---

 Key: GORA-167
 URL: https://issues.apache.org/jira/browse/GORA-167
 Project: Apache Gora
  Issue Type: Improvement
  Components: storage-cassandra
Affects Versions: 0.2.1
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Minor
 Fix For: 0.3


 Current in CassandraClient#checkKeyspace() consistency is hard coded such 
 that consistency level is .ONE which permits consistency to wait until one 
 replica has responded. This could be improved to enable users to specify 
 other consistency profiles e.g. 
 ANY: Wait until some replica has responded.
 ONE: Wait until one replica has responded.
 TWO: Wait until two replicas have responded.
 THREE: Wait until three replicas have responded.
 LOCAL_QUORUM: Wait for quorum on the datacenter the connection was 
 stablished.
 EACH_QUORUM: Wait for quorum on each datacenter.
 QUORUM: Wait for a quorum of replicas (no matter which datacenter).
 ALL: Blocks for all the replicas before returning to the client.
 Configuration should be made available through gora.properties

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (GORA-167) Make Cassandra keyspace consistency configurable within gora.properties

2013-02-09 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-167:
--

Attachment: GORA-167.patch

Patch for trunk.
Can someone please check on the Properties param for checkKeyspace() method 
which now accepts the consistency level property from gora.properties.

Is specifying this solely in gora.properties the best way to go, or should this 
be configurable programmatically as well?  

Thanks for any feedback. 
Lets put this one to bed. 

 Make Cassandra keyspace consistency configurable within gora.properties
 ---

 Key: GORA-167
 URL: https://issues.apache.org/jira/browse/GORA-167
 Project: Apache Gora
  Issue Type: Improvement
  Components: storage-cassandra
Affects Versions: 0.2.1
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Minor
 Fix For: 0.3

 Attachments: GORA-167.patch


 Current in CassandraClient#checkKeyspace() consistency is hard coded such 
 that consistency level is .ONE which permits consistency to wait until one 
 replica has responded. This could be improved to enable users to specify 
 other consistency profiles e.g. 
 ANY: Wait until some replica has responded.
 ONE: Wait until one replica has responded.
 TWO: Wait until two replicas have responded.
 THREE: Wait until three replicas have responded.
 LOCAL_QUORUM: Wait for quorum on the datacenter the connection was 
 stablished.
 EACH_QUORUM: Wait for quorum on each datacenter.
 QUORUM: Wait for a quorum of replicas (no matter which datacenter).
 ALL: Blocks for all the replicas before returning to the client.
 Configuration should be made available through gora.properties

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (GORA-190) Add version switch to bin/gora script

2013-02-09 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-190:
--

Fix Version/s: (was: 0.4)
   0.3

 Add version switch to bin/gora script 
 

 Key: GORA-190
 URL: https://issues.apache.org/jira/browse/GORA-190
 Project: Apache Gora
  Issue Type: Improvement
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Minor
 Fix For: 0.3

 Attachments: GORA-190.patch


 This should act as a sure means of ensuring that Gora is properly installed 
 in the target operating system. I have never used Gora on anything other than 
 Ubuntu, so this will help us in the future to identify interoperability with 
 other OS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (GORA-190) Add version switch to bin/gora script

2013-02-09 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-190:
--

Attachment: GORA-190.patch

 Add version switch to bin/gora script 
 

 Key: GORA-190
 URL: https://issues.apache.org/jira/browse/GORA-190
 Project: Apache Gora
  Issue Type: Improvement
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Minor
 Fix For: 0.3

 Attachments: GORA-190.patch


 This should act as a sure means of ensuring that Gora is properly installed 
 in the target operating system. I have never used Gora on anything other than 
 Ubuntu, so this will help us in the future to identify interoperability with 
 other OS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-201) Upgrade HBase API Usage in Gora

2013-02-10 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575476#comment-13575476
 ] 

Lewis John McGibbney commented on GORA-201:
---

The hbase dep IGora seems to stagnated somewhat.
I am not keeping up with hbase but do know that we use an old api.
Eds avro patch is another kettle of post 0.3 fish but linked to this indeed.

 Upgrade HBase API Usage in Gora
 ---

 Key: GORA-201
 URL: https://issues.apache.org/jira/browse/GORA-201
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-hbase
Affects Versions: 0.3
Reporter: Lewis John McGibbney
 Fix For: 0.4


 We haven't touched the HBase versioning in a good while. When a new user 
 heads over to the HBase site, they are directed to the 'stable' release which 
 is currently sitting at 0.94.4.
 I realise that we have (legacy) support for the 0.90.X branch of HBase, but 
 from what I can see, there is no current justification for this decision and 
 it is also not within any strategic short/medium/long term objectives of Gora.
 This issue should 
 *Enable us to discuss what Hbase branch we wish to support moving forward
 *Actually implement the upgrade which gathers most consensus.   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (GORA-208) Implement consistent use of DataStoreFactory across Gora modules

2013-02-11 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created GORA-208:
-

 Summary: Implement consistent use of DataStoreFactory across Gora 
modules
 Key: GORA-208
 URL: https://issues.apache.org/jira/browse/GORA-208
 Project: Apache Gora
  Issue Type: Bug
  Components: gora-core, storage-accumulo, storage-cassandra, 
storage-dynamodb, storage-hbase
Affects Versions: 0.2.1
Reporter: Lewis John McGibbney
 Fix For: 0.4


Currently usage of DataStoreFactory (for initializing datastores, mappings and 
datastore configuration properties) is in consistent across datastore modules. 
If we are to lower the barrier to datastore contributions and implementations 
then we need to make the approach consistent.
This should also be documented thoroughly as it is a key part of the Gora 
architecture.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Support for NoSQL databases

2013-02-15 Thread Lewis John Mcgibbney
Hi Apostolis,

On Fri, Feb 15, 2013 at 7:10 PM, dev-digest-h...@gora.apache.org wrote:



 Hello,

 Could you please provide me a list of all the NoSQL databases that Gora
 supports at the moment


We currently support Apache Accumulo, Avro, Cassandra and HBase. We also
have a WebService's API and support Amazon's DynamoDB.


 and what NoSQL databases are planned to be supported
 in the near future?


We have a number of tickets open for planned implementations. I've
separated them into patches available and no patches available

Patches available:
Solr 4.X - https://issues.apache.org/jira/browse/GORA-9
MongoDB - https://issues.apache.org/jira/browse/GORA-199
Ehcache - https://issues.apache.org/jira/browse/GORA-13
JDBM2 - https://issues.apache.org/jira/browse/GORA-14

No patch
File-based store - https://issues.apache.org/jira/browse/GORA-8


 Also, do you have an estimate on how long would it take
 for someone to develop a Gora module to support a new NoSQL database?


A good benchmark was last years Google Summer of Code project. Writing a
new compiler, restructuring the core Gora API, adding a WebServices API and
writing the gora-dynamodb was all achived within the project. I do not
however have a definitive duration of time for this. I suppose it really
depends on what you want to do and how much time you are prepared to
allocate to the task.
Take into consideration that a lot of your 'thinking' can be done out loud
on the developer or user list. We would welcome such dialogue.


 The
 reason for asking is because I am interested in implementing such a module
 myself as a final year MSc project.


Sounds excellent. You've certainly come to the right place. If you are
serious about engaging in some work within Gora then please tell us more
and we can begin to plan ahead.

Best
Lewis


-- 
*Lewis*


Re: Updated GORA-174 HBase information - unions

2013-02-15 Thread Lewis John Mcgibbney
Hi Renato

On Fri, Feb 15, 2013 at 7:10 PM, dev-digest-h...@gora.apache.org wrote:


 This is a part I am not understanding very well. You guys are saying
 that legacy data is a problem, but why is this a problem if we haven't
 been supporting Avro Union in the past? This is a new feature, not an
 upgrade. And for what I am understanding, the second issue was on
 marking as deprecated the support for Union data types. But then
 again, if we are able to support Union data types, this would be the
 first time.
 Am I understanding things correctly here? Lewis? Alfonso? anyone else?


If we have previously defined the JSON Avro schemas not defining unions
(which is current practice), then new schemas supporting avro unions will
not be compatible with the legacy data. This is the problem right?

Ok, I see. But what about unions with more than one type? shouldn't we
 think in solving this once for all?
 We also have to keep in mind that the same solution might not be
 applicable to all data stores, but we should be able to provide the
 same features across all the supported data stores.


This is very well put. It is clear that the implementations will differ
considerably. We are moving in the right direction for Cassandra and HBase
solutions, but currently lack Accumulo. Please see my other most recent
thread on GORa-174.

Thanks troops. Have a great weekend.
Lewis


[jira] [Created] (GORA-209) Specify query timeout for Hector usage in gora-cassandra

2013-02-18 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created GORA-209:
-

 Summary: Specify query timeout for Hector usage in gora-cassandra
 Key: GORA-209
 URL: https://issues.apache.org/jira/browse/GORA-209
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-cassandra
Affects Versions: 0.2.1
Reporter: Lewis John McGibbney
Priority: Minor
 Fix For: 0.4


There is an interesting discussion going on over at Hector Dev list regarding 
improving Hector to support time outs for queries running for over X seconds.
https://groups.google.com/forum/?fromgroups=#!topic/hector-dev/9a0-u9oXjk4
Once something results from this, we should improve gora-cassandra to also 
leverage timeouts for queries which time out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (GORA-210) thread safety: java.util.ConcurrentModificationException

2013-02-28 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-210:
--

Fix Version/s: 0.3

 thread safety: java.util.ConcurrentModificationException
 

 Key: GORA-210
 URL: https://issues.apache.org/jira/browse/GORA-210
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-cassandra
Affects Versions: 0.2
 Environment: nutch 2.1 / cassandra 1.2.1 / gora-cassandra 0.2 / 
 gora-core 0.2.1
 running fetch with parse=true
 fetcher.threads.per.queue1
Reporter: Roland
Priority: Critical
  Labels: patch
 Fix For: 0.3

 Attachments: GORA-210.patch


 This is the result of debugging one of my issues described in NUTCH-1534.
 I think there is a wrong assumpation about thread safety of LinkedHashMap, it 
 is not enough to not iterate over the buffer (which is a LinkedHashMap).
 My patch fixes this error for me:
 java.util.ConcurrentModificationException
 at 
 java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:394)
 at java.util.LinkedHashMap$KeyIterator.next(LinkedHashMap.java:405)
 at java.util.AbstractCollection.toArray(AbstractCollection.java:141)
 at 
 org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:200)
 at 
 org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:65)
 at 
 org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:587)
 at 
 org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
 at 
 org.apache.nutch.fetcher.FetcherReducer$FetcherThread.output(FetcherReducer.java:664)
 at 
 org.apache.nutch.fetcher.FetcherReducer$FetcherThread.run(FetcherReducer.java:534)
 It may not be perfect from a performance point of view...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-210) thread safety: java.util.ConcurrentModificationException

2013-02-28 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589816#comment-13589816
 ] 

Lewis John McGibbney commented on GORA-210:
---

How can I reproduce the Exception you's guys are talking about? Can we test for 
it?... easily?

 thread safety: java.util.ConcurrentModificationException
 

 Key: GORA-210
 URL: https://issues.apache.org/jira/browse/GORA-210
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-cassandra
Affects Versions: 0.2
 Environment: nutch 2.1 / cassandra 1.2.1 / gora-cassandra 0.2 / 
 gora-core 0.2.1
 running fetch with parse=true
 fetcher.threads.per.queue1
Reporter: Roland
Priority: Critical
  Labels: patch
 Fix For: 0.3

 Attachments: GORA-210.patch


 This is the result of debugging one of my issues described in NUTCH-1534.
 I think there is a wrong assumpation about thread safety of LinkedHashMap, it 
 is not enough to not iterate over the buffer (which is a LinkedHashMap).
 My patch fixes this error for me:
 java.util.ConcurrentModificationException
 at 
 java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:394)
 at java.util.LinkedHashMap$KeyIterator.next(LinkedHashMap.java:405)
 at java.util.AbstractCollection.toArray(AbstractCollection.java:141)
 at 
 org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:200)
 at 
 org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:65)
 at 
 org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:587)
 at 
 org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
 at 
 org.apache.nutch.fetcher.FetcherReducer$FetcherThread.output(FetcherReducer.java:664)
 at 
 org.apache.nutch.fetcher.FetcherReducer$FetcherThread.run(FetcherReducer.java:534)
 It may not be perfect from a performance point of view...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-206) Verify storage and retrieval of Avro null-single-type Union data type within Gora-Cassandra

2013-02-28 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590087#comment-13590087
 ] 

Lewis John McGibbney commented on GORA-206:
---

This is a large patch. Can you please describe it to us Renato please. It is 
difficult for me personally to digest.

 Verify storage and retrieval of Avro null-single-type Union data type within 
 Gora-Cassandra
 ---

 Key: GORA-206
 URL: https://issues.apache.org/jira/browse/GORA-206
 Project: Apache Gora
  Issue Type: Sub-task
  Components: storage-cassandra
Affects Versions: 0.3
Reporter: Renato Javier Marroquín Mogrovejo
Assignee: Renato Javier Marroquín Mogrovejo
  Labels: gora-cassandra, gora-core
 Fix For: 0.3

 Attachments: GORA-206.v1.patch


 The necessary features should be added to confirm that we are able to support 
 Avro Union data types.
 This referes specifically to null-single-type unions. We will open another 
 issue to address the multi-type unions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-206) Verify storage and retrieval of Avro null-single-type Union data type within Gora-Cassandra

2013-03-02 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591556#comment-13591556
 ] 

Lewis John McGibbney commented on GORA-206:
---

OK so first things 1st, what issues do we have with GORA-174?
If you mention that the patch here includes all of the stuff Alfonso 
implemented in GORA-174 then I will apply your patch to trunk and run the tests.
Lets keep the conversation either here or else on GORA-174 as it is becoming 
difficult to track now.
Is there anything on the mailing list (regarding conversation which has not 
been tied off) which you want to clarify or iron out?
Thanks for the work Renato. Great help that you guys are pushing this on.

 Verify storage and retrieval of Avro null-single-type Union data type within 
 Gora-Cassandra
 ---

 Key: GORA-206
 URL: https://issues.apache.org/jira/browse/GORA-206
 Project: Apache Gora
  Issue Type: Sub-task
  Components: storage-cassandra
Affects Versions: 0.3
Reporter: Renato Javier Marroquín Mogrovejo
Assignee: Renato Javier Marroquín Mogrovejo
  Labels: gora-cassandra, gora-core
 Fix For: 0.3

 Attachments: GORA-206.v1.patch


 The necessary features should be added to confirm that we are able to support 
 Avro Union data types.
 This referes specifically to null-single-type unions. We will open another 
 issue to address the multi-type unions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-206) Verify storage and retrieval of Avro null-single-type Union data type within Gora-Cassandra

2013-03-02 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591600#comment-13591600
 ] 

Lewis John McGibbney commented on GORA-206:
---

I've still not tracked this one down but I am getting closer.
When we use a patched version of GoraCompiler to compile the latest 
webpage.json schema available on NUTCH-1477 (which for the record IS 
syntactically fine) we get the following generated into the WebPage.java class.
{code}
public static final Schema _SCHEMA = 
Schema.parse({\type\:\record\,\name\:\WebPage\,\namespace\:\org.apache.nutch.storage\,\fields\:[{\name\:\baseurl\,\type\:[\null\,\string\]}},{\name\:\status\,\type\:\int\},{\name\:\fetchtime\,\type\:\long\},{\name\:\prevfetchtime\,\type\:\long\},{\name\:\fetchinterval\,\type\:\int\},{\name\:\retriessincefetch\,\type\:\int\},{\name\:\modifiedtime\,\type\:\long\},{\name\:\protocolstatus\,\type\:[\null\,\protocolstatus\]}},{\name\:\content\,\type\:[\null\,\bytes\]}},{\name\:\contenttype\,\type\:[\null\,\string\]}},{\name\:\prevsignature\,\type\:[\null\,\bytes\]}},{\name\:\signature\,\type\:[\null\,\bytes\]}},{\name\:\title\,\type\:[\null\,\string\]}},{\name\:\text\,\type\:[\null\,\string\]}},{\name\:\parsestatus\,\type\:[\null\,\parsestatus\]}},{\name\:\score\,\type\:\float\},{\name\:\reprurl\,\type\:[\null\,\string\]}},{\name\:\headers\,\type\:\map\},{\name\:\outlinks\,\type\:\map\},{\name\:\inlinks\,\type\:\map\},{\name\:\markers\,\type\:\map\},{\name\:\metadata\,\type\:\map\}]});
{code}

This does not look good when I do some simple bracket matching. I think we've 
introduced a big in GoraCompiler which needs to be ironed out.

 Verify storage and retrieval of Avro null-single-type Union data type within 
 Gora-Cassandra
 ---

 Key: GORA-206
 URL: https://issues.apache.org/jira/browse/GORA-206
 Project: Apache Gora
  Issue Type: Sub-task
  Components: storage-cassandra
Affects Versions: 0.3
Reporter: Renato Javier Marroquín Mogrovejo
Assignee: Renato Javier Marroquín Mogrovejo
  Labels: gora-cassandra, gora-core
 Fix For: 0.3

 Attachments: GORA-206.v1.patch, GORA-206.v2.patch


 The necessary features should be added to confirm that we are able to support 
 Avro Union data types.
 This referes specifically to null-single-type unions. We will open another 
 issue to address the multi-type unions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (GORA-191) Add a constructor to GoraCompiler so it can be used outside of Gora.

2013-03-02 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reassigned GORA-191:
-

Assignee: Apostolos Giannakidis

Done. Thanks for any contributions.

 Add a constructor to GoraCompiler so it can be used outside of Gora.
 

 Key: GORA-191
 URL: https://issues.apache.org/jira/browse/GORA-191
 Project: Apache Gora
  Issue Type: Improvement
  Components: gora-core, schema
Reporter: Lewis John McGibbney
Assignee: Apostolos Giannakidis
Priority: Critical
 Fix For: 0.4


 We need to automate the compiling of various .avsc files over in Nutch. We 
 should add a constructor to GoraCompiler so it can be used more widely. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (GORA-206) Verify storage and retrieval of Avro null-single-type Union data type within Gora-Cassandra

2013-03-03 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591901#comment-13591901
 ] 

Lewis John McGibbney commented on GORA-206:
---

Looking at this, this morning I see that we've introduced a number of bugs such 
as

* fields for embedded records are now not handled correctly in new GoraComiler 
* fields containing array and map values are now not handled correctly in 
GoraCompiler

I'll work on these two issues and hopefully attach a working v3 patch.

 Verify storage and retrieval of Avro null-single-type Union data type within 
 Gora-Cassandra
 ---

 Key: GORA-206
 URL: https://issues.apache.org/jira/browse/GORA-206
 Project: Apache Gora
  Issue Type: Sub-task
  Components: storage-cassandra
Affects Versions: 0.3
Reporter: Renato Javier Marroquín Mogrovejo
Assignee: Renato Javier Marroquín Mogrovejo
  Labels: gora-cassandra, gora-core
 Fix For: 0.3

 Attachments: GORA-206.v1.patch, GORA-206.v2.patch


 The necessary features should be added to confirm that we are able to support 
 Avro Union data types.
 This referes specifically to null-single-type unions. We will open another 
 issue to address the multi-type unions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (GORA-213) Move out StringUtil-capable methods from GoraCompiler to StringUtils

2013-03-03 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created GORA-213:
-

 Summary: Move out StringUtil-capable methods from GoraCompiler to 
StringUtils
 Key: GORA-213
 URL: https://issues.apache.org/jira/browse/GORA-213
 Project: Apache Gora
  Issue Type: Bug
  Components: documentation, gora-core
Affects Versions: 0.2.1
Reporter: Lewis John McGibbney
Priority: Trivial
 Fix For: 0.4


This is a rather trivial affair, but concerns an attempt to clean up 
GoraCompiler.
I know I for one have struggled in the past to get to grips to GoraCompiler, 
and honestly, additional class specific strung utility like methods really do 
not help in the slightest. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (GORA-206) Verify storage and retrieval of Avro null-single-type Union data type within Gora-Cassandra

2013-03-03 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/GORA-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591901#comment-13591901
 ] 

Lewis John McGibbney edited comment on GORA-206 at 3/4/13 12:37 AM:


Looking at this, this morning I see that we've introduced a number of bugs such 
as

* fields for embedded records are now not handled correctly in new GoraCompiler 
e.g
{code}
{name: protocolStatus, type: [null, {
name: ProtocolStatus,
type: record,
namespace: org.apache.nutch.storage,
fields: [
{name: code, type: int},
{name: args, type: {type: array, items: string}},
{name: lastModified, type: long}
]
}]}
{code}
is simply compiled down into
{code}
{\name\:\protocolStatus\,\type\:[\null\,\protocolstatus\]},
{code} 

* additionally fields containing array and map values are now not handled 
correctly in GoraCompiler e.g.
{code}
{name: headers, type: {type: map, values: string}}
{code}
is incorrectly compiled down into
{code}
{\name\:\metadata\,\type\:\map\}
{code}

We need to sort these two cases at a minimum. These are blockers.

I'll work on these two issues and hopefully attach a working v3 patch.

  was (Author: lewismc):
Looking at this, this morning I see that we've introduced a number of bugs 
such as

* fields for embedded records are now not handled correctly in new GoraComiler 
* fields containing array and map values are now not handled correctly in 
GoraCompiler

I'll work on these two issues and hopefully attach a working v3 patch.
  
 Verify storage and retrieval of Avro null-single-type Union data type within 
 Gora-Cassandra
 ---

 Key: GORA-206
 URL: https://issues.apache.org/jira/browse/GORA-206
 Project: Apache Gora
  Issue Type: Sub-task
  Components: storage-cassandra
Affects Versions: 0.3
Reporter: Renato Javier Marroquín Mogrovejo
Assignee: Renato Javier Marroquín Mogrovejo
  Labels: gora-cassandra, gora-core
 Fix For: 0.3

 Attachments: GORA-206.v1.patch, GORA-206.v2.patch


 The necessary features should be added to confirm that we are able to support 
 Avro Union data types.
 This referes specifically to null-single-type unions. We will open another 
 issue to address the multi-type unions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (GORA-211) thread safety: java.lang.NullPointerException

2013-03-03 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-211:
--

Assignee: Roland

 thread safety: java.lang.NullPointerException
 -

 Key: GORA-211
 URL: https://issues.apache.org/jira/browse/GORA-211
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-cassandra
Affects Versions: 0.2
 Environment: nutch 2.1 / cassandra 1.2.1 / gora-cassandra 0.2 / 
 gora-core 0.2.1 
 running fetch with parse=true 
 fetcher.threads.per.queue=2
 nutch on a 16 core AMD  Opteron 2GHz
 Cassandra on 8 core Intel Xeon 3.3 GHz
Reporter: Roland
Assignee: Roland
Priority: Critical
 Attachments: GORA-211-0.2.patch, GORA-211-trunk.patch, 
 GORA-211-trunk-v2.patch


 This is the result of debugging one of my issues described in NUTCH-1534. 
 example trace:
 java.lang.NullPointerException
 at 
 me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
 at 
 me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:71)
 at 
 org.apache.gora.cassandra.store.CassandraClient.addColumn(CassandraClient.java:139)
 at 
 org.apache.gora.cassandra.store.CassandraStore.addOrUpdateField(CassandraStore.java:307)
 at 
 org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:212)
 at 
 org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:65)
 at 
 org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:587)
 at 
 org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
 at 
 org.apache.nutch.fetcher.FetcherReducer$FetcherThread.output(FetcherReducer.java:664)
 at 
 org.apache.nutch.fetcher.FetcherReducer$FetcherThread.run(FetcherReducer.java:534)
 I'm suspecting CassandraStore.put() not taking enough precautions to copy all 
 objects safely to it's buffer.
 {code}
 switch(type) {
   case RECORD:
 Persistent persistent = (Persistent) fieldValue;
 Persistent newRecord = persistent.newInstance(new 
 StateManagerImpl());
 for (Field member: fieldSchema.getFields()) {
   newRecord.put(member.pos(), persistent.get(member.pos()));
 }
 fieldValue = newRecord;
 break;
   case MAP:
 StatefulHashMap?, ? map = (StatefulHashMap?, ?) fieldValue;
 StatefulHashMap?, ? newMap = new StatefulHashMap(map);
 fieldValue = newMap;
 break;
 }
 {code}
 case RECORD - do we not need to duplicate the object returned by 
 persistent.get(member.pos()):
   newRecord.put(member.pos(), persistent.get(member.pos()))
 case MAP - do we not need to duplicate all value-objects of the map?
 I had not time to write a patch or test this, so, please comment :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (GORA-210) thread safety: java.util.ConcurrentModificationException

2013-03-03 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/GORA-210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-210:
--

Assignee: Roland

 thread safety: java.util.ConcurrentModificationException
 

 Key: GORA-210
 URL: https://issues.apache.org/jira/browse/GORA-210
 Project: Apache Gora
  Issue Type: Bug
  Components: storage-cassandra
Affects Versions: 0.2
 Environment: nutch 2.1 / cassandra 1.2.1 / gora-cassandra 0.2 / 
 gora-core 0.2.1
 running fetch with parse=true
 fetcher.threads.per.queue=2 / about 1 Exception per 100k URLs fetched
 nutch on a 16 core AMD  Opteron 2GHz.
 Cassandra on 8 core Intel Xeon 3.3 GHz
Reporter: Roland
Assignee: Roland
Priority: Critical
  Labels: patch
 Fix For: 0.3

 Attachments: GORA-210.patch, GORA-210-trunk.patch


 This is the result of debugging one of my issues described in NUTCH-1534.
 I think there is a wrong assumpation about thread safety of LinkedHashMap, it 
 is not enough to not iterate over the buffer (which is a LinkedHashMap).
 My patch fixes this error for me:
 java.util.ConcurrentModificationException
 at 
 java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:394)
 at java.util.LinkedHashMap$KeyIterator.next(LinkedHashMap.java:405)
 at java.util.AbstractCollection.toArray(AbstractCollection.java:141)
 at 
 org.apache.gora.cassandra.store.CassandraStore.flush(CassandraStore.java:200)
 at 
 org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:65)
 at 
 org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:587)
 at 
 org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
 at 
 org.apache.nutch.fetcher.FetcherReducer$FetcherThread.output(FetcherReducer.java:664)
 at 
 org.apache.nutch.fetcher.FetcherReducer$FetcherThread.run(FetcherReducer.java:534)
 It may not be perfect from a performance point of view...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


<    1   2   3   4   5   6   7   8   9   10   >