[jira] [Comment Edited] (GORA-411) Add exists(key) to DataStore interface
[ https://issues.apache.org/jira/browse/GORA-411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16795698#comment-16795698 ] John Mora edited comment on GORA-411 at 3/19/19 5:24 AM: - Hi [~alfonso.nishikawa] , [~lewismc]. I would like to work on this issue as a warm up task for my GoSC2019 application. I added the method _*public*_ _*boolean exists(K key) throws GoraException*_ in the *DataStore* interface and implemented a default behavior in the *DataStoreBase* class as follows. {code:java} @Override public boolean exists(K key) throws GoraException { return get(key,new String [0])!=null; } {code} And, for testing I added the following case: {code:java} public static void testExistsEmployee(DataStore dataStore) throws Exception { dataStore.createSchema(); Employee employee = DataStoreTestUtil.createEmployee(); String ssn = employee.getSsn().toString(); dataStore.put(ssn, employee); dataStore.flush(); assertTrue(dataStore.exists(ssn)); dataStore.delete(ssn); dataStore.flush(); assertFalse(dataStore.exists(ssn)); }{code} It seems this naive approach works (tests are passing), so I think I could analyze every backend in order to find more adequate custom implementations for each one. But, I would like to know if the test case above is enough for this new method, do you know other edge cases that should be also checked?. Cheers, John was (Author: jhnmora000): Hi [~alfonso.nishikawa] , [~lewismc]. I would like to work on this issue as a warm up task for my GoSC2019 application. I added the method _*public*_ _*boolean exists(K key) throws GoraException*_ in the *DataStore* interface and implemented a default behavior in the *DataStoreBase* class as follows. {code:java} @Override public boolean exists(K key) throws GoraException { return get(key,new String [0])!=null; } {code} And, for testing I added the following case: {code:java} public static void testExistsEmployee(DataStore dataStore) throws Exception { dataStore.createSchema(); Employee employee = DataStoreTestUtil.createEmployee(); String ssn = employee.getSsn().toString(); dataStore.put(ssn, employee); dataStore.flush(); assertTrue(dataStore.exists(ssn)); dataStore.delete(ssn); dataStore.flush(); assertFalse(dataStore.exists(ssn)); }{code} It seems this naive approach works (tests are passing), so I think I could analyze every backend in order to find more adequate custom implementations for each one. But, I would like to know if the test case above is enough for this new method, do you know other edge cases that should be also checked?. Cheers, John > Add exists(key) to DataStore interface > -- > > Key: GORA-411 > URL: https://issues.apache.org/jira/browse/GORA-411 > Project: Apache Gora > Issue Type: Improvement > Components: gora-core, storage >Reporter: Alfonso Nishikawa >Priority: Minor > Fix For: 0.9 > > > NUTCH-1679 need to check if there exists some rows and they are proposing to > use {{store.get(TableUtil.reverseUrl(url)))}}. > This will have a considerably impact on performance since every column will > be fetched. > Some datastores implements a call to just check if a row exists (like HBase) > so no data is transfered by network. > If a datastore can't handle an "exists" call, can default to a get. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GORA-411) Add exists(key) to DataStore interface
[ https://issues.apache.org/jira/browse/GORA-411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16795698#comment-16795698 ] John Mora commented on GORA-411: Hi [~alfonso.nishikawa] , [~lewismc]. I would like to work on this issue as a warm up task for my GoSC2019 application. I added the method _*public*_ _*boolean exists(K key) throws GoraException*_ in the *DataStore* interface and implemented a default behavior in the *DataStoreBase* class as follows. {code:java} @Override public boolean exists(K key) throws GoraException { return get(key,new String [0])!=null; } {code} And, for testing I added the following case: {code:java} public static void testExistsEmployee(DataStore dataStore) throws Exception { dataStore.createSchema(); Employee employee = DataStoreTestUtil.createEmployee(); String ssn = employee.getSsn().toString(); dataStore.put(ssn, employee); dataStore.flush(); assertTrue(dataStore.exists(ssn)); dataStore.delete(ssn); dataStore.flush(); assertFalse(dataStore.exists(ssn)); }{code} It seems this naive approach works (tests are passing), so I think I could analyze every backend in order to find more adequate custom implementations for each one. But, I would like to know if the test case above is enough for this new method, do you know other edge cases that should be also checked?. Cheers, John > Add exists(key) to DataStore interface > -- > > Key: GORA-411 > URL: https://issues.apache.org/jira/browse/GORA-411 > Project: Apache Gora > Issue Type: Improvement > Components: gora-core, storage >Reporter: Alfonso Nishikawa >Priority: Minor > Fix For: 0.9 > > > NUTCH-1679 need to check if there exists some rows and they are proposing to > use {{store.get(TableUtil.reverseUrl(url)))}}. > This will have a considerably impact on performance since every column will > be fetched. > Some datastores implements a call to just check if a row exists (like HBase) > so no data is transfered by network. > If a datastore can't handle an "exists" call, can default to a get. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [gora] alfonsonishikawa commented on a change in pull request #135: Goraexplorer needed changes
alfonsonishikawa commented on a change in pull request #135: Goraexplorer needed changes URL: https://github.com/apache/gora/pull/135#discussion_r266680620 ## File path: gora-pig/src/test/java/org/apache/gora/pig/GoraStorageTest.java-disabled ## @@ -0,0 +1,352 @@ +package org.apache.gora.pig; Review comment: > Were you able to try the same with HBase 2 upgrade? Hi! No. I will try, though. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] djkevincr commented on issue #135: Goraexplorer needed changes
djkevincr commented on issue #135: Goraexplorer needed changes URL: https://github.com/apache/gora/pull/135#issuecomment-474034153 @alfonsonishikawa One concern I do have is, I noticed record.vm velocity template changes, hopefully I think you have regenerated all the AVRO databean classes again to avoid any inconsistent updates, due to multiple updates to velocity template. This is really great :) as first step. We can continue this work, with the improvements you suggested to me offline. @lewismc Do you have any concern over your review on this PR? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] djkevincr commented on a change in pull request #135: Goraexplorer needed changes
djkevincr commented on a change in pull request #135: Goraexplorer needed changes URL: https://github.com/apache/gora/pull/135#discussion_r266567600 ## File path: gora-pig/src/test/java/org/apache/gora/pig/GoraStorageTest.java-disabled ## @@ -0,0 +1,352 @@ +package org.apache.gora.pig; Review comment: Were you able to try the same with HBase 2 upgrade? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] djkevincr commented on issue #135: Goraexplorer needed changes
djkevincr commented on issue #135: Goraexplorer needed changes URL: https://github.com/apache/gora/pull/135#issuecomment-474029378 Locally tested the PR, build passes without any test failures. [INFO] [INFO] Reactor Summary: [INFO] [INFO] Apache Gora SUCCESS [ 2.381 s] [INFO] Apache Gora :: Compiler SUCCESS [ 3.077 s] [INFO] Apache Gora :: Compiler-CLI SUCCESS [ 1.385 s] [INFO] Apache Gora :: Core SUCCESS [01:56 min] [INFO] Apache Gora :: Pig . SUCCESS [ 3.235 s] [INFO] Apache Gora :: Accumulo SUCCESS [08:07 min] [INFO] Apache Gora :: HBase ... SUCCESS [03:24 min] [INFO] Apache Gora :: Cassandra - CQL . SUCCESS [01:53 min] [INFO] Apache Gora :: GoraCI .. SUCCESS [ 3.998 s] [INFO] Apache Gora :: Infinispan .. SUCCESS [01:22 min] [INFO] Apache Gora :: JCache .. SUCCESS [01:21 min] [INFO] Apache Gora :: OrientDB SUCCESS [01:48 min] [INFO] Apache Gora :: Dynamodb SUCCESS [ 4.441 s] [INFO] Apache Gora :: CouchDB . SUCCESS [ 4.872 s] [INFO] Apache Gora :: Maven Plugin SUCCESS [ 3.021 s] [INFO] Apache Gora :: MongoDB . SUCCESS [02:04 min] [INFO] Apache Gora :: Solr SUCCESS [02:59 min] [INFO] Apache Gora :: Aerospike ... SUCCESS [ 2.849 s] [INFO] Apache Gora :: Ignite .. SUCCESS [02:56 min] [INFO] Apache Gora :: Tutorial SUCCESS [ 6.486 s] [INFO] Apache Gora :: Sources-Dist SUCCESS [ 0.364 s] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 28:32 min [INFO] Finished at: 2019-03-18T23:14:41+05:30 [INFO] Final Memory: 101M/1679M [INFO] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (GORA-555) Improve Lucene query implementation with NumericRangeQuery
Kevin Ratnasekera created GORA-555: -- Summary: Improve Lucene query implementation with NumericRangeQuery Key: GORA-555 URL: https://issues.apache.org/jira/browse/GORA-555 Project: Apache Gora Issue Type: Improvement Reporter: Kevin Ratnasekera There performance benefits around NumericRangeQuery. Please notice comment on LuceneQuery implementation. ``` //TODO: Change this to a NumericRangeQuery when necessary (it's faster) String lower = null; String upper = null; if (getStartKey() != null) { //Do we need to escape the term? lower = getStartKey().toString(); } if (getEndKey() != null) { upper = getEndKey().toString(); } if (upper == null && lower == null) { q = new MatchAllDocsQuery(); } else { q = TermRangeQuery.newStringRange(pk, lower, upper, true, true); } ``` -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (GORA-555) Improve Lucene query implementation with NumericRangeQuery
[ https://issues.apache.org/jira/browse/GORA-555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Ratnasekera updated GORA-555: --- Description: There are performance benefits around NumericRangeQuery. Please notice comment on LuceneQuery implementation. {code} //TODO: Change this to a NumericRangeQuery when necessary (it's faster) String lower = null; String upper = null; if (getStartKey() != null) { //Do we need to escape the term? lower = getStartKey().toString(); } if (getEndKey() != null) { upper = getEndKey().toString(); } if (upper == null && lower == null) { q = new MatchAllDocsQuery(); } else { q = TermRangeQuery.newStringRange(pk, lower, upper, true, true); } {code} was: There performance benefits around NumericRangeQuery. Please notice comment on LuceneQuery implementation. {code} //TODO: Change this to a NumericRangeQuery when necessary (it's faster) String lower = null; String upper = null; if (getStartKey() != null) { //Do we need to escape the term? lower = getStartKey().toString(); } if (getEndKey() != null) { upper = getEndKey().toString(); } if (upper == null && lower == null) { q = new MatchAllDocsQuery(); } else { q = TermRangeQuery.newStringRange(pk, lower, upper, true, true); } {code} > Improve Lucene query implementation with NumericRangeQuery > --- > > Key: GORA-555 > URL: https://issues.apache.org/jira/browse/GORA-555 > Project: Apache Gora > Issue Type: Improvement >Reporter: Kevin Ratnasekera >Priority: Major > > There are performance benefits around NumericRangeQuery. Please notice > comment on LuceneQuery implementation. > {code} > //TODO: Change this to a NumericRangeQuery when necessary (it's faster) > String lower = null; > String upper = null; > if (getStartKey() != null) { > //Do we need to escape the term? > lower = getStartKey().toString(); > } > if (getEndKey() != null) { > upper = getEndKey().toString(); > } > if (upper == null && lower == null) { > q = new MatchAllDocsQuery(); > } else { > q = TermRangeQuery.newStringRange(pk, lower, upper, true, true); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [gora] djkevincr edited a comment on issue #152: GORA-266 Lucene datastore for Gora - lewismc
djkevincr edited a comment on issue #152: GORA-266 Lucene datastore for Gora - lewismc URL: https://github.com/apache/gora/pull/152#issuecomment-473925982 @lewismc I have created issue [1] on NumericRangeQuery. Let s address this separately. I think this PR is in good shape to be merged. All major test cases passes without any issue. [1] https://issues.apache.org/jira/browse/GORA-555 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] djkevincr commented on issue #152: GORA-266 Lucene datastore for Gora - lewismc
djkevincr commented on issue #152: GORA-266 Lucene datastore for Gora - lewismc URL: https://github.com/apache/gora/pull/152#issuecomment-473925982 @lewismc I have created issue [1] on NumericRangeQuery. I think this PR is in good shape to be merged. All major test cases passes without any issue. [1] https://issues.apache.org/jira/browse/GORA-555 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (GORA-555) Improve Lucene query implementation with NumericRangeQuery
[ https://issues.apache.org/jira/browse/GORA-555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Ratnasekera updated GORA-555: --- Description: There performance benefits around NumericRangeQuery. Please notice comment on LuceneQuery implementation. {code} //TODO: Change this to a NumericRangeQuery when necessary (it's faster) String lower = null; String upper = null; if (getStartKey() != null) { //Do we need to escape the term? lower = getStartKey().toString(); } if (getEndKey() != null) { upper = getEndKey().toString(); } if (upper == null && lower == null) { q = new MatchAllDocsQuery(); } else { q = TermRangeQuery.newStringRange(pk, lower, upper, true, true); } {code} was: There performance benefits around NumericRangeQuery. Please notice comment on LuceneQuery implementation. ``` //TODO: Change this to a NumericRangeQuery when necessary (it's faster) String lower = null; String upper = null; if (getStartKey() != null) { //Do we need to escape the term? lower = getStartKey().toString(); } if (getEndKey() != null) { upper = getEndKey().toString(); } if (upper == null && lower == null) { q = new MatchAllDocsQuery(); } else { q = TermRangeQuery.newStringRange(pk, lower, upper, true, true); } ``` > Improve Lucene query implementation with NumericRangeQuery > --- > > Key: GORA-555 > URL: https://issues.apache.org/jira/browse/GORA-555 > Project: Apache Gora > Issue Type: Improvement >Reporter: Kevin Ratnasekera >Priority: Major > > There performance benefits around NumericRangeQuery. Please notice comment on > LuceneQuery implementation. > {code} > //TODO: Change this to a NumericRangeQuery when necessary (it's faster) > String lower = null; > String upper = null; > if (getStartKey() != null) { > //Do we need to escape the term? > lower = getStartKey().toString(); > } > if (getEndKey() != null) { > upper = getEndKey().toString(); > } > if (upper == null && lower == null) { > q = new MatchAllDocsQuery(); > } else { > q = TermRangeQuery.newStringRange(pk, lower, upper, true, true); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (GORA-554) Upgrade Solr dependency to latest
Kevin Ratnasekera created GORA-554: -- Summary: Upgrade Solr dependency to latest Key: GORA-554 URL: https://issues.apache.org/jira/browse/GORA-554 Project: Apache Gora Issue Type: Improvement Reporter: Kevin Ratnasekera Current stable version is @ 8.0.0. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [gora] djkevincr closed pull request #131: GORA-266 Lucene datastore for Gora
djkevincr closed pull request #131: GORA-266 Lucene datastore for Gora URL: https://github.com/apache/gora/pull/131 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [gora] djkevincr commented on issue #131: GORA-266 Lucene datastore for Gora
djkevincr commented on issue #131: GORA-266 Lucene datastore for Gora URL: https://github.com/apache/gora/pull/131#issuecomment-473918146 Closing this PR after submitting more update version of the same work in https://github.com/apache/gora/pull/152 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services