[GitHub] [gora] mgov88 opened a new pull request #235: [GORA-663] Add datastore for Neo4j

2021-03-01 Thread GitBox


mgov88 opened a new pull request #235:
URL: https://github.com/apache/gora/pull/235


   This pull request contains the implementation of the Neo4j Datastore for 
Apache Gora
   
   Please let me know if you have feedback.
   
   Intern: Gaby Ortiz
   Project: Add datastore for Neo4j
   Outreachy: 2020 Winter
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




Add datastore for Elasticsearch. Outreachy Week 13 Report

2021-03-01 Thread Maria Podorvanova
Hi,

Report #13
Week 13: February, 28 - March, 2
Activities:
- Submitted final feedback
- Posted last blog post
- Created a separate ticket[1] for Elasticsearch documentation for Apache
Gora website and attached a patch with my documentation
- Made a PR[2] with my code

Question:
CI build failed with weird errors, I am not sure what they are caused by.
Should I do something about it?

[1] https://issues.apache.org/jira/browse/GORA-670
[2] https://github.com/apache/gora/pull/234

Regards,
Maria


[GitHub] [gora] podorvanova opened a new pull request #234: GORA-664 Add datastore for Elasticsearch

2021-03-01 Thread GitBox


podorvanova opened a new pull request #234:
URL: https://github.com/apache/gora/pull/234


   [Outreachy Winter 2020-2021]
   This PR implements an Apache Elasticsearch datastore for Apache Gora.
   
   Your feedback would be much appreciated.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (GORA-670) Add documentation for the Elasticsearch DataStore

2021-03-01 Thread Mariia Podorvanova (Jira)
Mariia Podorvanova created GORA-670:
---

 Summary: Add documentation for the Elasticsearch DataStore
 Key: GORA-670
 URL: https://issues.apache.org/jira/browse/GORA-670
 Project: Apache Gora
  Issue Type: New Feature
  Components: documentation
Affects Versions: 1.0
Reporter: Mariia Podorvanova
 Attachments: elasticsearch-backend.patch

Documentation for the Elasticsearch backend, as a patch for the gora website.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Add datastore for Elasticsearch. Outreachy Week 12 Report

2021-03-01 Thread Maria Podorvanova
Hi John,

Thank you for your response.

1) I have tried to execute a refresh call on the flush method and it is
working now. Thank you very much!

3) I see. I will leave it out for now then.

I will send a PR by the end of today.

Regards,
Maria

On Tue, 2 Mar 2021 at 09:33, John Mora  wrote:

> Hi Maria.
>
>
> Thanks for your update.
>
> 1) I made some experiments and I think you have to execute a refresh call
> on the flush() method.
> "An elasticsearch refresh
> 
> makes your documents available for search"
>
>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-refresh.html
>
> Also, if you have problems with the order of the results check out the
> preference parameter
>
>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#search-preference
>
> 3) Since the internship end is close and the Gora Explorer is an
> independent project (I am not sure if Alfonso has free time). I think we
> can skip that task, but it would be a nice post-outreachy contribution if
> you want.
>
> Please send a PR with your code for review.
>
> Thanks,
> John
>
> El lun, 1 mar 2021 a las 7:23, Maria Podorvanova (<
> podorvanova.ma...@gmail.com>) escribió:
>
>> Hi Madhawa,
>>
>> Thank you for your response. I will do that.
>>
>> Regards,
>> Maria
>>
>> On Mon, 1 Mar 2021 at 22:51, Madhawa Gunasekara 
>> wrote:
>>
>>> Hi Maria,
>>>
>>> 2) Documentation looks fine to me, please refer these to documentation
>>> Jira tickets as well. Let's stick to the same format.
>>> [1] https://issues.apache.org/jira/browse/GORA-625
>>> [2] https://issues.apache.org/jira/browse/GORA-338
>>>
>>> Please create a separate ticket for this documentation.
>>>
>>> Thanks,
>>> Madhawa
>>>
>>>
>>> On Sat, Feb 27, 2021 at 9:10 AM Maria Podorvanova <
>>> podorvanova.ma...@gmail.com> wrote:
>>>
 Hi,

 Report #12
 Week 12: February, 21 - February, 27
 Activities:
 - Fixed execute method by adding a special "gora_id" field [1]
 - Implemented deleting specific fields of the records in deleteByQuery
 method [2]
 - Implemented MapReduce test [3]
 - Added Thread.sleep in order to synchronize Elasticsearch replicas [4]
 - All tests in TestElasticsearchStore are passing now
 - I also had informal chat with 2 people this week

 Questions:

1. The last commit [4] gives Elasticsearch some time to synchronize
all its replicas. Without Thread.sleep 10 tests (testQuery,
testQueryStartKey, testDeleteByQuery etc.) fail and return a different
number of hits every time I run them. I did not find a better solution, 
 but
commit it anyway. Do you have any suggestions?
2. I did not get feedback about Elasticsearch documentation for
Apache Gora website I sent last week. Do I need to fix something in it?
3. One of the last goals of my internship is to add the new
datastore to the GoraExplorer project. Could you tell me if there is any
guide on how to do it?


 [1]
 https://github.com/apache/gora/commit/f100b317a6dd3c98875f92de776e9b1e476e5425
 [2]
 https://github.com/apache/gora/commit/91fb2f83f7b4b682898b1cffe73eb8bebeb8ed83
 [3]
 https://github.com/apache/gora/commit/28b2dee779fa428f51f54585dbfb88638f9bc1de
 [4]
 https://github.com/apache/gora/commit/d7955f74821fad063da3dd9f1988f59aadbf7cca

 Regards,
 Maria

>>>


Re: Outreachy 2020-2021 - Neo4j - Weekly reports.

2021-03-01 Thread John Mora
Hi Gaby

Thanks for your update.

Please send a PR with your code for review.

Open a new ticket in Jira for the documentation, you have to send a patch
for the website.

Example: https://issues.apache.org/jira/browse/GORA-625

Best,
John

El lun, 1 mar 2021 a las 2:11, gabriela ortiz ()
escribió:

> Hi all.
>
> I wanted to inform the tasks I worked on this week: Feb 20 - Feb 26 .
>
> * Develop deleteByQuery method.
> * Develop the Metadata Analyzer classes for Neo4j
> * Enable Map Reduce tests
> * Enable all tests of Neo4jStoreTest.
> * Upload the documentation Neo4j -
> https://issues.apache.org/jira/browse/GORA-663?focusedCommentId=17292671=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17292671
> * Worked on my final blog.
>
> My code is here: https://github.com/mgov88/gora/tree/GORA-663
>
> Regards,
> Gaby
>
> El mié, 24 de feb. de 2021 a la(s) 11:36, John Mora (jhnmora...@gmail.com)
> escribió:
>
>> Hi Gaby
>>
>> Thanks for your report
>>
>> The delete method should return true when a record is actually deleted.
>> If the key does not exist it should return false.
>>
>> Try to use the executeUpdate method of PreparedStatement. It returns the
>> number of affected rows, if this number is greater than zero you can return
>> true or false otherwise.
>>
>> Best,
>> John
>>
>> El lun, 22 feb 2021 a las 1:09, gabriela ortiz ()
>> escribió:
>>
>>> Hi all.
>>>
>>> I wanted to inform the tasks I worked on this week: Feb 13 - Feb 19 .
>>>
>>> * Develop method newQuery.
>>> * Develop classes Neo4jResult and Neo4jQuery.
>>> * Develop method execute(query).
>>> * Activate tests:
>>> testTruncateSchema
>>> testDeleteSchema
>>> testQuery
>>> testQueryStartKey
>>> testQueryEndKey
>>> testQueryKeyRange
>>> testQueryWebPageSingleKey
>>> testQueryWebPageSingleKeyDefaultFields
>>> testQueryWebPageQueryEmptyResults
>>> testDelete
>>> testGetPartitions
>>> testResultSize
>>> testResultSizeStartKey
>>> testResultSizeEndKey
>>> testResultSizeKeyRange
>>> testResultSizeWithLimit
>>> testResultSizeStartKeyWithLimit
>>> testResultSizeEndKeyWithLimit
>>> testResultSizeKeyRangeWithLimit
>>>
>>> Also, I have a question, When the method Neo4jStore#delete(key) should
>>> return true or false?, I found out that I had to always return true in
>>> order to pass the tests, is that correct?
>>>
>>> My code is here: https://github.com/mgov88/gora/tree/GORA-663
>>>
>>> Regards,
>>> Gaby
>>>
>>> El lun, 15 de feb. de 2021 a la(s) 19:01, John Mora (
>>> jhnmora...@gmail.com) escribió:
>>>
 Hi Gaby

 Thanks for the update.

 Overall the code looks good, I do not have specific feedback for you
 this week.

 According to your proposed timeline you should start working on the
 Query features, let's do it. Let me know if you have questions.


 Thanks,
 John

 El sáb, 13 feb 2021 a las 0:57, gabriela ortiz ()
 escribió:

> Hi all.
>
> I wanted to inform the tasks I worked on this week: Feb 06 - Feb 12 .
>
> * Enhance variable names.
> * Add enum for neo4j protocols.
> * Enhance getUnionSchema method for Maps.
> * Implement partitons.
> * Activate tests:
>   testUpdate
>   testGetRecursive
>   testGetDoubleRecursive
>   testGetWebPage
>   testGetWebPageDefaultFields
>
> Also, I started working on my C.V.
>
> My code is here: https://github.com/mgov88/gora/tree/GORA-663
>
> Regards,
> Gaby
>
> El mié, 10 de feb. de 2021 a la(s) 21:33, gabriela ortiz (
> arqgabyor...@gmail.com) escribió:
>
>> Hi John.
>>
>> Thanks for the feedback I will work on your comments.
>>
>> Regards,
>> Gaby
>>
>>
>> El mié, 10 de feb. de 2021 a la(s) 12:04, John Mora (
>> jhnmora...@gmail.com) escribió:
>>
>>> Hi Gaby
>>>
>>> Thanks for the update.
>>>
>>> BTW, I am sorry that I did not provide feedback on your code last
>>> week, I have been busy.
>>>
>>> Some comments:
>>>
>>> Please use more descriptive variable names:
>>>
>>>
>>> https://github.com/mgov88/gora/blob/GORA-663/gora-neo4j/src/main/java/org/apache/gora/neo4j/store/Neo4jStore.java#L368
>>>
>>> https://github.com/mgov88/gora/blob/GORA-663/gora-neo4j/src/main/java/org/apache/gora/neo4j/store/Neo4jStore.java#L165
>>>
>>> https://github.com/mgov88/gora/blob/GORA-663/gora-neo4j/src/main/java/org/apache/gora/neo4j/store/Neo4jStore.java#L171
>>>
>>> https://github.com/mgov88/gora/blob/GORA-663/gora-neo4j/src/main/java/org/apache/gora/neo4j/store/Neo4jStore.java#L193
>>>
>>> https://github.com/mgov88/gora/blob/GORA-663/gora-neo4j/src/main/java/org/apache/gora/neo4j/store/Neo4jStore.java#L194
>>>
>>> https://github.com/mgov88/gora/blob/GORA-663/gora-neo4j/src/main/java/org/apache/gora/neo4j/store/Neo4jStore.java#L200
>>>
>>> 

Re: Add datastore for Elasticsearch. Outreachy Week 12 Report

2021-03-01 Thread John Mora
Hi Maria.


Thanks for your update.

1) I made some experiments and I think you have to execute a refresh call
on the flush() method.
"An elasticsearch refresh

makes your documents available for search"

https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-refresh.html

Also, if you have problems with the order of the results check out the
preference parameter

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html#search-preference

3) Since the internship end is close and the Gora Explorer is an
independent project (I am not sure if Alfonso has free time). I think we
can skip that task, but it would be a nice post-outreachy contribution if
you want.

Please send a PR with your code for review.

Thanks,
John

El lun, 1 mar 2021 a las 7:23, Maria Podorvanova (<
podorvanova.ma...@gmail.com>) escribió:

> Hi Madhawa,
>
> Thank you for your response. I will do that.
>
> Regards,
> Maria
>
> On Mon, 1 Mar 2021 at 22:51, Madhawa Gunasekara 
> wrote:
>
>> Hi Maria,
>>
>> 2) Documentation looks fine to me, please refer these to documentation
>> Jira tickets as well. Let's stick to the same format.
>> [1] https://issues.apache.org/jira/browse/GORA-625
>> [2] https://issues.apache.org/jira/browse/GORA-338
>>
>> Please create a separate ticket for this documentation.
>>
>> Thanks,
>> Madhawa
>>
>>
>> On Sat, Feb 27, 2021 at 9:10 AM Maria Podorvanova <
>> podorvanova.ma...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Report #12
>>> Week 12: February, 21 - February, 27
>>> Activities:
>>> - Fixed execute method by adding a special "gora_id" field [1]
>>> - Implemented deleting specific fields of the records in deleteByQuery
>>> method [2]
>>> - Implemented MapReduce test [3]
>>> - Added Thread.sleep in order to synchronize Elasticsearch replicas [4]
>>> - All tests in TestElasticsearchStore are passing now
>>> - I also had informal chat with 2 people this week
>>>
>>> Questions:
>>>
>>>1. The last commit [4] gives Elasticsearch some time to synchronize
>>>all its replicas. Without Thread.sleep 10 tests (testQuery,
>>>testQueryStartKey, testDeleteByQuery etc.) fail and return a different
>>>number of hits every time I run them. I did not find a better solution, 
>>> but
>>>commit it anyway. Do you have any suggestions?
>>>2. I did not get feedback about Elasticsearch documentation for
>>>Apache Gora website I sent last week. Do I need to fix something in it?
>>>3. One of the last goals of my internship is to add the new
>>>datastore to the GoraExplorer project. Could you tell me if there is any
>>>guide on how to do it?
>>>
>>>
>>> [1]
>>> https://github.com/apache/gora/commit/f100b317a6dd3c98875f92de776e9b1e476e5425
>>> [2]
>>> https://github.com/apache/gora/commit/91fb2f83f7b4b682898b1cffe73eb8bebeb8ed83
>>> [3]
>>> https://github.com/apache/gora/commit/28b2dee779fa428f51f54585dbfb88638f9bc1de
>>> [4]
>>> https://github.com/apache/gora/commit/d7955f74821fad063da3dd9f1988f59aadbf7cca
>>>
>>> Regards,
>>> Maria
>>>
>>


Re: Add datastore for Elasticsearch. Outreachy Week 12 Report

2021-03-01 Thread Maria Podorvanova
Hi Madhawa,

Thank you for your response. I will do that.

Regards,
Maria

On Mon, 1 Mar 2021 at 22:51, Madhawa Gunasekara  wrote:

> Hi Maria,
>
> 2) Documentation looks fine to me, please refer these to documentation
> Jira tickets as well. Let's stick to the same format.
> [1] https://issues.apache.org/jira/browse/GORA-625
> [2] https://issues.apache.org/jira/browse/GORA-338
>
> Please create a separate ticket for this documentation.
>
> Thanks,
> Madhawa
>
>
> On Sat, Feb 27, 2021 at 9:10 AM Maria Podorvanova <
> podorvanova.ma...@gmail.com> wrote:
>
>> Hi,
>>
>> Report #12
>> Week 12: February, 21 - February, 27
>> Activities:
>> - Fixed execute method by adding a special "gora_id" field [1]
>> - Implemented deleting specific fields of the records in deleteByQuery
>> method [2]
>> - Implemented MapReduce test [3]
>> - Added Thread.sleep in order to synchronize Elasticsearch replicas [4]
>> - All tests in TestElasticsearchStore are passing now
>> - I also had informal chat with 2 people this week
>>
>> Questions:
>>
>>1. The last commit [4] gives Elasticsearch some time to synchronize
>>all its replicas. Without Thread.sleep 10 tests (testQuery,
>>testQueryStartKey, testDeleteByQuery etc.) fail and return a different
>>number of hits every time I run them. I did not find a better solution, 
>> but
>>commit it anyway. Do you have any suggestions?
>>2. I did not get feedback about Elasticsearch documentation for
>>Apache Gora website I sent last week. Do I need to fix something in it?
>>3. One of the last goals of my internship is to add the new datastore
>>to the GoraExplorer project. Could you tell me if there is any guide on 
>> how
>>to do it?
>>
>>
>> [1]
>> https://github.com/apache/gora/commit/f100b317a6dd3c98875f92de776e9b1e476e5425
>> [2]
>> https://github.com/apache/gora/commit/91fb2f83f7b4b682898b1cffe73eb8bebeb8ed83
>> [3]
>> https://github.com/apache/gora/commit/28b2dee779fa428f51f54585dbfb88638f9bc1de
>> [4]
>> https://github.com/apache/gora/commit/d7955f74821fad063da3dd9f1988f59aadbf7cca
>>
>> Regards,
>> Maria
>>
>


Re: Add datastore for Elasticsearch. Outreachy Week 12 Report

2021-03-01 Thread Madhawa Gunasekara
Hi Maria,

2) Documentation looks fine to me, please refer these to documentation Jira
tickets as well. Let's stick to the same format.
[1] https://issues.apache.org/jira/browse/GORA-625
[2] https://issues.apache.org/jira/browse/GORA-338

Please create a separate ticket for this documentation.

Thanks,
Madhawa


On Sat, Feb 27, 2021 at 9:10 AM Maria Podorvanova <
podorvanova.ma...@gmail.com> wrote:

> Hi,
>
> Report #12
> Week 12: February, 21 - February, 27
> Activities:
> - Fixed execute method by adding a special "gora_id" field [1]
> - Implemented deleting specific fields of the records in deleteByQuery
> method [2]
> - Implemented MapReduce test [3]
> - Added Thread.sleep in order to synchronize Elasticsearch replicas [4]
> - All tests in TestElasticsearchStore are passing now
> - I also had informal chat with 2 people this week
>
> Questions:
>
>1. The last commit [4] gives Elasticsearch some time to synchronize
>all its replicas. Without Thread.sleep 10 tests (testQuery,
>testQueryStartKey, testDeleteByQuery etc.) fail and return a different
>number of hits every time I run them. I did not find a better solution, but
>commit it anyway. Do you have any suggestions?
>2. I did not get feedback about Elasticsearch documentation for Apache
>Gora website I sent last week. Do I need to fix something in it?
>3. One of the last goals of my internship is to add the new datastore
>to the GoraExplorer project. Could you tell me if there is any guide on how
>to do it?
>
>
> [1]
> https://github.com/apache/gora/commit/f100b317a6dd3c98875f92de776e9b1e476e5425
> [2]
> https://github.com/apache/gora/commit/91fb2f83f7b4b682898b1cffe73eb8bebeb8ed83
> [3]
> https://github.com/apache/gora/commit/28b2dee779fa428f51f54585dbfb88638f9bc1de
> [4]
> https://github.com/apache/gora/commit/d7955f74821fad063da3dd9f1988f59aadbf7cca
>
> Regards,
> Maria
>