Re: Add datastore for Elasticsearch. Outreachy Week 11 Report

2021-02-25 Thread Maria Podorvanova
Hi Kevin,

Yes, I will make a PR, once I fix some issues.

Regards,
Maria

On Thu, 25 Feb 2021 at 15:49, Kevin Ratnasekera 
wrote:

> Hi Maria,
>
> Thank you for hard work Maria. Can you raise a PR, once you are
> comfortable with changes?
>
> Regards
> Kevin
>
> On Thu, Feb 25, 2021 at 10:06 AM Maria Podorvanova <
> podorvanova.ma...@gmail.com> wrote:
>
>> Hi John,
>>
>> Thanks for your comment. I am working on it.
>>
>> Regards,
>> Maria
>>
>> On Wed, 24 Feb 2021 at 17:50, John Mora  wrote:
>>
>>> Hi Maria.
>>>
>>> Thanks for the update.
>>>
>>> Unfortunately, looping through all possible values in the range is not a
>>> practical solution.
>>>
>>> You should use the range query feature for this:
>>>
>>> https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html
>>>
>>> I think you should manually add a special field in the elasticsearch
>>> record that you can range query (you can add it to the mapping file as a
>>> 'mock' primary key field). It will be basically a copy of the '_id' field.
>>>
>>> Here, you can find a similar workaround in the Redis DataStore where
>>> Sorted Sets were as secondary indexes for range queries.
>>>
>>>
>>> https://github.com/apache/gora/blob/master/gora-redis/src/main/java/org/apache/gora/redis/store/RedisStore.java#L299
>>>
>>> Best,
>>> John
>>>
>>> El sáb, 20 feb 2021 a las 3:01, Maria Podorvanova (<
>>> podorvanova.ma...@gmail.com>) escribió:
>>>
 Hi,

 Report #11
 Week 11: February, 14 - February, 20
 Activities:
 - Added scaling_factor support [1]
 - Removed unsupported Elasticsearch data types [2]
 - Implemented Metadata Analyzer for Elasticsearch Store [3]
 - Tried to fix range query by “_id” field [4]
 - Wrote documentation for Apache Gora website [5]
 - Polished and sent my CV for reviewing

 Question:

1. I tried to fix the issue, where Elasticsearch "_id" field does
not support range queries. I've tried treating "_id" as a number, but 
 one
of the test "_id" field values is "http://foo.com/;. So
my approach did not work, but I decided to commit[4] my work on this 
 issue
in order to show you what I tried to do.


 [1]
 https://github.com/apache/gora/commit/670a04c51f4a6d169df319ed5fd3d1d0abd81870
 [2]
 https://github.com/apache/gora/commit/55020d722f9424021fefe8d94b6bf3ece213226d
 [3]
 https://github.com/apache/gora/commit/c491a6447d197b0509473294ee844834b1623a63
 [4]
 https://github.com/apache/gora/commit/a870ca8a2075af7cbab75b9341a94de4966fbf7a
 [5]
 https://docs.google.com/document/d/1AF6MG3pqe6A5Z0KtLooEKQlipuyeObbYlAa4O7rFXqM/edit?usp=sharing

 Regards,
 Maria

>>>


Re: Add datastore for Elasticsearch. Outreachy Week 11 Report

2021-02-24 Thread Kevin Ratnasekera
Hi Maria,

Thank you for hard work Maria. Can you raise a PR, once you are
comfortable with changes?

Regards
Kevin

On Thu, Feb 25, 2021 at 10:06 AM Maria Podorvanova <
podorvanova.ma...@gmail.com> wrote:

> Hi John,
>
> Thanks for your comment. I am working on it.
>
> Regards,
> Maria
>
> On Wed, 24 Feb 2021 at 17:50, John Mora  wrote:
>
>> Hi Maria.
>>
>> Thanks for the update.
>>
>> Unfortunately, looping through all possible values in the range is not a
>> practical solution.
>>
>> You should use the range query feature for this:
>>
>> https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html
>>
>> I think you should manually add a special field in the elasticsearch
>> record that you can range query (you can add it to the mapping file as a
>> 'mock' primary key field). It will be basically a copy of the '_id' field.
>>
>> Here, you can find a similar workaround in the Redis DataStore where
>> Sorted Sets were as secondary indexes for range queries.
>>
>>
>> https://github.com/apache/gora/blob/master/gora-redis/src/main/java/org/apache/gora/redis/store/RedisStore.java#L299
>>
>> Best,
>> John
>>
>> El sáb, 20 feb 2021 a las 3:01, Maria Podorvanova (<
>> podorvanova.ma...@gmail.com>) escribió:
>>
>>> Hi,
>>>
>>> Report #11
>>> Week 11: February, 14 - February, 20
>>> Activities:
>>> - Added scaling_factor support [1]
>>> - Removed unsupported Elasticsearch data types [2]
>>> - Implemented Metadata Analyzer for Elasticsearch Store [3]
>>> - Tried to fix range query by “_id” field [4]
>>> - Wrote documentation for Apache Gora website [5]
>>> - Polished and sent my CV for reviewing
>>>
>>> Question:
>>>
>>>1. I tried to fix the issue, where Elasticsearch "_id" field does
>>>not support range queries. I've tried treating "_id" as a number, but one
>>>of the test "_id" field values is "http://foo.com/;. So
>>>my approach did not work, but I decided to commit[4] my work on this 
>>> issue
>>>in order to show you what I tried to do.
>>>
>>>
>>> [1]
>>> https://github.com/apache/gora/commit/670a04c51f4a6d169df319ed5fd3d1d0abd81870
>>> [2]
>>> https://github.com/apache/gora/commit/55020d722f9424021fefe8d94b6bf3ece213226d
>>> [3]
>>> https://github.com/apache/gora/commit/c491a6447d197b0509473294ee844834b1623a63
>>> [4]
>>> https://github.com/apache/gora/commit/a870ca8a2075af7cbab75b9341a94de4966fbf7a
>>> [5]
>>> https://docs.google.com/document/d/1AF6MG3pqe6A5Z0KtLooEKQlipuyeObbYlAa4O7rFXqM/edit?usp=sharing
>>>
>>> Regards,
>>> Maria
>>>
>>


Re: Add datastore for Elasticsearch. Outreachy Week 11 Report

2021-02-24 Thread Maria Podorvanova
Hi John,

Thanks for your comment. I am working on it.

Regards,
Maria

On Wed, 24 Feb 2021 at 17:50, John Mora  wrote:

> Hi Maria.
>
> Thanks for the update.
>
> Unfortunately, looping through all possible values in the range is not a
> practical solution.
>
> You should use the range query feature for this:
>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html
>
> I think you should manually add a special field in the elasticsearch
> record that you can range query (you can add it to the mapping file as a
> 'mock' primary key field). It will be basically a copy of the '_id' field.
>
> Here, you can find a similar workaround in the Redis DataStore where
> Sorted Sets were as secondary indexes for range queries.
>
>
> https://github.com/apache/gora/blob/master/gora-redis/src/main/java/org/apache/gora/redis/store/RedisStore.java#L299
>
> Best,
> John
>
> El sáb, 20 feb 2021 a las 3:01, Maria Podorvanova (<
> podorvanova.ma...@gmail.com>) escribió:
>
>> Hi,
>>
>> Report #11
>> Week 11: February, 14 - February, 20
>> Activities:
>> - Added scaling_factor support [1]
>> - Removed unsupported Elasticsearch data types [2]
>> - Implemented Metadata Analyzer for Elasticsearch Store [3]
>> - Tried to fix range query by “_id” field [4]
>> - Wrote documentation for Apache Gora website [5]
>> - Polished and sent my CV for reviewing
>>
>> Question:
>>
>>1. I tried to fix the issue, where Elasticsearch "_id" field does not
>>support range queries. I've tried treating "_id" as a number, but one of
>>the test "_id" field values is "http://foo.com/;. So my approach did
>>not work, but I decided to commit[4] my work on this issue in order to 
>> show
>>you what I tried to do.
>>
>>
>> [1]
>> https://github.com/apache/gora/commit/670a04c51f4a6d169df319ed5fd3d1d0abd81870
>> [2]
>> https://github.com/apache/gora/commit/55020d722f9424021fefe8d94b6bf3ece213226d
>> [3]
>> https://github.com/apache/gora/commit/c491a6447d197b0509473294ee844834b1623a63
>> [4]
>> https://github.com/apache/gora/commit/a870ca8a2075af7cbab75b9341a94de4966fbf7a
>> [5]
>> https://docs.google.com/document/d/1AF6MG3pqe6A5Z0KtLooEKQlipuyeObbYlAa4O7rFXqM/edit?usp=sharing
>>
>> Regards,
>> Maria
>>
>


Re: Add datastore for Elasticsearch. Outreachy Week 11 Report

2021-02-23 Thread John Mora
Hi Maria.

Thanks for the update.

Unfortunately, looping through all possible values in the range is not a
practical solution.

You should use the range query feature for this:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html

I think you should manually add a special field in the elasticsearch record
that you can range query (you can add it to the mapping file as a 'mock'
primary key field). It will be basically a copy of the '_id' field.

Here, you can find a similar workaround in the Redis DataStore where Sorted
Sets were as secondary indexes for range queries.

https://github.com/apache/gora/blob/master/gora-redis/src/main/java/org/apache/gora/redis/store/RedisStore.java#L299

Best,
John

El sáb, 20 feb 2021 a las 3:01, Maria Podorvanova (<
podorvanova.ma...@gmail.com>) escribió:

> Hi,
>
> Report #11
> Week 11: February, 14 - February, 20
> Activities:
> - Added scaling_factor support [1]
> - Removed unsupported Elasticsearch data types [2]
> - Implemented Metadata Analyzer for Elasticsearch Store [3]
> - Tried to fix range query by “_id” field [4]
> - Wrote documentation for Apache Gora website [5]
> - Polished and sent my CV for reviewing
>
> Question:
>
>1. I tried to fix the issue, where Elasticsearch "_id" field does not
>support range queries. I've tried treating "_id" as a number, but one of
>the test "_id" field values is "http://foo.com/;. So my approach did
>not work, but I decided to commit[4] my work on this issue in order to show
>you what I tried to do.
>
>
> [1]
> https://github.com/apache/gora/commit/670a04c51f4a6d169df319ed5fd3d1d0abd81870
> [2]
> https://github.com/apache/gora/commit/55020d722f9424021fefe8d94b6bf3ece213226d
> [3]
> https://github.com/apache/gora/commit/c491a6447d197b0509473294ee844834b1623a63
> [4]
> https://github.com/apache/gora/commit/a870ca8a2075af7cbab75b9341a94de4966fbf7a
> [5]
> https://docs.google.com/document/d/1AF6MG3pqe6A5Z0KtLooEKQlipuyeObbYlAa4O7rFXqM/edit?usp=sharing
>
> Regards,
> Maria
>


Add datastore for Elasticsearch. Outreachy Week 11 Report

2021-02-20 Thread Maria Podorvanova
Hi,

Report #11
Week 11: February, 14 - February, 20
Activities:
- Added scaling_factor support [1]
- Removed unsupported Elasticsearch data types [2]
- Implemented Metadata Analyzer for Elasticsearch Store [3]
- Tried to fix range query by “_id” field [4]
- Wrote documentation for Apache Gora website [5]
- Polished and sent my CV for reviewing

Question:

   1. I tried to fix the issue, where Elasticsearch "_id" field does not
   support range queries. I've tried treating "_id" as a number, but one of
   the test "_id" field values is "http://foo.com/;. So my approach did not
   work, but I decided to commit[4] my work on this issue in order to show you
   what I tried to do.


[1]
https://github.com/apache/gora/commit/670a04c51f4a6d169df319ed5fd3d1d0abd81870
[2]
https://github.com/apache/gora/commit/55020d722f9424021fefe8d94b6bf3ece213226d
[3]
https://github.com/apache/gora/commit/c491a6447d197b0509473294ee844834b1623a63
[4]
https://github.com/apache/gora/commit/a870ca8a2075af7cbab75b9341a94de4966fbf7a
[5]
https://docs.google.com/document/d/1AF6MG3pqe6A5Z0KtLooEKQlipuyeObbYlAa4O7rFXqM/edit?usp=sharing

Regards,
Maria