Re: Add datastore for Elasticsearch. Outreachy Week 11 Report
Hi Kevin, Yes, I will make a PR, once I fix some issues. Regards, Maria On Thu, 25 Feb 2021 at 15:49, Kevin Ratnasekera wrote: > Hi Maria, > > Thank you for hard work Maria. Can you raise a PR, once you are > comfortable with changes? > > Regards > Kevin > > On Thu, Feb 25, 2021 at 10:06 AM Maria Podorvanova < > podorvanova.ma...@gmail.com> wrote: > >> Hi John, >> >> Thanks for your comment. I am working on it. >> >> Regards, >> Maria >> >> On Wed, 24 Feb 2021 at 17:50, John Mora wrote: >> >>> Hi Maria. >>> >>> Thanks for the update. >>> >>> Unfortunately, looping through all possible values in the range is not a >>> practical solution. >>> >>> You should use the range query feature for this: >>> >>> https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html >>> >>> I think you should manually add a special field in the elasticsearch >>> record that you can range query (you can add it to the mapping file as a >>> 'mock' primary key field). It will be basically a copy of the '_id' field. >>> >>> Here, you can find a similar workaround in the Redis DataStore where >>> Sorted Sets were as secondary indexes for range queries. >>> >>> >>> https://github.com/apache/gora/blob/master/gora-redis/src/main/java/org/apache/gora/redis/store/RedisStore.java#L299 >>> >>> Best, >>> John >>> >>> El sáb, 20 feb 2021 a las 3:01, Maria Podorvanova (< >>> podorvanova.ma...@gmail.com>) escribió: >>> Hi, Report #11 Week 11: February, 14 - February, 20 Activities: - Added scaling_factor support [1] - Removed unsupported Elasticsearch data types [2] - Implemented Metadata Analyzer for Elasticsearch Store [3] - Tried to fix range query by “_id” field [4] - Wrote documentation for Apache Gora website [5] - Polished and sent my CV for reviewing Question: 1. I tried to fix the issue, where Elasticsearch "_id" field does not support range queries. I've tried treating "_id" as a number, but one of the test "_id" field values is "http://foo.com/";. So my approach did not work, but I decided to commit[4] my work on this issue in order to show you what I tried to do. [1] https://github.com/apache/gora/commit/670a04c51f4a6d169df319ed5fd3d1d0abd81870 [2] https://github.com/apache/gora/commit/55020d722f9424021fefe8d94b6bf3ece213226d [3] https://github.com/apache/gora/commit/c491a6447d197b0509473294ee844834b1623a63 [4] https://github.com/apache/gora/commit/a870ca8a2075af7cbab75b9341a94de4966fbf7a [5] https://docs.google.com/document/d/1AF6MG3pqe6A5Z0KtLooEKQlipuyeObbYlAa4O7rFXqM/edit?usp=sharing Regards, Maria >>>
Re: Add datastore for Elasticsearch. Outreachy Week 11 Report
Hi Maria, Thank you for hard work Maria. Can you raise a PR, once you are comfortable with changes? Regards Kevin On Thu, Feb 25, 2021 at 10:06 AM Maria Podorvanova < podorvanova.ma...@gmail.com> wrote: > Hi John, > > Thanks for your comment. I am working on it. > > Regards, > Maria > > On Wed, 24 Feb 2021 at 17:50, John Mora wrote: > >> Hi Maria. >> >> Thanks for the update. >> >> Unfortunately, looping through all possible values in the range is not a >> practical solution. >> >> You should use the range query feature for this: >> >> https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html >> >> I think you should manually add a special field in the elasticsearch >> record that you can range query (you can add it to the mapping file as a >> 'mock' primary key field). It will be basically a copy of the '_id' field. >> >> Here, you can find a similar workaround in the Redis DataStore where >> Sorted Sets were as secondary indexes for range queries. >> >> >> https://github.com/apache/gora/blob/master/gora-redis/src/main/java/org/apache/gora/redis/store/RedisStore.java#L299 >> >> Best, >> John >> >> El sáb, 20 feb 2021 a las 3:01, Maria Podorvanova (< >> podorvanova.ma...@gmail.com>) escribió: >> >>> Hi, >>> >>> Report #11 >>> Week 11: February, 14 - February, 20 >>> Activities: >>> - Added scaling_factor support [1] >>> - Removed unsupported Elasticsearch data types [2] >>> - Implemented Metadata Analyzer for Elasticsearch Store [3] >>> - Tried to fix range query by “_id” field [4] >>> - Wrote documentation for Apache Gora website [5] >>> - Polished and sent my CV for reviewing >>> >>> Question: >>> >>>1. I tried to fix the issue, where Elasticsearch "_id" field does >>>not support range queries. I've tried treating "_id" as a number, but one >>>of the test "_id" field values is "http://foo.com/";. So >>>my approach did not work, but I decided to commit[4] my work on this >>> issue >>>in order to show you what I tried to do. >>> >>> >>> [1] >>> https://github.com/apache/gora/commit/670a04c51f4a6d169df319ed5fd3d1d0abd81870 >>> [2] >>> https://github.com/apache/gora/commit/55020d722f9424021fefe8d94b6bf3ece213226d >>> [3] >>> https://github.com/apache/gora/commit/c491a6447d197b0509473294ee844834b1623a63 >>> [4] >>> https://github.com/apache/gora/commit/a870ca8a2075af7cbab75b9341a94de4966fbf7a >>> [5] >>> https://docs.google.com/document/d/1AF6MG3pqe6A5Z0KtLooEKQlipuyeObbYlAa4O7rFXqM/edit?usp=sharing >>> >>> Regards, >>> Maria >>> >>
Re: Add datastore for Elasticsearch. Outreachy Week 11 Report
Hi John, Thanks for your comment. I am working on it. Regards, Maria On Wed, 24 Feb 2021 at 17:50, John Mora wrote: > Hi Maria. > > Thanks for the update. > > Unfortunately, looping through all possible values in the range is not a > practical solution. > > You should use the range query feature for this: > > https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html > > I think you should manually add a special field in the elasticsearch > record that you can range query (you can add it to the mapping file as a > 'mock' primary key field). It will be basically a copy of the '_id' field. > > Here, you can find a similar workaround in the Redis DataStore where > Sorted Sets were as secondary indexes for range queries. > > > https://github.com/apache/gora/blob/master/gora-redis/src/main/java/org/apache/gora/redis/store/RedisStore.java#L299 > > Best, > John > > El sáb, 20 feb 2021 a las 3:01, Maria Podorvanova (< > podorvanova.ma...@gmail.com>) escribió: > >> Hi, >> >> Report #11 >> Week 11: February, 14 - February, 20 >> Activities: >> - Added scaling_factor support [1] >> - Removed unsupported Elasticsearch data types [2] >> - Implemented Metadata Analyzer for Elasticsearch Store [3] >> - Tried to fix range query by “_id” field [4] >> - Wrote documentation for Apache Gora website [5] >> - Polished and sent my CV for reviewing >> >> Question: >> >>1. I tried to fix the issue, where Elasticsearch "_id" field does not >>support range queries. I've tried treating "_id" as a number, but one of >>the test "_id" field values is "http://foo.com/";. So my approach did >>not work, but I decided to commit[4] my work on this issue in order to >> show >>you what I tried to do. >> >> >> [1] >> https://github.com/apache/gora/commit/670a04c51f4a6d169df319ed5fd3d1d0abd81870 >> [2] >> https://github.com/apache/gora/commit/55020d722f9424021fefe8d94b6bf3ece213226d >> [3] >> https://github.com/apache/gora/commit/c491a6447d197b0509473294ee844834b1623a63 >> [4] >> https://github.com/apache/gora/commit/a870ca8a2075af7cbab75b9341a94de4966fbf7a >> [5] >> https://docs.google.com/document/d/1AF6MG3pqe6A5Z0KtLooEKQlipuyeObbYlAa4O7rFXqM/edit?usp=sharing >> >> Regards, >> Maria >> >
Re: Add datastore for Elasticsearch. Outreachy Week 11 Report
Hi Maria. Thanks for the update. Unfortunately, looping through all possible values in the range is not a practical solution. You should use the range query feature for this: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html I think you should manually add a special field in the elasticsearch record that you can range query (you can add it to the mapping file as a 'mock' primary key field). It will be basically a copy of the '_id' field. Here, you can find a similar workaround in the Redis DataStore where Sorted Sets were as secondary indexes for range queries. https://github.com/apache/gora/blob/master/gora-redis/src/main/java/org/apache/gora/redis/store/RedisStore.java#L299 Best, John El sáb, 20 feb 2021 a las 3:01, Maria Podorvanova (< podorvanova.ma...@gmail.com>) escribió: > Hi, > > Report #11 > Week 11: February, 14 - February, 20 > Activities: > - Added scaling_factor support [1] > - Removed unsupported Elasticsearch data types [2] > - Implemented Metadata Analyzer for Elasticsearch Store [3] > - Tried to fix range query by “_id” field [4] > - Wrote documentation for Apache Gora website [5] > - Polished and sent my CV for reviewing > > Question: > >1. I tried to fix the issue, where Elasticsearch "_id" field does not >support range queries. I've tried treating "_id" as a number, but one of >the test "_id" field values is "http://foo.com/";. So my approach did >not work, but I decided to commit[4] my work on this issue in order to show >you what I tried to do. > > > [1] > https://github.com/apache/gora/commit/670a04c51f4a6d169df319ed5fd3d1d0abd81870 > [2] > https://github.com/apache/gora/commit/55020d722f9424021fefe8d94b6bf3ece213226d > [3] > https://github.com/apache/gora/commit/c491a6447d197b0509473294ee844834b1623a63 > [4] > https://github.com/apache/gora/commit/a870ca8a2075af7cbab75b9341a94de4966fbf7a > [5] > https://docs.google.com/document/d/1AF6MG3pqe6A5Z0KtLooEKQlipuyeObbYlAa4O7rFXqM/edit?usp=sharing > > Regards, > Maria >