Re: scrolling in ElasticSearch adapter

Julian Hyde Thu, 25 Oct 2018 10:31:25 -0700

Do you need to generate a different plan (i.e. a different tree of RelNodes) 
for scrolling vs non-scrolling? If so, it’s certainly inconvenient that you 
don’t know until execute time whether they want scrolling. A possible solution 
would be to generate TWO plans - one scrolling, one non-scrolling inside the 
prepared statement - and pick which one based on the runtime context.


Julian

> On Oct 25, 2018, at 1:42 AM, Christian Beikov <[email protected]> 
> wrote:
> 
> Hey Andrei,
> 
> I don't have an answer for how you can access these settings from within the 
> adapter nor how one could do that via RelNodes but the suggestion to use 
> DataContext for that purpose sounds reasonable. Maybe someone else has an 
> idea?
> 
> Anyway, since these are settings that don't affect the general semantics of 
> the query/statement and also usually require a special API to be used, I'd 
> rather see these aspects not end up in the query string.
> 
> Am 25.10.2018 um 02:15 schrieb Andrei Sereda:
>> Christian,
>> 
>> I like TYPE_SCROLL_INSENSITIVE / fetchSize in PreparedStatement
>> generally but have some reservations (questions) :
>> 
>> How to pass resultSetType / fetchSize from PreparedStatement to RelNodes ?
>> What if user doesn’t use JDBC (eg. RelBuilders) ?
>> On Wed, Oct 24, 2018 at 6:28 PM Christian Beikov
>> <[email protected]> wrote:
>>> In JDBC one can configure a fetch size which would reflect the amount of
>>> rows to be fetched initially, but also subsequently.
>>> https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#setFetchSize(int)
>>> 
>>> According to what you are writing, ES behvior is what
>>> TYPE_SCROLL_INSENSITIVE would do i.e. provide a snapshot view that isn't
>>> affected by changes.
>>> 
>>> IMO TYPE_SCROLL_SENSITIVE means that if you have rows R1, R2, R3, R4,
>>> ... and view R1, R2, then R3 is deleted and you fetch the next rows, you
>>> wouldn't see R3.
>>> 
>>> According to the JDBC spec
>>> (https://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#prepareStatement(java.lang.String,%20int,%20int)
>>> ) you don't have to support all modes. Usually, user code doesn't use
>>> scrolling that much, but at least forward scrolling makes sense.
>>> 
>>> Am 24.10.2018 um 21:38 schrieb Andrei Sereda:
>>>> Hi Julian,
>>>> 
>>>> Scrolling (in elastic) does not only mean “open a cursor” but also iterate
>>>> over consistent snapshot. From docs:
>>>> 
>>>> The results that are returned from a scroll request reflect the state of
>>>> the index at the time that the initial search request was made, like a
>>>> snapshot in time. Subsequent changes to documents (index, update or delete)
>>>> will only affect later search requests.
>>>> 
>>>> So pagination (fetch / offset) can’t exactly replicate this functionality.
>>>> 
>>>> The problem with scrolling (in elastic) is that it is expensive and can’t
>>>> (shouldn’t) be enabled it by default.
>>>> 
>>>> There is one more “issue”. Currently select * from elastic returns at most
>>>> 10 rows (in calcite). This is consistent with elastic behaviour which
>>>> limits result set to 10 documents (unless size is specified). When
>>>> returning a cursor (eg. using JDBC TYPE_SCROLL_SENSITIVE
>>>> <https://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html#TYPE_SCROLL_SENSITIVE>
>>>> or SQL hint) does it mean return whole elastic index ? I’m not at ease with
>>>> returning different results based on hints or cursor settings.
>>>> 
>>>> Andrei.
>>>> 
>>>> On Wed, Oct 24, 2018 at 3:02 PM Julian Hyde <[email protected]> wrote:
>>>> 
>>>>> It seems to me that Elasticsearch scroll means return a cursor - a
>>>>> collection of rows that you iterate over, and you may not read all of 
>>>>> them.
>>>>> This is the default operation of JDBC.
>>>>> 
>>>>> So, I guess we need to give the user a way to signal their intent to read
>>>>> all rows versus only the first few. Oracle’s FIRST_ROWS and ALL_ROWS
>>>>> hints[1] seem close to this. We would want the hints to be acted upon by
>>>>> both the optimizer and the JDBC transport.
>>>>> 
>>>>> Related is pagination. SQL has FETCH and OFFSET, which allow you to
>>>>> retrieve different pieces of a large result set in separate statements or
>>>>> (using query parameters) executions. It would be useful if the server 
>>>>> could
>>>>> be given a hint to cache a statement across page requests.
>>>>> 
>>>>> Julian
>>>>> 
>>>>> [1]
>>>>> https://docs.oracle.com/cd/B10500_01/server.920/a96533/hintsref.htm#4924
>>>>> 
>>>>>> On Oct 24, 2018, at 11:19 AM, Christian Beikov <
>>>>> [email protected]> wrote:
>>>>>> Hey,
>>>>>> 
>>>>>> not sure if this should be an SQL keyword. JDBC specifies various
>>>>> constants that can be used at statement creation time:
>>>>> https://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html
>>>>>> Not sure though if or how these configurations are accessible for data
>>>>> stores or dialects, but IMO using these would be the proper way.
>>>>>> Regards
>>>>>> 
>>>>>> Christian
>>>>>> 
>>>>>>> Am 24.10.2018 um 18:44 schrieb Andrei Sereda:
>>>>>>> Hello,
>>>>>>> 
>>>>>>> 
>>>>>>> I was thinking about adding [scrolling functionality](
>>>>>>> 
>>>>> https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html
>>>>> )
>>>>>>> to elastic search adapter. Since scrolling has non-negligible effect on
>>>>> the
>>>>>>> cluster it should be selectively enabled on per query basis. So, likely,
>>>>>>> user has to explicitly set "scroll flag" somewhere.
>>>>>>> 
>>>>>>> Most natural way seems in SQL. [Calcite sql grammar](
>>>>>>> https://calcite.apache.org/docs/reference.html) has `SCROLL` keyword
>>>>>>> (unused to my knowledge). There were also discussions about adding
>>>>> hints to
>>>>>>> Calcite.
>>>>>>> 
>>>>>>> ### Examples
>>>>>>> ```sql
>>>>>>> -- special sql keyword ?
>>>>>>> SCROLL select * from elastic;
>>>>>>> 
>>>>>>> -- assuming hints are available in calcite
>>>>>>> /* HINT: scroll */ select * from elastic;
>>>>>>> ```
>>>>>>> 
>>>>>>> What people think about this use-case ? Are there better ideas ?
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Andrei.
>>>>>>>

Re: scrolling in ElasticSearch adapter

Reply via email to