Re: scrolling in ElasticSearch adapter

2018-11-14 Thread Julian Hyde
You’re proposing lazy consensus. We don’t do that for commits. (Yeah, we use a mixture of CTR and RTC. But adding lazy consensus muddies things too much, I think.) For this change, I think you should do RTC, and require a +1 from at least one committer. Therefore, someone please review

Re: scrolling in ElasticSearch adapter

2018-11-14 Thread Andrei Sereda
Thanks, all, for your input. Following this discussion, I have prepared a PR https://github.com/apache/calcite/pull/919 Let me know if you agree with the approach. If there are no objections / comments, I will commit it in a couple of days. Propagation of Statement.setFetchSize(int) into actual

Re: scrolling in ElasticSearch adapter

2018-10-26 Thread Kevin Risden
#1 sounds reasonable. #2 will have to see what that means for actual aggregations. Sounds like it could work. #3 not sure what is meant here. I think fetch size is a hint and not required. I wouldn't default to 10k though. Kevin Risden On Thu, Oct 25, 2018 at 4:25 PM Andrei Sereda wrote: >

Re: scrolling in ElasticSearch adapter

2018-10-25 Thread Andrei Sereda
There is new Composite Aggregation (still in beta) which allows pagination with aggregates. It is available on versions >= 5.6 (unfortunately we have to support 2.6+). To sum

Re: scrolling in ElasticSearch adapter

2018-10-25 Thread Kevin Risden
What I was saying is that scrolling is the only way to ensure you get correct results coming back from Elasticsearch if you are going to do more processing. https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html As long as you delete the scroll at the end its

Re: scrolling in ElasticSearch adapter

2018-10-25 Thread Andrei Sereda
Julian, > Do you need to generate a different plan (i.e. a different tree of RelNodes) > for scrolling vs non-scrolling? The plan is same. I just need to construct a different ES query and batching Enumerator On Thu, Oct 25, 2018 at 1:31 PM Julian Hyde wrote: > > Do you need to generate a

Re: scrolling in ElasticSearch adapter

2018-10-25 Thread Andrei Sereda
Hi Kevin, You suggest to use scrolling for all elastic queries ? Even when there are predicates ? Some questions : 1) Scrolling has a runtime overhead for elastic cluster. Having it enabled by default (against vendor recommendation) is risky. Does it (not) cause issues in Solr ? 2) Scrolling

Re: scrolling in ElasticSearch adapter

2018-10-25 Thread Kevin Risden
> > There is one more “issue”. Currently select * from elastic returns at most > 10 rows (in calcite). This is consistent with elastic behaviour which > limits result set to 10 documents (unless size is specified). In Solr land for the Calcite integration it uses the /export handler or streaming

Re: scrolling in ElasticSearch adapter

2018-10-25 Thread Julian Hyde
Do you need to generate a different plan (i.e. a different tree of RelNodes) for scrolling vs non-scrolling? If so, it’s certainly inconvenient that you don’t know until execute time whether they want scrolling. A possible solution would be to generate TWO plans - one scrolling, one

Re: scrolling in ElasticSearch adapter

2018-10-25 Thread Christian Beikov
Hey Andrei, I don't have an answer for how you can access these settings from within the adapter nor how one could do that via RelNodes but the suggestion to use DataContext for that purpose sounds reasonable. Maybe someone else has an idea? Anyway, since these are settings that don't

Re: scrolling in ElasticSearch adapter

2018-10-25 Thread Stamatis Zampetakis
For the sake of the discussion, I outline below a few ideas which may be of some use. Andrei>..elastic doesn't natively supports transactions. Scrollable results are "serializable" only for single query not for multiple (which might confuse users using transactions on the client side). One

Re: scrolling in ElasticSearch adapter

2018-10-24 Thread Andrei Sereda
Christian, I like TYPE_SCROLL_INSENSITIVE / fetchSize in PreparedStatement generally but have some reservations (questions) : How to pass resultSetType / fetchSize from PreparedStatement to RelNodes ? What if user doesn’t use JDBC (eg. RelBuilders) ? On Wed, Oct 24, 2018 at 6:28 PM Christian

Re: scrolling in ElasticSearch adapter

2018-10-24 Thread Andrei Sereda
> It seems to me that this behavior is tight with the notion of a transaction > and the various isolation levels defined by the standard. If the transaction > isolation level is for example "serializable" then executing the same query > in the same transaction multiple times should not return

Re: scrolling in ElasticSearch adapter

2018-10-24 Thread Christian Beikov
In JDBC one can configure a fetch size which would reflect the amount of rows to be fetched initially, but also subsequently. https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#setFetchSize(int) According to what you are writing, ES behvior is what TYPE_SCROLL_INSENSITIVE would

Re: scrolling in ElasticSearch adapter

2018-10-24 Thread Stamatis Zampetakis
Hi Andrei, Andrei>Scrolling (in elastic) does not only mean “open a cursor” but also iterate over consistent snapshot... It seems to me that this behavior is tight with the notion of a transaction and the various isolation levels defined by the standard. If the transaction isolation level is

Re: scrolling in ElasticSearch adapter

2018-10-24 Thread Andrei Sereda
Hi Julian, Scrolling (in elastic) does not only mean “open a cursor” but also iterate over consistent snapshot. From docs: The results that are returned from a scroll request reflect the state of the index at the time that the initial search request was made, like a snapshot in time. Subsequent

Re: scrolling in ElasticSearch adapter

2018-10-24 Thread Julian Hyde
It seems to me that Elasticsearch scroll means return a cursor - a collection of rows that you iterate over, and you may not read all of them. This is the default operation of JDBC. So, I guess we need to give the user a way to signal their intent to read all rows versus only the first few.

Re: scrolling in ElasticSearch adapter

2018-10-24 Thread Andrei Sereda
Hi Christian, Agree that (scroll) SQL keyword is an overkill. Need to check how JDBC context can be accessed in RelNode(s). If I remember correctly, relational algebra can also be evaluated outside JDBC using RelBuilders (can't find an example right now). Andrei. On Wed, Oct 24, 2018 at 2:20 PM

Re: scrolling in ElasticSearch adapter

2018-10-24 Thread Christian Beikov
Hey, not sure if this should be an SQL keyword. JDBC specifies various constants that can be used at statement creation time: https://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html Not sure though if or how these configurations are accessible for data stores or dialects, but IMO

scrolling in ElasticSearch adapter

2018-10-24 Thread Andrei Sereda
Hello, I was thinking about adding [scrolling functionality]( https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html) to elastic search adapter. Since scrolling has non-negligible effect on the cluster it should be selectively enabled on per query basis. So,