Re: Scroll Questions

mooky Wed, 18 Jun 2014 03:20:08 -0700

Furthermore on using hits.length ==0,
Shard failure(s) can mean hits.length==0 but perhaps the end of the scroll.





On Tuesday, 17 June 2014 18:46:07 UTC+1, mooky wrote:
>
> Having hit a bunch of issues using scroll, I thought I better improve my 
> understanding of how scroll is supposed to be used (and how its not 
> supposed to be used).
>
>
>    1. Does it make sense to execute a search request with scroll, but 
>    SearchType != SCAN?
>    2. Does it make sense to execute a search request with scroll, and 
>    also with facet/aggregations?
>    3. What is the difference between scrolling to the end of the results 
>    (ie calling until hits.length ==0) and issuing a specific 
>    ClearScrollRequest? It appears to me that the ClearScrollRequest 
>    immediately clears the scroll - whereas there is some time delay before a 
>    scroll is cleaned up after reaching the end of the results. ( I can see 
>    this in my tests because the ElasticsearchIntegrationTest fails on 
> teardown 
>    unless I perform an explicit ClearScrollRequest or I put a delay of some 
>    number of seconds). From reading the docs, I am not sure if this a bug or 
>    expected behaviour.
>    4. Does the scrollId represent the cursor, or the cursor 
>    page/iteration state? I have read documentation/mailing list explanations 
>    that have words to the effect "you must pass the scrollId from the 
> previous 
>    response into the subsequent request" - which suggests the id represents 
>    some cursor state - ie performing a scroll request with a given scrollId 
>    will always return the same results. My observation, however, is that the 
>    scrollId does not change (ie I get back the same scrollId I passed in) so 
>    each scroll request with the same scrollId advances the 'cursor' until no 
>    results are returned. I have also read stuff on the mailing list that 
>    implied multiple calls could be made in parallel with the same scrollId to 
>    load all the results faster (which would imply the scrollId is *not* 
> expected 
>    to change). So which is correct? :)
>
>
> To explain the background for my questions: I have two requirements :
> 1) I get an update event that leads me to go find items in the index that 
> need re-indexing. I perform a search on the index, I get the id's and I 
> load the original data from the source system(s) to reconstruct the 
> document and index it. This seems to be exactly what SCAN and SCROLL is 
> meant for. (However, the SCAN search type is different in that it always 
> returns zero hits from the original search request - only the scroll 
> requests seem to 
>
> 2) The user normally performs a search, and naturally we limit how many 
> results we serve to the client. However, occasionally, the user wants to 
> return all the data for a given search/filter (say, to export to excel or 
> whatever), so it seems like a good idea to use the scroll rather than 
> paging through the results using from&size as we know we will get a 
> consistent results even if documents are being added/removed/updated on the 
> server.
> From a functionality perspective, I want to make sure the scrolling search 
> request is the same as the non-scrolling search request so the user gets 
> the same results - so from a code perspective, ideally I really want to 
> make the codepath the same (save for adding the scroll keepAlive param). 
> However, perhaps there are things I perform with my normal search (e.g. 
> aggregations, SearchType.DEFAULT, etc) that just don't make sense when 
> scrolling?
>
> Many thanks.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/788e5f30-2a7e-4777-9377-9357c283bf2b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Scroll Questions

Reply via email to