[ 
https://issues.apache.org/jira/browse/SOLR-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-5463:
---------------------------

    Attachment: SOLR-5463__straw_man.patch

Patch update...

* additional tests that mix deletes & updates with walking a cursor
* more randomization of the types of queries being run
* more hardening of the SearchAfterTotem class (it should be useful beyond the 
strawman)
* more tests for the SearchAfterTotem serialization
* more tests of bad input/usage
* hook strawman component into example to try it out

...at this point i _was_ going to pursue tweaking the user facing API a bit, so 
that the "next" totem was never null, it always corrisponds to the last doc 
returned, and clients would check for 0 docs coming back to know when they are 
"done" and the "next" totem returned would be the same as the one they sent.  
If we do this, usecases like "i want every doc matching this query, and if 
there are no more, i want to remember where i left off and contiue again later" 
would be possible if the client uses a compatible sort (ie: a timestamp field)

However.... when manually using this with the example configs in order to 
sanity check that that was going to feel right as an API, i discovered problems 
where the queryResultCache was coming into play and never getting past "page" 
3.  I felt stupid for not thinking to test this earlier, and updated the test 
configs to include queryResultCache, but i can't reproduce with a failure ... 
still not sure why.

Need to investigate this further before doing anything else.

> Provide cursor/token based "searchAfter" support that works with arbitrary 
> sorting (ie: "deep paging")
> ------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-5463
>                 URL: https://issues.apache.org/jira/browse/SOLR-5463
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Hoss Man
>            Assignee: Hoss Man
>         Attachments: SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
> SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
> SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch
>
>
> I'd like to revist a solution to the problem of "deep paging" in Solr, 
> leveraging an HTTP based API similar to how IndexSearcher.searchAfter works 
> at the lucene level: require the clients to provide back a token indicating 
> the sort values of the last document seen on the previous "page".  This is 
> similar to the "cursor" model I've seen in several other REST APIs that 
> support "pagnation" over a large sets of results (notable the twitter API and 
> it's "since_id" param) except that we'll want something that works with 
> arbitrary multi-level sort critera that can be either ascending or descending.
> SOLR-1726 laid some initial ground work here and was commited quite a while 
> ago, but the key bit of argument parsing to leverage it was commented out due 
> to some problems (see comments in that issue).  It's also somewhat out of 
> date at this point: at the time it was commited, IndexSearcher only supported 
> searchAfter for simple scores, not arbitrary field sorts; and the params 
> added in SOLR-1726 suffer from this limitation as well.
> ---
> I think it would make sense to start fresh with a new issue with a focus on 
> ensuring that we have deep paging which:
> * supports arbitrary field sorts in addition to sorting by score
> * works in distributed mode



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to