fsparv commented on code in PR #4053: URL: https://github.com/apache/solr/pull/4053#discussion_r2723685924
########## solr/solr-ref-guide/modules/query-guide/pages/exporting-result-sets.adoc: ########## @@ -23,6 +23,24 @@ This feature uses a stream sorting technique that begins to send records within The cases where this functionality may be useful include: session analysis, distributed merge joins, time series roll-ups, aggregations on high cardinality fields, fully distributed field collapsing, and sort-based stats. +== Comparison with Cursors + +The `/export` handler offers several advantages over xref:pagination-of-results.adoc#fetching-a-large-number-of-sorted-results-cursors[cursor-based pagination] for streaming large result sets. + +With cursors, the query is re-executed for each page of results. +In contrast, `/export` runs the filter query once and the resulting segment-level bitmasks are applied once per segment, after which the documents are simply iterated over. +Additionally, the segments that existed when the stream was opened are held open for the duration of the export, eliminating the disappearing or duplicate document issues that can occur with cursors. +The trade-off is that IndexReaders are kept around for longer periods of time. Review Comment: > I feel we're potentially suggesting the contributor here put more work into this than he bargained for. Any documentation he's comfortable writing is encouraged... and beyond that, well let's just get this merged and have real users kick the tires and we'll see. I added comments since he asked for a second set of eyes on the docs, and didn't mark it changes requested. Feel free to take em or leave em. These are the things I suspect a reader might wonder. I have a notion of some of the answers, though it's quite likely @kotman12 has thought more carefully about it more recently than I. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
