On Thu, Nov 17, 2016 at 8:12 PM, Erick Erickson <[email protected]> wrote: > Yonik: > > Hmmm, we may be closer to that than it might appear. I happened to > need to do some verification yesterday to determine whether I could > limit the number of rows returned with TupleStream variants. /export > of course doesn't do that, the close on a TupleStream waits until the > entire stream is exhausted and throws the bits on the floor. > > Anyway, I was playing around with returning 10M rows with the /query > and /export handlers and found out that I could indeed use /query and > limit the rows. Fine so far. > > Then just for yucks I decided to try to use the /query handler with > rows=100M and... the total processing time was virtually identical to > /export. These weren't very sophisticated tests mind you; they did > lend evidence that your idea is probably the way to go though.
When I did some ad-hoc tests a long time ago, /select was inexplicably much slower (even when retrieving all docvalues and discounting sorting time). Some of the issue was probably a bug was fixed recently in SolrIndexSearcher.decorateSomethingOrOther that was creating a top-level DV view. Some other changes off the top of my head: - if the number of docs being retrieved is very large (or all via rows=-1), and if no other components (like highlighting) need the top-N docs (needDocList), then defer sorting of the matches until later. - keep track of the DocSet on the ResponseBuilder (this is already done when we facet via needDocSet?) - if sorting was deferred, then sort in the most efficient way we know how (i.e. don't always use a priority queue), or we can just do it like export writer currently does. - Invert the logic that writes DV fields so that we figure out the fields once, look up the docvalues once, and then efficiently write them out per document (Noble's addition of PushWriter is the right direction here). In the long run, this should be simpler to deal with from both a "fl" point of view, as well as augmenters, pseudo-fields, and security. But again, feel free to add whatever to /export in the meantime... I'm just laying out a bigger picture in case anyone also wants to work toward that as well. -Yonik --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
