Re: Why are cursor mark queries recommended over regular start, rows combination?

2018-03-26 Thread Webster Homer
Shawn, Thanks. It's been a while now, but we did find issues with both cursorMark AND start/rows. the effect was much more obvious with cursorMark. We were able to address this by switching to use TLOG replicas. These give consistent results. It's nice to know that the cursorMark problems were rela

Re: Why are cursor mark queries recommended over regular start, rows combination?

2018-03-23 Thread Shawn Heisey
On 3/23/2018 3:47 PM, Webster Homer wrote: > Just FYI I had a project recently where I tried to use cursorMark in > Solrcloud and solr 7.2.0 and it was very unreliable. It couldn't even > return consistent numberFound values. I posted about it in this forum. > Using the start and rows arguments in

Re: Why are cursor mark queries recommended over regular start, rows combination?

2018-03-23 Thread Webster Homer
Just FYI I had a project recently where I tried to use cursorMark in Solrcloud and solr 7.2.0 and it was very unreliable. It couldn't even return consistent numberFound values. I posted about it in this forum. Using the start and rows arguments in SolrQuery did work reliably so I abandoned cursorMa

Re: Why are cursor mark queries recommended over regular start, rows combination?

2018-03-20 Thread Jason Gerlowski
> I can take a stab at this if someone can point me how to update the > documentation. Hey SG, Please do, that'd be awesome. Thanks to some work done by Cassandra Targett a release or two ago, the Solr Ref Guide documentation now lives in the same codebase as the Solr/Lucene code itself, and t

Re: Why are cursor mark queries recommended over regular start, rows combination?

2018-03-14 Thread Erick Erickson
I'm pretty sure you can use Streaming Expressions to get all the rows back from a sharded collection without chewing up lots of memory. Try: search(collection, q="id:*", fl="id", sort="id asc", qt="/export") on a sharded SolrCloud installation, I

Re: Why are cursor mark queries recommended over regular start, rows combination?

2018-03-14 Thread S G
Thanks everybody. This is lot of good information. And we should try to update this in the documentation too to help users make the right choice. I can take a stab at this if someone can point me how to update the documentation. Thanks SG On Tue, Mar 13, 2018 at 2:04 PM, Chris Hostetter wrote:

Re: Why are cursor mark queries recommended over regular start, rows combination?

2018-03-13 Thread Chris Hostetter
: > 3) Lastly, it is not clear the role of export handler. It seems that the : > export handler would also have to do exactly the same kind of thing as : > start=0 and rows=1000,000. And that again means bad performance. : <3> First, streaming requests can only return docValues="true" : f

Re: Why are cursor mark queries recommended over regular start, rows combination?

2018-03-12 Thread Shawn Heisey
On 3/12/2018 6:18 PM, S G wrote: > We have use-cases where some queries will return about 100k to 500k records. > As per https://lucene.apache.org/solr/guide/7_2/pagination-of-results.html, > it seems that using start=x, rows=y is a bad combination performance wise. > > 1) However, it is not clear

Re: Why are cursor mark queries recommended over regular start, rows combination?

2018-03-12 Thread Erick Erickson
<1> consider start=100&rows=10. In the absence of cursorMark, Solr has to sort the top 110 documents in order to throw away the first 100 since the last document scored could be in the top 110 and there's no way to know that ahead of time. For 110 that's not very expensive, but when the list is in

Why are cursor mark queries recommended over regular start, rows combination?

2018-03-12 Thread S G
Hi, We have use-cases where some queries will return about 100k to 500k records. As per https://lucene.apache.org/solr/guide/7_2/pagination-of-results.html, it seems that using start=x, rows=y is a bad combination performance wise. 1) However, it is not clear to me why the alternative: "cursor-qu