Re: solr as nosql - pulling all docs vs deep paging limitations

Chris Hostetter Wed, 18 Dec 2013 08:12:50 -0800

: 
: What about SELECT * FROM WHERE ... like misusing Solr? I'm sure you've been
: asked many times for that.
: What if client don't need to rank results somehow, but just requesting
: unordered filtering result like they are used to in RDBMS?
: Do you feel it will never considered as a resonable usecase for Solr? or
: there is a well known approach for dealing with?


If you don't care about ordering, then the approach i described (either 
using SOLR-5463, or just using a sort by uniqueKey with increasing 
range filters on the id) should work fine -- the fact that they come back 
sorted by id is just an implementation detail that makes it possible to 
batch the records (the same way most SQL databases will likely give you 
back the docs based on whatever primary key index you have)

I think the key difference between approaches like SOLR-5244 vs the cursor 
work in SOLR-5463 is that SOLR-5244 is really targeted at dumping all 
data about all docs from a core (matching the query) in a single 
request/response -- for something like SolrCloud, the client would 
manually need to hit each shard (but as i understand it fro mthe 
dscription, that's kind of the point, it's aiming to be a very low level 
bulk export).  With the cursor approach in SOLR-5463, we do 
agregation across all shards, and we support arbitrary sorts, and you can 
control the batch size from the client and iterate over multiple 
request/responses of that size.  if there is any network hucups, you can 
re-do a request.  If you process half the docs that match (in a 
particular order) and then decide "I've got all the docs i need for my 
purposes", ou can stop requesting the continuation of that cursor.



-Hoss
http://www.lucidworks.com/

Re: solr as nosql - pulling all docs vs deep paging limitations

Reply via email to