[jira] Commented: (SOLR-659) Explicitly set start and rows per shard for more efficient bulk queries across distributed Solr

Shalin Shekhar Mangar (JIRA) Fri, 20 Mar 2009 02:03:18 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12683803#action_12683803
 ]


Shalin Shekhar Mangar commented on SOLR-659:
--------------------------------------------

If I understand this correctly, it makes bulk queries cheaper at the expense of 
less precise scoring. But if I'm paging through some results and you modify the 
shard.start and shard.rows then I'll get inconsistent results. Is that correct?

bq. The client will receive up to shards.rows * nShards results and should set 
rows accordingly. This makes bulk queries across distributed solr possible.

I do not understand that. Why will the client get more than rows? Or by client, 
did you mean the solr server to which the initial request is sent?

> Explicitly set start and rows per shard for more efficient bulk queries 
> across distributed Solr
> -----------------------------------------------------------------------------------------------
>
>                 Key: SOLR-659
>                 URL: https://issues.apache.org/jira/browse/SOLR-659
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.3
>            Reporter: Brian Whitman
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: shards.start_rows.patch, SOLR-659.patch
>
>
> The default behavior of setting start and rows on distributed solr (SOLR-303) 
> is to set start at 0 across all shards and set rows to start+rows across each 
> shard. This ensures all results are returned for any arbitrary start and rows 
> setting, but during "bulk queries" (where start is incrementally increased 
> and rows is kept consistent) the client would need finer control of the 
> per-shard start and rows parameter as retrieving many thousands of documents 
> becomes intractable as start grows higher.
> Attaching a patch that creates a &shards.start and &shards.rows parameter. If 
> used, the logic that sets rows to start+rows per shard is overridden and each 
> shard gets the exact start and rows set in shards.start and shards.rows. The 
> client will receive up to shards.rows * nShards results and should set rows 
> accordingly. This makes bulk queries across distributed solr possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-659) Explicitly set start and rows per shard for more efficient bulk queries across distributed Solr

Reply via email to