[ 
https://issues.apache.org/jira/browse/SOLR-6121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14073401#comment-14073401
 ] 

Hoss Man commented on SOLR-6121:
--------------------------------

bq. I don't think there's much to gain from forcing a user to specify a 
tiebreaker.

A good reason why forcing the user to explicitly include the uniqueKey in the 
sort string to use cursor's occurred to me this morning while replying to a 
solr-user thread ("Subject: How to migrate content of a collection to a new 
collection"):

{panel}
It forces the user to be aware that Solr will be sorting on the uniqueKey. It's 
not hidden from them, and they are given an opportunity to consider the 
resulting RAM impacts.
{panel}

Since the memory usage of the FieldCache is certainly non-trivial on really 
large collection, making user is explicitly aware that their uniqueKey needs to 
be in the sort puts them in the position to consider their memory needs and/or 
make an informed decision about enabling docValues on their uniqueKey field 
(something most people probably wouldn't ever need to do unless they normally 
have reasons to do range queries or sort on their uniqueKey field)

As things stand right now, if someone switches their usage from 
{{sort=foo+asc&start=0&rows=100}} to 
{{sort=foo+asc&start=0&rows=100*cursorMark=*}} they get a clear error that they 
need to consider how they want their uniqueKey field to be included in the 
sort.  If we change the code to silently rewrite their sort param to be 
{{sort=foo+asc,+id+asc}} internally w/o any obvious feedback to them, that 
could chew up a lot of ram on them unexpectedly, and they could see really 
weird performance changes and/or OOM and think it's some inherent 
behavior/problem with using cursors.


So i think i take back my "i've got no serious objections" comment, and instead 
vote "-1" to the idea.

I'd rather users get an error the first time they try and have to manually 
change things, then be frustrated/confused by why cursors (seemingly) degrade 
their overall system performance so much.

> cursorMark should accept sort without the uniqueKey
> ---------------------------------------------------
>
>                 Key: SOLR-6121
>                 URL: https://issues.apache.org/jira/browse/SOLR-6121
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: David Smiley
>            Priority: Minor
>
> If you are using the cursorMark (deep paging) feature, you shouldn't *have* 
> to add the uniqueKey to the sort parameter.  If the user doesn't do it, the 
> user obviously doesn't care about the uniqueKey order relative to whatever 
> other sort parameters they may or may not have provided.  So if sort doesn't 
> have it, then Solr should simply tack it on at the end instead of providing 
> an error and potentially confusing the user.  This would be more user 
> friendly.
> Quoting Hoss from 
> [SOLR-5463|https://issues.apache.org/jira/browse/SOLR-5463?focusedCommentId=14011384&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14011384]:
> {quote}
> The reason the code currently throws an error was because i figured it was 
> better to force the user to choose which tie breaker they wanted (asc vs 
> desc) then to just magically pick one arbitrarily.
> If folks think a magic default is a better i've got no serious objections – 
> just open a new issue.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to