davisp commented on a change in pull request #1376: Add fields to sort based on
selector
URL: https://github.com/apache/couchdb/pull/1376#discussion_r196461467
##########
File path: src/mango/src/mango_idx_view.erl
##########
@@ -511,3 +512,45 @@ range_pos(Low, Arg, High) ->
max
end
end.
+
+
+can_use_sort(_Cols, [], _Selector) ->
+ true;
+can_use_sort(Cols, SortFields, Selector) ->
+ case lists:subtract(SortFields, Cols) of
+ [] ->
+ ConstantSort = add_constant_sort_fields(Cols, SortFields,
Selector),
+ lists:prefix(ConstantSort, Cols);
+ _ ->
+ false
+ end.
+
+
+% This is an user experience improvement. If a selector has a sort field set
+% then an index is only valid if the sort fields match the
+% prefix of the index fields.
+
+% e.g Index = [A, B, C] with Sort = [A, B] is a valid sort
+% but if Sort = [B, C] then it is not valid for this index.
+
+% If an indexed field in the selector is constant, eg {A: {$eq: 21}} then
+% we can add it to the sort list because it won't affect sorting and the
+% original sort will still be valid.
+
+% e.g Index = [A, B] with Sort = [B] and selector has {A: 1}.
+% Then we can make the Sort = [A, B].
+% The sort will work as expected and this will increase the possibility
+% of the index being choosen. It also helps a user where they might not have
+% put correct initial fields in the sort.
+
+% Currently we only add constant fields that are prefixes to the sort fields
+% set by the user. We considered adding in constant fields after sort fields
+% but were not 100% sure that it would not affect the sorting of the query.
+add_constant_sort_fields(Cols, SortFields, Selector) ->
Review comment:
This function still seems overly complicated for what we're checking.
Specifically how we're foldr'ing and relying on the subtlety that if a field
isn't constant that its dropped which should theoretically guarantee that the
prefix check will always fail. Also the fact that can_use_sort never checks
directly and instead is applying the optimization logic rather than only
checking if we can optimize once we find a need to.
Playing around a bit I came up with this recursive approach that I think at
least seems easier to grok the behavior.
```erlang
can_use_sort(_Cols, [], _Selector) ->
true;
can_use_sort([], SortFields, Selector) ->
false;
can_use_sort([Col | _] = Cols, [Col | _] = SortFields, _Selector) ->
lists:prefix(SortFields, Cols);
can_use_sort([Col | RestCols], SortFields, Selector) ->
case mango_selector:is_constant_field(Selector, Col) of
true -> can_use_sort(RestCols, SortFields, Selector);
false -> false
end.
```
Which translates to a fairly straight forward reasoning:
1. If no sort, we can use this
2. If we run out of index columns we can't use this index
3. If current column is the start of the sort, return if sort is a prefix
4. If the current column is constant, drop it and continue, else return false
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services