[
https://issues.apache.org/jira/browse/COUCHDB-495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Damien Katz updated COUCHDB-495:
--------------------------------
Attachment: couch_perf.py
> Make views twice as fast
> ------------------------
>
> Key: COUCHDB-495
> URL: https://issues.apache.org/jira/browse/COUCHDB-495
> Project: CouchDB
> Issue Type: Improvement
> Components: JavaScript View Server
> Reporter: Chris Anderson
> Fix For: 0.11
>
> Attachments: binary_collate.diff, couch_perf.py, term_collate.diff
>
>
> Devs,
> Damien's identified view collation as the most significant bottleneck for the
> view generation. We've done some testing, and some preliminary patches, and
> the upshot seems to be that even removing ICU from the collator is not a
> significant boost. What does speed things up greatly is using raw Erlang term
> comparison. Eg, instead of using couch_view:less_json, using fun(A,B) A < B
> end. provides a roughly 2x speedup.
> However, the patch is challenging for a few reasons: Making the collation
> strategy switchable at all is tough. It's actually quite easy to get an
> alternate less function into the btree writer (all you've got to do is set it
> in couch_view_group:init_group). The hard part is propagating the same less
> function to the PassedEndFun. There's a secondary problem that when you use
> raw term comparison, a lot of terms turn out to come before nil, and after
> {}, which we use as artificial first and last terms in the less_json
> function. So just switching to raw collation alone will leave you with a view
> with unreachable rows.
> I tried two different approaches to the problem last night, and both of them
> led to (instructive) dead ends. I'll attach them for illustration purposes.
> The next line of attack we think should be tried is this:
> First - remove _all_docs_by_seq, as it is just adding complexity to the
> problem, and has been deprecated by _changes anyway. Along the same lines,
> _all_docs should no longer use couch_httpd_view:make_view_fold_fun as it has
> completely different collation needs than make_view_fold_fun. We'll end up
> duplicating a little code in the _all_docs implementation, but it should be
> worth it because it will make the other work much simpler.
> Once those changes have laid the groundwork, the next step is to change
> make_view_fold_fun and couch_view:fold, so that rather than
> make_view_fold_fun being responsible for detecting when we've passed the
> endkey. That means make_passed_end_fun and all references to PassedEnd and
> PassedEnd fun will be stripped from couch_httpd_view and moved to couch_btree.
> couch_view:fold (and the underlying btree) will need to accept not just a
> start, but also an endkey. This will make it much easier to use the less fun
> that is stored on View#view.btree#btree.less to determine PassedEnd funs.
> This will move some complexity to the btree code from the view code, but will
> keep the concerns more aligned. This also means that the btree will need to
> accept not only an endkey for folds, but also an inclusive_end parameter.
> Once we have all these refactorings done, it will be easy to make the less
> fun for an index configurable, as both the index writer and the index reader
> will look for it in the same place (on the #btree record).
> My aim is to start a discussion and get someone excited to work on this
> patch. Think of all the fast-views glory you'll get! Please ask questions and
> otherwise force me to clarify the above discussion.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.