Stefan du Fresne created COUCHDB-3021:
-----------------------------------------

             Summary: Erlang filter performance is dependent on the 
string-length of the filter as opposed to code flow
                 Key: COUCHDB-3021
                 URL: https://issues.apache.org/jira/browse/COUCHDB-3021
             Project: CouchDB
          Issue Type: Bug
          Components: Database Core, Replication
            Reporter: Stefan du Fresne


It seems that filtered replication speed when using native Erlang filters is 
bound more to the byte-length size of the filter code than the actual logic 
that is executed. This (hopefully) could be fixed by caching the code eval 
instead of doing repeated for every document.

We have an Erlang filter that is used for replication, that determines if a 
document should replicate to a user based on some logic. This was originally 
written in JS ^[#1]^ but was converted to Erlang ^[#2]^ for performance. For 
reference the Javascript took ^[#3]^ ~54 seconds, while the Erlang took ~28.

In an attempt to make the Erlang even faster I 'compiled' the filter into a 
static list of users / metausers, so that the filter could effectively just be 
"is this users id in this list of allowed users" instead of running recursive 
logic. To be safe, I kept the old code in the filter as well as a fallback in 
case the document didn't have the compiled permission list ^[#4]^.

Confusingly, even through this filter would have executed far less code, 
*filter performance remained the same*. It was only when I removed the fallback 
^[#5]^ that the filter went any faster, now 3 times faster at ~10 seconds.

To me this says that a large amount of the time spent is in the actual time 
spent evaling the string that represents the Erlang into the engine, and that 
this eval parsing happens over and over again for each document.

To prove this I created a version of the compiled-with-backup ^[#4]^ version 
with a bunch of extra Erlang in it that I knew would never actually get 
executed ^[#6]^, as well as a version where I removed all the comments etc to 
make it as small as possible ^[#7]^. As expected, the long version was much 
slower (~80 seconds) and the smaller version was slightly faster (~9.5 
seconds), even through their code execution paths haven't changed.

To me it seems like Couch should be caching the process of loading this code as 
a string from the ddoc and injecting it into the Erlang process so that you 
only take the hit from this once, as this hit seems very major (ie more 
important than actually writing a simple filter).

{anchor:1} ^1^ 
https://github.com/medic/medic-webapp/blob/filter_experiments_wip/lib/filters.js#L16
 -- it is not mandatory to understand what this and subsequent code links do, 
just their differences with relation to performance
{anchor:2} ^2^ 
https://github.com/medic/medic-webapp/blob/filter_experiments_wip/ddocs/erlang_filters/filters/doc_by_place_live.erl
{anchor:3} ^3^ All performance measurements are over the same 10k documents
{anchor:4} ^4^ 
https://github.com/medic/medic-webapp/blob/filter_experiments_wip/ddocs/erlang_filters/filters/doc_by_place_backup.erl
{anchor:5} ^5^ 
https://github.com/medic/medic-webapp/blob/filter_experiments_wip/ddocs/erlang_filters/filters/doc_by_place.erl
{anchor:6} ^6^ 
https://github.com/medic/medic-webapp/blob/filter_experiments_wip/ddocs/erlang_filters/filters/doc_by_place_backup_extra_long.erl
{anchor:7} ^7^ 
https://github.com/medic/medic-webapp/blob/filter_experiments_wip/ddocs/erlang_filters/filters/doc_by_place_fast.erl



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to