Hi Couchdb users,
Here is my situation:
I'm using Couchdb 1.3.1 for my web app. Each user do a long-poll to
_changes to exchange messages and other informations on a shared database. A
user can load a lot of docs at once using a server-side file loader. I set up
a FILTER to use on _changes to only include "message" doc. It works, but it's
much slower then expected. All my views / filters are written in Erlang.
For a simple database with *only* 56K documents, here is the performance I see
on localhost:
Build a view indexing all docs (from scratch)
: ~ 6 seconds
Query _changes with no filter, include_docs=true, since=0 : ~ 10
seconds (its basically downloading all the docs from the database)
Query _changes with a simple filter, include_docs=true, since=0 : ~ 47 seconds
(and return no doc)
It grows linearly with doc count. 112 K docs:
Build a view indexing all docs (from scratch)
: ~10 s
Query _changes with no filter, include_docs=true, since=0 : ~ 22s
Query _changes with a simple filter, include_docs=true, since=0 : ~ 95s
My filter is fairly simple:
"fun({Doc}, {Req}) -> DocType = proplists:get_value(<<\"type__\">>, Doc), case
DocType of <<\"message\">> -> true; _ -> false end end."
The view is a bit more complicated:
"fun({Doc}) -> DocTrashed = proplists:get_value(<<"trashed__">>, Doc), case
DocTrashed of true -> ok; _ -> DocType = proplists:get_value(<<"type__">>,
Doc), Emit([proplists:get_value(<<"created_at__">>, Doc)], null); _ -> ok end
end."
My filter used to be written in JS and I did see a big change using Erlang.
For 56K, it was 2min +.
I do take avantage of the since=the_last_seq to minimize the time to wait...
But there are situations where let say 2 users are connected (both listening to
_changes of the same db) and one load a file. The other user will then "hang"
for while, not being able to receive messages.
Is it the expected performance? Any performance tweak or config to do ?
Thanks.