Hi all,
I've been using CouchDB to track the results of testing Firefox and have
found that as the database and view sizes have increased CouchDB is
becoming less and less viable as a solution going forward. I don't wish
to switch to a different database at this time but may not have a choice.
Let me say that I have looked at Jira and found others with similar
issues although issues have mostly been resolved as invalid or already
fixed. I do admit that I have a hard time navigating Jira, so it is
entirely possible I've missed already filed issues. I am not sending
this email in a threatening fashion that I've seen many times in
bugzilla where a user says "Fix this or I'm leaving!", but in a plea for
some help in finding, filing or fixing the appropriate Jira issues which
need attention.
My database currently has a compacted size of about 37G and contains a
bit over 9 million documents. You can see examples of the view documents
in the error log I attached to
<https://issues.apache.org/jira/browse/COUCHDB-970>.
I am currently using CouchDB 1.0.1 on Centos5 64bit vm with 2CPU and 4G
RAM running Erlang R14B and configured to use the 64bit js-devel
libraries. I temporarily tried to use CouchDB 1.0.x to pick up the fix
for <https://issues.apache.org/jira/browse/COUCHDB-926> which was
causing me problems but had to revert to 1.0.1 due to crashes upon view
compaction completion.
Currently, my main issues are:
Slow View generation: Recreating views from scratch is very slow. It can
take me over 24 hours to recreate some of the larger views. Combined
with the need to immediately compact them (see Large Initial View sizes)
recreating views can take my application offline for users for more than
a day. Trying to switch to 1.0.x and back and having to regenerate views
after out of space conditions has led to my application being
unavailable for most of a week.
Large Initial View sizes: Several of my views are initially created with
sizes which are 10-20 times the size of the compacted view. For example,
I have one view which when initially created can take 95G but when
compacted uses less than 5G. This has caused several out of disk space
conditions when I've had to regenerate views for the database. I know
commodity disks are relatively cheap these days, but due to my current
hosting environment I am using relatively expensive networked storage.
Asking for sufficient storage for my expected database size was
difficult enough, but asking for 10 or more times that amount just to
deal with temporary explosive view sizes is probably a non-starter.
CouchDB 1.0.x: My experience with attempting to use the 1.0.x branch was
a failure due to the crashing immediately upon view compaction
completion which caused the views to begin indexing from scratch.
I would appreciate it if you would let me know if some of these are
known issues which have already been filed in Jira or if it would be
helpful to file new issues and what additional information I can provide
to help get these issues resolved.
I can also help in making newer releases of SpiderMonkey 1.7 available
and to help get SpiderMonkey 1.8 and later released if that will help
the JavaScript performance issues CouchDB may be facing.
bc