On Tue, Dec 21, 2010 at 4:55 AM, Bob Clary <[email protected]> wrote: > Hi all, > > I've been using CouchDB to track the results of testing Firefox and have > found that as the database and view sizes have increased CouchDB is becoming > less and less viable as a solution going forward. I don't wish to switch to > a different database at this time but may not have a choice. > > Let me say that I have looked at Jira and found others with similar issues > although issues have mostly been resolved as invalid or already fixed. I do > admit that I have a hard time navigating Jira, so it is entirely possible > I've missed already filed issues. I am not sending this email in a > threatening fashion that I've seen many times in bugzilla where a user says > "Fix this or I'm leaving!", but in a plea for some help in finding, filing > or fixing the appropriate Jira issues which need attention. > > My database currently has a compacted size of about 37G and contains a bit > over 9 million documents. You can see examples of the view documents in the > error log I attached to <https://issues.apache.org/jira/browse/COUCHDB-970>. >
The immediate thing you could do would be to use BigCouch. Even if you're using multiple BigCouch nodes on a single machine it should still help you with initial file sizes and view indexing times. > I am currently using CouchDB 1.0.1 on Centos5 64bit vm with 2CPU and 4G RAM > running Erlang R14B and configured to use the 64bit js-devel libraries. I > temporarily tried to use CouchDB 1.0.x to pick up the fix for > <https://issues.apache.org/jira/browse/COUCHDB-926> which was causing me > problems but had to revert to 1.0.1 due to crashes upon view compaction > completion. > > Currently, my main issues are: > > Slow View generation: Recreating views from scratch is very slow. It can > take me over 24 hours to recreate some of the larger views. Combined with > the need to immediately compact them (see Large Initial View sizes) > recreating views can take my application offline for users for more than a > day. Trying to switch to 1.0.x and back and having to regenerate views after > out of space conditions has led to my application being unavailable for most > of a week. > View generation is definitely slower than I'd like. Again, in the immediate short term, a switch to BigCouch will help you here because you can rebuild parts of a view independently which will help with time and disk space. > Large Initial View sizes: Several of my views are initially created with > sizes which are 10-20 times the size of the compacted view. For example, I > have one view which when initially created can take 95G but when compacted > uses less than 5G. This has caused several out of disk space conditions when > I've had to regenerate views for the database. I know commodity disks are > relatively cheap these days, but due to my current hosting environment I am > using relatively expensive networked storage. Asking for sufficient storage > for my expected database size was difficult enough, but asking for 10 or > more times that amount just to deal with temporary explosive view sizes is > probably a non-starter. > How do you have your views laid out? Remember that a design document is indexed all at once in a single file, so its possible you could get seedups and smaller files by splitting them across multiple design docs. Also, in 1.0.1 you should have the ability to create a view before using it. Ie, you create the _design doc with a random id, and build its views, then rename it to its final destination. Also, depending on your reductions, if you can, its best to use the built in reductions. > CouchDB 1.0.x: My experience with attempting to use the 1.0.x branch was a > failure due to the crashing immediately upon view compaction completion > which caused the views to begin indexing from scratch. > This is a serious unreported bug. Please add any crash logs to Jira so we can figure out what's going on here. > I would appreciate it if you would let me know if some of these are known > issues which have already been filed in Jira or if it would be helpful to > file new issues and what additional information I can provide to help get > these issues resolved. > > I can also help in making newer releases of SpiderMonkey 1.7 available and > to help get SpiderMonkey 1.8 and later released if that will help the > JavaScript performance issues CouchDB may be facing. > I think you'll definitely notice an change with that upgrade. The more complicated your views are, the more of an impact it should have. > bc > > HTH, Paul Davis
