Hi We're running CouchDB in production, and are currently storing around ~800K records in it. Lately view performance has started to become a hindering factor, especially when creating new views or changing existing ones (which is essentially creating a new view).
However we are currently using 56 byte _id fields, which I've come to realize was a bad choice. So I've made a few tests with smaller _id fields and decided to post them here. Unfortunately we cannot use the UUIDs assigned by CouchDB as we rely on the _id field to detect duplicate records (which is somewhat inherent in the way we collect distributed information, though it doesn't happen particularly often, it is definitely needed). Our data is also somewhat hetereogenous, and we often generate view keys based on different data items in the records, including the actual data values (so relational is a somewhat poor fit for us). I've done tests with 56, 22, and 12 bytes _id fields. The initial tests where done with CouchDB 0.10.0 on Karmic. I've tried 0.11 as well (but we'll take that later in the mail). 4 byte _id fields are not really possible for us as we would have significant chance of getting different records with the same _id. 8 bytes should be possible though, but wasn't tested. Test 1: Insert 70k records into database (inserted in same order), in chunks of 100 and measure db size: Results: 56 bytes 207.0 MB 22 bytes 175.6 MB 12 bytes 165.8 MB After compaction 56 bytes 146.7 MB 22 bytes 125.8 MB 12 bytes 120.0 MB Test 2: Construct a simple view over the data: 56 bytes 73 MB 22 bytes 54 MB 12 bytes 47 MB After compaction: 56 bytes 19 MB 22 bytes 14 MB 12 bytes 12 MB Test 3: Time for constructing a temporary view: 56 bytes 70 seconds 22 bytes 57 seconds 12 bytes 53 seconds In short, smaller _id fields provide a nice space reduction and saves a bit of time, but doesn't make it significantly faster. I build the current branch of 0.11 on Karmic as collation performance should have improved with that. I only redid the 12 byte _id tests. Test1: After initial insert: 151.3 MB (a bit smaller than 0.10) After compaction: 120.0 (same as 0.10) Test2 : Initial view build size: 153 MB (quite a lot more than 0.10) After compactions: 12 MB (same as 0.10) Test3: Time for constructing temporary view: 121 seconds (more than twice of 0.10). Does anyone have an idea of what could be wrong? Especially the increased view build time worries me, as I was hoping 0.11 could provide a needed performance boost for us. Please CC any replies, as I am not subscribed. -- - Henrik
