Following up on my own email, this seems to be an issue with snappy on Windows Server 2008. When I changed the file_compression setting to deflate_6, the "large" databases went down from 7GB to 1GB after compaction. I'm not entirely sure if this counts as a bug so I won't raise an issue on it. By the way: kudos to whoever wrote the code to deal with file_compression. When I changed file_compression to deflate_6, the system happily worked with the existing, supposedly snappy-compressed databases, and converted format on the next compaction. That could have gone wrong in several ways, but didn't, so thank you.
Nick On 15 May 2012 13:55, Nick North <[email protected]> wrote: > I'm curious about the size of replicated CouchDb databases in comparison > to each other. I have four databases, each with pull replications from the > other three, but they report quite different data sizes. Two of them say: > > {"db_name":"hydra","doc_count":1489060,"doc_del_count":2754893,"update_seq":6998882,"purge_seq":0,"compact_running":false,"disk_size":3213656193,"data_size":1395943755,"instance_start_time":"1337067567481841","disk_format_version":6,"committed_update_seq":6998882} > > While the other two say this - note the difference in data_size: > > {"db_name":"hydra","doc_count":1489441,"doc_del_count":2755302,"update_seq":4375865,"purge_seq":0,"compact_running":false,"disk_size":7599413027,"data_size":7265993199,"instance_start_time":"1337014746154865","disk_format_version":6,"committed_update_seq":4375865} > > (There is some discrepancy in the doc_count because new documents are being > posted continuously, and some went in in between fetching stats for the > various instances.) Other possibly relevant information: > > > - All the replications appear to be in working order so I don't believe > there is a backlog of documents waiting to be replicated. > - The database has just one design view and whether or not it has been > queried does not seem to make any difference to whether the database is > "large" or "small". > - Compaction makes little difference, in that the "large" instances always > remain much larger than the "small" ones. > - Everything is running CouchDb 1.2 on Windows: the "small" instances on > Windows 7 and Windows Vista, and the "large" ones on Windows Server 2008. > - File_compression is set to "snappy" in all cases and there are no > attachments anywhere. > > Can anyone suggest what might be going on here? My best guess is that it's to > do with file compression on Windows Server but that is a guess, so I'm > intending to do some experimentation with the other file compression options. > I'd be grateful for any thoughts, as I'm planning out disk requirements for a > system with ten times the capacity of the current one, and would very much > like to be do that with some certainty about file sizes. Thanks in advance > for any help, > > Nick North > > >
