> In a recent benchmark: Inserting 100b docs into an empty database:
> ~6200docs/s. Inserting the same docs in a 90 000 000 doc db: 6000docs/
> s (with sequential ids). Data scales.
This is very interesting - one of the applications I'm thinking of has a
profile just like this (warehouse for RADIUS accounting records)
I have a couple of questions relating to this.
- To get such high performance, is it necessary to use _bulk_docs, or was
it achieved with regular PUT operations?
- Does Couchdb commit its data to stable storage *before* returning a HTTP
response? That is, once you receive a HTTP success response, you can be
sure that the data has already hit the disk?
If Couchdb can handle 6,000 individual PUT requests per second, *and* only
respond after they are committed to stable storage, then I think it must be
batching the writes and hence delaying the responses somewhat (by how much?
I couldn't see a tunable parameter for this)
However if this performance is only achievable using _bulk_docs, I'll have
to write my RADIUS server / Couchdb client to perform its own local
batching and POST these batches a few times per second.
I presume that batching also affects disk space used (before compaction) - I
wouldn't want each 200 byte RADIUS record taking up 4KB :-)
Final question: does Couchdb perform any gzip-like compression when writing
the JSON to disk? These 200 byte RADIUS records will become a lot larger
when expanded into verbose JSON.
{
"framed_ip_address":"192.168.1.1", // 6 bytes in original packet
... etc
}
Regards,
Brian.