> In a recent benchmark: Inserting 100b docs into an empty database:  
> ~6200docs/s. Inserting the same docs in a 90 000 000 doc db: 6000docs/ 
> s (with sequential ids). Data scales.

This is very interesting - one of the applications I'm thinking of has a
profile just like this (warehouse for RADIUS accounting records)

I have a couple of questions relating to this.

- To get such high performance, is it necessary to use _bulk_docs, or was
  it achieved with regular PUT operations?

- Does Couchdb commit its data to stable storage *before* returning a HTTP
  response? That is, once you receive a HTTP success response, you can be
  sure that the data has already hit the disk?

If Couchdb can handle 6,000 individual PUT requests per second, *and* only
respond after they are committed to stable storage, then I think it must be
batching the writes and hence delaying the responses somewhat (by how much?
I couldn't see a tunable parameter for this)

However if this performance is only achievable using _bulk_docs, I'll have
to write my RADIUS server / Couchdb client to perform its own local
batching and POST these batches a few times per second.

I presume that batching also affects disk space used (before compaction) - I
wouldn't want each 200 byte RADIUS record taking up 4KB :-)

Final question: does Couchdb perform any gzip-like compression when writing
the JSON to disk? These 200 byte RADIUS records will become a lot larger
when expanded into verbose JSON.

  {
    "framed_ip_address":"192.168.1.1",  // 6 bytes in original packet
    ... etc
  }

Regards,

Brian.

Reply via email to