[ 
https://issues.apache.org/jira/browse/SAMZA-543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14299608#comment-14299608
 ] 

Jay Kreps commented on SAMZA-543:
---------------------------------

Oh, wow, yeah, well that may explain the performance. The batch write was about 
a 5x perf improvement for write intensive stuff in leveldb. Did you guys perf 
test the RocksDB stuff when it was added?

But actually what I was describing wasn't batch write but rather disabling 
compaction. They have some way to do "batched compaction". If you check out 
their benchmark on batch load perf they describe this. Basically instead of 
doing a ton of compaction online during the bootstrap load, you just read all 
the data (which is mostly deduplicated anyway) and then when the bootstrap is 
complete they do a global sort. This should be a lot faster because otherwise 
you potentially compact each thing many times for no reason.

> Disable WAL in RocksDB KV store
> -------------------------------
>
>                 Key: SAMZA-543
>                 URL: https://issues.apache.org/jira/browse/SAMZA-543
>             Project: Samza
>          Issue Type: Bug
>          Components: kv
>    Affects Versions: 0.9.0
>            Reporter: Chris Riccomini
>             Fix For: 0.9.0
>
>
> RocksDB uses a write-ahead log by default. This is unnecessary in Samza, 
> since we have full durability from a state store's changelog topic. We should 
> [disable the 
> WAL|https://github.com/facebook/rocksdb/wiki/Basic-Operations#asynchronous-writes]
>  in the RocksDB KV store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to