Hello Jean-Daniel Cryans, Kudu Jenkins,
I'd like you to reexamine a change. Please visit
to look at the new patch set (#2).
Change subject: tablet: change default bloom filter FP rate to 0.01%
tablet: change default bloom filter FP rate to 0.01%
The old default, 1%, was high enough that in a uniform random write workload,
we ended up needing to read in most of the key blocks even with bloom filters
enabled. On a 5 node cluster, after inserting a few billion rows, the write
throughput dropped dramatically as every batch of writes was seeking and
reading keys off disk.
In testing on the same cluster, changing the FP rate to 0.01% improved the
throughput dramatically (>2x) by reducing the random reads coming off disk. The
cost is a 2x increase in bloom filter size (20 bits per key vs 10) but
20 bits is still a small percentage compared to typical row key sizes
in target applications.
Of course if an application has no random write characteristics and really
cares about disk space, this can always be flipped back.
Screenshots of the inserts/second graph (1hr rolling average) for these tests
are at: https://gist.github.com/toddlipcon/1ab9b36b7fbae10b635d3a905e1fe55a
2 files changed, 10 insertions(+), 1 deletion(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/17/3517/2
To view, visit http://gerrit.cloudera.org:8080/3517
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Owner: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dral...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jdcry...@apache.org>
Gerrit-Reviewer: Kudu Jenkins