This is an automated email from the ASF dual-hosted git repository. granthenke pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/kudu.git
commit c2150b836818209646f6ff27a03010079bbd50fa Author: Mitch Barnett <[email protected]> AuthorDate: Thu May 2 11:43:45 2019 -0500 [docs] KUDU-2808: Correct doc for default compression Our documentation currently states: "By default, columns are stored uncompressed." However, all columns (excluding bool, string and binary typed columns) are encoded using Bitshuffle by default. Bitshuffle uses LZ4 compression, thus these columns are compressed using the LZ4 compression codec by default. I've updated the documentation to call this out, and specifically mention that the bool, string and binary types are not compressed by default. Change-Id: Ia85d294210e1e6bc9f085652dc6f6a46b3c6462f Reviewed-on: http://gerrit.cloudera.org:8080/13217 Reviewed-by: Grant Henke <[email protected]> Tested-by: Grant Henke <[email protected]> Reviewed-by: Andrew Wong <[email protected]> --- docs/schema_design.adoc | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/schema_design.adoc b/docs/schema_design.adoc index 2d71c4e..305a4e3 100644 --- a/docs/schema_design.adoc +++ b/docs/schema_design.adoc @@ -186,9 +186,10 @@ tablets. === Column Compression Kudu allows per-column compression using the `LZ4`, `Snappy`, or `zlib` -compression codecs. By default, columns are stored uncompressed. Consider using -compression if reducing storage space is more important than raw scan -performance. +compression codecs. By default, columns that are Bitshuffle-encoded are +inherently compressed with LZ4 compression. Otherwise, columns are stored +uncompressed. Consider using compression if reducing storage space is more +important than raw scan performance. Every data set will compress differently, but in general LZ4 is the most performant codec, while `zlib` will compress to the smallest data sizes.
