[ https://issues.apache.org/jira/browse/HBASE-11400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14057299#comment-14057299 ]
Jonathan Hsieh commented on HBASE-11400: ---------------------------------------- This is a good improvement. I think more can be done -- here are suggestions: - explain that there are tradeoffs for compression and encoding in the first section (where you are talking about how they go on a column family). Maybe say something about how compression codecs take big opaque byte arrays, while encodings take advantage of some of the structure that hbase knows its data formats. - Data Block encoding types section: Consider asking [~mbertozzi] if you can use the images from here. http://blog.cloudera.com/blog/2012/06/hbase-io-hfile-input-output/ - which compression or codec to use -- Would be good to explain why should gzip be used for cold data and snappy and lzo for hot data. Because lzo and snappy favor low cpu usage and a poorer compression ratio while gzip favors more cpu usage and a higher compression ratio. -- The codec part is pretty weak here. Maybe use examples from matteo's blog post or drop it since there is only one line. Also consider mentioning why you'd want to use a encoder in this section and just describe what the different types mean in the previous section. It is probably worth noting here that when settings are changed on an existing colunm family, the encodings and compression is applied on compaction. > Edit, consolidate, and update Compression and data encoding docs > ---------------------------------------------------------------- > > Key: HBASE-11400 > URL: https://issues.apache.org/jira/browse/HBASE-11400 > Project: HBase > Issue Type: Improvement > Components: documentation > Reporter: Misty Stanley-Jones > Assignee: Misty Stanley-Jones > Priority: Minor > Attachments: HBASE-11400-1.patch, HBASE-11400.patch > > > Current docs are here: http://hbase.apache.org/book.html#compression.test > It could use some editing and expansion. -- This message was sent by Atlassian JIRA (v6.2#6252)