[
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179153#comment-13179153
]
Mikhail Bautin commented on HBASE-4218:
---------------------------------------
@Matt: what do you call "the settings UI"? I thought HColumnDescriptor was part
of the user-visible API, and if we allowed more flexible options there, we
would have to fully support them everywhere.
On the performance issue: HBase is IO-bound for most production workloads, so
if we can fit more data into cache, we should get a performance win. Jacek
reported that encoded scanners were faster in his experiments, and if they are
not, we should optimize them or disable prefix compression for that particular
workload. In a CPU-bound situation, one reason encoded scanners could be slower
is that the data does not compress well, so delta encoding introduces an
unnecessary CPU overhead and does not really save any space in cache. For that
type of workload, using prefix compression probably is not the right thing to
do.
Could you please share some more details about the workload in your test? Is it
CPU-bound or IO-bound? Is it similar to your envisioned use case for data block
encoding? Are you planning to use the PREFIX algorithm or your trie
implementation? Does the trie algorithm have the same encoded scanner
performance problem?
@Lars, Matt:
"We have all the framework in place" and "features or already working code" are
relative concepts. The framework still needs to be tweaked to (1) support all
real use cases people have in mind; and (2) allow to solidify the existing
implementation and test it really well. Jacek's original patch did not handle
switching data block encoding settings in the column family, and I am in the
process of modifying the patch to support that. The more flexibility we allow
for column family encoding configuration, the more cases we have to test, and
the more exotic edge cases we get.
A couple more notes on supporting switching data block encoding column family
settings. Kannan and I discussed this, and we came up with a plan for allowing
a seamless migration to a new data block encoding. Blocks read from existing
HFiles will still be brought into cache using their original encoding, and we
will allow storing a mixture of different data block encodings in the cache.
The new encoding configuration will only be applied on flushes and compactions.
This is similar to the seamless HFile format upgrade that we have already done
successfully.
Another possible way to simplify things even further could be to get rid of the
ENCODE_IN_CACHE_ONLY option completely. We introduced it for testing, but it
seems to be causing more trouble than it is worth, and actually slows down
patch stabilization and testing. Such "test-mode" encoding would require extra
care to avoid using encoding during compactions, because that could actually
corrupt on-disk data. I think a better way would be to add more unit tests for
various edge cases and transitions for simplified configuration options, and do
more synthetic load testing with those. For dark launch cluster it is always
possible to take a backup and roll back if a data corruption happens. I still
need to discuss that option with Kannan and the rest of our team, but please
let me know what you think.
> Data Block Encoding of KeyValues (aka delta encoding / prefix compression)
> ---------------------------------------------------------------------------
>
> Key: HBASE-4218
> URL: https://issues.apache.org/jira/browse/HBASE-4218
> Project: HBase
> Issue Type: Improvement
> Components: io
> Affects Versions: 0.94.0
> Reporter: Jacek Migdal
> Assignee: Mikhail Bautin
> Labels: compression
> Fix For: 0.94.0
>
> Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch,
> 0001-Delta-encoding.patch, 4218-v16.txt, D447.1.patch, D447.10.patch,
> D447.11.patch, D447.12.patch, D447.13.patch, D447.14.patch, D447.15.patch,
> D447.16.patch, D447.17.patch, D447.2.patch, D447.3.patch, D447.4.patch,
> D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch,
> Data-block-encoding-2011-12-23.patch,
> Delta-encoding.patch-2011-12-22_11_52_07.patch,
> Delta_encoding_with_memstore_TS.patch, open-source.diff
>
>
> A compression for keys. Keys are sorted in HFile and they are usually very
> similar. Because of that, it is possible to design better compression than
> general purpose algorithms,
> It is an additional step designed to be used in memory. It aims to save
> memory in cache as well as speeding seeks within HFileBlocks. It should
> improve performance a lot, if key lengths are larger than value lengths. For
> example, it makes a lot of sense to use it when value is a counter.
> Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes)
> shows that I could achieve decent level of compression:
> key compression ratio: 92%
> total compression ratio: 85%
> LZO on the same data: 85%
> LZO after delta encoding: 91%
> While having much better performance (20-80% faster decompression ratio than
> LZO). Moreover, it should allow far more efficient seeking which should
> improve performance a bit.
> It seems that a simple compression algorithms are good enough. Most of the
> savings are due to prefix compression, int128 encoding, timestamp diffs and
> bitfields to avoid duplication. That way, comparisons of compressed data can
> be much faster than a byte comparator (thanks to prefix compression and
> bitfields).
> In order to implement it in HBase two important changes in design will be
> needed:
> -solidify interface to HFileBlock / HFileReader Scanner to provide seeking
> and iterating; access to uncompressed buffer in HFileBlock will have bad
> performance
> -extend comparators to support comparison assuming that N first bytes are
> equal (or some fields are equal)
> Link to a discussion about something similar:
> http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windows&subj=Re+prefix+compression
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira