[
https://issues.apache.org/jira/browse/HBASE-3691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030661#comment-13030661
]
Nicholas Telford commented on HBASE-3691:
-----------------------------------------
Thanks Nichole, without your patch to HColumnDescriptor it wasn't possible to
use snappy. I'd only tested it using CompressionTest, which I see now is not a
complete enough test: it only tests that compression on an HFile works, not
that Column Families can use it.
One thing that does concern me: it seems as though in your patch the Algorithm
implementation for SNAPPY has moved places in the enum. From the comments it
sounds like it should be added as the _last_ implementation to avoid breaking
HFiles compressed with the other implementations. This looks like it may just
be a merge glitch when you first applied my patch.
Using Nichole's patch, the steps to getting Snappy working are currently:
# Install hadoop-snappy using these instructions:
http://code.google.com/p/hadoop-snappy/
# You need to ensure the hadoop-snappy libs (incl. the native libs) are in the
HBase classpath. Unless there are any other recommendations, I just symlinked
the libs from HADOOP_HOME/lib to HBASE_HOME/lib. This needs to be done on all
HBase nodes, as with LZO.
# Use CompressionTest to verify snappy support is enabled and the libs can be
loaded:
bq. $ hbase org.apache.hadoop.hbase.util.CompressionTest
hdfs://host/path/to/hbase snappy
# Create a column family with snappy compression and verify it:
{quote}$ hbase shell
> create 't1', \{ NAME => 'cf1', COMPRESSION => 'snappy' \}
> describe 't1'{quote}
In the output of the "describe" command, you need to ensure it lists
"COMPRESSION => 'snappy'"
> Add compressor support for 'snappy', google's compressor
> --------------------------------------------------------
>
> Key: HBASE-3691
> URL: https://issues.apache.org/jira/browse/HBASE-3691
> Project: HBase
> Issue Type: Task
> Reporter: stack
> Priority: Critical
> Fix For: 0.92.0
>
> Attachments: hbase-snappy-3691-trunk-002.patch,
> hbase-snappy-3691-trunk-003.patch, hbase-snappy-3691-trunk.patch
>
>
> http://code.google.com/p/snappy/ is apache licensed.
> bq. Snappy is a compression/decompression library. It does not aim for
> maximum compression, or compatibility with any other compression library;
> instead, it aims for very high speeds and reasonable compression. For
> instance, compared to the fastest mode of zlib, Snappy is an order of
> magnitude faster for most inputs, but the resulting compressed files are
> anywhere from 20% to 100% bigger. On a single core of a Core i7 processor in
> 64-bit mode, Snappy compresses at about 250 MB/sec or more and decompresses
> at about 500 MB/sec or more.
> bq. Snappy is widely used inside Google, in everything from BigTable and
> MapReduce to our internal RPC systems. (Snappy has previously been referred
> to as "Zippy" in some presentations and the likes.)
> Lets get it in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira