[jira] [Commented] (HADOOP-7206) Integrate Snappy compression

Taro L. Saito (JIRA) Tue, 21 Jun 2011 20:27:53 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-7206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053019#comment-13053019
 ]


Taro L. Saito commented on HADOOP-7206:
---------------------------------------

Let me clarify some differences between Issay's hadoop-snappy and my 
snappy-java:

hadoop-snappy
 * Uses libsnappy.so (available in recent Linux distributions) and 
libhadoopsnappy.so (JNI code compiled for the target platform)

snappy-java
 * Uses libsnappyjava.so (mixing up the original snappy and JNI code), or 
snappyjava.dll (for Windows), libsnappyjava.jnilib (for Mac OS X)
 * It copies one of the native library to the directory specified in 
org.xerial.snappy.tempdir or java.io.tempdir system property.
 * If the dependencies to the glibc (in Linux GLIBC2.3 or higher is required 
for now) and dylib (in Mac OS X) cause some problems, you can re-compile 
snappy-java's native library only for your own platform (with make clean-native 
native). No need to care about building native libraries for the other 
platforms if you never use them. 

The same thing between hadoop-snappy and snappy-java is:
 * Both approaches need to compile the native code (libhadoopsnappy.so or 
libsnappyjava.so) somewhere. My snappy-java simply provides pre-compiled 
libsnappyjava.so for various platforms.

One of the design goals of snappy-java is to avoid troubles in linking against 
native libraries (e.g., libsnappy.so), such as crashes due to libstdc++ 
compatibility, missing libraries, etc. But as Alejandro suggested in my 
discussion group, using separate libsnappy.so and libsnapyjava.so is 
technically possible even in snappy-java:
 * First, tries to load pre-installed libsnappy.so and libsnappyjava.so (the 
version not containing libsnappy.so)  
 * If not found, extract these libraries embedded in the JAR to somewhere.
 * Load both native libraries.  

I am not sure supporting such loading mechanism is a right way to go.


> Integrate Snappy compression
> ----------------------------
>
>                 Key: HADOOP-7206
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7206
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.21.0
>            Reporter: Eli Collins
>            Assignee: T Jake Luciani
>             Fix For: 0.23.0
>
>         Attachments: HADOOP-7206-002.patch, HADOOP-7206.patch, 
> v2-HADOOP-7206-snappy-codec-using-snappy-java.txt, 
> v3-HADOOP-7206-snappy-codec-using-snappy-java.txt, 
> v4-HADOOP-7206-snappy-codec-using-snappy-java.txt, 
> v5-HADOOP-7206-snappy-codec-using-snappy-java.txt
>
>
> Google release Zippy as an open source (APLv2) project called Snappy 
> (http://code.google.com/p/snappy). This tracks integrating it into Hadoop.
> {quote}
> Snappy is a compression/decompression library. It does not aim for maximum 
> compression, or compatibility with any other compression library; instead, it 
> aims for very high speeds and reasonable compression. For instance, compared 
> to the fastest mode of zlib, Snappy is an order of magnitude faster for most 
> inputs, but the resulting compressed files are anywhere from 20% to 100% 
> bigger. On a single core of a Core i7 processor in 64-bit mode, Snappy 
> compresses at about 250 MB/sec or more and decompresses at about 500 MB/sec 
> or more.
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7206) Integrate Snappy compression

Reply via email to