[
https://issues.apache.org/jira/browse/HADOOP-7206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053019#comment-13053019
]
Taro L. Saito commented on HADOOP-7206:
---------------------------------------
Let me clarify some differences between Issay's hadoop-snappy and my
snappy-java:
hadoop-snappy
* Uses libsnappy.so (available in recent Linux distributions) and
libhadoopsnappy.so (JNI code compiled for the target platform)
snappy-java
* Uses libsnappyjava.so (mixing up the original snappy and JNI code), or
snappyjava.dll (for Windows), libsnappyjava.jnilib (for Mac OS X)
* It copies one of the native library to the directory specified in
org.xerial.snappy.tempdir or java.io.tempdir system property.
* If the dependencies to the glibc (in Linux GLIBC2.3 or higher is required
for now) and dylib (in Mac OS X) cause some problems, you can re-compile
snappy-java's native library only for your own platform (with make clean-native
native). No need to care about building native libraries for the other
platforms if you never use them.
The same thing between hadoop-snappy and snappy-java is:
* Both approaches need to compile the native code (libhadoopsnappy.so or
libsnappyjava.so) somewhere. My snappy-java simply provides pre-compiled
libsnappyjava.so for various platforms.
One of the design goals of snappy-java is to avoid troubles in linking against
native libraries (e.g., libsnappy.so), such as crashes due to libstdc++
compatibility, missing libraries, etc. But as Alejandro suggested in my
discussion group, using separate libsnappy.so and libsnapyjava.so is
technically possible even in snappy-java:
* First, tries to load pre-installed libsnappy.so and libsnappyjava.so (the
version not containing libsnappy.so)
* If not found, extract these libraries embedded in the JAR to somewhere.
* Load both native libraries.
I am not sure supporting such loading mechanism is a right way to go.
> Integrate Snappy compression
> ----------------------------
>
> Key: HADOOP-7206
> URL: https://issues.apache.org/jira/browse/HADOOP-7206
> Project: Hadoop Common
> Issue Type: New Feature
> Affects Versions: 0.21.0
> Reporter: Eli Collins
> Assignee: T Jake Luciani
> Fix For: 0.23.0
>
> Attachments: HADOOP-7206-002.patch, HADOOP-7206.patch,
> v2-HADOOP-7206-snappy-codec-using-snappy-java.txt,
> v3-HADOOP-7206-snappy-codec-using-snappy-java.txt,
> v4-HADOOP-7206-snappy-codec-using-snappy-java.txt,
> v5-HADOOP-7206-snappy-codec-using-snappy-java.txt
>
>
> Google release Zippy as an open source (APLv2) project called Snappy
> (http://code.google.com/p/snappy). This tracks integrating it into Hadoop.
> {quote}
> Snappy is a compression/decompression library. It does not aim for maximum
> compression, or compatibility with any other compression library; instead, it
> aims for very high speeds and reasonable compression. For instance, compared
> to the fastest mode of zlib, Snappy is an order of magnitude faster for most
> inputs, but the resulting compressed files are anywhere from 20% to 100%
> bigger. On a single core of a Core i7 processor in 64-bit mode, Snappy
> compresses at about 250 MB/sec or more and decompresses at about 500 MB/sec
> or more.
> {quote}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira