[jira] [Commented] (HADOOP-7206) Integrate Snappy compression

Taro L. Saito (JIRA) Thu, 23 Jun 2011 19:20:14 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-7206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054215#comment-13054215
 ]


Taro L. Saito commented on HADOOP-7206:
---------------------------------------

@Issei @Alejandro
Great. That means as long as using the same classloader (as Hadoop seems to do 
so), reusing libsnappy.so between hadoop-snappy and snappy-java is no problem. 
Now, it looks like whether to use libsnappy.so or not is a problem of 
snappy-java, and I prefer to use libsnappyjava.so (statically linked snappy + 
JNI code with -fvisibility=hiden option), which can avoid potential API 
conflict and missing library problems (for some OSes). 

In my experience of developing sqlite-jdbc 
(http://sqlite-jdbc.googlecode.com/), which uses the same technique to extract 
.so file at runtime, many people seems to be satisfied with this approach. The 
problem that can be solved by the runtime library extraction is failures due to 
misconfiguration (e.g., LD_LIBRARY_PATH, etc.) and library build process (gcc, 
linker options, etc.) for each OS. For example, I frequently use Windows to 
develop the code, but run the production code under Linux; no need to switch 
the library files really helps me a lot. But, looking at HADOOP-7405, current 
Hadoop's native libraries are not so portable across various OSes. In such a 
state, motivation for using portable library something like snappy-java might 
be low.

I don't care which one is used in Hadoop, but the discussion in this thread has 
been useful for me to improve snappy-java. Thanks!

@Allen
a) OS X (32-bit/64-bit) are already supported.  
b) I need to know os.name and os.arch name system properties that IBM JVM 
provides. 

Building and embedding non-bundled so file into snappy-java is simple; just do 
"make".  
As a matter of fact, I do it for 6 types of OS and CPU combinations. And also, 
by using VMWare FUSION in Mac, all of native libraries currently supported can 
be compiled in a single machine. Some 64-bit OS can be used to build 32-bit 
native libraries (e.g., Windows, Linux, etc.) 



> Integrate Snappy compression
> ----------------------------
>
>                 Key: HADOOP-7206
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7206
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.21.0
>            Reporter: Eli Collins
>            Assignee: Alejandro Abdelnur
>             Fix For: 0.23.0
>
>         Attachments: HADOOP-7206-002.patch, HADOOP-7206.patch, 
> v2-HADOOP-7206-snappy-codec-using-snappy-java.txt, 
> v3-HADOOP-7206-snappy-codec-using-snappy-java.txt, 
> v4-HADOOP-7206-snappy-codec-using-snappy-java.txt, 
> v5-HADOOP-7206-snappy-codec-using-snappy-java.txt
>
>
> Google release Zippy as an open source (APLv2) project called Snappy 
> (http://code.google.com/p/snappy). This tracks integrating it into Hadoop.
> {quote}
> Snappy is a compression/decompression library. It does not aim for maximum 
> compression, or compatibility with any other compression library; instead, it 
> aims for very high speeds and reasonable compression. For instance, compared 
> to the fastest mode of zlib, Snappy is an order of magnitude faster for most 
> inputs, but the resulting compressed files are anywhere from 20% to 100% 
> bigger. On a single core of a Core i7 processor in 64-bit mode, Snappy 
> compresses at about 250 MB/sec or more and decompresses at about 500 MB/sec 
> or more.
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7206) Integrate Snappy compression

Reply via email to