[jira] [Created] (HADOOP-19162) Add LzoCodec implementation based on aircompressor

2024-05-02 Thread L. C. Hsieh (Jira)
L. C. Hsieh created HADOOP-19162:


 Summary: Add LzoCodec implementation based on aircompressor
 Key: HADOOP-19162
 URL: https://issues.apache.org/jira/browse/HADOOP-19162
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: L. C. Hsieh


I remember due to license issue, Hadoop doesn't contain built-in LzoCodec. 
Users can choose to build and install Lzo codec like hadoop-lzo manually. Some 
implement LzoCodec based on other open source implementations like 
aircompressor. But it is somehow inconvenience to maintain it separately.

I'm wondering if we can add LzoCodec implementation based on aircompressor into 
Hadoop as default LzoCodec.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17891) Lz4 should be excluded from relocation in shaded Hadoop libraries

2021-09-04 Thread L. C. Hsieh (Jira)
L. C. Hsieh created HADOOP-17891:


 Summary: Lz4 should be excluded from relocation in shaded Hadoop 
libraries
 Key: HADOOP-17891
 URL: https://issues.apache.org/jira/browse/HADOOP-17891
 Project: Hadoop Common
  Issue Type: Bug
Reporter: L. C. Hsieh


Lz4 is a provided dependency. So in the shaded Hadoop libraries, e.g. 
hadoop-client-api, if we don't exclude lz4 dependency, the downstream will 
still see the exception even they include lz4 dependency.
{code:java}
[info]   Cause: java.lang.ClassNotFoundException: 
org.apache.hadoop.shaded.net.jpountz.lz4.LZ4Factory
[info]   at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
[info]   at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
[info]   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
[info]   at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
[info]   at 
org.apache.hadoop.io.compress.lz4.Lz4Compressor.(Lz4Compressor.java:66)
[info]   at 
org.apache.hadoop.io.compress.Lz4Codec.createCompressor(Lz4Codec.java:119)
[info]   at 
org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:152)
[info]   at 
org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:168)
 {code}

[info] at 
org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:168)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17887) Remove GzipOutputStream

2021-09-02 Thread L. C. Hsieh (Jira)
L. C. Hsieh created HADOOP-17887:


 Summary: Remove GzipOutputStream
 Key: HADOOP-17887
 URL: https://issues.apache.org/jira/browse/HADOOP-17887
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: L. C. Hsieh


As we provide built-in gzip compressor, we can use it in compressor stream. The 
wrapper GzipOutputStream can be removed now.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17877) BuiltInGzipCompressor header and trailer should not be static variables

2021-08-27 Thread L. C. Hsieh (Jira)
L. C. Hsieh created HADOOP-17877:


 Summary: BuiltInGzipCompressor header and trailer should not be 
static variables
 Key: HADOOP-17877
 URL: https://issues.apache.org/jira/browse/HADOOP-17877
 Project: Hadoop Common
  Issue Type: Bug
Reporter: L. C. Hsieh


In the newly added BuiltInGzipCompressor, we should not let header and trailer 
as static variables as they are for different instances.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17868) Add more test for the BuiltInGzipCompressor

2021-08-26 Thread L. C. Hsieh (Jira)
L. C. Hsieh created HADOOP-17868:


 Summary: Add more test for the BuiltInGzipCompressor
 Key: HADOOP-17868
 URL: https://issues.apache.org/jira/browse/HADOOP-17868
 Project: Hadoop Common
  Issue Type: Test
Reporter: L. C. Hsieh


We added BuiltInGzipCompressor recently. It is better to add more compatibility 
tests for the compressor.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-17832) Replacing native lib with their Java wrappers

2021-08-02 Thread L. C. Hsieh (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

L. C. Hsieh resolved HADOOP-17832.
--
Resolution: Later

> Replacing native lib with their Java wrappers
> -
>
> Key: HADOOP-17832
> URL: https://issues.apache.org/jira/browse/HADOOP-17832
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: L. C. Hsieh
>Priority: Major
>
> This is umbrella ticker covering all works for replacing native lib with 
> their Java wrappers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17832) Replacing native lib with their Java wrappers

2021-08-02 Thread L. C. Hsieh (Jira)
L. C. Hsieh created HADOOP-17832:


 Summary: Replacing native lib with their Java wrappers
 Key: HADOOP-17832
 URL: https://issues.apache.org/jira/browse/HADOOP-17832
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: L. C. Hsieh


This is umbrella ticker covering all works for replacing native lib with their 
Java wrappers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17825) Add BuiltInGzipCompressor

2021-07-30 Thread L. C. Hsieh (Jira)
L. C. Hsieh created HADOOP-17825:


 Summary: Add BuiltInGzipCompressor
 Key: HADOOP-17825
 URL: https://issues.apache.org/jira/browse/HADOOP-17825
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: L. C. Hsieh


Currently, GzipCodec only supports BuiltInGzipDecompressor, if native zlib is 
not loaded. So, without Hadoop native codec installed, saving SequenceFile 
using GzipCodec will throw exception like "SequenceFile doesn't work with 
GzipCodec without native-hadoop code!"

Same as other codecs which we migrated to using prepared packages (lz4, 
snappy), it will be better if we support GzipCodec generally without Hadoop 
native codec installed. Similar to BuiltInGzipDecompressor, we can use Java 
Deflater to support BuiltInGzipCompressor.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17472) Fix hadoop.tools.dynamometer.TestDynamometerInfra

2021-01-14 Thread L. C. Hsieh (Jira)
L. C. Hsieh created HADOOP-17472:


 Summary: Fix hadoop.tools.dynamometer.TestDynamometerInfra
 Key: HADOOP-17472
 URL: https://issues.apache.org/jira/browse/HADOOP-17472
 Project: Hadoop Common
  Issue Type: Test
Reporter: L. C. Hsieh


Currently hadoop.tools.dynamometer.TestDynamometerInfra in trunk is failed.

{code}
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.hadoop.tools.dynamometer.TestDynamometerInfra
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.531 s 
<<< FAILURE! - in org.apache.hadoop.tools.dynamometer.TestDynamometerInfra
[ERROR] org.apache.hadoop.tools.dynamometer.TestDynamometerInfra  Time elapsed: 
0.53 s  <<< ERROR!
java.io.FileNotFoundException: 
http://mirrors.ocf.berkeley.edu/apache/hadoop/common/hadoop-3.1.3/hadoop-3.1.3.tar.gz
at 
java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1923)
at 
java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1523)
at org.apache.commons.io.FileUtils.copyURLToFile(FileUtils.java:1506)
at 
org.apache.hadoop.tools.dynamometer.DynoInfraUtils.fetchHadoopTarball(DynoInfraUtils.java:151)
at 
org.apache.hadoop.tools.dynamometer.TestDynamometerInfra.setupClass(TestDynamometerInfra.java:176)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17464) Create hadoop-compression module

2021-01-08 Thread L. C. Hsieh (Jira)
L. C. Hsieh created HADOOP-17464:


 Summary: Create hadoop-compression module
 Key: HADOOP-17464
 URL: https://issues.apache.org/jira/browse/HADOOP-17464
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: L. C. Hsieh


We added lz4-java, snappy-java dependencies to replace native libs. As per the 
suggestion from the review comments, we better add a hadoop module to have 
these extra dependencies, to avoid messing up the dependencies of user 
application.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17425) Bump up snappy-java to 1.1.8.2

2020-12-09 Thread L. C. Hsieh (Jira)
L. C. Hsieh created HADOOP-17425:


 Summary: Bump up snappy-java to 1.1.8.2
 Key: HADOOP-17425
 URL: https://issues.apache.org/jira/browse/HADOOP-17425
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: L. C. Hsieh


1.1.8.2 includes:

* Support Apple Silicon (M1, Mac-aarch64)
* Fixed the pure-java Snappy fallback logic when no native library for your 
platform is found.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17391) Add lz4-java as hadoop-hdfs test dependency

2020-11-21 Thread L. C. Hsieh (Jira)
L. C. Hsieh created HADOOP-17391:


 Summary: Add lz4-java as hadoop-hdfs test dependency
 Key: HADOOP-17391
 URL: https://issues.apache.org/jira/browse/HADOOP-17391
 Project: Hadoop Common
  Issue Type: Test
Reporter: L. C. Hsieh






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17292) Using lz4-java in Lz4Codec

2020-09-29 Thread L. C. Hsieh (Jira)
L. C. Hsieh created HADOOP-17292:


 Summary: Using lz4-java in Lz4Codec
 Key: HADOOP-17292
 URL: https://issues.apache.org/jira/browse/HADOOP-17292
 Project: Hadoop Common
  Issue Type: New Feature
  Components: common
Affects Versions: 3.3.0
Reporter: L. C. Hsieh


In Hadoop, we use native libs for lz4 codec which has several disadvantages:

It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and 
they have to be installed separately on each node of the clusters, container 
images, or local test environments which adds huge complexities from deployment 
point of view. In some environments, it requires compiling the natives from 
sources which is non-trivial. Also, this approach is platform dependent; the 
binary may not work in different platform, so it requires recompilation.
It requires extra configuration of java.library.path to load the natives, and 
it results higher application deployment and maintenance cost for users.
Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which is 
JNI-based implementation. It contains native binaries for Linux, Mac, and IBM 
in jar file, and it can automatically load the native binaries into JVM from 
jar without any setup. If a native implementation can not be found for a 
platform, it can fallback to pure-java implementation of lz4.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org