[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=529266&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-529266 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 29/Dec/20 21:31 Start Date: 29/Dec/20 21:31 Worklog Time Spent: 10m Work Description: sunchao commented on pull request #2576: URL: https://github.com/apache/hadoop/pull/2576#issuecomment-752248558 All tests passed except `TestCompressorDecompressor.testCompressorDecompressorWithExeedBufferLimit` which is fixed via backporting [HADOOP-17270](https://issues.apache.org/jira/browse/HADOOP-17270) to branch-3.3. Closing this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 529266) Time Spent: 12h (was: 11h 50m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Assignee: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 12h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=529265&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-529265 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 29/Dec/20 21:31 Start Date: 29/Dec/20 21:31 Worklog Time Spent: 10m Work Description: sunchao closed pull request #2576: URL: https://github.com/apache/hadoop/pull/2576 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 529265) Time Spent: 11h 50m (was: 11h 40m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Assignee: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 11h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=529260&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-529260 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 29/Dec/20 20:54 Start Date: 29/Dec/20 20:54 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2576: URL: https://github.com/apache/hadoop/pull/2576#issuecomment-752237975 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 24m 38s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 6 new or modified test files. | ||| _ branch-3.3 Compile Tests _ | | +0 :ok: | mvndep | 4m 1s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 28m 54s | branch-3.3 passed | | +1 :green_heart: | compile | 16m 4s | branch-3.3 passed | | +1 :green_heart: | checkstyle | 2m 44s | branch-3.3 passed | | +1 :green_heart: | mvnsite | 3m 16s | branch-3.3 passed | | +1 :green_heart: | shadedclient | 17m 3s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 2m 34s | branch-3.3 passed | | +0 :ok: | spotbugs | 0m 57s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 39s | branch/hadoop-project no findbugs output file (findbugsXml.xml) | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 34s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 40s | the patch passed | | +1 :green_heart: | compile | 15m 40s | the patch passed | | +1 :green_heart: | cc | 15m 40s | the patch passed | | +1 :green_heart: | golang | 15m 40s | the patch passed | | -1 :x: | javac | 15m 40s | root generated 1 new + 1872 unchanged - 0 fixed = 1873 total (was 1872) | | -0 :warning: | checkstyle | 2m 49s | root: The patch generated 2 new + 132 unchanged - 1 fixed = 134 total (was 133) | | +1 :green_heart: | mvnsite | 3m 12s | the patch passed | | +1 :green_heart: | shellcheck | 0m 0s | There were no new shellcheck issues. | | +1 :green_heart: | shelldocs | 0m 44s | There were no new shelldocs issues. | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 4s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 15m 51s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 2m 41s | the patch passed | | +0 :ok: | findbugs | 0m 43s | hadoop-project has no data from findbugs | ||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 42s | hadoop-project in the patch passed. | | -1 :x: | unit | 11m 58s | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 10m 4s | hadoop-mapreduce-client-nativetask in the patch passed. | | -1 :x: | asflicense | 1m 6s | The patch generated 2 ASF License warnings. | | | | 181m 22s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.io.compress.TestCompressorDecompressor | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2576/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2576 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml cc findbugs checkstyle golang shellcheck shelldocs | | uname | Linux 73661674f7bb 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.3 / 3736f6e | | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~16.04-b01 | | javac | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2576/1/artifact/out/diff-compile-javac-root.txt | | checkstyle | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2576/1/artifact/out/diff-checkstyle-root.txt | | unit | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2576/1/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2576/1/testRepo
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=529200&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-529200 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 29/Dec/20 17:53 Start Date: 29/Dec/20 17:53 Worklog Time Spent: 10m Work Description: sunchao commented on pull request #2576: URL: https://github.com/apache/hadoop/pull/2576#issuecomment-752182613 The purpose for this PR is to go through all tests (there're some conflicts during backporting). If everything looks good I'll cherry-pick it directly to branch-3.3. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 529200) Time Spent: 11.5h (was: 11h 20m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Assignee: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 11.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=529199&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-529199 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 29/Dec/20 17:52 Start Date: 29/Dec/20 17:52 Worklog Time Spent: 10m Work Description: sunchao opened a new pull request #2576: URL: https://github.com/apache/hadoop/pull/2576 Backporting #2350 to branch-3.3 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 529199) Time Spent: 11h 20m (was: 11h 10m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Assignee: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 11h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=514149&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-514149 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 19/Nov/20 15:17 Start Date: 19/Nov/20 15:17 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-730443651 Ok. Let's leave to simmer in trunk for the weekend and we can worry about backporting to branch-3.3 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 514149) Time Spent: 11h 10m (was: 11h) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Assignee: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 11h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=513770&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-513770 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 18/Nov/20 20:29 Start Date: 18/Nov/20 20:29 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-729935534 Thanks all! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 513770) Time Spent: 11h (was: 10h 50m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Assignee: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 11h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=513754&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-513754 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 18/Nov/20 20:04 Start Date: 18/Nov/20 20:04 Worklog Time Spent: 10m Work Description: sunchao commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-729922347 Thanks @steveloughran for the LGTM. The last run looks good and I just merged this into trunk. Thanks @viirya for the contribution and everyone for reviewing! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 513754) Time Spent: 10h 50m (was: 10h 40m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 10h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=513753&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-513753 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 18/Nov/20 20:03 Start Date: 18/Nov/20 20:03 Worklog Time Spent: 10m Work Description: sunchao merged pull request #2350: URL: https://github.com/apache/hadoop/pull/2350 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 513753) Time Spent: 10h 40m (was: 10.5h) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 10h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=51&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-51 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 18/Nov/20 04:14 Start Date: 18/Nov/20 04:14 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-729393651 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 10s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 6 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 54s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 23m 9s | | trunk passed | | +1 :green_heart: | compile | 21m 22s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 | | +1 :green_heart: | compile | 18m 24s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 | | +1 :green_heart: | checkstyle | 2m 58s | | trunk passed | | +1 :green_heart: | mvnsite | 2m 47s | | trunk passed | | +1 :green_heart: | shadedclient | 17m 52s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 2m 13s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 | | +1 :green_heart: | javadoc | 2m 41s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 | | +0 :ok: | spotbugs | 0m 50s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 34s | | branch/hadoop-project no findbugs output file (findbugsXml.xml) | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 32s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 30s | | the patch passed | | +1 :green_heart: | compile | 20m 35s | | the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 | | -1 :x: | cc | 20m 35s | [/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/14/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt) | root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 generated 37 new + 134 unchanged - 37 fixed = 171 total (was 171) | | +1 :green_heart: | golang | 20m 35s | | the patch passed | | -1 :x: | javac | 20m 35s | [/diff-compile-javac-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/14/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt) | root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 generated 2 new + 2051 unchanged - 1 fixed = 2053 total (was 2052) | | +1 :green_heart: | compile | 18m 8s | | the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 | | -1 :x: | cc | 18m 8s | [/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/14/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt) | root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 generated 27 new + 144 unchanged - 27 fixed = 171 total (was 171) | | +1 :green_heart: | golang | 18m 8s | | the patch passed | | -1 :x: | javac | 18m 8s | [/diff-compile-javac-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/14/artifact/out/diff-compile-javac-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt) | root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 generated 1 new + 1947 unchanged - 0 fixed = 1948 total (was 1947) | | -0 :warning: | checkstyle | 2m 53s | [/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/14/artifact/out/diff-checkstyle-root.txt) | root: The patch generated 2 new + 132 unchanged - 1 fixed = 134 total (was 133) | | +1 :green_heart: | mvn
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=513317&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-513317 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 18/Nov/20 03:35 Start Date: 18/Nov/20 03:35 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-729368867 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 24s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 6 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 53s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 24m 27s | | trunk passed | | +1 :green_heart: | compile | 22m 26s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 | | +1 :green_heart: | compile | 19m 18s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 | | +1 :green_heart: | checkstyle | 3m 6s | | trunk passed | | +1 :green_heart: | mvnsite | 2m 49s | | trunk passed | | +1 :green_heart: | shadedclient | 18m 25s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 2m 19s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 | | +1 :green_heart: | javadoc | 2m 42s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 | | +0 :ok: | spotbugs | 0m 49s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 35s | | branch/hadoop-project no findbugs output file (findbugsXml.xml) | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 35s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 39s | | the patch passed | | +1 :green_heart: | compile | 22m 3s | | the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 | | -1 :x: | cc | 22m 3s | [/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/13/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt) | root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 generated 27 new + 144 unchanged - 27 fixed = 171 total (was 171) | | +1 :green_heart: | golang | 22m 3s | | the patch passed | | -1 :x: | javac | 22m 3s | [/diff-compile-javac-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/13/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt) | root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 generated 2 new + 2051 unchanged - 1 fixed = 2053 total (was 2052) | | +1 :green_heart: | compile | 19m 26s | | the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 | | -1 :x: | cc | 19m 26s | [/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/13/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt) | root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 generated 23 new + 148 unchanged - 23 fixed = 171 total (was 171) | | +1 :green_heart: | golang | 19m 26s | | the patch passed | | -1 :x: | javac | 19m 26s | [/diff-compile-javac-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/13/artifact/out/diff-compile-javac-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt) | root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 generated 1 new + 1947 unchanged - 0 fixed = 1948 total (was 1947) | | -0 :warning: | checkstyle | 3m 2s | [/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/13/artifact/out/diff-checkstyle-root.txt) | root: The patch generated 2 new + 132 unchanged - 1 fixed = 134 total (was 133) | | +1 :green_heart: | mvn
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=513250&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-513250 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 18/Nov/20 00:40 Start Date: 18/Nov/20 00:40 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-729301320 Thanks @steveloughran! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 513250) Time Spent: 10h 10m (was: 10h) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 10h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=513248&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-513248 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 18/Nov/20 00:38 Start Date: 18/Nov/20 00:38 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r525618405 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Decompressor.java ## @@ -67,6 +55,15 @@ public Lz4Decompressor(int directBufferSize) { this.directBufferSize = directBufferSize; +try { + LZ4Factory lz4Factory = LZ4Factory.fastestInstance(); + lz4Decompressor = lz4Factory.safeDecompressor(); +} catch (Throwable t) { Review comment: Let me just catch `AssertionError` and wrap it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 513248) Time Spent: 10h (was: 9h 50m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 10h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=513077&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-513077 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 17/Nov/20 18:09 Start Date: 17/Nov/20 18:09 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-729106816 LGTM. Only issue is should the catch be for all of Throwable or a subset. I think I'll be happy to go with it as is Two checkstyles to deal with; deprecation warning is notthing to worry about ``` ./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java:476: if (compressor.getClass().isAssignableFrom(Lz4Compressor.class)): 'if' construct must use '{}'s. [NeedBraces] ./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/TestCodec.java:79:import org.junit.Assert;:8: Unused import - org.junit.Assert. [UnusedImports] ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 513077) Time Spent: 9h 50m (was: 9h 40m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 9h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=513073&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-513073 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 17/Nov/20 18:06 Start Date: 17/Nov/20 18:06 Worklog Time Spent: 10m Work Description: hadoop-yetus removed a comment on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-723558273 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 19s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 6 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 15m 2s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 23m 27s | | trunk passed | | +1 :green_heart: | compile | 21m 43s | | trunk passed with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 | | +1 :green_heart: | compile | 18m 55s | | trunk passed with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 | | +1 :green_heart: | checkstyle | 3m 0s | | trunk passed | | +1 :green_heart: | mvnsite | 2m 44s | | trunk passed | | +1 :green_heart: | shadedclient | 17m 16s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 2m 15s | | trunk passed with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 | | +1 :green_heart: | javadoc | 2m 40s | | trunk passed with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 | | +0 :ok: | spotbugs | 0m 51s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 35s | | branch/hadoop-project no findbugs output file (findbugsXml.xml) | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 32s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 33s | | the patch passed | | +1 :green_heart: | compile | 20m 49s | | the patch passed with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 | | -1 :x: | cc | 20m 49s | [/diff-compile-cc-root-jdkUbuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/11/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1.txt) | root-jdkUbuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 generated 26 new + 145 unchanged - 26 fixed = 171 total (was 171) | | +1 :green_heart: | golang | 20m 49s | | the patch passed | | -1 :x: | javac | 20m 49s | [/diff-compile-javac-root-jdkUbuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/11/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1.txt) | root-jdkUbuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 generated 2 new + 2051 unchanged - 1 fixed = 2053 total (was 2052) | | +1 :green_heart: | compile | 18m 15s | | the patch passed with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 | | -1 :x: | cc | 18m 15s | [/diff-compile-cc-root-jdkPrivateBuild-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/11/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10.txt) | root-jdkPrivateBuild-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 generated 31 new + 140 unchanged - 31 fixed = 171 total (was 171) | | +1 :green_heart: | golang | 18m 15s | | the patch passed | | -1 :x: | javac | 18m 15s | [/diff-compile-javac-root-jdkPrivateBuild-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/11/artifact/out/diff-compile-javac-root-jdkPrivateBuild-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10.txt) | root-jdkPrivateBuild-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 generated 1 new + 1947 unchanged - 0 fixed = 1948 total (was 1947) | | -0 :warning: | checkstyle | 2m 57s | [/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/11/artifact/out/diff-checkstyle-root.txt) | root: The patch generated 2 new + 132 unchanged - 1 fixed = 134 total (was 133) | | +1 :
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=513066&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-513066 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 17/Nov/20 18:01 Start Date: 17/Nov/20 18:01 Worklog Time Spent: 10m Work Description: steveloughran commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r525373324 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Decompressor.java ## @@ -67,6 +55,15 @@ public Lz4Decompressor(int directBufferSize) { this.directBufferSize = directBufferSize; +try { + LZ4Factory lz4Factory = LZ4Factory.fastestInstance(); + lz4Decompressor = lz4Factory.safeDecompressor(); +} catch (Throwable t) { Review comment: ok, so what's best here? 1. Catch AssertionError and wrap 2. Catch throwable, but with something ahead of it which will catch and rethrow Error without wrapping. Because we shouldn't really be wrapping those high-priority problems that aren't generally things to ignore This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 513066) Time Spent: 9.5h (was: 9h 20m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 9.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=513062&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-513062 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 17/Nov/20 17:57 Start Date: 17/Nov/20 17:57 Worklog Time Spent: 10m Work Description: steveloughran commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r525370313 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Decompressor.java ## @@ -22,8 +22,9 @@ import java.nio.Buffer; import java.nio.ByteBuffer; +import net.jpountz.lz4.LZ4Factory; +import net.jpountz.lz4.LZ4SafeDecompressor; Review comment: yeah, if things are mixed up, best to leave alone -at least for those files which get lots of changes. For something which rarely sees maintenance, you can make a stronger case for cleanup. I do it sometimes, but as I also get to field cherrypick merge pain, I don't go wild on it This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 513062) Time Spent: 9h 20m (was: 9h 10m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 9h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=512659&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-512659 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 16/Nov/20 23:34 Start Date: 16/Nov/20 23:34 Worklog Time Spent: 10m Work Description: dbtsai edited a comment on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-728409452 Gently ping @steveloughran and @sunchao. @viirya can you update the pom again due to merge conflict? Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 512659) Time Spent: 9h 10m (was: 9h) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 9h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=512657&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-512657 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 16/Nov/20 23:33 Start Date: 16/Nov/20 23:33 Worklog Time Spent: 10m Work Description: dbtsai commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-728409452 Gently ping @steveloughran and @sunchao This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 512657) Time Spent: 9h (was: 8h 50m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 9h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=509583&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-509583 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 10/Nov/20 08:26 Start Date: 10/Nov/20 08:26 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-724544790 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 29s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 6 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 56s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 24m 10s | | trunk passed | | +1 :green_heart: | compile | 22m 46s | | trunk passed with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 | | +1 :green_heart: | compile | 19m 29s | | trunk passed with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 | | +1 :green_heart: | checkstyle | 2m 57s | | trunk passed | | +1 :green_heart: | mvnsite | 2m 48s | | trunk passed | | +1 :green_heart: | shadedclient | 17m 2s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 2m 18s | | trunk passed with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 | | +1 :green_heart: | javadoc | 2m 44s | | trunk passed with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 | | +0 :ok: | spotbugs | 0m 50s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 35s | | branch/hadoop-project no findbugs output file (findbugsXml.xml) | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 33s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 35s | | the patch passed | | +1 :green_heart: | compile | 22m 17s | | the patch passed with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 | | -1 :x: | cc | 22m 17s | [/diff-compile-cc-root-jdkUbuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/12/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1.txt) | root-jdkUbuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 generated 14 new + 157 unchanged - 14 fixed = 171 total (was 171) | | +1 :green_heart: | golang | 22m 17s | | the patch passed | | -1 :x: | javac | 22m 17s | [/diff-compile-javac-root-jdkUbuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/12/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1.txt) | root-jdkUbuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 generated 2 new + 2051 unchanged - 1 fixed = 2053 total (was 2052) | | +1 :green_heart: | compile | 18m 57s | | the patch passed with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 | | -1 :x: | cc | 18m 57s | [/diff-compile-cc-root-jdkPrivateBuild-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/12/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10.txt) | root-jdkPrivateBuild-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 generated 38 new + 133 unchanged - 38 fixed = 171 total (was 171) | | +1 :green_heart: | golang | 18m 57s | | the patch passed | | -1 :x: | javac | 18m 57s | [/diff-compile-javac-root-jdkPrivateBuild-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/12/artifact/out/diff-compile-javac-root-jdkPrivateBuild-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10.txt) | root-jdkPrivateBuild-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 generated 1 new + 1947 unchanged - 0 fixed = 1948 total (was 1947) | | -0 :warning: | checkstyle | 2m 59s | [/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/12/artifact/out/diff-checkstyle-root.txt) | root: The patch generated 2 new + 132 unchanged - 1 fixed = 134 total (was 133) | | +1 :green_he
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=509469&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-509469 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 10/Nov/20 02:41 Start Date: 10/Nov/20 02:41 Worklog Time Spent: 10m Work Description: iwasakims commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r520249492 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Decompressor.java ## @@ -67,6 +55,15 @@ public Lz4Decompressor(int directBufferSize) { this.directBufferSize = directBufferSize; +try { + LZ4Factory lz4Factory = LZ4Factory.fastestInstance(); + lz4Decompressor = lz4Factory.safeDecompressor(); +} catch (Throwable t) { Review comment: LZ4Factory seems to throw java.lang.AssertionError after catching unrecoverable exception. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 509469) Time Spent: 8h 40m (was: 8.5h) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 8h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=509467&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-509467 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 10/Nov/20 02:29 Start Date: 10/Nov/20 02:29 Worklog Time Spent: 10m Work Description: iwasakims commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r520245741 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Decompressor.java ## @@ -22,8 +22,9 @@ import java.nio.Buffer; import java.nio.ByteBuffer; +import net.jpountz.lz4.LZ4Factory; +import net.jpountz.lz4.LZ4SafeDecompressor; Review comment: @viirya I think steveloughran is mentioning about the order of imports and blank line between import blocks. https://github.com/steveloughran/formality/blob/master/styleguide/styleguide.md#imports While the original file violates the coding stantdard and we usually avoid large diff just for fixing formatting issue, it would be ok to fix a few lines here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 509467) Time Spent: 8.5h (was: 8h 20m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 8.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=509421&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-509421 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 09/Nov/20 23:48 Start Date: 09/Nov/20 23:48 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-724352438 Thanks @iwasakims This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 509421) Time Spent: 8h 20m (was: 8h 10m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 8h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=509418&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-509418 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 09/Nov/20 23:43 Start Date: 09/Nov/20 23:43 Worklog Time Spent: 10m Work Description: iwasakims commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-724350332 @dbtsai Let me post a patch on [HADOOP-17369](https://issues.apache.org/jira/browse/HADOOP-17369). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 509418) Time Spent: 8h 10m (was: 8h) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 8h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=509408&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-509408 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 09/Nov/20 23:04 Start Date: 09/Nov/20 23:04 Worklog Time Spent: 10m Work Description: dbtsai commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-724332569 @iwasakims are we able to fix the libstdc++ issue in snappy-java, and bump the version in hadoop before the release? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 509408) Time Spent: 8h (was: 7h 50m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 8h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=509032&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-509032 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 09/Nov/20 08:07 Start Date: 09/Nov/20 08:07 Worklog Time Spent: 10m Work Description: iwasakims commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-723841060 > BTW, for snappy-java and lz4-java, if no native library for your platform is found, it will fallback pure-java implementation. Yeah. In the snappy-java case, native library was provided but the version on the dependency (libstdc++) did not match. Falling back to pure-java impl did not kick in. I think lz4-java has not such a dependency. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 509032) Time Spent: 7h 50m (was: 7h 40m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 7h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=508988&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-508988 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 09/Nov/20 06:17 Start Date: 09/Nov/20 06:17 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-723786079 Regarding `whitespace | 0m 0s | /whitespace-tabs.txt`: > hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/lz4/lz4.c:1705: /* Currently the fast loop shows a regression on qualcomm arm chips. */ This `lz4.c` is copied from `hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/lz4/lz4.c` without change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 508988) Time Spent: 7h 40m (was: 7.5h) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 7h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=508876&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-508876 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 08/Nov/20 10:33 Start Date: 08/Nov/20 10:33 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-723558273 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 19s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 6 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 15m 2s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 23m 27s | | trunk passed | | +1 :green_heart: | compile | 21m 43s | | trunk passed with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 | | +1 :green_heart: | compile | 18m 55s | | trunk passed with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 | | +1 :green_heart: | checkstyle | 3m 0s | | trunk passed | | +1 :green_heart: | mvnsite | 2m 44s | | trunk passed | | +1 :green_heart: | shadedclient | 17m 16s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 2m 15s | | trunk passed with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 | | +1 :green_heart: | javadoc | 2m 40s | | trunk passed with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 | | +0 :ok: | spotbugs | 0m 51s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 35s | | branch/hadoop-project no findbugs output file (findbugsXml.xml) | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 32s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 33s | | the patch passed | | +1 :green_heart: | compile | 20m 49s | | the patch passed with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 | | -1 :x: | cc | 20m 49s | [/diff-compile-cc-root-jdkUbuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/11/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1.txt) | root-jdkUbuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 generated 26 new + 145 unchanged - 26 fixed = 171 total (was 171) | | +1 :green_heart: | golang | 20m 49s | | the patch passed | | -1 :x: | javac | 20m 49s | [/diff-compile-javac-root-jdkUbuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/11/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1.txt) | root-jdkUbuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 with JDK Ubuntu-11.0.9+11-Ubuntu-0ubuntu1.18.04.1 generated 2 new + 2051 unchanged - 1 fixed = 2053 total (was 2052) | | +1 :green_heart: | compile | 18m 15s | | the patch passed with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 | | -1 :x: | cc | 18m 15s | [/diff-compile-cc-root-jdkPrivateBuild-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/11/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10.txt) | root-jdkPrivateBuild-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 generated 31 new + 140 unchanged - 31 fixed = 171 total (was 171) | | +1 :green_heart: | golang | 18m 15s | | the patch passed | | -1 :x: | javac | 18m 15s | [/diff-compile-javac-root-jdkPrivateBuild-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/11/artifact/out/diff-compile-javac-root-jdkPrivateBuild-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10.txt) | root-jdkPrivateBuild-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 with JDK Private Build-1.8.0_272-8u272-b10-0ubuntu1~18.04-b10 generated 1 new + 1947 unchanged - 0 fixed = 1948 total (was 1947) | | -0 :warning: | checkstyle | 2m 57s | [/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/11/artifact/out/diff-checkstyle-root.txt) | root: The patch generated 2 new + 132 unchanged - 1 fixed = 134 total (was 133) | | +1 :green_he
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=508787&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-508787 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 07/Nov/20 18:01 Start Date: 07/Nov/20 18:01 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-723475628 > I got a portability issue on bundled snappy .so of snappy-java in [xerial/snappy-java#256](https://github.com/xerial/snappy-java/pull/256). It could be downside that we can not directly control the portability. While I could not find how the bundled lz4 is built yet, it is enough portable since it has no dependency on other library? Thanks for asking. The underlying of lz4-java is lz4 (https://lz4.github.io/lz4/), I think it is the same as current Hadoop's lz4 library. ``` LZ4 - Fast LZ compression algorithm Copyright (C) 2011-present, Yann Collet. ... You can contact the author at : - LZ4 homepage : http://www.lz4.org - LZ4 source repository : https://github.com/lz4/lz4 ``` So I think it should be as portable as before? BTW, for snappy-java and lz4-java, if no native library for your platform is found, it will fallback pure-java implementation. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 508787) Time Spent: 7h 20m (was: 7h 10m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 7h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=508738&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-508738 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 07/Nov/20 11:07 Start Date: 07/Nov/20 11:07 Worklog Time Spent: 10m Work Description: iwasakims commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-723431811 I got a portability issue on bundled snappy .so of snappy-java in https://github.com/xerial/snappy-java/pull/256. It could be downside that we can not directly control the portability. While I could not find how the bundled lz4 is built yet, it is enough portable since it has no dependency on other library? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 508738) Time Spent: 7h 10m (was: 7h) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 7h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=508651&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-508651 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 06/Nov/20 23:18 Start Date: 06/Nov/20 23:18 Worklog Time Spent: 10m Work Description: sunchao commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-723342394 Another kindly ping @steveloughran . I think @viirya has resolved all your comments so please take a look. Otherwise I'll merge this early next week. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 508651) Time Spent: 6h 50m (was: 6h 40m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 6h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=508652&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-508652 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 06/Nov/20 23:19 Start Date: 06/Nov/20 23:19 Worklog Time Spent: 10m Work Description: sunchao edited a comment on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-723342394 Another kindly ping @steveloughran . I think @viirya has resolved all your comments so please take a look. Otherwise I'll merge this next week. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 508652) Time Spent: 7h (was: 6h 50m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 7h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=503468&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503468 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 22/Oct/20 01:18 Start Date: 22/Oct/20 01:18 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-714127360 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 33s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 6 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 11m 50s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 21m 21s | | trunk passed | | +1 :green_heart: | compile | 19m 54s | | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 17m 25s | | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 2m 51s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 18s | | trunk passed | | +1 :green_heart: | shadedclient | 15m 45s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 2m 26s | | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 3m 21s | | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 1m 2s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 47s | | branch/hadoop-project no findbugs output file (findbugsXml.xml) | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 36s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 37s | | the patch passed | | +1 :green_heart: | compile | 19m 14s | | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | cc | 19m 14s | [/diff-compile-cc-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/10/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt) | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 36 new + 136 unchanged - 36 fixed = 172 total (was 172) | | +1 :green_heart: | golang | 19m 14s | | the patch passed | | -1 :x: | javac | 19m 14s | [/diff-compile-javac-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/10/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt) | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 2 new + 2051 unchanged - 1 fixed = 2053 total (was 2052) | | +1 :green_heart: | compile | 17m 28s | | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | cc | 17m 28s | [/diff-compile-cc-root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/10/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.txt) | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 11 new + 161 unchanged - 11 fixed = 172 total (was 172) | | +1 :green_heart: | golang | 17m 28s | | the patch passed | | -1 :x: | javac | 17m 28s | [/diff-compile-javac-root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/10/artifact/out/diff-compile-javac-root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.txt) | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 1 new + 1947 unchanged - 0 fixed = 1948 total (was 1947) | | -0 :warning: | checkstyle | 3m 17s | [/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/10/artifact/out/diff-checkstyle-root.txt) | root: The patch generated 2 new + 132 unchanged - 1 fixed
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=503467&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503467 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 22/Oct/20 01:17 Start Date: 22/Oct/20 01:17 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-714125530 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 13s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 6 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 11m 37s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 23m 6s | | trunk passed | | +1 :green_heart: | compile | 22m 3s | | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 18m 27s | | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 2m 57s | | trunk passed | | +1 :green_heart: | mvnsite | 2m 45s | | trunk passed | | +1 :green_heart: | shadedclient | 17m 14s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 46s | | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 39s | | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 0m 52s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 36s | | branch/hadoop-project no findbugs output file (findbugsXml.xml) | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 32s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 30s | | the patch passed | | +1 :green_heart: | compile | 20m 45s | | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | cc | 20m 45s | [/diff-compile-cc-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/9/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt) | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 24 new + 148 unchanged - 24 fixed = 172 total (was 172) | | +1 :green_heart: | golang | 20m 45s | | the patch passed | | -1 :x: | javac | 20m 45s | [/diff-compile-javac-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/9/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt) | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 2 new + 2051 unchanged - 1 fixed = 2053 total (was 2052) | | +1 :green_heart: | compile | 18m 23s | | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | cc | 18m 23s | [/diff-compile-cc-root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/9/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.txt) | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 29 new + 143 unchanged - 29 fixed = 172 total (was 172) | | +1 :green_heart: | golang | 18m 23s | | the patch passed | | -1 :x: | javac | 18m 23s | [/diff-compile-javac-root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/9/artifact/out/diff-compile-javac-root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.txt) | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 1 new + 1947 unchanged - 0 fixed = 1948 total (was 1947) | | -0 :warning: | checkstyle | 3m 21s | [/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/9/artifact/out/diff-checkstyle-root.txt) | root: The patch generated 2 new + 132 unchanged - 1 fixed = 13
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=503420&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503420 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 21/Oct/20 21:56 Start Date: 21/Oct/20 21:56 Worklog Time Spent: 10m Work Description: sunchao commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-713900131 kindly ping @steveloughran This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 503420) Time Spent: 6h 20m (was: 6h 10m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 6h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=503417&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503417 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 21/Oct/20 21:52 Start Date: 21/Oct/20 21:52 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-713898244 @dbtsai Resolved. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 503417) Time Spent: 6h 10m (was: 6h) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 6h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=503388&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503388 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 21/Oct/20 20:39 Start Date: 21/Oct/20 20:39 Worklog Time Spent: 10m Work Description: dbtsai commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-713864391 @viirya can you resolve the conflict? Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 503388) Time Spent: 6h (was: 5h 50m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 6h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=501163&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-501163 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 15/Oct/20 16:46 Start Date: 15/Oct/20 16:46 Worklog Time Spent: 10m Work Description: sunchao commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r505690346 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Compressor.java ## @@ -302,11 +303,20 @@ public synchronized long getBytesWritten() { public synchronized void end() { } - private native static void initIDs(); - - private native int compressBytesDirect(); - - private native int compressBytesDirectHC(); - - public native static String getLibraryName(); + private int compressDirectBuf() { +if (uncompressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `uncompressedDirectBuf` for reading + uncompressedDirectBuf.limit(uncompressedDirectBufLen).position(0); + compressedDirectBuf.clear(); Review comment: Okay makes sense. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 501163) Time Spent: 5h 50m (was: 5h 40m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 5h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=500482&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500482 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 14/Oct/20 07:06 Start Date: 14/Oct/20 07:06 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-708206042 @sunchao Thanks for review. I addressed your comments, please let me know if you have more comments. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 500482) Time Spent: 5h 40m (was: 5.5h) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 5h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=500481&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500481 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 14/Oct/20 07:05 Start Date: 14/Oct/20 07:05 Worklog Time Spent: 10m Work Description: viirya edited a comment on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-708205091 @steveloughran Thanks for reviewing this change. > this may be time for that hadoop-compression library which, even if the codec stays in hadoop-common, can declare the new JAR (and snappy java) as explicit dependencies. That way things including mapreduce client can just ask for hadoop-compression and get all the latest set of JARs as well as codecs, and critically, get the JAR versions consistent with what hadoop was built with Yea, we will make `hadoop-compression` after this change. As we change snappy and lz4 codec to use java library, it actually makes easier to prepare `hadoop-compression` as if we want to move codec, we don't need to move native codes, related native code compilation stuffs..etc. > will need extra docs (where?) to cover the change Not sure where is good, we will look at proper place to put extra docs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 500481) Time Spent: 5.5h (was: 5h 20m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 5.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=500480&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500480 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 14/Oct/20 07:04 Start Date: 14/Oct/20 07:04 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-708205091 @steveloughran Thanks for reviewing this change. > this may be time for that hadoop-compression library which, even if the codec stays in hadoop-common, can declare the new JAR (and snappy java) as explicit dependencies. That way things including mapreduce client can just ask for hadoop-compression and get all the latest set of JARs as well as codecs, and critically, get the JAR versions consistent with what hadoop was built with Yea, we will make `hadoop-compression` after this change. As we change snappy and lz4 codec to use java library, it actually makes easier to prepare `hadoop-compression` as if we want to move codec, we don't need to move native codes, related native code compilation stuffs..etc. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 500480) Time Spent: 5h 20m (was: 5h 10m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 5h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=500478&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500478 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 14/Oct/20 07:00 Start Date: 14/Oct/20 07:00 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r50909 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Compressor.java ## @@ -50,20 +51,7 @@ private final boolean useLz4HC; - static { -if (NativeCodeLoader.isNativeCodeLoaded()) { - // Initialize the native library - try { -initIDs(); - } catch (Throwable t) { -// Ignore failure to load/initialize lz4 -LOG.warn(t.toString()); - } -} else { - LOG.error("Cannot load " + Lz4Compressor.class.getName() + - " without native hadoop library!"); -} - } + private LZ4Compressor lz4Compressor; Review comment: thanks. added `final`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 500478) Time Spent: 5h 10m (was: 5h) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 5h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=500477&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500477 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 14/Oct/20 06:59 Start Date: 14/Oct/20 06:59 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r50782 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Decompressor.java ## @@ -67,6 +55,15 @@ public Lz4Decompressor(int directBufferSize) { this.directBufferSize = directBufferSize; +try { + LZ4Factory lz4Factory = LZ4Factory.fastestInstance(); + lz4Decompressor = lz4Factory.safeDecompressor(); +} catch (Throwable t) { + throw new RuntimeException("lz4-java library is not available: " + + "Lz4Decompressor has not been loaded. You need to add " + + "lz4-java.jar to your CLASSPATH", t); Review comment: thanks. added it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 500477) Time Spent: 5h (was: 4h 50m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=500476&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500476 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 14/Oct/20 06:59 Start Date: 14/Oct/20 06:59 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r50526 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/lz4/TestLz4CompressorDecompressor.java ## @@ -330,4 +328,33 @@ public void doWork() throws Exception { ctx.waitFor(6); } + + @Test + public void testLz4Compatibility() throws Exception { +Path filePath = new Path(TestLz4CompressorDecompressor.class Review comment: ok, added some comment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 500476) Time Spent: 4h 50m (was: 4h 40m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 4h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=500473&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500473 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 14/Oct/20 06:54 Start Date: 14/Oct/20 06:54 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r504442381 ## File path: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/pom.xml ## @@ -71,6 +71,11 @@ assertj-core test + Review comment: One test in `hadoop-mapreduce-client-nativetask` uses `Lz4Codec`. If we don't add `lz4-java` as test dependency, because it is `provided` scope in `hadoop-common` now, we will get class not found exception. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 500473) Time Spent: 4h 40m (was: 4.5h) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 4h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=500472&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500472 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 14/Oct/20 06:53 Start Date: 14/Oct/20 06:53 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r504441816 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Decompressor.java ## @@ -272,7 +269,19 @@ public synchronized void end() { // do nothing } - private native static void initIDs(); - - private native int decompressBytesDirect(); + private int decompressDirectBuf() { +if (compressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `compressedDirectBuf` for reading + compressedDirectBuf.limit(compressedDirectBufLen).position(0); + lz4Decompressor.decompress((ByteBuffer) compressedDirectBuf, + (ByteBuffer) uncompressedDirectBuf); + compressedDirectBufLen = 0; + compressedDirectBuf.limit(compressedDirectBuf.capacity()).position(0); Review comment: sure. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 500472) Time Spent: 4.5h (was: 4h 20m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 4.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=500471&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500471 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 14/Oct/20 06:53 Start Date: 14/Oct/20 06:53 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r504441522 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Decompressor.java ## @@ -272,7 +269,19 @@ public synchronized void end() { // do nothing } - private native static void initIDs(); - - private native int decompressBytesDirect(); + private int decompressDirectBuf() { +if (compressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `compressedDirectBuf` for reading Review comment: ok, removed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 500471) Time Spent: 4h 20m (was: 4h 10m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 4h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=500470&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500470 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 14/Oct/20 06:51 Start Date: 14/Oct/20 06:51 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r504441080 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Compressor.java ## @@ -302,11 +303,20 @@ public synchronized long getBytesWritten() { public synchronized void end() { } - private native static void initIDs(); - - private native int compressBytesDirect(); - - private native int compressBytesDirectHC(); - - public native static String getLibraryName(); + private int compressDirectBuf() { +if (uncompressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `uncompressedDirectBuf` for reading + uncompressedDirectBuf.limit(uncompressedDirectBufLen).position(0); + compressedDirectBuf.clear(); Review comment: We can remove the `clear` and `limit` calls at the call site. ```java // Re-initialize the lz4's output direct-buffer compressedDirectBuf.clear(); compressedDirectBuf.limit(0); ``` But we cannot remove `clear` in `compressDirectBuf` method. First, lz4-java cannot accept the `compressedDirectBuf` if it is called with `limit(0)`. ``` [ERROR] Failures: [ERROR] TestCompressorDecompressor.testCompressorDecompressor:69 Expected to find 'testCompressorDecompressor error !!!' but got unexpected exception: net.jpountz.lz4.LZ4Exception: maxDestLen is too small at net.jpountz.lz4.LZ4JNICompressor.compress(LZ4JNICompressor.java:69) at net.jpountz.lz4.LZ4Compressor.compress(LZ4Compressor.java:158) at org.apache.hadoop.io.compress.lz4.Lz4Compressor.compressDirectBuf(Lz4Compressor.java:310) at org.apache.hadoop.io.compress.lz4.Lz4Compressor.compress(Lz4Compressor.java:237) at org.apache.hadoop.io.compress.CompressDecompressTester$CompressionTestStrategy$2.assertCompression(CompressDecompressTester.java:286) at org.apache.hadoop.io.compress.CompressDecompressTester.test(CompressDecompressTester.java:114) at org.apache.hadoop.io.compress.TestCompressorDecompressor.testCompressorDecompressor(TestCompressorDecompressor.java:66) ``` Second, even we remove `limit` in the call site, one test still failed: ``` [ERROR] testCompressorDecompressor(org.apache.hadoop.io.compress.TestCompressorDecompressor) Time elapsed: 0.539 s <<< FAILURE! java.lang.AssertionError: org.apache.hadoop.io.compress.lz4.Lz4Compressor_org.apache.hadoop.io.compress.lz4.Lz4Decompressor- empty stream compressed output size != 4 expected:<4> but was:<65796> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:834) at org.junit.Assert.assertEquals(Assert.java:645) at org.apache.hadoop.io.compress.CompressDecompressTester$CompressionTestStrategy$3.assertCompression(CompressDecompressTester.java:33
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=500463&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500463 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 14/Oct/20 06:36 Start Date: 14/Oct/20 06:36 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r504434296 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Decompressor.java ## @@ -67,6 +55,15 @@ public Lz4Decompressor(int directBufferSize) { this.directBufferSize = directBufferSize; +try { + LZ4Factory lz4Factory = LZ4Factory.fastestInstance(); + lz4Decompressor = lz4Factory.safeDecompressor(); +} catch (Throwable t) { Review comment: Hmm, I tried to explicitly catch `ClassNotFoundException`, but the java compiler complains... ``` exception java.lang.ClassNotFoundException is never thrown in body of corresponding try statement ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 500463) Time Spent: 4h (was: 3h 50m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 4h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=500460&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500460 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 14/Oct/20 06:32 Start Date: 14/Oct/20 06:32 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r504432504 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Compressor.java ## @@ -302,11 +303,20 @@ public synchronized long getBytesWritten() { public synchronized void end() { } - private native static void initIDs(); - - private native int compressBytesDirect(); - - private native int compressBytesDirectHC(); - - public native static String getLibraryName(); + private int compressDirectBuf() { Review comment: Ok, currently this PR focuses on turning to use lz4-java. We can refactor compressor/decompressor classes later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 500460) Time Spent: 3h 50m (was: 3h 40m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 3h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=500459&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500459 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 14/Oct/20 06:31 Start Date: 14/Oct/20 06:31 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r504431944 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Compressor.java ## @@ -76,6 +64,19 @@ public Lz4Compressor(int directBufferSize, boolean useLz4HC) { this.useLz4HC = useLz4HC; this.directBufferSize = directBufferSize; +try { + LZ4Factory lz4Factory = LZ4Factory.fastestInstance(); + if (useLz4HC) { Review comment: Oh, yes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 500459) Time Spent: 3h 40m (was: 3.5h) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 3h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=500458&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500458 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 14/Oct/20 06:30 Start Date: 14/Oct/20 06:30 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r504431628 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Compressor.java ## @@ -76,6 +64,19 @@ public Lz4Compressor(int directBufferSize, boolean useLz4HC) { this.useLz4HC = useLz4HC; this.directBufferSize = directBufferSize; +try { + LZ4Factory lz4Factory = LZ4Factory.fastestInstance(); + if (useLz4HC) { +lz4Compressor = lz4Factory.highCompressor(); Review comment: Yeah, sounds good. We can add it later if it is necessary. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 500458) Time Spent: 3.5h (was: 3h 20m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 3.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=500453&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500453 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 14/Oct/20 06:23 Start Date: 14/Oct/20 06:23 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r504428520 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Decompressor.java ## @@ -22,8 +22,9 @@ import java.nio.Buffer; import java.nio.ByteBuffer; +import net.jpountz.lz4.LZ4Factory; +import net.jpountz.lz4.LZ4SafeDecompressor; Review comment: Already in same block? ```java import net.jpountz.lz4.LZ4Factory; import net.jpountz.lz4.LZ4SafeDecompressor; import org.apache.hadoop.io.compress.Decompressor; import org.slf4j.Logger; import org.slf4j.LoggerFactory; ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 500453) Time Spent: 3h 20m (was: 3h 10m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 3h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=500450&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500450 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 14/Oct/20 06:22 Start Date: 14/Oct/20 06:22 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r504428324 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Compressor.java ## @@ -22,9 +22,10 @@ import java.nio.Buffer; import java.nio.ByteBuffer; +import net.jpountz.lz4.LZ4Factory; +import net.jpountz.lz4.LZ4Compressor; Review comment: They are already in same block, no? ```java import net.jpountz.lz4.LZ4Factory; import net.jpountz.lz4.LZ4Compressor; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.io.compress.Compressor; import org.slf4j.Logger; import org.slf4j.LoggerFactory; ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 500450) Time Spent: 3h (was: 2h 50m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 3h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=500451&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500451 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 14/Oct/20 06:22 Start Date: 14/Oct/20 06:22 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r504428520 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Decompressor.java ## @@ -22,8 +22,9 @@ import java.nio.Buffer; import java.nio.ByteBuffer; +import net.jpountz.lz4.LZ4Factory; +import net.jpountz.lz4.LZ4SafeDecompressor; Review comment: ``` import net.jpountz.lz4.LZ4Factory; import net.jpountz.lz4.LZ4SafeDecompressor; import org.apache.hadoop.io.compress.Decompressor; import org.slf4j.Logger; import org.slf4j.LoggerFactory; ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 500451) Time Spent: 3h 10m (was: 3h) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 3h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=498683&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498683 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 09/Oct/20 17:52 Start Date: 09/Oct/20 17:52 Worklog Time Spent: 10m Work Description: sunchao commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r502564166 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Compressor.java ## @@ -76,6 +64,19 @@ public Lz4Compressor(int directBufferSize, boolean useLz4HC) { this.useLz4HC = useLz4HC; this.directBufferSize = directBufferSize; +try { + LZ4Factory lz4Factory = LZ4Factory.fastestInstance(); + if (useLz4HC) { +lz4Compressor = lz4Factory.highCompressor(); Review comment: The library also allow configuring the compression level, which perhaps we can add a Hadoop option to enable that later. This just use the default compression level. ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Compressor.java ## @@ -76,6 +64,19 @@ public Lz4Compressor(int directBufferSize, boolean useLz4HC) { this.useLz4HC = useLz4HC; this.directBufferSize = directBufferSize; +try { + LZ4Factory lz4Factory = LZ4Factory.fastestInstance(); + if (useLz4HC) { Review comment: nit: seems we no longer need the field `useLz4HC` with this. ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Decompressor.java ## @@ -272,7 +269,19 @@ public synchronized void end() { // do nothing } - private native static void initIDs(); - - private native int decompressBytesDirect(); + private int decompressDirectBuf() { +if (compressedDirectBufLen == 0) { Review comment: I don't think this will ever happen but it's not a big deal. ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Decompressor.java ## @@ -272,7 +269,19 @@ public synchronized void end() { // do nothing } - private native static void initIDs(); - - private native int decompressBytesDirect(); + private int decompressDirectBuf() { +if (compressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `compressedDirectBuf` for reading + compressedDirectBuf.limit(compressedDirectBufLen).position(0); + lz4Decompressor.decompress((ByteBuffer) compressedDirectBuf, + (ByteBuffer) uncompressedDirectBuf); + compressedDirectBufLen = 0; + compressedDirectBuf.limit(compressedDirectBuf.capacity()).position(0); Review comment: you can just call `clear`? ## File path: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/pom.xml ## @@ -71,6 +71,11 @@ assertj-core test + Review comment: why is this needed? ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Compressor.java ## @@ -302,11 +303,20 @@ public synchronized long getBytesWritten() { public synchronized void end() { } - private native static void initIDs(); - - private native int compressBytesDirect(); - - private native int compressBytesDirectHC(); - - public native static String getLibraryName(); + private int compressDirectBuf() { Review comment: seems some of the methods in this class look exactly the same as in `SnappyCompressor` - perhaps we can do some refactoring later. ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Compressor.java ## @@ -302,11 +303,20 @@ public synchronized long getBytesWritten() { public synchronized void end() { } - private native static void initIDs(); - - private native int compressBytesDirect(); - - private native int compressBytesDirectHC(); - - public native static String getLibraryName(); + private int compressDirectBuf() { +if (uncompressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `uncompressedDirectBuf` for reading + uncompressedDirectBuf.limit(uncompressedDirectBufLen).position(0); + compressedDirectBuf.clear(); Review comment: I think this isn't necessary since it's called right before the call site? ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Decompressor.java ## @@ -272,7 +269,19 @@ public synchronized void end() { // do nothing } - private native static void initIDs(); - - private native int decompressByte
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=498655&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498655 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 09/Oct/20 16:54 Start Date: 09/Oct/20 16:54 Worklog Time Spent: 10m Work Description: steveloughran commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r502553101 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Compressor.java ## @@ -50,20 +51,7 @@ private final boolean useLz4HC; - static { -if (NativeCodeLoader.isNativeCodeLoaded()) { - // Initialize the native library - try { -initIDs(); - } catch (Throwable t) { -// Ignore failure to load/initialize lz4 -LOG.warn(t.toString()); - } -} else { - LOG.error("Cannot load " + Lz4Compressor.class.getName() + - " without native hadoop library!"); -} - } + private LZ4Compressor lz4Compressor; Review comment: final? ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Compressor.java ## @@ -22,9 +22,10 @@ import java.nio.Buffer; import java.nio.ByteBuffer; +import net.jpountz.lz4.LZ4Factory; +import net.jpountz.lz4.LZ4Compressor; Review comment: put into the same import block as org.sjf4j ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Decompressor.java ## @@ -22,8 +22,9 @@ import java.nio.Buffer; import java.nio.ByteBuffer; +import net.jpountz.lz4.LZ4Factory; +import net.jpountz.lz4.LZ4SafeDecompressor; Review comment: again, not in same import block as org.apache ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Decompressor.java ## @@ -67,6 +55,15 @@ public Lz4Decompressor(int directBufferSize) { this.directBufferSize = directBufferSize; +try { + LZ4Factory lz4Factory = LZ4Factory.fastestInstance(); + lz4Decompressor = lz4Factory.safeDecompressor(); +} catch (Throwable t) { + throw new RuntimeException("lz4-java library is not available: " + + "Lz4Decompressor has not been loaded. You need to add " + + "lz4-java.jar to your CLASSPATH", t); Review comment: add +t to the end of the string, so the specific error text isn't lost ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Decompressor.java ## @@ -67,6 +55,15 @@ public Lz4Decompressor(int directBufferSize) { this.directBufferSize = directBufferSize; +try { + LZ4Factory lz4Factory = LZ4Factory.fastestInstance(); + lz4Decompressor = lz4Factory.safeDecompressor(); +} catch (Throwable t) { Review comment: Shame the constructor can't throw exceptions direct. Catch only the exception's known to be raised, and don't convert RTEs or, especially, Errors. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498655) Time Spent: 2h 40m (was: 2.5h) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 2h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains na
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=498519&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498519 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 09/Oct/20 14:13 Start Date: 09/Oct/20 14:13 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-705330740 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 2m 10s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 6 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 6m 18s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 27m 49s | | trunk passed | | +1 :green_heart: | compile | 25m 54s | | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 21m 50s | | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 3m 25s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 13s | | trunk passed | | +1 :green_heart: | shadedclient | 17m 32s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 2m 3s | | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 44s | | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 0m 58s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 36s | | branch/hadoop-project no findbugs output file (findbugsXml.xml) | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 33s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 47s | | the patch passed | | +1 :green_heart: | compile | 23m 56s | | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | cc | 23m 56s | [/diff-compile-cc-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/8/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt) | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 8 new + 155 unchanged - 8 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 23m 56s | | the patch passed | | -1 :x: | javac | 23m 56s | [/diff-compile-javac-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/8/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt) | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 2 new + 2051 unchanged - 1 fixed = 2053 total (was 2052) | | +1 :green_heart: | compile | 21m 13s | | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | cc | 21m 13s | [/diff-compile-cc-root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/8/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.txt) | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 13 new + 150 unchanged - 13 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 21m 13s | | the patch passed | | -1 :x: | javac | 21m 13s | [/diff-compile-javac-root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/8/artifact/out/diff-compile-javac-root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.txt) | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 1 new + 1947 unchanged - 0 fixed = 1948 total (was 1947) | | -0 :warning: | checkstyle | 3m 26s | [/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/8/artifact/out/diff-checkstyle-root.txt) | root: The patch generated 2 new + 132 unchanged - 1 fixed = 134
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=497091&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497091 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 08/Oct/20 05:02 Start Date: 08/Oct/20 05:02 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-705330740 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 2m 10s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 6 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 6m 18s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 27m 49s | | trunk passed | | +1 :green_heart: | compile | 25m 54s | | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 21m 50s | | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 3m 25s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 13s | | trunk passed | | +1 :green_heart: | shadedclient | 17m 32s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 2m 3s | | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 44s | | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 0m 58s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 36s | | branch/hadoop-project no findbugs output file (findbugsXml.xml) | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 33s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 47s | | the patch passed | | +1 :green_heart: | compile | 23m 56s | | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | cc | 23m 56s | [/diff-compile-cc-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/8/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt) | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 8 new + 155 unchanged - 8 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 23m 56s | | the patch passed | | -1 :x: | javac | 23m 56s | [/diff-compile-javac-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/8/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt) | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 2 new + 2051 unchanged - 1 fixed = 2053 total (was 2052) | | +1 :green_heart: | compile | 21m 13s | | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | cc | 21m 13s | [/diff-compile-cc-root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/8/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.txt) | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 13 new + 150 unchanged - 13 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 21m 13s | | the patch passed | | -1 :x: | javac | 21m 13s | [/diff-compile-javac-root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/8/artifact/out/diff-compile-javac-root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.txt) | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 1 new + 1947 unchanged - 0 fixed = 1948 total (was 1947) | | -0 :warning: | checkstyle | 3m 26s | [/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/8/artifact/out/diff-checkstyle-root.txt) | root: The patch generated 2 new + 132 unchanged - 1 fixed = 134
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=496826&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496826 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 07/Oct/20 18:25 Start Date: 07/Oct/20 18:25 Worklog Time Spent: 10m Work Description: dbtsai commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-705114513 Gently ping @steveloughran This is almost identical to SnappyCodec one you merged. Could you help to review it? Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 496826) Time Spent: 2h 10m (was: 2h) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=496793&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496793 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 07/Oct/20 17:42 Start Date: 07/Oct/20 17:42 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-705091425 Rebased. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 496793) Time Spent: 2h (was: 1h 50m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=496770&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496770 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 07/Oct/20 17:19 Start Date: 07/Oct/20 17:19 Worklog Time Spent: 10m Work Description: dbtsai commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-705079529 @viirya can you rebase master? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 496770) Time Spent: 1h 50m (was: 1h 40m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=495049&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495049 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 04/Oct/20 18:49 Start Date: 04/Oct/20 18:49 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-703298858 All tests are passed now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495049) Time Spent: 1.5h (was: 1h 20m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=495050&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495050 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 04/Oct/20 18:49 Start Date: 04/Oct/20 18:49 Worklog Time Spent: 10m Work Description: viirya edited a comment on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-703298858 All tests are passed now. https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2350/7/testReport/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495050) Time Spent: 1h 40m (was: 1.5h) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=494095&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-494095 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 02/Oct/20 18:51 Start Date: 02/Oct/20 18:51 Worklog Time Spent: 10m Work Description: dbtsai commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-702900730 cc @steveloughran this is almost identical to https://github.com/apache/hadoop/pull/2297/ Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 494095) Time Spent: 1h 20m (was: 1h 10m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=493820&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493820 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 02/Oct/20 05:19 Start Date: 02/Oct/20 05:19 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r498622029 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Compressor.java ## @@ -236,7 +237,7 @@ public synchronized int compress(byte[] b, int off, int len) } // Compress data -n = useLz4HC ? compressBytesDirectHC() : compressBytesDirect(); +n = compressBytesDirect(); Review comment: fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493820) Time Spent: 1h 10m (was: 1h) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=493819&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493819 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 02/Oct/20 05:15 Start Date: 02/Oct/20 05:15 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r498621489 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/TestCodec.java ## @@ -143,22 +143,16 @@ public void testSnappyCodec() throws IOException { @Test public void testLz4Codec() throws IOException { -if (NativeCodeLoader.isNativeCodeLoaded()) { - if (Lz4Codec.isNativeCodeLoaded()) { -conf.setBoolean( +conf.setBoolean( CommonConfigurationKeys.IO_COMPRESSION_CODEC_LZ4_USELZ4HC_KEY, Review comment: fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493819) Time Spent: 1h (was: 50m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=493818&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493818 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 02/Oct/20 05:13 Start Date: 02/Oct/20 05:13 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r498621151 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java ## @@ -494,8 +494,7 @@ public String getName() { private static boolean isAvailable(TesterPair pair) { Compressor compressor = pair.compressor; -if (compressor.getClass().isAssignableFrom(Lz4Compressor.class) -&& (NativeCodeLoader.isNativeCodeLoaded())) +if (compressor.getClass().isAssignableFrom(Lz4Compressor.class)) Review comment: Added compatibility test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493818) Time Spent: 50m (was: 40m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=492638&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492638 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 29/Sep/20 19:51 Start Date: 29/Sep/20 19:51 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r497004500 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java ## @@ -494,8 +494,7 @@ public String getName() { private static boolean isAvailable(TesterPair pair) { Compressor compressor = pair.compressor; -if (compressor.getClass().isAssignableFrom(Lz4Compressor.class) -&& (NativeCodeLoader.isNativeCodeLoaded())) +if (compressor.getClass().isAssignableFrom(Lz4Compressor.class)) Review comment: Sure. Will add it later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 492638) Time Spent: 40m (was: 0.5h) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=492637&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492637 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 29/Sep/20 19:46 Start Date: 29/Sep/20 19:46 Worklog Time Spent: 10m Work Description: dbtsai commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r49626 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Compressor.java ## @@ -236,7 +237,7 @@ public synchronized int compress(byte[] b, int off, int len) } // Compress data -n = useLz4HC ? compressBytesDirectHC() : compressBytesDirect(); +n = compressBytesDirect(); Review comment: `compressDirectBuf`? ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java ## @@ -494,8 +494,7 @@ public String getName() { private static boolean isAvailable(TesterPair pair) { Compressor compressor = pair.compressor; -if (compressor.getClass().isAssignableFrom(Lz4Compressor.class) -&& (NativeCodeLoader.isNativeCodeLoaded())) +if (compressor.getClass().isAssignableFrom(Lz4Compressor.class)) Review comment: Can we add a compatibility test like snappy? ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/TestCodec.java ## @@ -143,22 +143,16 @@ public void testSnappyCodec() throws IOException { @Test public void testLz4Codec() throws IOException { -if (NativeCodeLoader.isNativeCodeLoaded()) { - if (Lz4Codec.isNativeCodeLoaded()) { -conf.setBoolean( +conf.setBoolean( CommonConfigurationKeys.IO_COMPRESSION_CODEC_LZ4_USELZ4HC_KEY, Review comment: indentation as you remove the `{ }` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 492637) Time Spent: 0.5h (was: 20m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=492632&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492632 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 29/Sep/20 19:31 Start Date: 29/Sep/20 19:31 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#issuecomment-700934519 cc @dbtsai @sunchao This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 492632) Time Spent: 20m (was: 10m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=492628&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492628 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 29/Sep/20 19:24 Start Date: 29/Sep/20 19:24 Worklog Time Spent: 10m Work Description: viirya opened a new pull request #2350: URL: https://github.com/apache/hadoop/pull/2350 See https://issues.apache.org/jira/browse/HADOOP-17292 for details. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 492628) Remaining Estimate: 0h Time Spent: 10m > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org