[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=499515=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-499515 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 12/Oct/20 16:51 Start Date: 12/Oct/20 16:51 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-707232854 you needed to be listed in the project settings as someone with the right permissions. its done now This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 499515) Time Spent: 25h 50m (was: 25h 40m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Assignee: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Fix For: 3.3.1, 3.4.0 > > Time Spent: 25h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=499484=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-499484 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 12/Oct/20 16:02 Start Date: 12/Oct/20 16:02 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-707207886 @steveloughran Thank you! I tried to assign this ticket, but seems cannot do it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 499484) Time Spent: 25h 40m (was: 25.5h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Assignee: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Fix For: 3.3.1, 3.4.0 > > Time Spent: 25h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=499311=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-499311 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 12/Oct/20 10:01 Start Date: 12/Oct/20 10:01 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-707019757 @viirya assigned JIRA to you. you are also free to assign any other Hadoop JIRAs to yourself... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 499311) Time Spent: 25.5h (was: 25h 20m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Assignee: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Fix For: 3.3.1, 3.4.0 > > Time Spent: 25.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=498660=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498660 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 09/Oct/20 16:57 Start Date: 09/Oct/20 16:57 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-706292851 @steveloughran Username is viirya too. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498660) Time Spent: 25h 20m (was: 25h 10m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Assignee: DB Tsai >Priority: Major > Labels: pull-request-available > Fix For: 3.3.1, 3.4.0 > > Time Spent: 25h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=498647=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498647 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 09/Oct/20 16:42 Start Date: 09/Oct/20 16:42 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-706286111 @viirya...what's your JIRA username? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 498647) Time Spent: 25h 10m (was: 25h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Assignee: DB Tsai >Priority: Major > Labels: pull-request-available > Fix For: 3.3.1, 3.4.0 > > Time Spent: 25h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=496735=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496735 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 07/Oct/20 16:36 Start Date: 07/Oct/20 16:36 Worklog Time Spent: 10m Work Description: sunchao commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-705055574 Thanks @steveloughran - could you assign the JIRA to @viirya ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 496735) Time Spent: 25h (was: 24h 50m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Assignee: DB Tsai >Priority: Major > Labels: pull-request-available > Fix For: 3.3.1, 3.4.0 > > Time Spent: 25h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=496571=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496571 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 07/Oct/20 12:56 Start Date: 07/Oct/20 12:56 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-704916354 JIRA closed, added a release note. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 496571) Time Spent: 24h 50m (was: 24h 40m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Assignee: DB Tsai >Priority: Major > Labels: pull-request-available > Fix For: 3.3.1 > > Time Spent: 24h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=496065=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496065 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 06/Oct/20 17:51 Start Date: 06/Oct/20 17:51 Worklog Time Spent: 10m Work Description: viirya edited a comment on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-704433770 Ok, got it. I will update release notes once it is back. Seems I cannot update Hadoop JIRA. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 496065) Time Spent: 24h 40m (was: 24.5h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 24h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=496050=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496050 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 06/Oct/20 17:34 Start Date: 06/Oct/20 17:34 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-704434440 Looks like the JIRA is back now? https://issues.apache.org/jira/browse/HADOOP-17125 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 496050) Time Spent: 24.5h (was: 24h 20m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 24.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=496049=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496049 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 06/Oct/20 17:32 Start Date: 06/Oct/20 17:32 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-704433770 Ok, got it. I will update release notes once it is back. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 496049) Time Spent: 24h 20m (was: 24h 10m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 24h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=495984=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495984 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 06/Oct/20 16:00 Start Date: 06/Oct/20 16:00 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-704380621 Ok, I'm happy too +1, merging to trunk and branch-3.3 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495984) Time Spent: 24h 10m (was: 24h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 24h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=495982=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495982 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 06/Oct/20 15:58 Start Date: 06/Oct/20 15:58 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-704379411 > No harm fixing it as part of this patch... add the 'jobTokenPassword' from below in ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/pom.xml I think it's actually some test runner bug, really it should be cleaned up. But we can pull in the patch to shut it up. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495982) Time Spent: 24h (was: 23h 50m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 24h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=495800=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495800 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 06/Oct/20 09:06 Start Date: 06/Oct/20 09:06 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-704136004 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 29s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 4 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 5m 36s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 24m 2s | | trunk passed | | +1 :green_heart: | compile | 19m 49s | | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 17m 5s | | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 2m 56s | | trunk passed | | +1 :green_heart: | mvnsite | 20m 56s | | trunk passed | | +1 :green_heart: | shadedclient | 14m 21s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 6m 29s | | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 7m 8s | | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 0m 45s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 25s | | branch/hadoop-project no findbugs output file (findbugsXml.xml) | | +0 :ok: | findbugs | 0m 23s | | branch/hadoop-project-dist no findbugs output file (findbugsXml.xml) | | -0 :warning: | patch | 1m 7s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 35s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 21m 15s | | the patch passed | | +1 :green_heart: | compile | 19m 18s | | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | cc | 19m 18s | [/diff-compile-cc-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/22/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt) | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 40 new + 123 unchanged - 40 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 19m 18s | | the patch passed | | +1 :green_heart: | javac | 19m 18s | | the patch passed | | +1 :green_heart: | compile | 17m 9s | | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | cc | 17m 9s | [/diff-compile-cc-root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/22/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.txt) | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 36 new + 127 unchanged - 36 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 17m 9s | | the patch passed | | +1 :green_heart: | javac | 17m 9s | | the patch passed | | +1 :green_heart: | checkstyle | 2m 50s | | root: The patch generated 0 new + 140 unchanged - 3 fixed = 140 total (was 143) | | +1 :green_heart: | mvnsite | 17m 36s | | the patch passed | | +1 :green_heart: | shellcheck | 0m 0s | | There were no new shellcheck issues. | | +1 :green_heart: | shelldocs | 0m 18s | | There were no new shelldocs issues. | | +1 :green_heart: | whitespace | 0m 0s | | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 5s | | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 14m 11s | | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc |
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=495486=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495486 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 05/Oct/20 18:14 Start Date: 05/Oct/20 18:14 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-703801190 Thanks @steveloughran and @saintstack. Updated the diff based on your suggestions. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495486) Time Spent: 23h 40m (was: 23.5h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 23h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=495480=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495480 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 05/Oct/20 17:59 Start Date: 05/Oct/20 17:59 Worklog Time Spent: 10m Work Description: saintstack commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-703792707 The native compile complaints seem unrelated... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495480) Time Spent: 23.5h (was: 23h 20m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 23.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=495478=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495478 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 05/Oct/20 17:53 Start Date: 05/Oct/20 17:53 Worklog Time Spent: 10m Work Description: saintstack commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-703789784 If making a new PR, the ' compile' is redundant given its maven default? The license failure is: ``` Lines that start with ? in the ASF License report indicate files that do not have an Apache license header: !? /home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-2297/src/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/jobTokenPassword ``` No harm fixing it as part of this patch... add the 'jobTokenPassword' from below in ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/pom.xml ``` org.apache.rat apache-rat-plugin src/test/java/org/apache/hadoop/cli/data60bytes src/test/resources/job_1329348432655_0001-10.jhist **/jobTokenPassword ``` Otherwise patch is looking good to me. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495478) Time Spent: 23h 20m (was: 23h 10m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 23h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=495443=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495443 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 05/Oct/20 16:51 Start Date: 05/Oct/20 16:51 Worklog Time Spent: 10m Work Description: steveloughran commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r499736294 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java ## @@ -432,7 +412,11 @@ public void assertCompression(String name, Compressor compressor, joiner.join(name, "byte arrays not equals error !!!"), originalRawData, decompressOut.toByteArray()); } catch (Exception ex) { - fail(joiner.join(name, ex.getMessage())); + if (ex.getMessage() != null) { +fail(joiner.join(name, ex.getMessage())); + } else { +fail(joiner.join(name, ExceptionUtils.getStackTrace(ex))); Review comment: NPE is why toString() is what new code should do. Why don't we just `throw new AssertionError(name +ex, ex)`. That way, the stack trace doesn't get lost, which is something we never want to have happen, This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 495443) Time Spent: 23h 10m (was: 23h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 23h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=494119=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-494119 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 02/Oct/20 20:05 Start Date: 02/Oct/20 20:05 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-702933677 Fixed another and last style issue. Checked with `mvn checkstyle:check` locally. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 494119) Time Spent: 23h (was: 22h 50m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 23h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=494094=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-494094 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 02/Oct/20 18:50 Start Date: 02/Oct/20 18:50 Worklog Time Spent: 10m Work Description: sunchao commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-702900335 The style issue was fixed in the last run. The CI failed because of [unit tests](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/20/testReport/) and [ASF license](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/20/artifact/out/patch-asflicense-problems.txt) (I don't really see the file `jobTokenPassword`). Seems neither is related to this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 494094) Time Spent: 22h 50m (was: 22h 40m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 22h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=493595=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493595 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 01/Oct/20 17:07 Start Date: 01/Oct/20 17:07 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-702273856 Hmm, for CompressDecompressTester.java, it seems to me that it is from original code? ```java else if (compressor.getClass().isAssignableFrom(ZlibCompressor.class)) { return ZlibFactory.isNativeZlibLoaded(new Configuration()); -} -else if (compressor.getClass().isAssignableFrom(SnappyCompressor.class) -&& isNativeSnappyLoadable()) +} +else if (compressor.getClass().isAssignableFrom(SnappyCompressor.class)) ``` Anyway, I can fix it here if you think it is ok. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493595) Time Spent: 22h 40m (was: 22.5h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 22h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=493505=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493505 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 01/Oct/20 14:04 Start Date: 01/Oct/20 14:04 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-702159088 ok, yetus is running, it's just reporting isn't quite there...if you follow the link you see the results. Test failures in hdfs: unrelated. ASF licence warning: unrelated. Checkstyles are, sadly, related. ``` ./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java:491: }:5: '}' at column 5 should be on the same line as the next part of a multi-block statement (one that directly contains multiple blocks: if/else-if/else, do/while or try/catch/finally). [RightCurly] ./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java:492: else if (compressor.getClass().isAssignableFrom(SnappyCompressor.class)): 'if' construct must use '{}'s. [NeedBraces] ./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java:356: int[] size = { 4 * 1024, 64 * 1024, 128 * 1024, 1024 * 1024 };:18: '{' is followed by whitespace. [NoWhitespaceAfter] ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493505) Time Spent: 22.5h (was: 22h 20m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 22.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=493504=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493504 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 01/Oct/20 14:03 Start Date: 01/Oct/20 14:03 Worklog Time Spent: 10m Work Description: hadoop-yetus removed a comment on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690909475 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 30s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 1s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 35s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 27m 4s | trunk passed | | +1 :green_heart: | compile | 21m 25s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 19m 19s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 3m 2s | trunk passed | | +1 :green_heart: | mvnsite | 2m 7s | trunk passed | | +1 :green_heart: | shadedclient | 21m 6s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 13s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 7s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 2m 12s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 38s | branch/hadoop-project no findbugs output file (findbugsXml.xml) | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 28s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 2s | the patch passed | | +1 :green_heart: | compile | 18m 38s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | cc | 18m 38s | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 36 new + 127 unchanged - 36 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 18m 38s | the patch passed | | +1 :green_heart: | javac | 18m 38s | the patch passed | | +1 :green_heart: | compile | 16m 51s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | cc | 16m 51s | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 29 new + 134 unchanged - 29 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 16m 51s | the patch passed | | +1 :green_heart: | javac | 16m 51s | the patch passed | | -0 :warning: | checkstyle | 2m 41s | root: The patch generated 1 new + 151 unchanged - 5 fixed = 152 total (was 156) | | +1 :green_heart: | mvnsite | 2m 1s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 2s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 14m 25s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 12s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 9s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | findbugs | 0m 32s | hadoop-project has no data from findbugs | ||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 32s | hadoop-project in the patch passed. | | -1 :x: | unit | 9m 32s | hadoop-common in the patch passed. | | +1 :green_heart: | asflicense | 0m 55s | The patch does not generate ASF License warnings. | | | | 177m 51s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.io.compress.snappy.TestSnappyCompressorDecompressor | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2297 | | Optional Tests | dupname asflicense compile
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492716=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492716 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 29/Sep/20 23:17 Start Date: 29/Sep/20 23:17 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-701041484 @sunchao Thanks for review. I addressed your comments. Please let me know if you have more comments. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 492716) Time Spent: 22h 10m (was: 22h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 22h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492701=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492701 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 29/Sep/20 22:35 Start Date: 29/Sep/20 22:35 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r497099812 ## File path: hadoop-project-dist/pom.xml ## @@ -341,7 +340,6 @@ --openssllib=${openssl.lib} --opensslbinbundle=${bundle.openssl.in.bin} --openssllibbundle=${bundle.openssl} - --snappybinbundle=${bundle.snappy.in.bin} --snappylib=${snappy.lib} --snappylibbundle=${bundle.snappy} Review comment: Re-checked. hadoop-mapreduce-client-nativetask doesn't use `bundle.snappy.in.bin`. Only native-win profile of hadoop-common and native-win profile in hadoop-project use `bundle.snappy.in.bin`. They are added by HADOOP-9802 to support SnappyCodec on Windows. So looks like it is safe to remove. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 492701) Time Spent: 22h (was: 21h 50m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 22h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492699=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492699 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 29/Sep/20 22:23 Start Date: 29/Sep/20 22:23 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r497093683 ## File path: hadoop-project-dist/pom.xml ## @@ -341,7 +340,6 @@ --openssllib=${openssl.lib} --opensslbinbundle=${bundle.openssl.in.bin} --openssllibbundle=${bundle.openssl} - --snappybinbundle=${bundle.snappy.in.bin} --snappylib=${snappy.lib} --snappylibbundle=${bundle.snappy} Review comment: Hmm, I am not sure about this now. I think it is safer to revert this back? I am not sure if hadoop-mapreduce-client-nativetask uses this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 492699) Time Spent: 21h 50m (was: 21h 40m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 21h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492697=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492697 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 29/Sep/20 22:19 Start Date: 29/Sep/20 22:19 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r497093683 ## File path: hadoop-project-dist/pom.xml ## @@ -341,7 +340,6 @@ --openssllib=${openssl.lib} --opensslbinbundle=${bundle.openssl.in.bin} --openssllibbundle=${bundle.openssl} - --snappybinbundle=${bundle.snappy.in.bin} --snappylib=${snappy.lib} --snappylibbundle=${bundle.snappy} Review comment: Hmm, I am not sure about this now. I think it is safer to revert this back, I am not sure if hadoop-mapreduce-client-nativetask uses this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 492697) Time Spent: 21h 40m (was: 21.5h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 21h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492689=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492689 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 29/Sep/20 21:44 Start Date: 29/Sep/20 21:44 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r497078103 ## File path: hadoop-common-project/hadoop-common/src/main/native/native.vcxproj ## @@ -68,30 +68,13 @@ ..\..\..\target\native\$(Configuration)\ hadoop - -$(CustomSnappyPrefix) -$(CustomSnappyPrefix)\lib -$(CustomSnappyPrefix)\bin -$(CustomSnappyLib) -$(CustomSnappyPrefix) -$(CustomSnappyPrefix)\include -$(CustomSnappyInclude) -true -$(SnappyInclude);$(IncludePath) -$(ZLIB_HOME);$(IncludePath) Review comment: Oh, yeah, will add it back. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 492689) Time Spent: 21.5h (was: 21h 20m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 21.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492687=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492687 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 29/Sep/20 21:33 Start Date: 29/Sep/20 21:33 Worklog Time Spent: 10m Work Description: sunchao commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r497073278 ## File path: hadoop-common-project/hadoop-common/src/main/native/native.vcxproj ## @@ -68,30 +68,13 @@ ..\..\..\target\native\$(Configuration)\ hadoop - -$(CustomSnappyPrefix) -$(CustomSnappyPrefix)\lib -$(CustomSnappyPrefix)\bin -$(CustomSnappyLib) -$(CustomSnappyPrefix) -$(CustomSnappyPrefix)\include -$(CustomSnappyInclude) -true -$(SnappyInclude);$(IncludePath) -$(ZLIB_HOME);$(IncludePath) Review comment: I think this is for Zlib compressor (see https://issues.apache.org/jira/browse/HADOOP-10450). Yeah it is a bit confusing that it's defined in the same group. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 492687) Time Spent: 21h 20m (was: 21h 10m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 21h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492682=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492682 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 29/Sep/20 21:24 Start Date: 29/Sep/20 21:24 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r497068326 ## File path: hadoop-common-project/hadoop-common/src/main/native/native.vcxproj ## @@ -68,30 +68,13 @@ ..\..\..\target\native\$(Configuration)\ hadoop - -$(CustomSnappyPrefix) -$(CustomSnappyPrefix)\lib -$(CustomSnappyPrefix)\bin -$(CustomSnappyLib) -$(CustomSnappyPrefix) -$(CustomSnappyPrefix)\include -$(CustomSnappyInclude) -true -$(SnappyInclude);$(IncludePath) -$(ZLIB_HOME);$(IncludePath) Review comment: I think it is together with snappy stuffs here, no? They are in same `PropertyGroup`. I think it is used to add zlib home into include paths in the property group. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 492682) Time Spent: 21h 10m (was: 21h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 21h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492648=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492648 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 29/Sep/20 20:13 Start Date: 29/Sep/20 20:13 Worklog Time Spent: 10m Work Description: saintstack commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-700960375 > happy with that? Thanks @steveloughran If I understand you correctly, because you figured we actually already include snappy-java and that snappy is already a hadoop-common dependency, then making it so snappy-java is 'compile' rather than 'provided' is ok by you. If so, yeah, I think this better. Operators don't have to make sure of native snappy everywhere since snappy-java provides it. (Looks like @viirya has gone ahead and made snappy-java compile scope...) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 492648) Time Spent: 21h (was: 20h 50m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 21h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492645=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492645 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 29/Sep/20 20:00 Start Date: 29/Sep/20 20:00 Worklog Time Spent: 10m Work Description: sunchao commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r497009027 ## File path: hadoop-common-project/hadoop-common/src/main/native/native.vcxproj ## @@ -68,30 +68,13 @@ ..\..\..\target\native\$(Configuration)\ hadoop - -$(CustomSnappyPrefix) -$(CustomSnappyPrefix)\lib -$(CustomSnappyPrefix)\bin -$(CustomSnappyLib) -$(CustomSnappyPrefix) -$(CustomSnappyPrefix)\include -$(CustomSnappyInclude) -true -$(SnappyInclude);$(IncludePath) -$(ZLIB_HOME);$(IncludePath) Review comment: I mean the last line, which is about ZLIB. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 492645) Time Spent: 20h 50m (was: 20h 40m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 20h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492635=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492635 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 29/Sep/20 19:39 Start Date: 29/Sep/20 19:39 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r496997774 ## File path: hadoop-common-project/hadoop-common/src/main/native/native.vcxproj ## @@ -68,30 +68,13 @@ ..\..\..\target\native\$(Configuration)\ hadoop - -$(CustomSnappyPrefix) -$(CustomSnappyPrefix)\lib -$(CustomSnappyPrefix)\bin -$(CustomSnappyLib) -$(CustomSnappyPrefix) -$(CustomSnappyPrefix)\include -$(CustomSnappyInclude) -true -$(SnappyInclude);$(IncludePath) -$(ZLIB_HOME);$(IncludePath) Review comment: Since we remove snappy native code, why do we need to keep this on windows? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 492635) Time Spent: 20h 40m (was: 20.5h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 20h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492626=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492626 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 29/Sep/20 19:17 Start Date: 29/Sep/20 19:17 Worklog Time Spent: 10m Work Description: sunchao commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-700927518 > bad: doesn't work so well if things (hadoop-client?) shade the snappy jar references. Since this dependency is new, as long we make sure it is not shaded in the client modules (e.g., hadoop-client-api) it should be fine. I checked there and don't see it is shaded right now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 492626) Time Spent: 20.5h (was: 20h 20m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 20.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492625=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492625 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 29/Sep/20 19:15 Start Date: 29/Sep/20 19:15 Worklog Time Spent: 10m Work Description: sunchao commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r496960563 ## File path: hadoop-common-project/hadoop-common/src/main/native/native.vcxproj ## @@ -68,30 +68,13 @@ ..\..\..\target\native\$(Configuration)\ hadoop - -$(CustomSnappyPrefix) -$(CustomSnappyPrefix)\lib -$(CustomSnappyPrefix)\bin -$(CustomSnappyLib) -$(CustomSnappyPrefix) -$(CustomSnappyPrefix)\include -$(CustomSnappyInclude) -true -$(SnappyInclude);$(IncludePath) -$(ZLIB_HOME);$(IncludePath) Review comment: We shouldn't remove this ## File path: hadoop-project-dist/pom.xml ## @@ -341,7 +340,6 @@ --openssllib=${openssl.lib} --opensslbinbundle=${bundle.openssl.in.bin} --openssllibbundle=${bundle.openssl} - --snappybinbundle=${bundle.snappy.in.bin} --snappylib=${snappy.lib} --snappylibbundle=${bundle.snappy} Review comment: does this mean we don't need the option in dev-support/bin/dist-copynativelibs for snappy? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 492625) Time Spent: 20h 20m (was: 20h 10m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 20h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492567=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492567 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 29/Sep/20 17:20 Start Date: 29/Sep/20 17:20 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-700859880 Having checked up on those dependencies myself, yes, we are already shipping snappy-java 1.0.5 as a dependency of hadoop-common, by way of avro. Which makes a strong case for keeping the snappy codec in hadoop-common, declaring the snappy dependency as a compile time dependency, with the version we choose. This ensures hbase pick it up, and, because it is there: we aren't creating any more complications for people downstream than they get today. @saintstack happy with that? In which case, we actually go to the earlier patches, which just switch the codec to using the native one, and the pom changes to import it. Plus a release note 'you don't need native snappy no more'. We could stay with the current code, which is resilient to someone removing the JAR. Good: resilient bad: doesn't work so well if things (hadoop-client?) shade the snappy jar references. I don't know what to do there. Given the code is written, I'd go for "merge it as is and see what happens", but the "go back to the minimal binding" could well have lower maintenance costs down the line This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 492567) Time Spent: 20h 10m (was: 20h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 20h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492164=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492164 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 28/Sep/20 21:33 Start Date: 28/Sep/20 21:33 Worklog Time Spent: 10m Work Description: saintstack commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-700293030 > @steveloughran if snappy native was a dependency for hbase-common, do you think that argues snappy-java could be (having it as 'provided' undoes the main benefit of using snappy-java jar instead of the native snappy libs). Thanks Steve. Any luck on opinion on above @steveloughran ? Thanks boss. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 492164) Time Spent: 20h (was: 19h 50m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 20h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491427=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491427 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 25/Sep/20 20:59 Start Date: 25/Sep/20 20:59 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r495182332 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -45,30 +46,21 @@ private int userBufOff = 0, userBufLen = 0; private boolean finished; - private static boolean nativeSnappyLoaded = false; - - static { -if (NativeCodeLoader.isNativeCodeLoaded() && -NativeCodeLoader.buildSupportsSnappy()) { - try { -initIDs(); -nativeSnappyLoaded = true; - } catch (Throwable t) { -LOG.error("failed to load SnappyDecompressor", t); - } -} - } - - public static boolean isNativeCodeLoaded() { -return nativeSnappyLoaded; - } - /** * Creates a new compressor. * * @param directBufferSize size of the direct buffer to be used. */ public SnappyDecompressor(int directBufferSize) { +// `snappy-java` is provided scope. We need to check if it is available. +try { + SnappyLoader.getVersion(); Review comment: The "internal user-only" of `SnappyLoader`, based on its comment, seems more related to native library loading stuff. `getVersion` is static method and it doesn't involve loading of native library described in `SnappyLoader`, so I guess it is fine? Otherwise, I don't find other proper one to check. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 491427) Time Spent: 19h 50m (was: 19h 40m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 19h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491426=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491426 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 25/Sep/20 20:58 Start Date: 25/Sep/20 20:58 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r495183191 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java ## @@ -291,9 +283,17 @@ public long getBytesWritten() { public void end() { } - private native static void initIDs(); - - private native int compressBytesDirect(); - - public native static String getLibraryName(); + private int compressDirectBuf() throws IOException { +if (uncompressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `uncompressedDirectBuf` for reading + uncompressedDirectBuf.limit(uncompressedDirectBufLen).position(0); + int size = Snappy.compress((ByteBuffer) uncompressedDirectBuf, + (ByteBuffer) compressedDirectBuf); + uncompressedDirectBufLen = 0; + uncompressedDirectBuf.limit(uncompressedDirectBuf.capacity()).position(0); Review comment: Seems so, I remember I added this to fix test failure. It might be `SnappyDecompressor`, I think, then I copied to `SnappyCompressor`. Deleted this and see what Jenkins tells. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 491426) Time Spent: 19h 40m (was: 19.5h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 19h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491379=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491379 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 25/Sep/20 19:36 Start Date: 25/Sep/20 19:36 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r495193338 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java ## @@ -495,19 +479,16 @@ public String getName() { Compressor compressor = pair.compressor; if (compressor.getClass().isAssignableFrom(Lz4Compressor.class) -&& (NativeCodeLoader.isNativeCodeLoaded())) Review comment: Ok. Reverted the change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 491379) Time Spent: 19.5h (was: 19h 20m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 19.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491377=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491377 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 25/Sep/20 19:34 Start Date: 25/Sep/20 19:34 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-699115827 Thanks @sunchao for review. Addressed your comments. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 491377) Time Spent: 19h 20m (was: 19h 10m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 19h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491367=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491367 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 25/Sep/20 19:14 Start Date: 25/Sep/20 19:14 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r495183191 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java ## @@ -291,9 +283,17 @@ public long getBytesWritten() { public void end() { } - private native static void initIDs(); - - private native int compressBytesDirect(); - - public native static String getLibraryName(); + private int compressDirectBuf() throws IOException { +if (uncompressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `uncompressedDirectBuf` for reading + uncompressedDirectBuf.limit(uncompressedDirectBufLen).position(0); + int size = Snappy.compress((ByteBuffer) uncompressedDirectBuf, + (ByteBuffer) compressedDirectBuf); + uncompressedDirectBufLen = 0; + uncompressedDirectBuf.limit(uncompressedDirectBuf.capacity()).position(0); Review comment: Seems so, I remember I added this to fix test failure. t might be `SnappyDecompressor`, I think, then I copied to `SnappyCompressor`. Deleted this and see what Jenkins tells. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 491367) Time Spent: 19h (was: 18h 50m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 19h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491368=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491368 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 25/Sep/20 19:14 Start Date: 25/Sep/20 19:14 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r495183290 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -276,10 +268,20 @@ public void end() { // do nothing } - private native static void initIDs(); + private int decompressDirectBuf() throws IOException { +if (compressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `compressedDirectBuf` for reading + compressedDirectBuf.limit(compressedDirectBufLen).position(0); + int size = Snappy.uncompress((ByteBuffer) compressedDirectBuf, + (ByteBuffer) uncompressedDirectBuf); + compressedDirectBufLen = 0; + compressedDirectBuf.limit(compressedDirectBuf.capacity()).position(0); Review comment: yap This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 491368) Time Spent: 19h 10m (was: 19h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 19h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491366=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491366 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 25/Sep/20 19:12 Start Date: 25/Sep/20 19:12 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r495182332 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -45,30 +46,21 @@ private int userBufOff = 0, userBufLen = 0; private boolean finished; - private static boolean nativeSnappyLoaded = false; - - static { -if (NativeCodeLoader.isNativeCodeLoaded() && -NativeCodeLoader.buildSupportsSnappy()) { - try { -initIDs(); -nativeSnappyLoaded = true; - } catch (Throwable t) { -LOG.error("failed to load SnappyDecompressor", t); - } -} - } - - public static boolean isNativeCodeLoaded() { -return nativeSnappyLoaded; - } - /** * Creates a new compressor. * * @param directBufferSize size of the direct buffer to be used. */ public SnappyDecompressor(int directBufferSize) { +// `snappy-java` is provided scope. We need to check if it is available. +try { + SnappyLoader.getVersion(); Review comment: `getVersion` is static method and it doesn't involve loading of native library described in `SnappyLoader`, so I guess it is fine? Otherwise, I don't find other proper one to check. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 491366) Time Spent: 18h 50m (was: 18h 40m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 18h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491365=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491365 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 25/Sep/20 19:00 Start Date: 25/Sep/20 19:00 Worklog Time Spent: 10m Work Description: sunchao commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r495176908 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java ## @@ -495,19 +479,16 @@ public String getName() { Compressor compressor = pair.compressor; if (compressor.getClass().isAssignableFrom(Lz4Compressor.class) -&& (NativeCodeLoader.isNativeCodeLoaded())) Review comment: Yeah usually it's not recommended to include unrelated changes in Hadoop patch, we may add another refactoring PR later if this is absolutely necessary. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 491365) Time Spent: 18h 40m (was: 18.5h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 18h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491364=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491364 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 25/Sep/20 18:58 Start Date: 25/Sep/20 18:58 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r495175908 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java ## @@ -495,19 +479,16 @@ public String getName() { Compressor compressor = pair.compressor; if (compressor.getClass().isAssignableFrom(Lz4Compressor.class) -&& (NativeCodeLoader.isNativeCodeLoaded())) Review comment: Oh, this is from @dbtsai's original change. I think adding curly brackets is better? I can revert this if you think it is necessary. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 491364) Time Spent: 18.5h (was: 18h 20m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 18.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491362=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491362 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 25/Sep/20 18:55 Start Date: 25/Sep/20 18:55 Worklog Time Spent: 10m Work Description: sunchao commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-699097890 (nit nit) we may need to update [NativeLibraries guide](https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/site/markdown/NativeLibraries.md.vm) as well since `NativeLibraryChecker` no longer checks snappy. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 491362) Time Spent: 18h 20m (was: 18h 10m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 18h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491359=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491359 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 25/Sep/20 18:51 Start Date: 25/Sep/20 18:51 Worklog Time Spent: 10m Work Description: sunchao commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r495158362 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java ## @@ -291,9 +283,17 @@ public long getBytesWritten() { public void end() { } - private native static void initIDs(); - - private native int compressBytesDirect(); - - public native static String getLibraryName(); + private int compressDirectBuf() throws IOException { +if (uncompressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `uncompressedDirectBuf` for reading + uncompressedDirectBuf.limit(uncompressedDirectBufLen).position(0); + int size = Snappy.compress((ByteBuffer) uncompressedDirectBuf, + (ByteBuffer) compressedDirectBuf); + uncompressedDirectBufLen = 0; + uncompressedDirectBuf.limit(uncompressedDirectBuf.capacity()).position(0); Review comment: nit: this seems unnecessary as `clear` is called shortly after at the call site? ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -276,10 +268,20 @@ public void end() { // do nothing } - private native static void initIDs(); + private int decompressDirectBuf() throws IOException { +if (compressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `compressedDirectBuf` for reading + compressedDirectBuf.limit(compressedDirectBufLen).position(0); + int size = Snappy.uncompress((ByteBuffer) compressedDirectBuf, + (ByteBuffer) uncompressedDirectBuf); + compressedDirectBufLen = 0; + compressedDirectBuf.limit(compressedDirectBuf.capacity()).position(0); Review comment: nit: can we just call `compressedDirectBuf.clear()`? ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java ## @@ -446,4 +442,49 @@ public void doWork() throws Exception { ctx.waitFor(6); } + + @Test + public void testSnappyCompatibility() throws Exception { +// HADOOP-17125. Using snappy-java in SnappyCodec. These strings are raw data and compressed data +// using previous native Snappy codec. We use updated Snappy codec to decode it and check if it +// matches. +String rawData = "010a06030a040a0c0109020c0a010204020d02000b010701080605080b090902060a08050206" + + "0a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d060907020a0" + + "30a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b060e030e0a07" + +"050d06050d"; +String compressed = "8001f07f010a06030a040a0c0109020c0a010204020d02000b010701080605080b0909020" + + "60a080502060a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d" + + "060907020a030a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b0" + +"60e030e0a07050d06050d"; + +byte[] rawDataBytes = Hex.decodeHex(rawData); +byte[] compressedBytes = Hex.decodeHex(compressed); + +ByteBuffer inBuf = ByteBuffer.allocateDirect(compressedBytes.length); +inBuf.put(compressedBytes, 0, compressedBytes.length); +inBuf.flip(); + +ByteBuffer outBuf = ByteBuffer.allocateDirect(rawDataBytes.length); +ByteBuffer expected = ByteBuffer.wrap(rawDataBytes); + +SnappyDecompressor.SnappyDirectDecompressor decompressor = new SnappyDecompressor.SnappyDirectDecompressor(); Review comment: nit: long lines (80 chars). ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java ## @@ -495,19 +479,16 @@ public String getName() { Compressor compressor = pair.compressor; if (compressor.getClass().isAssignableFrom(Lz4Compressor.class) -&& (NativeCodeLoader.isNativeCodeLoaded())) Review comment: nit: unrelated changes :) ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -45,30 +46,21 @@ private int userBufOff = 0, userBufLen = 0; private boolean finished; - private static boolean nativeSnappyLoaded = false; - - static { -if (NativeCodeLoader.isNativeCodeLoaded() && -
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491318=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491318 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 25/Sep/20 16:54 Start Date: 25/Sep/20 16:54 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-699039044 @saintstack @steveloughran Is this change looking good for you now? Based on this change, we will work on the compression module per https://github.com/apache/hadoop/pull/2159#issuecomment-698212693 suggests. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 491318) Time Spent: 18h (was: 17h 50m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 18h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491132=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491132 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 25/Sep/20 13:45 Start Date: 25/Sep/20 13:45 Worklog Time Spent: 10m Work Description: dbtsai commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r494537597 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java ## @@ -291,9 +283,17 @@ public long getBytesWritten() { public void end() { } - private native static void initIDs(); - - private native int compressBytesDirect(); - - public native static String getLibraryName(); + private int compressDirectBuf() throws IOException { +if (uncompressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `uncompressedDirectBuf` for reading + uncompressedDirectBuf.limit(uncompressedDirectBufLen).position(0); + int size = Snappy.compress((ByteBuffer) uncompressedDirectBuf, + (ByteBuffer) compressedDirectBuf); + uncompressedDirectBufLen = 0; + uncompressedDirectBuf.limit(directBufferSize).position(0); Review comment: nit, `uncompressedDirectBuf.limit(uncompressedDirectBuf.capacity()).position(0);` for safety. ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java ## @@ -432,7 +412,11 @@ public void assertCompression(String name, Compressor compressor, joiner.join(name, "byte arrays not equals error !!!"), originalRawData, decompressOut.toByteArray()); } catch (Exception ex) { - fail(joiner.join(name, ex.getMessage())); + if (ex.getMessage() != null) { +fail(joiner.join(name, ex.getMessage())); + } else { +fail(joiner.join(name, ExceptionUtils.getStackTrace(ex))); Review comment: Why is this change needed? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 491132) Time Spent: 17h 50m (was: 17h 40m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 17h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490944=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490944 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 25/Sep/20 13:28 Start Date: 25/Sep/20 13:28 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-698298972 So actually snappy was already a dependency of hadoop-common? Interesting This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 490944) Time Spent: 17h 40m (was: 17.5h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 17h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490867=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490867 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 25/Sep/20 13:21 Start Date: 25/Sep/20 13:21 Worklog Time Spent: 10m Work Description: saintstack commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-698423282 @steveloughran if snappy native was a dependency for hbase-common, do you think that argues snappy-java could be (having it as 'provided' undoes the main benefit of using snappy-java jar instead of the native snappy libs). Thanks Steve. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 490867) Time Spent: 17h 10m (was: 17h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 17h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490922=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490922 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 25/Sep/20 13:26 Start Date: 25/Sep/20 13:26 Worklog Time Spent: 10m Work Description: hadoop-yetus removed a comment on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690765839 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 490922) Time Spent: 17.5h (was: 17h 20m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 17.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490911=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490911 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 25/Sep/20 13:25 Start Date: 25/Sep/20 13:25 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r494069586 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java ## @@ -446,4 +442,43 @@ public void doWork() throws Exception { ctx.waitFor(6); } + + @Test + public void testSnappyCompatibility() throws Exception { +// HADOOP-17125. Using snappy-java in SnappyCodec. These strings are raw data and compressed data +// using previous native Snappy codec. We use updated Snappy codec to decode it and check if it +// matches. +String rawData = "010a06030a040a0c0109020c0a010204020d02000b010701080605080b090902060a080502060a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d060907020a030a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b060e030e0a07050d06050d"; Review comment: String is to make the test as simple as possible. Maybe further shorten the string? ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java ## @@ -291,9 +282,17 @@ public long getBytesWritten() { public void end() { } - private native static void initIDs(); - - private native int compressBytesDirect(); - - public native static String getLibraryName(); + private int compressBytesDirect() throws IOException { Review comment: This `compressBytesDirect` and `decompressBytesDirect` basically are copied from original method names. `compressDirectBuf` and `decompressDirectBuf` looks good to me. ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java ## @@ -48,30 +49,20 @@ private long bytesRead = 0L; private long bytesWritten = 0L; - private static boolean nativeSnappyLoaded = false; - - static { -if (NativeCodeLoader.isNativeCodeLoaded() && -NativeCodeLoader.buildSupportsSnappy()) { - try { -initIDs(); -nativeSnappyLoaded = true; - } catch (Throwable t) { -LOG.error("failed to load SnappyCompressor", t); - } -} - } - - public static boolean isNativeCodeLoaded() { -return nativeSnappyLoaded; - } - /** * Creates a new compressor. * * @param directBufferSize size of the direct buffer to be used. */ public SnappyCompressor(int directBufferSize) { +// `snappy-java` is provided scope. We need to check if its availability. +try { + SnappyLoader.getVersion(); +} catch (Throwable t) { + throw new RuntimeException("native snappy library not available: " + Review comment: It is java-snappy jar, yeah, I will review the message. ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java ## @@ -48,30 +49,20 @@ private long bytesRead = 0L; private long bytesWritten = 0L; - private static boolean nativeSnappyLoaded = false; - - static { -if (NativeCodeLoader.isNativeCodeLoaded() && -NativeCodeLoader.buildSupportsSnappy()) { - try { -initIDs(); -nativeSnappyLoaded = true; - } catch (Throwable t) { -LOG.error("failed to load SnappyCompressor", t); - } -} - } - - public static boolean isNativeCodeLoaded() { -return nativeSnappyLoaded; - } - /** * Creates a new compressor. * * @param directBufferSize size of the direct buffer to be used. */ public SnappyCompressor(int directBufferSize) { +// `snappy-java` is provided scope. We need to check if its availability. Review comment: Oops, thanks. ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java ## @@ -48,30 +49,20 @@ private long bytesRead = 0L; private long bytesWritten = 0L; - private static boolean nativeSnappyLoaded = false; - - static { -if (NativeCodeLoader.isNativeCodeLoaded() && -NativeCodeLoader.buildSupportsSnappy()) { - try { -initIDs(); -nativeSnappyLoaded = true; - } catch (Throwable t) { -LOG.error("failed to load SnappyCompressor", t); - } -} - } - - public static boolean isNativeCodeLoaded() { -return nativeSnappyLoaded; - } - /** * Creates a new compressor. * * @param directBufferSize size of the
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490810=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490810 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 25/Sep/20 13:15 Start Date: 25/Sep/20 13:15 Worklog Time Spent: 10m Work Description: saintstack commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r494054389 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java ## @@ -48,30 +49,20 @@ private long bytesRead = 0L; private long bytesWritten = 0L; - private static boolean nativeSnappyLoaded = false; - - static { -if (NativeCodeLoader.isNativeCodeLoaded() && -NativeCodeLoader.buildSupportsSnappy()) { - try { -initIDs(); -nativeSnappyLoaded = true; - } catch (Throwable t) { -LOG.error("failed to load SnappyCompressor", t); - } -} - } - - public static boolean isNativeCodeLoaded() { -return nativeSnappyLoaded; - } - /** * Creates a new compressor. * * @param directBufferSize size of the direct buffer to be used. */ public SnappyCompressor(int directBufferSize) { +// `snappy-java` is provided scope. We need to check if its availability. +try { + SnappyLoader.getVersion(); +} catch (Throwable t) { + throw new RuntimeException("native snappy library not available: " + Review comment: Is it the 'native snappy library' that is missing or the java-snappy jar? ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java ## @@ -48,30 +49,20 @@ private long bytesRead = 0L; private long bytesWritten = 0L; - private static boolean nativeSnappyLoaded = false; - - static { -if (NativeCodeLoader.isNativeCodeLoaded() && -NativeCodeLoader.buildSupportsSnappy()) { - try { -initIDs(); -nativeSnappyLoaded = true; - } catch (Throwable t) { -LOG.error("failed to load SnappyCompressor", t); - } -} - } - - public static boolean isNativeCodeLoaded() { -return nativeSnappyLoaded; - } - /** * Creates a new compressor. * * @param directBufferSize size of the direct buffer to be used. */ public SnappyCompressor(int directBufferSize) { +// `snappy-java` is provided scope. We need to check if its availability. Review comment: Fix this last sentence if you make a new PR ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java ## @@ -446,4 +442,43 @@ public void doWork() throws Exception { ctx.waitFor(6); } + + @Test + public void testSnappyCompatibility() throws Exception { +// HADOOP-17125. Using snappy-java in SnappyCodec. These strings are raw data and compressed data +// using previous native Snappy codec. We use updated Snappy codec to decode it and check if it +// matches. +String rawData = "010a06030a040a0c0109020c0a010204020d02000b010701080605080b090902060a080502060a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d060907020a030a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b060e030e0a07050d06050d"; Review comment: hmm... this is a little anemic. Have you considered adding a data file that is a little more interesting than this? ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -276,10 +267,20 @@ public void end() { // do nothing } - private native static void initIDs(); + private int decompressBytesDirect() throws IOException { Review comment: ditto ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -45,30 +46,20 @@ private int userBufOff = 0, userBufLen = 0; private boolean finished; - private static boolean nativeSnappyLoaded = false; - - static { -if (NativeCodeLoader.isNativeCodeLoaded() && -NativeCodeLoader.buildSupportsSnappy()) { - try { -initIDs(); -nativeSnappyLoaded = true; - } catch (Throwable t) { -LOG.error("failed to load SnappyDecompressor", t); - } -} - } - - public static boolean isNativeCodeLoaded() { -return nativeSnappyLoaded; - } - /** * Creates a new compressor. * * @param directBufferSize size of the direct buffer to be used. */ public SnappyDecompressor(int directBufferSize) { +// `snappy-java` is provided scope. We need to check if its availability. +try { +
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490786=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490786 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 25/Sep/20 13:12 Start Date: 25/Sep/20 13:12 Worklog Time Spent: 10m Work Description: steveloughran commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r494256866 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java ## @@ -446,4 +442,43 @@ public void doWork() throws Exception { ctx.waitFor(6); } + + @Test + public void testSnappyCompatibility() throws Exception { +// HADOOP-17125. Using snappy-java in SnappyCodec. These strings are raw data and compressed data +// using previous native Snappy codec. We use updated Snappy codec to decode it and check if it +// matches. +String rawData = "010a06030a040a0c0109020c0a010204020d02000b010701080605080b090902060a080502060a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d060907020a030a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b060e030e0a07050d06050d"; Review comment: should be split across lines, but otherwise fine inline -simpler for tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 490786) Time Spent: 16h 50m (was: 16h 40m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 16h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490732=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490732 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 25/Sep/20 13:07 Start Date: 25/Sep/20 13:07 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-698516626 @saintstack @steveloughran Thanks for the reviews. I think I addressed the latest comments. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 490732) Time Spent: 16h 40m (was: 16.5h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 16h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490387=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490387 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 24/Sep/20 19:10 Start Date: 24/Sep/20 19:10 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r494550809 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java ## @@ -291,9 +283,17 @@ public long getBytesWritten() { public void end() { } - private native static void initIDs(); - - private native int compressBytesDirect(); - - public native static String getLibraryName(); + private int compressDirectBuf() throws IOException { +if (uncompressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `uncompressedDirectBuf` for reading + uncompressedDirectBuf.limit(uncompressedDirectBufLen).position(0); + int size = Snappy.compress((ByteBuffer) uncompressedDirectBuf, + (ByteBuffer) compressedDirectBuf); + uncompressedDirectBufLen = 0; + uncompressedDirectBuf.limit(directBufferSize).position(0); Review comment: done. thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 490387) Time Spent: 16.5h (was: 16h 20m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 16.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490375=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490375 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 24/Sep/20 18:54 Start Date: 24/Sep/20 18:54 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r494542249 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java ## @@ -432,7 +412,11 @@ public void assertCompression(String name, Compressor compressor, joiner.join(name, "byte arrays not equals error !!!"), originalRawData, decompressOut.toByteArray()); } catch (Exception ex) { - fail(joiner.join(name, ex.getMessage())); + if (ex.getMessage() != null) { +fail(joiner.join(name, ex.getMessage())); + } else { +fail(joiner.join(name, ExceptionUtils.getStackTrace(ex))); Review comment: When I first took over this change, the test failed with NPE without any details. It is because the exception thrown returns null from `getMessage()`. `joiner.join(name, null)` causes the NPE, so I changed it to print stack trace once `getMessage()` returns null. It's better for debugging. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 490375) Time Spent: 16h 20m (was: 16h 10m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 16h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490372=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490372 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 24/Sep/20 18:48 Start Date: 24/Sep/20 18:48 Worklog Time Spent: 10m Work Description: dbtsai commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r494539014 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java ## @@ -432,7 +412,11 @@ public void assertCompression(String name, Compressor compressor, joiner.join(name, "byte arrays not equals error !!!"), originalRawData, decompressOut.toByteArray()); } catch (Exception ex) { - fail(joiner.join(name, ex.getMessage())); + if (ex.getMessage() != null) { +fail(joiner.join(name, ex.getMessage())); + } else { +fail(joiner.join(name, ExceptionUtils.getStackTrace(ex))); Review comment: Why is this change needed? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 490372) Time Spent: 16h 10m (was: 16h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 16h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490370=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490370 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 24/Sep/20 18:46 Start Date: 24/Sep/20 18:46 Worklog Time Spent: 10m Work Description: dbtsai commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r494537597 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java ## @@ -291,9 +283,17 @@ public long getBytesWritten() { public void end() { } - private native static void initIDs(); - - private native int compressBytesDirect(); - - public native static String getLibraryName(); + private int compressDirectBuf() throws IOException { +if (uncompressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `uncompressedDirectBuf` for reading + uncompressedDirectBuf.limit(uncompressedDirectBufLen).position(0); + int size = Snappy.compress((ByteBuffer) uncompressedDirectBuf, + (ByteBuffer) compressedDirectBuf); + uncompressedDirectBufLen = 0; + uncompressedDirectBuf.limit(directBufferSize).position(0); Review comment: nit, `uncompressedDirectBuf.limit(uncompressedDirectBuf.capacity()).position(0);` for safety. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 490370) Time Spent: 16h (was: 15h 50m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 16h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490362=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490362 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 24/Sep/20 18:36 Start Date: 24/Sep/20 18:36 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-698516626 @saintstack @steveloughran Thanks for the reviews. I think I addressed the latest comments. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 490362) Time Spent: 15h 40m (was: 15.5h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 15h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490363=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490363 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 24/Sep/20 18:36 Start Date: 24/Sep/20 18:36 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r494532226 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java ## @@ -446,4 +442,43 @@ public void doWork() throws Exception { ctx.waitFor(6); } + + @Test + public void testSnappyCompatibility() throws Exception { +// HADOOP-17125. Using snappy-java in SnappyCodec. These strings are raw data and compressed data +// using previous native Snappy codec. We use updated Snappy codec to decode it and check if it +// matches. +String rawData = "010a06030a040a0c0109020c0a010204020d02000b010701080605080b090902060a080502060a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d060907020a030a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b060e030e0a07050d06050d"; Review comment: Ok, I split the long string. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 490363) Time Spent: 15h 50m (was: 15h 40m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 15h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490289=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490289 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 24/Sep/20 15:44 Start Date: 24/Sep/20 15:44 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r494070442 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java ## @@ -48,30 +49,20 @@ private long bytesRead = 0L; private long bytesWritten = 0L; - private static boolean nativeSnappyLoaded = false; - - static { -if (NativeCodeLoader.isNativeCodeLoaded() && -NativeCodeLoader.buildSupportsSnappy()) { - try { -initIDs(); -nativeSnappyLoaded = true; - } catch (Throwable t) { -LOG.error("failed to load SnappyCompressor", t); - } -} - } - - public static boolean isNativeCodeLoaded() { -return nativeSnappyLoaded; - } - /** * Creates a new compressor. * * @param directBufferSize size of the direct buffer to be used. */ public SnappyCompressor(int directBufferSize) { +// `snappy-java` is provided scope. We need to check if its availability. +try { + SnappyLoader.getVersion(); +} catch (Throwable t) { + throw new RuntimeException("native snappy library not available: " + Review comment: It is java-snappy jar, yeah, I will revise the message. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 490289) Time Spent: 15.5h (was: 15h 20m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 15.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490285=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490285 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 24/Sep/20 15:37 Start Date: 24/Sep/20 15:37 Worklog Time Spent: 10m Work Description: saintstack commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-698423282 @steveloughran if snappy native was a dependency for hbase-common, do you think that argues snappy-java could be (having it as 'provided' undoes the main benefit of using snappy-java jar instead of the native snappy libs). Thanks Steve. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 490285) Time Spent: 15h 20m (was: 15h 10m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 15h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490163=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490163 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 24/Sep/20 12:00 Start Date: 24/Sep/20 12:00 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-698298972 So actually snappy was already a dependency of hadoop-common? Interesting This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 490163) Time Spent: 15h 10m (was: 15h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 15h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490160=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490160 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 24/Sep/20 11:59 Start Date: 24/Sep/20 11:59 Worklog Time Spent: 10m Work Description: hadoop-yetus removed a comment on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-693868067 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 29m 41s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 20s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 27m 3s | trunk passed | | +1 :green_heart: | compile | 21m 43s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 18m 24s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 2m 50s | trunk passed | | +1 :green_heart: | mvnsite | 2m 41s | trunk passed | | +1 :green_heart: | shadedclient | 20m 17s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 40s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 37s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 0m 29s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 40s | branch/hadoop-project no findbugs output file (findbugsXml.xml) | | +0 :ok: | findbugs | 0m 29s | branch/hadoop-project-dist no findbugs output file (findbugsXml.xml) | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 32s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 17s | the patch passed | | +1 :green_heart: | compile | 20m 57s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | cc | 20m 57s | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 18 new + 145 unchanged - 18 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 20m 57s | the patch passed | | +1 :green_heart: | javac | 20m 57s | the patch passed | | +1 :green_heart: | compile | 18m 29s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | cc | 18m 29s | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 40 new + 123 unchanged - 40 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 18m 29s | the patch passed | | +1 :green_heart: | javac | 18m 29s | the patch passed | | -0 :warning: | checkstyle | 2m 54s | root: The patch generated 6 new + 151 unchanged - 5 fixed = 157 total (was 156) | | +1 :green_heart: | mvnsite | 2m 39s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 4s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 14m 9s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 37s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 39s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | findbugs | 0m 29s | hadoop-project has no data from findbugs | | +0 :ok: | findbugs | 0m 34s | hadoop-project-dist has no data from findbugs | ||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 30s | hadoop-project in the patch passed. | | +1 :green_heart: | unit | 0m 30s | hadoop-project-dist in the patch passed. | | -1 :x: | unit | 10m 18s | hadoop-common in the patch passed. | | +1 :green_heart: | asflicense | 0m 57s | The patch does not generate ASF License warnings. | | | | 214m 44s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.ha.TestZKFailoverController | | Subsystem | Report/Notes |
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490162=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490162 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 24/Sep/20 12:00 Start Date: 24/Sep/20 12:00 Worklog Time Spent: 10m Work Description: steveloughran commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r494256866 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java ## @@ -446,4 +442,43 @@ public void doWork() throws Exception { ctx.waitFor(6); } + + @Test + public void testSnappyCompatibility() throws Exception { +// HADOOP-17125. Using snappy-java in SnappyCodec. These strings are raw data and compressed data +// using previous native Snappy codec. We use updated Snappy codec to decode it and check if it +// matches. +String rawData = "010a06030a040a0c0109020c0a010204020d02000b010701080605080b090902060a080502060a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d060907020a030a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b060e030e0a07050d06050d"; Review comment: should be split across lines, but otherwise fine inline -simpler for tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 490162) Time Spent: 15h (was: 14h 50m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 15h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490157=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490157 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 24/Sep/20 11:57 Start Date: 24/Sep/20 11:57 Worklog Time Spent: 10m Work Description: hadoop-yetus removed a comment on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690765839 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 41s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 34s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 33m 4s | trunk passed | | +1 :green_heart: | compile | 25m 59s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 21m 36s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 3m 14s | trunk passed | | +1 :green_heart: | mvnsite | 2m 10s | trunk passed | | +1 :green_heart: | shadedclient | 23m 22s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 14s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 9s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 2m 16s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 38s | branch/hadoop-project no findbugs output file (findbugsXml.xml) | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 29s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 6s | the patch passed | | -1 :x: | compile | 1m 6s | root in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | javac | 1m 6s | root in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | compile | 1m 0s | root in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -1 :x: | javac | 1m 0s | root in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -0 :warning: | checkstyle | 2m 20s | root: The patch generated 1 new + 151 unchanged - 5 fixed = 152 total (was 156) | | +1 :green_heart: | mvnsite | 1m 24s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 3s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 14m 20s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 41s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 35s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | findbugs | 0m 19s | hadoop-project has no data from findbugs | ||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 18s | hadoop-project in the patch passed. | | -1 :x: | unit | 0m 42s | hadoop-common in the patch failed. | | +1 :green_heart: | asflicense | 0m 33s | The patch does not generate ASF License warnings. | | | | 146m 43s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2297 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle | | uname | Linux 205df60c0f1e 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 9960c01a25c | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | |
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490158=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490158 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 24/Sep/20 11:57 Start Date: 24/Sep/20 11:57 Worklog Time Spent: 10m Work Description: hadoop-yetus removed a comment on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690812236 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 29s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 36s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 26m 46s | trunk passed | | +1 :green_heart: | compile | 19m 28s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 17m 8s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 3m 41s | trunk passed | | +1 :green_heart: | mvnsite | 2m 1s | trunk passed | | +1 :green_heart: | shadedclient | 20m 48s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 14s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 10s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 2m 15s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 37s | branch/hadoop-project no findbugs output file (findbugsXml.xml) | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 29s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 4s | the patch passed | | -1 :x: | compile | 1m 4s | root in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | cc | 1m 4s | root in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | golang | 1m 4s | root in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | javac | 1m 4s | root in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | compile | 0m 56s | root in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -1 :x: | cc | 0m 56s | root in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -1 :x: | golang | 0m 56s | root in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -1 :x: | javac | 0m 56s | root in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -0 :warning: | checkstyle | 2m 21s | root: The patch generated 1 new + 151 unchanged - 5 fixed = 152 total (was 156) | | +1 :green_heart: | mvnsite | 1m 24s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 3s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 13m 48s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 41s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 36s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | findbugs | 0m 19s | hadoop-project has no data from findbugs | ||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 18s | hadoop-project in the patch passed. | | -1 :x: | unit | 0m 40s | hadoop-common in the patch failed. | | +1 :green_heart: | asflicense | 0m 34s | The patch does not generate ASF License warnings. | | | | 125m 50s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2297 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle cc golang | | uname | Linux
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490009=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490009 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 24/Sep/20 06:37 Start Date: 24/Sep/20 06:37 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r494070826 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java ## @@ -48,30 +49,20 @@ private long bytesRead = 0L; private long bytesWritten = 0L; - private static boolean nativeSnappyLoaded = false; - - static { -if (NativeCodeLoader.isNativeCodeLoaded() && -NativeCodeLoader.buildSupportsSnappy()) { - try { -initIDs(); -nativeSnappyLoaded = true; - } catch (Throwable t) { -LOG.error("failed to load SnappyCompressor", t); - } -} - } - - public static boolean isNativeCodeLoaded() { -return nativeSnappyLoaded; - } - /** * Creates a new compressor. * * @param directBufferSize size of the direct buffer to be used. */ public SnappyCompressor(int directBufferSize) { +// `snappy-java` is provided scope. We need to check if its availability. Review comment: Oops, thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 490009) Time Spent: 14h 20m (was: 14h 10m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 14h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490008=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490008 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 24/Sep/20 06:36 Start Date: 24/Sep/20 06:36 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r494070442 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java ## @@ -48,30 +49,20 @@ private long bytesRead = 0L; private long bytesWritten = 0L; - private static boolean nativeSnappyLoaded = false; - - static { -if (NativeCodeLoader.isNativeCodeLoaded() && -NativeCodeLoader.buildSupportsSnappy()) { - try { -initIDs(); -nativeSnappyLoaded = true; - } catch (Throwable t) { -LOG.error("failed to load SnappyCompressor", t); - } -} - } - - public static boolean isNativeCodeLoaded() { -return nativeSnappyLoaded; - } - /** * Creates a new compressor. * * @param directBufferSize size of the direct buffer to be used. */ public SnappyCompressor(int directBufferSize) { +// `snappy-java` is provided scope. We need to check if its availability. +try { + SnappyLoader.getVersion(); +} catch (Throwable t) { + throw new RuntimeException("native snappy library not available: " + Review comment: It is java-snappy jar, yeah, I will review the message. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 490008) Time Spent: 14h 10m (was: 14h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 14h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490007=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490007 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 24/Sep/20 06:35 Start Date: 24/Sep/20 06:35 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r494070189 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java ## @@ -291,9 +282,17 @@ public long getBytesWritten() { public void end() { } - private native static void initIDs(); - - private native int compressBytesDirect(); - - public native static String getLibraryName(); + private int compressBytesDirect() throws IOException { Review comment: This `compressBytesDirect` and `decompressBytesDirect` basically are copied from original method names. `compressDirectBuf` and `decompressDirectBuf` looks good to me. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 490007) Time Spent: 14h (was: 13h 50m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 14h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490006=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490006 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 24/Sep/20 06:34 Start Date: 24/Sep/20 06:34 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r494069586 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java ## @@ -446,4 +442,43 @@ public void doWork() throws Exception { ctx.waitFor(6); } + + @Test + public void testSnappyCompatibility() throws Exception { +// HADOOP-17125. Using snappy-java in SnappyCodec. These strings are raw data and compressed data +// using previous native Snappy codec. We use updated Snappy codec to decode it and check if it +// matches. +String rawData = "010a06030a040a0c0109020c0a010204020d02000b010701080605080b090902060a080502060a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d060907020a030a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b060e030e0a07050d06050d"; Review comment: String is to make the test as simple as possible. Maybe further shorten the string? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 490006) Time Spent: 13h 50m (was: 13h 40m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 13h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=489995=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-489995 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 24/Sep/20 06:03 Start Date: 24/Sep/20 06:03 Worklog Time Spent: 10m Work Description: saintstack commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r494054389 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java ## @@ -48,30 +49,20 @@ private long bytesRead = 0L; private long bytesWritten = 0L; - private static boolean nativeSnappyLoaded = false; - - static { -if (NativeCodeLoader.isNativeCodeLoaded() && -NativeCodeLoader.buildSupportsSnappy()) { - try { -initIDs(); -nativeSnappyLoaded = true; - } catch (Throwable t) { -LOG.error("failed to load SnappyCompressor", t); - } -} - } - - public static boolean isNativeCodeLoaded() { -return nativeSnappyLoaded; - } - /** * Creates a new compressor. * * @param directBufferSize size of the direct buffer to be used. */ public SnappyCompressor(int directBufferSize) { +// `snappy-java` is provided scope. We need to check if its availability. +try { + SnappyLoader.getVersion(); +} catch (Throwable t) { + throw new RuntimeException("native snappy library not available: " + Review comment: Is it the 'native snappy library' that is missing or the java-snappy jar? ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java ## @@ -48,30 +49,20 @@ private long bytesRead = 0L; private long bytesWritten = 0L; - private static boolean nativeSnappyLoaded = false; - - static { -if (NativeCodeLoader.isNativeCodeLoaded() && -NativeCodeLoader.buildSupportsSnappy()) { - try { -initIDs(); -nativeSnappyLoaded = true; - } catch (Throwable t) { -LOG.error("failed to load SnappyCompressor", t); - } -} - } - - public static boolean isNativeCodeLoaded() { -return nativeSnappyLoaded; - } - /** * Creates a new compressor. * * @param directBufferSize size of the direct buffer to be used. */ public SnappyCompressor(int directBufferSize) { +// `snappy-java` is provided scope. We need to check if its availability. Review comment: Fix this last sentence if you make a new PR ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java ## @@ -446,4 +442,43 @@ public void doWork() throws Exception { ctx.waitFor(6); } + + @Test + public void testSnappyCompatibility() throws Exception { +// HADOOP-17125. Using snappy-java in SnappyCodec. These strings are raw data and compressed data +// using previous native Snappy codec. We use updated Snappy codec to decode it and check if it +// matches. +String rawData = "010a06030a040a0c0109020c0a010204020d02000b010701080605080b090902060a080502060a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d060907020a030a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b060e030e0a07050d06050d"; Review comment: hmm... this is a little anemic. Have you considered adding a data file that is a little more interesting than this? ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -276,10 +267,20 @@ public void end() { // do nothing } - private native static void initIDs(); + private int decompressBytesDirect() throws IOException { Review comment: ditto ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -45,30 +46,20 @@ private int userBufOff = 0, userBufLen = 0; private boolean finished; - private static boolean nativeSnappyLoaded = false; - - static { -if (NativeCodeLoader.isNativeCodeLoaded() && -NativeCodeLoader.buildSupportsSnappy()) { - try { -initIDs(); -nativeSnappyLoaded = true; - } catch (Throwable t) { -LOG.error("failed to load SnappyDecompressor", t); - } -} - } - - public static boolean isNativeCodeLoaded() { -return nativeSnappyLoaded; - } - /** * Creates a new compressor. * * @param directBufferSize size of the direct buffer to be used. */ public SnappyDecompressor(int directBufferSize) { +// `snappy-java` is provided scope. We need to check if its availability. +try { +
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=488822=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-488822 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 23/Sep/20 04:12 Start Date: 23/Sep/20 04:12 Worklog Time Spent: 10m Work Description: dbtsai commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-697008848 cc @iwasakims @aajisaka who involved with codec in Hadoop. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 488822) Time Spent: 13.5h (was: 13h 20m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 13.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=488627=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-488627 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 22/Sep/20 22:09 Start Date: 22/Sep/20 22:09 Worklog Time Spent: 10m Work Description: dbtsai commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-697008848 cc @iwasakims @aajisaka who involved with codec in Hadoop. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 488627) Time Spent: 13h 20m (was: 13h 10m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 13h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=487464=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487464 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 22/Sep/20 03:03 Start Date: 22/Sep/20 03:03 Worklog Time Spent: 10m Work Description: dbtsai commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-696356461 @steveloughran we addressed your comments. Please take a look again when you have time. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487464) Time Spent: 13h 10m (was: 13h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 13h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=487247=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487247 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 21/Sep/20 20:29 Start Date: 21/Sep/20 20:29 Worklog Time Spent: 10m Work Description: dbtsai commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-696356461 @steveloughran we addressed your comments. Please take a look again when you have time. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487247) Time Spent: 13h (was: 12h 50m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 13h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=486064=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486064 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 18/Sep/20 05:27 Start Date: 18/Sep/20 05:27 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-694659679 Regarding the change of mvn dependency. For Apache Hadoop Common, the diff is: Before: ``` [INFO] +- org.apache.avro:avro:jar:1.7.7:compile [INFO] | +- com.thoughtworks.paranamer:paranamer:jar:2.3:compile [INFO] | \- org.xerial.snappy:snappy-java:jar:1.0.5:compile ``` After: ``` [INFO] +- org.apache.avro:avro:jar:1.7.7:compile [INFO] | \- com.thoughtworks.paranamer:paranamer:jar:2.3:compile ... [INFO] \- org.xerial.snappy:snappy-java:jar:1.1.7.7:provided ``` For other modules, the change is the same. Like Apache Hadoop NFS: Before: ``` [INFO] | +- org.apache.avro:avro:jar:1.7.7:provided [INFO] | | +- com.thoughtworks.paranamer:paranamer:jar:2.3:provided [INFO] | | \- org.xerial.snappy:snappy-java:jar:1.0.5:provided ``` After: ``` [INFO] | +- org.apache.avro:avro:jar:1.7.7:provided [INFO] | | +- com.thoughtworks.paranamer:paranamer:jar:2.3:provided [INFO] | | \- org.xerial.snappy:snappy-java:jar:1.1.7.7:provided ``` Or like Apache Hadoop KMS: Before: ``` [INFO] | +- org.apache.avro:avro:jar:1.7.7:compile [INFO] | | +- com.thoughtworks.paranamer:paranamer:jar:2.3:compile [INFO] | | \- org.xerial.snappy:snappy-java:jar:1.0.5:compile ``` After: ``` [INFO] | +- org.apache.avro:avro:jar:1.7.7:compile [INFO] | | +- com.thoughtworks.paranamer:paranamer:jar:2.3:compile [INFO] | | \- org.xerial.snappy:snappy-java:jar:1.1.7.7:compile ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 486064) Time Spent: 12h 50m (was: 12h 40m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 12h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=486023=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486023 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 18/Sep/20 01:21 Start Date: 18/Sep/20 01:21 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-694592777 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 29s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 18s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 25m 58s | trunk passed | | +1 :green_heart: | compile | 19m 24s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 16m 49s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 2m 44s | trunk passed | | +1 :green_heart: | mvnsite | 2m 41s | trunk passed | | +1 :green_heart: | shadedclient | 20m 10s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 49s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 41s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 0m 35s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 37s | branch/hadoop-project no findbugs output file (findbugsXml.xml) | | +0 :ok: | findbugs | 0m 35s | branch/hadoop-project-dist no findbugs output file (findbugsXml.xml) | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 1m 1s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 21s | the patch passed | | +1 :green_heart: | compile | 19m 33s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | cc | 19m 33s | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 38 new + 125 unchanged - 38 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 19m 33s | the patch passed | | +1 :green_heart: | javac | 19m 33s | the patch passed | | +1 :green_heart: | compile | 16m 48s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | cc | 16m 48s | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 43 new + 120 unchanged - 43 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 16m 48s | the patch passed | | +1 :green_heart: | javac | 16m 48s | the patch passed | | -0 :warning: | checkstyle | 2m 48s | root: The patch generated 6 new + 151 unchanged - 5 fixed = 157 total (was 156) | | +1 :green_heart: | mvnsite | 2m 40s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 4s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 14m 6s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 49s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 39s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | findbugs | 0m 36s | hadoop-project has no data from findbugs | | +0 :ok: | findbugs | 0m 35s | hadoop-project-dist has no data from findbugs | ||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 34s | hadoop-project in the patch passed. | | +1 :green_heart: | unit | 0m 35s | hadoop-project-dist in the patch passed. | | +1 :green_heart: | unit | 9m 24s | hadoop-common in the patch passed. | | +1 :green_heart: | asflicense | 0m 53s | The patch does not generate ASF License warnings. | | | | 177m 15s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base:
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485981=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485981 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 17/Sep/20 22:29 Start Date: 17/Sep/20 22:29 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r490596868 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -45,30 +46,19 @@ private int userBufOff = 0, userBufLen = 0; private boolean finished; - private static boolean nativeSnappyLoaded = false; - - static { -if (NativeCodeLoader.isNativeCodeLoaded() && -NativeCodeLoader.buildSupportsSnappy()) { - try { -initIDs(); -nativeSnappyLoaded = true; - } catch (Throwable t) { -LOG.error("failed to load SnappyDecompressor", t); - } -} - } - - public static boolean isNativeCodeLoaded() { -return nativeSnappyLoaded; - } - /** * Creates a new compressor. * * @param directBufferSize size of the direct buffer to be used. */ public SnappyDecompressor(int directBufferSize) { +// `snappy-java` is provided scope. We need to check if its availability. +try { + SnappyLoader.getVersion(); +} catch (Throwable t) { + LOG.warn("Error loading snappy libraries: " + t); Review comment: ok, changed to throw `RuntimeException`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 485981) Time Spent: 12.5h (was: 12h 20m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 12.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485980=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485980 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 17/Sep/20 22:28 Start Date: 17/Sep/20 22:28 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r490596761 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java ## @@ -48,24 +48,6 @@ private long bytesRead = 0L; private long bytesWritten = 0L; - private static boolean nativeSnappyLoaded = false; - - static { -if (NativeCodeLoader.isNativeCodeLoaded() && -NativeCodeLoader.buildSupportsSnappy()) { - try { -initIDs(); -nativeSnappyLoaded = true; - } catch (Throwable t) { -LOG.error("failed to load SnappyCompressor", t); - } -} - } - - public static boolean isNativeCodeLoaded() { -return nativeSnappyLoaded; Review comment: added. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 485980) Time Spent: 12h 20m (was: 12h 10m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 12h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485722=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485722 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 17/Sep/20 13:05 Start Date: 17/Sep/20 13:05 Worklog Time Spent: 10m Work Description: hadoop-yetus removed a comment on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-693863169 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 29m 58s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 21s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 27m 10s | trunk passed | | +1 :green_heart: | compile | 21m 26s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 18m 44s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 2m 54s | trunk passed | | +1 :green_heart: | mvnsite | 2m 27s | trunk passed | | +1 :green_heart: | shadedclient | 19m 56s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 38s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 32s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 0m 38s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 36s | branch/hadoop-project no findbugs output file (findbugsXml.xml) | | +0 :ok: | findbugs | 0m 37s | branch/hadoop-project-dist no findbugs output file (findbugsXml.xml) | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 30s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 16s | the patch passed | | +1 :green_heart: | compile | 20m 49s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | cc | 20m 49s | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 17 new + 146 unchanged - 17 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 20m 49s | the patch passed | | +1 :green_heart: | javac | 20m 49s | the patch passed | | +1 :green_heart: | compile | 18m 31s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | cc | 18m 31s | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 11 new + 152 unchanged - 11 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 18m 31s | the patch passed | | +1 :green_heart: | javac | 18m 31s | the patch passed | | -0 :warning: | checkstyle | 2m 49s | root: The patch generated 6 new + 151 unchanged - 5 fixed = 157 total (was 156) | | +1 :green_heart: | mvnsite | 2m 32s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 5s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 14m 3s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 38s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 42s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | findbugs | 0m 34s | hadoop-project has no data from findbugs | | +0 :ok: | findbugs | 0m 35s | hadoop-project-dist has no data from findbugs | ||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 33s | hadoop-project in the patch passed. | | +1 :green_heart: | unit | 0m 34s | hadoop-project-dist in the patch passed. | | +1 :green_heart: | unit | 9m 49s | hadoop-common in the patch passed. | | +1 :green_heart: | asflicense | 0m 50s | The patch does not generate ASF License warnings. | | | | 213m 57s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base:
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485533=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485533 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 17/Sep/20 05:11 Start Date: 17/Sep/20 05:11 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-693868067 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 29m 41s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 20s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 27m 3s | trunk passed | | +1 :green_heart: | compile | 21m 43s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 18m 24s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 2m 50s | trunk passed | | +1 :green_heart: | mvnsite | 2m 41s | trunk passed | | +1 :green_heart: | shadedclient | 20m 17s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 40s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 37s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 0m 29s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 40s | branch/hadoop-project no findbugs output file (findbugsXml.xml) | | +0 :ok: | findbugs | 0m 29s | branch/hadoop-project-dist no findbugs output file (findbugsXml.xml) | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 32s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 17s | the patch passed | | +1 :green_heart: | compile | 20m 57s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | cc | 20m 57s | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 18 new + 145 unchanged - 18 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 20m 57s | the patch passed | | +1 :green_heart: | javac | 20m 57s | the patch passed | | +1 :green_heart: | compile | 18m 29s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | cc | 18m 29s | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 40 new + 123 unchanged - 40 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 18m 29s | the patch passed | | +1 :green_heart: | javac | 18m 29s | the patch passed | | -0 :warning: | checkstyle | 2m 54s | root: The patch generated 6 new + 151 unchanged - 5 fixed = 157 total (was 156) | | +1 :green_heart: | mvnsite | 2m 39s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 4s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 14m 9s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 37s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 39s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | findbugs | 0m 29s | hadoop-project has no data from findbugs | | +0 :ok: | findbugs | 0m 34s | hadoop-project-dist has no data from findbugs | ||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 30s | hadoop-project in the patch passed. | | +1 :green_heart: | unit | 0m 30s | hadoop-project-dist in the patch passed. | | -1 :x: | unit | 10m 18s | hadoop-common in the patch passed. | | +1 :green_heart: | asflicense | 0m 57s | The patch does not generate ASF License warnings. | | | | 214m 44s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.ha.TestZKFailoverController | | Subsystem | Report/Notes | |--:|:-|
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485532=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485532 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 17/Sep/20 05:07 Start Date: 17/Sep/20 05:07 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-693863169 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 29m 58s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 21s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 27m 10s | trunk passed | | +1 :green_heart: | compile | 21m 26s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 18m 44s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 2m 54s | trunk passed | | +1 :green_heart: | mvnsite | 2m 27s | trunk passed | | +1 :green_heart: | shadedclient | 19m 56s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 38s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 32s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 0m 38s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 36s | branch/hadoop-project no findbugs output file (findbugsXml.xml) | | +0 :ok: | findbugs | 0m 37s | branch/hadoop-project-dist no findbugs output file (findbugsXml.xml) | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 30s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 16s | the patch passed | | +1 :green_heart: | compile | 20m 49s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | cc | 20m 49s | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 17 new + 146 unchanged - 17 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 20m 49s | the patch passed | | +1 :green_heart: | javac | 20m 49s | the patch passed | | +1 :green_heart: | compile | 18m 31s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | cc | 18m 31s | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 11 new + 152 unchanged - 11 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 18m 31s | the patch passed | | +1 :green_heart: | javac | 18m 31s | the patch passed | | -0 :warning: | checkstyle | 2m 49s | root: The patch generated 6 new + 151 unchanged - 5 fixed = 157 total (was 156) | | +1 :green_heart: | mvnsite | 2m 32s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 5s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 14m 3s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 38s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 42s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | findbugs | 0m 34s | hadoop-project has no data from findbugs | | +0 :ok: | findbugs | 0m 35s | hadoop-project-dist has no data from findbugs | ||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 33s | hadoop-project in the patch passed. | | +1 :green_heart: | unit | 0m 34s | hadoop-project-dist in the patch passed. | | +1 :green_heart: | unit | 9m 49s | hadoop-common in the patch passed. | | +1 :green_heart: | asflicense | 0m 50s | The patch does not generate ASF License warnings. | | | | 213m 57s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base:
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485517=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485517 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 17/Sep/20 03:17 Start Date: 17/Sep/20 03:17 Worklog Time Spent: 10m Work Description: dbtsai commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r489922981 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java ## @@ -48,24 +48,6 @@ private long bytesRead = 0L; private long bytesWritten = 0L; - private static boolean nativeSnappyLoaded = false; - - static { -if (NativeCodeLoader.isNativeCodeLoaded() && -NativeCodeLoader.buildSupportsSnappy()) { - try { -initIDs(); -nativeSnappyLoaded = true; - } catch (Throwable t) { -LOG.error("failed to load SnappyCompressor", t); - } -} - } - - public static boolean isNativeCodeLoaded() { -return nativeSnappyLoaded; Review comment: we need to check if the snappy class is available for SnappyCompressor too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 485517) Time Spent: 11h 40m (was: 11.5h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 11h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485516=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485516 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 17/Sep/20 03:15 Start Date: 17/Sep/20 03:15 Worklog Time Spent: 10m Work Description: dbtsai commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r489921893 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -45,30 +46,19 @@ private int userBufOff = 0, userBufLen = 0; private boolean finished; - private static boolean nativeSnappyLoaded = false; - - static { -if (NativeCodeLoader.isNativeCodeLoaded() && -NativeCodeLoader.buildSupportsSnappy()) { - try { -initIDs(); -nativeSnappyLoaded = true; - } catch (Throwable t) { -LOG.error("failed to load SnappyDecompressor", t); - } -} - } - - public static boolean isNativeCodeLoaded() { -return nativeSnappyLoaded; - } - /** * Creates a new compressor. * * @param directBufferSize size of the direct buffer to be used. */ public SnappyDecompressor(int directBufferSize) { +// `snappy-java` is provided scope. We need to check if its availability. +try { + SnappyLoader.getVersion(); +} catch (Throwable t) { + LOG.warn("Error loading snappy libraries: " + t); Review comment: In the original code, we throw a runtime exception if the native snappy is not found. Should we follow? ``` throw new RuntimeException("native snappy library not available: " + "SnappyCompressor has not been loaded."); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 485516) Time Spent: 11.5h (was: 11h 20m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 11.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485489=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485489 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 17/Sep/20 01:36 Start Date: 17/Sep/20 01:36 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r489860460 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -276,13 +258,29 @@ public void end() { // do nothing } - private native static void initIDs(); + private int decompressBytesDirect() throws IOException { +if (compressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `compressedDirectBuf` for reading + compressedDirectBuf.limit(compressedDirectBufLen).position(0); + // There is compressed input, decompress it now. + int size = Snappy.uncompressedLength((ByteBuffer) compressedDirectBuf); + if (size > uncompressedDirectBuf.remaining()) { +throw new IOException("Could not decompress data. " + + "uncompressedDirectBuf length is too small."); Review comment: I found this check is not needed. Removed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 485489) Time Spent: 11h 20m (was: 11h 10m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 11h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485487=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485487 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 17/Sep/20 01:32 Start Date: 17/Sep/20 01:32 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r489858089 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -276,13 +258,29 @@ public void end() { // do nothing } - private native static void initIDs(); + private int decompressBytesDirect() throws IOException { +if (compressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `compressedDirectBuf` for reading + compressedDirectBuf.limit(compressedDirectBufLen).position(0); + // There is compressed input, decompress it now. + int size = Snappy.uncompressedLength((ByteBuffer) compressedDirectBuf); + if (size > uncompressedDirectBuf.remaining()) { +throw new IOException("Could not decompress data. " + + "uncompressedDirectBuf length is too small."); + } + size = Snappy.uncompress((ByteBuffer) compressedDirectBuf, + (ByteBuffer) uncompressedDirectBuf); + compressedDirectBufLen = 0; + compressedDirectBuf.limit(compressedDirectBuf.capacity()).position(0); + return size; +} + } - private native int decompressBytesDirect(); - int decompressDirect(ByteBuffer src, ByteBuffer dst) throws IOException { assert (this instanceof SnappyDirectDecompressor); - + Review comment: ok, reverted tailing whitespace. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 485487) Time Spent: 11h 10m (was: 11h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 11h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485476=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485476 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 17/Sep/20 00:44 Start Date: 17/Sep/20 00:44 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r489829411 ## File path: hadoop-project-dist/pom.xml ## @@ -341,7 +340,6 @@ --openssllib=${openssl.lib} --opensslbinbundle=${bundle.openssl.in.bin} --openssllibbundle=${bundle.openssl} - --snappybinbundle=${bundle.snappy.in.bin} --snappylib=${snappy.lib} --snappylibbundle=${bundle.snappy} Review comment: we don't touch hadoop-mapreduce-client-nativetask that needs snappy lib still, per https://github.com/apache/hadoop/pull/2201#issuecomment-681687572 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 485476) Time Spent: 11h (was: 10h 50m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 11h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485414=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485414 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 16/Sep/20 21:51 Start Date: 16/Sep/20 21:51 Worklog Time Spent: 10m Work Description: dbtsai edited a comment on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-693683581 cc @jlowe @liuml07 @tgravescs who recently worked on compression codecs for more feedback. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 485414) Time Spent: 10h 50m (was: 10h 40m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 10h 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485413=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485413 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 16/Sep/20 21:48 Start Date: 16/Sep/20 21:48 Worklog Time Spent: 10m Work Description: dbtsai commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-693683581 cc @jlowe and @liuml07 who recently worked on compression codecs for more feedback. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 485413) Time Spent: 10h 40m (was: 10.5h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 10h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485313=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485313 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 16/Sep/20 18:50 Start Date: 16/Sep/20 18:50 Worklog Time Spent: 10m Work Description: dbtsai commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r489663248 ## File path: hadoop-common-project/hadoop-common/pom.xml ## @@ -363,6 +363,10 @@ wildfly-openssl-java provided + + org.xerial.snappy Review comment: We can make it provided, and once we create a `hadoop-compression` module, we can add back the jar. @viirya since the jar will be provided, we need to check if the class exists so we can log it with right message. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 485313) Time Spent: 10.5h (was: 10h 20m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 10.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485137=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485137 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 16/Sep/20 13:19 Start Date: 16/Sep/20 13:19 Worklog Time Spent: 10m Work Description: steveloughran commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r489422661 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -276,13 +258,29 @@ public void end() { // do nothing } - private native static void initIDs(); + private int decompressBytesDirect() throws IOException { +if (compressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `compressedDirectBuf` for reading + compressedDirectBuf.limit(compressedDirectBufLen).position(0); + // There is compressed input, decompress it now. + int size = Snappy.uncompressedLength((ByteBuffer) compressedDirectBuf); + if (size > uncompressedDirectBuf.remaining()) { +throw new IOException("Could not decompress data. " + + "uncompressedDirectBuf length is too small."); + } + size = Snappy.uncompress((ByteBuffer) compressedDirectBuf, + (ByteBuffer) uncompressedDirectBuf); + compressedDirectBufLen = 0; + compressedDirectBuf.limit(compressedDirectBuf.capacity()).position(0); + return size; +} + } - private native int decompressBytesDirect(); - int decompressDirect(ByteBuffer src, ByteBuffer dst) throws IOException { assert (this instanceof SnappyDirectDecompressor); - + Review comment: please stop the IDE removing trailing whitespace on lines which haven't been edited; complicates life ## File path: hadoop-common-project/hadoop-common/pom.xml ## @@ -363,6 +363,10 @@ wildfly-openssl-java provided + + org.xerial.snappy Review comment: provided ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java ## @@ -276,13 +258,29 @@ public void end() { // do nothing } - private native static void initIDs(); + private int decompressBytesDirect() throws IOException { +if (compressedDirectBufLen == 0) { + return 0; +} else { + // Set the position and limit of `compressedDirectBuf` for reading + compressedDirectBuf.limit(compressedDirectBufLen).position(0); + // There is compressed input, decompress it now. + int size = Snappy.uncompressedLength((ByteBuffer) compressedDirectBuf); + if (size > uncompressedDirectBuf.remaining()) { +throw new IOException("Could not decompress data. " + + "uncompressedDirectBuf length is too small."); Review comment: use name of config option which users can tun ## File path: hadoop-project-dist/pom.xml ## @@ -341,7 +340,6 @@ --openssllib=${openssl.lib} --opensslbinbundle=${bundle.openssl.in.bin} --openssllibbundle=${bundle.openssl} - --snappybinbundle=${bundle.snappy.in.bin} --snappylib=${snappy.lib} --snappylibbundle=${bundle.snappy} Review comment: what about the others snappylibs? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 485137) Time Spent: 10h 20m (was: 10h 10m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 10h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485129=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485129 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 16/Sep/20 13:00 Start Date: 16/Sep/20 13:00 Worklog Time Spent: 10m Work Description: hadoop-yetus removed a comment on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-693125895 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 485129) Time Spent: 10h 10m (was: 10h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 10h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485126=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485126 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 16/Sep/20 12:55 Start Date: 16/Sep/20 12:55 Worklog Time Spent: 10m Work Description: hadoop-yetus removed a comment on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-692976368 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 32s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 22s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 26m 9s | trunk passed | | +1 :green_heart: | compile | 19m 26s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 16m 58s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 2m 44s | trunk passed | | +1 :green_heart: | mvnsite | 2m 44s | trunk passed | | +1 :green_heart: | shadedclient | 20m 27s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 38s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 34s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 0m 29s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 34s | branch/hadoop-project no findbugs output file (findbugsXml.xml) | | +0 :ok: | findbugs | 0m 29s | branch/hadoop-project-dist no findbugs output file (findbugsXml.xml) | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 33s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 24s | the patch passed | | +1 :green_heart: | compile | 20m 15s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | cc | 20m 15s | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 36 new + 127 unchanged - 36 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 20m 15s | the patch passed | | +1 :green_heart: | javac | 20m 15s | the patch passed | | +1 :green_heart: | compile | 19m 48s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | cc | 19m 48s | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 36 new + 127 unchanged - 36 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 19m 48s | the patch passed | | +1 :green_heart: | javac | 19m 48s | the patch passed | | -0 :warning: | checkstyle | 3m 3s | root: The patch generated 1 new + 151 unchanged - 5 fixed = 152 total (was 156) | | +1 :green_heart: | mvnsite | 2m 29s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 4s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 16m 27s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 27s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 33s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | findbugs | 0m 27s | hadoop-project has no data from findbugs | | +0 :ok: | findbugs | 0m 24s | hadoop-project-dist has no data from findbugs | ||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 24s | hadoop-project in the patch passed. | | +1 :green_heart: | unit | 0m 26s | hadoop-project-dist in the patch passed. | | +1 :green_heart: | unit | 10m 18s | hadoop-common in the patch passed. | | +1 :green_heart: | asflicense | 0m 53s | The patch does not generate ASF License warnings. | | | | 182m 50s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base:
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=484870=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-484870 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 16/Sep/20 02:09 Start Date: 16/Sep/20 02:09 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-693125895 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 39m 42s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 29s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 35m 48s | trunk passed | | +1 :green_heart: | compile | 29m 25s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 24m 38s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 3m 51s | trunk passed | | +1 :green_heart: | mvnsite | 3m 2s | trunk passed | | +1 :green_heart: | shadedclient | 26m 19s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 44s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 53s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 0m 30s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 36s | branch/hadoop-project no findbugs output file (findbugsXml.xml) | | +0 :ok: | findbugs | 0m 30s | branch/hadoop-project-dist no findbugs output file (findbugsXml.xml) | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 1m 6s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 43s | the patch passed | | +1 :green_heart: | compile | 28m 55s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | cc | 28m 55s | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 14 new + 149 unchanged - 14 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 28m 55s | the patch passed | | +1 :green_heart: | javac | 28m 55s | the patch passed | | +1 :green_heart: | compile | 24m 49s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | cc | 24m 49s | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 21 new + 142 unchanged - 21 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 24m 49s | the patch passed | | +1 :green_heart: | javac | 24m 49s | the patch passed | | -0 :warning: | checkstyle | 3m 42s | root: The patch generated 6 new + 151 unchanged - 5 fixed = 157 total (was 156) | | +1 :green_heart: | mvnsite | 3m 5s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 5s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 17m 48s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 48s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 52s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | findbugs | 0m 33s | hadoop-project has no data from findbugs | | +0 :ok: | findbugs | 0m 33s | hadoop-project-dist has no data from findbugs | ||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 31s | hadoop-project in the patch passed. | | +1 :green_heart: | unit | 0m 31s | hadoop-project-dist in the patch passed. | | +1 :green_heart: | unit | 12m 25s | hadoop-common in the patch passed. | | +1 :green_heart: | asflicense | 1m 2s | The patch does not generate ASF License warnings. | | | | 278m 27s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base:
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=484809=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-484809 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 15/Sep/20 22:53 Start Date: 15/Sep/20 22:53 Worklog Time Spent: 10m Work Description: dbtsai edited a comment on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-693003698 cc @xerial @steveloughran and @jojochuang for review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 484809) Time Spent: 9h 40m (was: 9.5h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 9h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=484805=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-484805 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 15/Sep/20 22:48 Start Date: 15/Sep/20 22:48 Worklog Time Spent: 10m Work Description: dbtsai commented on a change in pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#discussion_r489045005 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java ## @@ -441,4 +442,43 @@ public void doWork() throws Exception { ctx.waitFor(6); } + + @Test + public void testSnappyCompatibility() throws Exception { +// HADOOP-17125. Using snappy-java in SnappyCodec. These strings are raw data and compressed data +// using previous native Snappy codec. We use updated Snappy codec to decode it and check if it +// matches. +String rawData = "010a06030a040a0c0109020c0a010204020d02000b010701080605080b090902060a080502060a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d060907020a030a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b060e030e0a07050d06050d"; +String compressed = "8001f07f010a06030a040a0c0109020c0a010204020d02000b010701080605080b090902060a080502060a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d060907020a030a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b060e030e0a07050d06050d"; Review comment: Maybe you can comment the compressed data is generated by current native snappy codec? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 484805) Time Spent: 9.5h (was: 9h 20m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 9.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=484787=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-484787 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 15/Sep/20 22:03 Start Date: 15/Sep/20 22:03 Worklog Time Spent: 10m Work Description: dbtsai commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-693003698 cc @steveloughran and @jojochuang for review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 484787) Time Spent: 9h 20m (was: 9h 10m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 9h 20m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org