subject:"\[jira\] \[Work logged\] \(HADOOP\-17125\) Using snappy\-java in SnappyCodec"

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=499515=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-499515
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 12/Oct/20 16:51
Start Date: 12/Oct/20 16:51
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-707232854


   you needed to be listed in the project settings as someone with the right 
permissions. its done now



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 499515)
Time Spent: 25h 50m  (was: 25h 40m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 25h 50m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=499484=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-499484
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 12/Oct/20 16:02
Start Date: 12/Oct/20 16:02
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-707207886


   @steveloughran Thank you! I tried to assign this ticket, but seems cannot do 
it.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 499484)
Time Spent: 25h 40m  (was: 25.5h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 25h 40m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=499311=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-499311
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 12/Oct/20 10:01
Start Date: 12/Oct/20 10:01
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-707019757


   @viirya assigned JIRA to you. you are also free to assign any other Hadoop 
JIRAs to yourself...



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 499311)
Time Spent: 25.5h  (was: 25h 20m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 25.5h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=498660=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498660
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 09/Oct/20 16:57
Start Date: 09/Oct/20 16:57
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-706292851


   @steveloughran Username is viirya too. Thanks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498660)
Time Spent: 25h 20m  (was: 25h 10m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Assignee: DB Tsai
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 25h 20m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=498647=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498647
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 09/Oct/20 16:42
Start Date: 09/Oct/20 16:42
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-706286111


   @viirya...what's your JIRA username?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498647)
Time Spent: 25h 10m  (was: 25h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Assignee: DB Tsai
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 25h 10m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-07 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=496735=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496735
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 07/Oct/20 16:36
Start Date: 07/Oct/20 16:36
Worklog Time Spent: 10m 
  Work Description: sunchao commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-705055574


   Thanks @steveloughran - could you assign the JIRA to @viirya ? 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 496735)
Time Spent: 25h  (was: 24h 50m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Assignee: DB Tsai
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 25h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-07 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=496571=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496571
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 07/Oct/20 12:56
Start Date: 07/Oct/20 12:56
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-704916354


   JIRA closed, added a release note.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 496571)
Time Spent: 24h 50m  (was: 24h 40m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Assignee: DB Tsai
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1
>
>  Time Spent: 24h 50m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=496065=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496065
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 06/Oct/20 17:51
Start Date: 06/Oct/20 17:51
Worklog Time Spent: 10m 
  Work Description: viirya edited a comment on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-704433770


   Ok, got it. I will update release notes once it is back. Seems I 
cannot update Hadoop JIRA.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 496065)
Time Spent: 24h 40m  (was: 24.5h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 24h 40m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=496050=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496050
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 06/Oct/20 17:34
Start Date: 06/Oct/20 17:34
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-704434440


   Looks like the JIRA is back now? 
https://issues.apache.org/jira/browse/HADOOP-17125



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 496050)
Time Spent: 24.5h  (was: 24h 20m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 24.5h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=496049=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496049
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 06/Oct/20 17:32
Start Date: 06/Oct/20 17:32
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-704433770


   Ok, got it. I will update release notes once it is back.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 496049)
Time Spent: 24h 20m  (was: 24h 10m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 24h 20m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=495984=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495984
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 06/Oct/20 16:00
Start Date: 06/Oct/20 16:00
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-704380621


   Ok, I'm happy too
   
   +1, merging to trunk and branch-3.3



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 495984)
Time Spent: 24h 10m  (was: 24h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 24h 10m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=495982=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495982
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 06/Oct/20 15:58
Start Date: 06/Oct/20 15:58
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-704379411


   > No harm fixing it as part of this patch... add the 'jobTokenPassword' from 
below in 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/pom.xml
   
   I think it's actually some test runner bug, really it should be cleaned up. 
But we can pull in the patch to shut it up.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 495982)
Time Spent: 24h  (was: 23h 50m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 24h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=495800=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495800
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 06/Oct/20 09:06
Start Date: 06/Oct/20 09:06
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-704136004


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 29s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 4 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   5m 36s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  24m  2s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  19m 49s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  17m  5s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 56s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |  20m 56s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  14m 21s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   6m 29s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   7m  8s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   0m 45s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +0 :ok: |  findbugs  |   0m 25s |  |  branch/hadoop-project no findbugs 
output file (findbugsXml.xml)  |
   | +0 :ok: |  findbugs  |   0m 23s |  |  branch/hadoop-project-dist no 
findbugs output file (findbugsXml.xml)  |
   | -0 :warning: |  patch  |   1m  7s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 35s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |  21m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 18s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | -1 :x: |  cc  |  19m 18s | 
[/diff-compile-cc-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/22/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt)
 |  root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 40 new + 123 unchanged - 
40 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  19m 18s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  19m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m  9s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | -1 :x: |  cc  |  17m  9s | 
[/diff-compile-cc-root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/22/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.txt)
 |  root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 36 new + 127 
unchanged - 36 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  17m  9s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  17m  9s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   2m 50s |  |  root: The patch generated 
0 new + 140 unchanged - 3 fixed = 140 total (was 143)  |
   | +1 :green_heart: |  mvnsite  |  17m 36s |  |  the patch passed  |
   | +1 :green_heart: |  shellcheck  |   0m  0s |  |  There were no new 
shellcheck issues.  |
   | +1 :green_heart: |  shelldocs  |   0m 18s |  |  There were no new 
shelldocs issues.  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  xml  |   0m  5s |  |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  14m 11s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-05 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=495486=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495486
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 05/Oct/20 18:14
Start Date: 05/Oct/20 18:14
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-703801190


   Thanks @steveloughran and @saintstack. Updated the diff based on your 
suggestions.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 495486)
Time Spent: 23h 40m  (was: 23.5h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 23h 40m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-05 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=495480=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495480
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 05/Oct/20 17:59
Start Date: 05/Oct/20 17:59
Worklog Time Spent: 10m 
  Work Description: saintstack commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-703792707


   The native compile complaints seem unrelated... 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 495480)
Time Spent: 23.5h  (was: 23h 20m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 23.5h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-05 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=495478=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495478
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 05/Oct/20 17:53
Start Date: 05/Oct/20 17:53
Worklog Time Spent: 10m 
  Work Description: saintstack commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-703789784


   If making a new PR, the '   compile' is redundant given its 
maven default?
   
   The license failure is:
   
   ```
   Lines that start with ? in the ASF License  report indicate files that 
do not have an Apache license header:
!? 
/home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-2297/src/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/jobTokenPassword
   ```
   
   No harm fixing it as part of this patch... add the 'jobTokenPassword' from 
below in 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/pom.xml
   
   ```
  
org.apache.rat
apache-rat-plugin

  

src/test/java/org/apache/hadoop/cli/data60bytes

src/test/resources/job_1329348432655_0001-10.jhist
**/jobTokenPassword
  

  
   ```
   Otherwise patch is looking good to me.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 495478)
Time Spent: 23h 20m  (was: 23h 10m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 23h 20m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-05 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=495443=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495443
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 05/Oct/20 16:51
Start Date: 05/Oct/20 16:51
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on a change in pull request 
#2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r499736294



##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java
##
@@ -432,7 +412,11 @@ public void assertCompression(String name, Compressor 
compressor,
   joiner.join(name, "byte arrays not equals error !!!"),
   originalRawData, decompressOut.toByteArray());
 } catch (Exception ex) {
-  fail(joiner.join(name, ex.getMessage()));
+  if (ex.getMessage() != null) {
+fail(joiner.join(name, ex.getMessage()));
+  } else {
+fail(joiner.join(name, ExceptionUtils.getStackTrace(ex)));

Review comment:
   NPE is why toString() is what new code should do.
   Why don't we just `throw new AssertionError(name +ex, ex)`. That way, the 
stack trace doesn't get lost, which is something we never want to have happen, 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 495443)
Time Spent: 23h 10m  (was: 23h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 23h 10m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-02 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=494119=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-494119
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 02/Oct/20 20:05
Start Date: 02/Oct/20 20:05
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-702933677


   Fixed another and last style issue. Checked with `mvn checkstyle:check` 
locally.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 494119)
Time Spent: 23h  (was: 22h 50m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 23h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-02 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=494094=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-494094
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 02/Oct/20 18:50
Start Date: 02/Oct/20 18:50
Worklog Time Spent: 10m 
  Work Description: sunchao commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-702900335


   The style issue was fixed in the last run. The CI failed because of [unit 
tests](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/20/testReport/)
 and [ASF 
license](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/20/artifact/out/patch-asflicense-problems.txt)
 (I don't really see the file `jobTokenPassword`). Seems neither is related to 
this PR.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 494094)
Time Spent: 22h 50m  (was: 22h 40m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 22h 50m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=493595=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493595
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 17:07
Start Date: 01/Oct/20 17:07
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-702273856


   Hmm, for CompressDecompressTester.java, it seems to me that it is from 
original code?
   
   ```java
   else if (compressor.getClass().isAssignableFrom(ZlibCompressor.class)) {
 return ZlibFactory.isNativeZlibLoaded(new Configuration());
   -}  
   -else if (compressor.getClass().isAssignableFrom(SnappyCompressor.class)
   -&& isNativeSnappyLoadable())
   +}
   +else if (compressor.getClass().isAssignableFrom(SnappyCompressor.class))
   ```
   
   Anyway, I can fix it here if you think it is ok.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493595)
Time Spent: 22h 40m  (was: 22.5h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 22h 40m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=493505=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493505
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 14:04
Start Date: 01/Oct/20 14:04
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-702159088


   ok, yetus is running, it's just reporting isn't quite there...if you follow 
the link you see the results. Test failures in hdfs: unrelated. ASF licence 
warning: unrelated. 
   
   Checkstyles are, sadly, related.
   ```
   
./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java:491:
}:5: '}' at column 5 should be on the same line as the next part of a 
multi-block statement (one that directly contains multiple blocks: 
if/else-if/else, do/while or try/catch/finally). [RightCurly]
   
./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java:492:
else if (compressor.getClass().isAssignableFrom(SnappyCompressor.class)): 
'if' construct must use '{}'s. [NeedBraces]
   
./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java:356:
int[] size = { 4 * 1024, 64 * 1024, 128 * 1024, 1024 * 1024 };:18: '{' is 
followed by whitespace. [NoWhitespaceAfter]
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493505)
Time Spent: 22.5h  (was: 22h 20m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 22.5h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=493504=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493504
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 14:03
Start Date: 01/Oct/20 14:03
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690909475


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   0m 30s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  1s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
5 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 35s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  27m  4s |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 25s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  19m 19s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   3m  2s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m  7s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m  6s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 13s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  7s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   2m 12s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +0 :ok: |  findbugs  |   0m 38s |  branch/hadoop-project no findbugs 
output file (findbugsXml.xml)  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m  2s |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 38s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | -1 :x: |  cc  |  18m 38s |  
root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 36 new + 127 unchanged - 
36 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  18m 38s |  the patch passed  |
   | +1 :green_heart: |  javac  |  18m 38s |  the patch passed  |
   | +1 :green_heart: |  compile  |  16m 51s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | -1 :x: |  cc  |  16m 51s |  
root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 29 new + 134 unchanged - 
29 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  16m 51s |  the patch passed  |
   | +1 :green_heart: |  javac  |  16m 51s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   2m 41s |  root: The patch generated 1 new 
+ 151 unchanged - 5 fixed = 152 total (was 156)  |
   | +1 :green_heart: |  mvnsite  |   2m  1s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  xml  |   0m  2s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  14m 25s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 12s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  9s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  findbugs  |   0m 32s |  hadoop-project has no data from 
findbugs  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 32s |  hadoop-project in the patch 
passed.  |
   | -1 :x: |  unit  |   9m 32s |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 55s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 177m 51s |   |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.io.compress.snappy.TestSnappyCompressorDecompressor |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2297 |
   | Optional Tests | dupname asflicense compile

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492716=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492716
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 29/Sep/20 23:17
Start Date: 29/Sep/20 23:17
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-701041484


   @sunchao Thanks for review. I addressed your comments. Please let me know if 
you have more comments.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492716)
Time Spent: 22h 10m  (was: 22h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 22h 10m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492701=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492701
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 29/Sep/20 22:35
Start Date: 29/Sep/20 22:35
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r497099812



##
File path: hadoop-project-dist/pom.xml
##
@@ -341,7 +340,6 @@
 --openssllib=${openssl.lib}
 
--opensslbinbundle=${bundle.openssl.in.bin}
 --openssllibbundle=${bundle.openssl}
-
--snappybinbundle=${bundle.snappy.in.bin}
 --snappylib=${snappy.lib}
 --snappylibbundle=${bundle.snappy}

Review comment:
   Re-checked. hadoop-mapreduce-client-nativetask doesn't use 
`bundle.snappy.in.bin`. Only native-win profile of hadoop-common and native-win 
profile in hadoop-project use `bundle.snappy.in.bin`. They are added by 
HADOOP-9802 to support SnappyCodec on Windows. So looks like it is safe to 
remove.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492701)
Time Spent: 22h  (was: 21h 50m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 22h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492699=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492699
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 29/Sep/20 22:23
Start Date: 29/Sep/20 22:23
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r497093683



##
File path: hadoop-project-dist/pom.xml
##
@@ -341,7 +340,6 @@
 --openssllib=${openssl.lib}
 
--opensslbinbundle=${bundle.openssl.in.bin}
 --openssllibbundle=${bundle.openssl}
-
--snappybinbundle=${bundle.snappy.in.bin}
 --snappylib=${snappy.lib}
 --snappylibbundle=${bundle.snappy}

Review comment:
   Hmm, I am not sure about this now. I think it is safer to revert this 
back? I am not sure if hadoop-mapreduce-client-nativetask uses this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492699)
Time Spent: 21h 50m  (was: 21h 40m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 21h 50m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492697=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492697
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 29/Sep/20 22:19
Start Date: 29/Sep/20 22:19
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r497093683



##
File path: hadoop-project-dist/pom.xml
##
@@ -341,7 +340,6 @@
 --openssllib=${openssl.lib}
 
--opensslbinbundle=${bundle.openssl.in.bin}
 --openssllibbundle=${bundle.openssl}
-
--snappybinbundle=${bundle.snappy.in.bin}
 --snappylib=${snappy.lib}
 --snappylibbundle=${bundle.snappy}

Review comment:
   Hmm, I am not sure about this now. I think it is safer to revert this 
back, I am not sure if hadoop-mapreduce-client-nativetask uses this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492697)
Time Spent: 21h 40m  (was: 21.5h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 21h 40m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492689=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492689
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 29/Sep/20 21:44
Start Date: 29/Sep/20 21:44
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r497078103



##
File path: hadoop-common-project/hadoop-common/src/main/native/native.vcxproj
##
@@ -68,30 +68,13 @@
 ..\..\..\target\native\$(Configuration)\
 hadoop
   
-  
-$(CustomSnappyPrefix)
-$(CustomSnappyPrefix)\lib
-$(CustomSnappyPrefix)\bin
-$(CustomSnappyLib)
-$(CustomSnappyPrefix)
-$(CustomSnappyPrefix)\include
-$(CustomSnappyInclude)
-true
-$(SnappyInclude);$(IncludePath)
-$(ZLIB_HOME);$(IncludePath)

Review comment:
   Oh, yeah, will add it back.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492689)
Time Spent: 21.5h  (was: 21h 20m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 21.5h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492687=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492687
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 29/Sep/20 21:33
Start Date: 29/Sep/20 21:33
Worklog Time Spent: 10m 
  Work Description: sunchao commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r497073278



##
File path: hadoop-common-project/hadoop-common/src/main/native/native.vcxproj
##
@@ -68,30 +68,13 @@
 ..\..\..\target\native\$(Configuration)\
 hadoop
   
-  
-$(CustomSnappyPrefix)
-$(CustomSnappyPrefix)\lib
-$(CustomSnappyPrefix)\bin
-$(CustomSnappyLib)
-$(CustomSnappyPrefix)
-$(CustomSnappyPrefix)\include
-$(CustomSnappyInclude)
-true
-$(SnappyInclude);$(IncludePath)
-$(ZLIB_HOME);$(IncludePath)

Review comment:
   I think this is for Zlib compressor (see 
https://issues.apache.org/jira/browse/HADOOP-10450). Yeah it is a bit confusing 
that it's defined in the same group.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492687)
Time Spent: 21h 20m  (was: 21h 10m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 21h 20m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492682=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492682
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 29/Sep/20 21:24
Start Date: 29/Sep/20 21:24
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r497068326



##
File path: hadoop-common-project/hadoop-common/src/main/native/native.vcxproj
##
@@ -68,30 +68,13 @@
 ..\..\..\target\native\$(Configuration)\
 hadoop
   
-  
-$(CustomSnappyPrefix)
-$(CustomSnappyPrefix)\lib
-$(CustomSnappyPrefix)\bin
-$(CustomSnappyLib)
-$(CustomSnappyPrefix)
-$(CustomSnappyPrefix)\include
-$(CustomSnappyInclude)
-true
-$(SnappyInclude);$(IncludePath)
-$(ZLIB_HOME);$(IncludePath)

Review comment:
   I think it is together with snappy stuffs here, no? They are in same 
`PropertyGroup`. I think it is used to add zlib home into include paths in the 
property group.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492682)
Time Spent: 21h 10m  (was: 21h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 21h 10m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492648=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492648
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 29/Sep/20 20:13
Start Date: 29/Sep/20 20:13
Worklog Time Spent: 10m 
  Work Description: saintstack commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-700960375


   > happy with that?
   
   Thanks @steveloughran If I understand you correctly, because you figured we 
actually already include snappy-java and that snappy is already a hadoop-common 
dependency, then making it so snappy-java is 'compile' rather than 'provided' 
is ok by you.
   
   If so, yeah, I think this better. Operators don't have to make sure of 
native snappy everywhere since snappy-java provides it.
   
   (Looks like @viirya has gone ahead and made snappy-java compile scope...)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492648)
Time Spent: 21h  (was: 20h 50m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 21h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492645=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492645
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 29/Sep/20 20:00
Start Date: 29/Sep/20 20:00
Worklog Time Spent: 10m 
  Work Description: sunchao commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r497009027



##
File path: hadoop-common-project/hadoop-common/src/main/native/native.vcxproj
##
@@ -68,30 +68,13 @@
 ..\..\..\target\native\$(Configuration)\
 hadoop
   
-  
-$(CustomSnappyPrefix)
-$(CustomSnappyPrefix)\lib
-$(CustomSnappyPrefix)\bin
-$(CustomSnappyLib)
-$(CustomSnappyPrefix)
-$(CustomSnappyPrefix)\include
-$(CustomSnappyInclude)
-true
-$(SnappyInclude);$(IncludePath)
-$(ZLIB_HOME);$(IncludePath)

Review comment:
   I mean the last line, which is about ZLIB.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492645)
Time Spent: 20h 50m  (was: 20h 40m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20h 50m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492635=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492635
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 29/Sep/20 19:39
Start Date: 29/Sep/20 19:39
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r496997774



##
File path: hadoop-common-project/hadoop-common/src/main/native/native.vcxproj
##
@@ -68,30 +68,13 @@
 ..\..\..\target\native\$(Configuration)\
 hadoop
   
-  
-$(CustomSnappyPrefix)
-$(CustomSnappyPrefix)\lib
-$(CustomSnappyPrefix)\bin
-$(CustomSnappyLib)
-$(CustomSnappyPrefix)
-$(CustomSnappyPrefix)\include
-$(CustomSnappyInclude)
-true
-$(SnappyInclude);$(IncludePath)
-$(ZLIB_HOME);$(IncludePath)

Review comment:
   Since we remove snappy native code, why do we need to keep this on 
windows?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492635)
Time Spent: 20h 40m  (was: 20.5h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20h 40m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492626=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492626
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 29/Sep/20 19:17
Start Date: 29/Sep/20 19:17
Worklog Time Spent: 10m 
  Work Description: sunchao commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-700927518


   > bad: doesn't work so well if things (hadoop-client?) shade the snappy jar 
references.
   
   Since this dependency is new, as long we make sure it is not shaded in the 
client modules (e.g., hadoop-client-api) it should be fine. I checked there and 
don't see it is shaded right now.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492626)
Time Spent: 20.5h  (was: 20h 20m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20.5h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492625=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492625
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 29/Sep/20 19:15
Start Date: 29/Sep/20 19:15
Worklog Time Spent: 10m 
  Work Description: sunchao commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r496960563



##
File path: hadoop-common-project/hadoop-common/src/main/native/native.vcxproj
##
@@ -68,30 +68,13 @@
 ..\..\..\target\native\$(Configuration)\
 hadoop
   
-  
-$(CustomSnappyPrefix)
-$(CustomSnappyPrefix)\lib
-$(CustomSnappyPrefix)\bin
-$(CustomSnappyLib)
-$(CustomSnappyPrefix)
-$(CustomSnappyPrefix)\include
-$(CustomSnappyInclude)
-true
-$(SnappyInclude);$(IncludePath)
-$(ZLIB_HOME);$(IncludePath)

Review comment:
   We shouldn't remove this

##
File path: hadoop-project-dist/pom.xml
##
@@ -341,7 +340,6 @@
 --openssllib=${openssl.lib}
 
--opensslbinbundle=${bundle.openssl.in.bin}
 --openssllibbundle=${bundle.openssl}
-
--snappybinbundle=${bundle.snappy.in.bin}
 --snappylib=${snappy.lib}
 --snappylibbundle=${bundle.snappy}

Review comment:
   does this mean we don't need the option in 
dev-support/bin/dist-copynativelibs for snappy?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492625)
Time Spent: 20h 20m  (was: 20h 10m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20h 20m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492567=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492567
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 29/Sep/20 17:20
Start Date: 29/Sep/20 17:20
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-700859880


   Having checked up on those dependencies myself, yes, we are already shipping 
snappy-java 1.0.5 as a dependency of hadoop-common, by way of avro. Which makes 
a strong case for keeping the snappy codec in hadoop-common, declaring the 
snappy dependency as a compile time dependency, with the version we choose. 
This ensures hbase  pick it up, and, because it is there: we aren't creating 
any more complications for people downstream than they get today.
   
   @saintstack happy with that?
   
   In which case, we actually go to the earlier patches, which just switch the 
codec to using the native one, and the pom changes to import it. Plus a release 
note 'you don't need native snappy no more'.
   
   We could stay with the current code, which is resilient to someone removing 
the JAR. 
   Good: resilient
   bad: doesn't work so well if things (hadoop-client?) shade the snappy jar 
references. 
   
   I don't know what to do there. Given the code is written, I'd go for "merge 
it as is and see what happens", but the "go back to the minimal binding" could 
well have lower maintenance costs down the line



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492567)
Time Spent: 20h 10m  (was: 20h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20h 10m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-28 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=492164=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492164
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 28/Sep/20 21:33
Start Date: 28/Sep/20 21:33
Worklog Time Spent: 10m 
  Work Description: saintstack commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-700293030


   > @steveloughran if snappy native was a dependency for hbase-common, do you 
think that argues snappy-java could be (having it as 'provided' undoes the main 
benefit of using snappy-java jar instead of the native snappy libs). Thanks 
Steve.
   
   Any luck on opinion on above @steveloughran ? Thanks boss.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492164)
Time Spent: 20h  (was: 19h 50m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491427=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491427
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 25/Sep/20 20:59
Start Date: 25/Sep/20 20:59
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r495182332



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java
##
@@ -45,30 +46,21 @@
   private int userBufOff = 0, userBufLen = 0;
   private boolean finished;
 
-  private static boolean nativeSnappyLoaded = false;
-
-  static {
-if (NativeCodeLoader.isNativeCodeLoaded() &&
-NativeCodeLoader.buildSupportsSnappy()) {
-  try {
-initIDs();
-nativeSnappyLoaded = true;
-  } catch (Throwable t) {
-LOG.error("failed to load SnappyDecompressor", t);
-  }
-}
-  }
-  
-  public static boolean isNativeCodeLoaded() {
-return nativeSnappyLoaded;
-  }
-  
   /**
* Creates a new compressor.
*
* @param directBufferSize size of the direct buffer to be used.
*/
   public SnappyDecompressor(int directBufferSize) {
+// `snappy-java` is provided scope. We need to check if it is available.
+try {
+  SnappyLoader.getVersion();

Review comment:
   The "internal user-only" of `SnappyLoader`, based on its comment, seems 
more related to native library loading stuff.
   
   `getVersion` is static method and it doesn't involve loading of native 
library described in `SnappyLoader`, so I guess it is fine? Otherwise, I don't 
find other proper one to check.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491427)
Time Spent: 19h 50m  (was: 19h 40m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 19h 50m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491426=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491426
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 25/Sep/20 20:58
Start Date: 25/Sep/20 20:58
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r495183191



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java
##
@@ -291,9 +283,17 @@ public long getBytesWritten() {
   public void end() {
   }
 
-  private native static void initIDs();
-
-  private native int compressBytesDirect();
-
-  public native static String getLibraryName();
+  private int compressDirectBuf() throws IOException {
+if (uncompressedDirectBufLen == 0) {
+  return 0;
+} else {
+  // Set the position and limit of `uncompressedDirectBuf` for reading
+  uncompressedDirectBuf.limit(uncompressedDirectBufLen).position(0);
+  int size = Snappy.compress((ByteBuffer) uncompressedDirectBuf,
+  (ByteBuffer) compressedDirectBuf);
+  uncompressedDirectBufLen = 0;
+  
uncompressedDirectBuf.limit(uncompressedDirectBuf.capacity()).position(0);

Review comment:
   Seems so, I remember I added this to fix test failure. It might be 
`SnappyDecompressor`, I think, then I copied to `SnappyCompressor`. Deleted 
this and see what Jenkins tells.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491426)
Time Spent: 19h 40m  (was: 19.5h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 19h 40m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491379=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491379
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 25/Sep/20 19:36
Start Date: 25/Sep/20 19:36
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r495193338



##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java
##
@@ -495,19 +479,16 @@ public String getName() {
 Compressor compressor = pair.compressor;
 
 if (compressor.getClass().isAssignableFrom(Lz4Compressor.class)
-&& (NativeCodeLoader.isNativeCodeLoaded()))

Review comment:
   Ok. Reverted the change.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491379)
Time Spent: 19.5h  (was: 19h 20m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 19.5h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491377=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491377
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 25/Sep/20 19:34
Start Date: 25/Sep/20 19:34
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-699115827


   Thanks @sunchao for review. Addressed your comments.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491377)
Time Spent: 19h 20m  (was: 19h 10m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 19h 20m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491367=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491367
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 25/Sep/20 19:14
Start Date: 25/Sep/20 19:14
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r495183191



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java
##
@@ -291,9 +283,17 @@ public long getBytesWritten() {
   public void end() {
   }
 
-  private native static void initIDs();
-
-  private native int compressBytesDirect();
-
-  public native static String getLibraryName();
+  private int compressDirectBuf() throws IOException {
+if (uncompressedDirectBufLen == 0) {
+  return 0;
+} else {
+  // Set the position and limit of `uncompressedDirectBuf` for reading
+  uncompressedDirectBuf.limit(uncompressedDirectBufLen).position(0);
+  int size = Snappy.compress((ByteBuffer) uncompressedDirectBuf,
+  (ByteBuffer) compressedDirectBuf);
+  uncompressedDirectBufLen = 0;
+  
uncompressedDirectBuf.limit(uncompressedDirectBuf.capacity()).position(0);

Review comment:
   Seems so, I remember I added this to fix test failure. t might be 
`SnappyDecompressor`, I think, then I copied to `SnappyCompressor`. Deleted 
this and see what Jenkins tells.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491367)
Time Spent: 19h  (was: 18h 50m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 19h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491368=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491368
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 25/Sep/20 19:14
Start Date: 25/Sep/20 19:14
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r495183290



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java
##
@@ -276,10 +268,20 @@ public void end() {
 // do nothing
   }
 
-  private native static void initIDs();
+  private int decompressDirectBuf() throws IOException {
+if (compressedDirectBufLen == 0) {
+  return 0;
+} else {
+  // Set the position and limit of `compressedDirectBuf` for reading
+  compressedDirectBuf.limit(compressedDirectBufLen).position(0);
+  int size = Snappy.uncompress((ByteBuffer) compressedDirectBuf,
+  (ByteBuffer) uncompressedDirectBuf);
+  compressedDirectBufLen = 0;
+  compressedDirectBuf.limit(compressedDirectBuf.capacity()).position(0);

Review comment:
   yap





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491368)
Time Spent: 19h 10m  (was: 19h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 19h 10m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491366=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491366
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 25/Sep/20 19:12
Start Date: 25/Sep/20 19:12
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r495182332



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java
##
@@ -45,30 +46,21 @@
   private int userBufOff = 0, userBufLen = 0;
   private boolean finished;
 
-  private static boolean nativeSnappyLoaded = false;
-
-  static {
-if (NativeCodeLoader.isNativeCodeLoaded() &&
-NativeCodeLoader.buildSupportsSnappy()) {
-  try {
-initIDs();
-nativeSnappyLoaded = true;
-  } catch (Throwable t) {
-LOG.error("failed to load SnappyDecompressor", t);
-  }
-}
-  }
-  
-  public static boolean isNativeCodeLoaded() {
-return nativeSnappyLoaded;
-  }
-  
   /**
* Creates a new compressor.
*
* @param directBufferSize size of the direct buffer to be used.
*/
   public SnappyDecompressor(int directBufferSize) {
+// `snappy-java` is provided scope. We need to check if it is available.
+try {
+  SnappyLoader.getVersion();

Review comment:
   `getVersion` is static method and it doesn't involve loading of native 
library described in `SnappyLoader`, so I guess it is fine? Otherwise, I don't 
find other proper one to check.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491366)
Time Spent: 18h 50m  (was: 18h 40m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 18h 50m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491365=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491365
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 25/Sep/20 19:00
Start Date: 25/Sep/20 19:00
Worklog Time Spent: 10m 
  Work Description: sunchao commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r495176908



##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java
##
@@ -495,19 +479,16 @@ public String getName() {
 Compressor compressor = pair.compressor;
 
 if (compressor.getClass().isAssignableFrom(Lz4Compressor.class)
-&& (NativeCodeLoader.isNativeCodeLoaded()))

Review comment:
   Yeah usually it's not recommended to include unrelated changes in Hadoop 
patch, we may add another refactoring PR later if this is absolutely necessary.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491365)
Time Spent: 18h 40m  (was: 18.5h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 18h 40m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491364=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491364
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 25/Sep/20 18:58
Start Date: 25/Sep/20 18:58
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r495175908



##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java
##
@@ -495,19 +479,16 @@ public String getName() {
 Compressor compressor = pair.compressor;
 
 if (compressor.getClass().isAssignableFrom(Lz4Compressor.class)
-&& (NativeCodeLoader.isNativeCodeLoaded()))

Review comment:
   Oh, this is from @dbtsai's original change. I think adding curly 
brackets is better? I can revert this if you think it is necessary.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491364)
Time Spent: 18.5h  (was: 18h 20m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 18.5h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491362=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491362
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 25/Sep/20 18:55
Start Date: 25/Sep/20 18:55
Worklog Time Spent: 10m 
  Work Description: sunchao commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-699097890


   (nit nit) we may need to update [NativeLibraries 
guide](https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/site/markdown/NativeLibraries.md.vm)
 as well since `NativeLibraryChecker` no longer checks snappy.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491362)
Time Spent: 18h 20m  (was: 18h 10m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 18h 20m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491359=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491359
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 25/Sep/20 18:51
Start Date: 25/Sep/20 18:51
Worklog Time Spent: 10m 
  Work Description: sunchao commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r495158362



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java
##
@@ -291,9 +283,17 @@ public long getBytesWritten() {
   public void end() {
   }
 
-  private native static void initIDs();
-
-  private native int compressBytesDirect();
-
-  public native static String getLibraryName();
+  private int compressDirectBuf() throws IOException {
+if (uncompressedDirectBufLen == 0) {
+  return 0;
+} else {
+  // Set the position and limit of `uncompressedDirectBuf` for reading
+  uncompressedDirectBuf.limit(uncompressedDirectBufLen).position(0);
+  int size = Snappy.compress((ByteBuffer) uncompressedDirectBuf,
+  (ByteBuffer) compressedDirectBuf);
+  uncompressedDirectBufLen = 0;
+  
uncompressedDirectBuf.limit(uncompressedDirectBuf.capacity()).position(0);

Review comment:
   nit: this seems unnecessary as `clear` is called shortly after at the 
call site? 

##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java
##
@@ -276,10 +268,20 @@ public void end() {
 // do nothing
   }
 
-  private native static void initIDs();
+  private int decompressDirectBuf() throws IOException {
+if (compressedDirectBufLen == 0) {
+  return 0;
+} else {
+  // Set the position and limit of `compressedDirectBuf` for reading
+  compressedDirectBuf.limit(compressedDirectBufLen).position(0);
+  int size = Snappy.uncompress((ByteBuffer) compressedDirectBuf,
+  (ByteBuffer) uncompressedDirectBuf);
+  compressedDirectBufLen = 0;
+  compressedDirectBuf.limit(compressedDirectBuf.capacity()).position(0);

Review comment:
   nit: can we just call `compressedDirectBuf.clear()`?

##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java
##
@@ -446,4 +442,49 @@ public void doWork() throws Exception {
 
 ctx.waitFor(6);
   }
+
+  @Test
+  public void testSnappyCompatibility() throws Exception {
+// HADOOP-17125. Using snappy-java in SnappyCodec. These strings are raw 
data and compressed data
+// using previous native Snappy codec. We use updated Snappy codec to 
decode it and check if it
+// matches.
+String rawData = 
"010a06030a040a0c0109020c0a010204020d02000b010701080605080b090902060a08050206" +
+
"0a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d060907020a0"
 +
+
"30a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b060e030e0a07"
 +
+"050d06050d";
+String compressed = 
"8001f07f010a06030a040a0c0109020c0a010204020d02000b010701080605080b0909020" +
+
"60a080502060a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d"
 +
+
"060907020a030a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b0"
 +
+"60e030e0a07050d06050d";
+
+byte[] rawDataBytes = Hex.decodeHex(rawData);
+byte[] compressedBytes = Hex.decodeHex(compressed);
+
+ByteBuffer inBuf = ByteBuffer.allocateDirect(compressedBytes.length);
+inBuf.put(compressedBytes, 0, compressedBytes.length);
+inBuf.flip();
+
+ByteBuffer outBuf = ByteBuffer.allocateDirect(rawDataBytes.length);
+ByteBuffer expected = ByteBuffer.wrap(rawDataBytes);
+
+SnappyDecompressor.SnappyDirectDecompressor decompressor = new 
SnappyDecompressor.SnappyDirectDecompressor();

Review comment:
   nit: long lines (80 chars).

##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java
##
@@ -495,19 +479,16 @@ public String getName() {
 Compressor compressor = pair.compressor;
 
 if (compressor.getClass().isAssignableFrom(Lz4Compressor.class)
-&& (NativeCodeLoader.isNativeCodeLoaded()))

Review comment:
   nit: unrelated changes :)

##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java
##
@@ -45,30 +46,21 @@
   private int userBufOff = 0, userBufLen = 0;
   private boolean finished;
 
-  private static boolean nativeSnappyLoaded = false;
-
-  static {
-if (NativeCodeLoader.isNativeCodeLoaded() &&
-

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491318=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491318
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 25/Sep/20 16:54
Start Date: 25/Sep/20 16:54
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-699039044


   @saintstack @steveloughran Is this change looking good for you now? Based on 
this change, we will work on the compression module per 
https://github.com/apache/hadoop/pull/2159#issuecomment-698212693 suggests. 
Thanks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491318)
Time Spent: 18h  (was: 17h 50m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 18h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=491132=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-491132
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:45
Start Date: 25/Sep/20 13:45
Worklog Time Spent: 10m 
  Work Description: dbtsai commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r494537597



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java
##
@@ -291,9 +283,17 @@ public long getBytesWritten() {
   public void end() {
   }
 
-  private native static void initIDs();
-
-  private native int compressBytesDirect();
-
-  public native static String getLibraryName();
+  private int compressDirectBuf() throws IOException {
+if (uncompressedDirectBufLen == 0) {
+  return 0;
+} else {
+  // Set the position and limit of `uncompressedDirectBuf` for reading
+  uncompressedDirectBuf.limit(uncompressedDirectBufLen).position(0);
+  int size = Snappy.compress((ByteBuffer) uncompressedDirectBuf,
+  (ByteBuffer) compressedDirectBuf);
+  uncompressedDirectBufLen = 0;
+  uncompressedDirectBuf.limit(directBufferSize).position(0);

Review comment:
   nit, 
`uncompressedDirectBuf.limit(uncompressedDirectBuf.capacity()).position(0);` 
for safety.
   

##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java
##
@@ -432,7 +412,11 @@ public void assertCompression(String name, Compressor 
compressor,
   joiner.join(name, "byte arrays not equals error !!!"),
   originalRawData, decompressOut.toByteArray());
 } catch (Exception ex) {
-  fail(joiner.join(name, ex.getMessage()));
+  if (ex.getMessage() != null) {
+fail(joiner.join(name, ex.getMessage()));
+  } else {
+fail(joiner.join(name, ExceptionUtils.getStackTrace(ex)));

Review comment:
   Why is this change needed?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 491132)
Time Spent: 17h 50m  (was: 17h 40m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 17h 50m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490944=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490944
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:28
Start Date: 25/Sep/20 13:28
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-698298972


   So actually snappy was already a dependency of hadoop-common? Interesting



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490944)
Time Spent: 17h 40m  (was: 17.5h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 17h 40m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490867=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490867
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:21
Start Date: 25/Sep/20 13:21
Worklog Time Spent: 10m 
  Work Description: saintstack commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-698423282


   @steveloughran if snappy native was a dependency for hbase-common, do you 
think that argues snappy-java could be (having it as 'provided' undoes the main 
benefit of using snappy-java jar instead of the native snappy libs). Thanks 
Steve.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490867)
Time Spent: 17h 10m  (was: 17h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 17h 10m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490922=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490922
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:26
Start Date: 25/Sep/20 13:26
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690765839







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490922)
Time Spent: 17.5h  (was: 17h 20m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 17.5h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490911=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490911
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:25
Start Date: 25/Sep/20 13:25
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r494069586



##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java
##
@@ -446,4 +442,43 @@ public void doWork() throws Exception {
 
 ctx.waitFor(6);
   }
+
+  @Test
+  public void testSnappyCompatibility() throws Exception {
+// HADOOP-17125. Using snappy-java in SnappyCodec. These strings are raw 
data and compressed data
+// using previous native Snappy codec. We use updated Snappy codec to 
decode it and check if it
+// matches.
+String rawData = 
"010a06030a040a0c0109020c0a010204020d02000b010701080605080b090902060a080502060a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d060907020a030a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b060e030e0a07050d06050d";

Review comment:
   String is to make the test as simple as possible. Maybe further shorten 
the string?

##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java
##
@@ -291,9 +282,17 @@ public long getBytesWritten() {
   public void end() {
   }
 
-  private native static void initIDs();
-
-  private native int compressBytesDirect();
-
-  public native static String getLibraryName();
+  private int compressBytesDirect() throws IOException {

Review comment:
   This `compressBytesDirect` and `decompressBytesDirect` basically are 
copied from original method names. `compressDirectBuf` and 
`decompressDirectBuf` looks good to me.

##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java
##
@@ -48,30 +49,20 @@
   private long bytesRead = 0L;
   private long bytesWritten = 0L;
 
-  private static boolean nativeSnappyLoaded = false;
-  
-  static {
-if (NativeCodeLoader.isNativeCodeLoaded() &&
-NativeCodeLoader.buildSupportsSnappy()) {
-  try {
-initIDs();
-nativeSnappyLoaded = true;
-  } catch (Throwable t) {
-LOG.error("failed to load SnappyCompressor", t);
-  }
-}
-  }
-  
-  public static boolean isNativeCodeLoaded() {
-return nativeSnappyLoaded;
-  }
-  
   /**
* Creates a new compressor.
*
* @param directBufferSize size of the direct buffer to be used.
*/
   public SnappyCompressor(int directBufferSize) {
+// `snappy-java` is provided scope. We need to check if its availability.
+try {
+  SnappyLoader.getVersion();
+} catch (Throwable t) {
+  throw new RuntimeException("native snappy library not available: " +

Review comment:
   It is java-snappy jar, yeah, I will review the message.

##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java
##
@@ -48,30 +49,20 @@
   private long bytesRead = 0L;
   private long bytesWritten = 0L;
 
-  private static boolean nativeSnappyLoaded = false;
-  
-  static {
-if (NativeCodeLoader.isNativeCodeLoaded() &&
-NativeCodeLoader.buildSupportsSnappy()) {
-  try {
-initIDs();
-nativeSnappyLoaded = true;
-  } catch (Throwable t) {
-LOG.error("failed to load SnappyCompressor", t);
-  }
-}
-  }
-  
-  public static boolean isNativeCodeLoaded() {
-return nativeSnappyLoaded;
-  }
-  
   /**
* Creates a new compressor.
*
* @param directBufferSize size of the direct buffer to be used.
*/
   public SnappyCompressor(int directBufferSize) {
+// `snappy-java` is provided scope. We need to check if its availability.

Review comment:
   Oops, thanks. 

##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java
##
@@ -48,30 +49,20 @@
   private long bytesRead = 0L;
   private long bytesWritten = 0L;
 
-  private static boolean nativeSnappyLoaded = false;
-  
-  static {
-if (NativeCodeLoader.isNativeCodeLoaded() &&
-NativeCodeLoader.buildSupportsSnappy()) {
-  try {
-initIDs();
-nativeSnappyLoaded = true;
-  } catch (Throwable t) {
-LOG.error("failed to load SnappyCompressor", t);
-  }
-}
-  }
-  
-  public static boolean isNativeCodeLoaded() {
-return nativeSnappyLoaded;
-  }
-  
   /**
* Creates a new compressor.
*
* @param directBufferSize size of the

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490810=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490810
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:15
Start Date: 25/Sep/20 13:15
Worklog Time Spent: 10m 
  Work Description: saintstack commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r494054389



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java
##
@@ -48,30 +49,20 @@
   private long bytesRead = 0L;
   private long bytesWritten = 0L;
 
-  private static boolean nativeSnappyLoaded = false;
-  
-  static {
-if (NativeCodeLoader.isNativeCodeLoaded() &&
-NativeCodeLoader.buildSupportsSnappy()) {
-  try {
-initIDs();
-nativeSnappyLoaded = true;
-  } catch (Throwable t) {
-LOG.error("failed to load SnappyCompressor", t);
-  }
-}
-  }
-  
-  public static boolean isNativeCodeLoaded() {
-return nativeSnappyLoaded;
-  }
-  
   /**
* Creates a new compressor.
*
* @param directBufferSize size of the direct buffer to be used.
*/
   public SnappyCompressor(int directBufferSize) {
+// `snappy-java` is provided scope. We need to check if its availability.
+try {
+  SnappyLoader.getVersion();
+} catch (Throwable t) {
+  throw new RuntimeException("native snappy library not available: " +

Review comment:
   Is it the 'native snappy library' that is missing or the java-snappy jar?

##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java
##
@@ -48,30 +49,20 @@
   private long bytesRead = 0L;
   private long bytesWritten = 0L;
 
-  private static boolean nativeSnappyLoaded = false;
-  
-  static {
-if (NativeCodeLoader.isNativeCodeLoaded() &&
-NativeCodeLoader.buildSupportsSnappy()) {
-  try {
-initIDs();
-nativeSnappyLoaded = true;
-  } catch (Throwable t) {
-LOG.error("failed to load SnappyCompressor", t);
-  }
-}
-  }
-  
-  public static boolean isNativeCodeLoaded() {
-return nativeSnappyLoaded;
-  }
-  
   /**
* Creates a new compressor.
*
* @param directBufferSize size of the direct buffer to be used.
*/
   public SnappyCompressor(int directBufferSize) {
+// `snappy-java` is provided scope. We need to check if its availability.

Review comment:
   Fix this last sentence if you make a new PR

##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java
##
@@ -446,4 +442,43 @@ public void doWork() throws Exception {
 
 ctx.waitFor(6);
   }
+
+  @Test
+  public void testSnappyCompatibility() throws Exception {
+// HADOOP-17125. Using snappy-java in SnappyCodec. These strings are raw 
data and compressed data
+// using previous native Snappy codec. We use updated Snappy codec to 
decode it and check if it
+// matches.
+String rawData = 
"010a06030a040a0c0109020c0a010204020d02000b010701080605080b090902060a080502060a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d060907020a030a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b060e030e0a07050d06050d";

Review comment:
   hmm... this is a little anemic. Have you considered adding a data file 
that is a little more interesting than this?

##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java
##
@@ -276,10 +267,20 @@ public void end() {
 // do nothing
   }
 
-  private native static void initIDs();
+  private int decompressBytesDirect() throws IOException {

Review comment:
   ditto

##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java
##
@@ -45,30 +46,20 @@
   private int userBufOff = 0, userBufLen = 0;
   private boolean finished;
 
-  private static boolean nativeSnappyLoaded = false;
-
-  static {
-if (NativeCodeLoader.isNativeCodeLoaded() &&
-NativeCodeLoader.buildSupportsSnappy()) {
-  try {
-initIDs();
-nativeSnappyLoaded = true;
-  } catch (Throwable t) {
-LOG.error("failed to load SnappyDecompressor", t);
-  }
-}
-  }
-  
-  public static boolean isNativeCodeLoaded() {
-return nativeSnappyLoaded;
-  }
-  
   /**
* Creates a new compressor.
*
* @param directBufferSize size of the direct buffer to be used.
*/
   public SnappyDecompressor(int directBufferSize) {
+// `snappy-java` is provided scope. We need to check if its availability.
+try {
+

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490786=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490786
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:12
Start Date: 25/Sep/20 13:12
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on a change in pull request 
#2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r494256866



##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java
##
@@ -446,4 +442,43 @@ public void doWork() throws Exception {
 
 ctx.waitFor(6);
   }
+
+  @Test
+  public void testSnappyCompatibility() throws Exception {
+// HADOOP-17125. Using snappy-java in SnappyCodec. These strings are raw 
data and compressed data
+// using previous native Snappy codec. We use updated Snappy codec to 
decode it and check if it
+// matches.
+String rawData = 
"010a06030a040a0c0109020c0a010204020d02000b010701080605080b090902060a080502060a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d060907020a030a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b060e030e0a07050d06050d";

Review comment:
   should be split across lines, but otherwise fine inline -simpler for 
tests





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490786)
Time Spent: 16h 50m  (was: 16h 40m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 16h 50m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490732=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490732
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 25/Sep/20 13:07
Start Date: 25/Sep/20 13:07
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-698516626


   @saintstack @steveloughran Thanks for the reviews. I think I addressed the 
latest comments.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490732)
Time Spent: 16h 40m  (was: 16.5h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 16h 40m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490387=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490387
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 24/Sep/20 19:10
Start Date: 24/Sep/20 19:10
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r494550809



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java
##
@@ -291,9 +283,17 @@ public long getBytesWritten() {
   public void end() {
   }
 
-  private native static void initIDs();
-
-  private native int compressBytesDirect();
-
-  public native static String getLibraryName();
+  private int compressDirectBuf() throws IOException {
+if (uncompressedDirectBufLen == 0) {
+  return 0;
+} else {
+  // Set the position and limit of `uncompressedDirectBuf` for reading
+  uncompressedDirectBuf.limit(uncompressedDirectBufLen).position(0);
+  int size = Snappy.compress((ByteBuffer) uncompressedDirectBuf,
+  (ByteBuffer) compressedDirectBuf);
+  uncompressedDirectBufLen = 0;
+  uncompressedDirectBuf.limit(directBufferSize).position(0);

Review comment:
   done. thanks.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490387)
Time Spent: 16.5h  (was: 16h 20m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 16.5h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490375=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490375
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 24/Sep/20 18:54
Start Date: 24/Sep/20 18:54
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r494542249



##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java
##
@@ -432,7 +412,11 @@ public void assertCompression(String name, Compressor 
compressor,
   joiner.join(name, "byte arrays not equals error !!!"),
   originalRawData, decompressOut.toByteArray());
 } catch (Exception ex) {
-  fail(joiner.join(name, ex.getMessage()));
+  if (ex.getMessage() != null) {
+fail(joiner.join(name, ex.getMessage()));
+  } else {
+fail(joiner.join(name, ExceptionUtils.getStackTrace(ex)));

Review comment:
   When I first took over this change, the test failed with NPE without any 
details. It is because the exception thrown returns null from `getMessage()`. 
`joiner.join(name, null)` causes the NPE, so I changed it to print stack trace 
once `getMessage()` returns null. It's better for debugging.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490375)
Time Spent: 16h 20m  (was: 16h 10m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 16h 20m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490372=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490372
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 24/Sep/20 18:48
Start Date: 24/Sep/20 18:48
Worklog Time Spent: 10m 
  Work Description: dbtsai commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r494539014



##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java
##
@@ -432,7 +412,11 @@ public void assertCompression(String name, Compressor 
compressor,
   joiner.join(name, "byte arrays not equals error !!!"),
   originalRawData, decompressOut.toByteArray());
 } catch (Exception ex) {
-  fail(joiner.join(name, ex.getMessage()));
+  if (ex.getMessage() != null) {
+fail(joiner.join(name, ex.getMessage()));
+  } else {
+fail(joiner.join(name, ExceptionUtils.getStackTrace(ex)));

Review comment:
   Why is this change needed?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490372)
Time Spent: 16h 10m  (was: 16h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 16h 10m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490370=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490370
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 24/Sep/20 18:46
Start Date: 24/Sep/20 18:46
Worklog Time Spent: 10m 
  Work Description: dbtsai commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r494537597



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java
##
@@ -291,9 +283,17 @@ public long getBytesWritten() {
   public void end() {
   }
 
-  private native static void initIDs();
-
-  private native int compressBytesDirect();
-
-  public native static String getLibraryName();
+  private int compressDirectBuf() throws IOException {
+if (uncompressedDirectBufLen == 0) {
+  return 0;
+} else {
+  // Set the position and limit of `uncompressedDirectBuf` for reading
+  uncompressedDirectBuf.limit(uncompressedDirectBufLen).position(0);
+  int size = Snappy.compress((ByteBuffer) uncompressedDirectBuf,
+  (ByteBuffer) compressedDirectBuf);
+  uncompressedDirectBufLen = 0;
+  uncompressedDirectBuf.limit(directBufferSize).position(0);

Review comment:
   nit, 
`uncompressedDirectBuf.limit(uncompressedDirectBuf.capacity()).position(0);` 
for safety.
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490370)
Time Spent: 16h  (was: 15h 50m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 16h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490362=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490362
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 24/Sep/20 18:36
Start Date: 24/Sep/20 18:36
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-698516626


   @saintstack @steveloughran Thanks for the reviews. I think I addressed the 
latest comments.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490362)
Time Spent: 15h 40m  (was: 15.5h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 15h 40m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490363=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490363
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 24/Sep/20 18:36
Start Date: 24/Sep/20 18:36
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r494532226



##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java
##
@@ -446,4 +442,43 @@ public void doWork() throws Exception {
 
 ctx.waitFor(6);
   }
+
+  @Test
+  public void testSnappyCompatibility() throws Exception {
+// HADOOP-17125. Using snappy-java in SnappyCodec. These strings are raw 
data and compressed data
+// using previous native Snappy codec. We use updated Snappy codec to 
decode it and check if it
+// matches.
+String rawData = 
"010a06030a040a0c0109020c0a010204020d02000b010701080605080b090902060a080502060a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d060907020a030a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b060e030e0a07050d06050d";

Review comment:
   Ok, I split the long string. Thanks.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490363)
Time Spent: 15h 50m  (was: 15h 40m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 15h 50m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490289=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490289
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 24/Sep/20 15:44
Start Date: 24/Sep/20 15:44
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r494070442



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java
##
@@ -48,30 +49,20 @@
   private long bytesRead = 0L;
   private long bytesWritten = 0L;
 
-  private static boolean nativeSnappyLoaded = false;
-  
-  static {
-if (NativeCodeLoader.isNativeCodeLoaded() &&
-NativeCodeLoader.buildSupportsSnappy()) {
-  try {
-initIDs();
-nativeSnappyLoaded = true;
-  } catch (Throwable t) {
-LOG.error("failed to load SnappyCompressor", t);
-  }
-}
-  }
-  
-  public static boolean isNativeCodeLoaded() {
-return nativeSnappyLoaded;
-  }
-  
   /**
* Creates a new compressor.
*
* @param directBufferSize size of the direct buffer to be used.
*/
   public SnappyCompressor(int directBufferSize) {
+// `snappy-java` is provided scope. We need to check if its availability.
+try {
+  SnappyLoader.getVersion();
+} catch (Throwable t) {
+  throw new RuntimeException("native snappy library not available: " +

Review comment:
   It is java-snappy jar, yeah, I will revise the message.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490289)
Time Spent: 15.5h  (was: 15h 20m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 15.5h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490285=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490285
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 24/Sep/20 15:37
Start Date: 24/Sep/20 15:37
Worklog Time Spent: 10m 
  Work Description: saintstack commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-698423282


   @steveloughran if snappy native was a dependency for hbase-common, do you 
think that argues snappy-java could be (having it as 'provided' undoes the main 
benefit of using snappy-java jar instead of the native snappy libs). Thanks 
Steve.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490285)
Time Spent: 15h 20m  (was: 15h 10m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 15h 20m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490163=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490163
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 24/Sep/20 12:00
Start Date: 24/Sep/20 12:00
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-698298972


   So actually snappy was already a dependency of hadoop-common? Interesting



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490163)
Time Spent: 15h 10m  (was: 15h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 15h 10m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490160=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490160
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 24/Sep/20 11:59
Start Date: 24/Sep/20 11:59
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-693868067


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |  29m 41s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
5 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 20s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  27m  3s |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 43s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  18m 24s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 50s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 41s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 17s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 40s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 37s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   0m 29s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +0 :ok: |  findbugs  |   0m 40s |  branch/hadoop-project no findbugs 
output file (findbugsXml.xml)  |
   | +0 :ok: |  findbugs  |   0m 29s |  branch/hadoop-project-dist no findbugs 
output file (findbugsXml.xml)  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 32s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 17s |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 57s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | -1 :x: |  cc  |  20m 57s |  
root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 18 new + 145 unchanged - 
18 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  20m 57s |  the patch passed  |
   | +1 :green_heart: |  javac  |  20m 57s |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 29s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | -1 :x: |  cc  |  18m 29s |  
root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 40 new + 123 unchanged - 
40 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  18m 29s |  the patch passed  |
   | +1 :green_heart: |  javac  |  18m 29s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   2m 54s |  root: The patch generated 6 new 
+ 151 unchanged - 5 fixed = 157 total (was 156)  |
   | +1 :green_heart: |  mvnsite  |   2m 39s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  xml  |   0m  4s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  14m  9s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 37s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 39s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  findbugs  |   0m 29s |  hadoop-project has no data from 
findbugs  |
   | +0 :ok: |  findbugs  |   0m 34s |  hadoop-project-dist has no data from 
findbugs  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 30s |  hadoop-project in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   0m 30s |  hadoop-project-dist in the patch 
passed.  |
   | -1 :x: |  unit  |  10m 18s |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 57s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 214m 44s |   |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.ha.TestZKFailoverController |
   
   
   | Subsystem | Report/Notes |

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490162=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490162
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 24/Sep/20 12:00
Start Date: 24/Sep/20 12:00
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on a change in pull request 
#2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r494256866



##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java
##
@@ -446,4 +442,43 @@ public void doWork() throws Exception {
 
 ctx.waitFor(6);
   }
+
+  @Test
+  public void testSnappyCompatibility() throws Exception {
+// HADOOP-17125. Using snappy-java in SnappyCodec. These strings are raw 
data and compressed data
+// using previous native Snappy codec. We use updated Snappy codec to 
decode it and check if it
+// matches.
+String rawData = 
"010a06030a040a0c0109020c0a010204020d02000b010701080605080b090902060a080502060a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d060907020a030a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b060e030e0a07050d06050d";

Review comment:
   should be split across lines, but otherwise fine inline -simpler for 
tests





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490162)
Time Spent: 15h  (was: 14h 50m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 15h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490157=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490157
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 24/Sep/20 11:57
Start Date: 24/Sep/20 11:57
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690765839


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   0m 41s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
5 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 34s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  33m  4s |  trunk passed  |
   | +1 :green_heart: |  compile  |  25m 59s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  21m 36s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   3m 14s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 10s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 22s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 14s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  9s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   2m 16s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +0 :ok: |  findbugs  |   0m 38s |  branch/hadoop-project no findbugs 
output file (findbugsXml.xml)  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m  6s |  the patch passed  |
   | -1 :x: |  compile  |   1m  6s |  root in the patch failed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.  |
   | -1 :x: |  javac  |   1m  6s |  root in the patch failed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.  |
   | -1 :x: |  compile  |   1m  0s |  root in the patch failed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.  |
   | -1 :x: |  javac  |   1m  0s |  root in the patch failed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.  |
   | -0 :warning: |  checkstyle  |   2m 20s |  root: The patch generated 1 new 
+ 151 unchanged - 5 fixed = 152 total (was 156)  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  xml  |   0m  3s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  14m 20s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 41s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 35s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  findbugs  |   0m 19s |  hadoop-project has no data from 
findbugs  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 18s |  hadoop-project in the patch 
passed.  |
   | -1 :x: |  unit  |   0m 42s |  hadoop-common in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 33s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 146m 43s |   |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2297 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient xml findbugs checkstyle |
   | uname | Linux 205df60c0f1e 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 9960c01a25c |
   | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   |

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490158=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490158
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 24/Sep/20 11:57
Start Date: 24/Sep/20 11:57
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690812236


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   0m 29s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
5 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 36s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m 46s |  trunk passed  |
   | +1 :green_heart: |  compile  |  19m 28s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  17m  8s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   3m 41s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m  1s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 48s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 14s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 10s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   2m 15s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +0 :ok: |  findbugs  |   0m 37s |  branch/hadoop-project no findbugs 
output file (findbugsXml.xml)  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m  4s |  the patch passed  |
   | -1 :x: |  compile  |   1m  4s |  root in the patch failed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.  |
   | -1 :x: |  cc  |   1m  4s |  root in the patch failed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.  |
   | -1 :x: |  golang  |   1m  4s |  root in the patch failed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.  |
   | -1 :x: |  javac  |   1m  4s |  root in the patch failed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.  |
   | -1 :x: |  compile  |   0m 56s |  root in the patch failed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.  |
   | -1 :x: |  cc  |   0m 56s |  root in the patch failed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.  |
   | -1 :x: |  golang  |   0m 56s |  root in the patch failed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.  |
   | -1 :x: |  javac  |   0m 56s |  root in the patch failed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01.  |
   | -0 :warning: |  checkstyle  |   2m 21s |  root: The patch generated 1 new 
+ 151 unchanged - 5 fixed = 152 total (was 156)  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  xml  |   0m  3s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  13m 48s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 41s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 36s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  findbugs  |   0m 19s |  hadoop-project has no data from 
findbugs  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 18s |  hadoop-project in the patch 
passed.  |
   | -1 :x: |  unit  |   0m 40s |  hadoop-common in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 34s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 125m 50s |   |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2297 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient xml findbugs checkstyle cc golang |
   | uname | Linux

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490009=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490009
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 24/Sep/20 06:37
Start Date: 24/Sep/20 06:37
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r494070826



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java
##
@@ -48,30 +49,20 @@
   private long bytesRead = 0L;
   private long bytesWritten = 0L;
 
-  private static boolean nativeSnappyLoaded = false;
-  
-  static {
-if (NativeCodeLoader.isNativeCodeLoaded() &&
-NativeCodeLoader.buildSupportsSnappy()) {
-  try {
-initIDs();
-nativeSnappyLoaded = true;
-  } catch (Throwable t) {
-LOG.error("failed to load SnappyCompressor", t);
-  }
-}
-  }
-  
-  public static boolean isNativeCodeLoaded() {
-return nativeSnappyLoaded;
-  }
-  
   /**
* Creates a new compressor.
*
* @param directBufferSize size of the direct buffer to be used.
*/
   public SnappyCompressor(int directBufferSize) {
+// `snappy-java` is provided scope. We need to check if its availability.

Review comment:
   Oops, thanks. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490009)
Time Spent: 14h 20m  (was: 14h 10m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 14h 20m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490008=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490008
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 24/Sep/20 06:36
Start Date: 24/Sep/20 06:36
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r494070442



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java
##
@@ -48,30 +49,20 @@
   private long bytesRead = 0L;
   private long bytesWritten = 0L;
 
-  private static boolean nativeSnappyLoaded = false;
-  
-  static {
-if (NativeCodeLoader.isNativeCodeLoaded() &&
-NativeCodeLoader.buildSupportsSnappy()) {
-  try {
-initIDs();
-nativeSnappyLoaded = true;
-  } catch (Throwable t) {
-LOG.error("failed to load SnappyCompressor", t);
-  }
-}
-  }
-  
-  public static boolean isNativeCodeLoaded() {
-return nativeSnappyLoaded;
-  }
-  
   /**
* Creates a new compressor.
*
* @param directBufferSize size of the direct buffer to be used.
*/
   public SnappyCompressor(int directBufferSize) {
+// `snappy-java` is provided scope. We need to check if its availability.
+try {
+  SnappyLoader.getVersion();
+} catch (Throwable t) {
+  throw new RuntimeException("native snappy library not available: " +

Review comment:
   It is java-snappy jar, yeah, I will review the message.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490008)
Time Spent: 14h 10m  (was: 14h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 14h 10m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490007=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490007
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 24/Sep/20 06:35
Start Date: 24/Sep/20 06:35
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r494070189



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java
##
@@ -291,9 +282,17 @@ public long getBytesWritten() {
   public void end() {
   }
 
-  private native static void initIDs();
-
-  private native int compressBytesDirect();
-
-  public native static String getLibraryName();
+  private int compressBytesDirect() throws IOException {

Review comment:
   This `compressBytesDirect` and `decompressBytesDirect` basically are 
copied from original method names. `compressDirectBuf` and 
`decompressDirectBuf` looks good to me.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490007)
Time Spent: 14h  (was: 13h 50m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 14h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=490006=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-490006
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 24/Sep/20 06:34
Start Date: 24/Sep/20 06:34
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r494069586



##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java
##
@@ -446,4 +442,43 @@ public void doWork() throws Exception {
 
 ctx.waitFor(6);
   }
+
+  @Test
+  public void testSnappyCompatibility() throws Exception {
+// HADOOP-17125. Using snappy-java in SnappyCodec. These strings are raw 
data and compressed data
+// using previous native Snappy codec. We use updated Snappy codec to 
decode it and check if it
+// matches.
+String rawData = 
"010a06030a040a0c0109020c0a010204020d02000b010701080605080b090902060a080502060a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d060907020a030a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b060e030e0a07050d06050d";

Review comment:
   String is to make the test as simple as possible. Maybe further shorten 
the string?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 490006)
Time Spent: 13h 50m  (was: 13h 40m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 13h 50m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=489995=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-489995
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 24/Sep/20 06:03
Start Date: 24/Sep/20 06:03
Worklog Time Spent: 10m 
  Work Description: saintstack commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r494054389



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java
##
@@ -48,30 +49,20 @@
   private long bytesRead = 0L;
   private long bytesWritten = 0L;
 
-  private static boolean nativeSnappyLoaded = false;
-  
-  static {
-if (NativeCodeLoader.isNativeCodeLoaded() &&
-NativeCodeLoader.buildSupportsSnappy()) {
-  try {
-initIDs();
-nativeSnappyLoaded = true;
-  } catch (Throwable t) {
-LOG.error("failed to load SnappyCompressor", t);
-  }
-}
-  }
-  
-  public static boolean isNativeCodeLoaded() {
-return nativeSnappyLoaded;
-  }
-  
   /**
* Creates a new compressor.
*
* @param directBufferSize size of the direct buffer to be used.
*/
   public SnappyCompressor(int directBufferSize) {
+// `snappy-java` is provided scope. We need to check if its availability.
+try {
+  SnappyLoader.getVersion();
+} catch (Throwable t) {
+  throw new RuntimeException("native snappy library not available: " +

Review comment:
   Is it the 'native snappy library' that is missing or the java-snappy jar?

##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java
##
@@ -48,30 +49,20 @@
   private long bytesRead = 0L;
   private long bytesWritten = 0L;
 
-  private static boolean nativeSnappyLoaded = false;
-  
-  static {
-if (NativeCodeLoader.isNativeCodeLoaded() &&
-NativeCodeLoader.buildSupportsSnappy()) {
-  try {
-initIDs();
-nativeSnappyLoaded = true;
-  } catch (Throwable t) {
-LOG.error("failed to load SnappyCompressor", t);
-  }
-}
-  }
-  
-  public static boolean isNativeCodeLoaded() {
-return nativeSnappyLoaded;
-  }
-  
   /**
* Creates a new compressor.
*
* @param directBufferSize size of the direct buffer to be used.
*/
   public SnappyCompressor(int directBufferSize) {
+// `snappy-java` is provided scope. We need to check if its availability.

Review comment:
   Fix this last sentence if you make a new PR

##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java
##
@@ -446,4 +442,43 @@ public void doWork() throws Exception {
 
 ctx.waitFor(6);
   }
+
+  @Test
+  public void testSnappyCompatibility() throws Exception {
+// HADOOP-17125. Using snappy-java in SnappyCodec. These strings are raw 
data and compressed data
+// using previous native Snappy codec. We use updated Snappy codec to 
decode it and check if it
+// matches.
+String rawData = 
"010a06030a040a0c0109020c0a010204020d02000b010701080605080b090902060a080502060a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d060907020a030a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b060e030e0a07050d06050d";

Review comment:
   hmm... this is a little anemic. Have you considered adding a data file 
that is a little more interesting than this?

##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java
##
@@ -276,10 +267,20 @@ public void end() {
 // do nothing
   }
 
-  private native static void initIDs();
+  private int decompressBytesDirect() throws IOException {

Review comment:
   ditto

##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java
##
@@ -45,30 +46,20 @@
   private int userBufOff = 0, userBufLen = 0;
   private boolean finished;
 
-  private static boolean nativeSnappyLoaded = false;
-
-  static {
-if (NativeCodeLoader.isNativeCodeLoaded() &&
-NativeCodeLoader.buildSupportsSnappy()) {
-  try {
-initIDs();
-nativeSnappyLoaded = true;
-  } catch (Throwable t) {
-LOG.error("failed to load SnappyDecompressor", t);
-  }
-}
-  }
-  
-  public static boolean isNativeCodeLoaded() {
-return nativeSnappyLoaded;
-  }
-  
   /**
* Creates a new compressor.
*
* @param directBufferSize size of the direct buffer to be used.
*/
   public SnappyDecompressor(int directBufferSize) {
+// `snappy-java` is provided scope. We need to check if its availability.
+try {
+

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-22 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=488822=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-488822
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 23/Sep/20 04:12
Start Date: 23/Sep/20 04:12
Worklog Time Spent: 10m 
  Work Description: dbtsai commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-697008848


   cc @iwasakims @aajisaka who involved with codec in Hadoop.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 488822)
Time Spent: 13.5h  (was: 13h 20m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-22 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=488627=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-488627
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 22/Sep/20 22:09
Start Date: 22/Sep/20 22:09
Worklog Time Spent: 10m 
  Work Description: dbtsai commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-697008848


   cc @iwasakims @aajisaka who involved with codec in Hadoop.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 488627)
Time Spent: 13h 20m  (was: 13h 10m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=487464=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487464
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:03
Start Date: 22/Sep/20 03:03
Worklog Time Spent: 10m 
  Work Description: dbtsai commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-696356461


   @steveloughran we addressed your comments. Please take a look again when you 
have time. Thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487464)
Time Spent: 13h 10m  (was: 13h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 13h 10m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=487247=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487247
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 21/Sep/20 20:29
Start Date: 21/Sep/20 20:29
Worklog Time Spent: 10m 
  Work Description: dbtsai commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-696356461


   @steveloughran we addressed your comments. Please take a look again when you 
have time. Thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487247)
Time Spent: 13h  (was: 12h 50m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 13h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=486064=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486064
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 18/Sep/20 05:27
Start Date: 18/Sep/20 05:27
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-694659679


   Regarding the change of mvn dependency.
   
   For Apache Hadoop Common, the diff is:
   
   Before:
   ```
   [INFO] +- org.apache.avro:avro:jar:1.7.7:compile
   [INFO] |  +- com.thoughtworks.paranamer:paranamer:jar:2.3:compile
   [INFO] |  \- org.xerial.snappy:snappy-java:jar:1.0.5:compile
   ```
   
   After:
   ```
   [INFO] +- org.apache.avro:avro:jar:1.7.7:compile
   [INFO] |  \- com.thoughtworks.paranamer:paranamer:jar:2.3:compile
   ...
   [INFO] \- org.xerial.snappy:snappy-java:jar:1.1.7.7:provided
   ```
   
   For other modules, the change is the same.
   
   Like Apache Hadoop NFS:
   
   Before:
   ```
   [INFO] |  +- org.apache.avro:avro:jar:1.7.7:provided
   [INFO] |  |  +- com.thoughtworks.paranamer:paranamer:jar:2.3:provided
   [INFO] |  |  \- org.xerial.snappy:snappy-java:jar:1.0.5:provided
   ```
   
   After:
   ```
   [INFO] |  +- org.apache.avro:avro:jar:1.7.7:provided
   [INFO] |  |  +- com.thoughtworks.paranamer:paranamer:jar:2.3:provided
   [INFO] |  |  \- org.xerial.snappy:snappy-java:jar:1.1.7.7:provided
   ```
   
   Or like Apache Hadoop KMS:
   
   Before:
   ```
   [INFO] |  +- org.apache.avro:avro:jar:1.7.7:compile
   [INFO] |  |  +- com.thoughtworks.paranamer:paranamer:jar:2.3:compile
   [INFO] |  |  \- org.xerial.snappy:snappy-java:jar:1.0.5:compile
   ```
   
   After:
   ```
   [INFO] |  +- org.apache.avro:avro:jar:1.7.7:compile
   [INFO] |  |  +- com.thoughtworks.paranamer:paranamer:jar:2.3:compile
   [INFO] |  |  \- org.xerial.snappy:snappy-java:jar:1.1.7.7:compile
   ```
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 486064)
Time Spent: 12h 50m  (was: 12h 40m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 12h 50m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=486023=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486023
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 18/Sep/20 01:21
Start Date: 18/Sep/20 01:21
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-694592777


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   0m 29s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
5 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 18s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 58s |  trunk passed  |
   | +1 :green_heart: |  compile  |  19m 24s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  16m 49s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 44s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 41s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 10s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 49s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 41s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   0m 35s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +0 :ok: |  findbugs  |   0m 37s |  branch/hadoop-project no findbugs 
output file (findbugsXml.xml)  |
   | +0 :ok: |  findbugs  |   0m 35s |  branch/hadoop-project-dist no findbugs 
output file (findbugsXml.xml)  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   1m  1s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 21s |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 33s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | -1 :x: |  cc  |  19m 33s |  
root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 38 new + 125 unchanged - 
38 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  19m 33s |  the patch passed  |
   | +1 :green_heart: |  javac  |  19m 33s |  the patch passed  |
   | +1 :green_heart: |  compile  |  16m 48s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | -1 :x: |  cc  |  16m 48s |  
root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 43 new + 120 unchanged - 
43 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  16m 48s |  the patch passed  |
   | +1 :green_heart: |  javac  |  16m 48s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   2m 48s |  root: The patch generated 6 new 
+ 151 unchanged - 5 fixed = 157 total (was 156)  |
   | +1 :green_heart: |  mvnsite  |   2m 40s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  xml  |   0m  4s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  14m  6s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 49s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 39s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  findbugs  |   0m 36s |  hadoop-project has no data from 
findbugs  |
   | +0 :ok: |  findbugs  |   0m 35s |  hadoop-project-dist has no data from 
findbugs  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 34s |  hadoop-project in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   0m 35s |  hadoop-project-dist in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   9m 24s |  hadoop-common in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 53s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 177m 15s |   |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base:

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485981=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485981
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 17/Sep/20 22:29
Start Date: 17/Sep/20 22:29
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r490596868



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java
##
@@ -45,30 +46,19 @@
   private int userBufOff = 0, userBufLen = 0;
   private boolean finished;
 
-  private static boolean nativeSnappyLoaded = false;
-
-  static {
-if (NativeCodeLoader.isNativeCodeLoaded() &&
-NativeCodeLoader.buildSupportsSnappy()) {
-  try {
-initIDs();
-nativeSnappyLoaded = true;
-  } catch (Throwable t) {
-LOG.error("failed to load SnappyDecompressor", t);
-  }
-}
-  }
-  
-  public static boolean isNativeCodeLoaded() {
-return nativeSnappyLoaded;
-  }
-  
   /**
* Creates a new compressor.
*
* @param directBufferSize size of the direct buffer to be used.
*/
   public SnappyDecompressor(int directBufferSize) {
+// `snappy-java` is provided scope. We need to check if its availability.
+try {
+  SnappyLoader.getVersion();
+} catch (Throwable t) {
+  LOG.warn("Error loading snappy libraries: " + t);

Review comment:
   ok, changed to throw `RuntimeException`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 485981)
Time Spent: 12.5h  (was: 12h 20m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485980=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485980
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 17/Sep/20 22:28
Start Date: 17/Sep/20 22:28
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r490596761



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java
##
@@ -48,24 +48,6 @@
   private long bytesRead = 0L;
   private long bytesWritten = 0L;
 
-  private static boolean nativeSnappyLoaded = false;
-  
-  static {
-if (NativeCodeLoader.isNativeCodeLoaded() &&
-NativeCodeLoader.buildSupportsSnappy()) {
-  try {
-initIDs();
-nativeSnappyLoaded = true;
-  } catch (Throwable t) {
-LOG.error("failed to load SnappyCompressor", t);
-  }
-}
-  }
-  
-  public static boolean isNativeCodeLoaded() {
-return nativeSnappyLoaded;

Review comment:
   added.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 485980)
Time Spent: 12h 20m  (was: 12h 10m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485722=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485722
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 17/Sep/20 13:05
Start Date: 17/Sep/20 13:05
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-693863169


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |  29m 58s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
5 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 21s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  27m 10s |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 26s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  18m 44s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 54s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 27s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  19m 56s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 38s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 32s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   0m 38s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +0 :ok: |  findbugs  |   0m 36s |  branch/hadoop-project no findbugs 
output file (findbugsXml.xml)  |
   | +0 :ok: |  findbugs  |   0m 37s |  branch/hadoop-project-dist no findbugs 
output file (findbugsXml.xml)  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 30s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 16s |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 49s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | -1 :x: |  cc  |  20m 49s |  
root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 17 new + 146 unchanged - 
17 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  20m 49s |  the patch passed  |
   | +1 :green_heart: |  javac  |  20m 49s |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 31s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | -1 :x: |  cc  |  18m 31s |  
root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 11 new + 152 unchanged - 
11 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  18m 31s |  the patch passed  |
   | +1 :green_heart: |  javac  |  18m 31s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   2m 49s |  root: The patch generated 6 new 
+ 151 unchanged - 5 fixed = 157 total (was 156)  |
   | +1 :green_heart: |  mvnsite  |   2m 32s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  xml  |   0m  5s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  14m  3s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 38s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 42s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  findbugs  |   0m 34s |  hadoop-project has no data from 
findbugs  |
   | +0 :ok: |  findbugs  |   0m 35s |  hadoop-project-dist has no data from 
findbugs  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 33s |  hadoop-project in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   0m 34s |  hadoop-project-dist in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   9m 49s |  hadoop-common in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 50s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 213m 57s |   |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base:

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-16 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485533=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485533
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 17/Sep/20 05:11
Start Date: 17/Sep/20 05:11
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-693868067


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |  29m 41s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
5 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 20s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  27m  3s |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 43s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  18m 24s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 50s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 41s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 17s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 40s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 37s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   0m 29s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +0 :ok: |  findbugs  |   0m 40s |  branch/hadoop-project no findbugs 
output file (findbugsXml.xml)  |
   | +0 :ok: |  findbugs  |   0m 29s |  branch/hadoop-project-dist no findbugs 
output file (findbugsXml.xml)  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 32s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 17s |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 57s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | -1 :x: |  cc  |  20m 57s |  
root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 18 new + 145 unchanged - 
18 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  20m 57s |  the patch passed  |
   | +1 :green_heart: |  javac  |  20m 57s |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 29s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | -1 :x: |  cc  |  18m 29s |  
root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 40 new + 123 unchanged - 
40 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  18m 29s |  the patch passed  |
   | +1 :green_heart: |  javac  |  18m 29s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   2m 54s |  root: The patch generated 6 new 
+ 151 unchanged - 5 fixed = 157 total (was 156)  |
   | +1 :green_heart: |  mvnsite  |   2m 39s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  xml  |   0m  4s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  14m  9s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 37s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 39s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  findbugs  |   0m 29s |  hadoop-project has no data from 
findbugs  |
   | +0 :ok: |  findbugs  |   0m 34s |  hadoop-project-dist has no data from 
findbugs  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 30s |  hadoop-project in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   0m 30s |  hadoop-project-dist in the patch 
passed.  |
   | -1 :x: |  unit  |  10m 18s |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 57s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 214m 44s |   |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.ha.TestZKFailoverController |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-16 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485532=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485532
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 17/Sep/20 05:07
Start Date: 17/Sep/20 05:07
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-693863169


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |  29m 58s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
5 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 21s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  27m 10s |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 26s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  18m 44s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 54s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 27s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  19m 56s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 38s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 32s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   0m 38s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +0 :ok: |  findbugs  |   0m 36s |  branch/hadoop-project no findbugs 
output file (findbugsXml.xml)  |
   | +0 :ok: |  findbugs  |   0m 37s |  branch/hadoop-project-dist no findbugs 
output file (findbugsXml.xml)  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 30s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 16s |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 49s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | -1 :x: |  cc  |  20m 49s |  
root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 17 new + 146 unchanged - 
17 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  20m 49s |  the patch passed  |
   | +1 :green_heart: |  javac  |  20m 49s |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 31s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | -1 :x: |  cc  |  18m 31s |  
root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 11 new + 152 unchanged - 
11 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  18m 31s |  the patch passed  |
   | +1 :green_heart: |  javac  |  18m 31s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   2m 49s |  root: The patch generated 6 new 
+ 151 unchanged - 5 fixed = 157 total (was 156)  |
   | +1 :green_heart: |  mvnsite  |   2m 32s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  xml  |   0m  5s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  14m  3s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 38s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 42s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  findbugs  |   0m 34s |  hadoop-project has no data from 
findbugs  |
   | +0 :ok: |  findbugs  |   0m 35s |  hadoop-project-dist has no data from 
findbugs  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 33s |  hadoop-project in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   0m 34s |  hadoop-project-dist in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   9m 49s |  hadoop-common in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 50s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 213m 57s |   |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base:

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-16 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485517=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485517
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 17/Sep/20 03:17
Start Date: 17/Sep/20 03:17
Worklog Time Spent: 10m 
  Work Description: dbtsai commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r489922981



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyCompressor.java
##
@@ -48,24 +48,6 @@
   private long bytesRead = 0L;
   private long bytesWritten = 0L;
 
-  private static boolean nativeSnappyLoaded = false;
-  
-  static {
-if (NativeCodeLoader.isNativeCodeLoaded() &&
-NativeCodeLoader.buildSupportsSnappy()) {
-  try {
-initIDs();
-nativeSnappyLoaded = true;
-  } catch (Throwable t) {
-LOG.error("failed to load SnappyCompressor", t);
-  }
-}
-  }
-  
-  public static boolean isNativeCodeLoaded() {
-return nativeSnappyLoaded;

Review comment:
   we need to check if the snappy class is available for SnappyCompressor 
too. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 485517)
Time Spent: 11h 40m  (was: 11.5h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 11h 40m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-16 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485516=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485516
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 17/Sep/20 03:15
Start Date: 17/Sep/20 03:15
Worklog Time Spent: 10m 
  Work Description: dbtsai commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r489921893



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java
##
@@ -45,30 +46,19 @@
   private int userBufOff = 0, userBufLen = 0;
   private boolean finished;
 
-  private static boolean nativeSnappyLoaded = false;
-
-  static {
-if (NativeCodeLoader.isNativeCodeLoaded() &&
-NativeCodeLoader.buildSupportsSnappy()) {
-  try {
-initIDs();
-nativeSnappyLoaded = true;
-  } catch (Throwable t) {
-LOG.error("failed to load SnappyDecompressor", t);
-  }
-}
-  }
-  
-  public static boolean isNativeCodeLoaded() {
-return nativeSnappyLoaded;
-  }
-  
   /**
* Creates a new compressor.
*
* @param directBufferSize size of the direct buffer to be used.
*/
   public SnappyDecompressor(int directBufferSize) {
+// `snappy-java` is provided scope. We need to check if its availability.
+try {
+  SnappyLoader.getVersion();
+} catch (Throwable t) {
+  LOG.warn("Error loading snappy libraries: " + t);

Review comment:
   In the original code, we throw a runtime exception if the native snappy 
is not found. Should we follow?
   
   ```
 throw new RuntimeException("native snappy library not available: " +
 "SnappyCompressor has not been loaded.");
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 485516)
Time Spent: 11.5h  (was: 11h 20m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-16 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485489=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485489
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 17/Sep/20 01:36
Start Date: 17/Sep/20 01:36
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r489860460



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java
##
@@ -276,13 +258,29 @@ public void end() {
 // do nothing
   }
 
-  private native static void initIDs();
+  private int decompressBytesDirect() throws IOException {
+if (compressedDirectBufLen == 0) {
+  return 0;
+} else {
+  // Set the position and limit of `compressedDirectBuf` for reading
+  compressedDirectBuf.limit(compressedDirectBufLen).position(0);
+  // There is compressed input, decompress it now.
+  int size = Snappy.uncompressedLength((ByteBuffer) compressedDirectBuf);
+  if (size > uncompressedDirectBuf.remaining()) {
+throw new IOException("Could not decompress data. " +
+  "uncompressedDirectBuf length is too small.");

Review comment:
   I found this check is not needed. Removed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 485489)
Time Spent: 11h 20m  (was: 11h 10m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-16 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485487=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485487
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 17/Sep/20 01:32
Start Date: 17/Sep/20 01:32
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r489858089



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java
##
@@ -276,13 +258,29 @@ public void end() {
 // do nothing
   }
 
-  private native static void initIDs();
+  private int decompressBytesDirect() throws IOException {
+if (compressedDirectBufLen == 0) {
+  return 0;
+} else {
+  // Set the position and limit of `compressedDirectBuf` for reading
+  compressedDirectBuf.limit(compressedDirectBufLen).position(0);
+  // There is compressed input, decompress it now.
+  int size = Snappy.uncompressedLength((ByteBuffer) compressedDirectBuf);
+  if (size > uncompressedDirectBuf.remaining()) {
+throw new IOException("Could not decompress data. " +
+  "uncompressedDirectBuf length is too small.");
+  }
+  size = Snappy.uncompress((ByteBuffer) compressedDirectBuf,
+  (ByteBuffer) uncompressedDirectBuf);
+  compressedDirectBufLen = 0;
+  compressedDirectBuf.limit(compressedDirectBuf.capacity()).position(0);
+  return size;
+}
+  }
 
-  private native int decompressBytesDirect();
-  
   int decompressDirect(ByteBuffer src, ByteBuffer dst) throws IOException {
 assert (this instanceof SnappyDirectDecompressor);
-
+

Review comment:
   ok, reverted tailing whitespace.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 485487)
Time Spent: 11h 10m  (was: 11h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-16 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485476=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485476
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 17/Sep/20 00:44
Start Date: 17/Sep/20 00:44
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r489829411



##
File path: hadoop-project-dist/pom.xml
##
@@ -341,7 +340,6 @@
 --openssllib=${openssl.lib}
 
--opensslbinbundle=${bundle.openssl.in.bin}
 --openssllibbundle=${bundle.openssl}
-
--snappybinbundle=${bundle.snappy.in.bin}
 --snappylib=${snappy.lib}
 --snappylibbundle=${bundle.snappy}

Review comment:
   we don't touch hadoop-mapreduce-client-nativetask that needs snappy lib 
still, per https://github.com/apache/hadoop/pull/2201#issuecomment-681687572





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 485476)
Time Spent: 11h  (was: 10h 50m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-16 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485414=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485414
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 16/Sep/20 21:51
Start Date: 16/Sep/20 21:51
Worklog Time Spent: 10m 
  Work Description: dbtsai edited a comment on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-693683581


   cc @jlowe @liuml07 @tgravescs  who recently worked on compression codecs for 
more feedback.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 485414)
Time Spent: 10h 50m  (was: 10h 40m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-16 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485413=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485413
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 16/Sep/20 21:48
Start Date: 16/Sep/20 21:48
Worklog Time Spent: 10m 
  Work Description: dbtsai commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-693683581


   cc @jlowe and @liuml07 who recently worked on compression codecs for more 
feedback.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 485413)
Time Spent: 10h 40m  (was: 10.5h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-16 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485313=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485313
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 16/Sep/20 18:50
Start Date: 16/Sep/20 18:50
Worklog Time Spent: 10m 
  Work Description: dbtsai commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r489663248



##
File path: hadoop-common-project/hadoop-common/pom.xml
##
@@ -363,6 +363,10 @@
   wildfly-openssl-java
   provided
 
+
+  org.xerial.snappy

Review comment:
   We can make it provided, and once we create a `hadoop-compression` 
module, we can add back the jar. @viirya since the jar will be provided, we 
need to check if the class exists so we can log it with right message.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 485313)
Time Spent: 10.5h  (was: 10h 20m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-16 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485137=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485137
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 16/Sep/20 13:19
Start Date: 16/Sep/20 13:19
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on a change in pull request 
#2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r489422661



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java
##
@@ -276,13 +258,29 @@ public void end() {
 // do nothing
   }
 
-  private native static void initIDs();
+  private int decompressBytesDirect() throws IOException {
+if (compressedDirectBufLen == 0) {
+  return 0;
+} else {
+  // Set the position and limit of `compressedDirectBuf` for reading
+  compressedDirectBuf.limit(compressedDirectBufLen).position(0);
+  // There is compressed input, decompress it now.
+  int size = Snappy.uncompressedLength((ByteBuffer) compressedDirectBuf);
+  if (size > uncompressedDirectBuf.remaining()) {
+throw new IOException("Could not decompress data. " +
+  "uncompressedDirectBuf length is too small.");
+  }
+  size = Snappy.uncompress((ByteBuffer) compressedDirectBuf,
+  (ByteBuffer) uncompressedDirectBuf);
+  compressedDirectBufLen = 0;
+  compressedDirectBuf.limit(compressedDirectBuf.capacity()).position(0);
+  return size;
+}
+  }
 
-  private native int decompressBytesDirect();
-  
   int decompressDirect(ByteBuffer src, ByteBuffer dst) throws IOException {
 assert (this instanceof SnappyDirectDecompressor);
-
+

Review comment:
   please stop the IDE removing trailing whitespace on lines which haven't 
been edited; complicates life

##
File path: hadoop-common-project/hadoop-common/pom.xml
##
@@ -363,6 +363,10 @@
   wildfly-openssl-java
   provided
 
+
+  org.xerial.snappy

Review comment:
   provided

##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.java
##
@@ -276,13 +258,29 @@ public void end() {
 // do nothing
   }
 
-  private native static void initIDs();
+  private int decompressBytesDirect() throws IOException {
+if (compressedDirectBufLen == 0) {
+  return 0;
+} else {
+  // Set the position and limit of `compressedDirectBuf` for reading
+  compressedDirectBuf.limit(compressedDirectBufLen).position(0);
+  // There is compressed input, decompress it now.
+  int size = Snappy.uncompressedLength((ByteBuffer) compressedDirectBuf);
+  if (size > uncompressedDirectBuf.remaining()) {
+throw new IOException("Could not decompress data. " +
+  "uncompressedDirectBuf length is too small.");

Review comment:
   use name of config option which users can tun

##
File path: hadoop-project-dist/pom.xml
##
@@ -341,7 +340,6 @@
 --openssllib=${openssl.lib}
 
--opensslbinbundle=${bundle.openssl.in.bin}
 --openssllibbundle=${bundle.openssl}
-
--snappybinbundle=${bundle.snappy.in.bin}
 --snappylib=${snappy.lib}
 --snappylibbundle=${bundle.snappy}

Review comment:
   what about the others snappylibs?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 485137)
Time Spent: 10h 20m  (was: 10h 10m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10h 20m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-16 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485129=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485129
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 16/Sep/20 13:00
Start Date: 16/Sep/20 13:00
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-693125895







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 485129)
Time Spent: 10h 10m  (was: 10h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-16 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=485126=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485126
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 16/Sep/20 12:55
Start Date: 16/Sep/20 12:55
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-692976368


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   0m 32s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
5 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 22s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m  9s |  trunk passed  |
   | +1 :green_heart: |  compile  |  19m 26s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  16m 58s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 44s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 44s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 27s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 38s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 34s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   0m 29s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +0 :ok: |  findbugs  |   0m 34s |  branch/hadoop-project no findbugs 
output file (findbugsXml.xml)  |
   | +0 :ok: |  findbugs  |   0m 29s |  branch/hadoop-project-dist no findbugs 
output file (findbugsXml.xml)  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 33s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 24s |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 15s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | -1 :x: |  cc  |  20m 15s |  
root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 36 new + 127 unchanged - 
36 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  20m 15s |  the patch passed  |
   | +1 :green_heart: |  javac  |  20m 15s |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 48s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | -1 :x: |  cc  |  19m 48s |  
root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 36 new + 127 unchanged - 
36 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  19m 48s |  the patch passed  |
   | +1 :green_heart: |  javac  |  19m 48s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   3m  3s |  root: The patch generated 1 new 
+ 151 unchanged - 5 fixed = 152 total (was 156)  |
   | +1 :green_heart: |  mvnsite  |   2m 29s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  xml  |   0m  4s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  16m 27s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 27s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 33s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  findbugs  |   0m 27s |  hadoop-project has no data from 
findbugs  |
   | +0 :ok: |  findbugs  |   0m 24s |  hadoop-project-dist has no data from 
findbugs  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 24s |  hadoop-project in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   0m 26s |  hadoop-project-dist in the patch 
passed.  |
   | +1 :green_heart: |  unit  |  10m 18s |  hadoop-common in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 53s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 182m 50s |   |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base:

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=484870=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-484870
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 16/Sep/20 02:09
Start Date: 16/Sep/20 02:09
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-693125895


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |  39m 42s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
5 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 29s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  35m 48s |  trunk passed  |
   | +1 :green_heart: |  compile  |  29m 25s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  24m 38s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   3m 51s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m  2s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  26m 19s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 44s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 53s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   0m 30s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +0 :ok: |  findbugs  |   0m 36s |  branch/hadoop-project no findbugs 
output file (findbugsXml.xml)  |
   | +0 :ok: |  findbugs  |   0m 30s |  branch/hadoop-project-dist no findbugs 
output file (findbugsXml.xml)  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   1m  6s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 43s |  the patch passed  |
   | +1 :green_heart: |  compile  |  28m 55s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | -1 :x: |  cc  |  28m 55s |  
root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 14 new + 149 unchanged - 
14 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  28m 55s |  the patch passed  |
   | +1 :green_heart: |  javac  |  28m 55s |  the patch passed  |
   | +1 :green_heart: |  compile  |  24m 49s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | -1 :x: |  cc  |  24m 49s |  
root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 21 new + 142 unchanged - 
21 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  24m 49s |  the patch passed  |
   | +1 :green_heart: |  javac  |  24m 49s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   3m 42s |  root: The patch generated 6 new 
+ 151 unchanged - 5 fixed = 157 total (was 156)  |
   | +1 :green_heart: |  mvnsite  |   3m  5s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  xml  |   0m  5s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  17m 48s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 48s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 52s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  findbugs  |   0m 33s |  hadoop-project has no data from 
findbugs  |
   | +0 :ok: |  findbugs  |   0m 33s |  hadoop-project-dist has no data from 
findbugs  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 31s |  hadoop-project in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   0m 31s |  hadoop-project-dist in the patch 
passed.  |
   | +1 :green_heart: |  unit  |  12m 25s |  hadoop-common in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   1m  2s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 278m 27s |   |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base:

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=484809=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-484809
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 15/Sep/20 22:53
Start Date: 15/Sep/20 22:53
Worklog Time Spent: 10m 
  Work Description: dbtsai edited a comment on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-693003698


   cc @xerial @steveloughran and @jojochuang for review.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 484809)
Time Spent: 9h 40m  (was: 9.5h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=484805=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-484805
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 15/Sep/20 22:48
Start Date: 15/Sep/20 22:48
Worklog Time Spent: 10m 
  Work Description: dbtsai commented on a change in pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#discussion_r489045005



##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java
##
@@ -441,4 +442,43 @@ public void doWork() throws Exception {
 
 ctx.waitFor(6);
   }
+
+  @Test
+  public void testSnappyCompatibility() throws Exception {
+// HADOOP-17125. Using snappy-java in SnappyCodec. These strings are raw 
data and compressed data
+// using previous native Snappy codec. We use updated Snappy codec to 
decode it and check if it
+// matches.
+String rawData = 
"010a06030a040a0c0109020c0a010204020d02000b010701080605080b090902060a080502060a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d060907020a030a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b060e030e0a07050d06050d";
+String compressed = 
"8001f07f010a06030a040a0c0109020c0a010204020d02000b010701080605080b090902060a080502060a0d06070908080a0c0105030904090d05090800040c090c0d0d0804000d00040b0b0d010d060907020a030a0c0900040905080107040d0c01060a0b09070a04000b01040b09000e0e00020b06050b060e030e0a07050d06050d";

Review comment:
   Maybe you can comment the compressed data is generated by current native 
snappy codec?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 484805)
Time Spent: 9.5h  (was: 9h 20m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-09-15 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=484787=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-484787
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 15/Sep/20 22:03
Start Date: 15/Sep/20 22:03
Worklog Time Spent: 10m 
  Work Description: dbtsai commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-693003698


   cc @steveloughran and @jojochuang for review.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 484787)
Time Spent: 9h 20m  (was: 9h 10m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

1 2 >

1 - 100 of 155 matches

Mail list logo