[jira] [Comment Edited] (HADOOP-15129) Datanode caches namenode DNS lookup failure and cannot startup

2018-12-12 Thread Ravi Prakash (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718798#comment-16718798
 ] 

Ravi Prakash edited comment on HADOOP-15129 at 12/12/18 11:10 AM:
--

Hi Karthik! Thanks for your contribution. Could you please rebase the patch to 
the latest trunk? I usually apply patches using
{code:java}
$ git apply {code}
A few suggestions:
 # Could you please use short descriptions in JIRA? [I was told a long time 
ago|https://issues.apache.org/jira/browse/HDFS-2011?focusedCommentId=13041707=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13041707].
 :)
 # When using JIRA numbers, could you please write HDFS-8068 (instead of just 
8068) because issues often cut across several different projects, and this way 
JIRA creates nice links for viewers to click on?

Patches are usually committed to trunk *first* and then a (possibly) different 
version of the patch may be committed to earlier branches like branch-2. So 
technically you could have used neat Lambdas in the trunk patch. ;) Its a nit 
though.

I'm trying to find the wikipage that tried to explain certain errors. I'm 
afraid I rarely found them useful (its probably because we didn't really expand 
on those wiki pages ever), so I'm fine with a more helpful error in the logs.

 


was (Author: raviprak):
Hi Karthik! Thanks for your contribution. Could you please rebase the patch to 
the latest trunk? I usually apply patches using
{code:java}
$ git apply {code}
A few suggestions:
 # Could you please use short descriptions in JIRA? I was told a long time ago. 
:)
 # When using JIRA numbers, could you please write HDFS-8068 (instead of just 
8068) because issues often cut across several different projects, and this way 
JIRA creates nice links for viewers to click on?

Patches are usually committed to trunk *first* and then a (possibly) different 
version of the patch may be committed to earlier branches like branch-2. So 
technically you could have used neat Lambdas in the trunk patch. ;) Its a nit 
though.

I'm trying to find the wikipage that tried to explain certain errors. I'm 
afraid I rarely found them useful (its probably because we didn't really expand 
on those wiki pages ever), so I'm fine with a more helpful error in the logs.

 

> Datanode caches namenode DNS lookup failure and cannot startup
> --
>
> Key: HADOOP-15129
> URL: https://issues.apache.org/jira/browse/HADOOP-15129
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc
>Affects Versions: 2.8.2
> Environment: Google Compute Engine.
> I'm using Java 8, Debian 8, Hadoop 2.8.2.
>Reporter: Karthik Palaniappan
>Assignee: Karthik Palaniappan
>Priority: Minor
> Attachments: HADOOP-15129.001.patch, HADOOP-15129.002.patch
>
>
> On startup, the Datanode creates an InetSocketAddress to register with each 
> namenode. Though there are retries on connection failure throughout the 
> stack, the same InetSocketAddress is reused.
> InetSocketAddress is an interesting class, because it resolves DNS names to 
> IP addresses on construction, and it is never refreshed. Hadoop re-creates an 
> InetSocketAddress in some cases just in case the remote IP has changed for a 
> particular DNS name: https://issues.apache.org/jira/browse/HADOOP-7472.
> Anyway, on startup, you cna see the Datanode log: "Namenode...remains 
> unresolved" -- referring to the fact that DNS lookup failed.
> {code:java}
> 2017-11-02 16:01:55,115 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Refresh request received for nameservices: null
> 2017-11-02 16:01:55,153 WARN org.apache.hadoop.hdfs.DFSUtilClient: Namenode 
> for null remains unresolved for ID null. Check your hdfs-site.xml file to 
> ensure namenodes are configured properly.
> 2017-11-02 16:01:55,156 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Starting BPOfferServices for nameservices: 
> 2017-11-02 16:01:55,169 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Block pool  (Datanode Uuid unassigned) service to 
> cluster-32f5-m:8020 starting to offer service
> {code}
> The Datanode then proceeds to use this unresolved address, as it may work if 
> the DN is configured to use a proxy. Since I'm not using a proxy, it forever 
> prints out this message:
> {code:java}
> 2017-12-15 00:13:40,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> 2017-12-15 00:13:45,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> 2017-12-15 00:13:50,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> 2017-12-15 00:13:55,713 WARN 

[jira] [Commented] (HADOOP-15129) Datanode caches namenode DNS lookup failure and cannot startup

2018-12-12 Thread Ravi Prakash (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718798#comment-16718798
 ] 

Ravi Prakash commented on HADOOP-15129:
---

Hi Karthik! Thanks for your contribution. Could you please rebase the patch to 
the latest trunk? I usually apply patches using
{code:java}
$ git apply {code}
A few suggestions:
 # Could you please use short descriptions in JIRA? I was told a long time ago. 
:)
 # When using JIRA numbers, could you please write HDFS-8068 (instead of just 
8068) because issues often cut across several different projects, and this way 
JIRA creates nice links for viewers to click on?

Patches are usually committed to trunk *first* and then a (possibly) different 
version of the patch may be committed to earlier branches like branch-2. So 
technically you could have used neat Lambdas in the trunk patch. ;) Its a nit 
though.

I'm trying to find the wikipage that tried to explain certain errors. I'm 
afraid I rarely found them useful (its probably because we didn't really expand 
on those wiki pages ever), so I'm fine with a more helpful error in the logs.

 

> Datanode caches namenode DNS lookup failure and cannot startup
> --
>
> Key: HADOOP-15129
> URL: https://issues.apache.org/jira/browse/HADOOP-15129
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc
>Affects Versions: 2.8.2
> Environment: Google Compute Engine.
> I'm using Java 8, Debian 8, Hadoop 2.8.2.
>Reporter: Karthik Palaniappan
>Assignee: Karthik Palaniappan
>Priority: Minor
> Attachments: HADOOP-15129.001.patch, HADOOP-15129.002.patch
>
>
> On startup, the Datanode creates an InetSocketAddress to register with each 
> namenode. Though there are retries on connection failure throughout the 
> stack, the same InetSocketAddress is reused.
> InetSocketAddress is an interesting class, because it resolves DNS names to 
> IP addresses on construction, and it is never refreshed. Hadoop re-creates an 
> InetSocketAddress in some cases just in case the remote IP has changed for a 
> particular DNS name: https://issues.apache.org/jira/browse/HADOOP-7472.
> Anyway, on startup, you cna see the Datanode log: "Namenode...remains 
> unresolved" -- referring to the fact that DNS lookup failed.
> {code:java}
> 2017-11-02 16:01:55,115 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Refresh request received for nameservices: null
> 2017-11-02 16:01:55,153 WARN org.apache.hadoop.hdfs.DFSUtilClient: Namenode 
> for null remains unresolved for ID null. Check your hdfs-site.xml file to 
> ensure namenodes are configured properly.
> 2017-11-02 16:01:55,156 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Starting BPOfferServices for nameservices: 
> 2017-11-02 16:01:55,169 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Block pool  (Datanode Uuid unassigned) service to 
> cluster-32f5-m:8020 starting to offer service
> {code}
> The Datanode then proceeds to use this unresolved address, as it may work if 
> the DN is configured to use a proxy. Since I'm not using a proxy, it forever 
> prints out this message:
> {code:java}
> 2017-12-15 00:13:40,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> 2017-12-15 00:13:45,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> 2017-12-15 00:13:50,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> 2017-12-15 00:13:55,713 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> 2017-12-15 00:14:00,713 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> {code}
> Unfortunately, the log doesn't contain the exception that triggered it, but 
> the culprit is actually in IPC Client: 
> https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java#L444.
> This line was introduced in https://issues.apache.org/jira/browse/HADOOP-487 
> to give a clear error message when somebody mispells an address.
> However, the fix in HADOOP-7472 doesn't apply here, because that code happens 
> in Client#getConnection after the Connection is constructed.
> My proposed fix (will attach a patch) is to move this exception out of the 
> constructor and into a place that will trigger HADOOP-7472's logic to 
> re-resolve addresses. If the DNS failure was temporary, this will allow the 
> connection to succeed. If not, the connection will fail after ipc client 
> retries (default 10 seconds worth of retries).
> I want to fix this in ipc client rather than just in Datanode 

[jira] [Comment Edited] (HADOOP-15129) Datanode caches namenode DNS lookup failure and cannot startup

2018-12-12 Thread Ravi Prakash (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718798#comment-16718798
 ] 

Ravi Prakash edited comment on HADOOP-15129 at 12/12/18 11:13 AM:
--

Hi Karthik! Thanks for your contribution. Could you please rebase the patch to 
the latest trunk? I usually apply patches using
{code:java}
$ git apply {code}
A few suggestions:
 # Could you please use short descriptions in JIRA? I was told a long time ago. 
:)
 # When using JIRA numbers, could you please write HDFS-8068 (instead of just 
8068) because issues often cut across several different projects, and this way 
JIRA creates nice links for viewers to click on?

Patches are usually committed to trunk *first* and then a (possibly) different 
version of the patch may be committed to earlier branches like branch-2. So 
technically you could have used neat Lambdas in the trunk patch. ;) Its a nit 
though.

I'm trying to find the wikipage that tried to explain certain errors. I'm 
afraid I rarely found them useful (its probably because we didn't really expand 
on those wiki pages ever), so I'm fine with a more helpful error in the logs.

Could you please also comment on whether you have been running with this patch 
in production for any amount of time and seen / not seen any issues with it?

I concur that this is extremely important code, so it behooves us to tread very 
carefully. 


was (Author: raviprak):
Hi Karthik! Thanks for your contribution. Could you please rebase the patch to 
the latest trunk? I usually apply patches using
{code:java}
$ git apply {code}
A few suggestions:
 # Could you please use short descriptions in JIRA? [I was told a long time 
ago|https://issues.apache.org/jira/browse/HDFS-2011?focusedCommentId=13041707=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13041707].
 :)
 # When using JIRA numbers, could you please write HDFS-8068 (instead of just 
8068) because issues often cut across several different projects, and this way 
JIRA creates nice links for viewers to click on?

Patches are usually committed to trunk *first* and then a (possibly) different 
version of the patch may be committed to earlier branches like branch-2. So 
technically you could have used neat Lambdas in the trunk patch. ;) Its a nit 
though.

I'm trying to find the wikipage that tried to explain certain errors. I'm 
afraid I rarely found them useful (its probably because we didn't really expand 
on those wiki pages ever), so I'm fine with a more helpful error in the logs.

 

> Datanode caches namenode DNS lookup failure and cannot startup
> --
>
> Key: HADOOP-15129
> URL: https://issues.apache.org/jira/browse/HADOOP-15129
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc
>Affects Versions: 2.8.2
> Environment: Google Compute Engine.
> I'm using Java 8, Debian 8, Hadoop 2.8.2.
>Reporter: Karthik Palaniappan
>Assignee: Karthik Palaniappan
>Priority: Minor
> Attachments: HADOOP-15129.001.patch, HADOOP-15129.002.patch
>
>
> On startup, the Datanode creates an InetSocketAddress to register with each 
> namenode. Though there are retries on connection failure throughout the 
> stack, the same InetSocketAddress is reused.
> InetSocketAddress is an interesting class, because it resolves DNS names to 
> IP addresses on construction, and it is never refreshed. Hadoop re-creates an 
> InetSocketAddress in some cases just in case the remote IP has changed for a 
> particular DNS name: https://issues.apache.org/jira/browse/HADOOP-7472.
> Anyway, on startup, you cna see the Datanode log: "Namenode...remains 
> unresolved" -- referring to the fact that DNS lookup failed.
> {code:java}
> 2017-11-02 16:01:55,115 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Refresh request received for nameservices: null
> 2017-11-02 16:01:55,153 WARN org.apache.hadoop.hdfs.DFSUtilClient: Namenode 
> for null remains unresolved for ID null. Check your hdfs-site.xml file to 
> ensure namenodes are configured properly.
> 2017-11-02 16:01:55,156 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Starting BPOfferServices for nameservices: 
> 2017-11-02 16:01:55,169 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Block pool  (Datanode Uuid unassigned) service to 
> cluster-32f5-m:8020 starting to offer service
> {code}
> The Datanode then proceeds to use this unresolved address, as it may work if 
> the DN is configured to use a proxy. Since I'm not using a proxy, it forever 
> prints out this message:
> {code:java}
> 2017-12-15 00:13:40,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> 2017-12-15 00:13:45,712 WARN 

[jira] [Commented] (HADOOP-14597) Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been made opaque

2018-01-11 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322808#comment-16322808
 ] 

Ravi Prakash commented on HADOOP-14597:
---

Hi Miklos! Sorry about the late reply. I don't have a test setup with 
encryption turned on. Even though the tests pass, I don't know how good the 
coverage is. Hence my hesitation to port it to branch-2.

> Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been 
> made opaque
> 
>
> Key: HADOOP-14597
> URL: https://issues.apache.org/jira/browse/HADOOP-14597
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha4
> Environment: openssl-1.1.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Fix For: 3.0.0-beta1
>
> Attachments: HADOOP-14597.00.patch, HADOOP-14597.01.patch, 
> HADOOP-14597.02.patch, HADOOP-14597.03.patch, HADOOP-14597.04.patch
>
>
> Trying to build Hadoop trunk on Fedora 26 which has openssl-devel-1.1.0 fails 
> with this error
> {code}[WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:
>  In function ‘check_update_max_output_len’:
> [WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:256:14:
>  error: dereferencing pointer to incomplete type ‘EVP_CIPHER_CTX {aka struct 
> evp_cipher_ctx_st}’
> [WARNING]if (context->flags & EVP_CIPH_NO_PADDING) {
> [WARNING]   ^~
> {code}
> https://github.com/openssl/openssl/issues/962 mattcaswell says
> {quote}
> One of the primary differences between master (OpenSSL 1.1.0) and the 1.0.2 
> version is that many types have been made opaque, i.e. applications are no 
> longer allowed to look inside the internals of the structures
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15033) Use java.util.zip.CRC32C for Java 9 and above

2017-11-13 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16249865#comment-16249865
 ] 

Ravi Prakash commented on HADOOP-15033:
---

Hi Dmitry! Thank you for running the tests. I'm sorry I'm not too familiar with 
this area. Do you know if the results for the checksum were the same? I 
remember there were some inconsistencies between Hadoop's implementation of 
CRC32C and the system libraries. If that were to happen, on an upgrade 
perfectly good blocks may be marked as corrupt. Have you taken a look at 
https://issues.apache.org/jira/browse/HDFS-3528 ?

> Use java.util.zip.CRC32C for Java 9 and above
> -
>
> Key: HADOOP-15033
> URL: https://issues.apache.org/jira/browse/HADOOP-15033
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: performance, util
>Affects Versions: 3.0.0
>Reporter: Dmitry Chuyko
>
> java.util.zip.CRC32C implementation is available since Java 9.
> https://docs.oracle.com/javase/9/docs/api/java/util/zip/CRC32C.html
> Platform specific assembler intrinsics make it more effective than any pure 
> Java implementation.
> Hadoop is compiled against Java 8 but class constructor may be accessible 
> with method handle on 9 to instances implementing Checksum in runtime.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13091) DistCp masks potential CRC check failures

2017-10-03 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-13091:
--
Target Version/s: 2.10.0  (was: 2.9.0)

> DistCp masks potential CRC check failures
> -
>
> Key: HADOOP-13091
> URL: https://issues.apache.org/jira/browse/HADOOP-13091
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Elliot West
>Assignee: Yiqun Lin
> Attachments: HADOOP-13091.003.patch, HADOOP-13091.004.patch, 
> HDFS-10338.001.patch, HDFS-10338.002.patch
>
>
> There appear to be edge cases whereby CRC checks may be circumvented when 
> requests for checksums from the source or target file system fail. In this 
> event CRCs could differ between the source and target and yet the DistCp copy 
> would succeed, even when the 'skip CRC check' option is not being used.
> The code in question is contained in the method 
> [{{org.apache.hadoop.tools.util.DistCpUtils#checksumsAreEqual(...)}}|https://github.com/apache/hadoop/blob/release-2.7.1/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/util/DistCpUtils.java#L457]
> Specifically this code block suggests that if there is a failure when trying 
> to read the source or target checksum then the method will return {{true}} 
> (i.e.  the checksums are equal), implying that the check succeeded. In actual 
> fact we just failed to obtain the checksum and could not perform the check.
> {code}
> try {
>   sourceChecksum = sourceChecksum != null ? sourceChecksum : 
> sourceFS.getFileChecksum(source);
>   targetChecksum = targetFS.getFileChecksum(target);
> } catch (IOException e) {
>   LOG.error("Unable to retrieve checksum for " + source + " or "
> + target, e);
> }
> return (sourceChecksum == null || targetChecksum == null ||
>   sourceChecksum.equals(targetChecksum));
> {code}
> I believe that at the very least the caught {{IOException}} should be 
> re-thrown. If this is not deemed desirable then I believe an option 
> ({{--strictCrc}}?) should be added to enforce a strict check where we require 
> that both the source and target CRCs are retrieved, are not null, and are 
> then compared for equality. If for any reason either of the CRCs retrievals 
> fail then an exception is thrown.
> Clearly some {{FileSystems}} do not support CRCs and invocations to 
> {{FileSystem.getFileChecksum(...)}} return {{null}} in these instances. I 
> would suggest that these should fail a strict CRC check to prevent users 
> developing a false sense of security in their copy pipeline.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14176) distcp reports beyond physical memory limits on 2.X

2017-10-03 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-14176:
--
Target Version/s: 2.10.0  (was: 2.9.0)

> distcp reports beyond physical memory limits on 2.X
> ---
>
> Key: HADOOP-14176
> URL: https://issues.apache.org/jira/browse/HADOOP-14176
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HADOOP-14176-branch-2.001.patch, 
> HADOOP-14176-branch-2.002.patch, HADOOP-14176-branch-2.003.patch, 
> HADOOP-14176-branch-2.004.patch
>
>
> When i run distcp,  i get some errors as follow
> {quote}
> 17/02/21 15:31:18 INFO mapreduce.Job: Task Id : 
> attempt_1487645941615_0037_m_03_0, Status : FAILED
> Container [pid=24661,containerID=container_1487645941615_0037_01_05] is 
> running beyond physical memory limits. Current usage: 1.1 GB of 1 GB physical 
> memory used; 4.0 GB of 5 GB virtual memory used. Killing container.
> Dump of the process-tree for container_1487645941615_0037_01_05 :
> |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
> |- 24661 24659 24661 24661 (bash) 0 0 108650496 301 /bin/bash -c 
> /usr/lib/jvm/java/bin/java -Djava.net.preferIPv4Stack=true 
> -Dhadoop.metrics.log.level=WARN  -Xmx2120m 
> -Djava.io.tmpdir=/mnt/disk4/yarn/usercache/hadoop/appcache/application_1487645941615_0037/container_1487645941615_0037_01_05/tmp
>  -Dlog4j.configuration=container-log4j.properties 
> -Dyarn.app.container.log.dir=/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_05
>  -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
> -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 192.168.1.208 
> 44048 attempt_1487645941615_0037_m_03_0 5 
> 1>/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_05/stdout
>  
> 2>/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_05/stderr
> |- 24665 24661 24661 24661 (java) 1766 336 4235558912 280699 
> /usr/lib/jvm/java/bin/java -Djava.net.preferIPv4Stack=true 
> -Dhadoop.metrics.log.level=WARN -Xmx2120m 
> -Djava.io.tmpdir=/mnt/disk4/yarn/usercache/hadoop/appcache/application_1487645941615_0037/container_1487645941615_0037_01_05/tmp
>  -Dlog4j.configuration=container-log4j.properties 
> -Dyarn.app.container.log.dir=/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_05
>  -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
> -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 192.168.1.208 
> 44048 attempt_1487645941615_0037_m_03_0 5
> Container killed on request. Exit code is 143
> Container exited with a non-zero exit code 143
> {quote}
> Deep into the code , i find that because distcp configuration covers 
> mapred-site.xml
> {code}
> 
> mapred.job.map.memory.mb
> 1024
> 
> 
> mapred.job.reduce.memory.mb
> 1024
> 
> {code}
> When mapreduce.map.java.opts and mapreduce.map.memory.mb is setting in 
> mapred-default.xml, and the value is larger than setted in 
> distcp-default.xml, the error maybe occur.
> we should remove those two configurations in distcp-default.xml 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-10738) Dynamically adjust distcp configuration by adding distcp-site.xml into code base

2017-10-03 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-10738:
--
Target Version/s: 2.10.0  (was: 2.9.0)

> Dynamically adjust distcp configuration by adding distcp-site.xml into code 
> base
> 
>
> Key: HADOOP-10738
> URL: https://issues.apache.org/jira/browse/HADOOP-10738
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Siqi Li
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-10738-branch-2.001.patch, HADOOP-10738.v1.patch, 
> HADOOP-10738.v2.patch
>
>
> For now, the configuration of distcp resides in hadoop-distcp.jar. This makes 
> it difficult to adjust the configuration dynamically.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14822) hadoop-project/pom.xml is executable

2017-08-31 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149326#comment-16149326
 ] 

Ravi Prakash commented on HADOOP-14822:
---

{code}trunk $ find . -name pom.xml -perm 755
trunk $ git status
On branch trunk
Your branch is up-to-date with 'origin/trunk'.
nothing to commit, working tree clean
{code}

Not sure where you saw this. But +1 for fixing it in case it is

> hadoop-project/pom.xml is executable
> 
>
> Key: HADOOP-14822
> URL: https://issues.apache.org/jira/browse/HADOOP-14822
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Akira Ajisaka
>Priority: Minor
>  Labels: newbie
>
> No need for pom.xml to be executable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14439) regression: secret stripping from S3x URIs breaks some downstream code

2017-08-03 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113214#comment-16113214
 ] 

Ravi Prakash commented on HADOOP-14439:
---

Could you please also set the Target version on the JIRA?

> regression: secret stripping from S3x URIs breaks some downstream code
> --
>
> Key: HADOOP-14439
> URL: https://issues.apache.org/jira/browse/HADOOP-14439
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.8.0
> Environment: Spark 2.1
>Reporter: Steve Loughran
>Assignee: Vinayakumar B
>Priority: Minor
> Attachments: HADOOP-14439-01.patch, HADOOP-14439-02.patch
>
>
> Surfaced in SPARK-20799
> Spark is listing the contents of a path with getFileStatus(path), then 
> looking up the path value doing a lookup of the contents.
> Apparently the lookup is failing to find files if you have a secret in the 
> key, {{s3a://key:secret@bucket/path}}. 
> Presumably this is because the stripped values aren't matching.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14643) Clean up Test(HDFS|LocalFS)FileContextMainOperations and FileContextMainOperationsBaseTest

2017-07-28 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16105678#comment-16105678
 ] 

Ravi Prakash commented on HADOOP-14643:
---

You can remove {{unwrapException}} now. Other than that, patch looks mostly 
good. Can you please try getting a +1 from HadoopQA?

> Clean up Test(HDFS|LocalFS)FileContextMainOperations and 
> FileContextMainOperationsBaseTest
> --
>
> Key: HADOOP-14643
> URL: https://issues.apache.org/jira/browse/HADOOP-14643
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Andras Bokor
>Assignee: Andras Bokor
>Priority: Minor
> Attachments: HADOOP-14643.01.patch, HADOOP-14643.02.patch
>
>
> I was working with classes in summary. It's good time to clean up them as 
> "Boy Scout Rule"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14644) javadoc:test-javadoc fails with OutOfMemoryError

2017-07-28 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16105570#comment-16105570
 ] 

Ravi Prakash commented on HADOOP-14644:
---

FWIW, it worked without any changes for me on Linux. I'll let some Mac users 
comment.

> javadoc:test-javadoc fails with OutOfMemoryError
> 
>
> Key: HADOOP-14644
> URL: https://issues.apache.org/jira/browse/HADOOP-14644
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Reporter: Andras Bokor
>Assignee: Andras Bokor
> Attachments: HADOOP-14644.01.patch
>
>
> It's easy to reproduce:
> {code}cd hadoop-common-project/hadoop-common/
> mvn javadoc:test-javadoc{code}
> The build will fail. I tried it on OSX (2 different ones).
> {code}ERROR] javadoc: error - Java heap space
> [ERROR] 
> [ERROR] Command line was: 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_74.jdk/Contents/Home/jre/../bin/javadoc
>  -J-Xmx512m @options @packages{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-7002) Wrong description of copyFromLocal and copyToLocal in documentation

2017-07-28 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16105556#comment-16105556
 ] 

Ravi Prakash commented on HADOOP-7002:
--

CopyFromLocal seems to have a thread count which is not an option in Put. 
https://github.com/apache/hadoop/blob/746189ad8cdf90ab35baec9364b2e02956a1e70c/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommands.java#L311

> Wrong description of copyFromLocal and copyToLocal in documentation
> ---
>
> Key: HADOOP-7002
> URL: https://issues.apache.org/jira/browse/HADOOP-7002
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jingguo Yao
>Assignee: Andras Bokor
>Priority: Minor
> Attachments: HADOOP-7002.01.patch
>
>   Original Estimate: 20m
>  Remaining Estimate: 20m
>
> The descriptions of copyFromLocal and copyToLocal are wrong. 
> For copyFromLocal, the documentation says "Similar to put command, except 
> that the source is restricted to a local file reference." But from the source 
> code of FsShell.java, I can see that copyFromLocal is the sames as put. 
> For copyToLocal, the documentation says "Similar to get command, except that 
> the destination is restricted to a local file reference.". But from the 
> source code of FsShell.java, I can see that copyToLocal is the same as get.
> And this problem exist for both English and Chinese documentation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Closed] (HADOOP-14229) hadoop.security.auth_to_local example is incorrect in the documentation

2017-07-28 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash closed HADOOP-14229.
-

> hadoop.security.auth_to_local example is incorrect in the documentation
> ---
>
> Key: HADOOP-14229
> URL: https://issues.apache.org/jira/browse/HADOOP-14229
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Andras Bokor
>Assignee: Andras Bokor
> Fix For: 3.0.0-beta1
>
> Attachments: HADOOP-14229.01.patch, HADOOP-14229.02.patch, 
> HADOOP-14229.03.patch
>
>
> Let's see jhs as example:
> {code}RULE:[2:$1@$0](jhs/.*@.*REALM.TLD)s/.*/mapred/{code}
> That means principal has 2 components (jhs/myhost@REALM).
> The second column converts this to jhs@REALM. So the regex will not match on 
> this since regex expects / in the principal.
> My suggestion is
> {code}RULE:[2:$1](jhs)s/.*/mapred/{code}
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SecureMode.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14229) hadoop.security.auth_to_local example is incorrect in the documentation

2017-07-28 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-14229:
--
   Resolution: Fixed
Fix Version/s: 3.0.0-beta1
   Status: Resolved  (was: Patch Available)

Committed to trunk.

> hadoop.security.auth_to_local example is incorrect in the documentation
> ---
>
> Key: HADOOP-14229
> URL: https://issues.apache.org/jira/browse/HADOOP-14229
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Andras Bokor
>Assignee: Andras Bokor
> Fix For: 3.0.0-beta1
>
> Attachments: HADOOP-14229.01.patch, HADOOP-14229.02.patch, 
> HADOOP-14229.03.patch
>
>
> Let's see jhs as example:
> {code}RULE:[2:$1@$0](jhs/.*@.*REALM.TLD)s/.*/mapred/{code}
> That means principal has 2 components (jhs/myhost@REALM).
> The second column converts this to jhs@REALM. So the regex will not match on 
> this since regex expects / in the principal.
> My suggestion is
> {code}RULE:[2:$1](jhs)s/.*/mapred/{code}
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SecureMode.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14229) hadoop.security.auth_to_local example is incorrect in the documentation

2017-07-28 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16105477#comment-16105477
 ] 

Ravi Prakash commented on HADOOP-14229:
---

Looks good to me. +1. Committing shortly. Thank you for the contribution Andras!

> hadoop.security.auth_to_local example is incorrect in the documentation
> ---
>
> Key: HADOOP-14229
> URL: https://issues.apache.org/jira/browse/HADOOP-14229
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Andras Bokor
>Assignee: Andras Bokor
> Attachments: HADOOP-14229.01.patch, HADOOP-14229.02.patch, 
> HADOOP-14229.03.patch
>
>
> Let's see jhs as example:
> {code}RULE:[2:$1@$0](jhs/.*@.*REALM.TLD)s/.*/mapred/{code}
> That means principal has 2 components (jhs/myhost@REALM).
> The second column converts this to jhs@REALM. So the regex will not match on 
> this since regex expects / in the principal.
> My suggestion is
> {code}RULE:[2:$1](jhs)s/.*/mapred/{code}
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SecureMode.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14682) cmake Makefiles in hadoop-common don't properly respect -Dopenssl.prefix

2017-07-24 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099258#comment-16099258
 ] 

Ravi Prakash commented on HADOOP-14682:
---

I briefly tried stuff around 
https://github.com/apache/hadoop/blob/94ca52ae9ec0ae04854d726bf2ac1bc457b96a9c/hadoop-common-project/hadoop-common/src/CMakeLists.txt#L171
 but still failed. 

> cmake Makefiles in hadoop-common don't properly respect -Dopenssl.prefix
> 
>
> Key: HADOOP-14682
> URL: https://issues.apache.org/jira/browse/HADOOP-14682
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ravi Prakash
>
> Allen reported that while running tests, cmake didn't properly respect 
> -Dopenssl.prefix that would allow us to build and run the tests with 
> different versions of OpenSSL.
> https://issues.apache.org/jira/browse/HADOOP-14597?focusedCommentId=16092114=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16092114
> I too encountered some funny stuff while trying to build with a non-default 
> OpenSSL library. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14597) Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been made opaque

2017-07-24 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099254#comment-16099254
 ] 

Ravi Prakash commented on HADOOP-14597:
---

I've filed https://issues.apache.org/jira/browse/HADOOP-14682 to document the 
issue in the Cmake files

> Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been 
> made opaque
> 
>
> Key: HADOOP-14597
> URL: https://issues.apache.org/jira/browse/HADOOP-14597
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha4
> Environment: openssl-1.1.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Fix For: 3.0.0-beta1
>
> Attachments: HADOOP-14597.00.patch, HADOOP-14597.01.patch, 
> HADOOP-14597.02.patch, HADOOP-14597.03.patch, HADOOP-14597.04.patch
>
>
> Trying to build Hadoop trunk on Fedora 26 which has openssl-devel-1.1.0 fails 
> with this error
> {code}[WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:
>  In function ‘check_update_max_output_len’:
> [WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:256:14:
>  error: dereferencing pointer to incomplete type ‘EVP_CIPHER_CTX {aka struct 
> evp_cipher_ctx_st}’
> [WARNING]if (context->flags & EVP_CIPH_NO_PADDING) {
> [WARNING]   ^~
> {code}
> https://github.com/openssl/openssl/issues/962 mattcaswell says
> {quote}
> One of the primary differences between master (OpenSSL 1.1.0) and the 1.0.2 
> version is that many types have been made opaque, i.e. applications are no 
> longer allowed to look inside the internals of the structures
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14682) cmake Makefiles in hadoop-common don't properly respect -Dopenssl.prefix

2017-07-24 Thread Ravi Prakash (JIRA)
Ravi Prakash created HADOOP-14682:
-

 Summary: cmake Makefiles in hadoop-common don't properly respect 
-Dopenssl.prefix
 Key: HADOOP-14682
 URL: https://issues.apache.org/jira/browse/HADOOP-14682
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ravi Prakash


Allen reported that while running tests, cmake didn't properly respect 
-Dopenssl.prefix that would allow us to build and run the tests with different 
versions of OpenSSL.
https://issues.apache.org/jira/browse/HADOOP-14597?focusedCommentId=16092114=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16092114

I too encountered some funny stuff while trying to build with a non-default 
OpenSSL library. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Closed] (HADOOP-14597) Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been made opaque

2017-07-24 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash closed HADOOP-14597.
-

> Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been 
> made opaque
> 
>
> Key: HADOOP-14597
> URL: https://issues.apache.org/jira/browse/HADOOP-14597
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha4
> Environment: openssl-1.1.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Fix For: 3.0.0-beta1
>
> Attachments: HADOOP-14597.00.patch, HADOOP-14597.01.patch, 
> HADOOP-14597.02.patch, HADOOP-14597.03.patch, HADOOP-14597.04.patch
>
>
> Trying to build Hadoop trunk on Fedora 26 which has openssl-devel-1.1.0 fails 
> with this error
> {code}[WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:
>  In function ‘check_update_max_output_len’:
> [WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:256:14:
>  error: dereferencing pointer to incomplete type ‘EVP_CIPHER_CTX {aka struct 
> evp_cipher_ctx_st}’
> [WARNING]if (context->flags & EVP_CIPH_NO_PADDING) {
> [WARNING]   ^~
> {code}
> https://github.com/openssl/openssl/issues/962 mattcaswell says
> {quote}
> One of the primary differences between master (OpenSSL 1.1.0) and the 1.0.2 
> version is that many types have been made opaque, i.e. applications are no 
> longer allowed to look inside the internals of the structures
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14597) Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been made opaque

2017-07-24 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-14597:
--
   Resolution: Fixed
Fix Version/s: 3.0.0-beta1
   Status: Resolved  (was: Patch Available)

> Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been 
> made opaque
> 
>
> Key: HADOOP-14597
> URL: https://issues.apache.org/jira/browse/HADOOP-14597
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha4
> Environment: openssl-1.1.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Fix For: 3.0.0-beta1
>
> Attachments: HADOOP-14597.00.patch, HADOOP-14597.01.patch, 
> HADOOP-14597.02.patch, HADOOP-14597.03.patch, HADOOP-14597.04.patch
>
>
> Trying to build Hadoop trunk on Fedora 26 which has openssl-devel-1.1.0 fails 
> with this error
> {code}[WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:
>  In function ‘check_update_max_output_len’:
> [WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:256:14:
>  error: dereferencing pointer to incomplete type ‘EVP_CIPHER_CTX {aka struct 
> evp_cipher_ctx_st}’
> [WARNING]if (context->flags & EVP_CIPH_NO_PADDING) {
> [WARNING]   ^~
> {code}
> https://github.com/openssl/openssl/issues/962 mattcaswell says
> {quote}
> One of the primary differences between master (OpenSSL 1.1.0) and the 1.0.2 
> version is that many types have been made opaque, i.e. applications are no 
> longer allowed to look inside the internals of the structures
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14597) Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been made opaque

2017-07-24 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099227#comment-16099227
 ] 

Ravi Prakash commented on HADOOP-14597:
---

Thank you for the review and comments Allen. Committing shortly. 

> Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been 
> made opaque
> 
>
> Key: HADOOP-14597
> URL: https://issues.apache.org/jira/browse/HADOOP-14597
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha4
> Environment: openssl-1.1.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Attachments: HADOOP-14597.00.patch, HADOOP-14597.01.patch, 
> HADOOP-14597.02.patch, HADOOP-14597.03.patch, HADOOP-14597.04.patch
>
>
> Trying to build Hadoop trunk on Fedora 26 which has openssl-devel-1.1.0 fails 
> with this error
> {code}[WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:
>  In function ‘check_update_max_output_len’:
> [WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:256:14:
>  error: dereferencing pointer to incomplete type ‘EVP_CIPHER_CTX {aka struct 
> evp_cipher_ctx_st}’
> [WARNING]if (context->flags & EVP_CIPH_NO_PADDING) {
> [WARNING]   ^~
> {code}
> https://github.com/openssl/openssl/issues/962 mattcaswell says
> {quote}
> One of the primary differences between master (OpenSSL 1.1.0) and the 1.0.2 
> version is that many types have been made opaque, i.e. applications are no 
> longer allowed to look inside the internals of the structures
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14597) Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been made opaque

2017-06-29 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-14597:
--
Attachment: HADOOP-14597.04.patch

Aah! Good point. Thank you Allen! Done.

Are you aware of any tests which I can run to check functionality? I don't see 
anything in the test directory for native encryption.

Good point about {{EVP_CIPHER_CTX_encrypting}}. The function doesn't exist in 
OpenSSL-1.0.2 . I've put an ugly ifdef for that too now. I'm not sure what else 
I could do now, except dilute the semantics (really its just a one byte 
difference on checking buffer bounds depending on whether its encrypting or 
decrypting). Or we could set minimum version of OpenSSL to 1.1.0 which I'm sure 
will make a lot of people unhappy.

[~hitliuyi] Could you also please review this?

> Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been 
> made opaque
> 
>
> Key: HADOOP-14597
> URL: https://issues.apache.org/jira/browse/HADOOP-14597
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha4
> Environment: openssl-1.1.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Attachments: HADOOP-14597.00.patch, HADOOP-14597.01.patch, 
> HADOOP-14597.02.patch, HADOOP-14597.03.patch, HADOOP-14597.04.patch
>
>
> Trying to build Hadoop trunk on Fedora 26 which has openssl-devel-1.1.0 fails 
> with this error
> {code}[WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:
>  In function ‘check_update_max_output_len’:
> [WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:256:14:
>  error: dereferencing pointer to incomplete type ‘EVP_CIPHER_CTX {aka struct 
> evp_cipher_ctx_st}’
> [WARNING]if (context->flags & EVP_CIPH_NO_PADDING) {
> [WARNING]   ^~
> {code}
> https://github.com/openssl/openssl/issues/962 mattcaswell says
> {quote}
> One of the primary differences between master (OpenSSL 1.1.0) and the 1.0.2 
> version is that many types have been made opaque, i.e. applications are no 
> longer allowed to look inside the internals of the structures
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14597) Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been made opaque

2017-06-28 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-14597:
--
Attachment: HADOOP-14597.03.patch

Thanks Allen! I tested this one with OpenSSL-1.0.2. I had to pull in 
MAPREDUCE-6536 too. 

FWIW I built OpenSSL-1.0.2 from source using {code} ./config 
--prefix=/some/directory; make; make install {code}
To let common build using {{-Dopenssl.prefix=/some/directory}}, I had to 
{code}diff --git a/hadoop-common-project/hadoop-common/src/CMakeLists.txt 
b/hadoop-common-project/hadoop-common/src/CMakeLists.txt
index 10b0f23ddf..01aadd546a 100644
--- a/hadoop-common-project/hadoop-common/src/CMakeLists.txt
+++ b/hadoop-common-project/hadoop-common/src/CMakeLists.txt
@@ -176,6 +176,7 @@ if(${CMAKE_SYSTEM_NAME} MATCHES "Windows")
 SET(OPENSSL_NAME "eay32")
 endif()
 message("CUSTOM_OPENSSL_PREFIX = ${CUSTOM_OPENSSL_PREFIX}")
+SET(CMAKE_FIND_LIBRARY_SUFFIXES ".so" ".a")
 find_library(OPENSSL_LIBRARY
 NAMES ${OPENSSL_NAME}
 PATHS ${CUSTOM_OPENSSL_PREFIX} ${CUSTOM_OPENSSL_PREFIX}/lib
{code}
Seems like we are playing games to be better at detecting OpenSSL than Cmake 
:-) . I know I'm way off on a tangent now ;-) Hahaha


> Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been 
> made opaque
> 
>
> Key: HADOOP-14597
> URL: https://issues.apache.org/jira/browse/HADOOP-14597
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha4
> Environment: openssl-1.1.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Attachments: HADOOP-14597.00.patch, HADOOP-14597.01.patch, 
> HADOOP-14597.02.patch, HADOOP-14597.03.patch
>
>
> Trying to build Hadoop trunk on Fedora 26 which has openssl-devel-1.1.0 fails 
> with this error
> {code}[WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:
>  In function ‘check_update_max_output_len’:
> [WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:256:14:
>  error: dereferencing pointer to incomplete type ‘EVP_CIPHER_CTX {aka struct 
> evp_cipher_ctx_st}’
> [WARNING]if (context->flags & EVP_CIPH_NO_PADDING) {
> [WARNING]   ^~
> {code}
> https://github.com/openssl/openssl/issues/962 mattcaswell says
> {quote}
> One of the primary differences between master (OpenSSL 1.1.0) and the 1.0.2 
> version is that many types have been made opaque, i.e. applications are no 
> longer allowed to look inside the internals of the structures
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-14597) Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been made opaque

2017-06-27 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065657#comment-16065657
 ] 

Ravi Prakash edited comment on HADOOP-14597 at 6/27/17 11:27 PM:
-

HADOOP-14597.00.patch and HADOOP-14597.01.patch are bad. They are passing a 
[{{EVP_CIPHER_CTX}}|https://github.com/openssl/openssl/blob/master/crypto/evp/evp_locl.h#L24]
 to 
[{{EVP_CIPHER_flags}}|https://github.com/openssl/openssl/blob/master/crypto/evp/evp_lib.c#L196]
 which actually expects 
[{{EVP_CIPHER}}|https://github.com/openssl/openssl/blob/master/crypto/include/internal/evp_int.h#L115]
 . These are two different structs (none inheriting the other).

I am now using the proper method and I'm guessing the openssl devs intended for 
us to use it this way. 

Can someone please review and commit?


was (Author: raviprak):
HADOOP-14597.00.patch and HADOOP-14597.01.patch are bad. They are passing a 
[{{EVP_CIPHER_CTX}}|https://github.com/openssl/openssl/blob/master/crypto/evp/evp_locl.h#L24]
 to 
[{{EVP_CIPHER_flags}}|https://github.com/openssl/openssl/blob/master/crypto/evp/evp_lib.c#L196]
 which actually expects 
[{{EVP_CIPHER}}|https://github.com/openssl/openssl/blob/master/crypto/include/internal/evp_int.h#L115]
 . These are two different structs (none inheriting the other).
I am not using the proper method and I'm guessing the openssl devs intended for 
us to use it this way. 

Can someone please review and commit?

> Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been 
> made opaque
> 
>
> Key: HADOOP-14597
> URL: https://issues.apache.org/jira/browse/HADOOP-14597
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha4
> Environment: openssl-1.1.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Attachments: HADOOP-14597.00.patch, HADOOP-14597.01.patch, 
> HADOOP-14597.02.patch
>
>
> Trying to build Hadoop trunk on Fedora 26 which has openssl-devel-1.1.0 fails 
> with this error
> {code}[WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:
>  In function ‘check_update_max_output_len’:
> [WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:256:14:
>  error: dereferencing pointer to incomplete type ‘EVP_CIPHER_CTX {aka struct 
> evp_cipher_ctx_st}’
> [WARNING]if (context->flags & EVP_CIPH_NO_PADDING) {
> [WARNING]   ^~
> {code}
> https://github.com/openssl/openssl/issues/962 mattcaswell says
> {quote}
> One of the primary differences between master (OpenSSL 1.1.0) and the 1.0.2 
> version is that many types have been made opaque, i.e. applications are no 
> longer allowed to look inside the internals of the structures
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14597) Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been made opaque

2017-06-27 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-14597:
--
Attachment: HADOOP-14597.02.patch

HADOOP-14597.00.patch and HADOOP-14597.01.patch are bad. They are passing a 
[{{EVP_CIPHER_CTX}}|https://github.com/openssl/openssl/blob/master/crypto/evp/evp_locl.h#L24]
 to 
[{{EVP_CIPHER_flags}}|https://github.com/openssl/openssl/blob/master/crypto/evp/evp_lib.c#L196]
 which actually expects 
[{{EVP_CIPHER}}|https://github.com/openssl/openssl/blob/master/crypto/include/internal/evp_int.h#L115]
 . These are two different structs (none inheriting the other).
I am not using the proper method and I'm guessing the openssl devs intended for 
us to use it this way. 

Can someone please review and commit?

> Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been 
> made opaque
> 
>
> Key: HADOOP-14597
> URL: https://issues.apache.org/jira/browse/HADOOP-14597
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha4
> Environment: openssl-1.1.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Attachments: HADOOP-14597.00.patch, HADOOP-14597.01.patch, 
> HADOOP-14597.02.patch
>
>
> Trying to build Hadoop trunk on Fedora 26 which has openssl-devel-1.1.0 fails 
> with this error
> {code}[WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:
>  In function ‘check_update_max_output_len’:
> [WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:256:14:
>  error: dereferencing pointer to incomplete type ‘EVP_CIPHER_CTX {aka struct 
> evp_cipher_ctx_st}’
> [WARNING]if (context->flags & EVP_CIPH_NO_PADDING) {
> [WARNING]   ^~
> {code}
> https://github.com/openssl/openssl/issues/962 mattcaswell says
> {quote}
> One of the primary differences between master (OpenSSL 1.1.0) and the 1.0.2 
> version is that many types have been made opaque, i.e. applications are no 
> longer allowed to look inside the internals of the structures
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14597) Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been made opaque

2017-06-27 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-14597:
--
Target Version/s: 3.0.0-alpha4
  Status: Patch Available  (was: Open)

> Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been 
> made opaque
> 
>
> Key: HADOOP-14597
> URL: https://issues.apache.org/jira/browse/HADOOP-14597
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha4
> Environment: openssl-1.1.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Attachments: HADOOP-14597.00.patch, HADOOP-14597.01.patch, 
> HADOOP-14597.02.patch
>
>
> Trying to build Hadoop trunk on Fedora 26 which has openssl-devel-1.1.0 fails 
> with this error
> {code}[WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:
>  In function ‘check_update_max_output_len’:
> [WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:256:14:
>  error: dereferencing pointer to incomplete type ‘EVP_CIPHER_CTX {aka struct 
> evp_cipher_ctx_st}’
> [WARNING]if (context->flags & EVP_CIPH_NO_PADDING) {
> [WARNING]   ^~
> {code}
> https://github.com/openssl/openssl/issues/962 mattcaswell says
> {quote}
> One of the primary differences between master (OpenSSL 1.1.0) and the 1.0.2 
> version is that many types have been made opaque, i.e. applications are no 
> longer allowed to look inside the internals of the structures
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-14597) Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been made opaque

2017-06-27 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash reassigned HADOOP-14597:
-

Assignee: Ravi Prakash

> Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been 
> made opaque
> 
>
> Key: HADOOP-14597
> URL: https://issues.apache.org/jira/browse/HADOOP-14597
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha4
> Environment: openssl-1.1.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Attachments: HADOOP-14597.00.patch, HADOOP-14597.01.patch
>
>
> Trying to build Hadoop trunk on Fedora 26 which has openssl-devel-1.1.0 fails 
> with this error
> {code}[WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:
>  In function ‘check_update_max_output_len’:
> [WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:256:14:
>  error: dereferencing pointer to incomplete type ‘EVP_CIPHER_CTX {aka struct 
> evp_cipher_ctx_st}’
> [WARNING]if (context->flags & EVP_CIPH_NO_PADDING) {
> [WARNING]   ^~
> {code}
> https://github.com/openssl/openssl/issues/962 mattcaswell says
> {quote}
> One of the primary differences between master (OpenSSL 1.1.0) and the 1.0.2 
> version is that many types have been made opaque, i.e. applications are no 
> longer allowed to look inside the internals of the structures
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14597) Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been made opaque

2017-06-27 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-14597:
--
Attachment: HADOOP-14597.01.patch

This patch lets hadoop-pipes compile too

> Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been 
> made opaque
> 
>
> Key: HADOOP-14597
> URL: https://issues.apache.org/jira/browse/HADOOP-14597
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha4
> Environment: openssl-1.1.0
>Reporter: Ravi Prakash
> Attachments: HADOOP-14597.00.patch, HADOOP-14597.01.patch
>
>
> Trying to build Hadoop trunk on Fedora 26 which has openssl-devel-1.1.0 fails 
> with this error
> {code}[WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:
>  In function ‘check_update_max_output_len’:
> [WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:256:14:
>  error: dereferencing pointer to incomplete type ‘EVP_CIPHER_CTX {aka struct 
> evp_cipher_ctx_st}’
> [WARNING]if (context->flags & EVP_CIPH_NO_PADDING) {
> [WARNING]   ^~
> {code}
> https://github.com/openssl/openssl/issues/962 mattcaswell says
> {quote}
> One of the primary differences between master (OpenSSL 1.1.0) and the 1.0.2 
> version is that many types have been made opaque, i.e. applications are no 
> longer allowed to look inside the internals of the structures
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14597) Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been made opaque

2017-06-27 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-14597:
--
Attachment: HADOOP-14597.00.patch

I'm not sure this is the right patch, but it lets the compilation continue. I'm 
using functions defined here 
https://github.com/openssl/openssl/blob/master/crypto/evp/evp_lib.c#L196. I 
don't know what the intent of making the internals opaque is, if I can get to 
the same flags using this method. This leads me to believe this patch is 
probably not what the openssl devs intended. Anyone know anything about this?

> Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been 
> made opaque
> 
>
> Key: HADOOP-14597
> URL: https://issues.apache.org/jira/browse/HADOOP-14597
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha4
> Environment: openssl-1.1.0
>Reporter: Ravi Prakash
> Attachments: HADOOP-14597.00.patch
>
>
> Trying to build Hadoop trunk on Fedora 26 which has openssl-devel-1.1.0 fails 
> with this error
> {code}[WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:
>  In function ‘check_update_max_output_len’:
> [WARNING] 
> /home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:256:14:
>  error: dereferencing pointer to incomplete type ‘EVP_CIPHER_CTX {aka struct 
> evp_cipher_ctx_st}’
> [WARNING]if (context->flags & EVP_CIPH_NO_PADDING) {
> [WARNING]   ^~
> {code}
> https://github.com/openssl/openssl/issues/962 mattcaswell says
> {quote}
> One of the primary differences between master (OpenSSL 1.1.0) and the 1.0.2 
> version is that many types have been made opaque, i.e. applications are no 
> longer allowed to look inside the internals of the structures
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14597) Native compilation broken with OpenSSL-1.1.0 because EVP_CIPHER_CTX has been made opaque

2017-06-27 Thread Ravi Prakash (JIRA)
Ravi Prakash created HADOOP-14597:
-

 Summary: Native compilation broken with OpenSSL-1.1.0 because 
EVP_CIPHER_CTX has been made opaque
 Key: HADOOP-14597
 URL: https://issues.apache.org/jira/browse/HADOOP-14597
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.0.0-alpha4
 Environment: openssl-1.1.0
Reporter: Ravi Prakash


Trying to build Hadoop trunk on Fedora 26 which has openssl-devel-1.1.0 fails 
with this error
{code}[WARNING] 
/home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:
 In function ‘check_update_max_output_len’:
[WARNING] 
/home/raviprak/Code/hadoop/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto/OpensslCipher.c:256:14:
 error: dereferencing pointer to incomplete type ‘EVP_CIPHER_CTX {aka struct 
evp_cipher_ctx_st}’
[WARNING]if (context->flags & EVP_CIPH_NO_PADDING) {
[WARNING]   ^~
{code}

https://github.com/openssl/openssl/issues/962 mattcaswell says
{quote}
One of the primary differences between master (OpenSSL 1.1.0) and the 1.0.2 
version is that many types have been made opaque, i.e. applications are no 
longer allowed to look inside the internals of the structures
{quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Closed] (HADOOP-14513) A little performance improvement of HarFileSystem

2017-06-13 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash closed HADOOP-14513.
-

> A little performance improvement of HarFileSystem
> -
>
> Key: HADOOP-14513
> URL: https://issues.apache.org/jira/browse/HADOOP-14513
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha3
>Reporter: hu xiaodong
>Assignee: hu xiaodong
>Priority: Trivial
> Attachments: HADOOP-14513.001.patch
>
>
> In the Java source of HarFileSystem.java:
> {code:title=HarFileSystem.java|borderStyle=solid}
> ...
> ...
> private Path archivePath(Path p) {
> Path retPath = null;
> Path tmp = p;
> 
> // I think p.depth() need not be loop many times, depth() is a complex 
> calculation
> for (int i=0; i< p.depth(); i++) {
>   if (tmp.toString().endsWith(".har")) {
> retPath = tmp;
> break;
>   }
>   tmp = tmp.getParent();
> }
> return retPath;
>   }
> ...
> ...
> {code}
>  
> I think the fellow is more suitable:
> {code:title=HarFileSystem.java|borderStyle=solid}
> ...
> ...
> private Path archivePath(Path p) {
> Path retPath = null;
> Path tmp = p;
> 
> // just loop once
> for (int i=0,depth=p.depth(); i< depth; i++) {
>   if (tmp.toString().endsWith(".har")) {
> retPath = tmp;
> break;
>   }
>   tmp = tmp.getParent();
> }
> return retPath;
>   }
> ...
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-14513) A little performance improvement of HarFileSystem

2017-06-13 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash resolved HADOOP-14513.
---
Resolution: Not A Problem

> A little performance improvement of HarFileSystem
> -
>
> Key: HADOOP-14513
> URL: https://issues.apache.org/jira/browse/HADOOP-14513
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha3
>Reporter: hu xiaodong
>Assignee: hu xiaodong
>Priority: Trivial
> Attachments: HADOOP-14513.001.patch
>
>
> In the Java source of HarFileSystem.java:
> {code:title=HarFileSystem.java|borderStyle=solid}
> ...
> ...
> private Path archivePath(Path p) {
> Path retPath = null;
> Path tmp = p;
> 
> // I think p.depth() need not be loop many times, depth() is a complex 
> calculation
> for (int i=0; i< p.depth(); i++) {
>   if (tmp.toString().endsWith(".har")) {
> retPath = tmp;
> break;
>   }
>   tmp = tmp.getParent();
> }
> return retPath;
>   }
> ...
> ...
> {code}
>  
> I think the fellow is more suitable:
> {code:title=HarFileSystem.java|borderStyle=solid}
> ...
> ...
> private Path archivePath(Path p) {
> Path retPath = null;
> Path tmp = p;
> 
> // just loop once
> for (int i=0,depth=p.depth(); i< depth; i++) {
>   if (tmp.toString().endsWith(".har")) {
> retPath = tmp;
> break;
>   }
>   tmp = tmp.getParent();
> }
> return retPath;
>   }
> ...
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14513) A little performance improvement of HarFileSystem

2017-06-09 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16044742#comment-16044742
 ] 

Ravi Prakash commented on HADOOP-14513:
---

Hi Hu!

Thanks for your contribution. Do you have any benchmarks / profiles which show 
the improvement? Do you know if the JVM doesn't already optimize it ? 
https://en.wikipedia.org/wiki/Loop-invariant_code_motion


> A little performance improvement of HarFileSystem
> -
>
> Key: HADOOP-14513
> URL: https://issues.apache.org/jira/browse/HADOOP-14513
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha3
>Reporter: hu xiaodong
>Assignee: hu xiaodong
>Priority: Trivial
> Attachments: HADOOP-14513.001.patch
>
>
> In the Java source of HarFileSystem.java:
> {code:title=HarFileSystem.java|borderStyle=solid}
> ...
> ...
> private Path archivePath(Path p) {
> Path retPath = null;
> Path tmp = p;
> 
> // I think p.depth() need not be loop many times, depth() is a complex 
> calculation
> for (int i=0; i< p.depth(); i++) {
>   if (tmp.toString().endsWith(".har")) {
> retPath = tmp;
> break;
>   }
>   tmp = tmp.getParent();
> }
> return retPath;
>   }
> ...
> ...
> {code}
>  
> I think the fellow is more suitable:
> {code:title=HarFileSystem.java|borderStyle=solid}
> ...
> ...
> private Path archivePath(Path p) {
> Path retPath = null;
> Path tmp = p;
> 
> // just loop once
> for (int i=0,depth=p.depth(); i< depth; i++) {
>   if (tmp.toString().endsWith(".har")) {
> retPath = tmp;
> break;
>   }
>   tmp = tmp.getParent();
> }
> return retPath;
>   }
> ...
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-14319) Under replicated blocks are not getting re-replicated

2017-04-18 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash resolved HADOOP-14319.
---
Resolution: Invalid

Please send your queries to hdfs-user mailing list. 
https://hadoop.apache.org/mailing_lists.html
To answer your query please look at dfs.namenode.replication.max-streams , 
dfs.namenode.replication.max-streams-hard-limit,  
dfs.namenode.replication.work.multiplier.per.iteration etc.


> Under replicated blocks are not getting re-replicated
> -
>
> Key: HADOOP-14319
> URL: https://issues.apache.org/jira/browse/HADOOP-14319
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Anil
>
> Under replicated blocks are not getting re-replicated
> In production Hadoop cluster of 5 Manangement + 5 Data Nodes, under 
> replicated blocks are not re-replicated even after 2 days. 
> Here is quick view of required configurations;
>  Default replication factor:  3
>  Average block replication:   3.0
>  Corrupt blocks:  0
>  Missing replicas:0 (0.0 %)
>  Number of data-nodes:5
>  Number of racks: 1
> After bringing one of the DataNodes down, the replication factor for the 
> blocks allocated on the Data Node became 2. It is observed that, even after 2 
> days the replication factor remains as 2. Under replicated blocks are not 
> getting re-replicated to another DataNodes in the cluster. 
> If a Data Node goes down, HDFS will try to replicate the blocks from Dead DN 
> to other nodes and the priority. Are there any configuration changes to speed 
> up the re-replication process for the under replicated blocks? 
> When tested for blocks with replication factor 1, the re-replication happened 
> to 2 overnight in around 10 hours of time. But blocks with 2 replication 
> factor are not being re-replicated to default replication factor 3. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Closed] (HADOOP-14319) Under replicated blocks are not getting re-replicated

2017-04-18 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash closed HADOOP-14319.
-

> Under replicated blocks are not getting re-replicated
> -
>
> Key: HADOOP-14319
> URL: https://issues.apache.org/jira/browse/HADOOP-14319
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Anil
>
> Under replicated blocks are not getting re-replicated
> In production Hadoop cluster of 5 Manangement + 5 Data Nodes, under 
> replicated blocks are not re-replicated even after 2 days. 
> Here is quick view of required configurations;
>  Default replication factor:  3
>  Average block replication:   3.0
>  Corrupt blocks:  0
>  Missing replicas:0 (0.0 %)
>  Number of data-nodes:5
>  Number of racks: 1
> After bringing one of the DataNodes down, the replication factor for the 
> blocks allocated on the Data Node became 2. It is observed that, even after 2 
> days the replication factor remains as 2. Under replicated blocks are not 
> getting re-replicated to another DataNodes in the cluster. 
> If a Data Node goes down, HDFS will try to replicate the blocks from Dead DN 
> to other nodes and the priority. Are there any configuration changes to speed 
> up the re-replication process for the under replicated blocks? 
> When tested for blocks with replication factor 1, the re-replication happened 
> to 2 overnight in around 10 hours of time. But blocks with 2 replication 
> factor are not being re-replicated to default replication factor 3. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14163) Refactor existing hadoop site to use more usable static website generator

2017-03-24 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15940698#comment-15940698
 ] 

Ravi Prakash commented on HADOOP-14163:
---

Thanks Marton for your attention to this often neglected but very important 
facet of Hadoop.
I'd also like to draw your attention to 
https://issues.apache.org/jira/browse/HADOOP-8039 . In my experience, whenever 
I have tried to build the documentation and stage it, the staged files are 
replete with broken links. Is your JIRA going to fix this? Perhaps I'm using 
the wrong command?

> Refactor existing hadoop site to use more usable static website generator
> -
>
> Key: HADOOP-14163
> URL: https://issues.apache.org/jira/browse/HADOOP-14163
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: site
>Reporter: Elek, Marton
>Assignee: Elek, Marton
> Attachments: hadoop-site.tar.gz, hadop-site-rendered.tar.gz
>
>
> From the dev mailing list:
> "Publishing can be attacked via a mix of scripting and revamping the darned 
> website. Forrest is pretty bad compared to the newer static site generators 
> out there (e.g. need to write XML instead of markdown, it's hard to review a 
> staging site because of all the absolute links, hard to customize, did I 
> mention XML?), and the look and feel of the site is from the 00s. We don't 
> actually have that much site content, so it should be possible to migrate to 
> a new system."
> This issue is find a solution to migrate the old site to a new modern static 
> site generator using a more contemprary theme.
> Goals: 
>  * existing links should work (or at least redirected)
>  * It should be easy to add more content required by a release automatically 
> (most probably with creating separated markdown files)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14213) Move Configuration runtime check for hadoop-site.xml to initialization

2017-03-23 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-14213:
--
  Resolution: Fixed
Release Note: Move the check for hadoop-site.xml to static initialization 
of the Configuration class.
  Status: Resolved  (was: Patch Available)

> Move Configuration runtime check for hadoop-site.xml to initialization
> --
>
> Key: HADOOP-14213
> URL: https://issues.apache.org/jira/browse/HADOOP-14213
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HADOOP-14213.1.patch, HADOOP-14213.2.patch
>
>
> Each Configuration object that loads defaults checks for hadoop-site.xml. It 
> has been long deprecated and is not present in most if not nearly all 
> installations. The getResource check for hadoop-site.xml has to check the 
> entire classpath since it is not found. This jira proposes to 1) either 
> remove hadoop-site.xml as a default resource or 2) move the check to static 
> initialization of the class so the performance hit is only taken once.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14213) Move Configuration runtime check for hadoop-site.xml to initialization

2017-03-23 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-14213:
--
Affects Version/s: 2.8.0
 Target Version/s: 2.9.0, 3.0.0-beta1
Fix Version/s: 3.0.0-beta1
   2.9.0

> Move Configuration runtime check for hadoop-site.xml to initialization
> --
>
> Key: HADOOP-14213
> URL: https://issues.apache.org/jira/browse/HADOOP-14213
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HADOOP-14213.1.patch, HADOOP-14213.2.patch
>
>
> Each Configuration object that loads defaults checks for hadoop-site.xml. It 
> has been long deprecated and is not present in most if not nearly all 
> installations. The getResource check for hadoop-site.xml has to check the 
> entire classpath since it is not found. This jira proposes to 1) either 
> remove hadoop-site.xml as a default resource or 2) move the check to static 
> initialization of the class so the performance hit is only taken once.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14213) Move Configuration runtime check for hadoop-site.xml to initialization

2017-03-23 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15938700#comment-15938700
 ] 

Ravi Prakash commented on HADOOP-14213:
---

Thanks for the clarification Jon! LGTM. +1. 

Since you have not removed hadoop-site from default resources, my understanding 
is that this is *not* a backward incompatible change. 

Will commit shortly to trunk and branch-2.

> Move Configuration runtime check for hadoop-site.xml to initialization
> --
>
> Key: HADOOP-14213
> URL: https://issues.apache.org/jira/browse/HADOOP-14213
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: HADOOP-14213.1.patch, HADOOP-14213.2.patch
>
>
> Each Configuration object that loads defaults checks for hadoop-site.xml. It 
> has been long deprecated and is not present in most if not nearly all 
> installations. The getResource check for hadoop-site.xml has to check the 
> entire classpath since it is not found. This jira proposes to 1) either 
> remove hadoop-site.xml as a default resource or 2) move the check to static 
> initialization of the class so the performance hit is only taken once.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14213) Move Configuration runtime check for hadoop-site.xml to initialization

2017-03-22 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937750#comment-15937750
 ] 

Ravi Prakash commented on HADOOP-14213:
---

Thanks Jeagles!

# typo : "hadoop-site.xml.xml" -> "hadoop-site.xml" ??
# I'm sorry! Could you please elaborate how core-site would take precedence 
over hadoop-site still? It seems like hadoop-site would override the attributes 
in core-site, right?

> Move Configuration runtime check for hadoop-site.xml to initialization
> --
>
> Key: HADOOP-14213
> URL: https://issues.apache.org/jira/browse/HADOOP-14213
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: HADOOP-14213.1.patch
>
>
> Each Configuration object that loads defaults checks for hadoop-site.xml. It 
> has been long deprecated and is not present in most if not nearly all 
> installations. The getResource check for hadoop-site.xml has to check the 
> entire classpath since it is not found. This jira proposes to 1) either 
> remove hadoop-site.xml as a default resource or 2) move the check to static 
> initialization of the class so the performance hit is only taken once.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-10738) Dynamically adjust distcp configuration by adding distcp-site.xml into code base

2017-03-20 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933578#comment-15933578
 ] 

Ravi Prakash commented on HADOOP-10738:
---

I'm not sure Siqi Li is active anymore Arpit. I suspect the way most people use 
distcp is in an [oozie 
action|https://oozie.apache.org/docs/4.0.0/DG_DistCpActionExtension.html]. I 
suspect they want to specify a single xml file where they can have all the 
configuration for a source-destination pair. The alternative is to have the 
exact set of parameters copy-pasted in multiple workflows (which if changed 
then has to be updated in all workflows). Please correct me if I'm wrong 
[~ferhui] . Other users?

> Dynamically adjust distcp configuration by adding distcp-site.xml into code 
> base
> 
>
> Key: HADOOP-10738
> URL: https://issues.apache.org/jira/browse/HADOOP-10738
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Siqi Li
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-10738.v1.patch, HADOOP-10738.v2.patch
>
>
> For now, the configuration of distcp resides in hadoop-distcp.jar. This makes 
> it difficult to adjust the configuration dynamically.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14176) distcp reports beyond physical memory limits on 2.X

2017-03-20 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933552#comment-15933552
 ] 

Ravi Prakash commented on HADOOP-14176:
---

Hi Fei Hui!

What I meant was that I'd lean towards setting {{mapreduce.map.memory.mb}} to 
1280 and {{mapreduce.map.java.opts}} to 1024. That way no jobs which used to 
work would suddenly fail (If in the past {{mapred.job.map.memory.mb}} was 
1024.) I would like to hear other people's opinion though since you want this 
in branch-2.

> distcp reports beyond physical memory limits on 2.X
> ---
>
> Key: HADOOP-14176
> URL: https://issues.apache.org/jira/browse/HADOOP-14176
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HADOOP-14176-branch-2.001.patch, 
> HADOOP-14176-branch-2.002.patch, HADOOP-14176-branch-2.003.patch, 
> HADOOP-14176-branch-2.004.patch
>
>
> When i run distcp,  i get some errors as follow
> {quote}
> 17/02/21 15:31:18 INFO mapreduce.Job: Task Id : 
> attempt_1487645941615_0037_m_03_0, Status : FAILED
> Container [pid=24661,containerID=container_1487645941615_0037_01_05] is 
> running beyond physical memory limits. Current usage: 1.1 GB of 1 GB physical 
> memory used; 4.0 GB of 5 GB virtual memory used. Killing container.
> Dump of the process-tree for container_1487645941615_0037_01_05 :
> |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
> |- 24661 24659 24661 24661 (bash) 0 0 108650496 301 /bin/bash -c 
> /usr/lib/jvm/java/bin/java -Djava.net.preferIPv4Stack=true 
> -Dhadoop.metrics.log.level=WARN  -Xmx2120m 
> -Djava.io.tmpdir=/mnt/disk4/yarn/usercache/hadoop/appcache/application_1487645941615_0037/container_1487645941615_0037_01_05/tmp
>  -Dlog4j.configuration=container-log4j.properties 
> -Dyarn.app.container.log.dir=/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_05
>  -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
> -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 192.168.1.208 
> 44048 attempt_1487645941615_0037_m_03_0 5 
> 1>/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_05/stdout
>  
> 2>/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_05/stderr
> |- 24665 24661 24661 24661 (java) 1766 336 4235558912 280699 
> /usr/lib/jvm/java/bin/java -Djava.net.preferIPv4Stack=true 
> -Dhadoop.metrics.log.level=WARN -Xmx2120m 
> -Djava.io.tmpdir=/mnt/disk4/yarn/usercache/hadoop/appcache/application_1487645941615_0037/container_1487645941615_0037_01_05/tmp
>  -Dlog4j.configuration=container-log4j.properties 
> -Dyarn.app.container.log.dir=/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_05
>  -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
> -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 192.168.1.208 
> 44048 attempt_1487645941615_0037_m_03_0 5
> Container killed on request. Exit code is 143
> Container exited with a non-zero exit code 143
> {quote}
> Deep into the code , i find that because distcp configuration covers 
> mapred-site.xml
> {code}
> 
> mapred.job.map.memory.mb
> 1024
> 
> 
> mapred.job.reduce.memory.mb
> 1024
> 
> {code}
> When mapreduce.map.java.opts and mapreduce.map.memory.mb is setting in 
> mapred-default.xml, and the value is larger than setted in 
> distcp-default.xml, the error maybe occur.
> we should remove those two configurations in distcp-default.xml 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14189) add distcp-site.xml for distcp on branch-2

2017-03-20 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933140#comment-15933140
 ] 

Ravi Prakash commented on HADOOP-14189:
---

bq. You should be able to set them in core-site.xml.
Do we really want core-site.xml to be the catch-all for all configuration for 
all tools? See how that atomic blaster shoots both ways ;-)

Let's continue on HADOOP-10738

> add distcp-site.xml for distcp on branch-2
> --
>
> Key: HADOOP-14189
> URL: https://issues.apache.org/jira/browse/HADOOP-14189
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HADOOP-14189-branch-2.001.patch
>
>
> On hadoop 2.x , we could not config hadoop parameters for distcp. It only 
> uses distcp-default.xml.
> We should add distcp-site.xml to overrides hadoop paramers.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14176) distcp reports beyond physical memory limits on 2.X

2017-03-16 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15928858#comment-15928858
 ] 

Ravi Prakash commented on HADOOP-14176:
---

Just fyi! [~yzhangal] is working on HADOOP-11794 and I think its orthogonal, 
but its always good to know (since that's a much bigger change). Maybe Yongjun 
can also chime in on this change

>From my reading of the patch:
1. {{mapred.job.map.memory.mb}} has been changed to 
{{mapreduce.map.memory.mb}}. This will likely remove a "deprecated 
configuration parameter" specified warning. Good!
2. Thanks to Joep for pointing out that there are 0 reducers, removing 
{{mapred.job.reduce.memory.mb}} and {{mapreduce.reduce.class}} should have no 
affect. Great!
3. I'm having a tougher time tracing the affects of removing 
{{mapred.reducer.new-api}} . Is it being passed to {{JobImpl.transition}} ?
4. Unfortunately I don't think setting {{mapreduce.map.java.opts}} to 
{{-Xmx768m}} is the right thing to do. This may cause lots of jobs (e.g. which 
were using >= 769Mb heaps) which used to run, now fail with OOMException. In 
the past when we have been faced with this choice, I have preferred to increase 
resource usage (leading to under-utilization of cluster resources) rather than 
risk failing jobs which currently work fine

Opinions?

> distcp reports beyond physical memory limits on 2.X
> ---
>
> Key: HADOOP-14176
> URL: https://issues.apache.org/jira/browse/HADOOP-14176
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HADOOP-14176-branch-2.001.patch, 
> HADOOP-14176-branch-2.002.patch, HADOOP-14176-branch-2.003.patch
>
>
> When i run distcp,  i get some errors as follow
> {quote}
> 17/02/21 15:31:18 INFO mapreduce.Job: Task Id : 
> attempt_1487645941615_0037_m_03_0, Status : FAILED
> Container [pid=24661,containerID=container_1487645941615_0037_01_05] is 
> running beyond physical memory limits. Current usage: 1.1 GB of 1 GB physical 
> memory used; 4.0 GB of 5 GB virtual memory used. Killing container.
> Dump of the process-tree for container_1487645941615_0037_01_05 :
> |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
> |- 24661 24659 24661 24661 (bash) 0 0 108650496 301 /bin/bash -c 
> /usr/lib/jvm/java/bin/java -Djava.net.preferIPv4Stack=true 
> -Dhadoop.metrics.log.level=WARN  -Xmx2120m 
> -Djava.io.tmpdir=/mnt/disk4/yarn/usercache/hadoop/appcache/application_1487645941615_0037/container_1487645941615_0037_01_05/tmp
>  -Dlog4j.configuration=container-log4j.properties 
> -Dyarn.app.container.log.dir=/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_05
>  -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
> -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 192.168.1.208 
> 44048 attempt_1487645941615_0037_m_03_0 5 
> 1>/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_05/stdout
>  
> 2>/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_05/stderr
> |- 24665 24661 24661 24661 (java) 1766 336 4235558912 280699 
> /usr/lib/jvm/java/bin/java -Djava.net.preferIPv4Stack=true 
> -Dhadoop.metrics.log.level=WARN -Xmx2120m 
> -Djava.io.tmpdir=/mnt/disk4/yarn/usercache/hadoop/appcache/application_1487645941615_0037/container_1487645941615_0037_01_05/tmp
>  -Dlog4j.configuration=container-log4j.properties 
> -Dyarn.app.container.log.dir=/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_05
>  -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
> -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 192.168.1.208 
> 44048 attempt_1487645941615_0037_m_03_0 5
> Container killed on request. Exit code is 143
> Container exited with a non-zero exit code 143
> {quote}
> Deep into the code , i find that because distcp configuration covers 
> mapred-site.xml
> {code}
> 
> mapred.job.map.memory.mb
> 1024
> 
> 
> mapred.job.reduce.memory.mb
> 1024
> 
> {code}
> When mapreduce.map.java.opts and mapreduce.map.memory.mb is setting in 
> mapred-default.xml, and the value is larger than setted in 
> distcp-default.xml, the error maybe occur.
> we should remove those two configurations in distcp-default.xml 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For 

[jira] [Commented] (HADOOP-14189) add distcp-site.xml for distcp on branch-2

2017-03-16 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15928536#comment-15928536
 ] 

Ravi Prakash commented on HADOOP-14189:
---

Thanks Allen for your comment! I think the problem users face is that most of 
them want a common set of distcp options to be used all of the time. A 
distcp-site.xml (or other configuration place) would have been a natural place 
to put them. IMHO data transfer is important enough to merit this extra 
configuration file. Although I do get your point about having configuration 
strewn all over in multiple files. I'm not sure there is a better alternative 
though :( . Could we please continue on HADOOP-10738 ?

> add distcp-site.xml for distcp on branch-2
> --
>
> Key: HADOOP-14189
> URL: https://issues.apache.org/jira/browse/HADOOP-14189
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HADOOP-14189-branch-2.001.patch
>
>
> On hadoop 2.x , we could not config hadoop parameters for distcp. It only 
> uses distcp-default.xml.
> We should add distcp-site.xml to overrides hadoop paramers.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14189) add distcp-site.xml for distcp on branch-2

2017-03-16 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15928532#comment-15928532
 ] 

Ravi Prakash commented on HADOOP-14189:
---

Thanks Fei Hui for your contribution! Isn't this the same as HADOOP-10738. 
Could you please take a look and close this as duplicate if it is?

> add distcp-site.xml for distcp on branch-2
> --
>
> Key: HADOOP-14189
> URL: https://issues.apache.org/jira/browse/HADOOP-14189
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HADOOP-14189-branch-2.001.patch
>
>
> On hadoop 2.x , we could not config hadoop parameters for distcp. It only 
> uses distcp-default.xml.
> We should add distcp-site.xml to overrides hadoop paramers.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-8039) mvn site:stage-deploy should not have broken links.

2017-03-09 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903952#comment-15903952
 ] 

Ravi Prakash commented on HADOOP-8039:
--

This problem still exists:
To recreate here's what I did:
{code}
$ mvn site
$ mvn site:stage-deploy -DstagingSiteURL=file:///home/raviprak/stag
{code}

> mvn site:stage-deploy should not have broken links.
> ---
>
> Key: HADOOP-8039
> URL: https://issues.apache.org/jira/browse/HADOOP-8039
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build, documentation
>Affects Versions: 0.23.1
>Reporter: Ravi Prakash
>
> The stage-deployed site has a lot of broken links / missing pages. We should 
> fix that.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-11232) jersey-core-1.9 has a faulty glassfish-repo setting

2017-03-09 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash resolved HADOOP-11232.
---
Resolution: Duplicate

HADOOP-9613 seems to have upgraded jersey to 1.19 . Please reopen if I'm 
mistaken

> jersey-core-1.9 has a faulty glassfish-repo setting
> ---
>
> Key: HADOOP-11232
> URL: https://issues.apache.org/jira/browse/HADOOP-11232
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Sushanth Sowmyan
>
> The following was reported by [~sushanth].
> hadoop-common brings in jersey-core-1.9 as a dependency by default.
> This is problematic, since the pom file for jersey 1.9 hardcode-specifies 
> glassfish-repo as the place to get further transitive dependencies, which 
> leads to a site that serves a static "this has moved" page instead of a 404. 
> This results in faulty parent resolutions, which when asked for a pom file, 
> get erroneous results.
> The only way around this seems to be to add a series of exclusions for 
> jersey-core, jersey-json, jersey-server and a bunch of others to 
> hadoop-common, then to hadoop-hdfs, then to hadoop-mapreduce-client-core. I 
> don't know how many more excludes are necessary before I can get this to work.
> If you update your jersey.version to 1.14, this faulty pom goes away. Please 
> either update that, or work with build infra to update our nexus pom for 
> jersey-1.9 so that it does not include the faulty glassfish repo.
> Another interesting note about this is that something changed yesterday 
> evening to cause this break in behaviour. We have not had this particular 
> problem in about 9+ months.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13673) Update scripts to be smarter when running with privilege

2017-01-17 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827031#comment-15827031
 ] 

Ravi Prakash commented on HADOOP-13673:
---

LGTM too! Thanks Allen and Andrew! +1

> Update scripts to be smarter when running with privilege
> 
>
> Key: HADOOP-13673
> URL: https://issues.apache.org/jira/browse/HADOOP-13673
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: scripts
>Affects Versions: 3.0.0-alpha1, 3.0.0-alpha2
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>  Labels: security
> Attachments: HADOOP-13673.00.patch, HADOOP-13673.01.patch, 
> HADOOP-13673.02.patch, HADOOP-13673.03.patch, HADOOP-13673.04.patch
>
>
> As work continues on HADOOP-13397, it's become evident that we need better 
> hooks to start daemons as specifically configured users.  Via the 
> (command)_(subcommand)_USER environment variables in 3.x, we actually have a 
> standardized way to do that.  This in turn means we can make the sbin scripts 
> super functional with a bit of updating:
> * Consolidate start-dfs.sh and start-secure-dns.sh into one script
> * Make start-\*.sh and stop-\*.sh know how to switch users when run as root
> * Undeprecate start/stop-all.sh so that it could be used as root for 
> production purposes and as a single user for non-production users



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13673) Update scripts to be smarter when running with privilege

2017-01-13 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822113#comment-15822113
 ] 

Ravi Prakash commented on HADOOP-13673:
---

Hi Allen!

Thanks for the patch! It looks good. I only could find these nits:

# "Atempting" -> "Attempting"
# Remove "${EUID} comes from the shell itself!" in hadoop-functions.sh
# I'm not exactly sure how HADOOP_REEXECED_CMD is being used to prevent a fork 
bomb, but could a script set it to false explicitly as part of itself? i.e. 
what's preventing access to that variable from a user script?
#pwd
# Is hadoop_abs supposed to resolve links? If yes, in hadoop_abs.bats could you 
please add a test for links?

> Update scripts to be smarter when running with privilege
> 
>
> Key: HADOOP-13673
> URL: https://issues.apache.org/jira/browse/HADOOP-13673
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: scripts
>Affects Versions: 3.0.0-alpha1, 3.0.0-alpha2
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>  Labels: security
> Attachments: HADOOP-13673.00.patch, HADOOP-13673.01.patch, 
> HADOOP-13673.02.patch, HADOOP-13673.03.patch
>
>
> As work continues on HADOOP-13397, it's become evident that we need better 
> hooks to start daemons as specifically configured users.  Via the 
> (command)_(subcommand)_USER environment variables in 3.x, we actually have a 
> standardized way to do that.  This in turn means we can make the sbin scripts 
> super functional with a bit of updating:
> * Consolidate start-dfs.sh and start-secure-dns.sh into one script
> * Make start-\*.sh and stop-\*.sh know how to switch users when run as root
> * Undeprecate start/stop-all.sh so that it could be used as root for 
> production purposes and as a single user for non-production users



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13114) DistCp should have option to compress data on write

2017-01-10 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15816356#comment-15816356
 ] 

Ravi Prakash commented on HADOOP-13114:
---

Thanks Koji! I was under the impression that even binary files could be 
compressed quite well. For e.g. if I compress /usr/bin/xsane (a binary file)
{code}
[raviprak@ravi ~]$ ls -alh xsane.gz 
-rwxr-xr-x 1 raviprak raviprak 298K Jan 10 11:06 xsane.gz
[raviprak@ravi ~]$ ls -alh /usr/bin/xsane
-rwxr-xr-x 1 root root 744K Feb  5  2016 /usr/bin/xsane
{code}
The question is how many "binary" files we expect to be on HDFS, but that means 
we'd make assumptions about Hadoop's use cases and I'm not sure I want to 
hazard that. I'm sorry if I misunderstand you. Could you please elucidate your 
concern if its not that?

Thanks Nathan! I am ambivalent about this myself. Ideally we'd want to compress 
during transit (like {{rsync -z}}), but this JIRA was split out of that desire 
(from HADOOP-8065). For a variety of reasons HADOOP-8065 has been requested by 
a lot of _our_ customers (in addition to the hadoop users you can see in the 
voters and watchers list.) Also, a few first-time contributors went above and 
beyond on this JIRA.

bq. What happens if we run the command with compression twice? distcp a->b, 
then b->c? I'm assuming c is a compressed version of b which is a compressed 
version of a. In order to read we'd have to unwind both layers of compression. 
Seems strange and really easy to accidentally have this happen.
You are right that compressed files would be nested, one inside the other. 
Compression tools would do similar nesting, won't they? So I'm not sure it can 
be helped. And if I had checked the compression status, I'm sure someone will 
pipe up and say that I should have been nesting ;-) Perhaps yet another flag?

bq. Obvious question is: "if it's valuable to compress, why wasn't it 
compressed in the first place?"
In my experience, some times the source hadoop cluster is not in the control of 
the copier, or has a lot more capacity (and so compression there is not a 
concern). Sometimes the source is written by IoT objects into a staging area, 
and rather than have a separate job that compresses data, it'd be helpful to 
combine the copy with the compression. 

bq. Just the name bothers me a bit. copy commands don't normally transform 
data, but this one would.
Having said that, I do feel this argument is particularly compelling. I am not 
sure if this would be breaking precedent considering there is {{--append}} 
which is not exactly a "copy" either, but I do agree with your concern.

For now I will stop work on this JIRA unless I hear from a few more diverse 
viewpoints.

> DistCp should have option to compress data on write
> ---
>
> Key: HADOOP-13114
> URL: https://issues.apache.org/jira/browse/HADOOP-13114
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.8.0, 2.7.3, 3.0.0-alpha1
>Reporter: Suraj Nayak
>Assignee: Suraj Nayak
>Priority: Minor
>  Labels: distcp
> Attachments: HADOOP-13114-trunk_2016-05-07-1.patch, 
> HADOOP-13114-trunk_2016-05-08-1.patch, HADOOP-13114-trunk_2016-05-10-1.patch, 
> HADOOP-13114-trunk_2016-05-12-1.patch, HADOOP-13114.05.patch, 
> HADOOP-13114.06.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> DistCp utility should have capability to store data in user specified 
> compression format. This avoids one hop of compressing data after transfer. 
> Backup strategies to different cluster also get benefit of saving one IO 
> operation to and from HDFS, thus saving resources, time and effort.
> * Create an option -compressOutput defaulting to 
> {{org.apache.hadoop.io.compress.BZip2Codec}}. 
> * Users will be able to change codec with {{-D 
> mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.GzipCodec}}
> * If distcp compression is enabled, suffix the filenames with default codec 
> extension to indicate the file is compressed. Thus users can be aware of what 
> codec was used to compress the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-13114) DistCp should have option to compress data on write

2017-01-09 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813083#comment-15813083
 ] 

Ravi Prakash edited comment on HADOOP-13114 at 1/9/17 11:20 PM:


Here's a rebase of the patch from Suraj and Yongjun. To try it out, you could 
use this command: 
{code}
hadoop distcp 
-Ddistcp.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.BZip2Codec
 --compressoutput /input /output
{code}


was (Author: raviprak):
Here's rebase of the patch from Suraj and Yongjun. To try it out, you could use 
this command: 
{code}
hadoop distcp 
-Ddistcp.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.BZip2Codec
 --compressoutput /input /output
{code}

> DistCp should have option to compress data on write
> ---
>
> Key: HADOOP-13114
> URL: https://issues.apache.org/jira/browse/HADOOP-13114
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.8.0, 2.7.3, 3.0.0-alpha1
>Reporter: Suraj Nayak
>Assignee: Suraj Nayak
>Priority: Minor
>  Labels: distcp
> Attachments: HADOOP-13114-trunk_2016-05-07-1.patch, 
> HADOOP-13114-trunk_2016-05-08-1.patch, HADOOP-13114-trunk_2016-05-10-1.patch, 
> HADOOP-13114-trunk_2016-05-12-1.patch, HADOOP-13114.05.patch, 
> HADOOP-13114.06.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> DistCp utility should have capability to store data in user specified 
> compression format. This avoids one hop of compressing data after transfer. 
> Backup strategies to different cluster also get benefit of saving one IO 
> operation to and from HDFS, thus saving resources, time and effort.
> * Create an option -compressOutput defaulting to 
> {{org.apache.hadoop.io.compress.BZip2Codec}}. 
> * Users will be able to change codec with {{-D 
> mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.GzipCodec}}
> * If distcp compression is enabled, suffix the filenames with default codec 
> extension to indicate the file is compressed. Thus users can be aware of what 
> codec was used to compress the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13114) DistCp should have option to compress data on write

2017-01-09 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813177#comment-15813177
 ] 

Ravi Prakash commented on HADOOP-13114:
---

Thanks for your comment Koji!

I guess it'd be useful for any files which are compressible, right? And also 
the target HDFS can have less free space. Are you thinking there may be 
downsides?


> DistCp should have option to compress data on write
> ---
>
> Key: HADOOP-13114
> URL: https://issues.apache.org/jira/browse/HADOOP-13114
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.8.0, 2.7.3, 3.0.0-alpha1
>Reporter: Suraj Nayak
>Assignee: Suraj Nayak
>Priority: Minor
>  Labels: distcp
> Attachments: HADOOP-13114-trunk_2016-05-07-1.patch, 
> HADOOP-13114-trunk_2016-05-08-1.patch, HADOOP-13114-trunk_2016-05-10-1.patch, 
> HADOOP-13114-trunk_2016-05-12-1.patch, HADOOP-13114.05.patch, 
> HADOOP-13114.06.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> DistCp utility should have capability to store data in user specified 
> compression format. This avoids one hop of compressing data after transfer. 
> Backup strategies to different cluster also get benefit of saving one IO 
> operation to and from HDFS, thus saving resources, time and effort.
> * Create an option -compressOutput defaulting to 
> {{org.apache.hadoop.io.compress.BZip2Codec}}. 
> * Users will be able to change codec with {{-D 
> mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.GzipCodec}}
> * If distcp compression is enabled, suffix the filenames with default codec 
> extension to indicate the file is compressed. Thus users can be aware of what 
> codec was used to compress the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13114) DistCp should have option to compress data on write

2017-01-09 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-13114:
--
Attachment: HADOOP-13114.06.patch

Here's rebase of the patch from Suraj and Yongjun. To try it out, you could use 
this command: 
{code}
hadoop distcp 
-Ddistcp.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.BZip2Codec
 --compressoutput /input /output
{code}

> DistCp should have option to compress data on write
> ---
>
> Key: HADOOP-13114
> URL: https://issues.apache.org/jira/browse/HADOOP-13114
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.8.0, 2.7.3, 3.0.0-alpha1
>Reporter: Suraj Nayak
>Assignee: Suraj Nayak
>Priority: Minor
>  Labels: distcp
> Attachments: HADOOP-13114-trunk_2016-05-07-1.patch, 
> HADOOP-13114-trunk_2016-05-08-1.patch, HADOOP-13114-trunk_2016-05-10-1.patch, 
> HADOOP-13114-trunk_2016-05-12-1.patch, HADOOP-13114.05.patch, 
> HADOOP-13114.06.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> DistCp utility should have capability to store data in user specified 
> compression format. This avoids one hop of compressing data after transfer. 
> Backup strategies to different cluster also get benefit of saving one IO 
> operation to and from HDFS, thus saving resources, time and effort.
> * Create an option -compressOutput defaulting to 
> {{org.apache.hadoop.io.compress.BZip2Codec}}. 
> * Users will be able to change codec with {{-D 
> mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.GzipCodec}}
> * If distcp compression is enabled, suffix the filenames with default codec 
> extension to indicate the file is compressed. Thus users can be aware of what 
> codec was used to compress the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13898) should set HADOOP_JOB_HISTORYSERVER_HEAPSIZE only if it's empty on branch2

2016-12-13 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-13898:
--
   Resolution: Fixed
Fix Version/s: 2.9.0
   Status: Resolved  (was: Patch Available)

Thanks for your contribution Fei Hui! I've committed this to branch-2. It 
should be released in 2.9.0

> should set HADOOP_JOB_HISTORYSERVER_HEAPSIZE only if it's empty on branch2
> --
>
> Key: HADOOP-13898
> URL: https://issues.apache.org/jira/browse/HADOOP-13898
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 2.9.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Fix For: 2.9.0
>
> Attachments: HADOOP-13898-branch-2.001.patch, 
> HADOOP-13898-branch-2.002.patch
>
>
> In mapred-env, set HADOOP_JOB_HISTORYSERVER_HEAPSIZE 1000 by default, That is 
> incorrect.
> We should set it 1000 by default only if it's empty. 
> Because if you run  'HADOOP_JOB_HISTORYSERVER_HEAPSIZE =512 
> $HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver', 
> HADOOP_JOB_HISTORYSERVER_HEAPSIZE  will be set 1000, rather than 512.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13898) should set HADOOP_JOB_HISTORYSERVER_HEAPSIZE only if it's empty on branch2

2016-12-13 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746274#comment-15746274
 ] 

Ravi Prakash commented on HADOOP-13898:
---

Looks good to me. +1. Committing to branch-2 shortly.

> should set HADOOP_JOB_HISTORYSERVER_HEAPSIZE only if it's empty on branch2
> --
>
> Key: HADOOP-13898
> URL: https://issues.apache.org/jira/browse/HADOOP-13898
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 2.9.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HADOOP-13898-branch-2.001.patch, 
> HADOOP-13898-branch-2.002.patch
>
>
> In mapred-env, set HADOOP_JOB_HISTORYSERVER_HEAPSIZE 1000 by default, That is 
> incorrect.
> We should set it 1000 by default only if it's empty. 
> Because if you run  'HADOOP_JOB_HISTORYSERVER_HEAPSIZE =512 
> $HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver', 
> HADOOP_JOB_HISTORYSERVER_HEAPSIZE  will be set 1000, rather than 512.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13849) Bzip2 java-builtin and system-native have almost the same compress speed

2016-12-01 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712862#comment-15712862
 ] 

Ravi Prakash commented on HADOOP-13849:
---

That makes sense Steve and Tao Li! Thanks for your efforts. Please keep us 
updated if you find any bottlenecks. 

> Bzip2 java-builtin and system-native have almost the same compress speed
> 
>
> Key: HADOOP-13849
> URL: https://issues.apache.org/jira/browse/HADOOP-13849
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.6.0
> Environment: os version: redhat6
> hadoop version: 2.6.0
> native bzip2 version: bzip2-devel-1.0.5-7.el6_0.x86_64
>Reporter: Tao Li
>
> I tested bzip2 java-builtin and system-native compression, and I found the 
> compress speed is almost the same. (I think the system-native should have 
> better compress speed than java-builtin)
> My test case:
> 1. input file: 2.7GB text file without compression
> 2. after bzip2 java-builtin compress: 457MB, 12min 4sec
> 3. after bzip2 system-native compress: 457MB, 12min 19sec
> My MapReduce Config:
> conf.set("mapreduce.fileoutputcommitter.marksuccessfuljobs", "false");
> conf.set("mapreduce.output.fileoutputformat.compress", "true");
> conf.set("mapreduce.output.fileoutputformat.compress.type", "BLOCK");
> conf.set("mapreduce.output.fileoutputformat.compress.codec", 
> "org.apache.hadoop.io.compress.BZip2Codec");
> conf.set("io.compression.codec.bzip2.library", "java-builtin"); // for 
> java-builtin
> conf.set("io.compression.codec.bzip2.library", "system-native"); // for 
> system-native
> And I am sure I have enable the bzip2 native, the output of command "hadoop 
> checknative -a" is as follows:
> Native library checking:
> hadoop:  true /usr/lib/hadoop/lib/native/libhadoop.so.1.0.0
> zlib:true /lib64/libz.so.1
> snappy:  true /usr/lib/hadoop/lib/native/libsnappy.so.1
> lz4: true revision:99
> bzip2:   true /lib64/libbz2.so.1
> openssl: true /usr/lib64/libcrypto.so



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13849) Bzip2 java-builtin and system-native have almost the same compress speed

2016-11-30 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709305#comment-15709305
 ] 

Ravi Prakash commented on HADOOP-13849:
---

Hi Tao Li!

Thanks for your effort to benchmark the two implementations. Are you proposing 
to make one faster than the other?

> Bzip2 java-builtin and system-native have almost the same compress speed
> 
>
> Key: HADOOP-13849
> URL: https://issues.apache.org/jira/browse/HADOOP-13849
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.6.0
> Environment: os version: redhat6
> hadoop version: 2.6.0
> native bzip2 version: bzip2-devel-1.0.5-7.el6_0.x86_64
>Reporter: Tao Li
>
> I tested bzip2 java-builtin and system-native compression, and I found the 
> compress speed is almost the same. (I think the system-native should have 
> better compress speed than java-builtin)
> My test case:
> 1. input file: 2.7GB text file without compression
> 2. after bzip2 java-builtin compress: 457MB, 12min 4sec
> 3. after bzip2 system-native compress: 457MB, 12min 19sec
> My MapReduce Config:
> conf.set("mapreduce.fileoutputcommitter.marksuccessfuljobs", "false");
> conf.set("mapreduce.output.fileoutputformat.compress", "true");
> conf.set("mapreduce.output.fileoutputformat.compress.type", "BLOCK");
> conf.set("mapreduce.output.fileoutputformat.compress.codec", 
> "org.apache.hadoop.io.compress.BZip2Codec");
> conf.set("io.compression.codec.bzip2.library", "java-builtin"); // for 
> java-builtin
> conf.set("io.compression.codec.bzip2.library", "system-native"); // for 
> system-native
> And I am sure I have enable the bzip2 native, the output of command "hadoop 
> checknative -a" is as follows:
> Native library checking:
> hadoop:  true /usr/lib/hadoop/lib/native/libhadoop.so.1.0.0
> zlib:true /lib64/libz.so.1
> snappy:  true /usr/lib/hadoop/lib/native/libsnappy.so.1
> lz4: true revision:99
> bzip2:   true /lib64/libbz2.so.1
> openssl: true /usr/lib64/libcrypto.so



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13114) DistCp should have option to compress data on write

2016-11-17 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-13114:
--
Attachment: HADOOP-13114.05.patch

Hi Suraj!

Thanks a lot for all your efforts to improve DistCp. My sincere apologies for 
not paying attention to this issue. I'm sorry I was a bit busy when you had 
asked and then never got back to this issue. Yongjun seems to want this in, so 
we'll make another push for it. 
Here's a rebase for the latest trunk. I'll try to review and test it in the 
coming days.

> DistCp should have option to compress data on write
> ---
>
> Key: HADOOP-13114
> URL: https://issues.apache.org/jira/browse/HADOOP-13114
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: tools/distcp
>Affects Versions: 2.8.0, 2.7.3, 3.0.0-alpha1
>Reporter: Suraj Nayak
>Assignee: Suraj Nayak
>Priority: Minor
>  Labels: distcp
> Attachments: HADOOP-13114-trunk_2016-05-07-1.patch, 
> HADOOP-13114-trunk_2016-05-08-1.patch, HADOOP-13114-trunk_2016-05-10-1.patch, 
> HADOOP-13114-trunk_2016-05-12-1.patch, HADOOP-13114.05.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> DistCp utility should have capability to store data in user specified 
> compression format. This avoids one hop of compressing data after transfer. 
> Backup strategies to different cluster also get benefit of saving one IO 
> operation to and from HDFS, thus saving resources, time and effort.
> * Create an option -compressOutput defaulting to 
> {{org.apache.hadoop.io.compress.BZip2Codec}}. 
> * Users will be able to change codec with {{-D 
> mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.GzipCodec}}
> * If distcp compression is enabled, suffix the filenames with default codec 
> extension to indicate the file is compressed. Thus users can be aware of what 
> codec was used to compress the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-8065) distcp should have an option to compress data while copying.

2016-11-17 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15675308#comment-15675308
 ] 

Ravi Prakash commented on HADOOP-8065:
--

Hi Yongjun!

Thanks for rebasing the patch and your polishing touches. I think HADOOP-13114 
might be the more appropriate JIRA for these changes (which Suraj kindly filed 
at my request earlier.) since this patch does not compress *during* transfer; 
only after transfer and before writing to HDFS.
- {{getCompressionCodcec}} has the same typo I pointed out to Suraj. He did 
post updated patches on HADOOP-13114. I apologize for neglecting to review 
those patches despite Suraj's requests.
- {{getCompressionCodcec}} also uses ReflectionUtils. I don't know if it'd be 
better to use [this 
pattern|https://github.com/apache/hadoop/blob/b4f1971ff1dd578353036d7a123fe83c27c1e803/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/lib/CombineFileInputFormat.java#L159]
 instead?
- We're still not using a CodecPool like I suggested earlier. The patch in 
HADOOP-13114 actually is. Let me rebase and upload that. Could you please take 
a look at that?

> distcp should have an option to compress data while copying.
> 
>
> Key: HADOOP-8065
> URL: https://issues.apache.org/jira/browse/HADOOP-8065
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 0.20.2
>Reporter: Suresh Antony
>Assignee: Suraj Nayak
>Priority: Minor
>  Labels: distcp
> Fix For: 0.20.2
>
> Attachments: HADOOP-8065-trunk_2015-11-03.patch, 
> HADOOP-8065-trunk_2015-11-04.patch, HADOOP-8065-trunk_2016-04-29-4.patch, 
> HADOOP-8065.005.patch, HADOOP-8065.006.patch, patch.distcp.2012-02-10
>
>
> We would like compress the data while transferring from our source system to 
> target system. One way to do this is to write a map/reduce job to compress 
> that after/before being transferred. This looks inefficient. 
> Since distcp already reading writing data it would be better if it can 
> accomplish while doing this. 
> Flip side of this is that distcp -update option can not check file size 
> before copying data. It can only check for the existence of file. 
> So I propose if -compress option is given then file size is not checked.
> Also when we copy file appropriate extension needs to be added to file 
> depending on compression type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-8065) distcp should have an option to compress data while copying.

2016-11-15 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668015#comment-15668015
 ] 

Ravi Prakash commented on HADOOP-8065:
--

Thanks for rebasing Yongjun! I'll take a look. Does it look good to you?

> distcp should have an option to compress data while copying.
> 
>
> Key: HADOOP-8065
> URL: https://issues.apache.org/jira/browse/HADOOP-8065
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 0.20.2
>Reporter: Suresh Antony
>Assignee: Suraj Nayak
>Priority: Minor
>  Labels: distcp
> Fix For: 0.20.2
>
> Attachments: HADOOP-8065-trunk_2015-11-03.patch, 
> HADOOP-8065-trunk_2015-11-04.patch, HADOOP-8065-trunk_2016-04-29-4.patch, 
> HADOOP-8065.005.patch, patch.distcp.2012-02-10
>
>
> We would like compress the data while transferring from our source system to 
> target system. One way to do this is to write a map/reduce job to compress 
> that after/before being transferred. This looks inefficient. 
> Since distcp already reading writing data it would be better if it can 
> accomplish while doing this. 
> Flip side of this is that distcp -update option can not check file size 
> before copying data. It can only check for the existence of file. 
> So I propose if -compress option is given then file size is not checked.
> Also when we copy file appropriate extension needs to be added to file 
> depending on compression type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13773) wrong HADOOP_CLIENT_OPTS in hadoop-env on branch-2

2016-11-01 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15626888#comment-15626888
 ] 

Ravi Prakash commented on HADOOP-13773:
---

Awesome! Sounds good Andrew! Thanks. I'll take care in the future. 

> wrong HADOOP_CLIENT_OPTS in hadoop-env  on branch-2
> ---
>
> Key: HADOOP-13773
> URL: https://issues.apache.org/jira/browse/HADOOP-13773
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Affects Versions: 2.6.1, 2.7.3
>Reporter: Fei Hui
>Assignee: Fei Hui
> Fix For: 2.8.0
>
> Attachments: HADOOP-13773.patch
>
>
> in conf/hadoop-env.sh,
> export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"
> when i set HADOOP_HEAPSIZE ,and run 'hadoop jar ...', jvm args is not work.
> i see, in bin/hadoop,
> exec "$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"
> HADOOP_OPTS is behind JAVA_HEAP_MAX, so HADOOP_HEAPSIZE is not work.
> for example i run 'HADOOP_HEAPSIZE=1024 hadoop jar ...' , the java process is 
> 'java -Xmx1024m ... -Xmx512m...', then Xmx512m is valid, and Xmx1024m is 
> invalid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13773) wrong HADOOP_CLIENT_OPTS in hadoop-env on branch-2

2016-11-01 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15626805#comment-15626805
 ] 

Ravi Prakash commented on HADOOP-13773:
---

Andrew! The precommit was known to fail because the patch applied only to 
branch-2 (not trunk)

> wrong HADOOP_CLIENT_OPTS in hadoop-env  on branch-2
> ---
>
> Key: HADOOP-13773
> URL: https://issues.apache.org/jira/browse/HADOOP-13773
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Affects Versions: 2.6.1, 2.7.3
>Reporter: Fei Hui
>Assignee: Fei Hui
> Fix For: 2.8.0
>
> Attachments: HADOOP-13773.patch
>
>
> in conf/hadoop-env.sh,
> export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"
> when i set HADOOP_HEAPSIZE ,and run 'hadoop jar ...', jvm args is not work.
> i see, in bin/hadoop,
> exec "$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"
> HADOOP_OPTS is behind JAVA_HEAP_MAX, so HADOOP_HEAPSIZE is not work.
> for example i run 'HADOOP_HEAPSIZE=1024 hadoop jar ...' , the java process is 
> 'java -Xmx1024m ... -Xmx512m...', then Xmx512m is valid, and Xmx1024m is 
> invalid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13773) wrong HADOOP_CLIENT_OPTS in hadoop-env on branch-2

2016-11-01 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15626155#comment-15626155
 ] 

Ravi Prakash commented on HADOOP-13773:
---

Just for future reference. I've added you to the Contributors1 role. That would 
allow you to assign issues to yourself. Also, please follow the patch naming 
scheme. All patch files should have a version. Also, if the patch is not for 
trunk, it should contain the branch name. So the file name for your patch 
should be HADOOP-13773.branch-2.01.patch. 

Thanks for your contribution and we look forward to many more from you!

> wrong HADOOP_CLIENT_OPTS in hadoop-env  on branch-2
> ---
>
> Key: HADOOP-13773
> URL: https://issues.apache.org/jira/browse/HADOOP-13773
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Affects Versions: 2.6.1, 2.7.3
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HADOOP-13773.patch
>
>
> in conf/hadoop-env.sh,
> export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"
> when i set HADOOP_HEAPSIZE ,and run 'hadoop jar ...', jvm args is not work.
> i see, in bin/hadoop,
> exec "$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"
> HADOOP_OPTS is behind JAVA_HEAP_MAX, so HADOOP_HEAPSIZE is not work.
> for example i run 'HADOOP_HEAPSIZE=1024 hadoop jar ...' , the java process is 
> 'java -Xmx1024m ... -Xmx512m...', then Xmx512m is valid, and Xmx1024m is 
> invalid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13773) wrong HADOOP_CLIENT_OPTS in hadoop-env on branch-2

2016-11-01 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-13773:
--
Assignee: Fei Hui

> wrong HADOOP_CLIENT_OPTS in hadoop-env  on branch-2
> ---
>
> Key: HADOOP-13773
> URL: https://issues.apache.org/jira/browse/HADOOP-13773
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Affects Versions: 2.6.1, 2.7.3
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HADOOP-13773.patch
>
>
> in conf/hadoop-env.sh,
> export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"
> when i set HADOOP_HEAPSIZE ,and run 'hadoop jar ...', jvm args is not work.
> i see, in bin/hadoop,
> exec "$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"
> HADOOP_OPTS is behind JAVA_HEAP_MAX, so HADOOP_HEAPSIZE is not work.
> for example i run 'HADOOP_HEAPSIZE=1024 hadoop jar ...' , the java process is 
> 'java -Xmx1024m ... -Xmx512m...', then Xmx512m is valid, and Xmx1024m is 
> invalid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13773) wrong HADOOP_CLIENT_OPTS in hadoop-env on branch-2

2016-11-01 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15626142#comment-15626142
 ] 

Ravi Prakash commented on HADOOP-13773:
---

Thanks for your contribution Fei Hui and for your careful review Yuanbo Liu!

I've committed this to branch-2 and branch-2.8. 

> wrong HADOOP_CLIENT_OPTS in hadoop-env  on branch-2
> ---
>
> Key: HADOOP-13773
> URL: https://issues.apache.org/jira/browse/HADOOP-13773
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Affects Versions: 2.6.1, 2.7.3
>Reporter: Fei Hui
> Attachments: HADOOP-13773.patch
>
>
> in conf/hadoop-env.sh,
> export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"
> when i set HADOOP_HEAPSIZE ,and run 'hadoop jar ...', jvm args is not work.
> i see, in bin/hadoop,
> exec "$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"
> HADOOP_OPTS is behind JAVA_HEAP_MAX, so HADOOP_HEAPSIZE is not work.
> for example i run 'HADOOP_HEAPSIZE=1024 hadoop jar ...' , the java process is 
> 'java -Xmx1024m ... -Xmx512m...', then Xmx512m is valid, and Xmx1024m is 
> invalid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13773) wrong HADOOP_CLIENT_OPTS in hadoop-env on branch-2

2016-10-31 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15624384#comment-15624384
 ] 

Ravi Prakash commented on HADOOP-13773:
---

Hi Fei Hui!

Welcome to the community and thanks for your contribution! I've taken the 
liberty to edit the JIRA with values for fields that we are used to. (Fix 
Version is set when the patch is merged. Target Version is set to the next 
expected release that would contain the fix. Description contains the problem)

https://wiki.apache.org/hadoop/HowToContribute is a fairly verbose guide on how 
to contribute. Instead of making you read it in its entirety, I'd suggest 
uploading a patch file instead of github pull requests (because what happens to 
review discussions if Github.com was to fold tomorrow?)

Just FYI, Allen rewrote the shell scripts for trunk 
(https://issues.apache.org/jira/browse/HADOOP-9902) and he's the reigning 
expert in that area. I'll defer to his better judgement. Unfortunately those 
improvements were not backported fully into branch-2 (to explain any 
discrepancy you may be seeing)

It seems in trunk the discrepancy of multiple Xmx values is hired much more 
elegantly with 
[hadoop_add_param|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh#L830]
 . However to me your fix makes sense for branch-2. We shouldn't be appending 
{{-Xmx=512m}} indiscriminately. Could you please upload the patch file and I'll 
be happy to commit.



> wrong HADOOP_CLIENT_OPTS in hadoop-env  on branch-2
> ---
>
> Key: HADOOP-13773
> URL: https://issues.apache.org/jira/browse/HADOOP-13773
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Affects Versions: 2.6.1, 2.7.3
>Reporter: Fei Hui
>
> in conf/hadoop-env.sh,
> export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"
> when i set HADOOP_HEAPSIZE ,and run 'hadoop jar ...', jvm args is not work.
> i see, in bin/hadoop,
> exec "$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"
> HADOOP_OPTS is behind JAVA_HEAP_MAX, so HADOOP_HEAPSIZE is not work.
> for example i run 'HADOOP_HEAPSIZE=1024 hadoop jar ...' , the java process is 
> 'java -Xmx1024m ... -Xmx512m...', then Xmx512m is valid, and Xmx1024m is 
> invalid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13773) wrong HADOOP_CLIENT_OPTS in hadoop-env on branch-2

2016-10-31 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-13773:
--
Target Version/s: 2.8.0  (was: 2.7.3)

> wrong HADOOP_CLIENT_OPTS in hadoop-env  on branch-2
> ---
>
> Key: HADOOP-13773
> URL: https://issues.apache.org/jira/browse/HADOOP-13773
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Affects Versions: 2.6.1, 2.7.3
>Reporter: Fei Hui
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13773) wrong HADOOP_CLIENT_OPTS in hadoop-env on branch-2

2016-10-31 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-13773:
--
Description: 
in conf/hadoop-env.sh,
export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"

when i set HADOOP_HEAPSIZE ,and run 'hadoop jar ...', jvm args is not work.

i see, in bin/hadoop,
exec "$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"

HADOOP_OPTS is behind JAVA_HEAP_MAX, so HADOOP_HEAPSIZE is not work.

for example i run 'HADOOP_HEAPSIZE=1024 hadoop jar ...' , the java process is 
'java -Xmx1024m ... -Xmx512m...', then Xmx512m is valid, and Xmx1024m is invalid


> wrong HADOOP_CLIENT_OPTS in hadoop-env  on branch-2
> ---
>
> Key: HADOOP-13773
> URL: https://issues.apache.org/jira/browse/HADOOP-13773
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Affects Versions: 2.6.1, 2.7.3
>Reporter: Fei Hui
>
> in conf/hadoop-env.sh,
> export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"
> when i set HADOOP_HEAPSIZE ,and run 'hadoop jar ...', jvm args is not work.
> i see, in bin/hadoop,
> exec "$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"
> HADOOP_OPTS is behind JAVA_HEAP_MAX, so HADOOP_HEAPSIZE is not work.
> for example i run 'HADOOP_HEAPSIZE=1024 hadoop jar ...' , the java process is 
> 'java -Xmx1024m ... -Xmx512m...', then Xmx512m is valid, and Xmx1024m is 
> invalid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13773) wrong HADOOP_CLIENT_OPTS in hadoop-env on branch-2

2016-10-31 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-13773:
--
Fix Version/s: (was: 2.7.4)
   (was: 2.9.0)
   (was: 2.8.0)

> wrong HADOOP_CLIENT_OPTS in hadoop-env  on branch-2
> ---
>
> Key: HADOOP-13773
> URL: https://issues.apache.org/jira/browse/HADOOP-13773
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Affects Versions: 2.6.1, 2.7.3
>Reporter: Fei Hui
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-10075) Update jetty dependency to version 9

2016-10-27 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613462#comment-15613462
 ] 

Ravi Prakash commented on HADOOP-10075:
---

All the changes in java and pom files look good to me. I ran all the unit tests 
on trunk with and without patch, and the same unit tests fail, so I'm crossing 
my fingers that the patch doesn't introduce any new unit test failures. I also 
started all daemons and clicked around and saw nothing unusual. I checked that 
the /conf, /jmx and REST URIs still work.

I can't submit jobs on unpatched trunk right now (it complains {{Could not find 
or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster}}) but its an 
orthogonal issue and not affected by your patch.

Thanks for the massive amount of effort. The 011 patch looks good to me. +1.

Please feel free to commit it yourself. Otherwise I'm happy to do it by the end 
of the day.

> Update jetty dependency to version 9
> 
>
> Key: HADOOP-10075
> URL: https://issues.apache.org/jira/browse/HADOOP-10075
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.2.0, 2.6.0
>Reporter: Robert Rati
>Assignee: Robert Kanter
>Priority: Critical
> Attachments: HADOOP-10075-002-wip.patch, HADOOP-10075.003.patch, 
> HADOOP-10075.004.patch, HADOOP-10075.005.patch, HADOOP-10075.006.patch, 
> HADOOP-10075.007.patch, HADOOP-10075.008.patch, HADOOP-10075.009.patch, 
> HADOOP-10075.010.patch, HADOOP-10075.011.patch, HADOOP-10075.patch
>
>
> Jetty6 is no longer maintained.  Update the dependency to jetty9.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-10075) Update jetty dependency to version 9

2016-10-27 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612868#comment-15612868
 ] 

Ravi Prakash commented on HADOOP-10075:
---

Still working on it Robert! I'll try to finish by today

> Update jetty dependency to version 9
> 
>
> Key: HADOOP-10075
> URL: https://issues.apache.org/jira/browse/HADOOP-10075
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.2.0, 2.6.0
>Reporter: Robert Rati
>Assignee: Robert Kanter
>Priority: Critical
> Attachments: HADOOP-10075-002-wip.patch, HADOOP-10075.003.patch, 
> HADOOP-10075.004.patch, HADOOP-10075.005.patch, HADOOP-10075.006.patch, 
> HADOOP-10075.007.patch, HADOOP-10075.008.patch, HADOOP-10075.009.patch, 
> HADOOP-10075.010.patch, HADOOP-10075.011.patch, HADOOP-10075.patch
>
>
> Jetty6 is no longer maintained.  Update the dependency to jetty9.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-10075) Update jetty dependency to version 9

2016-10-24 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15603395#comment-15603395
 ] 

Ravi Prakash commented on HADOOP-10075:
---

Thanks Robert for your work. I'm trying to run all the unit tests which on the 
computer that I can spare takes 10 hours. If the jenkins bot came back with a 
+1, it'd be much easier for me to +1 the patch too.

> Update jetty dependency to version 9
> 
>
> Key: HADOOP-10075
> URL: https://issues.apache.org/jira/browse/HADOOP-10075
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.2.0, 2.6.0
>Reporter: Robert Rati
>Assignee: Robert Kanter
>Priority: Critical
> Attachments: HADOOP-10075-002-wip.patch, HADOOP-10075.003.patch, 
> HADOOP-10075.004.patch, HADOOP-10075.005.patch, HADOOP-10075.006.patch, 
> HADOOP-10075.007.patch, HADOOP-10075.008.patch, HADOOP-10075.009.patch, 
> HADOOP-10075.010.patch, HADOOP-10075.011.patch, HADOOP-10075.patch
>
>
> Jetty6 is no longer maintained.  Update the dependency to jetty9.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-10075) Update jetty dependency to version 9

2016-10-19 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15589256#comment-15589256
 ] 

Ravi Prakash commented on HADOOP-10075:
---

Thanks Robert! I think I am done with my feedback. If all the tests that passed 
earlier, would pass with the new patch, I'm happy to +1 it.


> Update jetty dependency to version 9
> 
>
> Key: HADOOP-10075
> URL: https://issues.apache.org/jira/browse/HADOOP-10075
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.2.0, 2.6.0
>Reporter: Robert Rati
>Assignee: Robert Kanter
>Priority: Critical
> Attachments: HADOOP-10075-002-wip.patch, HADOOP-10075.003.patch, 
> HADOOP-10075.004.patch, HADOOP-10075.005.patch, HADOOP-10075.006.patch, 
> HADOOP-10075.007.patch, HADOOP-10075.008.patch, HADOOP-10075.009.patch, 
> HADOOP-10075.patch
>
>
> Jetty6 is no longer maintained.  Update the dependency to jetty9.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-10075) Update jetty dependency to version 9

2016-10-17 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584205#comment-15584205
 ] 

Ravi Prakash commented on HADOOP-10075:
---

Thanks Robert! 

* Why did you have to add {{ep.getPort() == -2}} in 
{{HttpServer2.Builder.build()}}? The javadoc for URI never claims to return a 
-2. Is it bleeding in from {{ServerConnector.getLocalPort()}} somehow?
* {{HttpServer2.createDefaultChannelConnector()}} has started taking a server 
as an argument. Is that really necessary? Or are we always asking to pass in an 
instance of {{Server}} to this method that we already have access to?

My notes (and not necessarily questions, but if you have more info, it'd be 
great to document here):
* 
http://www.eclipse.org/jetty/documentation/current/architecture.html#basic-architecture
 and http://www.eclipse.org/jetty/documentation/current/embedding-jetty.html 
are good pages to read up.
* {code}  conn.addFirstConnectionFactory(new 
SslConnectionFactory(sslContextFactory,
  HttpVersion.HTTP_1_1.asString()));
{code} 
http://archive.eclipse.org/jetty/9.0.0.M3/apidocs/org/eclipse/jetty/server/ssl/SslSelectChannelConnector.html
 suggests to use SelectChannelConnector . Couldn't find any documentation on 
how to do that.
* {{c.setLowResourceMaxIdleTime(1);}} has been changed to 
{{c.setIdleTimeout(1);}} This just means that timeout will occur after 10s 
whether or not there were more than LowResourcesConnections. I'm fine with this 
change.
* We haven't set {{c.setResolveNames(false);}}[This 
thread|https://dev.eclipse.org/mhonarc/lists/jetty-users/msg07284.html] claims 
that by default names will not be resolved, so I'm fine with this change. 
Tagging [~mshen] who had put in original arguments.
* {{addNoCacheFilter(webAppContext);}} was changed to 
{{addNoCacheFilter(logContext);}} Was this a bug earlier?
* {{addInternalServlet()}} removes existing path bindings now. I don't see what 
else we could do about it. Fine by me.


> Update jetty dependency to version 9
> 
>
> Key: HADOOP-10075
> URL: https://issues.apache.org/jira/browse/HADOOP-10075
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.2.0, 2.6.0
>Reporter: Robert Rati
>Assignee: Robert Kanter
>Priority: Critical
> Attachments: HADOOP-10075-002-wip.patch, HADOOP-10075.003.patch, 
> HADOOP-10075.004.patch, HADOOP-10075.005.patch, HADOOP-10075.006.patch, 
> HADOOP-10075.007.patch, HADOOP-10075.008.patch, HADOOP-10075.009.patch, 
> HADOOP-10075.patch
>
>
> Jetty6 is no longer maintained.  Update the dependency to jetty9.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-10075) Update jetty dependency to version 9

2016-10-14 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576071#comment-15576071
 ] 

Ravi Prakash commented on HADOOP-10075:
---

Thanks a lot for the massive amount of effort Robert and others.
The patch is deceptively big. A lot of the bulk is added because of renaming 
classes or adding charset or just the logging method. Thanks also for writing 
the maven plugin Robert.

Is there a reason you changed {{httpServer.addContext(uiWebAppContext, true);}} 
to {{httpServer.addHandlerAtFront(uiWebAppContext);}} in 
ApplicationHistoryServer?

The main changes are in HttpServer2 and I am going through them right now. Will 
hopefully get done soon.

> Update jetty dependency to version 9
> 
>
> Key: HADOOP-10075
> URL: https://issues.apache.org/jira/browse/HADOOP-10075
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.2.0, 2.6.0
>Reporter: Robert Rati
>Assignee: Robert Kanter
> Attachments: HADOOP-10075-002-wip.patch, HADOOP-10075.003.patch, 
> HADOOP-10075.004.patch, HADOOP-10075.005.patch, HADOOP-10075.006.patch, 
> HADOOP-10075.007.patch, HADOOP-10075.008.patch, HADOOP-10075.patch
>
>
> Jetty6 is no longer maintained.  Update the dependency to jetty9.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-10075) Update jetty dependency to version 9

2016-09-30 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15537166#comment-15537166
 ] 

Ravi Prakash commented on HADOOP-10075:
---

Thanks for all your work Robert.

* Why did you have to deprecate {{RequestLoggerFilter.setStatus}} ?
* Is there precedent for trusting external plugins in Maven? Even though I'm 
sure com.github.phuonghuynh is a perfectly trustworthy repo, is there something 
else we can do? I would almost rather keep the 2 files (compressed and 
uncompressed)


> Update jetty dependency to version 9
> 
>
> Key: HADOOP-10075
> URL: https://issues.apache.org/jira/browse/HADOOP-10075
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.2.0, 2.6.0
>Reporter: Robert Rati
>Assignee: Robert Kanter
> Attachments: HADOOP-10075-002-wip.patch, HADOOP-10075.003.patch, 
> HADOOP-10075.004.patch, HADOOP-10075.005.patch, HADOOP-10075.006.patch, 
> HADOOP-10075.patch
>
>
> Jetty6 is no longer maintained.  Update the dependency to jetty9.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-10075) Update jetty dependency to version 9

2016-09-12 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485046#comment-15485046
 ] 

Ravi Prakash commented on HADOOP-10075:
---

bq. IMO we'd be better off moving out of Jetty and into jersey as the server; 
this would eliminate jersey version problems altogether, and more importantly, 
jersey "quirks"
For those of us less familiar with Jersey, could you please elaborate on this 
Steve? Or did you mean "this would eliminate *jetty* version problems 
altogether" ? Or does Jersey promise never to change its API ever?

In any case we can always make that happen later, so we shouldn't block the 
upgrade of an old and crufty Jetty if someone wants to do it.

> Update jetty dependency to version 9
> 
>
> Key: HADOOP-10075
> URL: https://issues.apache.org/jira/browse/HADOOP-10075
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.2.0, 2.6.0
>Reporter: Robert Rati
>Assignee: Robert Kanter
> Attachments: HADOOP-10075-002-wip.patch, HADOOP-10075.patch
>
>
> Jetty6 is no longer maintained.  Update the dependency to jetty9.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13587) distcp.map.bandwidth.mb is overwritten even when -bandwidth flag isn't set

2016-09-12 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-13587:
--
  Resolution: Fixed
   Fix Version/s: 3.0.0-alpha2
Target Version/s: 3.0.0-alpha2  (was: 3.0.0-alpha1)
  Status: Resolved  (was: Patch Available)

Committed to trunk! Thanks for the contribution Zoran!

> distcp.map.bandwidth.mb is overwritten even when -bandwidth flag isn't set
> --
>
> Key: HADOOP-13587
> URL: https://issues.apache.org/jira/browse/HADOOP-13587
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.0.0-alpha1
>Reporter: Zoran Dimitrijevic
>Assignee: Zoran Dimitrijevic
>Priority: Minor
> Fix For: 3.0.0-alpha2
>
> Attachments: HADOOP-13587-01.patch, HADOOP-13587-02.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> distcp.map.bandwidth.mb exists in distcp-defaults.xml config file, but it is 
> not honored even when it is . Current code always overwrites it with either 
> default value (java const) or with -bandwidth command line option.
> The expected behavior (at least how I would expect it) is to honor the value 
> set in distcp-defaults.xml unless user explicitly specify -bandwidth command 
> line flag. If there is no value set in .xml file or as a command line flag, 
> then the constant from java code should be used.
> Additionally, I would expect that we also try to get values from 
> distcp-site.xml, similar to other hadoop systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13587) distcp.map.bandwidth.mb is overwritten even when -bandwidth flag isn't set

2016-09-12 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15484408#comment-15484408
 ] 

Ravi Prakash commented on HADOOP-13587:
---

+1. LGTM. Thanks a lot for your contribution Zoran. Committing shortly!

> distcp.map.bandwidth.mb is overwritten even when -bandwidth flag isn't set
> --
>
> Key: HADOOP-13587
> URL: https://issues.apache.org/jira/browse/HADOOP-13587
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.0.0-alpha1
>Reporter: Zoran Dimitrijevic
>Assignee: Zoran Dimitrijevic
>Priority: Minor
> Attachments: HADOOP-13587-01.patch, HADOOP-13587-02.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> distcp.map.bandwidth.mb exists in distcp-defaults.xml config file, but it is 
> not honored even when it is . Current code always overwrites it with either 
> default value (java const) or with -bandwidth command line option.
> The expected behavior (at least how I would expect it) is to honor the value 
> set in distcp-defaults.xml unless user explicitly specify -bandwidth command 
> line flag. If there is no value set in .xml file or as a command line flag, 
> then the constant from java code should be used.
> Additionally, I would expect that we also try to get values from 
> distcp-site.xml, similar to other hadoop systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13410) RunJar adds the content of the jar twice to the classpath

2016-08-11 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418147#comment-15418147
 ] 

Ravi Prakash commented on HADOOP-13410:
---

Sure! Here's my +1 too for trunk

> RunJar adds the content of the jar twice to the classpath
> -
>
> Key: HADOOP-13410
> URL: https://issues.apache.org/jira/browse/HADOOP-13410
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Yuanbo Liu
> Attachments: HADOOP-13410.001.patch
>
>
> Today when you run a "hadoop jar" command, the jar is unzipped to a temporary 
> location and gets added to the classloader.
> However, the original jar itself is still added to the classpath.
> {code}
>   List classPath = new ArrayList<>();
>   classPath.add(new File(workDir + "/").toURI().toURL());
>   classPath.add(file.toURI().toURL());
>   classPath.add(new File(workDir, "classes/").toURI().toURL());
>   File[] libs = new File(workDir, "lib").listFiles();
>   if (libs != null) {
> for (File lib : libs) {
>   classPath.add(lib.toURI().toURL());
> }
>   }
> {code}
> As a result, the contents of the jar are present in the classpath *twice* and 
> are completely redundant. Although this does not necessarily cause 
> correctness issues, some stricter code written to require a single presence 
> of files may fail.
> I cannot think of a good reason why the jar should be added to the classpath 
> if the unjarred content was added to it. I think we should remove the jar 
> from the classpath.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-10738) Dynamically adjust distcp configuration by adding distcp-site.xml into code base

2016-08-11 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-10738:
--
Target Version/s: 2.9.0  (was: 2.8.0)

> Dynamically adjust distcp configuration by adding distcp-site.xml into code 
> base
> 
>
> Key: HADOOP-10738
> URL: https://issues.apache.org/jira/browse/HADOOP-10738
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Siqi Li
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-10738.v1.patch, HADOOP-10738.v2.patch
>
>
> For now, the configuration of distcp resides in hadoop-distcp.jar. This makes 
> it difficult to adjust the configuration dynamically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-10738) Dynamically adjust distcp configuration by adding distcp-site.xml into code base

2016-08-11 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-10738:
--
Attachment: HADOOP-10738.v2.patch

Upmerging to trunk

> Dynamically adjust distcp configuration by adding distcp-site.xml into code 
> base
> 
>
> Key: HADOOP-10738
> URL: https://issues.apache.org/jira/browse/HADOOP-10738
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Siqi Li
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-10738.v1.patch, HADOOP-10738.v2.patch
>
>
> For now, the configuration of distcp resides in hadoop-distcp.jar. This makes 
> it difficult to adjust the configuration dynamically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-10738) Dynamically adjust distcp configuration by adding distcp-site.xml into code base

2016-08-11 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-10738:
--
Status: Patch Available  (was: Open)

> Dynamically adjust distcp configuration by adding distcp-site.xml into code 
> base
> 
>
> Key: HADOOP-10738
> URL: https://issues.apache.org/jira/browse/HADOOP-10738
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Siqi Li
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-10738.v1.patch, HADOOP-10738.v2.patch
>
>
> For now, the configuration of distcp resides in hadoop-distcp.jar. This makes 
> it difficult to adjust the configuration dynamically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-10738) Dynamically adjust distcp configuration by adding distcp-site.xml into code base

2016-08-11 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-10738:
--
Target Version/s: 2.8.0
  Status: Open  (was: Patch Available)

Let's see if we can get this in now.

> Dynamically adjust distcp configuration by adding distcp-site.xml into code 
> base
> 
>
> Key: HADOOP-10738
> URL: https://issues.apache.org/jira/browse/HADOOP-10738
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Siqi Li
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-10738.v1.patch
>
>
> For now, the configuration of distcp resides in hadoop-distcp.jar. This makes 
> it difficult to adjust the configuration dynamically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13410) RunJar adds the content of the jar twice to the classpath

2016-08-09 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414531#comment-15414531
 ] 

Ravi Prakash commented on HADOOP-13410:
---

Hi Sangjin! Thanks for your contribution. Is there a reason you are choosing to 
remove the jar instead of the classes which have been expanded? I'm obviously 
+1 for keeping just one of those around. Am wondering which one we should keep, 
and which one we should let go.

> RunJar adds the content of the jar twice to the classpath
> -
>
> Key: HADOOP-13410
> URL: https://issues.apache.org/jira/browse/HADOOP-13410
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Yuanbo Liu
> Attachments: HADOOP-13410.001.patch
>
>
> Today when you run a "hadoop jar" command, the jar is unzipped to a temporary 
> location and gets added to the classloader.
> However, the original jar itself is still added to the classpath.
> {code}
>   List classPath = new ArrayList<>();
>   classPath.add(new File(workDir + "/").toURI().toURL());
>   classPath.add(file.toURI().toURL());
>   classPath.add(new File(workDir, "classes/").toURI().toURL());
>   File[] libs = new File(workDir, "lib").listFiles();
>   if (libs != null) {
> for (File lib : libs) {
>   classPath.add(lib.toURI().toURL());
> }
>   }
> {code}
> As a result, the contents of the jar are present in the classpath *twice* and 
> are completely redundant. Although this does not necessarily cause 
> correctness issues, some stricter code written to require a single presence 
> of files may fail.
> I cannot think of a good reason why the jar should be added to the classpath 
> if the unjarred content was added to it. I think we should remove the jar 
> from the classpath.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13018) Make Kdiag check whether hadoop.token.files points to existent and valid files

2016-07-28 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398340#comment-15398340
 ] 

Ravi Prakash commented on HADOOP-13018:
---

Hi Steve!

I'm not sure what to add to the document. The various options have been 
enumerated and briefly described. However, I haven't added a new option. Its 
just another check, and I don't think the details of the report have been (or 
need to be) documented.

> Make Kdiag check whether hadoop.token.files points to existent and valid files
> --
>
> Key: HADOOP-13018
> URL: https://issues.apache.org/jira/browse/HADOOP-13018
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha1
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Attachments: HADOOP-13018.01.patch, HADOOP-13018.02.patch, 
> HADOOP-13018.03.patch, HADOOP-13018.04.patch
>
>
> Steve proposed that KDiag should fail fast to help debug the case that 
> hadoop.token.files points to a file not found. This JIRA is to affect that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13123) Permit the default hadoop delegation token file format to be configurable

2016-07-18 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15382982#comment-15382982
 ] 

Ravi Prakash commented on HADOOP-13123:
---

For completeness of this discussion, the patch for HADOOP-12563 was never 
backported into branch-2 . It exists in trunk only so 3.x releases will have 
it. I don't know if we really should make the token file format configurable, 
but that's a discussion at a tangent... I'll save it for later.

> Permit the default hadoop delegation token file format to be configurable
> -
>
> Key: HADOOP-13123
> URL: https://issues.apache.org/jira/browse/HADOOP-13123
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Attachments: HADOOP-13123.01.patch
>
>
> If one environment updates to using the new dtutil code and accompanying 
> Credentials code, there is a backward compatibility issue with the default 
> file format being JAVA.  Older clients need to be updated to ask for a file 
> in the legacy format (FORMAT_JAVA).  
> As an aid to users in this trap, we can add a configuration property to set 
> the default file format.  When set to FORMAT_JAVA, the new server code will 
> serve up legacy files without being asked.  The default value for this 
> property will remain FORMAT_PB.  But affected users can add this config 
> option to the services using the newer code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-13342) ISAL download is breaking the Dockerfile

2016-07-06 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365121#comment-15365121
 ] 

Ravi Prakash edited comment on HADOOP-13342 at 7/6/16 9:25 PM:
---

Oh! Sorry about that Kai. Please create a new JIRA and I'm happy to +1 it. 
Also, I thought the Dockerfile was Allen's work. Didn't realize you'd added to 
it. My apologies.


was (Author: raviprak):
Oh! Sorry about that Kai. Please create a new JIRA and I'm happy to +1 it.

> ISAL download is breaking the Dockerfile
> 
>
> Key: HADOOP-13342
> URL: https://issues.apache.org/jira/browse/HADOOP-13342
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>Priority: Blocker
> Fix For: 3.0.0-alpha1
>
> Attachments: HADOOP-13342.00.patch
>
>
> http://http.us.debian.org/debian/pool/main/libi/libisal/libisal2_2.15.0-2_amd64.deb
>  is returning a 404.  We need to replace or remove this hack to prevent this 
> from happening in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13342) ISAL download is breaking the Dockerfile

2016-07-06 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365121#comment-15365121
 ] 

Ravi Prakash commented on HADOOP-13342:
---

Oh! Sorry about that Kai. Please create a new JIRA and I'm happy to +1 it.

> ISAL download is breaking the Dockerfile
> 
>
> Key: HADOOP-13342
> URL: https://issues.apache.org/jira/browse/HADOOP-13342
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>Priority: Blocker
> Fix For: 3.0.0-alpha1
>
> Attachments: HADOOP-13342.00.patch
>
>
> http://http.us.debian.org/debian/pool/main/libi/libisal/libisal2_2.15.0-2_amd64.deb
>  is returning a 404.  We need to replace or remove this hack to prevent this 
> from happening in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13342) ISAL download is breaking the Dockerfile

2016-07-05 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-13342:
--
   Resolution: Fixed
Fix Version/s: 3.0.0-alpha1
   Status: Resolved  (was: Patch Available)

Committed to trunk

> ISAL download is breaking the Dockerfile
> 
>
> Key: HADOOP-13342
> URL: https://issues.apache.org/jira/browse/HADOOP-13342
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>Priority: Blocker
> Fix For: 3.0.0-alpha1
>
> Attachments: HADOOP-13342.00.patch
>
>
> http://http.us.debian.org/debian/pool/main/libi/libisal/libisal2_2.15.0-2_amd64.deb
>  is returning a 404.  We need to replace or remove this hack to prevent this 
> from happening in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13342) ISAL download is breaking the Dockerfile

2016-07-05 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362896#comment-15362896
 ] 

Ravi Prakash commented on HADOOP-13342:
---

+1 LGTM. Will commit

> ISAL download is breaking the Dockerfile
> 
>
> Key: HADOOP-13342
> URL: https://issues.apache.org/jira/browse/HADOOP-13342
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
>Priority: Blocker
> Attachments: HADOOP-13342.00.patch
>
>
> http://http.us.debian.org/debian/pool/main/libi/libisal/libisal2_2.15.0-2_amd64.deb
>  is returning a 404.  We need to replace or remove this hack to prevent this 
> from happening in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-13295) Possible Vulnerability in DataNodes via SSH

2016-06-27 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash resolved HADOOP-13295.
---
Resolution: Invalid

> Possible Vulnerability in DataNodes via SSH
> ---
>
> Key: HADOOP-13295
> URL: https://issues.apache.org/jira/browse/HADOOP-13295
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Reporter: Mobin Ranjbar
>
> I suspected something weird in my Hadoop cluster. When I run datanodes, after 
> a while my servers(except namenode) will be down for SSH Max Attempts. When I 
> checked the 'systemctl status ssh', I figured out there are some invalid 
> username/password attempts via SSH and the SSH daemon blocked all incoming 
> connections and I got connection refused.
> I have no problem when my datanodes are not running.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13295) Possible Vulnerability in DataNodes via SSH

2016-06-24 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15348538#comment-15348538
 ] 

Ravi Prakash commented on HADOOP-13295:
---

Mobin! Could you please answer Steve's original question?
bq. How are you deploying it?

I'm inclined to close this JIRA as invalid. We haven't seen this issue anywhere 
else, and is probably an error in deployment.

> Possible Vulnerability in DataNodes via SSH
> ---
>
> Key: HADOOP-13295
> URL: https://issues.apache.org/jira/browse/HADOOP-13295
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Reporter: Mobin Ranjbar
>
> I suspected something weird in my Hadoop cluster. When I run datanodes, after 
> a while my servers(except namenode) will be down for SSH Max Attempts. When I 
> checked the 'systemctl status ssh', I figured out there are some invalid 
> username/password attempts via SSH and the SSH daemon blocked all incoming 
> connections and I got connection refused.
> I have no problem when my datanodes are not running.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13295) Possible Vulnerability in DataNodes via SSH

2016-06-22 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345083#comment-15345083
 ] 

Ravi Prakash commented on HADOOP-13295:
---

Or using the start-dfs.sh , start-*.sh scripts. Right?

> Possible Vulnerability in DataNodes via SSH
> ---
>
> Key: HADOOP-13295
> URL: https://issues.apache.org/jira/browse/HADOOP-13295
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Reporter: Mobin Ranjbar
>
> I suspected something weird in my Hadoop cluster. When I run datanodes, after 
> a while my servers(except namenode) will be down for SSH Max Attempts. When I 
> checked the 'systemctl status ssh', I figured out there are some invalid 
> username/password attempts via SSH and the SSH daemon blocked all incoming 
> connections and I got connection refused.
> I have no problem when my datanodes are not running.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13287) TestS3ACredentials#testInstantiateFromURL fails if AWS secret key contains '+'.

2016-06-21 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342479#comment-15342479
 ] 

Ravi Prakash commented on HADOOP-13287:
---

Sorry! I was trying to run the tests, but got distracted before I could figure 
out how. By the way my handle is raviprak ;-)

> TestS3ACredentials#testInstantiateFromURL fails if AWS secret key contains 
> '+'.
> ---
>
> Key: HADOOP-13287
> URL: https://issues.apache.org/jira/browse/HADOOP-13287
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3, test
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HADOOP-13287.001.patch, HADOOP-13287.002.patch
>
>
> HADOOP-3733 fixed accessing S3A with credentials on the command line for an 
> AWS secret key containing a '/'.  The patch added a new test suite: 
> {{TestS3ACredentialsInURL}}.  One of the tests fails if your AWS secret key 
> contains a '+'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-13188) S3A file-create should throw error rather than overwrite directories

2016-06-17 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336614#comment-15336614
 ] 

Ravi Prakash edited comment on HADOOP-13188 at 6/17/16 6:18 PM:


Looks good to me. +1.  FWIW, I was concerned about the performance penalty. 
So I uploaded 16 files (using {{hadoop fs -put  
s3a://:@bucket/path}}). With the patch it took 
{code}
real2m16.564s
user0m11.571s
sys 0m0.582s
{code}

Without the patch:
{code}
real2m5.481s
user0m10.811s
sys 0m0.472s{code}

So its not that bad a degradation.. Still pretty bad performance though, but 
that is clearly another JIRA


was (Author: raviprak):
Looks good to me. +1.  FWIW, I was concerned about the performance penalty. 
So I uploaded 16 files (using {{hadoop fs -put 
s3a://:@bucket/path}}). With the patch it took 
{code}
real2m16.564s
user0m11.571s
sys 0m0.582s
{code}

Without the patch:
{code}
real2m5.481s
user0m10.811s
sys 0m0.472s{code}

So its not that bad a degradation.. Still pretty bad performance though, but 
that is clearly another JIRA

> S3A file-create should throw error rather than overwrite directories
> 
>
> Key: HADOOP-13188
> URL: https://issues.apache.org/jira/browse/HADOOP-13188
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.7.2
>Reporter: Raymie Stata
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: HADOOP-13188-branch-2-001.patch
>
>
> S3A.create(Path,FsPermission,boolean,int,short,long,Progressable) is not 
> checking to see if it's being asked to overwrite a directory.  It could 
> easily do so, and should throw an error in this case.
> There is a test-case for this in AbstractFSContractTestBase, but it's being 
> skipped because S3A is a blobstore.  However, both the Azure and Swift file 
> systems make this test, and the new S3 one should as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13188) S3A file-create should throw error rather than overwrite directories

2016-06-17 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336614#comment-15336614
 ] 

Ravi Prakash commented on HADOOP-13188:
---

Looks good to me. +1.  FWIW, I was concerned about the performance penalty. 
So I uploaded 16 files (using {{hadoop fs -put 
s3a://:@bucket/path}}). With the patch it took 
{code}
real2m16.564s
user0m11.571s
sys 0m0.582s
{code}

Without the patch:
{code}
real2m5.481s
user0m10.811s
sys 0m0.472s{code}

So its not that bad a degradation.. Still pretty bad performance though, but 
that is clearly another JIRA

> S3A file-create should throw error rather than overwrite directories
> 
>
> Key: HADOOP-13188
> URL: https://issues.apache.org/jira/browse/HADOOP-13188
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.7.2
>Reporter: Raymie Stata
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: HADOOP-13188-branch-2-001.patch
>
>
> S3A.create(Path,FsPermission,boolean,int,short,long,Progressable) is not 
> checking to see if it's being asked to overwrite a directory.  It could 
> easily do so, and should throw an error in this case.
> There is a test-case for this in AbstractFSContractTestBase, but it's being 
> skipped because S3A is a blobstore.  However, both the Azure and Swift file 
> systems make this test, and the new S3 one should as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-3733) "s3:" URLs break when Secret Key contains a slash, even if encoded

2016-06-16 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HADOOP-3733:
-
  Resolution: Fixed
   Fix Version/s: 3.0.0-alpha1
  2.9.0
Target Version/s:   (was: 2.8.0)
  Status: Resolved  (was: Patch Available)

Thanks a lot everyone for your contributions on this long-standing issue. I'm 
glad we could close it out thanks to Steve!

> "s3:" URLs break when Secret Key contains a slash, even if encoded
> --
>
> Key: HADOOP-3733
> URL: https://issues.apache.org/jira/browse/HADOOP-3733
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 0.17.1, 2.0.2-alpha
>Reporter: Stuart Sierra
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: HADOOP-3733-20130223T011025Z.patch, 
> HADOOP-3733-branch-2-001.patch, HADOOP-3733-branch-2-002.patch, 
> HADOOP-3733-branch-2-003.patch, HADOOP-3733-branch-2-004.patch, 
> HADOOP-3733-branch-2-005.patch, HADOOP-3733-branch-2-006.patch, 
> HADOOP-3733-branch-2-007.patch, HADOOP-3733.patch, hadoop-3733.patch
>
>
> When using URLs of the form s3://ID:SECRET@BUCKET/ at the command line, 
> distcp fails if the SECRET contains a slash, even when the slash is 
> URL-encoded as %2F.
> Say your AWS Access Key ID is RYWX12N9WCY42XVOL8WH
> And your AWS Secret Key is Xqj1/NMvKBhl1jqKlzbYJS66ua0e8z7Kkvptl9bv
> And your bucket is called "mybucket"
> You can URL-encode the Secret KKey as 
> Xqj1%2FNMvKBhl1jqKlzbYJS66ua0e8z7Kkvptl9bv
> But this doesn't work:
> {noformat}
> $ bin/hadoop distcp file:///source  
> s3://RYWX12N9WCY42XVOL8WH:Xqj1%2FNMvKBhl1jqKlzbYJS66ua0e8z7Kkvptl9bv@mybucket/dest
> 08/07/09 15:05:22 INFO util.CopyFiles: srcPaths=[file:///source]
> 08/07/09 15:05:22 INFO util.CopyFiles: 
> destPath=s3://RYWX12N9WCY42XVOL8WH:Xqj1%2FNMvKBhl1jqKlzbYJS66ua0e8z7Kkvptl9bv@mybucket/dest
> 08/07/09 15:05:23 WARN httpclient.RestS3Service: Unable to access bucket: 
> mybucket
> org.jets3t.service.S3ServiceException: S3 HEAD request failed. 
> ResponseCode=403, ResponseMessage=Forbidden
> at 
> org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest(RestS3Service.java:339)
> ...
> With failures, global counters are inaccurate; consider running with -i
> Copy failed: org.apache.hadoop.fs.s3.S3Exception: 
> org.jets3t.service.S3ServiceException: S3 PUT failed. XML Error Message: 
>  encoding="UTF-8"?>SignatureDoesNotMatchThe 
> request signature we calculated does not match the signature you provided. 
> Check your key and signing method.
> at 
> org.apache.hadoop.fs.s3.Jets3tFileSystemStore.createBucket(Jets3tFileSystemStore.java:141)
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



  1   2   3   4   >