[jira] [Updated] (HADOOP-10842) CryptoExtension generateEncryptedKey method should receive the key name

2014-07-17 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated HADOOP-10842:
-

Status: Patch Available  (was: Open)

 CryptoExtension generateEncryptedKey method should receive the key name
 ---

 Key: HADOOP-10842
 URL: https://issues.apache.org/jira/browse/HADOOP-10842
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 3.0.0
Reporter: Alejandro Abdelnur
Assignee: Arun Suresh
 Attachments: HADOOP-10842-10841-COMBO.1.patch, HADOOP-10842.1.patch


 Generating an EEK should be done using always the current keyversion of a key 
 name. We should enforce that by API by handing off EEKs for the last 
 keyversion of a keyname only, thus we should ask for EEKs for a keyname and 
 the {{CryptoExtension}} should use the last keyversion.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-17 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064621#comment-14064621
 ] 

Konstantin Boudnik commented on HADOOP-10641:
-

As has been proposed above and agreed during the meet-up yesterday, I will go 
ahead and clear new branch {{ConsensusNode}} off the trunk, so we'll start 
adding the implementation there.

 Introduce Coordination Engine
 -

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch, hadoop-coordination.patch


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-17 Thread Alex Newman (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064623#comment-14064623
 ] 

Alex Newman commented on HADOOP-10641:
--

Hey dude. Should we delay this a bit?

On Wed, Jul 16, 2014 at 11:11 PM, Konstantin Boudnik (JIRA)


 Introduce Coordination Engine
 -

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch, hadoop-coordination.patch


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10842) CryptoExtension generateEncryptedKey method should receive the key name

2014-07-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064642#comment-14064642
 ] 

Hadoop QA commented on HADOOP-10842:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12656225/HADOOP-10842-10841-COMBO.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common:

  org.apache.hadoop.ipc.TestIPC
  org.apache.hadoop.fs.TestSymlinkLocalFSFileSystem
  org.apache.hadoop.fs.TestSymlinkLocalFSFileContext

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4301//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4301//console

This message is automatically generated.

 CryptoExtension generateEncryptedKey method should receive the key name
 ---

 Key: HADOOP-10842
 URL: https://issues.apache.org/jira/browse/HADOOP-10842
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 3.0.0
Reporter: Alejandro Abdelnur
Assignee: Arun Suresh
 Attachments: HADOOP-10842-10841-COMBO.1.patch, HADOOP-10842.1.patch


 Generating an EEK should be done using always the current keyversion of a key 
 name. We should enforce that by API by handing off EEKs for the last 
 keyversion of a keyname only, thus we should ask for EEKs for a keyname and 
 the {{CryptoExtension}} should use the last keyversion.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10841) EncryptedKeyVersion should have a key name property

2014-07-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064643#comment-14064643
 ] 

Hadoop QA commented on HADOOP-10841:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12656223/HADOOP-10841.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common:

  org.apache.hadoop.fs.TestSymlinkLocalFSFileSystem
  org.apache.hadoop.fs.TestSymlinkLocalFSFileContext
  org.apache.hadoop.ipc.TestIPC

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4302//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4302//console

This message is automatically generated.

 EncryptedKeyVersion should have a key name property
 ---

 Key: HADOOP-10841
 URL: https://issues.apache.org/jira/browse/HADOOP-10841
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 3.0.0
Reporter: Alejandro Abdelnur
Assignee: Arun Suresh
 Attachments: HADOOP-10841.1.patch


 having a keyname will help the NN to efficiently (without additional 
 keyprovider calls, which can translate into remote calls) determine the key 
 name of an EDEK.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10853) Refactor create instance of CryptoCodec and add CryptoCodecFactory

2014-07-17 Thread Yi Liu (JIRA)
Yi Liu created HADOOP-10853:
---

 Summary: Refactor create instance of CryptoCodec and add 
CryptoCodecFactory
 Key: HADOOP-10853
 URL: https://issues.apache.org/jira/browse/HADOOP-10853
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu


We should be able to create instance of *CryptoCodec*:
* via codec class name. (Applications may have config for different crypto 
codecs)
* via algorithm/mode/padding. (For automatically decryption, we need to find 
correct crypto codec and proper implementation)
* a default crypto codec through specific config. 

This JIRA is for
* Create instance through cipher suite(algorithm/mode/padding)
* Refactor create instance of {{CryptoCodec}} into {{CryptoCodecFactory}}

We need to get all crypto codecs in system, this can be done via a Java 
ServiceLoader + hadoop.security.crypto.codecs config.




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10853) Refactor create instance of CryptoCodec and add CryptoCodecFactory

2014-07-17 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HADOOP-10853:


Attachment: HADOOP-10853.001.patch

Upload the patch.
*1.* {{hadoop.security.crypto.codecs}} + a Java ServiceLoader are used to get 
all crypto codecs in system.
*2.* {{hadoop.security.crypto.cipher.suite}} + 
{{hadoop.security.crypto.codec.class}} are for default crypto codec.
*3.* When creating instance using *algorithm/mode/padding*, there may be 
several implementations, we should get proper implementation, default impl 
types are defined in {{hadoop.security.crypto.codec.impl.type}}.

 Refactor create instance of CryptoCodec and add CryptoCodecFactory
 --

 Key: HADOOP-10853
 URL: https://issues.apache.org/jira/browse/HADOOP-10853
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 3.0.0

 Attachments: HADOOP-10853.001.patch


 We should be able to create instance of *CryptoCodec*:
 * via codec class name. (Applications may have config for different crypto 
 codecs)
 * via algorithm/mode/padding. (For automatically decryption, we need to find 
 correct crypto codec and proper implementation)
 * a default crypto codec through specific config. 
 This JIRA is for
 * Create instance through cipher suite(algorithm/mode/padding)
 * Refactor create instance of {{CryptoCodec}} into {{CryptoCodecFactory}}
 We need to get all crypto codecs in system, this can be done via a Java 
 ServiceLoader + hadoop.security.crypto.codecs config.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10840) Fix OutOfMemoryError caused by metrics system in Azure File System

2014-07-17 Thread shanyu zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064662#comment-14064662
 ] 

shanyu zhao commented on HADOOP-10840:
--

Thanks [~cnauroth]!

 Fix OutOfMemoryError caused by metrics system in Azure File System
 --

 Key: HADOOP-10840
 URL: https://issues.apache.org/jira/browse/HADOOP-10840
 Project: Hadoop Common
  Issue Type: Bug
  Components: metrics
Affects Versions: 2.4.1
Reporter: shanyu zhao
Assignee: shanyu zhao
 Fix For: 3.0.0

 Attachments: HADOOP-10840.1.patch, HADOOP-10840.2.patch, 
 HADOOP-10840.patch


 In Hadoop 2.x the Hadoop File System framework changed and no cache is 
 implemented (refer to HADOOP-6356). This means for every WASB access, a new 
 NativeAzureFileSystem is created, along which a Metrics source created and 
 added to MetricsSystemImpl. Over time the sources accumulated, eating memory 
 and causing Java OutOfMemoryError.
 The fix is to utilize the unregisterSource() method added to MetricsSystem in 
 HADOOP-10839.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10692) Update metrics2 document and examples to be case sensitive

2014-07-17 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HADOOP-10692:
---

  Resolution: Invalid
Assignee: (was: Akira AJISAKA)
Target Version/s:   (was: 2.6.0)
  Status: Resolved  (was: Patch Available)

Since HADOOP-10468 has been fixed by no incompatible change, this issue become 
invalid.

 Update metrics2 document and examples to be case sensitive
 --

 Key: HADOOP-10692
 URL: https://issues.apache.org/jira/browse/HADOOP-10692
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf, metrics
Affects Versions: 2.5.0
Reporter: Akira AJISAKA
  Labels: newbie
 Attachments: HADOOP-10692.2.patch, HADOOP-10692.patch


 After HADOOP-10468, the prefix of the properties in metrics2 become case 
 sensitive. We should also update package-info and hadoop-metrics2.properties 
 examples to be case sensitive.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10816) KeyShell returns -1 on error to the shell, should be 1

2014-07-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064790#comment-14064790
 ] 

Hudson commented on HADOOP-10816:
-

FAILURE: Integrated in Hadoop-Yarn-trunk #615 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/615/])
HADOOP-10816. KeyShell returns -1 on error to the shell, should be 1. (Mike 
Yoder via wang) (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611229)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyShell.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/TestKeyShell.java


 KeyShell returns -1 on error to the shell, should be 1
 --

 Key: HADOOP-10816
 URL: https://issues.apache.org/jira/browse/HADOOP-10816
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 3.0.0
Reporter: Mike Yoder
Assignee: Mike Yoder
 Fix For: 3.0.0

 Attachments: HADOOP-10816.001.patch, HADOOP-10816.002.patch


 I've seen this in several places now - commands returning -1 on failure to 
 the shell. It's a bug. Someone confused their posix style returns (0 on 
 success,  0 on failure) with program returns, which are an unsigned 
 character. Thus, a return of -1 actually becomes 255 to the shell.
 {noformat}
 $ hadoop key create happykey2 --provider kms://http@localhost:16000/kms 
 --attr a=a --attr a=b
 Each attribute must correspond to only one value:
 atttribute a was repeated
 ...
 $ echo $?
 255
 {noformat}
 A return value of 1 instead of -1 does the right thing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10840) Fix OutOfMemoryError caused by metrics system in Azure File System

2014-07-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064791#comment-14064791
 ] 

Hudson commented on HADOOP-10840:
-

FAILURE: Integrated in Hadoop-Yarn-trunk #615 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/615/])
HADOOP-10840. Fix OutOfMemoryError caused by metrics system in Azure File 
System. Contributed by Shanyu Zhao. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611247)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/metrics/AzureFileSystemMetricsSystem.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azure/AzureBlobStorageTestAccount.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azure/NativeAzureFileSystemBaseTest.java


 Fix OutOfMemoryError caused by metrics system in Azure File System
 --

 Key: HADOOP-10840
 URL: https://issues.apache.org/jira/browse/HADOOP-10840
 Project: Hadoop Common
  Issue Type: Bug
  Components: metrics
Affects Versions: 2.4.1
Reporter: shanyu zhao
Assignee: shanyu zhao
 Fix For: 3.0.0

 Attachments: HADOOP-10840.1.patch, HADOOP-10840.2.patch, 
 HADOOP-10840.patch


 In Hadoop 2.x the Hadoop File System framework changed and no cache is 
 implemented (refer to HADOOP-6356). This means for every WASB access, a new 
 NativeAzureFileSystem is created, along which a Metrics source created and 
 added to MetricsSystemImpl. Over time the sources accumulated, eating memory 
 and causing Java OutOfMemoryError.
 The fix is to utilize the unregisterSource() method added to MetricsSystem in 
 HADOOP-10839.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10839) Add unregisterSource() to MetricsSystem API

2014-07-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064795#comment-14064795
 ] 

Hudson commented on HADOOP-10839:
-

FAILURE: Integrated in Hadoop-Yarn-trunk #615 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/615/])
HADOOP-10839. Add unregisterSource() to MetricsSystem API. Contributed by 
Shanyu Zhao. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611134)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/MetricsSystem.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/impl/TestMetricsSystemImpl.java


 Add unregisterSource() to MetricsSystem API
 ---

 Key: HADOOP-10839
 URL: https://issues.apache.org/jira/browse/HADOOP-10839
 Project: Hadoop Common
  Issue Type: Improvement
  Components: metrics
Affects Versions: 2.4.1
Reporter: shanyu zhao
Assignee: shanyu zhao
 Fix For: 3.0.0, 2.6.0

 Attachments: HADOOP-10839.2.patch, HADOOP-10839.patch


 Currently the MetrisSystem API has register() method to register a 
 MetricsSource but doesn't have unregister() method. This means once a 
 MetricsSource is registered with the MetricsSystem, it will be there forever 
 until the MetricsSystem is shut down. This in some cases can cause Java 
 OutOfMemoryError.
 One such case is in file system metrics implementation. The new 
 AbstractFileSystem/FileContext framework does not implement a cache so every 
 file system access can lead to the creation of a NativeFileSystem instance. 
 (refer to HADOOP-6356). And all these NativeFileSystem needs to share the 
 same instance of MetricsSystemImpl, which means we cannot shut down 
 MetricsSystem to clean up all the MetricsSources that has been registered but 
 no longer active. Over time the MetricsSource instance accumulates and 
 eventually we saw OutOfMemoryError.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10824) Refactor KMSACLs to avoid locking

2014-07-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064787#comment-14064787
 ] 

Hudson commented on HADOOP-10824:
-

FAILURE: Integrated in Hadoop-Yarn-trunk #615 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/615/])
HADOOP-10824. Refactor KMSACLs to avoid locking. (Benoy Antony via umamahesh) 
(umamahesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1610969)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-kms/src/main/java/org/apache/hadoop/crypto/key/kms/server/KMSACLs.java


 Refactor KMSACLs to avoid locking
 -

 Key: HADOOP-10824
 URL: https://issues.apache.org/jira/browse/HADOOP-10824
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 2.4.1
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: 3.0.0

 Attachments: HADOOP-10824.patch, HADOOP-10824.patch


 Currently _KMSACLs_ is made thread safe using _ReadWriteLock_. It is possible 
 to safely publish the _acls_ collection using _volatile_.
 Similar refactoring has been done in 
 [HADOOP-10448|https://issues.apache.org/jira/browse/HADOOP-10448?focusedCommentId=13980112page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13980112]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9921) daemon scripts should remove pid file on stop call after stop or process is found not running

2014-07-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064796#comment-14064796
 ] 

Hudson commented on HADOOP-9921:


FAILURE: Integrated in Hadoop-Yarn-trunk #615 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/615/])
HADOOP-9921. daemon scripts should remove pid file on stop call after stop or 
process is found not running ( Contributed by Vinayakumar B) (vinayakumarb: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1610964)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemon.sh
* /hadoop/common/trunk/hadoop-mapreduce-project/bin/mr-jobhistory-daemon.sh
* /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-daemon.sh


 daemon scripts should remove pid file on stop call after stop or process is 
 found not running
 -

 Key: HADOOP-9921
 URL: https://issues.apache.org/jira/browse/HADOOP-9921
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: 2.6.0

 Attachments: HADOOP-9921.patch


 daemon scripts should remove the pid file on stop call using daemon script.
 Should remove the pid file, even though process is not running.
 same pid file will be used by start command. At that time, if the same pid is 
 assigned to some other process, then start may fail.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10732) Update without holding write lock in JavaKeyStoreProvider#innerSetCredential()

2014-07-17 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064857#comment-14064857
 ] 

Junping Du commented on HADOOP-10732:
-

Manually kick off the Jenkins test again.

 Update without holding write lock in JavaKeyStoreProvider#innerSetCredential()
 --

 Key: HADOOP-10732
 URL: https://issues.apache.org/jira/browse/HADOOP-10732
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Attachments: hadoop-10732-v1.txt, hadoop-10732-v2.txt


 In 
 hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/alias/JavaKeyStoreProvider.java,
 innerSetCredential() doesn't wrap update with writeLock.lock() / 
 writeLock.unlock().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10732) Update without holding write lock in JavaKeyStoreProvider#innerSetCredential()

2014-07-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064883#comment-14064883
 ] 

Hadoop QA commented on HADOOP-10732:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12655353/hadoop-10732-v2.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common:

  org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl
  org.apache.hadoop.fs.TestSymlinkLocalFSFileContext
  org.apache.hadoop.ipc.TestIPC
  org.apache.hadoop.fs.TestSymlinkLocalFSFileSystem

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4303//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4303//console

This message is automatically generated.

 Update without holding write lock in JavaKeyStoreProvider#innerSetCredential()
 --

 Key: HADOOP-10732
 URL: https://issues.apache.org/jira/browse/HADOOP-10732
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Attachments: hadoop-10732-v1.txt, hadoop-10732-v2.txt


 In 
 hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/alias/JavaKeyStoreProvider.java,
 innerSetCredential() doesn't wrap update with writeLock.lock() / 
 writeLock.unlock().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10839) Add unregisterSource() to MetricsSystem API

2014-07-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064903#comment-14064903
 ] 

Hudson commented on HADOOP-10839:
-

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1834 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1834/])
HADOOP-10839. Add unregisterSource() to MetricsSystem API. Contributed by 
Shanyu Zhao. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611134)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/MetricsSystem.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/impl/TestMetricsSystemImpl.java


 Add unregisterSource() to MetricsSystem API
 ---

 Key: HADOOP-10839
 URL: https://issues.apache.org/jira/browse/HADOOP-10839
 Project: Hadoop Common
  Issue Type: Improvement
  Components: metrics
Affects Versions: 2.4.1
Reporter: shanyu zhao
Assignee: shanyu zhao
 Fix For: 3.0.0, 2.6.0

 Attachments: HADOOP-10839.2.patch, HADOOP-10839.patch


 Currently the MetrisSystem API has register() method to register a 
 MetricsSource but doesn't have unregister() method. This means once a 
 MetricsSource is registered with the MetricsSystem, it will be there forever 
 until the MetricsSystem is shut down. This in some cases can cause Java 
 OutOfMemoryError.
 One such case is in file system metrics implementation. The new 
 AbstractFileSystem/FileContext framework does not implement a cache so every 
 file system access can lead to the creation of a NativeFileSystem instance. 
 (refer to HADOOP-6356). And all these NativeFileSystem needs to share the 
 same instance of MetricsSystemImpl, which means we cannot shut down 
 MetricsSystem to clean up all the MetricsSources that has been registered but 
 no longer active. Over time the MetricsSource instance accumulates and 
 eventually we saw OutOfMemoryError.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10816) KeyShell returns -1 on error to the shell, should be 1

2014-07-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064898#comment-14064898
 ] 

Hudson commented on HADOOP-10816:
-

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1834 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1834/])
HADOOP-10816. KeyShell returns -1 on error to the shell, should be 1. (Mike 
Yoder via wang) (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611229)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyShell.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/TestKeyShell.java


 KeyShell returns -1 on error to the shell, should be 1
 --

 Key: HADOOP-10816
 URL: https://issues.apache.org/jira/browse/HADOOP-10816
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 3.0.0
Reporter: Mike Yoder
Assignee: Mike Yoder
 Fix For: 3.0.0

 Attachments: HADOOP-10816.001.patch, HADOOP-10816.002.patch


 I've seen this in several places now - commands returning -1 on failure to 
 the shell. It's a bug. Someone confused their posix style returns (0 on 
 success,  0 on failure) with program returns, which are an unsigned 
 character. Thus, a return of -1 actually becomes 255 to the shell.
 {noformat}
 $ hadoop key create happykey2 --provider kms://http@localhost:16000/kms 
 --attr a=a --attr a=b
 Each attribute must correspond to only one value:
 atttribute a was repeated
 ...
 $ echo $?
 255
 {noformat}
 A return value of 1 instead of -1 does the right thing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10840) Fix OutOfMemoryError caused by metrics system in Azure File System

2014-07-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064899#comment-14064899
 ] 

Hudson commented on HADOOP-10840:
-

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1834 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1834/])
HADOOP-10840. Fix OutOfMemoryError caused by metrics system in Azure File 
System. Contributed by Shanyu Zhao. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611247)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/metrics/AzureFileSystemMetricsSystem.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azure/AzureBlobStorageTestAccount.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azure/NativeAzureFileSystemBaseTest.java


 Fix OutOfMemoryError caused by metrics system in Azure File System
 --

 Key: HADOOP-10840
 URL: https://issues.apache.org/jira/browse/HADOOP-10840
 Project: Hadoop Common
  Issue Type: Bug
  Components: metrics
Affects Versions: 2.4.1
Reporter: shanyu zhao
Assignee: shanyu zhao
 Fix For: 3.0.0

 Attachments: HADOOP-10840.1.patch, HADOOP-10840.2.patch, 
 HADOOP-10840.patch


 In Hadoop 2.x the Hadoop File System framework changed and no cache is 
 implemented (refer to HADOOP-6356). This means for every WASB access, a new 
 NativeAzureFileSystem is created, along which a Metrics source created and 
 added to MetricsSystemImpl. Over time the sources accumulated, eating memory 
 and causing Java OutOfMemoryError.
 The fix is to utilize the unregisterSource() method added to MetricsSystem in 
 HADOOP-10839.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10839) Add unregisterSource() to MetricsSystem API

2014-07-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064924#comment-14064924
 ] 

Hudson commented on HADOOP-10839:
-

FAILURE: Integrated in Hadoop-Hdfs-trunk #1807 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1807/])
HADOOP-10839. Add unregisterSource() to MetricsSystem API. Contributed by 
Shanyu Zhao. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611134)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/MetricsSystem.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/impl/TestMetricsSystemImpl.java


 Add unregisterSource() to MetricsSystem API
 ---

 Key: HADOOP-10839
 URL: https://issues.apache.org/jira/browse/HADOOP-10839
 Project: Hadoop Common
  Issue Type: Improvement
  Components: metrics
Affects Versions: 2.4.1
Reporter: shanyu zhao
Assignee: shanyu zhao
 Fix For: 3.0.0, 2.6.0

 Attachments: HADOOP-10839.2.patch, HADOOP-10839.patch


 Currently the MetrisSystem API has register() method to register a 
 MetricsSource but doesn't have unregister() method. This means once a 
 MetricsSource is registered with the MetricsSystem, it will be there forever 
 until the MetricsSystem is shut down. This in some cases can cause Java 
 OutOfMemoryError.
 One such case is in file system metrics implementation. The new 
 AbstractFileSystem/FileContext framework does not implement a cache so every 
 file system access can lead to the creation of a NativeFileSystem instance. 
 (refer to HADOOP-6356). And all these NativeFileSystem needs to share the 
 same instance of MetricsSystemImpl, which means we cannot shut down 
 MetricsSystem to clean up all the MetricsSources that has been registered but 
 no longer active. Over time the MetricsSource instance accumulates and 
 eventually we saw OutOfMemoryError.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10840) Fix OutOfMemoryError caused by metrics system in Azure File System

2014-07-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064920#comment-14064920
 ] 

Hudson commented on HADOOP-10840:
-

FAILURE: Integrated in Hadoop-Hdfs-trunk #1807 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1807/])
HADOOP-10840. Fix OutOfMemoryError caused by metrics system in Azure File 
System. Contributed by Shanyu Zhao. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611247)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/metrics/AzureFileSystemMetricsSystem.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azure/AzureBlobStorageTestAccount.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azure/NativeAzureFileSystemBaseTest.java


 Fix OutOfMemoryError caused by metrics system in Azure File System
 --

 Key: HADOOP-10840
 URL: https://issues.apache.org/jira/browse/HADOOP-10840
 Project: Hadoop Common
  Issue Type: Bug
  Components: metrics
Affects Versions: 2.4.1
Reporter: shanyu zhao
Assignee: shanyu zhao
 Fix For: 3.0.0

 Attachments: HADOOP-10840.1.patch, HADOOP-10840.2.patch, 
 HADOOP-10840.patch


 In Hadoop 2.x the Hadoop File System framework changed and no cache is 
 implemented (refer to HADOOP-6356). This means for every WASB access, a new 
 NativeAzureFileSystem is created, along which a Metrics source created and 
 added to MetricsSystemImpl. Over time the sources accumulated, eating memory 
 and causing Java OutOfMemoryError.
 The fix is to utilize the unregisterSource() method added to MetricsSystem in 
 HADOOP-10839.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10816) KeyShell returns -1 on error to the shell, should be 1

2014-07-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064919#comment-14064919
 ] 

Hudson commented on HADOOP-10816:
-

FAILURE: Integrated in Hadoop-Hdfs-trunk #1807 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1807/])
HADOOP-10816. KeyShell returns -1 on error to the shell, should be 1. (Mike 
Yoder via wang) (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611229)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyShell.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/TestKeyShell.java


 KeyShell returns -1 on error to the shell, should be 1
 --

 Key: HADOOP-10816
 URL: https://issues.apache.org/jira/browse/HADOOP-10816
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 3.0.0
Reporter: Mike Yoder
Assignee: Mike Yoder
 Fix For: 3.0.0

 Attachments: HADOOP-10816.001.patch, HADOOP-10816.002.patch


 I've seen this in several places now - commands returning -1 on failure to 
 the shell. It's a bug. Someone confused their posix style returns (0 on 
 success,  0 on failure) with program returns, which are an unsigned 
 character. Thus, a return of -1 actually becomes 255 to the shell.
 {noformat}
 $ hadoop key create happykey2 --provider kms://http@localhost:16000/kms 
 --attr a=a --attr a=b
 Each attribute must correspond to only one value:
 atttribute a was repeated
 ...
 $ echo $?
 255
 {noformat}
 A return value of 1 instead of -1 does the right thing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10732) Update without holding write lock in JavaKeyStoreProvider#innerSetCredential()

2014-07-17 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064952#comment-14064952
 ] 

Ted Yu commented on HADOOP-10732:
-

Test failures were not related to patch - they pass locally with patch v2.
{code}
Tests run: 160, Failures: 0, Errors: 0, Skipped: 11
{code}

 Update without holding write lock in JavaKeyStoreProvider#innerSetCredential()
 --

 Key: HADOOP-10732
 URL: https://issues.apache.org/jira/browse/HADOOP-10732
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Attachments: hadoop-10732-v1.txt, hadoop-10732-v2.txt


 In 
 hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/alias/JavaKeyStoreProvider.java,
 innerSetCredential() doesn't wrap update with writeLock.lock() / 
 writeLock.unlock().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-17 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064987#comment-14064987
 ] 

Allen Wittenauer commented on HADOOP-10641:
---

Did you mean ConsensusNameNode?

 Introduce Coordination Engine
 -

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch, hadoop-coordination.patch


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-8100) share web server information for http filters

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-8100.
--

Resolution: Won't Fix

 share web server information for http filters
 -

 Key: HADOOP-8100
 URL: https://issues.apache.org/jira/browse/HADOOP-8100
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 1.0.0, 0.23.2, 0.24.0
Reporter: Allen Wittenauer
 Attachments: HADOOP-8100-branch-1.0.patch


 This is a simple fix which shares the web server bind information for 
 consumption down stream for 3rd party plugins.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-8100) share web server information for http filters

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-8100:
-

Status: Open  (was: Patch Available)

 share web server information for http filters
 -

 Key: HADOOP-8100
 URL: https://issues.apache.org/jira/browse/HADOOP-8100
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 1.0.0, 0.23.2, 0.24.0
Reporter: Allen Wittenauer
 Attachments: HADOOP-8100-branch-1.0.patch


 This is a simple fix which shares the web server bind information for 
 consumption down stream for 3rd party plugins.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-8026) various shell script fixes

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-8026.
--

Resolution: Duplicate

This is all part of HADOOP-9902 now.

 various shell script fixes
 --

 Key: HADOOP-8026
 URL: https://issues.apache.org/jira/browse/HADOOP-8026
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Allen Wittenauer
 Attachments: HADOOP-8026-branch-1.0.txt


 Various shell script fixes:
 * repair naked $0s so that dir detections work
 * remove superfluous JAVA_HOME settings
 * use /usr/bin/pdsh in slaves.sh if it exists



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-8025) change default distcp log location to be /tmp rather than cwd

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-8025.
--

Resolution: Won't Fix

 change default distcp log location to be /tmp rather than cwd
 -

 Key: HADOOP-8025
 URL: https://issues.apache.org/jira/browse/HADOOP-8025
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 1.0.0
Reporter: Allen Wittenauer
Priority: Trivial
 Attachments: HADOOP-8025-branch-1.0.txt


 distcp loves to leave emtpy files around.  this puts them in /tmp so at least 
 they are easy to find and kill.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10607) Create an API to Separate Credentials/Password Storage from Applications

2014-07-17 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HADOOP-10607:
---

Fix Version/s: 2.5.0

 Create an API to Separate Credentials/Password Storage from Applications
 

 Key: HADOOP-10607
 URL: https://issues.apache.org/jira/browse/HADOOP-10607
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Reporter: Larry McCay
Assignee: Larry McCay
 Fix For: 3.0.0, 2.5.0

 Attachments: 10607-10.patch, 10607-11.patch, 10607-12.patch, 
 10607-2.patch, 10607-3.patch, 10607-4.patch, 10607-5.patch, 10607-6.patch, 
 10607-7.patch, 10607-8.patch, 10607-9.patch, 10607-branch-2.patch, 10607.patch


 As with the filesystem API, we need to provide a generic mechanism to support 
 multiple credential storage mechanisms that are potentially from third 
 parties. 
 We need the ability to eliminate the storage of passwords and secrets in 
 clear text within configuration files or within code.
 Toward that end, I propose an API that is configured using a list of URLs of 
 CredentialProviders. The implementation will look for implementations using 
 the ServiceLoader interface and thus support third party libraries.
 Two providers will be included in this patch. One using the credentials cache 
 in MapReduce jobs and the other using Java KeyStores from either HDFS or 
 local file system. 
 A CredShell CLI will also be included in this patch which provides the 
 ability to manage the credentials within the stores.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10607) Create an API to Separate Credentials/Password Storage from Applications

2014-07-17 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HADOOP-10607:
---

Fix Version/s: (was: 2.5.0)
   2.6.0

 Create an API to Separate Credentials/Password Storage from Applications
 

 Key: HADOOP-10607
 URL: https://issues.apache.org/jira/browse/HADOOP-10607
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Reporter: Larry McCay
Assignee: Larry McCay
 Fix For: 3.0.0, 2.6.0

 Attachments: 10607-10.patch, 10607-11.patch, 10607-12.patch, 
 10607-2.patch, 10607-3.patch, 10607-4.patch, 10607-5.patch, 10607-6.patch, 
 10607-7.patch, 10607-8.patch, 10607-9.patch, 10607-branch-2.patch, 10607.patch


 As with the filesystem API, we need to provide a generic mechanism to support 
 multiple credential storage mechanisms that are potentially from third 
 parties. 
 We need the ability to eliminate the storage of passwords and secrets in 
 clear text within configuration files or within code.
 Toward that end, I propose an API that is configured using a list of URLs of 
 CredentialProviders. The implementation will look for implementations using 
 the ServiceLoader interface and thus support third party libraries.
 Two providers will be included in this patch. One using the credentials cache 
 in MapReduce jobs and the other using Java KeyStores from either HDFS or 
 local file system. 
 A CredShell CLI will also be included in this patch which provides the 
 ability to manage the credentials within the stores.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-535) back to back testing of codecs

2014-07-17 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065054#comment-14065054
 ] 

Allen Wittenauer commented on HADOOP-535:
-

This was done years ago, wasn't it?

 back to back testing of codecs
 --

 Key: HADOOP-535
 URL: https://issues.apache.org/jira/browse/HADOOP-535
 Project: Hadoop Common
  Issue Type: Test
  Components: io
Reporter: Owen O'Malley
Assignee: Arun C Murthy

 We should write some unit tests that use codecs back to back doing writing 
 and then reading.
 compressed block1, compressed block 2, compressed block3, ...
 that will check that the compression codecs are consuming the entire block 
 when they read.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10854) unit tests for the shell scripts

2014-07-17 Thread Allen Wittenauer (JIRA)
Allen Wittenauer created HADOOP-10854:
-

 Summary: unit tests for the shell scripts
 Key: HADOOP-10854
 URL: https://issues.apache.org/jira/browse/HADOOP-10854
 Project: Hadoop Common
  Issue Type: Test
Reporter: Allen Wittenauer


With HADOOP-9902 moving a lot of functionality to functions, we should build 
some unit tests for them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10855) Allow Text to be read with a known length

2014-07-17 Thread Todd Lipcon (JIRA)
Todd Lipcon created HADOOP-10855:


 Summary: Allow Text to be read with a known length
 Key: HADOOP-10855
 URL: https://issues.apache.org/jira/browse/HADOOP-10855
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 2.6.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor


For the native task work (MAPREDUCE-2841) it is useful to be able to store 
strings in a different fashion than the default (varint-prefixed) 
serialization. We should provide a read method in Text which takes an 
already-known length to support this use case while still providing Text 
objects back to the user.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10855) Allow Text to be read with a known length

2014-07-17 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HADOOP-10855:
-

Attachment: hadoop-10855.txt

Attached patch implements readWithKnownLength(). I also refactored the common 
code out from the existing read methods to call this new one after 
deserializing the length. Added a simple new unit test to verify.

 Allow Text to be read with a known length
 -

 Key: HADOOP-10855
 URL: https://issues.apache.org/jira/browse/HADOOP-10855
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 2.6.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hadoop-10855.txt


 For the native task work (MAPREDUCE-2841) it is useful to be able to store 
 strings in a different fashion than the default (varint-prefixed) 
 serialization. We should provide a read method in Text which takes an 
 already-known length to support this use case while still providing Text 
 objects back to the user.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10855) Allow Text to be read with a known length

2014-07-17 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HADOOP-10855:
-

Status: Patch Available  (was: Open)

 Allow Text to be read with a known length
 -

 Key: HADOOP-10855
 URL: https://issues.apache.org/jira/browse/HADOOP-10855
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 2.6.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hadoop-10855.txt, hadoop-10855.txt


 For the native task work (MAPREDUCE-2841) it is useful to be able to store 
 strings in a different fashion than the default (varint-prefixed) 
 serialization. We should provide a read method in Text which takes an 
 already-known length to support this use case while still providing Text 
 objects back to the user.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10855) Allow Text to be read with a known length

2014-07-17 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HADOOP-10855:
-

Attachment: hadoop-10855.txt

Oops, noticed a silly typo in a comment.

 Allow Text to be read with a known length
 -

 Key: HADOOP-10855
 URL: https://issues.apache.org/jira/browse/HADOOP-10855
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 2.6.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hadoop-10855.txt, hadoop-10855.txt


 For the native task work (MAPREDUCE-2841) it is useful to be able to store 
 strings in a different fashion than the default (varint-prefixed) 
 serialization. We should provide a read method in Text which takes an 
 already-known length to support this use case while still providing Text 
 objects back to the user.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10855) Allow Text to be read with a known length

2014-07-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065095#comment-14065095
 ] 

Hadoop QA commented on HADOOP-10855:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12656290/hadoop-10855.txt
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4304//console

This message is automatically generated.

 Allow Text to be read with a known length
 -

 Key: HADOOP-10855
 URL: https://issues.apache.org/jira/browse/HADOOP-10855
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 2.6.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hadoop-10855.txt, hadoop-10855.txt


 For the native task work (MAPREDUCE-2841) it is useful to be able to store 
 strings in a different fashion than the default (varint-prefixed) 
 serialization. We should provide a read method in Text which takes an 
 already-known length to support this use case while still providing Text 
 objects back to the user.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-1024) Add stable version line to the website front page

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-1024.
--

Resolution: Fixed

This was done forever ago.

 Add stable version line to the website front page
 -

 Key: HADOOP-1024
 URL: https://issues.apache.org/jira/browse/HADOOP-1024
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Owen O'Malley

 I think it would be worthwhile to add two lines to the top of the welcome 
 website page:
 Stable version: 0.10.1
 Latest version: 0.11.1 
 With the number linking off to the respective release like so: 
 http://www.apache.org/dyn/closer.cgi/lucene/hadoop/hadoop-0.10.1.tar.gz
 We can promote versions from Latest to Stable when they have proven 
 themselves.
 Thoughts?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-1464) IPC server should not log thread stacks at the info level

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-1464.
--

Resolution: Fixed

I'm going to close this out as stale.  I suspect this is no longer an issue.

 IPC server should not log thread stacks at the info level
 -

 Key: HADOOP-1464
 URL: https://issues.apache.org/jira/browse/HADOOP-1464
 Project: Hadoop Common
  Issue Type: Bug
  Components: ipc
Affects Versions: 0.12.3
Reporter: Hairong Kuang

 Currently when IPC server get a call which becomes too old, i.e. the call has 
 not been served for too long time, it dumps all thread stacks to logs at the 
 info level. Because the size of all thread stacks size might be very big, it 
 would be better to log them at the debug level.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-1496) Test coverage target in build files using emma

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-1496.
--

Resolution: Won't Fix

I'm going to close this now with won't fix given the clover coverage.

 Test coverage target in build files using emma
 --

 Key: HADOOP-1496
 URL: https://issues.apache.org/jira/browse/HADOOP-1496
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
 Environment: all
Reporter: woyg
Priority: Minor
 Attachments: emma.tgz, hadoop_clover.patch, patch.emma.txt, 
 patch.emma.txt.2


 Test coverage targets for Hadoop using emma. 
 Test coverage will help in identifying the components which are not poperly 
 covered in tests and write test cases for it.
 Emma (http://emma.sourceforge.net/) is a good tool for coverage.
 If you have something else in mind u can suggest.
 I have a patch ready with emma.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-1688) TestCrcCorruption hangs on windows

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-1688.
--

Resolution: Fixed

Closing this as stale.

 TestCrcCorruption hangs on windows
 --

 Key: HADOOP-1688
 URL: https://issues.apache.org/jira/browse/HADOOP-1688
 Project: Hadoop Common
  Issue Type: Bug
  Components: test
Affects Versions: 0.14.0
 Environment: Windows
Reporter: Konstantin Shvachko

 TestCrcCorruption times out on windows saying just that it timed out.
 No other useful information in the log.
 Some kind of timing issue, because if I run it with output=yes then it 
 succeeds.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-1754) A testimonial page for hadoop?

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-1754.
--

Resolution: Fixed

We have entire conferences now.

Closing.

 A testimonial page for hadoop?
 --

 Key: HADOOP-1754
 URL: https://issues.apache.org/jira/browse/HADOOP-1754
 Project: Hadoop Common
  Issue Type: Wish
  Components: documentation
Reporter: Konstantin Shvachko
Priority: Minor

 Should we create a testimonial page on hadoop wiki with a link from Hadoop 
 home page so that people 
 could share their experience of using Hadoop?
 I see some satisfied users out there. :)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-1791) Cleanup local files command(s)

2014-07-17 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065206#comment-14065206
 ] 

Allen Wittenauer commented on HADOOP-1791:
--

I'm not sure about -format as the option, but this would be kind of nice to 
have.

 Cleanup local files command(s)
 --

 Key: HADOOP-1791
 URL: https://issues.apache.org/jira/browse/HADOOP-1791
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 0.15.0
Reporter: Enis Soztutar
  Labels: newbie

 It would be good if we had some clean up command to cleanup all the local 
 directories that any component of hadoop uses. That way before the cluster is 
 restarted again, or when a machine is decided to be pulled out of the 
 cluster, we can cleanup all the local files. 
 i propose we add 
 {noformat}
 bin/hadoop datanode -format
 bin/hadoop tasktracker -format
 bin/hadoop jobtracker -format
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-1791) Cleanup local files command(s)

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-1791:
-

Labels: newbie  (was: )

 Cleanup local files command(s)
 --

 Key: HADOOP-1791
 URL: https://issues.apache.org/jira/browse/HADOOP-1791
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 0.15.0
Reporter: Enis Soztutar
  Labels: newbie

 It would be good if we had some clean up command to cleanup all the local 
 directories that any component of hadoop uses. That way before the cluster is 
 restarted again, or when a machine is decided to be pulled out of the 
 cluster, we can cleanup all the local files. 
 i propose we add 
 {noformat}
 bin/hadoop datanode -format
 bin/hadoop tasktracker -format
 bin/hadoop jobtracker -format
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-1815) Separate client and server jars

2014-07-17 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065215#comment-14065215
 ] 

Allen Wittenauer commented on HADOOP-1815:
--

With the move to protobuf, how close are we to closing this out?

 Separate client and server jars
 ---

 Key: HADOOP-1815
 URL: https://issues.apache.org/jira/browse/HADOOP-1815
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.14.0
 Environment: All
Reporter: Milind Bhandarkar

 For the ease of deployment, one should not have to change the server jars, 
 and restart clusters, when minor features on the client side are changed. 
 This requireds separating client and server jars for hadoop. Version numbers 
 appended to hadoop jars can reflect the compatibility. e.g. the server jar 
 could be at 0.13.1, and the client jar could be at 0.13.2. In short, we can 
 treat the part following 0. as the major version number for now.
 This allows major client frameworks such as streaming and Pig happy. To my 
 knowledge, Pig uses hadoop's default jobclient. Whereas streaming uses its 
 own jobclient. I would love to change streaming to use the default hadoop 
 jobclient, if I can make modifications to it (e.g. to print more stats that 
 are available from TaskReport, for example), if I do not have to deploy the 
 new version of the whole jar to the backend and restart the mapreduce cluster.
 (I thought there was already a bug filed for separating the client and server 
 jar, but I could not find it. Hence the new Jira. Sorry about duplication, if 
 any.)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10855) Allow Text to be read with a known length

2014-07-17 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HADOOP-10855:
-

Attachment: hadoop-10855.txt

woops, my patch wasn't relative to the right dir... take 3.

 Allow Text to be read with a known length
 -

 Key: HADOOP-10855
 URL: https://issues.apache.org/jira/browse/HADOOP-10855
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 2.6.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hadoop-10855.txt, hadoop-10855.txt, hadoop-10855.txt


 For the native task work (MAPREDUCE-2841) it is useful to be able to store 
 strings in a different fashion than the default (varint-prefixed) 
 serialization. We should provide a read method in Text which takes an 
 already-known length to support this use case while still providing Text 
 objects back to the user.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10732) Update without holding write lock in JavaKeyStoreProvider#innerSetCredential()

2014-07-17 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HADOOP-10732:
---

   Resolution: Fixed
Fix Version/s: 2.6.0
   3.0.0
   Status: Resolved  (was: Patch Available)

The v2 patch removes the synchronization around the hash table, so I'm going to 
use the v1 patch.

I just committed this to trunk and branch-2. Thanks, Ted!

 Update without holding write lock in JavaKeyStoreProvider#innerSetCredential()
 --

 Key: HADOOP-10732
 URL: https://issues.apache.org/jira/browse/HADOOP-10732
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Fix For: 3.0.0, 2.6.0

 Attachments: hadoop-10732-v1.txt, hadoop-10732-v2.txt


 In 
 hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/alias/JavaKeyStoreProvider.java,
 innerSetCredential() doesn't wrap update with writeLock.lock() / 
 writeLock.unlock().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10843) Unsafe.getLong is not supported correcly on Power PC, thus causing FastByteComparison's UnsafeComparer not working properly

2014-07-17 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065227#comment-14065227
 ] 

Colin Patrick McCabe commented on HADOOP-10843:
---

What do you mean by not supported correctly?  Are you talking about alignment 
restrictions (i.e. address must be a multiple of 8).  Or something else?  If it 
is something else, I would expect there to be a Sun/Oracle problem report open 
for this?

 Unsafe.getLong is not supported correcly on Power PC, thus causing 
 FastByteComparison's UnsafeComparer not working properly
 ---

 Key: HADOOP-10843
 URL: https://issues.apache.org/jira/browse/HADOOP-10843
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 2.2.0, 2.3.0, 2.4.0, 2.4.1
Reporter: Jinghui Wang
Assignee: Jinghui Wang
 Attachments: HADOOP-10843.patch


 Unsafe.getLong is not supported correcly on Power PC. FastByteComparison's 
 UnsafeComparer relies on unsafe method Unsafe.getLong, which is not correctly 
 supported for Power PC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10849) Implement conf substitution with UGI.current/loginUser

2014-07-17 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065234#comment-14065234
 ] 

Colin Patrick McCabe commented on HADOOP-10849:
---

UserGroupInformation#getCurrentUser also accesses configuration objects; if one 
of them asks for ugi.current.user, then we get in an infinite regress.  There 
is some potential for deadlock here.  So I would say that we should avoid doing 
this unless we can find some way to solve those problems.

 Implement conf substitution with UGI.current/loginUser
 --

 Key: HADOOP-10849
 URL: https://issues.apache.org/jira/browse/HADOOP-10849
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.4.1
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: HADOOP-10849.v01.patch


 Many path properties and similar in hadoop code base would be easily 
 configured if we had substitutions with 
 {{UserGroupInformation#getCurrentUser}}. Currently we often use less elegant 
 concatenation code if we want to express currentUser as opposed to 
 ${user.name} system property representing the user owning the JVM.
 This JIRA proposes the corresponding substitution support for keys 
 {{ugi.current.user}} and {{ugi.login.user}}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-1947) the hadoop-daemon.sh should allow the admin to configure the log4j appender for the servers

2014-07-17 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065241#comment-14065241
 ] 

Allen Wittenauer commented on HADOOP-1947:
--

Ironically, this was fixed in hadoop-daemon.sh at some point, but 
yarn-daemon.sh did the exact same thing!



 the hadoop-daemon.sh should allow the admin to configure the log4j appender 
 for the servers
 ---

 Key: HADOOP-1947
 URL: https://issues.apache.org/jira/browse/HADOOP-1947
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Owen O'Malley
Assignee: Owen O'Malley

 Currently the bin/hadoop-daemon.sh script forces the servers to use the 
 INFO,DRFA as the root logger. It really should be configurable from at least 
 hadoop-env.sh. Otherwise, it is hard for admins to control how the logs are 
 managed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-2082) randomwriter should complain if there are too many arguments

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-2082.
--

Resolution: Fixed

way old and likely fixed by now.

 randomwriter should complain if there are too many arguments
 

 Key: HADOOP-2082
 URL: https://issues.apache.org/jira/browse/HADOOP-2082
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Owen O'Malley

 A user was moving from 0.13 to 0.14 and was invoking randomwriter with a 
 config on the command line like:
 bin/hadoop jar hadoop-*-examples.jar randomwriter output conf.xml
 which worked in 0.13, but in 0.14 it ignores the conf.xml without 
 complaining. The equivalent is 
 bin/hadoop jar hadoop-*-examples.jar randomwriter -conf conf.xml output 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10733) Potential null dereference in CredentialShell#promptForCredential()

2014-07-17 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HADOOP-10733:
---

   Resolution: Fixed
Fix Version/s: 2.6.0
   3.0.0
   Status: Resolved  (was: Patch Available)

I just committed this to trunk and branch-2. Thanks, Ted!

 Potential null dereference in CredentialShell#promptForCredential()
 ---

 Key: HADOOP-10733
 URL: https://issues.apache.org/jira/browse/HADOOP-10733
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Fix For: 3.0.0, 2.6.0

 Attachments: hadoop-10733-v1.txt


 {code}
   char[] newPassword1 = c.readPassword(Enter password: );
   char[] newPassword2 = c.readPassword(Enter password again: );
   noMatch = !Arrays.equals(newPassword1, newPassword2);
   if (noMatch) {
 Arrays.fill(newPassword1, ' ');
 {code}
 newPassword1 might be null, leading to NullPointerException in Arrays.fill() 
 call.
 Similar issue for the following call on line 381:
 {code}
   Arrays.fill(newPassword2, ' ');
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10793) KeyShell and CredentialShell args should use single-dash style

2014-07-17 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065282#comment-14065282
 ] 

Andrew Wang commented on HADOOP-10793:
--

Somehow I just realized that CredentialShell also uses two dashes, so let's fix 
that here as well.

 KeyShell and CredentialShell args should use single-dash style
 --

 Key: HADOOP-10793
 URL: https://issues.apache.org/jira/browse/HADOOP-10793
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 3.0.0
Reporter: Mike Yoder

 Follow-on from HADOOP-10736 as per [~andrew.wang] - the key shell uses the 
 gnu double dash style for command line args, while other command line 
 programs use a single dash.  Consider changing this, and consider another 
 argument parsing scheme, like the CommandLine class.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10793) KeyShell and CredentialShell args should use single-dash style

2014-07-17 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HADOOP-10793:
-

Summary: KeyShell and CredentialShell args should use single-dash style  
(was: Key Shell args use double dash style)

 KeyShell and CredentialShell args should use single-dash style
 --

 Key: HADOOP-10793
 URL: https://issues.apache.org/jira/browse/HADOOP-10793
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 3.0.0
Reporter: Mike Yoder

 Follow-on from HADOOP-10736 as per [~andrew.wang] - the key shell uses the 
 gnu double dash style for command line args, while other command line 
 programs use a single dash.  Consider changing this, and consider another 
 argument parsing scheme, like the CommandLine class.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10607) Create an API to Separate Credentials/Password Storage from Applications

2014-07-17 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065278#comment-14065278
 ] 

Andrew Wang commented on HADOOP-10607:
--

Hey guys, few q's and comments:

* Why was this merged to branch-2? AFAIK this isn't being used by any Hadoop 
components yet, so it doesn't belong in a release branch. I'd like to revert it 
out of branch-2 until there is such a consumer.
* CredentialShell is using the double dash style for flags. I'm going to 
broaden the scope of HADOOP-10793 to fix this for both KeyShell and 
CredentialShell.
* Larry, I think your IDE is auto-wrapping with tabs. I think this is default 
behavior with Eclipse. Another thing you can do is configure `git diff` to 
highlight whitespace errors like these for the future. Maybe we can fix some of 
these tabs in HADOOP-10793 too, or in a new JIRA. Normally I'm against 
whitespace only changes, but this is mostly new code so there's little chance 
of conflicts.

 Create an API to Separate Credentials/Password Storage from Applications
 

 Key: HADOOP-10607
 URL: https://issues.apache.org/jira/browse/HADOOP-10607
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Reporter: Larry McCay
Assignee: Larry McCay
 Fix For: 3.0.0, 2.6.0

 Attachments: 10607-10.patch, 10607-11.patch, 10607-12.patch, 
 10607-2.patch, 10607-3.patch, 10607-4.patch, 10607-5.patch, 10607-6.patch, 
 10607-7.patch, 10607-8.patch, 10607-9.patch, 10607-branch-2.patch, 10607.patch


 As with the filesystem API, we need to provide a generic mechanism to support 
 multiple credential storage mechanisms that are potentially from third 
 parties. 
 We need the ability to eliminate the storage of passwords and secrets in 
 clear text within configuration files or within code.
 Toward that end, I propose an API that is configured using a list of URLs of 
 CredentialProviders. The implementation will look for implementations using 
 the ServiceLoader interface and thus support third party libraries.
 Two providers will be included in this patch. One using the credentials cache 
 in MapReduce jobs and the other using Java KeyStores from either HDFS or 
 local file system. 
 A CredShell CLI will also be included in this patch which provides the 
 ability to manage the credentials within the stores.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10591) Compression codecs must used pooled direct buffers or deallocate direct buffers when stream is closed

2014-07-17 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HADOOP-10591:
--

   Resolution: Fixed
Fix Version/s: 2.6.0
   Status: Resolved  (was: Patch Available)

 Compression codecs must used pooled direct buffers or deallocate direct 
 buffers when stream is closed
 -

 Key: HADOOP-10591
 URL: https://issues.apache.org/jira/browse/HADOOP-10591
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Hari Shreedharan
Assignee: Colin Patrick McCabe
 Fix For: 2.6.0

 Attachments: HADOOP-10591.001.patch, HADOOP-10591.002.patch


 Currently direct buffers allocated by compression codecs like Gzip (which 
 allocates 2 direct buffers per instance) are not deallocated when the stream 
 is closed. Eventually for long running processes which create a huge number 
 of files, these direct buffers are left hanging till a full gc, which may or 
 may not happen in a reasonable amount of time - especially if the process 
 does not use a whole lot of heap.
 Either these buffers should be pooled or they should be deallocated when the 
 stream is closed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10733) Potential null dereference in CredentialShell#promptForCredential()

2014-07-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065296#comment-14065296
 ] 

Hudson commented on HADOOP-10733:
-

FAILURE: Integrated in Hadoop-trunk-Commit #5900 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5900/])
HADOOP-10733. Fix potential null dereference in CredShell. (Ted Yu via
omalley) (omalley: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611419)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/alias/CredentialShell.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/alias/TestCredShell.java


 Potential null dereference in CredentialShell#promptForCredential()
 ---

 Key: HADOOP-10733
 URL: https://issues.apache.org/jira/browse/HADOOP-10733
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Fix For: 3.0.0, 2.6.0

 Attachments: hadoop-10733-v1.txt


 {code}
   char[] newPassword1 = c.readPassword(Enter password: );
   char[] newPassword2 = c.readPassword(Enter password again: );
   noMatch = !Arrays.equals(newPassword1, newPassword2);
   if (noMatch) {
 Arrays.fill(newPassword1, ' ');
 {code}
 newPassword1 might be null, leading to NullPointerException in Arrays.fill() 
 call.
 Similar issue for the following call on line 381:
 {code}
   Arrays.fill(newPassword2, ' ');
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10591) Compression codecs must used pooled direct buffers or deallocate direct buffers when stream is closed

2014-07-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065295#comment-14065295
 ] 

Hudson commented on HADOOP-10591:
-

FAILURE: Integrated in Hadoop-trunk-Commit #5900 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5900/])
HADOOP-10591. Compression codecs must used pooled direct buffers or deallocate 
direct buffers when stream is closed (cmccabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611423)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/BZip2Codec.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/CompressionCodec.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/CompressionInputStream.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/CompressionOutputStream.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DefaultCodec.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/GzipCodec.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/Lz4Codec.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/SnappyCodec.java


 Compression codecs must used pooled direct buffers or deallocate direct 
 buffers when stream is closed
 -

 Key: HADOOP-10591
 URL: https://issues.apache.org/jira/browse/HADOOP-10591
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Hari Shreedharan
Assignee: Colin Patrick McCabe
 Fix For: 2.6.0

 Attachments: HADOOP-10591.001.patch, HADOOP-10591.002.patch


 Currently direct buffers allocated by compression codecs like Gzip (which 
 allocates 2 direct buffers per instance) are not deallocated when the stream 
 is closed. Eventually for long running processes which create a huge number 
 of files, these direct buffers are left hanging till a full gc, which may or 
 may not happen in a reasonable amount of time - especially if the process 
 does not use a whole lot of heap.
 Either these buffers should be pooled or they should be deallocated when the 
 stream is closed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10732) Update without holding write lock in JavaKeyStoreProvider#innerSetCredential()

2014-07-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065298#comment-14065298
 ] 

Hudson commented on HADOOP-10732:
-

FAILURE: Integrated in Hadoop-trunk-Commit #5900 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5900/])
HADOOP-10732. Fix locking in credential update. (Ted Yu via omalley) (omalley: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611415)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/alias/JavaKeyStoreProvider.java


 Update without holding write lock in JavaKeyStoreProvider#innerSetCredential()
 --

 Key: HADOOP-10732
 URL: https://issues.apache.org/jira/browse/HADOOP-10732
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Fix For: 3.0.0, 2.6.0

 Attachments: hadoop-10732-v1.txt, hadoop-10732-v2.txt


 In 
 hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/alias/JavaKeyStoreProvider.java,
 innerSetCredential() doesn't wrap update with writeLock.lock() / 
 writeLock.unlock().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10607) Create an API to Separate Credentials/Password Storage from Applications

2014-07-17 Thread Larry McCay (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065322#comment-14065322
 ] 

Larry McCay commented on HADOOP-10607:
--

Hi [~andrew.wang] - I will look into changing my preferences and configuring 
git diff as you describe. I thought that I was managing it manually well 
enough. Thanks for the hints!

 Create an API to Separate Credentials/Password Storage from Applications
 

 Key: HADOOP-10607
 URL: https://issues.apache.org/jira/browse/HADOOP-10607
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Reporter: Larry McCay
Assignee: Larry McCay
 Fix For: 3.0.0, 2.6.0

 Attachments: 10607-10.patch, 10607-11.patch, 10607-12.patch, 
 10607-2.patch, 10607-3.patch, 10607-4.patch, 10607-5.patch, 10607-6.patch, 
 10607-7.patch, 10607-8.patch, 10607-9.patch, 10607-branch-2.patch, 10607.patch


 As with the filesystem API, we need to provide a generic mechanism to support 
 multiple credential storage mechanisms that are potentially from third 
 parties. 
 We need the ability to eliminate the storage of passwords and secrets in 
 clear text within configuration files or within code.
 Toward that end, I propose an API that is configured using a list of URLs of 
 CredentialProviders. The implementation will look for implementations using 
 the ServiceLoader interface and thus support third party libraries.
 Two providers will be included in this patch. One using the credentials cache 
 in MapReduce jobs and the other using Java KeyStores from either HDFS or 
 local file system. 
 A CredShell CLI will also be included in this patch which provides the 
 ability to manage the credentials within the stores.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Work started] (HADOOP-10818) native client: refactor URI code to be clearer

2014-07-17 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-10818 started by Colin Patrick McCabe.

 native client: refactor URI code to be clearer
 --

 Key: HADOOP-10818
 URL: https://issues.apache.org/jira/browse/HADOOP-10818
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: native
Affects Versions: HADOOP-10388
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10818-pnative.001.patch


 Refactor the {{common/uri.c}} code to be a bit clearer.  We should just be 
 able to refer to user_info, auth, port, path, etc. fields in the structure, 
 rather than calling accessors.  {{hdfsBuilder}} should just have a connection 
 URI rather than separate fields for all these things.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-2270) Title: DFS submit client params overrides final params on cluster

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-2270:
-

Labels: newbie  (was: )

 Title: DFS submit client params overrides final params on cluster 
 --

 Key: HADOOP-2270
 URL: https://issues.apache.org/jira/browse/HADOOP-2270
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 0.15.1
Reporter: Karam Singh
  Labels: newbie

 hdfs client params over-rides the params set as final on hdfs cluster nodes. 
 default valuesv of cleint side hadoop-site.xml values override the final 
 prameters of hdfs hadoop-site.xml .
 oberved the following cases -:
 1. dfs.trash.root=/recycle, dfs.trash.interval=10 and dfs.replication=2 
 marked final under hadoop-site.xml on hdfs cluster.
When fsShel command hadoop dfs -put local_dir dest fired from submission 
 host
Files will still get replicated 3 times (default) instead of final 
 dfs.replication=2.
Similarly when hadoop dfs -rmr dfs_dir OR hadoop dfs -rm file_path  
 fired from submit client the file/driectory diectly got deleted without being 
 moved to /recycle.
Here hadoop-site.xml on submit client does not specify dfs.trash.root, 
 dfs.trash.interval and dfs.replication.
   
Same is the case when we submit mapred JOB from client and job.xml 
 dispalys default values which overrides the lsuter values.
 2. dfs.trash.root=/recycle, dfs.trash.interval=10 and dfs.replication=2 
 marked final under hadoop-site.xml on hdfs cluster.
And 
dfs.trash.root=/rubbish, dfs.trash.interval=2 and dfs.replication=5 under 
 hadoop-site.xml on submit client.
When fsShel command hadoop dfs -put local_dir dest fired from submit 
 client
Files will  get replicated 5 times instead of final dfs.replication=2.
Similarly when hadoop dfs -rmr dfs_dir OR hadoop dfs -rm file_path  
 fired from submit client the file/driectory diectly will be moved to /rubbish 
 instead of /recycle.
   
Same is the case when we submit mapred job from client, job.xml displays 
 following values -:
dfs.trash.root=/rubbish, dfs.trash.interval=2 and dfs.replication=5



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-2270) Title: DFS submit client params overrides final params on cluster

2014-07-17 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065340#comment-14065340
 ] 

Allen Wittenauer commented on HADOOP-2270:
--

I doubt this is still an issue, but it would be good for someone to verify.  
I'll mark this as a newbie jira for someone to look at, just in case...

 Title: DFS submit client params overrides final params on cluster 
 --

 Key: HADOOP-2270
 URL: https://issues.apache.org/jira/browse/HADOOP-2270
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 0.15.1
Reporter: Karam Singh
  Labels: newbie

 hdfs client params over-rides the params set as final on hdfs cluster nodes. 
 default valuesv of cleint side hadoop-site.xml values override the final 
 prameters of hdfs hadoop-site.xml .
 oberved the following cases -:
 1. dfs.trash.root=/recycle, dfs.trash.interval=10 and dfs.replication=2 
 marked final under hadoop-site.xml on hdfs cluster.
When fsShel command hadoop dfs -put local_dir dest fired from submission 
 host
Files will still get replicated 3 times (default) instead of final 
 dfs.replication=2.
Similarly when hadoop dfs -rmr dfs_dir OR hadoop dfs -rm file_path  
 fired from submit client the file/driectory diectly got deleted without being 
 moved to /recycle.
Here hadoop-site.xml on submit client does not specify dfs.trash.root, 
 dfs.trash.interval and dfs.replication.
   
Same is the case when we submit mapred JOB from client and job.xml 
 dispalys default values which overrides the lsuter values.
 2. dfs.trash.root=/recycle, dfs.trash.interval=10 and dfs.replication=2 
 marked final under hadoop-site.xml on hdfs cluster.
And 
dfs.trash.root=/rubbish, dfs.trash.interval=2 and dfs.replication=5 under 
 hadoop-site.xml on submit client.
When fsShel command hadoop dfs -put local_dir dest fired from submit 
 client
Files will  get replicated 5 times instead of final dfs.replication=2.
Similarly when hadoop dfs -rmr dfs_dir OR hadoop dfs -rm file_path  
 fired from submit client the file/driectory diectly will be moved to /rubbish 
 instead of /recycle.
   
Same is the case when we submit mapred job from client, job.xml displays 
 following values -:
dfs.trash.root=/rubbish, dfs.trash.interval=2 and dfs.replication=5



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10856) HarFileSystem and HarFs support for HDFS encryption

2014-07-17 Thread Andrew Wang (JIRA)
Andrew Wang created HADOOP-10856:


 Summary: HarFileSystem and HarFs support for HDFS encryption
 Key: HADOOP-10856
 URL: https://issues.apache.org/jira/browse/HADOOP-10856
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
Reporter: Andrew Wang
Assignee: Andrew Wang


We need to examine support for Har with HDFS encryption.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10856) HarFileSystem and HarFs support for HDFS encryption

2014-07-17 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065379#comment-14065379
 ] 

Alejandro Abdelnur commented on HADOOP-10856:
-

IMO, HAR simply has to supports xAttrs.

If you are doing a HAR under .raw the same magic as for distcp will kick.

If you are doing a HAR outside of raw, everything is unencrypted in the HAR.

If your HAR file is within an encryption zone, the HAR file itself is encrypted.

 HarFileSystem and HarFs support for HDFS encryption
 ---

 Key: HADOOP-10856
 URL: https://issues.apache.org/jira/browse/HADOOP-10856
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
Reporter: Andrew Wang
Assignee: Andrew Wang

 We need to examine support for Har with HDFS encryption.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9902) Shell script rewrite

2014-07-17 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065387#comment-14065387
 ] 

Allen Wittenauer commented on HADOOP-9902:
--

A note for me: HDFS-2256 has an interesting idea for start-dfs.sh.

 Shell script rewrite
 

 Key: HADOOP-9902
 URL: https://issues.apache.org/jira/browse/HADOOP-9902
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 3.0.0
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
  Labels: releasenotes
 Attachments: HADOOP-9902-2.patch, HADOOP-9902-3.patch, 
 HADOOP-9902-4.patch, HADOOP-9902-5.patch, HADOOP-9902-6.patch, 
 HADOOP-9902.patch, HADOOP-9902.txt, hadoop-9902-1.patch, more-info.txt


 Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-2462) MiniMRCluster does not utilize multiple local directories in mapred.local.dir

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-2462.
--

Resolution: Incomplete

Stale.

 MiniMRCluster does not utilize multiple local directories in 
 mapred.local.dir
 ---

 Key: HADOOP-2462
 URL: https://issues.apache.org/jira/browse/HADOOP-2462
 Project: Hadoop Common
  Issue Type: Bug
  Components: test
Affects Versions: 0.15.0
Reporter: Konstantin Shvachko

 My hadoop-site.xml specifies 4 local directories
 {code}
 property
   namemapred.local.dir/name
   value${hadoop.tmp.dir}/mapred/local1, ${hadoop.tmp.dir}/mapred/local2, 
  ${hadoop.tmp.dir}/mapred/local3, 
 ${hadoop.tmp.dir}/mapred/local4/value
 /property
 {code}
 and I am looking at MiniMRCluster.TaskTrackerRunner
 There are several things here:
 # localDirBase value is set to
 {code}
 /tmp/h/mapred/local1, /tmp/h/mapred/local2, /tmp/h/mapred/local3, 
 /tmp/h/mapred/local4
 {code}
 and I get a hierarchy of directories with commas and spaces in the names. 
 I think this was not designed to work with multiple dirs.
 # Further down, all new directories are generated with the same name
 {code}
 File ttDir = new File(localDirBase, 
   Integer.toString(trackerId) + _ + 0);
 {code}
 So in fact only one directory is created. I think the intension was to have i 
 instead of 0
 {code}
 File ttDir = new File(localDirBase, 
   Integer.toString(trackerId) + _ + i);
 {code}
 # On windows MiniMRCluster.TaskTrackerRunner in this case throws an 
 IOException, 
 which is silently ignored by all but the TestMiniMRMapRedDebugScript   MiniMR 
 tests.
 {code}
 java.io.IOException: Mkdirs failed to create 
 /tmp/h/mapred/local1, /tmp/h/mapred/local2, /tmp/h/mapred/local3, 
 /tmp/h/mapred/local4/0_0
   at 
 org.apache.hadoop.mapred.MiniMRCluster$TaskTrackerRunner.init(MiniMRCluster.java:124)
   at org.apache.hadoop.mapred.MiniMRCluster.init(MiniMRCluster.java:293)
   at org.apache.hadoop.mapred.MiniMRCluster.init(MiniMRCluster.java:244)
   at 
 org.apache.hadoop.mapred.TestMiniMRClasspath.testClassPath(TestMiniMRClasspath.java:163)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:585)
   at junit.framework.TestCase.runTest(TestCase.java:154)
   at junit.framework.TestCase.runBare(TestCase.java:127)
   at junit.framework.TestResult$1.protect(TestResult.java:106)
   at junit.framework.TestResult.runProtected(TestResult.java:124)
   at junit.framework.TestResult.run(TestResult.java:109)
   at junit.framework.TestCase.run(TestCase.java:118)
   at junit.framework.TestSuite.runTest(TestSuite.java:208)
   at junit.framework.TestSuite.run(TestSuite.java:203)
   at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:478)
   at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:344)
   at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:196)
 {code}
 I am marking it as Major because we actually do not test multiple local 
 directories.
 Looks like it was introduced rather recently by HADOOP-1819.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10855) Allow Text to be read with a known length

2014-07-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065386#comment-14065386
 ] 

Hadoop QA commented on HADOOP-10855:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12656306/hadoop-10855.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common:

  org.apache.hadoop.fs.shell.TestCopyPreserveFlag
  org.apache.hadoop.fs.TestSymlinkLocalFSFileContext
  org.apache.hadoop.fs.shell.TestTextCommand
  org.apache.hadoop.ipc.TestIPC
  org.apache.hadoop.fs.TestSymlinkLocalFSFileSystem
  org.apache.hadoop.fs.shell.TestPathData
  org.apache.hadoop.fs.TestDFVariations

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4305//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4305//console

This message is automatically generated.

 Allow Text to be read with a known length
 -

 Key: HADOOP-10855
 URL: https://issues.apache.org/jira/browse/HADOOP-10855
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 2.6.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hadoop-10855.txt, hadoop-10855.txt, hadoop-10855.txt


 For the native task work (MAPREDUCE-2841) it is useful to be able to store 
 strings in a different fashion than the default (varint-prefixed) 
 serialization. We should provide a read method in Text which takes an 
 already-known length to support this use case while still providing Text 
 objects back to the user.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10778) Use NativeCrc32 only if it is faster

2014-07-17 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065407#comment-14065407
 ] 

Tsz Wo Nicholas Sze commented on HADOOP-10778:
--

It is 2.6 GHz i7.  How about we use crcutil?  It is Apache License 2.0.

 Use NativeCrc32 only if it is faster
 

 Key: HADOOP-10778
 URL: https://issues.apache.org/jira/browse/HADOOP-10778
 Project: Hadoop Common
  Issue Type: Improvement
  Components: util
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
 Attachments: c10778_20140702.patch


 From the benchmark post in [this 
 comment|https://issues.apache.org/jira/browse/HDFS-6560?focusedCommentId=14044060page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14044060],
  NativeCrc32 is slower than java.util.zip.CRC32 for Java 7 and above when 
 bytesPerChecksum  512.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-2560) Processing multiple input splits per mapper task

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-2560.
--

Resolution: Duplicate

This appears to predate MFIF/CFIF, as introduced by HADOOP-4565 which appears 
to fix the issue.  I'm going to close this out as resolved as a result.

 Processing multiple input splits per mapper task
 

 Key: HADOOP-2560
 URL: https://issues.apache.org/jira/browse/HADOOP-2560
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Runping Qi
Assignee: dhruba borthakur
 Attachments: multipleSplitsPerMapper.patch


 Currently, an input split contains a consecutive chunk of input file, which 
 by default, corresponding to a DFS block.
 This may lead to a large number of mapper tasks if the input data is large. 
 This leads to the following problems:
 1. Shuffling cost: since the framework has to move M * R map output segments 
 to the nodes running reducers, 
 larger M means larger shuffling cost.
 2. High JVM initialization overhead
 3. Disk fragmentation: larger number of map output files means lower read 
 throughput for accessing them.
 Ideally, you want to keep the number of mappers to no more than 16 times the 
 number of  nodes in the cluster.
 To achive that, we can increase the input split size. However, if a split 
 span over more than one dfs block,
 you lose the data locality scheduling benefits.
 One way to address this problem is to combine multiple input blocks with the 
 same rack into one split.
 If in average we combine B blocks into one split, then we will reduce the 
 number of mappers by a factor of B.
 Since all the blocks for one mapper share a rack, thus we can benefit from 
 rack-aware scheduling.
 Thoughts?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-2608) Reading sequence file consumes 100% cpu with maximum throughput being about 5MB/sec per process

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-2608.
--

Resolution: Fixed

I'm going to close this out as stale.

 Reading sequence file consumes 100% cpu with maximum throughput being about 
 5MB/sec per process
 ---

 Key: HADOOP-2608
 URL: https://issues.apache.org/jira/browse/HADOOP-2608
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Reporter: Runping Qi

 I did some tests on the throughput of scanning block-compressed sequence 
 files.
 The sustained throughput was bounded at 5MB/sec per process, with the cpu of 
 each process maxed at 100%.
 It seems to me that the cpu consumption is too high and the throughput is too 
 low for just scanning files.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-2681) NullPointerException in TaskRunner.java when system property hadoop.log.dir is not set

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-2681:
-

Labels: newbie  (was: )

 NullPointerException in TaskRunner.java when system property hadoop.log.dir 
 is not set
 

 Key: HADOOP-2681
 URL: https://issues.apache.org/jira/browse/HADOOP-2681
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 0.15.2
Reporter: Xu Zhang
Priority: Minor
  Labels: newbie

 Currently, NullPointerException exception is thrown on line 321 in 
 TaskRunner.java when system property hadoop.log.dir is not set.  Instead of 
 a NullPointerException exception, I expected a default value for 
 hadoop.log.dir to be used, or to see a more meaningful error message that 
 could have helped me figure out what was wrong (like, telling me that I 
 needed to set hadoop.log.dir and how to do so).
 Here is one instance of such exceptions:
 WARN mapred.TaskRunner: task_200801181719_0001_m_00_0 Child Error
 java.lang.NullPointerException
   at java.io.File.init(File.java:222)
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:321)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-2681) NullPointerException in TaskRunner.java when system property hadoop.log.dir is not set

2014-07-17 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065448#comment-14065448
 ] 

Allen Wittenauer commented on HADOOP-2681:
--

We should double-check *all* of the references to hadoop.log.dir.

HADOOP-9902 gives some guarantees that this properly set, but the Java code 
should be more forgiving.

 NullPointerException in TaskRunner.java when system property hadoop.log.dir 
 is not set
 

 Key: HADOOP-2681
 URL: https://issues.apache.org/jira/browse/HADOOP-2681
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 0.15.2
Reporter: Xu Zhang
Priority: Minor
  Labels: newbie

 Currently, NullPointerException exception is thrown on line 321 in 
 TaskRunner.java when system property hadoop.log.dir is not set.  Instead of 
 a NullPointerException exception, I expected a default value for 
 hadoop.log.dir to be used, or to see a more meaningful error message that 
 could have helped me figure out what was wrong (like, telling me that I 
 needed to set hadoop.log.dir and how to do so).
 Here is one instance of such exceptions:
 WARN mapred.TaskRunner: task_200801181719_0001_m_00_0 Child Error
 java.lang.NullPointerException
   at java.io.File.init(File.java:222)
   at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:321)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-2689) RegEx support for expressing datanodes in the slaves conf files

2014-07-17 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065453#comment-14065453
 ] 

Allen Wittenauer commented on HADOOP-2689:
--

It should be noted that with HADOOP-9902, it is possible to replace the slaves 
code to handle this in a much easier fashion.  But this would be a good 
enhancement for the follow up to that jira.

 RegEx support for expressing datanodes in the slaves conf files
 ---

 Key: HADOOP-2689
 URL: https://issues.apache.org/jira/browse/HADOOP-2689
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 0.14.4
 Environment: All
Reporter: Venkat Ramachandran

 It will be very handy if datanodes and task trackers can be expressed in the 
 slave conf file as regular expressions.
 For example, 
 machine[1-200].corp
 machine[400-679].corp



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-2715) Review and document '_' prefix convention in input directories

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-2715:
-

Labels: newbie  (was: )

 Review and document '_' prefix convention in input directories
 --

 Key: HADOOP-2715
 URL: https://issues.apache.org/jira/browse/HADOOP-2715
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Reporter: eric baldeschwieler
  Labels: newbie

 We use files and directories prefixed with '_' to store logs, metadata and 
 other info that might be useful to the owner of a job within the output 
 directory.  The standard input methods then ignore such files by default.
 HADOOP-2391 lead to some discussion of the '_' convention in output 
 directories.  No all developers input formats are supporting this.  We should 
 review the convention and document it well so that future input methods 
 support it.  Or we should come up with an alternate approach.  
 My hope is that after some discuss we will close this bug by creating a 
 documentation patch explaining the convention.
 It sounds like the convention is implemented via some input filter classes.  
 We should discuss if this generic solution is helping or obscuring the intent 
 of the convention.  Perhaps we should just have a non-configurable filter, so 
 '_' prefixed files are treated like '.' prefixed files by most unix tools.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-2715) Review and document '_' prefix convention in input directories

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-2715:
-

Component/s: documentation

 Review and document '_' prefix convention in input directories
 --

 Key: HADOOP-2715
 URL: https://issues.apache.org/jira/browse/HADOOP-2715
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Reporter: eric baldeschwieler
  Labels: newbie

 We use files and directories prefixed with '_' to store logs, metadata and 
 other info that might be useful to the owner of a job within the output 
 directory.  The standard input methods then ignore such files by default.
 HADOOP-2391 lead to some discussion of the '_' convention in output 
 directories.  No all developers input formats are supporting this.  We should 
 review the convention and document it well so that future input methods 
 support it.  Or we should come up with an alternate approach.  
 My hope is that after some discuss we will close this bug by creating a 
 documentation patch explaining the convention.
 It sounds like the convention is implemented via some input filter classes.  
 We should discuss if this generic solution is helping or obscuring the intent 
 of the convention.  Perhaps we should just have a non-configurable filter, so 
 '_' prefixed files are treated like '.' prefixed files by most unix tools.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10793) KeyShell and CredentialShell args should use single-dash style

2014-07-17 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065475#comment-14065475
 ] 

Owen O'Malley commented on HADOOP-10793:


Andrew, please change the jira so that all of the commands support the proper 
two dashes.

 KeyShell and CredentialShell args should use single-dash style
 --

 Key: HADOOP-10793
 URL: https://issues.apache.org/jira/browse/HADOOP-10793
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 3.0.0
Reporter: Mike Yoder

 Follow-on from HADOOP-10736 as per [~andrew.wang] - the key shell uses the 
 gnu double dash style for command line args, while other command line 
 programs use a single dash.  Consider changing this, and consider another 
 argument parsing scheme, like the CommandLine class.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10607) Create an API to Separate Credentials/Password Storage from Applications

2014-07-17 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065485#comment-14065485
 ] 

Owen O'Malley commented on HADOOP-10607:


Andrew,  it has to get released before it can be used by external components. 
Is there a technical concern with it getting in the 2.6 release?

 Create an API to Separate Credentials/Password Storage from Applications
 

 Key: HADOOP-10607
 URL: https://issues.apache.org/jira/browse/HADOOP-10607
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Reporter: Larry McCay
Assignee: Larry McCay
 Fix For: 3.0.0, 2.6.0

 Attachments: 10607-10.patch, 10607-11.patch, 10607-12.patch, 
 10607-2.patch, 10607-3.patch, 10607-4.patch, 10607-5.patch, 10607-6.patch, 
 10607-7.patch, 10607-8.patch, 10607-9.patch, 10607-branch-2.patch, 10607.patch


 As with the filesystem API, we need to provide a generic mechanism to support 
 multiple credential storage mechanisms that are potentially from third 
 parties. 
 We need the ability to eliminate the storage of passwords and secrets in 
 clear text within configuration files or within code.
 Toward that end, I propose an API that is configured using a list of URLs of 
 CredentialProviders. The implementation will look for implementations using 
 the ServiceLoader interface and thus support third party libraries.
 Two providers will be included in this patch. One using the credentials cache 
 in MapReduce jobs and the other using Java KeyStores from either HDFS or 
 local file system. 
 A CredShell CLI will also be included in this patch which provides the 
 ability to manage the credentials within the stores.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10793) KeyShell and CredentialShell args should use single-dash style

2014-07-17 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065493#comment-14065493
 ] 

Owen O'Malley commented on HADOOP-10793:


Sorry, to be clear I mean that you should fix the rest of the Hadoop commands 
to accept either one or two dashes. 

Obviously the old commands can't require two dashes without breaking 
compatibility.

 KeyShell and CredentialShell args should use single-dash style
 --

 Key: HADOOP-10793
 URL: https://issues.apache.org/jira/browse/HADOOP-10793
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 3.0.0
Reporter: Mike Yoder

 Follow-on from HADOOP-10736 as per [~andrew.wang] - the key shell uses the 
 gnu double dash style for command line args, while other command line 
 programs use a single dash.  Consider changing this, and consider another 
 argument parsing scheme, like the CommandLine class.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-2776) Web interface uses internal hostnames on EC2

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-2776.
--

Resolution: Won't Fix

I'm going to close this as won't fix. I don't think this is anything that we 
actually can fix here other than providing a complicated hostname mapping 
system for web interfaces.

Part of the frustration I'm sure stems from a misunderstanding of what is 
actually happening:

bq. The slaves file has the public names listed.

The slaves file is only used by the shell code to run ssh connections.  It has 
absolutely zero impact on the core of Hadoop.

bq.  Resolving a public name inside EC2 returns the private IP (which would 
reverse to the internal DNS name).

Hadoop makes the perfectly valid assumption that the hostname the system tells 
us is a valid, network-connectable hostname.  It is, from the inside of EC2.  
We have no way to know that you are attempting to connect from a completely 
different address that is being forwarded from some external entity.

Proxying connections into a private network space is a perfectly valid 
solution.  

 Web interface uses internal hostnames on EC2
 

 Key: HADOOP-2776
 URL: https://issues.apache.org/jira/browse/HADOOP-2776
 Project: Hadoop Common
  Issue Type: Bug
  Components: contrib/cloud
Affects Versions: 0.15.1
 Environment: EC2 ami-a324c1ca
Reporter: David Phillips

 The web interface, for example http://$MASTER_HOST:50030/machines.jsp, uses 
 internal hostnames when running on EC2.  This makes it impossible to access 
 from outside EC2.
 The slaves file has the public names listed.  Resolving a public name inside 
 EC2 returns the private IP (which would reverse to the internal DNS name).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10793) KeyShell and CredentialShell args should use single-dash style

2014-07-17 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065497#comment-14065497
 ] 

Andrew Wang commented on HADOOP-10793:
--

Hey Owen, since that'd be incompatible, we can't switch everything over until 
3.0. If this stuff wants to appear in a 2.x, I think consistency is the most 
important consideration, and thus should use a single dash.

Even for 3.0, I don't think the ROI is positive. The Hadoop commands have used 
a single dash forever, and there's precedent for this style in the {{java}} 
command. Hadoop users at this point are used to it.

 KeyShell and CredentialShell args should use single-dash style
 --

 Key: HADOOP-10793
 URL: https://issues.apache.org/jira/browse/HADOOP-10793
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 3.0.0
Reporter: Mike Yoder

 Follow-on from HADOOP-10736 as per [~andrew.wang] - the key shell uses the 
 gnu double dash style for command line args, while other command line 
 programs use a single dash.  Consider changing this, and consider another 
 argument parsing scheme, like the CommandLine class.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10793) KeyShell and CredentialShell args should use single-dash style

2014-07-17 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065508#comment-14065508
 ] 

Owen O'Malley commented on HADOOP-10793:


It won't be incompatible if you accept either one or two dashes.

 KeyShell and CredentialShell args should use single-dash style
 --

 Key: HADOOP-10793
 URL: https://issues.apache.org/jira/browse/HADOOP-10793
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 3.0.0
Reporter: Mike Yoder

 Follow-on from HADOOP-10736 as per [~andrew.wang] - the key shell uses the 
 gnu double dash style for command line args, while other command line 
 programs use a single dash.  Consider changing this, and consider another 
 argument parsing scheme, like the CommandLine class.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10817) ProxyUsers configuration should support configurable prefixes

2014-07-17 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065513#comment-14065513
 ] 

Andrew Wang commented on HADOOP-10817:
--

Hi Tucu, thanks for working on this, I took a look and had some review comments:

* ProxyUsers#refreshSUGC, we should init the new sip before assigning it to the 
volatile variable. This way it's not visible before it's ready.

DefaultImpersonationProvider:
* Good time to add some comments about the regexes, namely example matches
* Would be good to Precondition check that init has been called wherever we use 
configPrefix or related
* Some basic checking of configPrefix in init as well, e.g. not empty, not null

* ImpersonationProvider needs class annotations. Just a reminder, if this is a 
public interface, adding a new method is incompatible. I do see it in branch-2 
already.

 ProxyUsers configuration should support configurable prefixes 
 --

 Key: HADOOP-10817
 URL: https://issues.apache.org/jira/browse/HADOOP-10817
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 3.0.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: HADOOP-10817.patch, HADOOP-10817.patch


 Currently {{ProxyUsers}} and the {{ImpersonationProvider}} are hardcoded to 
 use {{hadoop.proxyuser.}} prefixes for loading proxy user configuration.
 Adding the possibility of using a custom prefix will enable reusing the 
 {{ProxyUsers}} class from other components (i.e. HttpFS and KMS).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-2835) hadoop fs -help ... should not require a NameNode to show help messages

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-2835.
--

Resolution: Fixed

Long ago fixed.

 hadoop fs -help ... should not require a NameNode to show help messages
 -

 Key: HADOOP-2835
 URL: https://issues.apache.org/jira/browse/HADOOP-2835
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Reporter: Tsz Wo Nicholas Sze
Priority: Minor

 For example, if we do hadoop fs -help get before started a NameNode, we 
 will get 
 {code}
 bash-3.2$ ./bin/hadoop fs -help get
 08/02/14 15:59:52 INFO ipc.Client: Retrying connect to server: 
 some-host:some-port. Already tried 1 time(s).
 08/02/14 15:59:54 INFO ipc.Client: Retrying connect to server: 
 some-host:some-port. Already tried 2 time(s).
 08/02/14 15:59:56 INFO ipc.Client: Retrying connect to server: 
 some-host:some-port. Already tried 3 time(s).
 08/02/14 15:59:58 INFO ipc.Client: Retrying connect to server: 
 some-host:some-port. Already tried 4 time(s).
 08/02/14 16:00:00 INFO ipc.Client: Retrying connect to server: 
 some-host:some-port. Already tried 5 time(s).
 08/02/14 16:00:02 INFO ipc.Client: Retrying connect to server: 
 some-host:some-port. Already tried 6 time(s).
 08/02/14 16:00:04 INFO ipc.Client: Retrying connect to server: 
 some-host:some-port. Already tried 7 time(s).
 08/02/14 16:00:06 INFO ipc.Client: Retrying connect to server: 
 some-host:some-port. Already tried 8 time(s).
 08/02/14 16:00:08 INFO ipc.Client: Retrying connect to server: 
 some-host:some-port. Already tried 9 time(s).
 08/02/14 16:00:10 INFO ipc.Client: Retrying connect to server: 
 some-host:some-port. Already tried 10 time(s).
 Bad connection to FS. command aborted.
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-2846) Large input data-sets throw java.net.SocketTimeoutException: timed out waiting for rpc response exception

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-2846.
--

Resolution: Cannot Reproduce

Closing this as stale, esp since it is likely long since fixed.

 Large input data-sets throw java.net.SocketTimeoutException: timed out 
 waiting for rpc response exception
 ---

 Key: HADOOP-2846
 URL: https://issues.apache.org/jira/browse/HADOOP-2846
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.15.3
Reporter: Amir Youssefi

 Pig scripts can run over a data set of 1 day. Using the same script and same 
 number of nodes on a larger data set (of 30 days) fails and throws following 
 exception after 1+ hour of running. 
 java.net.SocketTimeoutException: timed out waiting for rpc response
 at org.apache.hadoop.ipc.Client.call(Client.java:484)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
 at $Proxy1.getJobStatus(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
 at $Proxy1.getJobStatus(Unknown Source)
 at 
 org.apache.hadoop.mapred.JobClient$NetworkedJob.ensureFreshStatus(JobClient.java:182)
 at 
 org.apache.hadoop.mapred.JobClient$NetworkedJob.isComplete(JobClient.java:237)
 at 
 org.apache.pig.impl.mapreduceExec.MapReduceLauncher.launchPig(MapReduceLauncher.java:189)
 at 
 org.apache.pig.impl.physicalLayer.POMapreduce.open(POMapreduce.java:136)
 at 
 org.apache.pig.impl.physicalLayer.POMapreduce.open(POMapreduce.java:129)
 at 
 org.apache.pig.impl.physicalLayer.POMapreduce.open(POMapreduce.java:129)
 at 
 org.apache.pig.impl.physicalLayer.PhysicalPlan.exec(PhysicalPlan.java:39)
 at 
 org.apache.pig.impl.physicalLayer.IntermedResult.exec(IntermedResult.java:122)
 at org.apache.pig.PigServer.store(PigServer.java:445)
 at org.apache.pig.PigServer.store(PigServer.java:413)
 at 
 org.apache.pig.tools.grunt.GruntParser.processStore(GruntParser.java:135)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:327)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:54)
 at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:54)
 at org.apache.pig.Main.main(Main.java:258)
 timed out waiting for rpc response
 Re-runing always hits the same at %3 progress.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-2860) ant tar should not copy the modified configs into the tarball

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-2860.
--

Resolution: Won't Fix

We no longer use ant.

(insert pink panther theme here, using the words dead ant to represent the 
horn section)

 ant tar should not copy the modified configs into the tarball
 ---

 Key: HADOOP-2860
 URL: https://issues.apache.org/jira/browse/HADOOP-2860
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Owen O'Malley

 When generating releases, it is counter-intuitive that the tarball contains 
 the configuration files from the developer's test environment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10793) KeyShell and CredentialShell args should use single-dash style

2014-07-17 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065541#comment-14065541
 ] 

Andrew Wang commented on HADOOP-10793:
--

How about this:

- We do single-dash for Key/CredShell in branch-2
- File a new JIRA for trunk to switch everything over to some new style

This way we have consistency in branch-2 and consistency in trunk. I'd like all 
commands to behave the same way.

Metapoint, I think that accepting both - and -- as the same is not great, since 
it's inventing our own new style. It's neither UNIX-style long and short args, 
nor Java/Hadoop-style single-dash always. I'd like to stick with something with 
at least some precedent.

 KeyShell and CredentialShell args should use single-dash style
 --

 Key: HADOOP-10793
 URL: https://issues.apache.org/jira/browse/HADOOP-10793
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 3.0.0
Reporter: Mike Yoder

 Follow-on from HADOOP-10736 as per [~andrew.wang] - the key shell uses the 
 gnu double dash style for command line args, while other command line 
 programs use a single dash.  Consider changing this, and consider another 
 argument parsing scheme, like the CommandLine class.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-2864) Improve the Scalability and Robustness of IPC

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-2864.
--

Resolution: Fixed

This has changed so much since this JIRA was filed that I'm just going to close 
this as stale.

 Improve the Scalability and Robustness of IPC
 -

 Key: HADOOP-2864
 URL: https://issues.apache.org/jira/browse/HADOOP-2864
 Project: Hadoop Common
  Issue Type: Improvement
  Components: ipc
Affects Versions: 0.16.0
Reporter: Hairong Kuang
Assignee: Hairong Kuang
 Attachments: RPCScalabilityDesignWeb.pdf


 This jira is intended to enhance IPC's scalability and robustness. 
 Currently an IPC server can easily hung due to a disk failure or garbage 
 collection, during which it cannot respond to the clients promptly. This has 
 caused a lot of dropped calls and delayed responses thus many running 
 applications fail on timeout. On the other side if busy clients send a lot of 
 requests to the server in a short period of time or too many clients 
 communicate with the server simultaneously, the server may be swarmed by 
 requests and cannot work responsively. 
 The proposed changes aim to 
 # provide a better client/server coordination
 #* Server should be able to throttle client during burst of requests.
 #* A slow client should not affect server from serving other clients.
 #* A temporary hanging server should not cause catastrophic failures to 
 clients.
 # Client/server should detect remote side failures. Examples of failures 
 include: (1) the remote host is crashed; (2) the remote host is crashed and 
 then rebooted; (3) the remote process is crashed or shut down by an operator;
 # Fairness. Each client should be able to make progress.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-2882) HOD should put line breaks in to hadoop-site.xml

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-2882.
--

Resolution: Won't Fix

HOD is just a legend. Did it really exist? No one knows.

 HOD should put line breaks in to hadoop-site.xml
 

 Key: HADOOP-2882
 URL: https://issues.apache.org/jira/browse/HADOOP-2882
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Owen O'Malley

 It would help a lot if the hadoop-site files generated by HOD were readable. 
 Newlines would be a good start.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-2892) providing temp space management for applications

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-2892.
--

Resolution: Duplicate

In one sense, this has been resolved by the usage of /tmp.  But in reality, 
this request has been reborn in HDFS-6382.

 providing temp space management for applications
 

 Key: HADOOP-2892
 URL: https://issues.apache.org/jira/browse/HADOOP-2892
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Olga Natkovich

 It would be greate if hadoop can provide temp space for applications to use. 
 This would be useful for any applications that chain M-R jobs, perform 
 checkpoint and need to store some application specific temp results. 
 DeleteOnExit for files and directories would be ideal.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-2921) align map splits on sorted files with key boundaries

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-2921.
--

Resolution: Fixed

Stale

 align map splits on sorted files with key boundaries
 

 Key: HADOOP-2921
 URL: https://issues.apache.org/jira/browse/HADOOP-2921
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 0.16.0
Reporter: Joydeep Sen Sarma

 (this is something that we have implemented in the application layer - may be 
 useful to have in hadoop itself).
 long term log storage systems often keep data sorted (by some sort-key). 
 future computations on such files can often benefit from this sort order. if 
 the job requires grouping by the sort-key - then it should be possible to do 
 reduction in the map stage itself.
 this is not natively supported by hadoop (except in the degenerate case of 1 
 map file per task) since splits can span the sort-key. however aligning the 
 data read by the map task  to sort key boundaries is straightforward - and 
 this would be a useful capability to have in hadoop.
 the definition of the sort key should be left up to the application (it's not 
 necessarily the key field in a Sequencefile) through a generic interface - 
 but otherwise - the sequencefile and text file readers can use the extracted 
 sort key to align map task data with key boundaries.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-2922) sequencefiles without keys

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-2922.
--

Resolution: Won't Fix

Stale

 sequencefiles without keys
 --

 Key: HADOOP-2922
 URL: https://issues.apache.org/jira/browse/HADOOP-2922
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 0.16.0
Reporter: Joydeep Sen Sarma

 sequencefiles are invaluable for storing compressed/binary data. but when we 
 use them to store serialized records - we don't use the key part at all (just 
 put something dummy there to satisfy the api). i have heard of other projects 
 using the same tactics (jaql/cascading).
 so this is a request to have a modified version of sequencefiles that don't 
 incur the space and compute overhead of processing/storing these dummy keys.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-2975) IPC server should not allocate a buffer for each request

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-2975.
--

Resolution: Incomplete

A lot has changed here. Closing as stale.

 IPC server should not allocate a buffer for each request
 

 Key: HADOOP-2975
 URL: https://issues.apache.org/jira/browse/HADOOP-2975
 Project: Hadoop Common
  Issue Type: Improvement
  Components: ipc
Affects Versions: 0.16.0
Reporter: Hairong Kuang
Assignee: Ankur
 Attachments: Hadoop-2975-v1.patch, Hadoop-2975-v2.patch, 
 Hadoop-2975-v3.patch


 Currently the IPC server allocates a buffer for each incoming request. The 
 buffer is thrown away after the request is serialized. This leads to very 
 inefficient heap utilization. It would be nicer if all requests from one 
 connection could share a same common buffer since the ipc server has only one 
 request is being read from a socket at a time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-2960) A mapper should use some heuristics to decide whether to run the combiner during spills

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-2960.
--

Resolution: Won't Fix

Closing at won't fix, given the -1.

 A mapper should use some heuristics to decide whether to run the combiner 
 during spills
 ---

 Key: HADOOP-2960
 URL: https://issues.apache.org/jira/browse/HADOOP-2960
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Runping Qi

 Right now, the combiner, if set, will be called for each spill, no mapper 
 whether the combiner can actually reduce the values.
 The mapper should use some heuristics to decide whether to run the combiner 
 during spills.
 One of such heuristics is to check the the ratio of  the nymber of keys to 
 the number of unique keys in the spill.
 The combiner will be called only if that ration exceeds certain threshold 
 (say 2).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-2980) slow reduce copies - map output locations not being fetched even when map complete

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-2980.
--

Resolution: Incomplete

I'm going to close this as stale.  If people are still seeing this as an issue, 
they should file a new jira with new data!

 slow reduce copies - map output locations not being fetched even when map 
 complete
 --

 Key: HADOOP-2980
 URL: https://issues.apache.org/jira/browse/HADOOP-2980
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.15.3
Reporter: Joydeep Sen Sarma

 maps are long finished. reduces are stuck looking for map locations. they 
 make progress - but slowly. it almost seems like they get new map locations 
 every minute or so:
 2008-03-07 18:50:52,737 INFO org.apache.hadoop.mapred.ReduceTask: 
 task_200803041231_3586_r_21_0 done copying 
 task_200803041231_3586_m_004620_0 output from hadoop082.sf2p.facebook.com..
 2008-03-07 18:50:53,733 INFO org.apache.hadoop.mapred.ReduceTask: 
 task_200803041231_3586_r_21_0: Got 0 new map-outputs  0 obsolete 
 map-outputs from tasktracker and 0 map-outputs from previous failures
 2008-03-07 18:50:53,733 INFO org.apache.hadoop.mapred.ReduceTask: 
 task_200803041231_3586_r_21_0 Got 0 known map output location(s); 
 scheduling...
 ...
 2008-03-07 18:51:49,767 INFO org.apache.hadoop.mapred.ReduceTask: 
 task_200803041231_3586_r_21_0 Got 50 known map output location(s); 
 scheduling...
 2008-03-07 18:51:49,767 INFO org.apache.hadoop.mapred.ReduceTask: 
 task_200803041231_3586_r_21_0 Scheduled 41 of 50 known outputs (0 slow 
 hosts and 9 dup hosts)
 they get about 50 locations at a time and this 1 minute delay pattern is 
 surprisingly common ..



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-3037) Hudson needs to add src/test for checking javac warnings

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-3037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-3037.
--

Resolution: Not a Problem

Warning checks have been there for a while.

 Hudson needs to add src/test for checking javac warnings
 

 Key: HADOOP-3037
 URL: https://issues.apache.org/jira/browse/HADOOP-3037
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: hudson
Reporter: Amareshwari Sriramadasu
 Fix For: hudson


 I think src/test is not added in javac warnings checker.  HADOOP-3031 looks 
 at warnings introduced.
 Hudson needs to add src/test for checking javac warnings



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-3120) Large #of tasks failing at one time can effectively hang the jobtracker

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-3120.
--

Resolution: Incomplete

I'm going to close this as stale. 

 Large #of tasks failing at one time can effectively hang the jobtracker 
 

 Key: HADOOP-3120
 URL: https://issues.apache.org/jira/browse/HADOOP-3120
 Project: Hadoop Common
  Issue Type: Bug
 Environment: Linux/Hadoop-15.3
Reporter: Pete Wyckoff
Priority: Minor

 We think that JobTracker.removeMarkedTaks does so much logging when this 
 happens (ie logging thousands of failed taks per cycle) that nothing else can 
 go on (since it's called from a synchronized method) and thus by the next 
 cycle, the next waves of jobs have failed and we again have 10s of thousands 
 of failures to log and on and on.
 At least, the above is what we observed - just a continual printing of those 
 failures and nothing else happening on and on. Of course the original jobs 
 may have ultimately failed but new jobs come in to perpetuate the problem.
 This has happened to us a number of times and since we commented out the 
 log.info in that method we haven't had any problems. Although thousands and 
 thousands of task failures are hopefully not that common.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-3122) test-patch target should check @SuppressWarnings(...)

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-3122.
--

Resolution: Incomplete

Closing this as stale.

 test-patch target should check @SuppressWarnings(...)
 -

 Key: HADOOP-3122
 URL: https://issues.apache.org/jira/browse/HADOOP-3122
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Reporter: Tsz Wo Nicholas Sze
Priority: Minor

 The Java annotation @SuppressWarnings(...) tag can be used to get rid of 
 compiler warnings. In our patch process, QA should check 
 @SuppressWarnings(...) tag to prevent abusing this tag.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-3126) org.apache.hadoop.examples.RandomTextWriter$Counters fluctuate when RandonTextWriter job is running

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-3126.
--

Resolution: Fixed

I'm going to close this as fixed since it probably was.

 org.apache.hadoop.examples.RandomTextWriter$Counters fluctuate when 
 RandonTextWriter job is running
 ---

 Key: HADOOP-3126
 URL: https://issues.apache.org/jira/browse/HADOOP-3126
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Runping Qi

 On the web GUI page, the value for RECORDS_WRITTEN and BYTES_WRITTEN do not 
 increase monotonically.
 Rather, their values go up and down.
 I suspect something wrong with how the counters are updated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-3148) build-contrib.xml should inherit hadoop version parameter from root build.xml

2014-07-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-3148.
--

Resolution: Fixed

Stale issue.

 build-contrib.xml should inherit hadoop version parameter from root build.xml
 -

 Key: HADOOP-3148
 URL: https://issues.apache.org/jira/browse/HADOOP-3148
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Vinod Kumar Vavilapalli
Priority: Minor

 This is needed in HOD (and may be useful in other contrib projects), which, 
 in some cases, may be compiled and built separately. After HADOOP-3137, HOD 
 will obtain its version from build parameter ${version}, and this will fail 
 to give proper version when built independently(at src/contrib/hod level).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   >