[jira] [Updated] (HADOOP-10842) CryptoExtension generateEncryptedKey method should receive the key name
[ https://issues.apache.org/jira/browse/HADOOP-10842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated HADOOP-10842: - Status: Patch Available (was: Open) CryptoExtension generateEncryptedKey method should receive the key name --- Key: HADOOP-10842 URL: https://issues.apache.org/jira/browse/HADOOP-10842 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Arun Suresh Attachments: HADOOP-10842-10841-COMBO.1.patch, HADOOP-10842.1.patch Generating an EEK should be done using always the current keyversion of a key name. We should enforce that by API by handing off EEKs for the last keyversion of a keyname only, thus we should ask for EEKs for a keyname and the {{CryptoExtension}} should use the last keyversion. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine
[ https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064621#comment-14064621 ] Konstantin Boudnik commented on HADOOP-10641: - As has been proposed above and agreed during the meet-up yesterday, I will go ahead and clear new branch {{ConsensusNode}} off the trunk, so we'll start adding the implementation there. Introduce Coordination Engine - Key: HADOOP-10641 URL: https://issues.apache.org/jira/browse/HADOOP-10641 Project: Hadoop Common Issue Type: New Feature Affects Versions: 3.0.0 Reporter: Konstantin Shvachko Assignee: Plamen Jeliazkov Attachments: HADOOP-10641.patch, HADOOP-10641.patch, HADOOP-10641.patch, hadoop-coordination.patch Coordination Engine (CE) is a system, which allows to agree on a sequence of events in a distributed system. In order to be reliable CE should be distributed by itself. Coordination Engine can be based on different algorithms (paxos, raft, 2PC, zab) and have different implementations, depending on use cases, reliability, availability, and performance requirements. CE should have a common API, so that it could serve as a pluggable component in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and HBase (HBASE-10909). First implementation is proposed to be based on ZooKeeper. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine
[ https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064623#comment-14064623 ] Alex Newman commented on HADOOP-10641: -- Hey dude. Should we delay this a bit? On Wed, Jul 16, 2014 at 11:11 PM, Konstantin Boudnik (JIRA) Introduce Coordination Engine - Key: HADOOP-10641 URL: https://issues.apache.org/jira/browse/HADOOP-10641 Project: Hadoop Common Issue Type: New Feature Affects Versions: 3.0.0 Reporter: Konstantin Shvachko Assignee: Plamen Jeliazkov Attachments: HADOOP-10641.patch, HADOOP-10641.patch, HADOOP-10641.patch, hadoop-coordination.patch Coordination Engine (CE) is a system, which allows to agree on a sequence of events in a distributed system. In order to be reliable CE should be distributed by itself. Coordination Engine can be based on different algorithms (paxos, raft, 2PC, zab) and have different implementations, depending on use cases, reliability, availability, and performance requirements. CE should have a common API, so that it could serve as a pluggable component in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and HBase (HBASE-10909). First implementation is proposed to be based on ZooKeeper. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10842) CryptoExtension generateEncryptedKey method should receive the key name
[ https://issues.apache.org/jira/browse/HADOOP-10842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064642#comment-14064642 ] Hadoop QA commented on HADOOP-10842: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12656225/HADOOP-10842-10841-COMBO.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common: org.apache.hadoop.ipc.TestIPC org.apache.hadoop.fs.TestSymlinkLocalFSFileSystem org.apache.hadoop.fs.TestSymlinkLocalFSFileContext {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4301//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4301//console This message is automatically generated. CryptoExtension generateEncryptedKey method should receive the key name --- Key: HADOOP-10842 URL: https://issues.apache.org/jira/browse/HADOOP-10842 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Arun Suresh Attachments: HADOOP-10842-10841-COMBO.1.patch, HADOOP-10842.1.patch Generating an EEK should be done using always the current keyversion of a key name. We should enforce that by API by handing off EEKs for the last keyversion of a keyname only, thus we should ask for EEKs for a keyname and the {{CryptoExtension}} should use the last keyversion. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10841) EncryptedKeyVersion should have a key name property
[ https://issues.apache.org/jira/browse/HADOOP-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064643#comment-14064643 ] Hadoop QA commented on HADOOP-10841: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12656223/HADOOP-10841.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common: org.apache.hadoop.fs.TestSymlinkLocalFSFileSystem org.apache.hadoop.fs.TestSymlinkLocalFSFileContext org.apache.hadoop.ipc.TestIPC {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4302//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4302//console This message is automatically generated. EncryptedKeyVersion should have a key name property --- Key: HADOOP-10841 URL: https://issues.apache.org/jira/browse/HADOOP-10841 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Arun Suresh Attachments: HADOOP-10841.1.patch having a keyname will help the NN to efficiently (without additional keyprovider calls, which can translate into remote calls) determine the key name of an EDEK. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10853) Refactor create instance of CryptoCodec and add CryptoCodecFactory
Yi Liu created HADOOP-10853: --- Summary: Refactor create instance of CryptoCodec and add CryptoCodecFactory Key: HADOOP-10853 URL: https://issues.apache.org/jira/browse/HADOOP-10853 Project: Hadoop Common Issue Type: Sub-task Components: security Reporter: Yi Liu Assignee: Yi Liu We should be able to create instance of *CryptoCodec*: * via codec class name. (Applications may have config for different crypto codecs) * via algorithm/mode/padding. (For automatically decryption, we need to find correct crypto codec and proper implementation) * a default crypto codec through specific config. This JIRA is for * Create instance through cipher suite(algorithm/mode/padding) * Refactor create instance of {{CryptoCodec}} into {{CryptoCodecFactory}} We need to get all crypto codecs in system, this can be done via a Java ServiceLoader + hadoop.security.crypto.codecs config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10853) Refactor create instance of CryptoCodec and add CryptoCodecFactory
[ https://issues.apache.org/jira/browse/HADOOP-10853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-10853: Attachment: HADOOP-10853.001.patch Upload the patch. *1.* {{hadoop.security.crypto.codecs}} + a Java ServiceLoader are used to get all crypto codecs in system. *2.* {{hadoop.security.crypto.cipher.suite}} + {{hadoop.security.crypto.codec.class}} are for default crypto codec. *3.* When creating instance using *algorithm/mode/padding*, there may be several implementations, we should get proper implementation, default impl types are defined in {{hadoop.security.crypto.codec.impl.type}}. Refactor create instance of CryptoCodec and add CryptoCodecFactory -- Key: HADOOP-10853 URL: https://issues.apache.org/jira/browse/HADOOP-10853 Project: Hadoop Common Issue Type: Sub-task Components: security Reporter: Yi Liu Assignee: Yi Liu Fix For: 3.0.0 Attachments: HADOOP-10853.001.patch We should be able to create instance of *CryptoCodec*: * via codec class name. (Applications may have config for different crypto codecs) * via algorithm/mode/padding. (For automatically decryption, we need to find correct crypto codec and proper implementation) * a default crypto codec through specific config. This JIRA is for * Create instance through cipher suite(algorithm/mode/padding) * Refactor create instance of {{CryptoCodec}} into {{CryptoCodecFactory}} We need to get all crypto codecs in system, this can be done via a Java ServiceLoader + hadoop.security.crypto.codecs config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10840) Fix OutOfMemoryError caused by metrics system in Azure File System
[ https://issues.apache.org/jira/browse/HADOOP-10840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064662#comment-14064662 ] shanyu zhao commented on HADOOP-10840: -- Thanks [~cnauroth]! Fix OutOfMemoryError caused by metrics system in Azure File System -- Key: HADOOP-10840 URL: https://issues.apache.org/jira/browse/HADOOP-10840 Project: Hadoop Common Issue Type: Bug Components: metrics Affects Versions: 2.4.1 Reporter: shanyu zhao Assignee: shanyu zhao Fix For: 3.0.0 Attachments: HADOOP-10840.1.patch, HADOOP-10840.2.patch, HADOOP-10840.patch In Hadoop 2.x the Hadoop File System framework changed and no cache is implemented (refer to HADOOP-6356). This means for every WASB access, a new NativeAzureFileSystem is created, along which a Metrics source created and added to MetricsSystemImpl. Over time the sources accumulated, eating memory and causing Java OutOfMemoryError. The fix is to utilize the unregisterSource() method added to MetricsSystem in HADOOP-10839. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10692) Update metrics2 document and examples to be case sensitive
[ https://issues.apache.org/jira/browse/HADOOP-10692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HADOOP-10692: --- Resolution: Invalid Assignee: (was: Akira AJISAKA) Target Version/s: (was: 2.6.0) Status: Resolved (was: Patch Available) Since HADOOP-10468 has been fixed by no incompatible change, this issue become invalid. Update metrics2 document and examples to be case sensitive -- Key: HADOOP-10692 URL: https://issues.apache.org/jira/browse/HADOOP-10692 Project: Hadoop Common Issue Type: Bug Components: conf, metrics Affects Versions: 2.5.0 Reporter: Akira AJISAKA Labels: newbie Attachments: HADOOP-10692.2.patch, HADOOP-10692.patch After HADOOP-10468, the prefix of the properties in metrics2 become case sensitive. We should also update package-info and hadoop-metrics2.properties examples to be case sensitive. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10816) KeyShell returns -1 on error to the shell, should be 1
[ https://issues.apache.org/jira/browse/HADOOP-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064790#comment-14064790 ] Hudson commented on HADOOP-10816: - FAILURE: Integrated in Hadoop-Yarn-trunk #615 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/615/]) HADOOP-10816. KeyShell returns -1 on error to the shell, should be 1. (Mike Yoder via wang) (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611229) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyShell.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/TestKeyShell.java KeyShell returns -1 on error to the shell, should be 1 -- Key: HADOOP-10816 URL: https://issues.apache.org/jira/browse/HADOOP-10816 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Mike Yoder Assignee: Mike Yoder Fix For: 3.0.0 Attachments: HADOOP-10816.001.patch, HADOOP-10816.002.patch I've seen this in several places now - commands returning -1 on failure to the shell. It's a bug. Someone confused their posix style returns (0 on success, 0 on failure) with program returns, which are an unsigned character. Thus, a return of -1 actually becomes 255 to the shell. {noformat} $ hadoop key create happykey2 --provider kms://http@localhost:16000/kms --attr a=a --attr a=b Each attribute must correspond to only one value: atttribute a was repeated ... $ echo $? 255 {noformat} A return value of 1 instead of -1 does the right thing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10840) Fix OutOfMemoryError caused by metrics system in Azure File System
[ https://issues.apache.org/jira/browse/HADOOP-10840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064791#comment-14064791 ] Hudson commented on HADOOP-10840: - FAILURE: Integrated in Hadoop-Yarn-trunk #615 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/615/]) HADOOP-10840. Fix OutOfMemoryError caused by metrics system in Azure File System. Contributed by Shanyu Zhao. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611247) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java * /hadoop/common/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/metrics/AzureFileSystemMetricsSystem.java * /hadoop/common/trunk/hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azure/AzureBlobStorageTestAccount.java * /hadoop/common/trunk/hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azure/NativeAzureFileSystemBaseTest.java Fix OutOfMemoryError caused by metrics system in Azure File System -- Key: HADOOP-10840 URL: https://issues.apache.org/jira/browse/HADOOP-10840 Project: Hadoop Common Issue Type: Bug Components: metrics Affects Versions: 2.4.1 Reporter: shanyu zhao Assignee: shanyu zhao Fix For: 3.0.0 Attachments: HADOOP-10840.1.patch, HADOOP-10840.2.patch, HADOOP-10840.patch In Hadoop 2.x the Hadoop File System framework changed and no cache is implemented (refer to HADOOP-6356). This means for every WASB access, a new NativeAzureFileSystem is created, along which a Metrics source created and added to MetricsSystemImpl. Over time the sources accumulated, eating memory and causing Java OutOfMemoryError. The fix is to utilize the unregisterSource() method added to MetricsSystem in HADOOP-10839. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10839) Add unregisterSource() to MetricsSystem API
[ https://issues.apache.org/jira/browse/HADOOP-10839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064795#comment-14064795 ] Hudson commented on HADOOP-10839: - FAILURE: Integrated in Hadoop-Yarn-trunk #615 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/615/]) HADOOP-10839. Add unregisterSource() to MetricsSystem API. Contributed by Shanyu Zhao. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611134) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/MetricsSystem.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/impl/TestMetricsSystemImpl.java Add unregisterSource() to MetricsSystem API --- Key: HADOOP-10839 URL: https://issues.apache.org/jira/browse/HADOOP-10839 Project: Hadoop Common Issue Type: Improvement Components: metrics Affects Versions: 2.4.1 Reporter: shanyu zhao Assignee: shanyu zhao Fix For: 3.0.0, 2.6.0 Attachments: HADOOP-10839.2.patch, HADOOP-10839.patch Currently the MetrisSystem API has register() method to register a MetricsSource but doesn't have unregister() method. This means once a MetricsSource is registered with the MetricsSystem, it will be there forever until the MetricsSystem is shut down. This in some cases can cause Java OutOfMemoryError. One such case is in file system metrics implementation. The new AbstractFileSystem/FileContext framework does not implement a cache so every file system access can lead to the creation of a NativeFileSystem instance. (refer to HADOOP-6356). And all these NativeFileSystem needs to share the same instance of MetricsSystemImpl, which means we cannot shut down MetricsSystem to clean up all the MetricsSources that has been registered but no longer active. Over time the MetricsSource instance accumulates and eventually we saw OutOfMemoryError. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10824) Refactor KMSACLs to avoid locking
[ https://issues.apache.org/jira/browse/HADOOP-10824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064787#comment-14064787 ] Hudson commented on HADOOP-10824: - FAILURE: Integrated in Hadoop-Yarn-trunk #615 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/615/]) HADOOP-10824. Refactor KMSACLs to avoid locking. (Benoy Antony via umamahesh) (umamahesh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1610969) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-kms/src/main/java/org/apache/hadoop/crypto/key/kms/server/KMSACLs.java Refactor KMSACLs to avoid locking - Key: HADOOP-10824 URL: https://issues.apache.org/jira/browse/HADOOP-10824 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 2.4.1 Reporter: Benoy Antony Assignee: Benoy Antony Fix For: 3.0.0 Attachments: HADOOP-10824.patch, HADOOP-10824.patch Currently _KMSACLs_ is made thread safe using _ReadWriteLock_. It is possible to safely publish the _acls_ collection using _volatile_. Similar refactoring has been done in [HADOOP-10448|https://issues.apache.org/jira/browse/HADOOP-10448?focusedCommentId=13980112page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13980112] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9921) daemon scripts should remove pid file on stop call after stop or process is found not running
[ https://issues.apache.org/jira/browse/HADOOP-9921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064796#comment-14064796 ] Hudson commented on HADOOP-9921: FAILURE: Integrated in Hadoop-Yarn-trunk #615 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/615/]) HADOOP-9921. daemon scripts should remove pid file on stop call after stop or process is found not running ( Contributed by Vinayakumar B) (vinayakumarb: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1610964) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemon.sh * /hadoop/common/trunk/hadoop-mapreduce-project/bin/mr-jobhistory-daemon.sh * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-daemon.sh daemon scripts should remove pid file on stop call after stop or process is found not running - Key: HADOOP-9921 URL: https://issues.apache.org/jira/browse/HADOOP-9921 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Vinayakumar B Assignee: Vinayakumar B Fix For: 2.6.0 Attachments: HADOOP-9921.patch daemon scripts should remove the pid file on stop call using daemon script. Should remove the pid file, even though process is not running. same pid file will be used by start command. At that time, if the same pid is assigned to some other process, then start may fail. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10732) Update without holding write lock in JavaKeyStoreProvider#innerSetCredential()
[ https://issues.apache.org/jira/browse/HADOOP-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064857#comment-14064857 ] Junping Du commented on HADOOP-10732: - Manually kick off the Jenkins test again. Update without holding write lock in JavaKeyStoreProvider#innerSetCredential() -- Key: HADOOP-10732 URL: https://issues.apache.org/jira/browse/HADOOP-10732 Project: Hadoop Common Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Attachments: hadoop-10732-v1.txt, hadoop-10732-v2.txt In hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/alias/JavaKeyStoreProvider.java, innerSetCredential() doesn't wrap update with writeLock.lock() / writeLock.unlock(). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10732) Update without holding write lock in JavaKeyStoreProvider#innerSetCredential()
[ https://issues.apache.org/jira/browse/HADOOP-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064883#comment-14064883 ] Hadoop QA commented on HADOOP-10732: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12655353/hadoop-10732-v2.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common: org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl org.apache.hadoop.fs.TestSymlinkLocalFSFileContext org.apache.hadoop.ipc.TestIPC org.apache.hadoop.fs.TestSymlinkLocalFSFileSystem {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4303//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4303//console This message is automatically generated. Update without holding write lock in JavaKeyStoreProvider#innerSetCredential() -- Key: HADOOP-10732 URL: https://issues.apache.org/jira/browse/HADOOP-10732 Project: Hadoop Common Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Attachments: hadoop-10732-v1.txt, hadoop-10732-v2.txt In hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/alias/JavaKeyStoreProvider.java, innerSetCredential() doesn't wrap update with writeLock.lock() / writeLock.unlock(). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10839) Add unregisterSource() to MetricsSystem API
[ https://issues.apache.org/jira/browse/HADOOP-10839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064903#comment-14064903 ] Hudson commented on HADOOP-10839: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1834 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1834/]) HADOOP-10839. Add unregisterSource() to MetricsSystem API. Contributed by Shanyu Zhao. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611134) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/MetricsSystem.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/impl/TestMetricsSystemImpl.java Add unregisterSource() to MetricsSystem API --- Key: HADOOP-10839 URL: https://issues.apache.org/jira/browse/HADOOP-10839 Project: Hadoop Common Issue Type: Improvement Components: metrics Affects Versions: 2.4.1 Reporter: shanyu zhao Assignee: shanyu zhao Fix For: 3.0.0, 2.6.0 Attachments: HADOOP-10839.2.patch, HADOOP-10839.patch Currently the MetrisSystem API has register() method to register a MetricsSource but doesn't have unregister() method. This means once a MetricsSource is registered with the MetricsSystem, it will be there forever until the MetricsSystem is shut down. This in some cases can cause Java OutOfMemoryError. One such case is in file system metrics implementation. The new AbstractFileSystem/FileContext framework does not implement a cache so every file system access can lead to the creation of a NativeFileSystem instance. (refer to HADOOP-6356). And all these NativeFileSystem needs to share the same instance of MetricsSystemImpl, which means we cannot shut down MetricsSystem to clean up all the MetricsSources that has been registered but no longer active. Over time the MetricsSource instance accumulates and eventually we saw OutOfMemoryError. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10816) KeyShell returns -1 on error to the shell, should be 1
[ https://issues.apache.org/jira/browse/HADOOP-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064898#comment-14064898 ] Hudson commented on HADOOP-10816: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1834 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1834/]) HADOOP-10816. KeyShell returns -1 on error to the shell, should be 1. (Mike Yoder via wang) (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611229) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyShell.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/TestKeyShell.java KeyShell returns -1 on error to the shell, should be 1 -- Key: HADOOP-10816 URL: https://issues.apache.org/jira/browse/HADOOP-10816 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Mike Yoder Assignee: Mike Yoder Fix For: 3.0.0 Attachments: HADOOP-10816.001.patch, HADOOP-10816.002.patch I've seen this in several places now - commands returning -1 on failure to the shell. It's a bug. Someone confused their posix style returns (0 on success, 0 on failure) with program returns, which are an unsigned character. Thus, a return of -1 actually becomes 255 to the shell. {noformat} $ hadoop key create happykey2 --provider kms://http@localhost:16000/kms --attr a=a --attr a=b Each attribute must correspond to only one value: atttribute a was repeated ... $ echo $? 255 {noformat} A return value of 1 instead of -1 does the right thing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10840) Fix OutOfMemoryError caused by metrics system in Azure File System
[ https://issues.apache.org/jira/browse/HADOOP-10840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064899#comment-14064899 ] Hudson commented on HADOOP-10840: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1834 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1834/]) HADOOP-10840. Fix OutOfMemoryError caused by metrics system in Azure File System. Contributed by Shanyu Zhao. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611247) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java * /hadoop/common/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/metrics/AzureFileSystemMetricsSystem.java * /hadoop/common/trunk/hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azure/AzureBlobStorageTestAccount.java * /hadoop/common/trunk/hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azure/NativeAzureFileSystemBaseTest.java Fix OutOfMemoryError caused by metrics system in Azure File System -- Key: HADOOP-10840 URL: https://issues.apache.org/jira/browse/HADOOP-10840 Project: Hadoop Common Issue Type: Bug Components: metrics Affects Versions: 2.4.1 Reporter: shanyu zhao Assignee: shanyu zhao Fix For: 3.0.0 Attachments: HADOOP-10840.1.patch, HADOOP-10840.2.patch, HADOOP-10840.patch In Hadoop 2.x the Hadoop File System framework changed and no cache is implemented (refer to HADOOP-6356). This means for every WASB access, a new NativeAzureFileSystem is created, along which a Metrics source created and added to MetricsSystemImpl. Over time the sources accumulated, eating memory and causing Java OutOfMemoryError. The fix is to utilize the unregisterSource() method added to MetricsSystem in HADOOP-10839. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10839) Add unregisterSource() to MetricsSystem API
[ https://issues.apache.org/jira/browse/HADOOP-10839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064924#comment-14064924 ] Hudson commented on HADOOP-10839: - FAILURE: Integrated in Hadoop-Hdfs-trunk #1807 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1807/]) HADOOP-10839. Add unregisterSource() to MetricsSystem API. Contributed by Shanyu Zhao. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611134) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/MetricsSystem.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/impl/TestMetricsSystemImpl.java Add unregisterSource() to MetricsSystem API --- Key: HADOOP-10839 URL: https://issues.apache.org/jira/browse/HADOOP-10839 Project: Hadoop Common Issue Type: Improvement Components: metrics Affects Versions: 2.4.1 Reporter: shanyu zhao Assignee: shanyu zhao Fix For: 3.0.0, 2.6.0 Attachments: HADOOP-10839.2.patch, HADOOP-10839.patch Currently the MetrisSystem API has register() method to register a MetricsSource but doesn't have unregister() method. This means once a MetricsSource is registered with the MetricsSystem, it will be there forever until the MetricsSystem is shut down. This in some cases can cause Java OutOfMemoryError. One such case is in file system metrics implementation. The new AbstractFileSystem/FileContext framework does not implement a cache so every file system access can lead to the creation of a NativeFileSystem instance. (refer to HADOOP-6356). And all these NativeFileSystem needs to share the same instance of MetricsSystemImpl, which means we cannot shut down MetricsSystem to clean up all the MetricsSources that has been registered but no longer active. Over time the MetricsSource instance accumulates and eventually we saw OutOfMemoryError. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10840) Fix OutOfMemoryError caused by metrics system in Azure File System
[ https://issues.apache.org/jira/browse/HADOOP-10840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064920#comment-14064920 ] Hudson commented on HADOOP-10840: - FAILURE: Integrated in Hadoop-Hdfs-trunk #1807 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1807/]) HADOOP-10840. Fix OutOfMemoryError caused by metrics system in Azure File System. Contributed by Shanyu Zhao. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611247) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java * /hadoop/common/trunk/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/metrics/AzureFileSystemMetricsSystem.java * /hadoop/common/trunk/hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azure/AzureBlobStorageTestAccount.java * /hadoop/common/trunk/hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azure/NativeAzureFileSystemBaseTest.java Fix OutOfMemoryError caused by metrics system in Azure File System -- Key: HADOOP-10840 URL: https://issues.apache.org/jira/browse/HADOOP-10840 Project: Hadoop Common Issue Type: Bug Components: metrics Affects Versions: 2.4.1 Reporter: shanyu zhao Assignee: shanyu zhao Fix For: 3.0.0 Attachments: HADOOP-10840.1.patch, HADOOP-10840.2.patch, HADOOP-10840.patch In Hadoop 2.x the Hadoop File System framework changed and no cache is implemented (refer to HADOOP-6356). This means for every WASB access, a new NativeAzureFileSystem is created, along which a Metrics source created and added to MetricsSystemImpl. Over time the sources accumulated, eating memory and causing Java OutOfMemoryError. The fix is to utilize the unregisterSource() method added to MetricsSystem in HADOOP-10839. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10816) KeyShell returns -1 on error to the shell, should be 1
[ https://issues.apache.org/jira/browse/HADOOP-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064919#comment-14064919 ] Hudson commented on HADOOP-10816: - FAILURE: Integrated in Hadoop-Hdfs-trunk #1807 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1807/]) HADOOP-10816. KeyShell returns -1 on error to the shell, should be 1. (Mike Yoder via wang) (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611229) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyShell.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/TestKeyShell.java KeyShell returns -1 on error to the shell, should be 1 -- Key: HADOOP-10816 URL: https://issues.apache.org/jira/browse/HADOOP-10816 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Mike Yoder Assignee: Mike Yoder Fix For: 3.0.0 Attachments: HADOOP-10816.001.patch, HADOOP-10816.002.patch I've seen this in several places now - commands returning -1 on failure to the shell. It's a bug. Someone confused their posix style returns (0 on success, 0 on failure) with program returns, which are an unsigned character. Thus, a return of -1 actually becomes 255 to the shell. {noformat} $ hadoop key create happykey2 --provider kms://http@localhost:16000/kms --attr a=a --attr a=b Each attribute must correspond to only one value: atttribute a was repeated ... $ echo $? 255 {noformat} A return value of 1 instead of -1 does the right thing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10732) Update without holding write lock in JavaKeyStoreProvider#innerSetCredential()
[ https://issues.apache.org/jira/browse/HADOOP-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064952#comment-14064952 ] Ted Yu commented on HADOOP-10732: - Test failures were not related to patch - they pass locally with patch v2. {code} Tests run: 160, Failures: 0, Errors: 0, Skipped: 11 {code} Update without holding write lock in JavaKeyStoreProvider#innerSetCredential() -- Key: HADOOP-10732 URL: https://issues.apache.org/jira/browse/HADOOP-10732 Project: Hadoop Common Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Attachments: hadoop-10732-v1.txt, hadoop-10732-v2.txt In hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/alias/JavaKeyStoreProvider.java, innerSetCredential() doesn't wrap update with writeLock.lock() / writeLock.unlock(). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine
[ https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064987#comment-14064987 ] Allen Wittenauer commented on HADOOP-10641: --- Did you mean ConsensusNameNode? Introduce Coordination Engine - Key: HADOOP-10641 URL: https://issues.apache.org/jira/browse/HADOOP-10641 Project: Hadoop Common Issue Type: New Feature Affects Versions: 3.0.0 Reporter: Konstantin Shvachko Assignee: Plamen Jeliazkov Attachments: HADOOP-10641.patch, HADOOP-10641.patch, HADOOP-10641.patch, hadoop-coordination.patch Coordination Engine (CE) is a system, which allows to agree on a sequence of events in a distributed system. In order to be reliable CE should be distributed by itself. Coordination Engine can be based on different algorithms (paxos, raft, 2PC, zab) and have different implementations, depending on use cases, reliability, availability, and performance requirements. CE should have a common API, so that it could serve as a pluggable component in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and HBase (HBASE-10909). First implementation is proposed to be based on ZooKeeper. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-8100) share web server information for http filters
[ https://issues.apache.org/jira/browse/HADOOP-8100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-8100. -- Resolution: Won't Fix share web server information for http filters - Key: HADOOP-8100 URL: https://issues.apache.org/jira/browse/HADOOP-8100 Project: Hadoop Common Issue Type: New Feature Affects Versions: 1.0.0, 0.23.2, 0.24.0 Reporter: Allen Wittenauer Attachments: HADOOP-8100-branch-1.0.patch This is a simple fix which shares the web server bind information for consumption down stream for 3rd party plugins. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-8100) share web server information for http filters
[ https://issues.apache.org/jira/browse/HADOOP-8100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-8100: - Status: Open (was: Patch Available) share web server information for http filters - Key: HADOOP-8100 URL: https://issues.apache.org/jira/browse/HADOOP-8100 Project: Hadoop Common Issue Type: New Feature Affects Versions: 1.0.0, 0.23.2, 0.24.0 Reporter: Allen Wittenauer Attachments: HADOOP-8100-branch-1.0.patch This is a simple fix which shares the web server bind information for consumption down stream for 3rd party plugins. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-8026) various shell script fixes
[ https://issues.apache.org/jira/browse/HADOOP-8026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-8026. -- Resolution: Duplicate This is all part of HADOOP-9902 now. various shell script fixes -- Key: HADOOP-8026 URL: https://issues.apache.org/jira/browse/HADOOP-8026 Project: Hadoop Common Issue Type: Bug Affects Versions: 1.0.0 Reporter: Allen Wittenauer Attachments: HADOOP-8026-branch-1.0.txt Various shell script fixes: * repair naked $0s so that dir detections work * remove superfluous JAVA_HOME settings * use /usr/bin/pdsh in slaves.sh if it exists -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-8025) change default distcp log location to be /tmp rather than cwd
[ https://issues.apache.org/jira/browse/HADOOP-8025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-8025. -- Resolution: Won't Fix change default distcp log location to be /tmp rather than cwd - Key: HADOOP-8025 URL: https://issues.apache.org/jira/browse/HADOOP-8025 Project: Hadoop Common Issue Type: Improvement Affects Versions: 1.0.0 Reporter: Allen Wittenauer Priority: Trivial Attachments: HADOOP-8025-branch-1.0.txt distcp loves to leave emtpy files around. this puts them in /tmp so at least they are easy to find and kill. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10607) Create an API to Separate Credentials/Password Storage from Applications
[ https://issues.apache.org/jira/browse/HADOOP-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-10607: --- Fix Version/s: 2.5.0 Create an API to Separate Credentials/Password Storage from Applications Key: HADOOP-10607 URL: https://issues.apache.org/jira/browse/HADOOP-10607 Project: Hadoop Common Issue Type: New Feature Components: security Reporter: Larry McCay Assignee: Larry McCay Fix For: 3.0.0, 2.5.0 Attachments: 10607-10.patch, 10607-11.patch, 10607-12.patch, 10607-2.patch, 10607-3.patch, 10607-4.patch, 10607-5.patch, 10607-6.patch, 10607-7.patch, 10607-8.patch, 10607-9.patch, 10607-branch-2.patch, 10607.patch As with the filesystem API, we need to provide a generic mechanism to support multiple credential storage mechanisms that are potentially from third parties. We need the ability to eliminate the storage of passwords and secrets in clear text within configuration files or within code. Toward that end, I propose an API that is configured using a list of URLs of CredentialProviders. The implementation will look for implementations using the ServiceLoader interface and thus support third party libraries. Two providers will be included in this patch. One using the credentials cache in MapReduce jobs and the other using Java KeyStores from either HDFS or local file system. A CredShell CLI will also be included in this patch which provides the ability to manage the credentials within the stores. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10607) Create an API to Separate Credentials/Password Storage from Applications
[ https://issues.apache.org/jira/browse/HADOOP-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-10607: --- Fix Version/s: (was: 2.5.0) 2.6.0 Create an API to Separate Credentials/Password Storage from Applications Key: HADOOP-10607 URL: https://issues.apache.org/jira/browse/HADOOP-10607 Project: Hadoop Common Issue Type: New Feature Components: security Reporter: Larry McCay Assignee: Larry McCay Fix For: 3.0.0, 2.6.0 Attachments: 10607-10.patch, 10607-11.patch, 10607-12.patch, 10607-2.patch, 10607-3.patch, 10607-4.patch, 10607-5.patch, 10607-6.patch, 10607-7.patch, 10607-8.patch, 10607-9.patch, 10607-branch-2.patch, 10607.patch As with the filesystem API, we need to provide a generic mechanism to support multiple credential storage mechanisms that are potentially from third parties. We need the ability to eliminate the storage of passwords and secrets in clear text within configuration files or within code. Toward that end, I propose an API that is configured using a list of URLs of CredentialProviders. The implementation will look for implementations using the ServiceLoader interface and thus support third party libraries. Two providers will be included in this patch. One using the credentials cache in MapReduce jobs and the other using Java KeyStores from either HDFS or local file system. A CredShell CLI will also be included in this patch which provides the ability to manage the credentials within the stores. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-535) back to back testing of codecs
[ https://issues.apache.org/jira/browse/HADOOP-535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065054#comment-14065054 ] Allen Wittenauer commented on HADOOP-535: - This was done years ago, wasn't it? back to back testing of codecs -- Key: HADOOP-535 URL: https://issues.apache.org/jira/browse/HADOOP-535 Project: Hadoop Common Issue Type: Test Components: io Reporter: Owen O'Malley Assignee: Arun C Murthy We should write some unit tests that use codecs back to back doing writing and then reading. compressed block1, compressed block 2, compressed block3, ... that will check that the compression codecs are consuming the entire block when they read. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10854) unit tests for the shell scripts
Allen Wittenauer created HADOOP-10854: - Summary: unit tests for the shell scripts Key: HADOOP-10854 URL: https://issues.apache.org/jira/browse/HADOOP-10854 Project: Hadoop Common Issue Type: Test Reporter: Allen Wittenauer With HADOOP-9902 moving a lot of functionality to functions, we should build some unit tests for them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10855) Allow Text to be read with a known length
Todd Lipcon created HADOOP-10855: Summary: Allow Text to be read with a known length Key: HADOOP-10855 URL: https://issues.apache.org/jira/browse/HADOOP-10855 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 2.6.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor For the native task work (MAPREDUCE-2841) it is useful to be able to store strings in a different fashion than the default (varint-prefixed) serialization. We should provide a read method in Text which takes an already-known length to support this use case while still providing Text objects back to the user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10855) Allow Text to be read with a known length
[ https://issues.apache.org/jira/browse/HADOOP-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HADOOP-10855: - Attachment: hadoop-10855.txt Attached patch implements readWithKnownLength(). I also refactored the common code out from the existing read methods to call this new one after deserializing the length. Added a simple new unit test to verify. Allow Text to be read with a known length - Key: HADOOP-10855 URL: https://issues.apache.org/jira/browse/HADOOP-10855 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 2.6.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Attachments: hadoop-10855.txt For the native task work (MAPREDUCE-2841) it is useful to be able to store strings in a different fashion than the default (varint-prefixed) serialization. We should provide a read method in Text which takes an already-known length to support this use case while still providing Text objects back to the user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10855) Allow Text to be read with a known length
[ https://issues.apache.org/jira/browse/HADOOP-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HADOOP-10855: - Status: Patch Available (was: Open) Allow Text to be read with a known length - Key: HADOOP-10855 URL: https://issues.apache.org/jira/browse/HADOOP-10855 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 2.6.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Attachments: hadoop-10855.txt, hadoop-10855.txt For the native task work (MAPREDUCE-2841) it is useful to be able to store strings in a different fashion than the default (varint-prefixed) serialization. We should provide a read method in Text which takes an already-known length to support this use case while still providing Text objects back to the user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10855) Allow Text to be read with a known length
[ https://issues.apache.org/jira/browse/HADOOP-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HADOOP-10855: - Attachment: hadoop-10855.txt Oops, noticed a silly typo in a comment. Allow Text to be read with a known length - Key: HADOOP-10855 URL: https://issues.apache.org/jira/browse/HADOOP-10855 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 2.6.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Attachments: hadoop-10855.txt, hadoop-10855.txt For the native task work (MAPREDUCE-2841) it is useful to be able to store strings in a different fashion than the default (varint-prefixed) serialization. We should provide a read method in Text which takes an already-known length to support this use case while still providing Text objects back to the user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10855) Allow Text to be read with a known length
[ https://issues.apache.org/jira/browse/HADOOP-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065095#comment-14065095 ] Hadoop QA commented on HADOOP-10855: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12656290/hadoop-10855.txt against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4304//console This message is automatically generated. Allow Text to be read with a known length - Key: HADOOP-10855 URL: https://issues.apache.org/jira/browse/HADOOP-10855 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 2.6.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Attachments: hadoop-10855.txt, hadoop-10855.txt For the native task work (MAPREDUCE-2841) it is useful to be able to store strings in a different fashion than the default (varint-prefixed) serialization. We should provide a read method in Text which takes an already-known length to support this use case while still providing Text objects back to the user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-1024) Add stable version line to the website front page
[ https://issues.apache.org/jira/browse/HADOOP-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-1024. -- Resolution: Fixed This was done forever ago. Add stable version line to the website front page - Key: HADOOP-1024 URL: https://issues.apache.org/jira/browse/HADOOP-1024 Project: Hadoop Common Issue Type: Improvement Reporter: Owen O'Malley I think it would be worthwhile to add two lines to the top of the welcome website page: Stable version: 0.10.1 Latest version: 0.11.1 With the number linking off to the respective release like so: http://www.apache.org/dyn/closer.cgi/lucene/hadoop/hadoop-0.10.1.tar.gz We can promote versions from Latest to Stable when they have proven themselves. Thoughts? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-1464) IPC server should not log thread stacks at the info level
[ https://issues.apache.org/jira/browse/HADOOP-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-1464. -- Resolution: Fixed I'm going to close this out as stale. I suspect this is no longer an issue. IPC server should not log thread stacks at the info level - Key: HADOOP-1464 URL: https://issues.apache.org/jira/browse/HADOOP-1464 Project: Hadoop Common Issue Type: Bug Components: ipc Affects Versions: 0.12.3 Reporter: Hairong Kuang Currently when IPC server get a call which becomes too old, i.e. the call has not been served for too long time, it dumps all thread stacks to logs at the info level. Because the size of all thread stacks size might be very big, it would be better to log them at the debug level. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-1496) Test coverage target in build files using emma
[ https://issues.apache.org/jira/browse/HADOOP-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-1496. -- Resolution: Won't Fix I'm going to close this now with won't fix given the clover coverage. Test coverage target in build files using emma -- Key: HADOOP-1496 URL: https://issues.apache.org/jira/browse/HADOOP-1496 Project: Hadoop Common Issue Type: Improvement Components: build Environment: all Reporter: woyg Priority: Minor Attachments: emma.tgz, hadoop_clover.patch, patch.emma.txt, patch.emma.txt.2 Test coverage targets for Hadoop using emma. Test coverage will help in identifying the components which are not poperly covered in tests and write test cases for it. Emma (http://emma.sourceforge.net/) is a good tool for coverage. If you have something else in mind u can suggest. I have a patch ready with emma. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-1688) TestCrcCorruption hangs on windows
[ https://issues.apache.org/jira/browse/HADOOP-1688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-1688. -- Resolution: Fixed Closing this as stale. TestCrcCorruption hangs on windows -- Key: HADOOP-1688 URL: https://issues.apache.org/jira/browse/HADOOP-1688 Project: Hadoop Common Issue Type: Bug Components: test Affects Versions: 0.14.0 Environment: Windows Reporter: Konstantin Shvachko TestCrcCorruption times out on windows saying just that it timed out. No other useful information in the log. Some kind of timing issue, because if I run it with output=yes then it succeeds. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-1754) A testimonial page for hadoop?
[ https://issues.apache.org/jira/browse/HADOOP-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-1754. -- Resolution: Fixed We have entire conferences now. Closing. A testimonial page for hadoop? -- Key: HADOOP-1754 URL: https://issues.apache.org/jira/browse/HADOOP-1754 Project: Hadoop Common Issue Type: Wish Components: documentation Reporter: Konstantin Shvachko Priority: Minor Should we create a testimonial page on hadoop wiki with a link from Hadoop home page so that people could share their experience of using Hadoop? I see some satisfied users out there. :) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-1791) Cleanup local files command(s)
[ https://issues.apache.org/jira/browse/HADOOP-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065206#comment-14065206 ] Allen Wittenauer commented on HADOOP-1791: -- I'm not sure about -format as the option, but this would be kind of nice to have. Cleanup local files command(s) -- Key: HADOOP-1791 URL: https://issues.apache.org/jira/browse/HADOOP-1791 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 0.15.0 Reporter: Enis Soztutar Labels: newbie It would be good if we had some clean up command to cleanup all the local directories that any component of hadoop uses. That way before the cluster is restarted again, or when a machine is decided to be pulled out of the cluster, we can cleanup all the local files. i propose we add {noformat} bin/hadoop datanode -format bin/hadoop tasktracker -format bin/hadoop jobtracker -format {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-1791) Cleanup local files command(s)
[ https://issues.apache.org/jira/browse/HADOOP-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-1791: - Labels: newbie (was: ) Cleanup local files command(s) -- Key: HADOOP-1791 URL: https://issues.apache.org/jira/browse/HADOOP-1791 Project: Hadoop Common Issue Type: New Feature Components: util Affects Versions: 0.15.0 Reporter: Enis Soztutar Labels: newbie It would be good if we had some clean up command to cleanup all the local directories that any component of hadoop uses. That way before the cluster is restarted again, or when a machine is decided to be pulled out of the cluster, we can cleanup all the local files. i propose we add {noformat} bin/hadoop datanode -format bin/hadoop tasktracker -format bin/hadoop jobtracker -format {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-1815) Separate client and server jars
[ https://issues.apache.org/jira/browse/HADOOP-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065215#comment-14065215 ] Allen Wittenauer commented on HADOOP-1815: -- With the move to protobuf, how close are we to closing this out? Separate client and server jars --- Key: HADOOP-1815 URL: https://issues.apache.org/jira/browse/HADOOP-1815 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 0.14.0 Environment: All Reporter: Milind Bhandarkar For the ease of deployment, one should not have to change the server jars, and restart clusters, when minor features on the client side are changed. This requireds separating client and server jars for hadoop. Version numbers appended to hadoop jars can reflect the compatibility. e.g. the server jar could be at 0.13.1, and the client jar could be at 0.13.2. In short, we can treat the part following 0. as the major version number for now. This allows major client frameworks such as streaming and Pig happy. To my knowledge, Pig uses hadoop's default jobclient. Whereas streaming uses its own jobclient. I would love to change streaming to use the default hadoop jobclient, if I can make modifications to it (e.g. to print more stats that are available from TaskReport, for example), if I do not have to deploy the new version of the whole jar to the backend and restart the mapreduce cluster. (I thought there was already a bug filed for separating the client and server jar, but I could not find it. Hence the new Jira. Sorry about duplication, if any.) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10855) Allow Text to be read with a known length
[ https://issues.apache.org/jira/browse/HADOOP-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HADOOP-10855: - Attachment: hadoop-10855.txt woops, my patch wasn't relative to the right dir... take 3. Allow Text to be read with a known length - Key: HADOOP-10855 URL: https://issues.apache.org/jira/browse/HADOOP-10855 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 2.6.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Attachments: hadoop-10855.txt, hadoop-10855.txt, hadoop-10855.txt For the native task work (MAPREDUCE-2841) it is useful to be able to store strings in a different fashion than the default (varint-prefixed) serialization. We should provide a read method in Text which takes an already-known length to support this use case while still providing Text objects back to the user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10732) Update without holding write lock in JavaKeyStoreProvider#innerSetCredential()
[ https://issues.apache.org/jira/browse/HADOOP-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-10732: --- Resolution: Fixed Fix Version/s: 2.6.0 3.0.0 Status: Resolved (was: Patch Available) The v2 patch removes the synchronization around the hash table, so I'm going to use the v1 patch. I just committed this to trunk and branch-2. Thanks, Ted! Update without holding write lock in JavaKeyStoreProvider#innerSetCredential() -- Key: HADOOP-10732 URL: https://issues.apache.org/jira/browse/HADOOP-10732 Project: Hadoop Common Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Fix For: 3.0.0, 2.6.0 Attachments: hadoop-10732-v1.txt, hadoop-10732-v2.txt In hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/alias/JavaKeyStoreProvider.java, innerSetCredential() doesn't wrap update with writeLock.lock() / writeLock.unlock(). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10843) Unsafe.getLong is not supported correcly on Power PC, thus causing FastByteComparison's UnsafeComparer not working properly
[ https://issues.apache.org/jira/browse/HADOOP-10843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065227#comment-14065227 ] Colin Patrick McCabe commented on HADOOP-10843: --- What do you mean by not supported correctly? Are you talking about alignment restrictions (i.e. address must be a multiple of 8). Or something else? If it is something else, I would expect there to be a Sun/Oracle problem report open for this? Unsafe.getLong is not supported correcly on Power PC, thus causing FastByteComparison's UnsafeComparer not working properly --- Key: HADOOP-10843 URL: https://issues.apache.org/jira/browse/HADOOP-10843 Project: Hadoop Common Issue Type: Bug Components: io Affects Versions: 2.2.0, 2.3.0, 2.4.0, 2.4.1 Reporter: Jinghui Wang Assignee: Jinghui Wang Attachments: HADOOP-10843.patch Unsafe.getLong is not supported correcly on Power PC. FastByteComparison's UnsafeComparer relies on unsafe method Unsafe.getLong, which is not correctly supported for Power PC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10849) Implement conf substitution with UGI.current/loginUser
[ https://issues.apache.org/jira/browse/HADOOP-10849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065234#comment-14065234 ] Colin Patrick McCabe commented on HADOOP-10849: --- UserGroupInformation#getCurrentUser also accesses configuration objects; if one of them asks for ugi.current.user, then we get in an infinite regress. There is some potential for deadlock here. So I would say that we should avoid doing this unless we can find some way to solve those problems. Implement conf substitution with UGI.current/loginUser -- Key: HADOOP-10849 URL: https://issues.apache.org/jira/browse/HADOOP-10849 Project: Hadoop Common Issue Type: Improvement Components: conf Affects Versions: 2.4.1 Reporter: Gera Shegalov Assignee: Gera Shegalov Attachments: HADOOP-10849.v01.patch Many path properties and similar in hadoop code base would be easily configured if we had substitutions with {{UserGroupInformation#getCurrentUser}}. Currently we often use less elegant concatenation code if we want to express currentUser as opposed to ${user.name} system property representing the user owning the JVM. This JIRA proposes the corresponding substitution support for keys {{ugi.current.user}} and {{ugi.login.user}} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-1947) the hadoop-daemon.sh should allow the admin to configure the log4j appender for the servers
[ https://issues.apache.org/jira/browse/HADOOP-1947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065241#comment-14065241 ] Allen Wittenauer commented on HADOOP-1947: -- Ironically, this was fixed in hadoop-daemon.sh at some point, but yarn-daemon.sh did the exact same thing! the hadoop-daemon.sh should allow the admin to configure the log4j appender for the servers --- Key: HADOOP-1947 URL: https://issues.apache.org/jira/browse/HADOOP-1947 Project: Hadoop Common Issue Type: Improvement Reporter: Owen O'Malley Assignee: Owen O'Malley Currently the bin/hadoop-daemon.sh script forces the servers to use the INFO,DRFA as the root logger. It really should be configurable from at least hadoop-env.sh. Otherwise, it is hard for admins to control how the logs are managed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-2082) randomwriter should complain if there are too many arguments
[ https://issues.apache.org/jira/browse/HADOOP-2082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-2082. -- Resolution: Fixed way old and likely fixed by now. randomwriter should complain if there are too many arguments Key: HADOOP-2082 URL: https://issues.apache.org/jira/browse/HADOOP-2082 Project: Hadoop Common Issue Type: Improvement Reporter: Owen O'Malley A user was moving from 0.13 to 0.14 and was invoking randomwriter with a config on the command line like: bin/hadoop jar hadoop-*-examples.jar randomwriter output conf.xml which worked in 0.13, but in 0.14 it ignores the conf.xml without complaining. The equivalent is bin/hadoop jar hadoop-*-examples.jar randomwriter -conf conf.xml output -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10733) Potential null dereference in CredentialShell#promptForCredential()
[ https://issues.apache.org/jira/browse/HADOOP-10733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HADOOP-10733: --- Resolution: Fixed Fix Version/s: 2.6.0 3.0.0 Status: Resolved (was: Patch Available) I just committed this to trunk and branch-2. Thanks, Ted! Potential null dereference in CredentialShell#promptForCredential() --- Key: HADOOP-10733 URL: https://issues.apache.org/jira/browse/HADOOP-10733 Project: Hadoop Common Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Fix For: 3.0.0, 2.6.0 Attachments: hadoop-10733-v1.txt {code} char[] newPassword1 = c.readPassword(Enter password: ); char[] newPassword2 = c.readPassword(Enter password again: ); noMatch = !Arrays.equals(newPassword1, newPassword2); if (noMatch) { Arrays.fill(newPassword1, ' '); {code} newPassword1 might be null, leading to NullPointerException in Arrays.fill() call. Similar issue for the following call on line 381: {code} Arrays.fill(newPassword2, ' '); {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10793) KeyShell and CredentialShell args should use single-dash style
[ https://issues.apache.org/jira/browse/HADOOP-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065282#comment-14065282 ] Andrew Wang commented on HADOOP-10793: -- Somehow I just realized that CredentialShell also uses two dashes, so let's fix that here as well. KeyShell and CredentialShell args should use single-dash style -- Key: HADOOP-10793 URL: https://issues.apache.org/jira/browse/HADOOP-10793 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 3.0.0 Reporter: Mike Yoder Follow-on from HADOOP-10736 as per [~andrew.wang] - the key shell uses the gnu double dash style for command line args, while other command line programs use a single dash. Consider changing this, and consider another argument parsing scheme, like the CommandLine class. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10793) KeyShell and CredentialShell args should use single-dash style
[ https://issues.apache.org/jira/browse/HADOOP-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HADOOP-10793: - Summary: KeyShell and CredentialShell args should use single-dash style (was: Key Shell args use double dash style) KeyShell and CredentialShell args should use single-dash style -- Key: HADOOP-10793 URL: https://issues.apache.org/jira/browse/HADOOP-10793 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 3.0.0 Reporter: Mike Yoder Follow-on from HADOOP-10736 as per [~andrew.wang] - the key shell uses the gnu double dash style for command line args, while other command line programs use a single dash. Consider changing this, and consider another argument parsing scheme, like the CommandLine class. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10607) Create an API to Separate Credentials/Password Storage from Applications
[ https://issues.apache.org/jira/browse/HADOOP-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065278#comment-14065278 ] Andrew Wang commented on HADOOP-10607: -- Hey guys, few q's and comments: * Why was this merged to branch-2? AFAIK this isn't being used by any Hadoop components yet, so it doesn't belong in a release branch. I'd like to revert it out of branch-2 until there is such a consumer. * CredentialShell is using the double dash style for flags. I'm going to broaden the scope of HADOOP-10793 to fix this for both KeyShell and CredentialShell. * Larry, I think your IDE is auto-wrapping with tabs. I think this is default behavior with Eclipse. Another thing you can do is configure `git diff` to highlight whitespace errors like these for the future. Maybe we can fix some of these tabs in HADOOP-10793 too, or in a new JIRA. Normally I'm against whitespace only changes, but this is mostly new code so there's little chance of conflicts. Create an API to Separate Credentials/Password Storage from Applications Key: HADOOP-10607 URL: https://issues.apache.org/jira/browse/HADOOP-10607 Project: Hadoop Common Issue Type: New Feature Components: security Reporter: Larry McCay Assignee: Larry McCay Fix For: 3.0.0, 2.6.0 Attachments: 10607-10.patch, 10607-11.patch, 10607-12.patch, 10607-2.patch, 10607-3.patch, 10607-4.patch, 10607-5.patch, 10607-6.patch, 10607-7.patch, 10607-8.patch, 10607-9.patch, 10607-branch-2.patch, 10607.patch As with the filesystem API, we need to provide a generic mechanism to support multiple credential storage mechanisms that are potentially from third parties. We need the ability to eliminate the storage of passwords and secrets in clear text within configuration files or within code. Toward that end, I propose an API that is configured using a list of URLs of CredentialProviders. The implementation will look for implementations using the ServiceLoader interface and thus support third party libraries. Two providers will be included in this patch. One using the credentials cache in MapReduce jobs and the other using Java KeyStores from either HDFS or local file system. A CredShell CLI will also be included in this patch which provides the ability to manage the credentials within the stores. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10591) Compression codecs must used pooled direct buffers or deallocate direct buffers when stream is closed
[ https://issues.apache.org/jira/browse/HADOOP-10591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HADOOP-10591: -- Resolution: Fixed Fix Version/s: 2.6.0 Status: Resolved (was: Patch Available) Compression codecs must used pooled direct buffers or deallocate direct buffers when stream is closed - Key: HADOOP-10591 URL: https://issues.apache.org/jira/browse/HADOOP-10591 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.2.0 Reporter: Hari Shreedharan Assignee: Colin Patrick McCabe Fix For: 2.6.0 Attachments: HADOOP-10591.001.patch, HADOOP-10591.002.patch Currently direct buffers allocated by compression codecs like Gzip (which allocates 2 direct buffers per instance) are not deallocated when the stream is closed. Eventually for long running processes which create a huge number of files, these direct buffers are left hanging till a full gc, which may or may not happen in a reasonable amount of time - especially if the process does not use a whole lot of heap. Either these buffers should be pooled or they should be deallocated when the stream is closed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10733) Potential null dereference in CredentialShell#promptForCredential()
[ https://issues.apache.org/jira/browse/HADOOP-10733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065296#comment-14065296 ] Hudson commented on HADOOP-10733: - FAILURE: Integrated in Hadoop-trunk-Commit #5900 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5900/]) HADOOP-10733. Fix potential null dereference in CredShell. (Ted Yu via omalley) (omalley: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611419) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/alias/CredentialShell.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/alias/TestCredShell.java Potential null dereference in CredentialShell#promptForCredential() --- Key: HADOOP-10733 URL: https://issues.apache.org/jira/browse/HADOOP-10733 Project: Hadoop Common Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Fix For: 3.0.0, 2.6.0 Attachments: hadoop-10733-v1.txt {code} char[] newPassword1 = c.readPassword(Enter password: ); char[] newPassword2 = c.readPassword(Enter password again: ); noMatch = !Arrays.equals(newPassword1, newPassword2); if (noMatch) { Arrays.fill(newPassword1, ' '); {code} newPassword1 might be null, leading to NullPointerException in Arrays.fill() call. Similar issue for the following call on line 381: {code} Arrays.fill(newPassword2, ' '); {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10591) Compression codecs must used pooled direct buffers or deallocate direct buffers when stream is closed
[ https://issues.apache.org/jira/browse/HADOOP-10591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065295#comment-14065295 ] Hudson commented on HADOOP-10591: - FAILURE: Integrated in Hadoop-trunk-Commit #5900 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5900/]) HADOOP-10591. Compression codecs must used pooled direct buffers or deallocate direct buffers when stream is closed (cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611423) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/BZip2Codec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/CompressionCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/CompressionInputStream.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/CompressionOutputStream.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/DefaultCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/GzipCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/Lz4Codec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/SnappyCodec.java Compression codecs must used pooled direct buffers or deallocate direct buffers when stream is closed - Key: HADOOP-10591 URL: https://issues.apache.org/jira/browse/HADOOP-10591 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.2.0 Reporter: Hari Shreedharan Assignee: Colin Patrick McCabe Fix For: 2.6.0 Attachments: HADOOP-10591.001.patch, HADOOP-10591.002.patch Currently direct buffers allocated by compression codecs like Gzip (which allocates 2 direct buffers per instance) are not deallocated when the stream is closed. Eventually for long running processes which create a huge number of files, these direct buffers are left hanging till a full gc, which may or may not happen in a reasonable amount of time - especially if the process does not use a whole lot of heap. Either these buffers should be pooled or they should be deallocated when the stream is closed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10732) Update without holding write lock in JavaKeyStoreProvider#innerSetCredential()
[ https://issues.apache.org/jira/browse/HADOOP-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065298#comment-14065298 ] Hudson commented on HADOOP-10732: - FAILURE: Integrated in Hadoop-trunk-Commit #5900 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5900/]) HADOOP-10732. Fix locking in credential update. (Ted Yu via omalley) (omalley: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611415) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/alias/JavaKeyStoreProvider.java Update without holding write lock in JavaKeyStoreProvider#innerSetCredential() -- Key: HADOOP-10732 URL: https://issues.apache.org/jira/browse/HADOOP-10732 Project: Hadoop Common Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Fix For: 3.0.0, 2.6.0 Attachments: hadoop-10732-v1.txt, hadoop-10732-v2.txt In hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/alias/JavaKeyStoreProvider.java, innerSetCredential() doesn't wrap update with writeLock.lock() / writeLock.unlock(). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10607) Create an API to Separate Credentials/Password Storage from Applications
[ https://issues.apache.org/jira/browse/HADOOP-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065322#comment-14065322 ] Larry McCay commented on HADOOP-10607: -- Hi [~andrew.wang] - I will look into changing my preferences and configuring git diff as you describe. I thought that I was managing it manually well enough. Thanks for the hints! Create an API to Separate Credentials/Password Storage from Applications Key: HADOOP-10607 URL: https://issues.apache.org/jira/browse/HADOOP-10607 Project: Hadoop Common Issue Type: New Feature Components: security Reporter: Larry McCay Assignee: Larry McCay Fix For: 3.0.0, 2.6.0 Attachments: 10607-10.patch, 10607-11.patch, 10607-12.patch, 10607-2.patch, 10607-3.patch, 10607-4.patch, 10607-5.patch, 10607-6.patch, 10607-7.patch, 10607-8.patch, 10607-9.patch, 10607-branch-2.patch, 10607.patch As with the filesystem API, we need to provide a generic mechanism to support multiple credential storage mechanisms that are potentially from third parties. We need the ability to eliminate the storage of passwords and secrets in clear text within configuration files or within code. Toward that end, I propose an API that is configured using a list of URLs of CredentialProviders. The implementation will look for implementations using the ServiceLoader interface and thus support third party libraries. Two providers will be included in this patch. One using the credentials cache in MapReduce jobs and the other using Java KeyStores from either HDFS or local file system. A CredShell CLI will also be included in this patch which provides the ability to manage the credentials within the stores. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Work started] (HADOOP-10818) native client: refactor URI code to be clearer
[ https://issues.apache.org/jira/browse/HADOOP-10818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HADOOP-10818 started by Colin Patrick McCabe. native client: refactor URI code to be clearer -- Key: HADOOP-10818 URL: https://issues.apache.org/jira/browse/HADOOP-10818 Project: Hadoop Common Issue Type: Sub-task Components: native Affects Versions: HADOOP-10388 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HADOOP-10818-pnative.001.patch Refactor the {{common/uri.c}} code to be a bit clearer. We should just be able to refer to user_info, auth, port, path, etc. fields in the structure, rather than calling accessors. {{hdfsBuilder}} should just have a connection URI rather than separate fields for all these things. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-2270) Title: DFS submit client params overrides final params on cluster
[ https://issues.apache.org/jira/browse/HADOOP-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-2270: - Labels: newbie (was: ) Title: DFS submit client params overrides final params on cluster -- Key: HADOOP-2270 URL: https://issues.apache.org/jira/browse/HADOOP-2270 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 0.15.1 Reporter: Karam Singh Labels: newbie hdfs client params over-rides the params set as final on hdfs cluster nodes. default valuesv of cleint side hadoop-site.xml values override the final prameters of hdfs hadoop-site.xml . oberved the following cases -: 1. dfs.trash.root=/recycle, dfs.trash.interval=10 and dfs.replication=2 marked final under hadoop-site.xml on hdfs cluster. When fsShel command hadoop dfs -put local_dir dest fired from submission host Files will still get replicated 3 times (default) instead of final dfs.replication=2. Similarly when hadoop dfs -rmr dfs_dir OR hadoop dfs -rm file_path fired from submit client the file/driectory diectly got deleted without being moved to /recycle. Here hadoop-site.xml on submit client does not specify dfs.trash.root, dfs.trash.interval and dfs.replication. Same is the case when we submit mapred JOB from client and job.xml dispalys default values which overrides the lsuter values. 2. dfs.trash.root=/recycle, dfs.trash.interval=10 and dfs.replication=2 marked final under hadoop-site.xml on hdfs cluster. And dfs.trash.root=/rubbish, dfs.trash.interval=2 and dfs.replication=5 under hadoop-site.xml on submit client. When fsShel command hadoop dfs -put local_dir dest fired from submit client Files will get replicated 5 times instead of final dfs.replication=2. Similarly when hadoop dfs -rmr dfs_dir OR hadoop dfs -rm file_path fired from submit client the file/driectory diectly will be moved to /rubbish instead of /recycle. Same is the case when we submit mapred job from client, job.xml displays following values -: dfs.trash.root=/rubbish, dfs.trash.interval=2 and dfs.replication=5 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-2270) Title: DFS submit client params overrides final params on cluster
[ https://issues.apache.org/jira/browse/HADOOP-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065340#comment-14065340 ] Allen Wittenauer commented on HADOOP-2270: -- I doubt this is still an issue, but it would be good for someone to verify. I'll mark this as a newbie jira for someone to look at, just in case... Title: DFS submit client params overrides final params on cluster -- Key: HADOOP-2270 URL: https://issues.apache.org/jira/browse/HADOOP-2270 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 0.15.1 Reporter: Karam Singh Labels: newbie hdfs client params over-rides the params set as final on hdfs cluster nodes. default valuesv of cleint side hadoop-site.xml values override the final prameters of hdfs hadoop-site.xml . oberved the following cases -: 1. dfs.trash.root=/recycle, dfs.trash.interval=10 and dfs.replication=2 marked final under hadoop-site.xml on hdfs cluster. When fsShel command hadoop dfs -put local_dir dest fired from submission host Files will still get replicated 3 times (default) instead of final dfs.replication=2. Similarly when hadoop dfs -rmr dfs_dir OR hadoop dfs -rm file_path fired from submit client the file/driectory diectly got deleted without being moved to /recycle. Here hadoop-site.xml on submit client does not specify dfs.trash.root, dfs.trash.interval and dfs.replication. Same is the case when we submit mapred JOB from client and job.xml dispalys default values which overrides the lsuter values. 2. dfs.trash.root=/recycle, dfs.trash.interval=10 and dfs.replication=2 marked final under hadoop-site.xml on hdfs cluster. And dfs.trash.root=/rubbish, dfs.trash.interval=2 and dfs.replication=5 under hadoop-site.xml on submit client. When fsShel command hadoop dfs -put local_dir dest fired from submit client Files will get replicated 5 times instead of final dfs.replication=2. Similarly when hadoop dfs -rmr dfs_dir OR hadoop dfs -rm file_path fired from submit client the file/driectory diectly will be moved to /rubbish instead of /recycle. Same is the case when we submit mapred job from client, job.xml displays following values -: dfs.trash.root=/rubbish, dfs.trash.interval=2 and dfs.replication=5 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10856) HarFileSystem and HarFs support for HDFS encryption
Andrew Wang created HADOOP-10856: Summary: HarFileSystem and HarFs support for HDFS encryption Key: HADOOP-10856 URL: https://issues.apache.org/jira/browse/HADOOP-10856 Project: Hadoop Common Issue Type: Bug Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Andrew Wang Assignee: Andrew Wang We need to examine support for Har with HDFS encryption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10856) HarFileSystem and HarFs support for HDFS encryption
[ https://issues.apache.org/jira/browse/HADOOP-10856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065379#comment-14065379 ] Alejandro Abdelnur commented on HADOOP-10856: - IMO, HAR simply has to supports xAttrs. If you are doing a HAR under .raw the same magic as for distcp will kick. If you are doing a HAR outside of raw, everything is unencrypted in the HAR. If your HAR file is within an encryption zone, the HAR file itself is encrypted. HarFileSystem and HarFs support for HDFS encryption --- Key: HADOOP-10856 URL: https://issues.apache.org/jira/browse/HADOOP-10856 Project: Hadoop Common Issue Type: Bug Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Andrew Wang Assignee: Andrew Wang We need to examine support for Har with HDFS encryption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9902) Shell script rewrite
[ https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065387#comment-14065387 ] Allen Wittenauer commented on HADOOP-9902: -- A note for me: HDFS-2256 has an interesting idea for start-dfs.sh. Shell script rewrite Key: HADOOP-9902 URL: https://issues.apache.org/jira/browse/HADOOP-9902 Project: Hadoop Common Issue Type: Improvement Components: scripts Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: Allen Wittenauer Labels: releasenotes Attachments: HADOOP-9902-2.patch, HADOOP-9902-3.patch, HADOOP-9902-4.patch, HADOOP-9902-5.patch, HADOOP-9902-6.patch, HADOOP-9902.patch, HADOOP-9902.txt, hadoop-9902-1.patch, more-info.txt Umbrella JIRA for shell script rewrite. See more-info.txt for more details. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-2462) MiniMRCluster does not utilize multiple local directories in mapred.local.dir
[ https://issues.apache.org/jira/browse/HADOOP-2462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-2462. -- Resolution: Incomplete Stale. MiniMRCluster does not utilize multiple local directories in mapred.local.dir --- Key: HADOOP-2462 URL: https://issues.apache.org/jira/browse/HADOOP-2462 Project: Hadoop Common Issue Type: Bug Components: test Affects Versions: 0.15.0 Reporter: Konstantin Shvachko My hadoop-site.xml specifies 4 local directories {code} property namemapred.local.dir/name value${hadoop.tmp.dir}/mapred/local1, ${hadoop.tmp.dir}/mapred/local2, ${hadoop.tmp.dir}/mapred/local3, ${hadoop.tmp.dir}/mapred/local4/value /property {code} and I am looking at MiniMRCluster.TaskTrackerRunner There are several things here: # localDirBase value is set to {code} /tmp/h/mapred/local1, /tmp/h/mapred/local2, /tmp/h/mapred/local3, /tmp/h/mapred/local4 {code} and I get a hierarchy of directories with commas and spaces in the names. I think this was not designed to work with multiple dirs. # Further down, all new directories are generated with the same name {code} File ttDir = new File(localDirBase, Integer.toString(trackerId) + _ + 0); {code} So in fact only one directory is created. I think the intension was to have i instead of 0 {code} File ttDir = new File(localDirBase, Integer.toString(trackerId) + _ + i); {code} # On windows MiniMRCluster.TaskTrackerRunner in this case throws an IOException, which is silently ignored by all but the TestMiniMRMapRedDebugScript MiniMR tests. {code} java.io.IOException: Mkdirs failed to create /tmp/h/mapred/local1, /tmp/h/mapred/local2, /tmp/h/mapred/local3, /tmp/h/mapred/local4/0_0 at org.apache.hadoop.mapred.MiniMRCluster$TaskTrackerRunner.init(MiniMRCluster.java:124) at org.apache.hadoop.mapred.MiniMRCluster.init(MiniMRCluster.java:293) at org.apache.hadoop.mapred.MiniMRCluster.init(MiniMRCluster.java:244) at org.apache.hadoop.mapred.TestMiniMRClasspath.testClassPath(TestMiniMRClasspath.java:163) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at junit.framework.TestCase.runTest(TestCase.java:154) at junit.framework.TestCase.runBare(TestCase.java:127) at junit.framework.TestResult$1.protect(TestResult.java:106) at junit.framework.TestResult.runProtected(TestResult.java:124) at junit.framework.TestResult.run(TestResult.java:109) at junit.framework.TestCase.run(TestCase.java:118) at junit.framework.TestSuite.runTest(TestSuite.java:208) at junit.framework.TestSuite.run(TestSuite.java:203) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:478) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:344) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:196) {code} I am marking it as Major because we actually do not test multiple local directories. Looks like it was introduced rather recently by HADOOP-1819. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10855) Allow Text to be read with a known length
[ https://issues.apache.org/jira/browse/HADOOP-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065386#comment-14065386 ] Hadoop QA commented on HADOOP-10855: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12656306/hadoop-10855.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common: org.apache.hadoop.fs.shell.TestCopyPreserveFlag org.apache.hadoop.fs.TestSymlinkLocalFSFileContext org.apache.hadoop.fs.shell.TestTextCommand org.apache.hadoop.ipc.TestIPC org.apache.hadoop.fs.TestSymlinkLocalFSFileSystem org.apache.hadoop.fs.shell.TestPathData org.apache.hadoop.fs.TestDFVariations {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4305//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4305//console This message is automatically generated. Allow Text to be read with a known length - Key: HADOOP-10855 URL: https://issues.apache.org/jira/browse/HADOOP-10855 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 2.6.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Attachments: hadoop-10855.txt, hadoop-10855.txt, hadoop-10855.txt For the native task work (MAPREDUCE-2841) it is useful to be able to store strings in a different fashion than the default (varint-prefixed) serialization. We should provide a read method in Text which takes an already-known length to support this use case while still providing Text objects back to the user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10778) Use NativeCrc32 only if it is faster
[ https://issues.apache.org/jira/browse/HADOOP-10778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065407#comment-14065407 ] Tsz Wo Nicholas Sze commented on HADOOP-10778: -- It is 2.6 GHz i7. How about we use crcutil? It is Apache License 2.0. Use NativeCrc32 only if it is faster Key: HADOOP-10778 URL: https://issues.apache.org/jira/browse/HADOOP-10778 Project: Hadoop Common Issue Type: Improvement Components: util Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: c10778_20140702.patch From the benchmark post in [this comment|https://issues.apache.org/jira/browse/HDFS-6560?focusedCommentId=14044060page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14044060], NativeCrc32 is slower than java.util.zip.CRC32 for Java 7 and above when bytesPerChecksum 512. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-2560) Processing multiple input splits per mapper task
[ https://issues.apache.org/jira/browse/HADOOP-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-2560. -- Resolution: Duplicate This appears to predate MFIF/CFIF, as introduced by HADOOP-4565 which appears to fix the issue. I'm going to close this out as resolved as a result. Processing multiple input splits per mapper task Key: HADOOP-2560 URL: https://issues.apache.org/jira/browse/HADOOP-2560 Project: Hadoop Common Issue Type: Bug Reporter: Runping Qi Assignee: dhruba borthakur Attachments: multipleSplitsPerMapper.patch Currently, an input split contains a consecutive chunk of input file, which by default, corresponding to a DFS block. This may lead to a large number of mapper tasks if the input data is large. This leads to the following problems: 1. Shuffling cost: since the framework has to move M * R map output segments to the nodes running reducers, larger M means larger shuffling cost. 2. High JVM initialization overhead 3. Disk fragmentation: larger number of map output files means lower read throughput for accessing them. Ideally, you want to keep the number of mappers to no more than 16 times the number of nodes in the cluster. To achive that, we can increase the input split size. However, if a split span over more than one dfs block, you lose the data locality scheduling benefits. One way to address this problem is to combine multiple input blocks with the same rack into one split. If in average we combine B blocks into one split, then we will reduce the number of mappers by a factor of B. Since all the blocks for one mapper share a rack, thus we can benefit from rack-aware scheduling. Thoughts? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-2608) Reading sequence file consumes 100% cpu with maximum throughput being about 5MB/sec per process
[ https://issues.apache.org/jira/browse/HADOOP-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-2608. -- Resolution: Fixed I'm going to close this out as stale. Reading sequence file consumes 100% cpu with maximum throughput being about 5MB/sec per process --- Key: HADOOP-2608 URL: https://issues.apache.org/jira/browse/HADOOP-2608 Project: Hadoop Common Issue Type: Improvement Components: io Reporter: Runping Qi I did some tests on the throughput of scanning block-compressed sequence files. The sustained throughput was bounded at 5MB/sec per process, with the cpu of each process maxed at 100%. It seems to me that the cpu consumption is too high and the throughput is too low for just scanning files. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-2681) NullPointerException in TaskRunner.java when system property hadoop.log.dir is not set
[ https://issues.apache.org/jira/browse/HADOOP-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-2681: - Labels: newbie (was: ) NullPointerException in TaskRunner.java when system property hadoop.log.dir is not set Key: HADOOP-2681 URL: https://issues.apache.org/jira/browse/HADOOP-2681 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 0.15.2 Reporter: Xu Zhang Priority: Minor Labels: newbie Currently, NullPointerException exception is thrown on line 321 in TaskRunner.java when system property hadoop.log.dir is not set. Instead of a NullPointerException exception, I expected a default value for hadoop.log.dir to be used, or to see a more meaningful error message that could have helped me figure out what was wrong (like, telling me that I needed to set hadoop.log.dir and how to do so). Here is one instance of such exceptions: WARN mapred.TaskRunner: task_200801181719_0001_m_00_0 Child Error java.lang.NullPointerException at java.io.File.init(File.java:222) at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:321) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-2681) NullPointerException in TaskRunner.java when system property hadoop.log.dir is not set
[ https://issues.apache.org/jira/browse/HADOOP-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065448#comment-14065448 ] Allen Wittenauer commented on HADOOP-2681: -- We should double-check *all* of the references to hadoop.log.dir. HADOOP-9902 gives some guarantees that this properly set, but the Java code should be more forgiving. NullPointerException in TaskRunner.java when system property hadoop.log.dir is not set Key: HADOOP-2681 URL: https://issues.apache.org/jira/browse/HADOOP-2681 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 0.15.2 Reporter: Xu Zhang Priority: Minor Labels: newbie Currently, NullPointerException exception is thrown on line 321 in TaskRunner.java when system property hadoop.log.dir is not set. Instead of a NullPointerException exception, I expected a default value for hadoop.log.dir to be used, or to see a more meaningful error message that could have helped me figure out what was wrong (like, telling me that I needed to set hadoop.log.dir and how to do so). Here is one instance of such exceptions: WARN mapred.TaskRunner: task_200801181719_0001_m_00_0 Child Error java.lang.NullPointerException at java.io.File.init(File.java:222) at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:321) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-2689) RegEx support for expressing datanodes in the slaves conf files
[ https://issues.apache.org/jira/browse/HADOOP-2689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065453#comment-14065453 ] Allen Wittenauer commented on HADOOP-2689: -- It should be noted that with HADOOP-9902, it is possible to replace the slaves code to handle this in a much easier fashion. But this would be a good enhancement for the follow up to that jira. RegEx support for expressing datanodes in the slaves conf files --- Key: HADOOP-2689 URL: https://issues.apache.org/jira/browse/HADOOP-2689 Project: Hadoop Common Issue Type: Improvement Components: conf Affects Versions: 0.14.4 Environment: All Reporter: Venkat Ramachandran It will be very handy if datanodes and task trackers can be expressed in the slave conf file as regular expressions. For example, machine[1-200].corp machine[400-679].corp -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-2715) Review and document '_' prefix convention in input directories
[ https://issues.apache.org/jira/browse/HADOOP-2715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-2715: - Labels: newbie (was: ) Review and document '_' prefix convention in input directories -- Key: HADOOP-2715 URL: https://issues.apache.org/jira/browse/HADOOP-2715 Project: Hadoop Common Issue Type: Bug Components: documentation Reporter: eric baldeschwieler Labels: newbie We use files and directories prefixed with '_' to store logs, metadata and other info that might be useful to the owner of a job within the output directory. The standard input methods then ignore such files by default. HADOOP-2391 lead to some discussion of the '_' convention in output directories. No all developers input formats are supporting this. We should review the convention and document it well so that future input methods support it. Or we should come up with an alternate approach. My hope is that after some discuss we will close this bug by creating a documentation patch explaining the convention. It sounds like the convention is implemented via some input filter classes. We should discuss if this generic solution is helping or obscuring the intent of the convention. Perhaps we should just have a non-configurable filter, so '_' prefixed files are treated like '.' prefixed files by most unix tools. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-2715) Review and document '_' prefix convention in input directories
[ https://issues.apache.org/jira/browse/HADOOP-2715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-2715: - Component/s: documentation Review and document '_' prefix convention in input directories -- Key: HADOOP-2715 URL: https://issues.apache.org/jira/browse/HADOOP-2715 Project: Hadoop Common Issue Type: Bug Components: documentation Reporter: eric baldeschwieler Labels: newbie We use files and directories prefixed with '_' to store logs, metadata and other info that might be useful to the owner of a job within the output directory. The standard input methods then ignore such files by default. HADOOP-2391 lead to some discussion of the '_' convention in output directories. No all developers input formats are supporting this. We should review the convention and document it well so that future input methods support it. Or we should come up with an alternate approach. My hope is that after some discuss we will close this bug by creating a documentation patch explaining the convention. It sounds like the convention is implemented via some input filter classes. We should discuss if this generic solution is helping or obscuring the intent of the convention. Perhaps we should just have a non-configurable filter, so '_' prefixed files are treated like '.' prefixed files by most unix tools. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10793) KeyShell and CredentialShell args should use single-dash style
[ https://issues.apache.org/jira/browse/HADOOP-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065475#comment-14065475 ] Owen O'Malley commented on HADOOP-10793: Andrew, please change the jira so that all of the commands support the proper two dashes. KeyShell and CredentialShell args should use single-dash style -- Key: HADOOP-10793 URL: https://issues.apache.org/jira/browse/HADOOP-10793 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 3.0.0 Reporter: Mike Yoder Follow-on from HADOOP-10736 as per [~andrew.wang] - the key shell uses the gnu double dash style for command line args, while other command line programs use a single dash. Consider changing this, and consider another argument parsing scheme, like the CommandLine class. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10607) Create an API to Separate Credentials/Password Storage from Applications
[ https://issues.apache.org/jira/browse/HADOOP-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065485#comment-14065485 ] Owen O'Malley commented on HADOOP-10607: Andrew, it has to get released before it can be used by external components. Is there a technical concern with it getting in the 2.6 release? Create an API to Separate Credentials/Password Storage from Applications Key: HADOOP-10607 URL: https://issues.apache.org/jira/browse/HADOOP-10607 Project: Hadoop Common Issue Type: New Feature Components: security Reporter: Larry McCay Assignee: Larry McCay Fix For: 3.0.0, 2.6.0 Attachments: 10607-10.patch, 10607-11.patch, 10607-12.patch, 10607-2.patch, 10607-3.patch, 10607-4.patch, 10607-5.patch, 10607-6.patch, 10607-7.patch, 10607-8.patch, 10607-9.patch, 10607-branch-2.patch, 10607.patch As with the filesystem API, we need to provide a generic mechanism to support multiple credential storage mechanisms that are potentially from third parties. We need the ability to eliminate the storage of passwords and secrets in clear text within configuration files or within code. Toward that end, I propose an API that is configured using a list of URLs of CredentialProviders. The implementation will look for implementations using the ServiceLoader interface and thus support third party libraries. Two providers will be included in this patch. One using the credentials cache in MapReduce jobs and the other using Java KeyStores from either HDFS or local file system. A CredShell CLI will also be included in this patch which provides the ability to manage the credentials within the stores. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10793) KeyShell and CredentialShell args should use single-dash style
[ https://issues.apache.org/jira/browse/HADOOP-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065493#comment-14065493 ] Owen O'Malley commented on HADOOP-10793: Sorry, to be clear I mean that you should fix the rest of the Hadoop commands to accept either one or two dashes. Obviously the old commands can't require two dashes without breaking compatibility. KeyShell and CredentialShell args should use single-dash style -- Key: HADOOP-10793 URL: https://issues.apache.org/jira/browse/HADOOP-10793 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 3.0.0 Reporter: Mike Yoder Follow-on from HADOOP-10736 as per [~andrew.wang] - the key shell uses the gnu double dash style for command line args, while other command line programs use a single dash. Consider changing this, and consider another argument parsing scheme, like the CommandLine class. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-2776) Web interface uses internal hostnames on EC2
[ https://issues.apache.org/jira/browse/HADOOP-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-2776. -- Resolution: Won't Fix I'm going to close this as won't fix. I don't think this is anything that we actually can fix here other than providing a complicated hostname mapping system for web interfaces. Part of the frustration I'm sure stems from a misunderstanding of what is actually happening: bq. The slaves file has the public names listed. The slaves file is only used by the shell code to run ssh connections. It has absolutely zero impact on the core of Hadoop. bq. Resolving a public name inside EC2 returns the private IP (which would reverse to the internal DNS name). Hadoop makes the perfectly valid assumption that the hostname the system tells us is a valid, network-connectable hostname. It is, from the inside of EC2. We have no way to know that you are attempting to connect from a completely different address that is being forwarded from some external entity. Proxying connections into a private network space is a perfectly valid solution. Web interface uses internal hostnames on EC2 Key: HADOOP-2776 URL: https://issues.apache.org/jira/browse/HADOOP-2776 Project: Hadoop Common Issue Type: Bug Components: contrib/cloud Affects Versions: 0.15.1 Environment: EC2 ami-a324c1ca Reporter: David Phillips The web interface, for example http://$MASTER_HOST:50030/machines.jsp, uses internal hostnames when running on EC2. This makes it impossible to access from outside EC2. The slaves file has the public names listed. Resolving a public name inside EC2 returns the private IP (which would reverse to the internal DNS name). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10793) KeyShell and CredentialShell args should use single-dash style
[ https://issues.apache.org/jira/browse/HADOOP-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065497#comment-14065497 ] Andrew Wang commented on HADOOP-10793: -- Hey Owen, since that'd be incompatible, we can't switch everything over until 3.0. If this stuff wants to appear in a 2.x, I think consistency is the most important consideration, and thus should use a single dash. Even for 3.0, I don't think the ROI is positive. The Hadoop commands have used a single dash forever, and there's precedent for this style in the {{java}} command. Hadoop users at this point are used to it. KeyShell and CredentialShell args should use single-dash style -- Key: HADOOP-10793 URL: https://issues.apache.org/jira/browse/HADOOP-10793 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 3.0.0 Reporter: Mike Yoder Follow-on from HADOOP-10736 as per [~andrew.wang] - the key shell uses the gnu double dash style for command line args, while other command line programs use a single dash. Consider changing this, and consider another argument parsing scheme, like the CommandLine class. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10793) KeyShell and CredentialShell args should use single-dash style
[ https://issues.apache.org/jira/browse/HADOOP-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065508#comment-14065508 ] Owen O'Malley commented on HADOOP-10793: It won't be incompatible if you accept either one or two dashes. KeyShell and CredentialShell args should use single-dash style -- Key: HADOOP-10793 URL: https://issues.apache.org/jira/browse/HADOOP-10793 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 3.0.0 Reporter: Mike Yoder Follow-on from HADOOP-10736 as per [~andrew.wang] - the key shell uses the gnu double dash style for command line args, while other command line programs use a single dash. Consider changing this, and consider another argument parsing scheme, like the CommandLine class. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10817) ProxyUsers configuration should support configurable prefixes
[ https://issues.apache.org/jira/browse/HADOOP-10817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065513#comment-14065513 ] Andrew Wang commented on HADOOP-10817: -- Hi Tucu, thanks for working on this, I took a look and had some review comments: * ProxyUsers#refreshSUGC, we should init the new sip before assigning it to the volatile variable. This way it's not visible before it's ready. DefaultImpersonationProvider: * Good time to add some comments about the regexes, namely example matches * Would be good to Precondition check that init has been called wherever we use configPrefix or related * Some basic checking of configPrefix in init as well, e.g. not empty, not null * ImpersonationProvider needs class annotations. Just a reminder, if this is a public interface, adding a new method is incompatible. I do see it in branch-2 already. ProxyUsers configuration should support configurable prefixes -- Key: HADOOP-10817 URL: https://issues.apache.org/jira/browse/HADOOP-10817 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HADOOP-10817.patch, HADOOP-10817.patch Currently {{ProxyUsers}} and the {{ImpersonationProvider}} are hardcoded to use {{hadoop.proxyuser.}} prefixes for loading proxy user configuration. Adding the possibility of using a custom prefix will enable reusing the {{ProxyUsers}} class from other components (i.e. HttpFS and KMS). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-2835) hadoop fs -help ... should not require a NameNode to show help messages
[ https://issues.apache.org/jira/browse/HADOOP-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-2835. -- Resolution: Fixed Long ago fixed. hadoop fs -help ... should not require a NameNode to show help messages - Key: HADOOP-2835 URL: https://issues.apache.org/jira/browse/HADOOP-2835 Project: Hadoop Common Issue Type: Improvement Components: fs Reporter: Tsz Wo Nicholas Sze Priority: Minor For example, if we do hadoop fs -help get before started a NameNode, we will get {code} bash-3.2$ ./bin/hadoop fs -help get 08/02/14 15:59:52 INFO ipc.Client: Retrying connect to server: some-host:some-port. Already tried 1 time(s). 08/02/14 15:59:54 INFO ipc.Client: Retrying connect to server: some-host:some-port. Already tried 2 time(s). 08/02/14 15:59:56 INFO ipc.Client: Retrying connect to server: some-host:some-port. Already tried 3 time(s). 08/02/14 15:59:58 INFO ipc.Client: Retrying connect to server: some-host:some-port. Already tried 4 time(s). 08/02/14 16:00:00 INFO ipc.Client: Retrying connect to server: some-host:some-port. Already tried 5 time(s). 08/02/14 16:00:02 INFO ipc.Client: Retrying connect to server: some-host:some-port. Already tried 6 time(s). 08/02/14 16:00:04 INFO ipc.Client: Retrying connect to server: some-host:some-port. Already tried 7 time(s). 08/02/14 16:00:06 INFO ipc.Client: Retrying connect to server: some-host:some-port. Already tried 8 time(s). 08/02/14 16:00:08 INFO ipc.Client: Retrying connect to server: some-host:some-port. Already tried 9 time(s). 08/02/14 16:00:10 INFO ipc.Client: Retrying connect to server: some-host:some-port. Already tried 10 time(s). Bad connection to FS. command aborted. {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-2846) Large input data-sets throw java.net.SocketTimeoutException: timed out waiting for rpc response exception
[ https://issues.apache.org/jira/browse/HADOOP-2846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-2846. -- Resolution: Cannot Reproduce Closing this as stale, esp since it is likely long since fixed. Large input data-sets throw java.net.SocketTimeoutException: timed out waiting for rpc response exception --- Key: HADOOP-2846 URL: https://issues.apache.org/jira/browse/HADOOP-2846 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.15.3 Reporter: Amir Youssefi Pig scripts can run over a data set of 1 day. Using the same script and same number of nodes on a larger data set (of 30 days) fails and throws following exception after 1+ hour of running. java.net.SocketTimeoutException: timed out waiting for rpc response at org.apache.hadoop.ipc.Client.call(Client.java:484) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184) at $Proxy1.getJobStatus(Unknown Source) at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy1.getJobStatus(Unknown Source) at org.apache.hadoop.mapred.JobClient$NetworkedJob.ensureFreshStatus(JobClient.java:182) at org.apache.hadoop.mapred.JobClient$NetworkedJob.isComplete(JobClient.java:237) at org.apache.pig.impl.mapreduceExec.MapReduceLauncher.launchPig(MapReduceLauncher.java:189) at org.apache.pig.impl.physicalLayer.POMapreduce.open(POMapreduce.java:136) at org.apache.pig.impl.physicalLayer.POMapreduce.open(POMapreduce.java:129) at org.apache.pig.impl.physicalLayer.POMapreduce.open(POMapreduce.java:129) at org.apache.pig.impl.physicalLayer.PhysicalPlan.exec(PhysicalPlan.java:39) at org.apache.pig.impl.physicalLayer.IntermedResult.exec(IntermedResult.java:122) at org.apache.pig.PigServer.store(PigServer.java:445) at org.apache.pig.PigServer.store(PigServer.java:413) at org.apache.pig.tools.grunt.GruntParser.processStore(GruntParser.java:135) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:327) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:54) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:54) at org.apache.pig.Main.main(Main.java:258) timed out waiting for rpc response Re-runing always hits the same at %3 progress. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-2860) ant tar should not copy the modified configs into the tarball
[ https://issues.apache.org/jira/browse/HADOOP-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-2860. -- Resolution: Won't Fix We no longer use ant. (insert pink panther theme here, using the words dead ant to represent the horn section) ant tar should not copy the modified configs into the tarball --- Key: HADOOP-2860 URL: https://issues.apache.org/jira/browse/HADOOP-2860 Project: Hadoop Common Issue Type: Bug Components: build Reporter: Owen O'Malley When generating releases, it is counter-intuitive that the tarball contains the configuration files from the developer's test environment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10793) KeyShell and CredentialShell args should use single-dash style
[ https://issues.apache.org/jira/browse/HADOOP-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065541#comment-14065541 ] Andrew Wang commented on HADOOP-10793: -- How about this: - We do single-dash for Key/CredShell in branch-2 - File a new JIRA for trunk to switch everything over to some new style This way we have consistency in branch-2 and consistency in trunk. I'd like all commands to behave the same way. Metapoint, I think that accepting both - and -- as the same is not great, since it's inventing our own new style. It's neither UNIX-style long and short args, nor Java/Hadoop-style single-dash always. I'd like to stick with something with at least some precedent. KeyShell and CredentialShell args should use single-dash style -- Key: HADOOP-10793 URL: https://issues.apache.org/jira/browse/HADOOP-10793 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 3.0.0 Reporter: Mike Yoder Follow-on from HADOOP-10736 as per [~andrew.wang] - the key shell uses the gnu double dash style for command line args, while other command line programs use a single dash. Consider changing this, and consider another argument parsing scheme, like the CommandLine class. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-2864) Improve the Scalability and Robustness of IPC
[ https://issues.apache.org/jira/browse/HADOOP-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-2864. -- Resolution: Fixed This has changed so much since this JIRA was filed that I'm just going to close this as stale. Improve the Scalability and Robustness of IPC - Key: HADOOP-2864 URL: https://issues.apache.org/jira/browse/HADOOP-2864 Project: Hadoop Common Issue Type: Improvement Components: ipc Affects Versions: 0.16.0 Reporter: Hairong Kuang Assignee: Hairong Kuang Attachments: RPCScalabilityDesignWeb.pdf This jira is intended to enhance IPC's scalability and robustness. Currently an IPC server can easily hung due to a disk failure or garbage collection, during which it cannot respond to the clients promptly. This has caused a lot of dropped calls and delayed responses thus many running applications fail on timeout. On the other side if busy clients send a lot of requests to the server in a short period of time or too many clients communicate with the server simultaneously, the server may be swarmed by requests and cannot work responsively. The proposed changes aim to # provide a better client/server coordination #* Server should be able to throttle client during burst of requests. #* A slow client should not affect server from serving other clients. #* A temporary hanging server should not cause catastrophic failures to clients. # Client/server should detect remote side failures. Examples of failures include: (1) the remote host is crashed; (2) the remote host is crashed and then rebooted; (3) the remote process is crashed or shut down by an operator; # Fairness. Each client should be able to make progress. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-2882) HOD should put line breaks in to hadoop-site.xml
[ https://issues.apache.org/jira/browse/HADOOP-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-2882. -- Resolution: Won't Fix HOD is just a legend. Did it really exist? No one knows. HOD should put line breaks in to hadoop-site.xml Key: HADOOP-2882 URL: https://issues.apache.org/jira/browse/HADOOP-2882 Project: Hadoop Common Issue Type: Improvement Reporter: Owen O'Malley It would help a lot if the hadoop-site files generated by HOD were readable. Newlines would be a good start. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-2892) providing temp space management for applications
[ https://issues.apache.org/jira/browse/HADOOP-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-2892. -- Resolution: Duplicate In one sense, this has been resolved by the usage of /tmp. But in reality, this request has been reborn in HDFS-6382. providing temp space management for applications Key: HADOOP-2892 URL: https://issues.apache.org/jira/browse/HADOOP-2892 Project: Hadoop Common Issue Type: New Feature Reporter: Olga Natkovich It would be greate if hadoop can provide temp space for applications to use. This would be useful for any applications that chain M-R jobs, perform checkpoint and need to store some application specific temp results. DeleteOnExit for files and directories would be ideal. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-2921) align map splits on sorted files with key boundaries
[ https://issues.apache.org/jira/browse/HADOOP-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-2921. -- Resolution: Fixed Stale align map splits on sorted files with key boundaries Key: HADOOP-2921 URL: https://issues.apache.org/jira/browse/HADOOP-2921 Project: Hadoop Common Issue Type: New Feature Affects Versions: 0.16.0 Reporter: Joydeep Sen Sarma (this is something that we have implemented in the application layer - may be useful to have in hadoop itself). long term log storage systems often keep data sorted (by some sort-key). future computations on such files can often benefit from this sort order. if the job requires grouping by the sort-key - then it should be possible to do reduction in the map stage itself. this is not natively supported by hadoop (except in the degenerate case of 1 map file per task) since splits can span the sort-key. however aligning the data read by the map task to sort key boundaries is straightforward - and this would be a useful capability to have in hadoop. the definition of the sort key should be left up to the application (it's not necessarily the key field in a Sequencefile) through a generic interface - but otherwise - the sequencefile and text file readers can use the extracted sort key to align map task data with key boundaries. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-2922) sequencefiles without keys
[ https://issues.apache.org/jira/browse/HADOOP-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-2922. -- Resolution: Won't Fix Stale sequencefiles without keys -- Key: HADOOP-2922 URL: https://issues.apache.org/jira/browse/HADOOP-2922 Project: Hadoop Common Issue Type: New Feature Affects Versions: 0.16.0 Reporter: Joydeep Sen Sarma sequencefiles are invaluable for storing compressed/binary data. but when we use them to store serialized records - we don't use the key part at all (just put something dummy there to satisfy the api). i have heard of other projects using the same tactics (jaql/cascading). so this is a request to have a modified version of sequencefiles that don't incur the space and compute overhead of processing/storing these dummy keys. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-2975) IPC server should not allocate a buffer for each request
[ https://issues.apache.org/jira/browse/HADOOP-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-2975. -- Resolution: Incomplete A lot has changed here. Closing as stale. IPC server should not allocate a buffer for each request Key: HADOOP-2975 URL: https://issues.apache.org/jira/browse/HADOOP-2975 Project: Hadoop Common Issue Type: Improvement Components: ipc Affects Versions: 0.16.0 Reporter: Hairong Kuang Assignee: Ankur Attachments: Hadoop-2975-v1.patch, Hadoop-2975-v2.patch, Hadoop-2975-v3.patch Currently the IPC server allocates a buffer for each incoming request. The buffer is thrown away after the request is serialized. This leads to very inefficient heap utilization. It would be nicer if all requests from one connection could share a same common buffer since the ipc server has only one request is being read from a socket at a time. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-2960) A mapper should use some heuristics to decide whether to run the combiner during spills
[ https://issues.apache.org/jira/browse/HADOOP-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-2960. -- Resolution: Won't Fix Closing at won't fix, given the -1. A mapper should use some heuristics to decide whether to run the combiner during spills --- Key: HADOOP-2960 URL: https://issues.apache.org/jira/browse/HADOOP-2960 Project: Hadoop Common Issue Type: Bug Reporter: Runping Qi Right now, the combiner, if set, will be called for each spill, no mapper whether the combiner can actually reduce the values. The mapper should use some heuristics to decide whether to run the combiner during spills. One of such heuristics is to check the the ratio of the nymber of keys to the number of unique keys in the spill. The combiner will be called only if that ration exceeds certain threshold (say 2). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-2980) slow reduce copies - map output locations not being fetched even when map complete
[ https://issues.apache.org/jira/browse/HADOOP-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-2980. -- Resolution: Incomplete I'm going to close this as stale. If people are still seeing this as an issue, they should file a new jira with new data! slow reduce copies - map output locations not being fetched even when map complete -- Key: HADOOP-2980 URL: https://issues.apache.org/jira/browse/HADOOP-2980 Project: Hadoop Common Issue Type: Bug Affects Versions: 0.15.3 Reporter: Joydeep Sen Sarma maps are long finished. reduces are stuck looking for map locations. they make progress - but slowly. it almost seems like they get new map locations every minute or so: 2008-03-07 18:50:52,737 INFO org.apache.hadoop.mapred.ReduceTask: task_200803041231_3586_r_21_0 done copying task_200803041231_3586_m_004620_0 output from hadoop082.sf2p.facebook.com.. 2008-03-07 18:50:53,733 INFO org.apache.hadoop.mapred.ReduceTask: task_200803041231_3586_r_21_0: Got 0 new map-outputs 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures 2008-03-07 18:50:53,733 INFO org.apache.hadoop.mapred.ReduceTask: task_200803041231_3586_r_21_0 Got 0 known map output location(s); scheduling... ... 2008-03-07 18:51:49,767 INFO org.apache.hadoop.mapred.ReduceTask: task_200803041231_3586_r_21_0 Got 50 known map output location(s); scheduling... 2008-03-07 18:51:49,767 INFO org.apache.hadoop.mapred.ReduceTask: task_200803041231_3586_r_21_0 Scheduled 41 of 50 known outputs (0 slow hosts and 9 dup hosts) they get about 50 locations at a time and this 1 minute delay pattern is surprisingly common .. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-3037) Hudson needs to add src/test for checking javac warnings
[ https://issues.apache.org/jira/browse/HADOOP-3037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-3037. -- Resolution: Not a Problem Warning checks have been there for a while. Hudson needs to add src/test for checking javac warnings Key: HADOOP-3037 URL: https://issues.apache.org/jira/browse/HADOOP-3037 Project: Hadoop Common Issue Type: Improvement Components: build Affects Versions: hudson Reporter: Amareshwari Sriramadasu Fix For: hudson I think src/test is not added in javac warnings checker. HADOOP-3031 looks at warnings introduced. Hudson needs to add src/test for checking javac warnings -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-3120) Large #of tasks failing at one time can effectively hang the jobtracker
[ https://issues.apache.org/jira/browse/HADOOP-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-3120. -- Resolution: Incomplete I'm going to close this as stale. Large #of tasks failing at one time can effectively hang the jobtracker Key: HADOOP-3120 URL: https://issues.apache.org/jira/browse/HADOOP-3120 Project: Hadoop Common Issue Type: Bug Environment: Linux/Hadoop-15.3 Reporter: Pete Wyckoff Priority: Minor We think that JobTracker.removeMarkedTaks does so much logging when this happens (ie logging thousands of failed taks per cycle) that nothing else can go on (since it's called from a synchronized method) and thus by the next cycle, the next waves of jobs have failed and we again have 10s of thousands of failures to log and on and on. At least, the above is what we observed - just a continual printing of those failures and nothing else happening on and on. Of course the original jobs may have ultimately failed but new jobs come in to perpetuate the problem. This has happened to us a number of times and since we commented out the log.info in that method we haven't had any problems. Although thousands and thousands of task failures are hopefully not that common. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-3122) test-patch target should check @SuppressWarnings(...)
[ https://issues.apache.org/jira/browse/HADOOP-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-3122. -- Resolution: Incomplete Closing this as stale. test-patch target should check @SuppressWarnings(...) - Key: HADOOP-3122 URL: https://issues.apache.org/jira/browse/HADOOP-3122 Project: Hadoop Common Issue Type: Improvement Components: build Reporter: Tsz Wo Nicholas Sze Priority: Minor The Java annotation @SuppressWarnings(...) tag can be used to get rid of compiler warnings. In our patch process, QA should check @SuppressWarnings(...) tag to prevent abusing this tag. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-3126) org.apache.hadoop.examples.RandomTextWriter$Counters fluctuate when RandonTextWriter job is running
[ https://issues.apache.org/jira/browse/HADOOP-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-3126. -- Resolution: Fixed I'm going to close this as fixed since it probably was. org.apache.hadoop.examples.RandomTextWriter$Counters fluctuate when RandonTextWriter job is running --- Key: HADOOP-3126 URL: https://issues.apache.org/jira/browse/HADOOP-3126 Project: Hadoop Common Issue Type: Bug Reporter: Runping Qi On the web GUI page, the value for RECORDS_WRITTEN and BYTES_WRITTEN do not increase monotonically. Rather, their values go up and down. I suspect something wrong with how the counters are updated. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-3148) build-contrib.xml should inherit hadoop version parameter from root build.xml
[ https://issues.apache.org/jira/browse/HADOOP-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-3148. -- Resolution: Fixed Stale issue. build-contrib.xml should inherit hadoop version parameter from root build.xml - Key: HADOOP-3148 URL: https://issues.apache.org/jira/browse/HADOOP-3148 Project: Hadoop Common Issue Type: Bug Components: build Reporter: Vinod Kumar Vavilapalli Priority: Minor This is needed in HOD (and may be useful in other contrib projects), which, in some cases, may be compiled and built separately. After HADOOP-3137, HOD will obtain its version from build parameter ${version}, and this will fail to give proper version when built independently(at src/contrib/hod level). -- This message was sent by Atlassian JIRA (v6.2#6252)