date:20130604


 [ 
https://issues.apache.org/jira/browse/HADOOP-9616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HADOOP-9616:
---

Attachment: patchJavadocWarnings-old.txt

Attached previous Javadoc checking output as *-old.txt.

 In branch-2, baseline of Javadoc Warnings (specified in 
 test-patch.properties) is mismatch with  Javadoc warnings in current codebase
 -

 Key: HADOOP-9616
 URL: https://issues.apache.org/jira/browse/HADOOP-9616
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HADOOP-9616.patch, patchJavadocWarnings-old.txt


 Now the baseline is set to 13 warnings, but they are 29 warnings now. 16 
 warnings belongs to using Sun proprietary APIs, and 13 warnings is using 
 incorrect link in doc. I think we should at least fix 13 warnings and set the 
 baseline to 16.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9616) In branch-2, baseline of Javadoc Warnings (specified in test-patch.properties) is mismatch with Javadoc warnings in current codebase


 [ 
https://issues.apache.org/jira/browse/HADOOP-9616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HADOOP-9616:
---

Attachment: HADOOP-9616.patch

Attach a patch to fix these Javadoc issues.

 In branch-2, baseline of Javadoc Warnings (specified in 
 test-patch.properties) is mismatch with  Javadoc warnings in current codebase
 -

 Key: HADOOP-9616
 URL: https://issues.apache.org/jira/browse/HADOOP-9616
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HADOOP-9616.patch, patchJavadocWarnings-old.txt


 Now the baseline is set to 13 warnings, but they are 29 warnings now. 16 
 warnings belongs to using Sun proprietary APIs, and 13 warnings is using 
 incorrect link in doc. I think we should at least fix 13 warnings and set the 
 baseline to 16.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9616) In branch-2, baseline of Javadoc Warnings (specified in test-patch.properties) is mismatch with Javadoc warnings in current codebase


 [ 
https://issues.apache.org/jira/browse/HADOOP-9616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HADOOP-9616:
---

Attachment: patchJavadocWarnings-new.txt

Attach Javadoc checking output after applying this patch as *-new.txt.

 In branch-2, baseline of Javadoc Warnings (specified in 
 test-patch.properties) is mismatch with  Javadoc warnings in current codebase
 -

 Key: HADOOP-9616
 URL: https://issues.apache.org/jira/browse/HADOOP-9616
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HADOOP-9616.patch, patchJavadocWarnings-new.txt, 
 patchJavadocWarnings-old.txt


 Now the baseline is set to 13 warnings, but they are 29 warnings now. 16 
 warnings belongs to using Sun proprietary APIs, and 13 warnings is using 
 incorrect link in doc. I think we should at least fix 13 warnings and set the 
 baseline to 16.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9616) In branch-2, baseline of Javadoc Warnings (specified in test-patch.properties) is mismatch with Javadoc warnings in current codebase


 [ 
https://issues.apache.org/jira/browse/HADOOP-9616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HADOOP-9616:
---

 Tags: javadoc
 Target Version/s: 2.1.0-beta
Affects Version/s: 2.0.4-alpha
   Status: Patch Available  (was: Open)

 In branch-2, baseline of Javadoc Warnings (specified in 
 test-patch.properties) is mismatch with  Javadoc warnings in current codebase
 -

 Key: HADOOP-9616
 URL: https://issues.apache.org/jira/browse/HADOOP-9616
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.0.4-alpha
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HADOOP-9616.patch, patchJavadocWarnings-new.txt, 
 patchJavadocWarnings-old.txt


 Now the baseline is set to 13 warnings, but they are 29 warnings now. 16 
 warnings belongs to using Sun proprietary APIs, and 13 warnings is using 
 incorrect link in doc. I think we should at least fix 13 warnings and set the 
 baseline to 16.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9616) In branch-2, baseline of Javadoc Warnings (specified in test-patch.properties) is mismatch with Javadoc warnings in current codebase

2013-06-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674187#comment-13674187
 ] 

Hadoop QA commented on HADOOP-9616:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12586073/patchJavadocWarnings-new.txt
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/2598//console

This message is automatically generated.

 In branch-2, baseline of Javadoc Warnings (specified in 
 test-patch.properties) is mismatch with  Javadoc warnings in current codebase
 -

 Key: HADOOP-9616
 URL: https://issues.apache.org/jira/browse/HADOOP-9616
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.0.4-alpha
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HADOOP-9616.patch, patchJavadocWarnings-new.txt, 
 patchJavadocWarnings-old.txt


 Now the baseline is set to 13 warnings, but they are 29 warnings now. 16 
 warnings belongs to using Sun proprietary APIs, and 13 warnings is using 
 incorrect link in doc. I think we should at least fix 13 warnings and set the 
 baseline to 16.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9616) In branch-2, baseline of Javadoc Warnings (specified in test-patch.properties) is mismatch with Javadoc warnings in current codebase


[ 
https://issues.apache.org/jira/browse/HADOOP-9616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674192#comment-13674192
 ] 

Junping Du commented on HADOOP-9616:


This patch is for branch-2 only so cannot applied on jenkins of trunk.

 In branch-2, baseline of Javadoc Warnings (specified in 
 test-patch.properties) is mismatch with  Javadoc warnings in current codebase
 -

 Key: HADOOP-9616
 URL: https://issues.apache.org/jira/browse/HADOOP-9616
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.0.4-alpha
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HADOOP-9616.patch, patchJavadocWarnings-new.txt, 
 patchJavadocWarnings-old.txt


 Now the baseline is set to 13 warnings, but they are 29 warnings now. 16 
 warnings belongs to using Sun proprietary APIs, and 13 warnings is using 
 incorrect link in doc. I think we should at least fix 13 warnings and set the 
 baseline to 16.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9392) Token based authentication and Single Sign On

2013-06-04 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674196#comment-13674196
 ] 

Andrew Purtell commented on HADOOP-9392:


When is the meetup?

 Token based authentication and Single Sign On
 -

 Key: HADOOP-9392
 URL: https://issues.apache.org/jira/browse/HADOOP-9392
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Reporter: Kai Zheng
Assignee: Kai Zheng
 Fix For: 3.0.0

 Attachments: token-based-authn-plus-sso.pdf


 This is an umbrella entry for one of project Rhino’s topic, for details of 
 project Rhino, please refer to 
 https://github.com/intel-hadoop/project-rhino/. The major goal for this entry 
 as described in project Rhino was 
  
 “Core, HDFS, ZooKeeper, and HBase currently support Kerberos authentication 
 at the RPC layer, via SASL. However this does not provide valuable attributes 
 such as group membership, classification level, organizational identity, or 
 support for user defined attributes. Hadoop components must interrogate 
 external resources for discovering these attributes and at scale this is 
 problematic. There is also no consistent delegation model. HDFS has a simple 
 delegation capability, and only Oozie can take limited advantage of it. We 
 will implement a common token based authentication framework to decouple 
 internal user and service authentication from external mechanisms used to 
 support it (like Kerberos)”
  
 We’d like to start our work from Hadoop-Common and try to provide common 
 facilities by extending existing authentication framework which support:
 1.Pluggable token provider interface 
 2.Pluggable token verification protocol and interface
 3.Security mechanism to distribute secrets in cluster nodes
 4.Delegation model of user authentication

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9287) Parallel testing hadoop-common


[ 
https://issues.apache.org/jira/browse/HADOOP-9287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674249#comment-13674249
 ] 

Hudson commented on HADOOP-9287:


Integrated in Hadoop-Yarn-trunk #230 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/230/])
Move HADOOP-9287 in CHANGES.txt after committing to branch-2 (Revision 
1489258)

 Result = FAILURE
jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489258
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 Parallel testing hadoop-common
 --

 Key: HADOOP-9287
 URL: https://issues.apache.org/jira/browse/HADOOP-9287
 Project: Hadoop Common
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Andrey Klochkov
 Fix For: 3.0.0, 2.1.0-beta

 Attachments: HADOOP-9287.1.patch, HADOOP-9287-branch-2--N1.patch, 
 HADOOP-9287--N3.patch, HADOOP-9287--N3.patch, HADOOP-9287--N4.patch, 
 HADOOP-9287--N5.patch, HADOOP-9287--N6.patch, HADOOP-9287--N7.patch, 
 HADOOP-9287.patch, HADOOP-9287.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9481) Broken conditional logic with HADOOP_SNAPPY_LIBRARY


[ 
https://issues.apache.org/jira/browse/HADOOP-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674242#comment-13674242
 ] 

Hudson commented on HADOOP-9481:


Integrated in Hadoop-Yarn-trunk #230 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/230/])
HADOOP-9481. Move from trunk to release 2.1.0 section (Revision 1489261)

 Result = FAILURE
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489261
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 Broken conditional logic with HADOOP_SNAPPY_LIBRARY
 ---

 Key: HADOOP-9481
 URL: https://issues.apache.org/jira/browse/HADOOP-9481
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Vadim Bondarev
Assignee: Vadim Bondarev
Priority: Minor
 Fix For: 3.0.0, 2.1.0-beta

 Attachments: HADOOP-9481-trunk--N1.patch, HADOOP-9481-trunk--N4.patch


 The problem is a regression introduced by recent fix 
 https://issues.apache.org/jira/browse/HADOOP-8562 .
 That fix makes some improvements for Windows platform, but breaks native code 
 work on Unix.
 Namely, let's see the diff HADOOP-8562 of the file 
 hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c
  :  
 {noformat}
 --- 
 hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c
 +++ 
 hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c
 @@ -16,12 +16,18 @@
   * limitations under the License.
   */
 -#include dlfcn.h
 +
 +#if defined HADOOP_SNAPPY_LIBRARY
 +
  #include stdio.h
  #include stdlib.h
  #include string.h
 +#ifdef UNIX
 +#include dlfcn.h
  #include config.h
 +#endif // UNIX
 +
  #include org_apache_hadoop_io_compress_snappy.h
  #include org_apache_hadoop_io_compress_snappy_SnappyCompressor.h
 @@ -81,7 +87,7 @@ JNIEXPORT jint JNICALL 
 Java_org_apache_hadoop_io_compress_snappy_SnappyCompresso
UNLOCK_CLASS(env, clazz, SnappyCompressor);
if (uncompressed_bytes == 0) {
 -return 0;
 +return (jint)0;
}
// Get the output direct buffer
 @@ -90,7 +96,7 @@ JNIEXPORT jint JNICALL 
 Java_org_apache_hadoop_io_compress_snappy_SnappyCompresso
UNLOCK_CLASS(env, clazz, SnappyCompressor);
if (compressed_bytes == 0) {
 -return 0;
 +return (jint)0;
}
/* size_t should always be 4 bytes or larger. */
 @@ -109,3 +115,5 @@ JNIEXPORT jint JNICALL 
 Java_org_apache_hadoop_io_compress_snappy_SnappyCompresso
(*env)-SetIntField(env, thisj, SnappyCompressor_uncompressedDirectBufLen, 
 0);
return (jint)buf_len;
  }
 +
 +#endif //define HADOOP_SNAPPY_LIBRARY
 {noformat}
 Here we see that all the class implementation got enclosed into if defined 
 HADOOP_SNAPPY_LIBRARY directive, and the point is that 
 HADOOP_SNAPPY_LIBRARY is *not* defined. 
 This causes the class implementation to be effectively empty, what, in turn, 
 causes the UnsatisfiedLinkError to be thrown in the runtime upon any attempt 
 to invoke the native methods implemented there.
 The actual intention of the authors of HADOOP-8562 was (as we suppose) to 
 invoke include config.h, where HADOOP_SNAPPY_LIBRARY is defined. But 
 currently it is *not* included because it resides *inside* if defined 
 HADOOP_SNAPPY_LIBRARY block.
 Similar situation with ifdef UNIX, because UNIX or WINDOWS variables are 
 defined in org_apache_hadoop.h, which is indirectly included through 
 include org_apache_hadoop_io_compress_snappy.h, and in the current code 
 this is done *after* code ifdef UNIX, so in the current code the block 
 ifdef UNIX is *not* executed on UNIX.
 The suggested patch fixes the described problems by reordering the include 
 and if preprocessor directives accordingly, bringing the methods of class 
 org.apache.hadoop.io.compress.snappy.SnappyCompressor back to work again.
 Of course, Snappy native libraries must be installed to build and invoke 
 snappy native methods.
 (Note: there was a mistype in commit message: 8952 written in place of 8562: 
 HADOOP-8952. Enhancements to support Hadoop on Windows Server and Windows 
 Azure environments. Contributed by Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas 
 Saha, Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao, 
 Sumadhur Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, 
 Thejas Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, 
 Ramya Bharathi Nimmagadda.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1453486 
 13f79535-47bb-0310-9956-ffa450edef68
 )

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA

[jira] [Commented] (HADOOP-9397) Incremental dist tar build fails


[ 
https://issues.apache.org/jira/browse/HADOOP-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674240#comment-13674240
 ] 

Hudson commented on HADOOP-9397:


Integrated in Hadoop-Yarn-trunk #230 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/230/])
Move HADOOP-9397 to 2.1.0-beta after merging it into branch-2. (Revision 
1489026)

 Result = FAILURE
jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489026
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 Incremental dist tar build fails
 

 Key: HADOOP-9397
 URL: https://issues.apache.org/jira/browse/HADOOP-9397
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 3.0.0
Reporter: Jason Lowe
Assignee: Chris Nauroth
 Fix For: 3.0.0, 2.1.0-beta

 Attachments: HADOOP-9397.1.patch


 Building a dist tar build when the dist tarball already exists from a 
 previous build fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9614) smart-test-patch.sh hangs for new version of patch (2.7.1)


[ 
https://issues.apache.org/jira/browse/HADOOP-9614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674321#comment-13674321
 ] 

Hudson commented on HADOOP-9614:


Integrated in Hadoop-Hdfs-0.23-Build #628 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/628/])
HADOOP-9614. smart-test-patch.sh hangs for new version of patch (2.7.1) 
(Ravi Prakash via jeagles) (Revision 1489159)

 Result = UNSTABLE
jeagles : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489159
Files : 
* /hadoop/common/branches/branch-0.23/dev-support/smart-apply-patch.sh
* 
/hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt


 smart-test-patch.sh hangs for new version of patch (2.7.1)
 --

 Key: HADOOP-9614
 URL: https://issues.apache.org/jira/browse/HADOOP-9614
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Fix For: 3.0.0, 0.23.8, 2.0.5-alpha

 Attachments: HADOOP-9614.patch, HADOOP-9614.patch


 patch -p0 -E --dry-run prints checking file  for the new version of 
 patch(2.7.1) rather than patching file as it did for older versions. This 
 causes TMP2 to become empty, which causes the script to hang on this command 
 forever:
 PREFIX_DIRS_AND_FILES=$(cut -d '/' -f 1 | sort | uniq)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-8545) Filesystem Implementation for OpenStack Swift

2013-06-04 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-8545:
---

Status: Open  (was: Patch Available)

 Filesystem Implementation for OpenStack Swift
 -

 Key: HADOOP-8545
 URL: https://issues.apache.org/jira/browse/HADOOP-8545
 Project: Hadoop Common
  Issue Type: New Feature
  Components: fs
Affects Versions: 2.0.3-alpha, 1.2.0
Reporter: Tim Miller
Assignee: Dmitry Mezhensky
  Labels: hadoop, patch
 Attachments: HADOOP-8545-026.patch, HADOOP-8545-027.patch, 
 HADOOP-8545-028.patch, HADOOP-8545-10.patch, HADOOP-8545-11.patch, 
 HADOOP-8545-12.patch, HADOOP-8545-13.patch, HADOOP-8545-14.patch, 
 HADOOP-8545-15.patch, HADOOP-8545-16.patch, HADOOP-8545-17.patch, 
 HADOOP-8545-18.patch, HADOOP-8545-19.patch, HADOOP-8545-1.patch, 
 HADOOP-8545-20.patch, HADOOP-8545-21.patch, HADOOP-8545-22.patch, 
 HADOOP-8545-23.patch, HADOOP-8545-24.patch, HADOOP-8545-25.patch, 
 HADOOP-8545-2.patch, HADOOP-8545-3.patch, HADOOP-8545-4.patch, 
 HADOOP-8545-5.patch, HADOOP-8545-6.patch, HADOOP-8545-7.patch, 
 HADOOP-8545-8.patch, HADOOP-8545-9.patch, HADOOP-8545-javaclouds-2.patch, 
 HADOOP-8545.patch, HADOOP-8545.patch


 ,Add a filesystem implementation for OpenStack Swift object store, similar to 
 the one which exists today for S3.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9287) Parallel testing hadoop-common


[ 
https://issues.apache.org/jira/browse/HADOOP-9287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674340#comment-13674340
 ] 

Hudson commented on HADOOP-9287:


Integrated in Hadoop-Hdfs-trunk #1420 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1420/])
Move HADOOP-9287 in CHANGES.txt after committing to branch-2 (Revision 
1489258)

 Result = FAILURE
jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489258
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 Parallel testing hadoop-common
 --

 Key: HADOOP-9287
 URL: https://issues.apache.org/jira/browse/HADOOP-9287
 Project: Hadoop Common
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Andrey Klochkov
 Fix For: 3.0.0, 2.1.0-beta

 Attachments: HADOOP-9287.1.patch, HADOOP-9287-branch-2--N1.patch, 
 HADOOP-9287--N3.patch, HADOOP-9287--N3.patch, HADOOP-9287--N4.patch, 
 HADOOP-9287--N5.patch, HADOOP-9287--N6.patch, HADOOP-9287--N7.patch, 
 HADOOP-9287.patch, HADOOP-9287.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9614) smart-test-patch.sh hangs for new version of patch (2.7.1)


[ 
https://issues.apache.org/jira/browse/HADOOP-9614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674342#comment-13674342
 ] 

Hudson commented on HADOOP-9614:


Integrated in Hadoop-Hdfs-trunk #1420 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1420/])
HADOOP-9614. smart-test-patch.sh hangs for new version of patch (2.7.1) 
(Ravi Prakash via jeagles) (Revision 1489136)

 Result = FAILURE
jeagles : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489136
Files : 
* /hadoop/common/trunk/dev-support/smart-apply-patch.sh
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 smart-test-patch.sh hangs for new version of patch (2.7.1)
 --

 Key: HADOOP-9614
 URL: https://issues.apache.org/jira/browse/HADOOP-9614
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Fix For: 3.0.0, 0.23.8, 2.0.5-alpha

 Attachments: HADOOP-9614.patch, HADOOP-9614.patch


 patch -p0 -E --dry-run prints checking file  for the new version of 
 patch(2.7.1) rather than patching file as it did for older versions. This 
 causes TMP2 to become empty, which causes the script to hang on this command 
 forever:
 PREFIX_DIRS_AND_FILES=$(cut -d '/' -f 1 | sort | uniq)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9397) Incremental dist tar build fails


[ 
https://issues.apache.org/jira/browse/HADOOP-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674331#comment-13674331
 ] 

Hudson commented on HADOOP-9397:


Integrated in Hadoop-Hdfs-trunk #1420 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1420/])
Move HADOOP-9397 to 2.1.0-beta after merging it into branch-2. (Revision 
1489026)

 Result = FAILURE
jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489026
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 Incremental dist tar build fails
 

 Key: HADOOP-9397
 URL: https://issues.apache.org/jira/browse/HADOOP-9397
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 3.0.0
Reporter: Jason Lowe
Assignee: Chris Nauroth
 Fix For: 3.0.0, 2.1.0-beta

 Attachments: HADOOP-9397.1.patch


 Building a dist tar build when the dist tarball already exists from a 
 previous build fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9481) Broken conditional logic with HADOOP_SNAPPY_LIBRARY


[ 
https://issues.apache.org/jira/browse/HADOOP-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674333#comment-13674333
 ] 

Hudson commented on HADOOP-9481:


Integrated in Hadoop-Hdfs-trunk #1420 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1420/])
HADOOP-9481. Move from trunk to release 2.1.0 section (Revision 1489261)

 Result = FAILURE
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489261
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 Broken conditional logic with HADOOP_SNAPPY_LIBRARY
 ---

 Key: HADOOP-9481
 URL: https://issues.apache.org/jira/browse/HADOOP-9481
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Vadim Bondarev
Assignee: Vadim Bondarev
Priority: Minor
 Fix For: 3.0.0, 2.1.0-beta

 Attachments: HADOOP-9481-trunk--N1.patch, HADOOP-9481-trunk--N4.patch


 The problem is a regression introduced by recent fix 
 https://issues.apache.org/jira/browse/HADOOP-8562 .
 That fix makes some improvements for Windows platform, but breaks native code 
 work on Unix.
 Namely, let's see the diff HADOOP-8562 of the file 
 hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c
  :  
 {noformat}
 --- 
 hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c
 +++ 
 hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c
 @@ -16,12 +16,18 @@
   * limitations under the License.
   */
 -#include dlfcn.h
 +
 +#if defined HADOOP_SNAPPY_LIBRARY
 +
  #include stdio.h
  #include stdlib.h
  #include string.h
 +#ifdef UNIX
 +#include dlfcn.h
  #include config.h
 +#endif // UNIX
 +
  #include org_apache_hadoop_io_compress_snappy.h
  #include org_apache_hadoop_io_compress_snappy_SnappyCompressor.h
 @@ -81,7 +87,7 @@ JNIEXPORT jint JNICALL 
 Java_org_apache_hadoop_io_compress_snappy_SnappyCompresso
UNLOCK_CLASS(env, clazz, SnappyCompressor);
if (uncompressed_bytes == 0) {
 -return 0;
 +return (jint)0;
}
// Get the output direct buffer
 @@ -90,7 +96,7 @@ JNIEXPORT jint JNICALL 
 Java_org_apache_hadoop_io_compress_snappy_SnappyCompresso
UNLOCK_CLASS(env, clazz, SnappyCompressor);
if (compressed_bytes == 0) {
 -return 0;
 +return (jint)0;
}
/* size_t should always be 4 bytes or larger. */
 @@ -109,3 +115,5 @@ JNIEXPORT jint JNICALL 
 Java_org_apache_hadoop_io_compress_snappy_SnappyCompresso
(*env)-SetIntField(env, thisj, SnappyCompressor_uncompressedDirectBufLen, 
 0);
return (jint)buf_len;
  }
 +
 +#endif //define HADOOP_SNAPPY_LIBRARY
 {noformat}
 Here we see that all the class implementation got enclosed into if defined 
 HADOOP_SNAPPY_LIBRARY directive, and the point is that 
 HADOOP_SNAPPY_LIBRARY is *not* defined. 
 This causes the class implementation to be effectively empty, what, in turn, 
 causes the UnsatisfiedLinkError to be thrown in the runtime upon any attempt 
 to invoke the native methods implemented there.
 The actual intention of the authors of HADOOP-8562 was (as we suppose) to 
 invoke include config.h, where HADOOP_SNAPPY_LIBRARY is defined. But 
 currently it is *not* included because it resides *inside* if defined 
 HADOOP_SNAPPY_LIBRARY block.
 Similar situation with ifdef UNIX, because UNIX or WINDOWS variables are 
 defined in org_apache_hadoop.h, which is indirectly included through 
 include org_apache_hadoop_io_compress_snappy.h, and in the current code 
 this is done *after* code ifdef UNIX, so in the current code the block 
 ifdef UNIX is *not* executed on UNIX.
 The suggested patch fixes the described problems by reordering the include 
 and if preprocessor directives accordingly, bringing the methods of class 
 org.apache.hadoop.io.compress.snappy.SnappyCompressor back to work again.
 Of course, Snappy native libraries must be installed to build and invoke 
 snappy native methods.
 (Note: there was a mistype in commit message: 8952 written in place of 8562: 
 HADOOP-8952. Enhancements to support Hadoop on Windows Server and Windows 
 Azure environments. Contributed by Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas 
 Saha, Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao, 
 Sumadhur Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, 
 Thejas Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, 
 Ramya Bharathi Nimmagadda.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1453486 
 13f79535-47bb-0310-9956-ffa450edef68
 )

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA

[jira] [Commented] (HADOOP-9615) Hadoop Jar command not working when used with Spring ORM

2013-06-04 Thread Deepa Vasanthkumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674386#comment-13674386
 ] 

Deepa Vasanthkumar commented on HADOOP-9615:



As a workaround, I removed all configuration xmls from the jar file, and placed 
them into the working directory.
This resolved the issue for the timebeing. 


 Hadoop Jar command not working when used with Spring ORM
 

 Key: HADOOP-9615
 URL: https://issues.apache.org/jira/browse/HADOOP-9615
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.0.0-alpha
 Environment: CentOS, 
Reporter: Deepa Vasanthkumar
  Labels: hadoop-2.0

 Unable to invoke 'hadoop jar' command for class, which contains Spring 
 persistance unit.
 The problem is that, the jar file uses Spring ORM for loading the persistance 
 configurations, and based on these configurations,
 i need to move the files to HDFS. 
 While invoking the jar with hadoop jar command (having spring orm injected) 
 the exception is as: 
 Exception in thread main 
 org.springframework.beans.factory.BeanCreationException: Error creating bean 
 with name 
 'org.springframework.dao.annotation.PersistenceExceptionTranslationPostProcessor#0'
  defined in class path resource [applicationContext.xml
 Error creating bean with name 'entityManagerFactory' defined in class path 
 resource [applicationContext.xml]: 
 Invocation of init method failed; nested exception is 
 java.lang.IllegalStateException: Conflicting persistence unit definitions for 
 name 'Persistance': file:/home/user/Desktop/ABC/apnJar.jar, 
 file:/tmp/hadoop-user/hadoop-unjar2841422106164401019/
 Caused by: java.lang.IllegalStateException: Conflicting persistence unit 
 definitions for name 'Persistance': 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9397) Incremental dist tar build fails


[ 
https://issues.apache.org/jira/browse/HADOOP-9397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674399#comment-13674399
 ] 

Hudson commented on HADOOP-9397:


Integrated in Hadoop-Mapreduce-trunk #1446 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1446/])
Move HADOOP-9397 to 2.1.0-beta after merging it into branch-2. (Revision 
1489026)

 Result = SUCCESS
jlowe : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489026
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 Incremental dist tar build fails
 

 Key: HADOOP-9397
 URL: https://issues.apache.org/jira/browse/HADOOP-9397
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 3.0.0
Reporter: Jason Lowe
Assignee: Chris Nauroth
 Fix For: 3.0.0, 2.1.0-beta

 Attachments: HADOOP-9397.1.patch


 Building a dist tar build when the dist tarball already exists from a 
 previous build fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9614) smart-test-patch.sh hangs for new version of patch (2.7.1)


[ 
https://issues.apache.org/jira/browse/HADOOP-9614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674410#comment-13674410
 ] 

Hudson commented on HADOOP-9614:


Integrated in Hadoop-Mapreduce-trunk #1446 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1446/])
HADOOP-9614. smart-test-patch.sh hangs for new version of patch (2.7.1) 
(Ravi Prakash via jeagles) (Revision 1489136)

 Result = SUCCESS
jeagles : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489136
Files : 
* /hadoop/common/trunk/dev-support/smart-apply-patch.sh
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 smart-test-patch.sh hangs for new version of patch (2.7.1)
 --

 Key: HADOOP-9614
 URL: https://issues.apache.org/jira/browse/HADOOP-9614
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Fix For: 3.0.0, 0.23.8, 2.0.5-alpha

 Attachments: HADOOP-9614.patch, HADOOP-9614.patch


 patch -p0 -E --dry-run prints checking file  for the new version of 
 patch(2.7.1) rather than patching file as it did for older versions. This 
 causes TMP2 to become empty, which causes the script to hang on this command 
 forever:
 PREFIX_DIRS_AND_FILES=$(cut -d '/' -f 1 | sort | uniq)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9481) Broken conditional logic with HADOOP_SNAPPY_LIBRARY


[ 
https://issues.apache.org/jira/browse/HADOOP-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674401#comment-13674401
 ] 

Hudson commented on HADOOP-9481:


Integrated in Hadoop-Mapreduce-trunk #1446 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1446/])
HADOOP-9481. Move from trunk to release 2.1.0 section (Revision 1489261)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489261
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 Broken conditional logic with HADOOP_SNAPPY_LIBRARY
 ---

 Key: HADOOP-9481
 URL: https://issues.apache.org/jira/browse/HADOOP-9481
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Vadim Bondarev
Assignee: Vadim Bondarev
Priority: Minor
 Fix For: 3.0.0, 2.1.0-beta

 Attachments: HADOOP-9481-trunk--N1.patch, HADOOP-9481-trunk--N4.patch


 The problem is a regression introduced by recent fix 
 https://issues.apache.org/jira/browse/HADOOP-8562 .
 That fix makes some improvements for Windows platform, but breaks native code 
 work on Unix.
 Namely, let's see the diff HADOOP-8562 of the file 
 hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c
  :  
 {noformat}
 --- 
 hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c
 +++ 
 hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.c
 @@ -16,12 +16,18 @@
   * limitations under the License.
   */
 -#include dlfcn.h
 +
 +#if defined HADOOP_SNAPPY_LIBRARY
 +
  #include stdio.h
  #include stdlib.h
  #include string.h
 +#ifdef UNIX
 +#include dlfcn.h
  #include config.h
 +#endif // UNIX
 +
  #include org_apache_hadoop_io_compress_snappy.h
  #include org_apache_hadoop_io_compress_snappy_SnappyCompressor.h
 @@ -81,7 +87,7 @@ JNIEXPORT jint JNICALL 
 Java_org_apache_hadoop_io_compress_snappy_SnappyCompresso
UNLOCK_CLASS(env, clazz, SnappyCompressor);
if (uncompressed_bytes == 0) {
 -return 0;
 +return (jint)0;
}
// Get the output direct buffer
 @@ -90,7 +96,7 @@ JNIEXPORT jint JNICALL 
 Java_org_apache_hadoop_io_compress_snappy_SnappyCompresso
UNLOCK_CLASS(env, clazz, SnappyCompressor);
if (compressed_bytes == 0) {
 -return 0;
 +return (jint)0;
}
/* size_t should always be 4 bytes or larger. */
 @@ -109,3 +115,5 @@ JNIEXPORT jint JNICALL 
 Java_org_apache_hadoop_io_compress_snappy_SnappyCompresso
(*env)-SetIntField(env, thisj, SnappyCompressor_uncompressedDirectBufLen, 
 0);
return (jint)buf_len;
  }
 +
 +#endif //define HADOOP_SNAPPY_LIBRARY
 {noformat}
 Here we see that all the class implementation got enclosed into if defined 
 HADOOP_SNAPPY_LIBRARY directive, and the point is that 
 HADOOP_SNAPPY_LIBRARY is *not* defined. 
 This causes the class implementation to be effectively empty, what, in turn, 
 causes the UnsatisfiedLinkError to be thrown in the runtime upon any attempt 
 to invoke the native methods implemented there.
 The actual intention of the authors of HADOOP-8562 was (as we suppose) to 
 invoke include config.h, where HADOOP_SNAPPY_LIBRARY is defined. But 
 currently it is *not* included because it resides *inside* if defined 
 HADOOP_SNAPPY_LIBRARY block.
 Similar situation with ifdef UNIX, because UNIX or WINDOWS variables are 
 defined in org_apache_hadoop.h, which is indirectly included through 
 include org_apache_hadoop_io_compress_snappy.h, and in the current code 
 this is done *after* code ifdef UNIX, so in the current code the block 
 ifdef UNIX is *not* executed on UNIX.
 The suggested patch fixes the described problems by reordering the include 
 and if preprocessor directives accordingly, bringing the methods of class 
 org.apache.hadoop.io.compress.snappy.SnappyCompressor back to work again.
 Of course, Snappy native libraries must be installed to build and invoke 
 snappy native methods.
 (Note: there was a mistype in commit message: 8952 written in place of 8562: 
 HADOOP-8952. Enhancements to support Hadoop on Windows Server and Windows 
 Azure environments. Contributed by Ivan Mitic, Chuan Liu, Ramya Sunil, Bikas 
 Saha, Kanna Karanam, John Gordon, Brandon Li, Chris Nauroth, David Lao, 
 Sumadhur Reddy Bolli, Arpit Agarwal, Ahmed El Baz, Mike Liddell, Jing Zhao, 
 Thejas Nair, Steve Maine, Ganeshan Iyer, Raja Aluri, Giridharan Kesavan, 
 Ramya Bharathi Nimmagadda.
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1453486 
 13f79535-47bb-0310-9956-ffa450edef68
 )

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your

[jira] [Commented] (HADOOP-9533) Centralized Hadoop SSO/Token Server

2013-06-04 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674475#comment-13674475
 ] 

Daryn Sharp commented on HADOOP-9533:
-

Initial thoughts are passing a bearer token over the network creates a weakest 
link  - it's dangerous if a rogue or untrustworthy service is involved.  SASL 
can be leveraged to avoid passing the actual bearer token over the network, 
although that might complicate the desired design.

The nice thing about the SASL abstraction is mechanisms are easily swapped out 
w/o writing code.  While I'm not opposed to SSL/TLS, relying solely on SSL (at 
least openssl) for authentication may not be a good idea since it seems to be 
rife with security flaws.  I'd consider SSL an extra security layer, not the 
only security layer.

 Centralized Hadoop SSO/Token Server
 ---

 Key: HADOOP-9533
 URL: https://issues.apache.org/jira/browse/HADOOP-9533
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Reporter: Larry McCay
 Attachments: HSSO-Interaction-Overview-rev-1.docx, 
 HSSO-Interaction-Overview-rev-1.pdf


 This is an umbrella Jira filing to oversee a set of proposals for introducing 
 a new master service for Hadoop Single Sign On (HSSO).
 There is an increasing need for pluggable authentication providers that 
 authenticate both users and services as well as validate tokens in order to 
 federate identities authenticated by trusted IDPs. These IDPs may be deployed 
 within the enterprise or third-party IDPs that are external to the enterprise.
 These needs speak to a specific pain point: which is a narrow integration 
 path into the enterprise identity infrastructure. Kerberos is a fine solution 
 for those that already have it in place or are willing to adopt its use but 
 there remains a class of user that finds this unacceptable and needs to 
 integrate with a wider variety of identity management solutions.
 Another specific pain point is that of rolling and distributing keys. A 
 related and integral part of the HSSO server is library called the Credential 
 Management Framework (CMF), which will be a common library for easing the 
 management of secrets, keys and credentials.
 Initially, the existing delegation, block access and job tokens will continue 
 to be utilized. There may be some changes required to leverage a PKI based 
 signature facility rather than shared secrets. This is a means to simplify 
 the solution for the pain point of distributing shared secrets.
 This project will primarily centralize the responsibility of authentication 
 and federation into a single service that is trusted across the Hadoop 
 cluster and optionally across multiple clusters. This greatly simplifies a 
 number of things in the Hadoop ecosystem:
 1.a single token format that is used across all of Hadoop regardless of 
 authentication method
 2.a single service to have pluggable providers instead of all services
 3.a single token authority that would be trusted across the cluster/s and 
 through PKI encryption be able to easily issue cryptographically verifiable 
 tokens
 4.automatic rolling of the token authority’s keys and publishing of the 
 public key for easy access by those parties that need to verify incoming 
 tokens
 5.use of PKI for signatures eliminates the need for securely sharing and 
 distributing shared secrets
 In addition to serving as the internal Hadoop SSO service this service will 
 be leveraged by the Knox Gateway from the cluster perimeter in order to 
 acquire the Hadoop cluster tokens. The same token mechanism that is used for 
 internal services will be used to represent user identities. Providing for 
 interesting scenarios such as SSO across Hadoop clusters within an enterprise 
 and/or into the cloud.
 The HSSO service will be comprised of three major components and capabilities:
 1.Federating IDP – authenticates users/services and issues the common 
 Hadoop token
 2.Federating SP – validates the token of trusted external IDPs and issues 
 the common Hadoop token
 3.Token Authority – management of the common Hadoop tokens – including: 
 a.Issuance 
 b.Renewal
 c.Revocation
 As this is a meta Jira for tracking this overall effort, the details of the 
 individual efforts will be submitted along with the child Jira filings.
 Hadoop-Common would seem to be the most appropriate home for such a service 
 and its related common facilities. We will also leverage and extend existing 
 common mechanisms as appropriate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9487) Deprecation warnings in Configuration should go to their own log or otherwise be suppressible

2013-06-04 Thread Chu Tong (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674560#comment-13674560
 ] 

Chu Tong commented on HADOOP-9487:
--

[~sjlee0], your idea seems a bit vague to me, can you please elaborate? thanks

 Deprecation warnings in Configuration should go to their own log or otherwise 
 be suppressible
 -

 Key: HADOOP-9487
 URL: https://issues.apache.org/jira/browse/HADOOP-9487
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 3.0.0
Reporter: Steve Loughran

 Running local pig jobs triggers large quantities of warnings about deprecated 
 properties -something I don't care about as I'm not in a position to fix 
 without delving into Pig. 
 I can suppress them by changing the log level, but that can hide other 
 warnings that may actually matter.
 If there was a special Configuration.deprecated log for all deprecation 
 messages, this log could be suppressed by people who don't want noisy logs

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9487) Deprecation warnings in Configuration should go to their own log or otherwise be suppressible

2013-06-04 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674568#comment-13674568
 ] 

Sangjin Lee commented on HADOOP-9487:
-

[~stayhf] I'm not sure which part of my comment you're referring to... When we 
run any mapreduce job that uses old properties, the log is flooded with the 
deprecation warnings to the point that it is often hard to make out other legit 
warnings or errors. I support the idea that provides an ability to suppress 
these deprecation warning messages if a certain boolean option is enabled.

 Deprecation warnings in Configuration should go to their own log or otherwise 
 be suppressible
 -

 Key: HADOOP-9487
 URL: https://issues.apache.org/jira/browse/HADOOP-9487
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 3.0.0
Reporter: Steve Loughran

 Running local pig jobs triggers large quantities of warnings about deprecated 
 properties -something I don't care about as I'm not in a position to fix 
 without delving into Pig. 
 I can suppress them by changing the log level, but that can hide other 
 warnings that may actually matter.
 If there was a special Configuration.deprecated log for all deprecation 
 messages, this log could be suppressed by people who don't want noisy logs

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9487) Deprecation warnings in Configuration should go to their own log or otherwise be suppressible

2013-06-04 Thread Chu Tong (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674576#comment-13674576
 ] 

Chu Tong commented on HADOOP-9487:
--

[~sjlee0], so I assume this boolean variable has to be specified in one of the 
configuration files?

 Deprecation warnings in Configuration should go to their own log or otherwise 
 be suppressible
 -

 Key: HADOOP-9487
 URL: https://issues.apache.org/jira/browse/HADOOP-9487
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 3.0.0
Reporter: Steve Loughran

 Running local pig jobs triggers large quantities of warnings about deprecated 
 properties -something I don't care about as I'm not in a position to fix 
 without delving into Pig. 
 I can suppress them by changing the log level, but that can hide other 
 warnings that may actually matter.
 If there was a special Configuration.deprecated log for all deprecation 
 messages, this log could be suppressed by people who don't want noisy logs

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9618) Add thread which detects JVM pauses


 [ 
https://issues.apache.org/jira/browse/HADOOP-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HADOOP-9618:


Attachment: hadoop-9618.txt

Attached patch implements this feature. I wasn't able to write an automated 
unit test for this, since it relies on forcing the JVM to pause, which isn't 
really doable in a junit context. I added a main function, though, which leaks 
memory and forces into GC. Here's the output from my machine:

{code}
todd@todd-w510:~/git/hadoop-common/hadoop-common-project/hadoop-common$ java 
-verbose:gc -Xmx1g -XX:+UseConcMarkSweepGC -cp 
/home/todd/confs/devconf.2nn-2/:target/classes/:/home/todd/git/hadoop-common/hadoop-dist/target/hadoop-2.0.0-cdh4.1.2/share/hadoop/common/lib/\*
 org.apache.hadoop.util.JvmPauseMonitor
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use 
org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
[GC 16715K-9327K(83008K), 0.0964960 secs]
[GC 26351K-23551K(83008K), 0.0406780 secs]
[GC 40575K-38828K(83008K), 0.0375710 secs]
[GC 39126K(83008K), 0.0027320 secs]
[GC 55852K-55851K(83008K), 0.0422420 secs]
[GC 70138K-69911K(88000K), 0.0437410 secs]
[GC 86935K-86934K(105028K), 0.0479130 secs]
[GC 103958K-103957K(121924K), 0.0572590 secs]
[GC 120981K-120804K(138824K), 0.0521890 secs]
[GC 137828K-137826K(155720K), 0.0508530 secs]
[GC 154850K-154848K(172808K), 0.0519540 secs]
[GC 163351K-163177K(181064K), 0.0307160 secs]
[GC 180201K-180201K(198160K), 0.0554300 secs]
[GC 197225K-197223K(215248K), 0.0660260 secs]
[GC 214247K-214244K(232144K), 0.0611980 secs]
[GC 231268K-231151K(249040K), 0.0543830 secs]
[GC 247289K-247117K(265168K), 0.0616800 secs]
[Full GC 247117K-225823K(265168K), 1.0242170 secs]
13/06/04 11:32:14 INFO util.JvmPauseMonitor: Detected pause in JVM or host 
machine (eg GC): pause of approximately 1145ms
GC pool 'ParNew' had collection(s): count=2 time=117ms
GC pool 'ConcurrentMarkSweep' had collection(s): count=2 time=445ms
[GC 268895K-268138K(424824K), 0.1054190 secs]
[GC 268825K(424824K), 0.0061170 secs]
[GC 311210K-311208K(424824K), 0.1476640 secs]
[GC 352421K-351750K(424824K), 0.1371280 secs]
[GC 394822K-394691K(454648K), 0.1522430 secs]
[GC 437763K-437760K(481144K), 0.1447820 secs]
[GC 480832K-486355K(529720K), 0.1565950 secs]
[GC 529427K-529894K(573304K), 0.1264970 secs]
[GC 547400K-546694K(590008K), 0.0609490 secs]
[Full GC 546694K-507462K(590008K), 2.6322880 secs]
13/06/04 11:32:19 INFO util.JvmPauseMonitor: Detected pause in JVM or host 
machine (eg GC): pause of approximately 2735ms
GC pool 'ParNew' had collection(s): count=2 time=188ms
GC pool 'ConcurrentMarkSweep' had collection(s): count=2 time=298ms
{code}

 Add thread which detects JVM pauses
 ---

 Key: HADOOP-9618
 URL: https://issues.apache.org/jira/browse/HADOOP-9618
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 3.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hadoop-9618.txt


 Often times users struggle to understand what happened when a long JVM pause 
 (GC or otherwise) causes things to malfunction inside a Hadoop daemon. For 
 example, a long GC pause while logging an edit to the QJM may cause the edit 
 to timeout, or a long GC pause may make other IPCs to the NameNode timeout. 
 We should add a simple thread which loops on 1-second sleeps, and if the 
 sleep ever takes significantly longer than 1 second, log a WARN. This will 
 make GC pauses obvious in logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9517) Document Hadoop Compatibility

2013-06-04 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674683#comment-13674683
 ] 

Karthik Kambatla commented on HADOOP-9517:
--

bq. Classes that subclass a Hadoop class to provide a plugin point MAY need 
recompiling on each major version, possibly with the handling of changes to 
methods.

If the Hadoop class being extended is Public and the stability is defined by 
the annotation, is that not sufficient indication to the user that it might 
need to be changed as that interface/class changes. For example, we recently 
added a SchedulingPolicy to FairScheduler annotated Public-Evolving: the 
policies written for the current version need to be updated as and when the 
SchedulingPolicy class changes. Once it becomes stable, we follow the standard 
deprecation rules for Public-Stable API that protects the policies. No?

I think it is important to detail how the API compatibility rules impact the 
user-level code. May be I am missing something here. Otherwise, we might not 
need specific policies for them?

 Document Hadoop Compatibility
 -

 Key: HADOOP-9517
 URL: https://issues.apache.org/jira/browse/HADOOP-9517
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Reporter: Arun C Murthy
Assignee: Karthik Kambatla
 Attachments: hadoop-9517.patch, hadoop-9517.patch, hadoop-9517.patch, 
 hadoop-9517.patch, hadoop-9517-proposal-v1.patch, 
 hadoop-9517-proposal-v1.patch


 As we get ready to call hadoop-2 stable we need to better define 'Hadoop 
 Compatibility'.
 http://wiki.apache.org/hadoop/Compatibility is a start, let's document 
 requirements clearly and completely.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9618) Add thread which detects JVM pauses


[ 
https://issues.apache.org/jira/browse/HADOOP-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674684#comment-13674684
 ] 

Todd Lipcon commented on HADOOP-9618:
-

BTW, the info reported from the beans seems to be off due to an OpenJDK bug. 
When I run the same test program with Oracle JDK 1.6.0_14 I get correct stats 
from the CMS MXBean:

13/06/04 11:36:33 INFO util.JvmPauseMonitor: Detected pause in JVM or host 
machine (eg GC): pause of approximately 3232ms
GC pool 'ParNew' had collection(s): count=1 time=56ms
GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=3665ms


 Add thread which detects JVM pauses
 ---

 Key: HADOOP-9618
 URL: https://issues.apache.org/jira/browse/HADOOP-9618
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 3.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hadoop-9618.txt


 Often times users struggle to understand what happened when a long JVM pause 
 (GC or otherwise) causes things to malfunction inside a Hadoop daemon. For 
 example, a long GC pause while logging an edit to the QJM may cause the edit 
 to timeout, or a long GC pause may make other IPCs to the NameNode timeout. 
 We should add a simple thread which loops on 1-second sleeps, and if the 
 sleep ever takes significantly longer than 1 second, log a WARN. This will 
 make GC pauses obvious in logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9618) Add thread which detects JVM pauses


 [ 
https://issues.apache.org/jira/browse/HADOOP-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HADOOP-9618:


Status: Patch Available  (was: Open)

 Add thread which detects JVM pauses
 ---

 Key: HADOOP-9618
 URL: https://issues.apache.org/jira/browse/HADOOP-9618
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 3.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hadoop-9618.txt


 Often times users struggle to understand what happened when a long JVM pause 
 (GC or otherwise) causes things to malfunction inside a Hadoop daemon. For 
 example, a long GC pause while logging an edit to the QJM may cause the edit 
 to timeout, or a long GC pause may make other IPCs to the NameNode timeout. 
 We should add a simple thread which loops on 1-second sleeps, and if the 
 sleep ever takes significantly longer than 1 second, log a WARN. This will 
 make GC pauses obvious in logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9533) Centralized Hadoop SSO/Token Server


[ 
https://issues.apache.org/jira/browse/HADOOP-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675130#comment-13675130
 ] 

Kevin Minder commented on HADOOP-9533:
--

I'm happy to announce that we have secured a time slot and dedicated space 
during Hadoop Summit NA dedicated to forward looking Hadoop security design 
collaboration.  Currently, a room has been allocated on the 26th from 1:45 to 
3:30 PT.  Specific location will be available at the Summit and any changes in 
date or time will be announced publicly to the best of our abilities.  In order 
to create a manageable agenda for this session, I'd like to schedule some prep 
meetings via meetup.com to start discussions and preparations with those that 
would be interested in co-organizing the session.

 Centralized Hadoop SSO/Token Server
 ---

 Key: HADOOP-9533
 URL: https://issues.apache.org/jira/browse/HADOOP-9533
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Reporter: Larry McCay
 Attachments: HSSO-Interaction-Overview-rev-1.docx, 
 HSSO-Interaction-Overview-rev-1.pdf


 This is an umbrella Jira filing to oversee a set of proposals for introducing 
 a new master service for Hadoop Single Sign On (HSSO).
 There is an increasing need for pluggable authentication providers that 
 authenticate both users and services as well as validate tokens in order to 
 federate identities authenticated by trusted IDPs. These IDPs may be deployed 
 within the enterprise or third-party IDPs that are external to the enterprise.
 These needs speak to a specific pain point: which is a narrow integration 
 path into the enterprise identity infrastructure. Kerberos is a fine solution 
 for those that already have it in place or are willing to adopt its use but 
 there remains a class of user that finds this unacceptable and needs to 
 integrate with a wider variety of identity management solutions.
 Another specific pain point is that of rolling and distributing keys. A 
 related and integral part of the HSSO server is library called the Credential 
 Management Framework (CMF), which will be a common library for easing the 
 management of secrets, keys and credentials.
 Initially, the existing delegation, block access and job tokens will continue 
 to be utilized. There may be some changes required to leverage a PKI based 
 signature facility rather than shared secrets. This is a means to simplify 
 the solution for the pain point of distributing shared secrets.
 This project will primarily centralize the responsibility of authentication 
 and federation into a single service that is trusted across the Hadoop 
 cluster and optionally across multiple clusters. This greatly simplifies a 
 number of things in the Hadoop ecosystem:
 1.a single token format that is used across all of Hadoop regardless of 
 authentication method
 2.a single service to have pluggable providers instead of all services
 3.a single token authority that would be trusted across the cluster/s and 
 through PKI encryption be able to easily issue cryptographically verifiable 
 tokens
 4.automatic rolling of the token authority’s keys and publishing of the 
 public key for easy access by those parties that need to verify incoming 
 tokens
 5.use of PKI for signatures eliminates the need for securely sharing and 
 distributing shared secrets
 In addition to serving as the internal Hadoop SSO service this service will 
 be leveraged by the Knox Gateway from the cluster perimeter in order to 
 acquire the Hadoop cluster tokens. The same token mechanism that is used for 
 internal services will be used to represent user identities. Providing for 
 interesting scenarios such as SSO across Hadoop clusters within an enterprise 
 and/or into the cloud.
 The HSSO service will be comprised of three major components and capabilities:
 1.Federating IDP – authenticates users/services and issues the common 
 Hadoop token
 2.Federating SP – validates the token of trusted external IDPs and issues 
 the common Hadoop token
 3.Token Authority – management of the common Hadoop tokens – including: 
 a.Issuance 
 b.Renewal
 c.Revocation
 As this is a meta Jira for tracking this overall effort, the details of the 
 individual efforts will be submitted along with the child Jira filings.
 Hadoop-Common would seem to be the most appropriate home for such a service 
 and its related common facilities. We will also leverage and extend existing 
 common mechanisms as appropriate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9392) Token based authentication and Single Sign On


[ 
https://issues.apache.org/jira/browse/HADOOP-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675132#comment-13675132
 ] 

Kevin Minder commented on HADOOP-9392:
--

I'm happy to announce that we have secured a time slot and dedicated space 
during Hadoop Summit NA dedicated to forward looking Hadoop security design 
collaboration.  Currently, a room has been allocated on the 26th from 1:45 to 
3:30 PT.  Specific location will be available at the Summit and any changes in 
date or time will be announced publicly to the best of our abilities.  In order 
to create a manageable agenda for this session, I'd like to schedule some prep 
meetings via meetup.com to start discussions and preparations with those that 
would be interested in co-organizing the session.


 Token based authentication and Single Sign On
 -

 Key: HADOOP-9392
 URL: https://issues.apache.org/jira/browse/HADOOP-9392
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Reporter: Kai Zheng
Assignee: Kai Zheng
 Fix For: 3.0.0

 Attachments: token-based-authn-plus-sso.pdf


 This is an umbrella entry for one of project Rhino’s topic, for details of 
 project Rhino, please refer to 
 https://github.com/intel-hadoop/project-rhino/. The major goal for this entry 
 as described in project Rhino was 
  
 “Core, HDFS, ZooKeeper, and HBase currently support Kerberos authentication 
 at the RPC layer, via SASL. However this does not provide valuable attributes 
 such as group membership, classification level, organizational identity, or 
 support for user defined attributes. Hadoop components must interrogate 
 external resources for discovering these attributes and at scale this is 
 problematic. There is also no consistent delegation model. HDFS has a simple 
 delegation capability, and only Oozie can take limited advantage of it. We 
 will implement a common token based authentication framework to decouple 
 internal user and service authentication from external mechanisms used to 
 support it (like Kerberos)”
  
 We’d like to start our work from Hadoop-Common and try to provide common 
 facilities by extending existing authentication framework which support:
 1.Pluggable token provider interface 
 2.Pluggable token verification protocol and interface
 3.Security mechanism to distribute secrets in cluster nodes
 4.Delegation model of user authentication

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9392) Token based authentication and Single Sign On


[ 
https://issues.apache.org/jira/browse/HADOOP-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675168#comment-13675168
 ] 

Kevin Minder commented on HADOOP-9392:
--

Logistics for remote attendance will also be announce publicly when we have 
that figured out.  We won't be making any decisions about security at either 
any prep or the Summit sessions and detailed summaries will be provided here 
for those that cannot attend. 

 Token based authentication and Single Sign On
 -

 Key: HADOOP-9392
 URL: https://issues.apache.org/jira/browse/HADOOP-9392
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Reporter: Kai Zheng
Assignee: Kai Zheng
 Fix For: 3.0.0

 Attachments: token-based-authn-plus-sso.pdf


 This is an umbrella entry for one of project Rhino’s topic, for details of 
 project Rhino, please refer to 
 https://github.com/intel-hadoop/project-rhino/. The major goal for this entry 
 as described in project Rhino was 
  
 “Core, HDFS, ZooKeeper, and HBase currently support Kerberos authentication 
 at the RPC layer, via SASL. However this does not provide valuable attributes 
 such as group membership, classification level, organizational identity, or 
 support for user defined attributes. Hadoop components must interrogate 
 external resources for discovering these attributes and at scale this is 
 problematic. There is also no consistent delegation model. HDFS has a simple 
 delegation capability, and only Oozie can take limited advantage of it. We 
 will implement a common token based authentication framework to decouple 
 internal user and service authentication from external mechanisms used to 
 support it (like Kerberos)”
  
 We’d like to start our work from Hadoop-Common and try to provide common 
 facilities by extending existing authentication framework which support:
 1.Pluggable token provider interface 
 2.Pluggable token verification protocol and interface
 3.Security mechanism to distribute secrets in cluster nodes
 4.Delegation model of user authentication

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9533) Centralized Hadoop SSO/Token Server


[ 
https://issues.apache.org/jira/browse/HADOOP-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675167#comment-13675167
 ] 

Kevin Minder commented on HADOOP-9533:
--

Logistics for remote attendance will also be announce publicly when we have 
that figured out.  We won't be making any decisions about security at either 
any prep or the Summit sessions and detailed summaries will be provided here 
for those that cannot attend. 

 Centralized Hadoop SSO/Token Server
 ---

 Key: HADOOP-9533
 URL: https://issues.apache.org/jira/browse/HADOOP-9533
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Reporter: Larry McCay
 Attachments: HSSO-Interaction-Overview-rev-1.docx, 
 HSSO-Interaction-Overview-rev-1.pdf


 This is an umbrella Jira filing to oversee a set of proposals for introducing 
 a new master service for Hadoop Single Sign On (HSSO).
 There is an increasing need for pluggable authentication providers that 
 authenticate both users and services as well as validate tokens in order to 
 federate identities authenticated by trusted IDPs. These IDPs may be deployed 
 within the enterprise or third-party IDPs that are external to the enterprise.
 These needs speak to a specific pain point: which is a narrow integration 
 path into the enterprise identity infrastructure. Kerberos is a fine solution 
 for those that already have it in place or are willing to adopt its use but 
 there remains a class of user that finds this unacceptable and needs to 
 integrate with a wider variety of identity management solutions.
 Another specific pain point is that of rolling and distributing keys. A 
 related and integral part of the HSSO server is library called the Credential 
 Management Framework (CMF), which will be a common library for easing the 
 management of secrets, keys and credentials.
 Initially, the existing delegation, block access and job tokens will continue 
 to be utilized. There may be some changes required to leverage a PKI based 
 signature facility rather than shared secrets. This is a means to simplify 
 the solution for the pain point of distributing shared secrets.
 This project will primarily centralize the responsibility of authentication 
 and federation into a single service that is trusted across the Hadoop 
 cluster and optionally across multiple clusters. This greatly simplifies a 
 number of things in the Hadoop ecosystem:
 1.a single token format that is used across all of Hadoop regardless of 
 authentication method
 2.a single service to have pluggable providers instead of all services
 3.a single token authority that would be trusted across the cluster/s and 
 through PKI encryption be able to easily issue cryptographically verifiable 
 tokens
 4.automatic rolling of the token authority’s keys and publishing of the 
 public key for easy access by those parties that need to verify incoming 
 tokens
 5.use of PKI for signatures eliminates the need for securely sharing and 
 distributing shared secrets
 In addition to serving as the internal Hadoop SSO service this service will 
 be leveraged by the Knox Gateway from the cluster perimeter in order to 
 acquire the Hadoop cluster tokens. The same token mechanism that is used for 
 internal services will be used to represent user identities. Providing for 
 interesting scenarios such as SSO across Hadoop clusters within an enterprise 
 and/or into the cloud.
 The HSSO service will be comprised of three major components and capabilities:
 1.Federating IDP – authenticates users/services and issues the common 
 Hadoop token
 2.Federating SP – validates the token of trusted external IDPs and issues 
 the common Hadoop token
 3.Token Authority – management of the common Hadoop tokens – including: 
 a.Issuance 
 b.Renewal
 c.Revocation
 As this is a meta Jira for tracking this overall effort, the details of the 
 individual efforts will be submitted along with the child Jira filings.
 Hadoop-Common would seem to be the most appropriate home for such a service 
 and its related common facilities. We will also leverage and extend existing 
 common mechanisms as appropriate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9487) Deprecation warnings in Configuration should go to their own log or otherwise be suppressible

2013-06-04 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675179#comment-13675179
 ] 

Sangjin Lee commented on HADOOP-9487:
-

I'm not suggesting an implementation yet. :) Could be a boolean variable in the 
config or some other flag... If it is part of the config, I presume one needs 
to make sure we don't have a circular logic (not that I think there is one).

 Deprecation warnings in Configuration should go to their own log or otherwise 
 be suppressible
 -

 Key: HADOOP-9487
 URL: https://issues.apache.org/jira/browse/HADOOP-9487
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 3.0.0
Reporter: Steve Loughran

 Running local pig jobs triggers large quantities of warnings about deprecated 
 properties -something I don't care about as I'm not in a position to fix 
 without delving into Pig. 
 I can suppress them by changing the log level, but that can hide other 
 warnings that may actually matter.
 If there was a special Configuration.deprecated log for all deprecation 
 messages, this log could be suppressed by people who don't want noisy logs

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9585) unit test failure :org.apache.hadoop.fs.TestFsShellReturnCode.testChgrp


[ 
https://issues.apache.org/jira/browse/HADOOP-9585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675201#comment-13675201
 ] 

Sanjay Radia commented on HADOOP-9585:
--

The error message for non existing files is normal. Appears that the only 
failed test is the chgrp. Is the user id jenkins  a member of the group admin? 

 unit test failure :org.apache.hadoop.fs.TestFsShellReturnCode.testChgrp
 ---

 Key: HADOOP-9585
 URL: https://issues.apache.org/jira/browse/HADOOP-9585
 Project: Hadoop Common
  Issue Type: Bug
  Components: test
Affects Versions: 1.3.0
 Environment: 
 https://builds.apache.org/job/Hadoop-branch1/lastCompletedBuild/testReport/org.apache.hadoop.fs/TestFsShellReturnCode/testChgrp/
Reporter: Giridharan Kesavan

 Standard Error
 chmod: could not get status for 
 '/home/jenkins/jenkins-slave/workspace/Hadoop-branch1/trunk/build/test/data/testChmod/fileDoesNotExist':
  File 
 /home/jenkins/jenkins-slave/workspace/Hadoop-branch1/trunk/build/test/data/testChmod/fileDoesNotExist
  does not exist.
 chmod: could not get status for 
 '/home/jenkins/jenkins-slave/workspace/Hadoop-branch1/trunk/build/test/data/testChmod/nonExistingfiles*'
 chown: could not get status for 
 '/home/jenkins/jenkins-slave/workspace/Hadoop-branch1/trunk/build/test/data/testChown/fileDoesNotExist':
  File 
 /home/jenkins/jenkins-slave/workspace/Hadoop-branch1/trunk/build/test/data/testChown/fileDoesNotExist
  does not exist.
 chown: could not get status for 
 '/home/jenkins/jenkins-slave/workspace/Hadoop-branch1/trunk/build/test/data/testChown/nonExistingfiles*'
 chgrp: failed on 
 'file:/home/jenkins/jenkins-slave/workspace/Hadoop-branch1/trunk/build/test/data/testChgrp/fileExists':
  chgrp: changing group of 
 `/home/jenkins/jenkins-slave/workspace/Hadoop-branch1/trunk/build/test/data/testChgrp/fileExists':
  Operation not permitted

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9618) Add thread which detects JVM pauses

2013-06-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675235#comment-13675235
 ] 

Hadoop QA commented on HADOOP-9618:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12586143/hadoop-9618.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/2599//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/2599//console

This message is automatically generated.

 Add thread which detects JVM pauses
 ---

 Key: HADOOP-9618
 URL: https://issues.apache.org/jira/browse/HADOOP-9618
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 3.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hadoop-9618.txt


 Often times users struggle to understand what happened when a long JVM pause 
 (GC or otherwise) causes things to malfunction inside a Hadoop daemon. For 
 example, a long GC pause while logging an edit to the QJM may cause the edit 
 to timeout, or a long GC pause may make other IPCs to the NameNode timeout. 
 We should add a simple thread which loops on 1-second sleeps, and if the 
 sleep ever takes significantly longer than 1 second, log a WARN. This will 
 make GC pauses obvious in logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9392) Token based authentication and Single Sign On

[
https://issues.apache.org/jira/browse/HADOOP-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675264#comment-13675264
]

Sanjay Radia commented on HADOOP-9392:
--

Thanks for the Jira and the slides on what you are proposing.
{quote}
There is also no consistent delegation model. HDFS has a simple delegation
capability, and only Oozie can take limited advantage of it. We will implement
a common token based authentication framework to decouple internal user and
service authentication from external mechanisms used to support it (like
Kerberos)
{quote}
I am puzzled by the above statement. Hadoop has delegation tokens and a trust
model. For the largest part we use delegation tokens (e.g. MR job client gets
HDFS delegation tokens etc) so that the job can run as the user that submitted
the job. Further in some cases we use trusted proxies like Ozzie (but this can
be any trusted service, not just Oozie), to access system services as specific
users. The delegation tokens and the trusted proxies are two independent
mechanisms. So I feel the statements in the quoted block are not correct or
perhaps you are using the term delegation in a different sense. Details of
the Hadoop delegation tokens are in the following very detailed paper on Hadoop
security (see
http://hortonworks.com/wp-content/uploads/2011/10/security-design_withCover-1.pdf).

You also state We will implement a common token based authentication
framework to decouple internal user and service authentication from external
mechanisms used to support it (like Kerberos) Note the at internal Hadoop
tokens are separate from the Kerberos tokens - indeed they are nicely decoupled
- the problem, IMHO, is not the decoupling but other issues. Another issue is
the authentication implementation which has a burnt-in notion of supporting
UGI, Keberos or the delegation tokens for authentication. As Daryn pointed out
this implementation needs to change so that the authentication mechanism is
pluggable. This jira, I believe, is proposing much much more than making this
implementation pluggable.

Don't get me wrong, I am not criticizing the Jira but merely trying to
understand some of the statements in the description and the slides you have
posted. I do agree that we need to allow other forms of authentication besides
Kerberos/ActiveDir. I also agree that attributes like group membership should
have been part of the Hadoop delegation tokens to avoid back calls which are
not feasible in cloud environments. Do you have a more detailed design besides
the slides that you have uploaded to this jira? I would like to get the next
level of details. Your comments in this jira do give some more details but it
would be good to put them in a design document. Further, I suspect you are
trying to replace the Hadoop delegation tokens - i don't disagree with that but
would like to understand the why and how from your perspective.

Would this be an accurate description of this Jira: Single sign on for
non-kerberos environments using tokens. Hadoop does support single sign on
for kerberos/activeDir environments; of course that is not good enough since
many customers do not have Kerberos/ActiveDir.

Token based authentication and Single Sign On
-

Key: HADOOP-9392
URL: https://issues.apache.org/jira/browse/HADOOP-9392
Project: Hadoop Common
Issue Type: New Feature
Components: security
Reporter: Kai Zheng
Assignee: Kai Zheng
Fix For: 3.0.0

Attachments: token-based-authn-plus-sso.pdf

This is an umbrella entry for one of project Rhino’s topic, for details of
project Rhino, please refer to
https://github.com/intel-hadoop/project-rhino/. The major goal for this entry
as described in project Rhino was

“Core, HDFS, ZooKeeper, and HBase currently support Kerberos authentication
at the RPC layer, via SASL. However this does not provide valuable attributes
such as group membership, classification level, organizational identity, or
support for user defined attributes. Hadoop components must interrogate
external resources for discovering these attributes and at scale this is
problematic. There is also no consistent delegation model. HDFS has a simple
delegation capability, and only Oozie can take limited advantage of it. We
will implement a common token based authentication framework to decouple
internal user and service authentication from external mechanisms used to
support it (like Kerberos)”

We’d like to start our work from Hadoop-Common and try to provide common
facilities by extending existing authentication framework which support:
1.Pluggable token provider interface
2.Pluggable token verification protocol and interface
3.Security

[jira] [Commented] (HADOOP-9487) Deprecation warnings in Configuration should go to their own log or otherwise be suppressible

2013-06-04 Thread Chu Tong (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675334#comment-13675334
 ] 

Chu Tong commented on HADOOP-9487:
--

As I am still kind of new here, please correct me if I am wrong. I am proposing 
to add a new boolean field in mapred-site.xml to suppress the warnings if user 
wants? any suggestions?

 Deprecation warnings in Configuration should go to their own log or otherwise 
 be suppressible
 -

 Key: HADOOP-9487
 URL: https://issues.apache.org/jira/browse/HADOOP-9487
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 3.0.0
Reporter: Steve Loughran

 Running local pig jobs triggers large quantities of warnings about deprecated 
 properties -something I don't care about as I'm not in a position to fix 
 without delving into Pig. 
 I can suppress them by changing the log level, but that can hide other 
 warnings that may actually matter.
 If there was a special Configuration.deprecated log for all deprecation 
 messages, this log could be suppressed by people who don't want noisy logs

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9582) Non-existent file to hadoop fs -conf doesn't throw error

2013-06-04 Thread Ashwin Shankar (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675372#comment-13675372
 ] 

Ashwin Shankar commented on HADOOP-9582:


Hi Jason,
Thanks for your comments. Yes this patch doesn't address the problem in other 
places apart from FsShell. But could we check this in so that we are good 
atleast from the FsShell standpoint. Maybe we could create another JIRA to 
investigate whether we break any use-case if we make the change directly in 
GenericOptionsParser.   

 Non-existent file to hadoop fs -conf doesn't throw error
 --

 Key: HADOOP-9582
 URL: https://issues.apache.org/jira/browse/HADOOP-9582
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Ashwin Shankar
 Attachments: HADOOP-9582.txt, HADOOP-9582.txt, HADOOP-9582.txt


 When we run :
 hadoop fs -conf BAD_FILE -ls /
 we expect hadoop to throw an error,but it doesn't.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9517) Document Hadoop Compatibility

2013-06-04 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated HADOOP-9517:
-

Attachment: hadoop-9517-v2.patch

Updated the patch to:
# differentiate between the kinds of file formats
# list the kinds of HDFS metadata upgrades
# include the policy proposal for proto files

The policies being proposed (from comments in the JIRA) are explicitly labelled 
(Proposal)

At this point, it would be great to make sure the document
# captures all the items that affect compatibility
# the policies for the not-newly-proposed ones are accurate
# the newly proposed policies are reasonable
# improve the presentation, if need be

Once done with the above, it might be a good idea to get this in and address 
items with no policies in subsequent JIRAs or sub-tasks to make sure we discuss 
them in isolation and detail. Thoughts? 

 Document Hadoop Compatibility
 -

 Key: HADOOP-9517
 URL: https://issues.apache.org/jira/browse/HADOOP-9517
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Reporter: Arun C Murthy
Assignee: Karthik Kambatla
 Attachments: hadoop-9517.patch, hadoop-9517.patch, hadoop-9517.patch, 
 hadoop-9517.patch, hadoop-9517-proposal-v1.patch, 
 hadoop-9517-proposal-v1.patch, hadoop-9517-v2.patch


 As we get ready to call hadoop-2 stable we need to better define 'Hadoop 
 Compatibility'.
 http://wiki.apache.org/hadoop/Compatibility is a start, let's document 
 requirements clearly and completely.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8717) JAVA_HOME detected in hadoop-config.sh under OS X does not work

2013-06-04 Thread Josh Rosen (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675382#comment-13675382
 ] 

Josh Rosen commented on HADOOP-8717:


I just ran into this same issue with Hadoop 2.0.4-alpha on OS X 10.8.3.

 JAVA_HOME detected in hadoop-config.sh under OS X does not work
 ---

 Key: HADOOP-8717
 URL: https://issues.apache.org/jira/browse/HADOOP-8717
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
 Environment: OS: Darwin 11.4.0 Darwin Kernel Version 11.4.0: Mon Apr  
 9 19:32:15 PDT 2012; root:xnu-1699.26.8~1/RELEASE_X86_64 x86_64
 java version 1.6.0_33
 Java(TM) SE Runtime Environment (build 1.6.0_33-b03-424-11M3720)
 Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03-424, mixed mode)
Reporter: Jianbin Wei
Priority: Minor
 Fix For: 3.0.0

 Attachments: HADOOP-8717.patch, HADOOP-8717.patch, HADOOP-8717.patch, 
 HADOOP-8717.patch


 After setting up a single node hadoop on mac, copy some text file to it and 
 run
 $ hadoop jar 
 ./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-SNAPSHOT.jar  
 wordcount /file.txt output
 It reports
 12/08/21 15:32:18 INFO Job.java:mapreduce.Job:1265: Running job: 
 job_1345588312126_0001
 12/08/21 15:32:22 INFO Job.java:mapreduce.Job:1286: Job 
 job_1345588312126_0001 running in uber mode : false
 12/08/21 15:32:22 INFO Job.java:mapreduce.Job:1293:  map 0% reduce 0%
 12/08/21 15:32:22 INFO Job.java:mapreduce.Job:1306: Job 
 job_1345588312126_0001 failed with state FAILED due to: Application 
 application_1345588312126_0001 failed 1 times due to AM Container for 
 appattempt_1345588312126_0001_01 exited with  exitCode: 127 due to: 
 .Failing this attempt.. Failing the application.
 12/08/21 15:32:22 INFO Job.java:mapreduce.Job:1311: Counters: 0
 $ cat 
 /tmp/logs/application_1345588312126_0001/container_1345588312126_0001_01_01/stderr
 /bin/bash: /bin/java: No such file or directory
 The detected JAVA_HOME is not used somehow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8717) JAVA_HOME detected in hadoop-config.sh under OS X does not work

2013-06-04 Thread Josh Rosen (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-8717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675384#comment-13675384
 ] 

Josh Rosen commented on HADOOP-8717:


The attached patch fixed things for me, but I just thought I'd mention that 
this affects that Hadoop version, too.

 JAVA_HOME detected in hadoop-config.sh under OS X does not work
 ---

 Key: HADOOP-8717
 URL: https://issues.apache.org/jira/browse/HADOOP-8717
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
 Environment: OS: Darwin 11.4.0 Darwin Kernel Version 11.4.0: Mon Apr  
 9 19:32:15 PDT 2012; root:xnu-1699.26.8~1/RELEASE_X86_64 x86_64
 java version 1.6.0_33
 Java(TM) SE Runtime Environment (build 1.6.0_33-b03-424-11M3720)
 Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03-424, mixed mode)
Reporter: Jianbin Wei
Priority: Minor
 Fix For: 3.0.0

 Attachments: HADOOP-8717.patch, HADOOP-8717.patch, HADOOP-8717.patch, 
 HADOOP-8717.patch


 After setting up a single node hadoop on mac, copy some text file to it and 
 run
 $ hadoop jar 
 ./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-SNAPSHOT.jar  
 wordcount /file.txt output
 It reports
 12/08/21 15:32:18 INFO Job.java:mapreduce.Job:1265: Running job: 
 job_1345588312126_0001
 12/08/21 15:32:22 INFO Job.java:mapreduce.Job:1286: Job 
 job_1345588312126_0001 running in uber mode : false
 12/08/21 15:32:22 INFO Job.java:mapreduce.Job:1293:  map 0% reduce 0%
 12/08/21 15:32:22 INFO Job.java:mapreduce.Job:1306: Job 
 job_1345588312126_0001 failed with state FAILED due to: Application 
 application_1345588312126_0001 failed 1 times due to AM Container for 
 appattempt_1345588312126_0001_01 exited with  exitCode: 127 due to: 
 .Failing this attempt.. Failing the application.
 12/08/21 15:32:22 INFO Job.java:mapreduce.Job:1311: Counters: 0
 $ cat 
 /tmp/logs/application_1345588312126_0001/container_1345588312126_0001_01_01/stderr
 /bin/bash: /bin/java: No such file or directory
 The detected JAVA_HOME is not used somehow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9614) smart-test-patch.sh hangs for new version of patch (2.7.1)

2013-06-04 Thread Konstantin Boudnik (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik updated HADOOP-9614:
---

Fix Version/s: (was: 2.0.5-alpha)

 smart-test-patch.sh hangs for new version of patch (2.7.1)
 --

 Key: HADOOP-9614
 URL: https://issues.apache.org/jira/browse/HADOOP-9614
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Fix For: 3.0.0, 0.23.8

 Attachments: HADOOP-9614.patch, HADOOP-9614.patch


 patch -p0 -E --dry-run prints checking file  for the new version of 
 patch(2.7.1) rather than patching file as it did for older versions. This 
 causes TMP2 to become empty, which causes the script to hang on this command 
 forever:
 PREFIX_DIRS_AND_FILES=$(cut -d '/' -f 1 | sort | uniq)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9618) Add thread which detects JVM pauses

2013-06-04 Thread Colin Patrick McCabe (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675436#comment-13675436
]

Colin Patrick McCabe commented on HADOOP-9618:
--

I kind of wish we could use the JVM's {{Xloggc:logfile}} to get this
information, since theoretically it should be more trustworthy than trying to
guess. Is that too much hassle to configure by default?

I suppose the thread method detects machine pauses which are *not* the result
of GCs, so you could say that it gives more information (although perhaps more
questionable information).

I'm a little gun-shy of the 1 second timeout. It wasn't too long ago that the
Linux scheduler quantum was 100 milliseconds. So if you had ten threads
hogging the CPU, you'd already have no time left to run your watchdog thread.
I think the timeout either needs to be longer, or the thread needs to be a
high-priority thread, possibly even realtime priority.

Have you tried running this with a gnarly MapReduce job going on?

Add thread which detects JVM pauses
---

Key: HADOOP-9618
URL: https://issues.apache.org/jira/browse/HADOOP-9618
Project: Hadoop Common
Issue Type: New Feature
Components: util
Affects Versions: 3.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Attachments: hadoop-9618.txt

Often times users struggle to understand what happened when a long JVM pause
(GC or otherwise) causes things to malfunction inside a Hadoop daemon. For
example, a long GC pause while logging an edit to the QJM may cause the edit
to timeout, or a long GC pause may make other IPCs to the NameNode timeout.
We should add a simple thread which loops on 1-second sleeps, and if the
sleep ever takes significantly longer than 1 second, log a WARN. This will
make GC pauses obvious in logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9618) Add thread which detects JVM pauses

2013-06-04 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675439#comment-13675439
 ] 

Colin Patrick McCabe commented on HADOOP-9618:
--

er, that should read 10 milliseconds / 100 CPU-bound threads

 Add thread which detects JVM pauses
 ---

 Key: HADOOP-9618
 URL: https://issues.apache.org/jira/browse/HADOOP-9618
 Project: Hadoop Common
  Issue Type: New Feature
  Components: util
Affects Versions: 3.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hadoop-9618.txt


 Often times users struggle to understand what happened when a long JVM pause 
 (GC or otherwise) causes things to malfunction inside a Hadoop daemon. For 
 example, a long GC pause while logging an edit to the QJM may cause the edit 
 to timeout, or a long GC pause may make other IPCs to the NameNode timeout. 
 We should add a simple thread which loops on 1-second sleeps, and if the 
 sleep ever takes significantly longer than 1 second, log a WARN. This will 
 make GC pauses obvious in logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9517) Document Hadoop Compatibility


[ 
https://issues.apache.org/jira/browse/HADOOP-9517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675461#comment-13675461
 ] 

Sanjay Radia commented on HADOOP-9517:
--

bq.  ... Optional fields can be added any time .. Fields can be renamed any 
time ...

This is what stable means. Hence i suggest that we add a comment to the .proto 
files to say that the .protos are private-stable, and we could add the comment 
on kinds of changes allowed. Will file a jira for this. Given that are no 
annotations for .proto, a comment is the best that can be done.

bq. Don't you think the proto files for client interfaces should be public? I 
was chatting with Todd about this, and it seems to us they should.
I would still mark it as private till we make the rpc and data transfer 
protocol itself public (ie the protos being public is useless without the rpc 
proto being public. 
Todd and I occasionally disagree ;-)




 Document Hadoop Compatibility
 -

 Key: HADOOP-9517
 URL: https://issues.apache.org/jira/browse/HADOOP-9517
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Reporter: Arun C Murthy
Assignee: Karthik Kambatla
 Attachments: hadoop-9517.patch, hadoop-9517.patch, hadoop-9517.patch, 
 hadoop-9517.patch, hadoop-9517-proposal-v1.patch, 
 hadoop-9517-proposal-v1.patch, hadoop-9517-v2.patch


 As we get ready to call hadoop-2 stable we need to better define 'Hadoop 
 Compatibility'.
 http://wiki.apache.org/hadoop/Compatibility is a start, let's document 
 requirements clearly and completely.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-9619) Mark stability of .proto files

Sanjay Radia created HADOOP-9619:


 Summary: Mark stability of .proto files
 Key: HADOOP-9619
 URL: https://issues.apache.org/jira/browse/HADOOP-9619
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Sanjay Radia
Assignee: Sanjay Radia




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9619) Mark stability of .proto files


[ 
https://issues.apache.org/jira/browse/HADOOP-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675463#comment-13675463
 ] 

Todd Lipcon commented on HADOOP-9619:
-

One thing I think we should clarify: I don't think the .proto files themselves 
should be stable. Rather, the field IDs and types are stable -- ie we should 
not make any changes to the proto files which cause wire incompatibility. But 
we should feel free to rename the .proto files themselves, or change the 
annotations that affect code generation.

If a downstream project wants to use the Hadoop wire protocol, they should be 
making a _copy_ of the proto files for their use case, not assuming that we'll 
never rename a protobuf field (which is wire-compatible).

 Mark stability of .proto files
 --

 Key: HADOOP-9619
 URL: https://issues.apache.org/jira/browse/HADOOP-9619
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: documentation
Reporter: Sanjay Radia
Assignee: Sanjay Radia



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9619) Mark stability of .proto files


[ 
https://issues.apache.org/jira/browse/HADOOP-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675469#comment-13675469
 ] 

Sanjay Radia commented on HADOOP-9619:
--

Proposing to mark the .proto files in Hadoop common and HDFS as 
*private-stable*. 
This is for both client facing protocols and for internal protocols.
We are promising wire compatibility and hence the label stable.
One we decide to make the rpc and the data transfer protocols public-stable we 
can then mark the client facing .protos as public.

Since there are no annotations for .proto files I am planning to simply add a 
comment plus a short description of what changes can be made (add optional 
fields, ...) 

 Mark stability of .proto files
 --

 Key: HADOOP-9619
 URL: https://issues.apache.org/jira/browse/HADOOP-9619
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: documentation
Reporter: Sanjay Radia
Assignee: Sanjay Radia



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9618) Add thread which detects JVM pauses

[
https://issues.apache.org/jira/browse/HADOOP-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675471#comment-13675471
]

Todd Lipcon commented on HADOOP-9618:
-

bq. I kind of wish we could use the JVM's Xloggc:logfile to get this
information, since theoretically it should be more trustworthy than trying to
guess. Is that too much hassle to configure by default?

The problem is that the GC logs don't roll, plus it's difficult to correlate
that into the log4j stream, since the timestamps in the GC logs are different
format than log4j, etc -- plus they won't rollup through alternate log4j
appenders to centralized monitoring.

bq. I suppose the thread method detects machine pauses which are not the result
of GCs, so you could say that it gives more information (although perhaps more
questionable information).

Yep - I've seen cases where the kernel locks up for multiple seconds due to
some bug, and that's interesting. Also there's JVM safepoint pauses which are
nasty and aren't in the gc logs unless you use -XX:+PrintSafepointStatistics,
which is super verbose.

bq. I'm a little gun-shy of the 1 second timeout. It wasn't too long ago that
the Linux scheduler quantum was 100 milliseconds. So if you had ten threads
hogging the CPU, you'd already have no time left to run your watchdog thread. I
think the timeout either needs to be longer, or the thread needs to be a
high-priority thread, possibly even realtime priority.

If one of your important Hadoop daemons is so overloaded, I think that would be
interesting as well. This only logs if the 1-second pause takes 3 seconds, so
things like scheduling jitter won't cause log messages unless the jitter is
multiple seconds long. At that point, I'd want to know about it regardless of
whether it's GC, a kernel issue, contention for machine resources, swap, etc.
Do you disagree?

Add thread which detects JVM pauses
---

[jira] [Commented] (HADOOP-9619) Mark stability of .proto files


[ 
https://issues.apache.org/jira/browse/HADOOP-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675473#comment-13675473
 ] 

Sanjay Radia commented on HADOOP-9619:
--

We simultaneously added our comments.

bq.   I don't think the .proto files themselves should be stable
Todd, stable means evolve in a compatible way - hence you can change things 
like the name of fields but not their ids, add new optional fields etc. Yes we 
can rename a file. Let me document that in a separate file and have the .proto 
files refer to it.

bq. If a downstream project ...
agreed.


 Mark stability of .proto files
 --

 Key: HADOOP-9619
 URL: https://issues.apache.org/jira/browse/HADOOP-9619
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: documentation
Reporter: Sanjay Radia
Assignee: Sanjay Radia



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9422) HADOOP_HOME should not be required to be set to be able to launch commands using hadoop.util.Shell

2013-06-04 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675522#comment-13675522
 ] 

Enis Soztutar commented on HADOOP-9422:
---

The requirement as far as I can tell, was that windows support needs to find a 
native library winutils, and it looks under hadoop home. +1 on making it a soft 
dependency on non-windows playforms, and in a further issue change the winutils 
to be under hadoop native lib dir. 

 HADOOP_HOME should not be required to be set to be able to launch commands 
 using hadoop.util.Shell
 --

 Key: HADOOP-9422
 URL: https://issues.apache.org/jira/browse/HADOOP-9422
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Hitesh Shah

 Not sure why this is an enforced requirement especially in cases where a 
 deployment is done using multiple tar-balls ( one each for 
 common/hdfs/mapreduce/yarn ). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9619) Mark stability of .proto files