[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688991#comment-13688991
 ] 

Luke Lu commented on HADOOP-9421:
-

bq. As described, optimize token path.

[Your 
patch|https://issues.apache.org/jira/secure/attachment/12588738/HADOOP-9421.patch]
 still doesn't have the proper client initiation support:
# Only works with token auths that use digest-md5, will require major protocol 
change to optimize for SCRAM (modern digest-md5 replacement) or Kerberos and 
anything SASL mechanisms that hasInitialResponse.
# fallback prevention seems broken as the patch didn't modify 
TestSaslRpc#testSimpleServerWith*Token and still pass these tests.

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Luke Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated HADOOP-9421:


Attachment: HADOOP-9421.patch

Patch with proper client initiation and fallback prevention unit tests.

Please review.

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9651) Filesystems to throw FileAlreadyExistsException in createFile(path, overwrite=false) when the file exists

2013-06-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HADOOP-9651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Baptiste Onofré updated HADOOP-9651:
-

Attachment: (was: HADOOP-9651.patch)

 Filesystems to throw FileAlreadyExistsException in createFile(path, 
 overwrite=false) when the file exists
 -

 Key: HADOOP-9651
 URL: https://issues.apache.org/jira/browse/HADOOP-9651
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs
Affects Versions: 2.1.0-beta
Reporter: Steve Loughran
Priority: Minor
 Attachments: HADOOP-9651.patch


 While HDFS and other filesystems throw a {{FileAlreadyExistsException}} if 
 you try to create a file that exists and you have set {{overwrite=false}}, 
 {{RawLocalFileSystem}} throws a plain {{IOException}}. This makes it 
 impossible to distinguish a create operation failing from a fixable problem 
 (the file is there) and something more fundamental.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9651) Filesystems to throw FileAlreadyExistsException in createFile(path, overwrite=false) when the file exists

2013-06-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HADOOP-9651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Baptiste Onofré updated HADOOP-9651:
-

Attachment: HADOOP-9651.patch

New patch with fix on other filesystems.

 Filesystems to throw FileAlreadyExistsException in createFile(path, 
 overwrite=false) when the file exists
 -

 Key: HADOOP-9651
 URL: https://issues.apache.org/jira/browse/HADOOP-9651
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs
Affects Versions: 2.1.0-beta
Reporter: Steve Loughran
Priority: Minor
 Attachments: HADOOP-9651.patch


 While HDFS and other filesystems throw a {{FileAlreadyExistsException}} if 
 you try to create a file that exists and you have set {{overwrite=false}}, 
 {{RawLocalFileSystem}} throws a plain {{IOException}}. This makes it 
 impossible to distinguish a create operation failing from a fixable problem 
 (the file is there) and something more fundamental.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9451) Node with one topology layer should be handled as fault topology when NodeGroup layer is enabled

2013-06-20 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689024#comment-13689024
 ] 

Luke Lu commented on HADOOP-9451:
-

Committed to branch-2 and 2.1-beta as well. Thanks for the note, Suresh!

 Node with one topology layer should be handled as fault topology when 
 NodeGroup layer is enabled
 

 Key: HADOOP-9451
 URL: https://issues.apache.org/jira/browse/HADOOP-9451
 Project: Hadoop Common
  Issue Type: Bug
  Components: net
Affects Versions: 1.1.2
Reporter: Junping Du
Assignee: Junping Du
 Fix For: 1.2.0, 3.0.0

 Attachments: HADOOP-9451-branch-1.patch, HADOOP-9451.patch, 
 HADOOP-9451-v2.patch, HDFS-4652-branch1.patch, HDFS-4652.patch


 Currently, nodes with one layer topology are allowed to join in the cluster 
 that with enabling NodeGroup layer which cause some exception cases. 
 When NodeGroup layer is enabled, the cluster should assumes that at least two 
 layer (Rack/NodeGroup) is valid topology for each nodes, so should throw 
 exceptions for one layer node in joining.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9451) Node with one topology layer should be handled as fault topology when NodeGroup layer is enabled

2013-06-20 Thread Luke Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated HADOOP-9451:


Target Version/s: 1.2.0, 2.1.0-beta  (was: 1.2.0, 3.0.0)
   Fix Version/s: (was: 3.0.0)
  2.1.0-beta

 Node with one topology layer should be handled as fault topology when 
 NodeGroup layer is enabled
 

 Key: HADOOP-9451
 URL: https://issues.apache.org/jira/browse/HADOOP-9451
 Project: Hadoop Common
  Issue Type: Bug
  Components: net
Affects Versions: 1.1.2
Reporter: Junping Du
Assignee: Junping Du
 Fix For: 1.2.0, 2.1.0-beta

 Attachments: HADOOP-9451-branch-1.patch, HADOOP-9451.patch, 
 HADOOP-9451-v2.patch, HDFS-4652-branch1.patch, HDFS-4652.patch


 Currently, nodes with one layer topology are allowed to join in the cluster 
 that with enabling NodeGroup layer which cause some exception cases. 
 When NodeGroup layer is enabled, the cluster should assumes that at least two 
 layer (Rack/NodeGroup) is valid topology for each nodes, so should throw 
 exceptions for one layer node in joining.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689025#comment-13689025
 ] 

Hadoop QA commented on HADOOP-9421:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12588781/HADOOP-9421.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common:

  org.apache.hadoop.ipc.TestSaslRPC

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/2682//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/2682//console

This message is automatically generated.

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9651) Filesystems to throw FileAlreadyExistsException in createFile(path, overwrite=false) when the file exists

2013-06-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689033#comment-13689033
 ] 

Hadoop QA commented on HADOOP-9651:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12588785/HADOOP-9651.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/2683//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/2683//console

This message is automatically generated.

 Filesystems to throw FileAlreadyExistsException in createFile(path, 
 overwrite=false) when the file exists
 -

 Key: HADOOP-9651
 URL: https://issues.apache.org/jira/browse/HADOOP-9651
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs
Affects Versions: 2.1.0-beta
Reporter: Steve Loughran
Priority: Minor
 Attachments: HADOOP-9651.patch


 While HDFS and other filesystems throw a {{FileAlreadyExistsException}} if 
 you try to create a file that exists and you have set {{overwrite=false}}, 
 {{RawLocalFileSystem}} throws a plain {{IOException}}. This makes it 
 impossible to distinguish a create operation failing from a fixable problem 
 (the file is there) and something more fundamental.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9258) Add stricter tests to FileSystemContractTestBase

2013-06-20 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689060#comment-13689060
 ] 

Steve Loughran commented on HADOOP-9258:


go for it!

 Add stricter tests to FileSystemContractTestBase
 

 Key: HADOOP-9258
 URL: https://issues.apache.org/jira/browse/HADOOP-9258
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: test
Affects Versions: 1.1.1, 2.0.3-alpha
Reporter: Steve Loughran
Assignee: Steve Loughran
 Fix For: 3.0.0

 Attachments: HADOOP-9258-8.patch, HADOOP-9528-2.patch, 
 HADOOP-9528-3.patch, HADOOP-9528-4.patch, HADOOP-9528-5.patch, 
 HADOOP-9528-6.patch, HADOOP-9528-7.patch, HADOOP-9528.patch


 The File System Contract contains implicit assumptions that aren't checked in 
 the contract test base. Add more tests to define the contract's assumptions 
 more rigorously for those filesystems that are tested by this (not Local, BTW)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9657) NetUtils.wrapException to have special handling for 0.0.0.0 addresses and :0 ports

2013-06-20 Thread Steve Loughran (JIRA)
Steve Loughran created HADOOP-9657:
--

 Summary: NetUtils.wrapException to have special handling for 
0.0.0.0 addresses and :0 ports
 Key: HADOOP-9657
 URL: https://issues.apache.org/jira/browse/HADOOP-9657
 Project: Hadoop Common
  Issue Type: Improvement
  Components: net
Reporter: Steve Loughran
Priority: Minor


when an exception is wrapped, it may look like {{0.0.0.0:0 failed on connection 
exception: java.net.ConnectException: Connection refused; For more details see: 
 http://wiki.apache.org/hadoop/ConnectionRefused}}

We should recognise all zero ip addresses and 0 ports and flag them as your 
configuration of the endpoint is wrong, as it is clearly the case


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9658) SnappyCodec#checkNativeCodeLoaded may unexpectedly fail when native code is not loaded

2013-06-20 Thread Zhijie Shen (JIRA)
Zhijie Shen created HADOOP-9658:
---

 Summary: SnappyCodec#checkNativeCodeLoaded may unexpectedly fail 
when native code is not loaded
 Key: HADOOP-9658
 URL: https://issues.apache.org/jira/browse/HADOOP-9658
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen


{code}
  public static void checkNativeCodeLoaded() {
  if (!NativeCodeLoader.buildSupportsSnappy()) {
throw new RuntimeException(native snappy library not available:  +
this version of libhadoop was built without  +
snappy support.);
  }
  if (!SnappyCompressor.isNativeCodeLoaded()) {
throw new RuntimeException(native snappy library not available:  +
SnappyCompressor has not been loaded.);
  }
  if (!SnappyDecompressor.isNativeCodeLoaded()) {
throw new RuntimeException(native snappy library not available:  +
SnappyDecompressor has not been loaded.);
  }
  }
{code}
buildSupportsSnappy is native method. If the native code is not loaded, the 
method will be missing. Therefore, whether the native code is loaded or not, 
the first runtime exception will not be thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Luke Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated HADOOP-9421:


Attachment: HADOOP-9421.patch

Test failures were due to atm's r1494787 checkin. New patch to make the tests 
work again.

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8608) Add Configuration API for parsing time durations

2013-06-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689120#comment-13689120
 ] 

Hudson commented on HADOOP-8608:


Integrated in Hadoop-Yarn-trunk #246 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/246/])
Move HADOOP-8608 to branch-2.1 (Revision 1494824)

 Result = FAILURE
cdouglas : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1494824
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 Add Configuration API for parsing time durations
 

 Key: HADOOP-8608
 URL: https://issues.apache.org/jira/browse/HADOOP-8608
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.1.0-beta
Reporter: Todd Lipcon
Assignee: Chris Douglas
Priority: Minor
 Fix For: 3.0.0

 Attachments: 8608-0.patch, 8608-1.patch, 8608-2.patch


 Hadoop has a lot of configurations which specify durations or intervals of 
 time. Unfortunately these different configurations have little consistency in 
 units - eg some are in milliseconds, some in seconds, and some in minutes. 
 This makes it difficult for users to configure, since they have to always 
 refer back to docs to remember the unit for each property.
 The proposed solution is to add an API like {{Configuration.getTimeDuration}} 
 which allows the user to specify the units with a postfix. For example, 
 10ms, 10s, 10m, 10h, or even 10d. For backwards-compatibility, if 
 the user does not specify a unit, the API can specify the default unit, and 
 warn the user that they should specify an explicit unit instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689130#comment-13689130
 ] 

Hadoop QA commented on HADOOP-9421:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12588814/HADOOP-9421.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/2684//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/2684//console

This message is automatically generated.

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689208#comment-13689208
 ] 

Daryn Sharp commented on HADOOP-9421:
-

bq.  Only works with token auths that use digest-md5, will require major 
protocol change to optimize for SCRAM (modern digest-md5 replacement) or 
Kerberos and anything SASL mechanisms that hasInitialResponse.

A major protocol change will not be required for other auths.  The client is 
properly coded to handle the server providing an initial challenge for any 
auth, but the server currently only does it for tokens.  When the server auths 
become extensible, additional initial challenges can be added w/o changing the 
client.  Ie. It's forward compatible.

I did not generate an initial challenge for kerberos because the SASL mechanism 
does not support it.  An exception is thrown if you try.

This is intended to be a minimal change to provide a base implementation for 
future work.  I thought everybody would be satisfied by removal of an existing 
round trip to offset the negotiate response?

I'll look at your modifications to the patch.

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8608) Add Configuration API for parsing time durations

2013-06-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689215#comment-13689215
 ] 

Hudson commented on HADOOP-8608:


Integrated in Hadoop-Hdfs-trunk #1436 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1436/])
Move HADOOP-8608 to branch-2.1 (Revision 1494824)

 Result = FAILURE
cdouglas : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1494824
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 Add Configuration API for parsing time durations
 

 Key: HADOOP-8608
 URL: https://issues.apache.org/jira/browse/HADOOP-8608
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.1.0-beta
Reporter: Todd Lipcon
Assignee: Chris Douglas
Priority: Minor
 Fix For: 3.0.0

 Attachments: 8608-0.patch, 8608-1.patch, 8608-2.patch


 Hadoop has a lot of configurations which specify durations or intervals of 
 time. Unfortunately these different configurations have little consistency in 
 units - eg some are in milliseconds, some in seconds, and some in minutes. 
 This makes it difficult for users to configure, since they have to always 
 refer back to docs to remember the unit for each property.
 The proposed solution is to add an API like {{Configuration.getTimeDuration}} 
 which allows the user to specify the units with a postfix. For example, 
 10ms, 10s, 10m, 10h, or even 10d. For backwards-compatibility, if 
 the user does not specify a unit, the API can specify the default unit, and 
 warn the user that they should specify an explicit unit instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9264) port change to use Java untar API on Windows from branch-1-win to trunk

2013-06-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689214#comment-13689214
 ] 

Hudson commented on HADOOP-9264:


Integrated in Hadoop-Hdfs-trunk #1436 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1436/])
HADOOP-9264. Change attribution of HADOOP-9264 from trunk to 2.1.0-beta. 
(cnauroth) (Revision 1494709)

 Result = FAILURE
cnauroth : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1494709
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 port change to use Java untar API on Windows from branch-1-win to trunk
 ---

 Key: HADOOP-9264
 URL: https://issues.apache.org/jira/browse/HADOOP-9264
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 3.0.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Fix For: 3.0.0, 2.1.0-beta

 Attachments: HADOOP-9264.1.patch, test-untar.tar, test-untar.tgz


 HADOOP-8847 originally introduced this change on branch-1-win.  HADOOP-9081 
 ported the change to branch-trunk-win.  This should be simple to port to 
 trunk, which would simplify the merge and test activity happening on 
 HADOOP-8562.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Issue Comment Deleted] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HADOOP-9421:


Comment: was deleted

(was: I think it's close. It needs to be rebased against trunk for atm's 
security fix. I'm also adding two unit tests to make sure fallback prevention 
actually works.)

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Issue Comment Deleted] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HADOOP-9421:


Comment: was deleted

(was: Test failures were due to atm's r1494787 checkin. New patch to make the 
tests work again.)

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Issue Comment Deleted] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated HADOOP-9421:


Comment: was deleted

(was: Looks like I need to merge with atm's fall-back-to-simple option commit 
(without a JIRA). )

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9264) port change to use Java untar API on Windows from branch-1-win to trunk

2013-06-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689260#comment-13689260
 ] 

Hudson commented on HADOOP-9264:


Integrated in Hadoop-Mapreduce-trunk #1463 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1463/])
HADOOP-9264. Change attribution of HADOOP-9264 from trunk to 2.1.0-beta. 
(cnauroth) (Revision 1494709)

 Result = SUCCESS
cnauroth : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1494709
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 port change to use Java untar API on Windows from branch-1-win to trunk
 ---

 Key: HADOOP-9264
 URL: https://issues.apache.org/jira/browse/HADOOP-9264
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 3.0.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Fix For: 3.0.0, 2.1.0-beta

 Attachments: HADOOP-9264.1.patch, test-untar.tar, test-untar.tgz


 HADOOP-8847 originally introduced this change on branch-1-win.  HADOOP-9081 
 ported the change to branch-trunk-win.  This should be simple to port to 
 trunk, which would simplify the merge and test activity happening on 
 HADOOP-8562.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8608) Add Configuration API for parsing time durations

2013-06-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689261#comment-13689261
 ] 

Hudson commented on HADOOP-8608:


Integrated in Hadoop-Mapreduce-trunk #1463 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1463/])
Move HADOOP-8608 to branch-2.1 (Revision 1494824)

 Result = SUCCESS
cdouglas : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1494824
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


 Add Configuration API for parsing time durations
 

 Key: HADOOP-8608
 URL: https://issues.apache.org/jira/browse/HADOOP-8608
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.1.0-beta
Reporter: Todd Lipcon
Assignee: Chris Douglas
Priority: Minor
 Fix For: 3.0.0

 Attachments: 8608-0.patch, 8608-1.patch, 8608-2.patch


 Hadoop has a lot of configurations which specify durations or intervals of 
 time. Unfortunately these different configurations have little consistency in 
 units - eg some are in milliseconds, some in seconds, and some in minutes. 
 This makes it difficult for users to configure, since they have to always 
 refer back to docs to remember the unit for each property.
 The proposed solution is to add an API like {{Configuration.getTimeDuration}} 
 which allows the user to specify the units with a postfix. For example, 
 10ms, 10s, 10m, 10h, or even 10d. For backwards-compatibility, if 
 the user does not specify a unit, the API can specify the default unit, and 
 warn the user that they should specify an explicit unit instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689283#comment-13689283
 ] 

Daryn Sharp commented on HADOOP-9421:
-

I don't understand the advantage of this patch.  At a minimum, here are major 
problems.
* Re-introduces the roundtrip I removed for tokens and usable by other auths in 
the future
* Appears to add yet another roundtrip for non-token auths
* Completely removes the ability for the client to chose the best or most 
preferred auth
* Ruins pluggable auths because the client now requires specific logic to 
guess if it can do the new auth
* Prevents elimination of token use_ip
* Prevents supporting tokens for multi-interface, multi-A record, or CNAMEs
* Breaks my ability to add IP failover support
* Allows clients to do complete DOS attacks by tying up the socket indefinitely 
with initiates

Given my correction of the misunderstandings of my prior patch, what are the 
disadvantages?

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689336#comment-13689336
 ] 

Luke Lu commented on HADOOP-9421:
-

bq. I don't understand the advantage of this patch. At a minimum, here are 
major problems.

I think you misunderstood the patch and the current flow as well. In the 
[Current 
flow|https://issues.apache.org/jira/browse/HADOOP-9421?focusedCommentId=13688055page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13688055],
 step 1 and 2 is in the same C-S packet. 

My patch is an improvement over yours in that challenge or negotiation is not 
done blindly. It can support SCRAM/Kerberos round-trip reduction with no 
protocol branching while yours cannot.

Conceptually I just replace the step 2 (still part of the connection header 
packet) with wrapped RpcSaslProto to allow future extensibility. So my patch 
feature wise is a superset of yours.

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9659) Hadoop authentication enhancement use cases, goals, requirements and constraints

2013-06-20 Thread Kevin Minder (JIRA)
Kevin Minder created HADOOP-9659:


 Summary: Hadoop authentication enhancement use cases, goals, 
requirements and constraints
 Key: HADOOP-9659
 URL: https://issues.apache.org/jira/browse/HADOOP-9659
 Project: Hadoop Common
  Issue Type: Task
  Components: security
Reporter: Kevin Minder
Priority: Blocker


We need to collect use cases, goals, requirements and constraints in a central 
location to inform all of the various efforts underway to improve Hadoop 
security initially focusing on authentication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9392) Token based authentication and Single Sign On

2013-06-20 Thread Kevin Minder (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689361#comment-13689361
 ] 

Kevin Minder commented on HADOOP-9392:
--

Here is a summary of the discussion we had during the above call.

Attendees: Andrew Purtell, Brian Swan, Benoy Antong, Avik Dey, Kai Zheng, Kyle 
Leckie, LarryMcCay, Kevin Minder, Tianyou Li

-- Goals  Perspective --

Hortonworks
* Plug into any enterprise Idp infrastructure
* Enhance Hadoop security model to better support perimeter security
* Align client programming model for different Hadoop deployment models

Microsoft
* Support pluggable identity providers: ActiveDirectory, cloud and beyond
* Enhance user isolation within Hadoop cluster

Intel
* Support token based authentication
* Support fine grained authorization
* Seamless identity delegation at every layer
* Support single sign on: from user's desktop, between Hadoop cluster
* Pluggable at every level
* Provide a security toolkit that would be integrated across the ecosystem
* Must be backward compatible
* Must take both RPC and HTTP into account and should follow common model

eBay
* Integrate better with eBay SSO
* Provide SSO integration at RPC layer

-- Summit Planning --

* Think of Summit session as a meet and greet and Kickoff of cross cutting 
security community
* Create a new Jira to collect high-level use cases, goals and usability
* Use time at summit to approach design at a whiteboard from a clean slate 
perspective against those use cases and goals
* Get a sense of how we can divide and conqueror problem space
* Figure out how best to collaborate
* Figure out how we can all get hacking on this ASAP

-- Ideas --

* Foster a security community within the Hadoop community
  * Suggest creating a focused security-dev type community mailing list
  * Suggest creating a wiki area devoted to overall security efforts

* Ideally Current independent designs will inform a collaborative design, pull 
in best of existing code to accelerate

* Link the security doc Jira HADOOP-9621 to other related security Jiras

-- Questions --

* What would central token authority (i.e. HSSO) provide beyond what the work 
that is already being done?
  * HADOOP-9479 (Benoy Antony)
  * HADOOP-8779 (Daryn Sharp)

* How can HSSO and TAS work together?  What is the relationship? 

 Token based authentication and Single Sign On
 -

 Key: HADOOP-9392
 URL: https://issues.apache.org/jira/browse/HADOOP-9392
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Reporter: Kai Zheng
Assignee: Kai Zheng
 Fix For: 3.0.0

 Attachments: token-based-authn-plus-sso.pdf


 This is an umbrella entry for one of project Rhino’s topic, for details of 
 project Rhino, please refer to 
 https://github.com/intel-hadoop/project-rhino/. The major goal for this entry 
 as described in project Rhino was 
  
 “Core, HDFS, ZooKeeper, and HBase currently support Kerberos authentication 
 at the RPC layer, via SASL. However this does not provide valuable attributes 
 such as group membership, classification level, organizational identity, or 
 support for user defined attributes. Hadoop components must interrogate 
 external resources for discovering these attributes and at scale this is 
 problematic. There is also no consistent delegation model. HDFS has a simple 
 delegation capability, and only Oozie can take limited advantage of it. We 
 will implement a common token based authentication framework to decouple 
 internal user and service authentication from external mechanisms used to 
 support it (like Kerberos)”
  
 We’d like to start our work from Hadoop-Common and try to provide common 
 facilities by extending existing authentication framework which support:
 1.Pluggable token provider interface 
 2.Pluggable token verification protocol and interface
 3.Security mechanism to distribute secrets in cluster nodes
 4.Delegation model of user authentication

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9659) Hadoop authentication enhancement use cases, goals, requirements and constraints

2013-06-20 Thread Kevin Minder (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Minder updated HADOOP-9659:
-

Priority: Critical  (was: Blocker)

 Hadoop authentication enhancement use cases, goals, requirements and 
 constraints
 

 Key: HADOOP-9659
 URL: https://issues.apache.org/jira/browse/HADOOP-9659
 Project: Hadoop Common
  Issue Type: Task
  Components: security
Reporter: Kevin Minder
Priority: Critical

 We need to collect use cases, goals, requirements and constraints in a 
 central location to inform all of the various efforts underway to improve 
 Hadoop security initially focusing on authentication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9533) Centralized Hadoop SSO/Token Server

2013-06-20 Thread Kevin Minder (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689363#comment-13689363
 ] 

Kevin Minder commented on HADOOP-9533:
--

This is a summary of the discussion that occurred during the above meeting.  

-- Attendees --
Andrew Purtell, Brian Swan, Benoy Antong, Avik Dey, Kai Zheng, Kyle Leckie, 
LarryMcCay, Kevin Minder, Tianyou Li

-- Goals  Perspective --

Hortonworks
* Plug into any enterprise Idp infrastructure
* Enhance Hadoop security model to better support perimeter security
* Align client programming model for different Hadoop deployment models

Microsoft
* Support pluggable identity providers: ActiveDirectory, cloud and beyond
* Enhance user isolation within Hadoop cluster

Intel
* Support token based authentication
* Support fine grained authorization
* Seamless identity delegation at every layer
* Support single sign on: from user's desktop, between Hadoop cluster
* Pluggable at every level
* Provide a security toolkit that would be integrated across the ecosystem
* Must be backward compatible
* Must take both RPC and HTTP into account and should follow common model

eBay
* Integrate better with eBay SSO
* Provide SSO integration at RPC layer

-- Summit Planning --

* Think of Summit session as a meet and greet and Kickoff of cross cutting 
security community
* Create a new Jira to collect high-level use cases, goals and usability
* Use time at summit to approach design at a whiteboard from a clean slate 
perspective against those use cases and goals
* Get a sense of how we can divide and conqueror problem space
* Figure out how best to collaborate
* Figure out how we can all get hacking on this ASAP

-- Ideas --

* Foster a security community within the Hadoop community
  * Suggest creating a focused security-dev type community mailing list
  * Suggest creating a wiki area devoted to overall security efforts

* Ideally Current independent designs will inform a collaborative design, pull 
in best of existing code to accelerate

* Link the security doc Jira HADOOP-9621 to other related security Jiras

-- Questions --

* What would central token authority (i.e. HSSO) provide beyond what the work 
that is already being done?
HADOOP-9479 (Benoy Antony)
HADOOP-8779 (Daryn Sharp)

* How can HSSO and TAS work together?  What is the relationship? 

 Centralized Hadoop SSO/Token Server
 ---

 Key: HADOOP-9533
 URL: https://issues.apache.org/jira/browse/HADOOP-9533
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Reporter: Larry McCay
 Attachments: HSSO-Interaction-Overview-rev-1.docx, 
 HSSO-Interaction-Overview-rev-1.pdf


 This is an umbrella Jira filing to oversee a set of proposals for introducing 
 a new master service for Hadoop Single Sign On (HSSO).
 There is an increasing need for pluggable authentication providers that 
 authenticate both users and services as well as validate tokens in order to 
 federate identities authenticated by trusted IDPs. These IDPs may be deployed 
 within the enterprise or third-party IDPs that are external to the enterprise.
 These needs speak to a specific pain point: which is a narrow integration 
 path into the enterprise identity infrastructure. Kerberos is a fine solution 
 for those that already have it in place or are willing to adopt its use but 
 there remains a class of user that finds this unacceptable and needs to 
 integrate with a wider variety of identity management solutions.
 Another specific pain point is that of rolling and distributing keys. A 
 related and integral part of the HSSO server is library called the Credential 
 Management Framework (CMF), which will be a common library for easing the 
 management of secrets, keys and credentials.
 Initially, the existing delegation, block access and job tokens will continue 
 to be utilized. There may be some changes required to leverage a PKI based 
 signature facility rather than shared secrets. This is a means to simplify 
 the solution for the pain point of distributing shared secrets.
 This project will primarily centralize the responsibility of authentication 
 and federation into a single service that is trusted across the Hadoop 
 cluster and optionally across multiple clusters. This greatly simplifies a 
 number of things in the Hadoop ecosystem:
 1.a single token format that is used across all of Hadoop regardless of 
 authentication method
 2.a single service to have pluggable providers instead of all services
 3.a single token authority that would be trusted across the cluster/s and 
 through PKI encryption be able to easily issue cryptographically verifiable 
 tokens
 4.automatic rolling of the token authority’s keys and publishing of the 
 public key for easy access by those parties that need 

[jira] [Commented] (HADOOP-9653) Token validation and transmission

2013-06-20 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689398#comment-13689398
 ] 

Kai Zheng commented on HADOOP-9653:
---

At any rate, I have difficulty visualizing how arbitrary token types are 
going to be presented by the clients for either RPC or HTTP based APIs in a 
common way.  
GSS mechanism sure works in SASL RPC framework as we all know;
GSS mechanism can also work in SPNEGO process for HttpUrlConnection or browser 
to access REST interface or web resources;
In situations where SPNEGO might not work for web services, special filter 
should be used to contact IdP to get identity token then requests an access 
token, and uses that access token to authenticate to the web service. In this 
case, since the filter and servlets as service are hosted in the same web 
server, no token transfer is involved.


 Token validation and transmission
 -

 Key: HADOOP-9653
 URL: https://issues.apache.org/jira/browse/HADOOP-9653
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Reporter: Kai Zheng
Assignee: Kai Zheng
  Labels: rhino
 Fix For: 3.0.0


 HADOOP-9392 proposes to have customizable token authenticator for services to 
 implement the TokenAuthn method and it was thought supporting pluggable token 
 validation is a significant feature itself so it serves to be addressed in a 
 separate JIRA. It will also consider how to securely transmit token in Hadoop 
 RPC in a way the defends against all of the classical attacks. Note the 
 authentication negotiation and wrapping of Hadoop RPC should be backwards 
 compatible and interoperable with existing deployments, so therefore be SASL 
 based.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9656) Gridmix unit tests fail on Windows and Linux

2013-06-20 Thread Chuan Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chuan Liu updated HADOOP-9656:
--

Attachment: HADOOP-9656-trunk.patch

 Gridmix unit tests fail on Windows and Linux
 

 Key: HADOOP-9656
 URL: https://issues.apache.org/jira/browse/HADOOP-9656
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: HADOOP-9656-trunk.patch, HADOOP-9656-trunk.patch


 The following three Gridmix unit tests fail on both Windows and Linux:
 *TestGridmixSubmission
 *TestLoadJob
 *TestSleepJob
 One common cause of failure for both Windows and Linux is that -1 was passed 
 to {{scaleConfigParameter()}} as the default per-task memory request in 
 {{GridmixJob.configureHighRamProperties()}} method.
 In additional to the memory setting issue, Windows also have a path issue. In 
 {{CommonJobTest.doSubmission()}} method, root path is an HDFS path. 
 However, it is initialized as a local file path. This lead to later failure 
 to create root on HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689451#comment-13689451
 ] 

Luke Lu commented on HADOOP-9421:
-

Specifically:

bq. Re-introduces the roundtrip I removed for tokens and usable by other auths 
in the future

Not true. The initiate proto wrapped in rpc header is part of the connection 
header packet even though they're logically separate messages.

bq. Appears to add yet another roundtrip for non-token auths

Not true. See above.

bq. Completely removes the ability for the client to chose the best or most 
preferred auth

Not true. In fact the client initiate proto allows future auths without 
introduce new round-trip.

bq. Ruins pluggable auths because the client now requires specific logic to 
guess if it can do the new auth

Not true. initiate allows but not requires client specific logic, hence 
more extensible.

bq. Prevents elimination of token use_ip

Not applicable to any known token mechs: Digest-MD5 or SCRAM, as the former is 
always server initiated and the latter doesn't care.

bq. Prevents supporting tokens for multi-interface, multi-A record, or CNAMEs

Not true. Token auth don't care, see above. And the initiate proto is 
extensible for all kinds of auth metadata.

bq. Breaks my ability to add IP failover support

Not true. IP failover works with tokens as is and for Kerberos if server 
principal is shared among the servers for the same logical server. Can be 
extended to support insane cross server principal failover, while maintaining 
minimum round-trips in normal cases.

bq. Allows clients to do complete DOS attacks by tying up the socket 
indefinitely with initiates

Clients can already do the same by keeping RPC connections indefinitely. DoS is 
only significant if it requires client less resource to DoS a server, which is 
not the case.

In summary, your patch changes the major flow of the current RPC with a new 
negotiate round-trip except for a round-trip reduction hack for Digest-MD5 
tokens, since it disallows client to send any new auth metadata in the first 
packet. My patch is actually a (conceptually) small change to extend the 
capability to send arbitrary auth metadata in the first packet and allows 
server to intelligently respond with either challenge or negotiate, which 
allows round-trip optimization for all future auths besides Digest-MD5 tokens.



 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9656) Gridmix unit tests fail on Windows and Linux

2013-06-20 Thread Chuan Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chuan Liu updated HADOOP-9656:
--

Status: Patch Available  (was: Open)

 Gridmix unit tests fail on Windows and Linux
 

 Key: HADOOP-9656
 URL: https://issues.apache.org/jira/browse/HADOOP-9656
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: HADOOP-9656-trunk.patch, HADOOP-9656-trunk.patch


 The following three Gridmix unit tests fail on both Windows and Linux:
 *TestGridmixSubmission
 *TestLoadJob
 *TestSleepJob
 One common cause of failure for both Windows and Linux is that -1 was passed 
 to {{scaleConfigParameter()}} as the default per-task memory request in 
 {{GridmixJob.configureHighRamProperties()}} method.
 In additional to the memory setting issue, Windows also have a path issue. In 
 {{CommonJobTest.doSubmission()}} method, root path is an HDFS path. 
 However, it is initialized as a local file path. This lead to later failure 
 to create root on HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9656) Gridmix unit tests fail on Windows and Linux

2013-06-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689483#comment-13689483
 ] 

Hadoop QA commented on HADOOP-9656:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12588871/HADOOP-9656-trunk.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-tools/hadoop-gridmix:

  org.apache.hadoop.mapred.gridmix.TestDistCacheEmulation

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/2685//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/2685//console

This message is automatically generated.

 Gridmix unit tests fail on Windows and Linux
 

 Key: HADOOP-9656
 URL: https://issues.apache.org/jira/browse/HADOOP-9656
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: HADOOP-9656-trunk.patch, HADOOP-9656-trunk.patch


 The following three Gridmix unit tests fail on both Windows and Linux:
 *TestGridmixSubmission
 *TestLoadJob
 *TestSleepJob
 One common cause of failure for both Windows and Linux is that -1 was passed 
 to {{scaleConfigParameter()}} as the default per-task memory request in 
 {{GridmixJob.configureHighRamProperties()}} method.
 In additional to the memory setting issue, Windows also have a path issue. In 
 {{CommonJobTest.doSubmission()}} method, root path is an HDFS path. 
 However, it is initialized as a local file path. This lead to later failure 
 to create root on HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8470) Implementation of 4-layer subclass of NetworkTopology (NetworkTopologyWithNodeGroup)

2013-06-20 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689485#comment-13689485
 ] 

Suresh Srinivas commented on HADOOP-8470:
-

[~szetszwo] Nicholas can you please merge this into branch-2.1.0-beta as well?

 Implementation of 4-layer subclass of NetworkTopology 
 (NetworkTopologyWithNodeGroup)
 

 Key: HADOOP-8470
 URL: https://issues.apache.org/jira/browse/HADOOP-8470
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 1.0.0, 2.0.0-alpha
Reporter: Junping Du
Assignee: Junping Du
 Fix For: 1.2.0, 2.1.0-beta

 Attachments: HADOOP-8470-branch-2.patch, 
 HADOOP-8470-NetworkTopology-new-impl.patch, 
 HADOOP-8470-NetworkTopology-new-impl-v2.patch, 
 HADOOP-8470-NetworkTopology-new-impl-v3.patch, 
 HADOOP-8470-NetworkTopology-new-impl-v4.patch


 To support the four-layer hierarchical topology shown in attached figure as a 
 subclass of NetworkTopology, NetworkTopologyWithNodeGroup was developed along 
 with unit tests. NetworkTopologyWithNodeGroup overriding the methods add, 
 remove, and pseudoSortByDistance were the most relevant to support the 
 four-layer topology. The method seudoSortByDistance selects the nodes to use 
 for reading data and sorts the nodes in sequence of node-local, 
 nodegroup-local, rack- local, rack–off. Another slightly change to 
 seudoSortByDistance is to support cases of separation data node and node 
 manager: if the reader cannot be found in NetworkTopology tree (formed by 
 data nodes only), then it will try to sort according to reader's sibling node 
 in the tree.
 The distance calculation changes the weights from 0 (local), 2 (rack- local), 
 4 (rack-off) to: 0 (local), 2 (nodegroup-local), 4 (rack-local), 6 (rack-off).
 The additional node group layer should be specified in the topology script or 
 table mapping, e.g. input 10.1.1.1, output: /rack1/nodegroup1
 A subclass on InnerNode, InnerNodeWithNodeGroup, was also needed to support 
 NetworkTopologyWithNodeGroup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689490#comment-13689490
 ] 

Daryn Sharp commented on HADOOP-9421:
-

I'm referring to the roundtrip your patch introduces by responding with 
negotiate if it's a non-token auth.

The client can't chose the best auth, or even know the supported auths, if it's 
already guessed prior to connection.  How will the client know whether the 
server does DIGEST-MD5 or SCRAM for tokens?  It won't work in a mixed 
environment.

Eliminating use_ip is not related to the mech.  A server hint is for the token 
selection itself instead of the fragile way tokens are currently selected.  
Tokens are completely sensitive to multi-interface hosts, and different 
hostnames for the same machine.

IP failover with a shared principal isn't an option, at least for us.  A shared 
principal prevents direct communication with the HA NNs because the client will 
use the actual host's principal, not the shared principal.  Which also means 
DNs can't heartbeat into both NNs w/o hardcoding in the config, which may be 
problematic for federation + HA.

The roundtrip reduction hack is a feature that can be extended to any sasl 
mechanism that can initiate.

The point you keep missing is +the client can't guess an auth method+ but you 
keep focusing on retaining that behavior.  We need to resolve this with the 
offline call today.

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-8029) org.apache.hadoop.io.nativeio.NativeIO.posixFadviseIfPossible does not handle EINVAL

2013-06-20 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HADOOP-8029:
-

Attachment: HADOOP-8029-b1.003.patch

* disable fadvise after it fails

 org.apache.hadoop.io.nativeio.NativeIO.posixFadviseIfPossible does not handle 
 EINVAL
 

 Key: HADOOP-8029
 URL: https://issues.apache.org/jira/browse/HADOOP-8029
 Project: Hadoop Common
  Issue Type: Bug
  Components: native
Affects Versions: 0.20.205.0
 Environment: Debian Wheezy 64-bit 
 uname -a = Linux desktop 3.1.0-1-amd64 #1 SMP Tue Jan 10 05:01:58 UTC 2012 
 x86_64 GNU/Linux 
 cat /etc/issue = Debian GNU/Linux wheezy/sid \n \l 
 /etc/apt/sources.list =  
 deb http://ftp.us.debian.org/debian/ wheezy main contrib non-free 
 deb-src http://ftp.us.debian.org/debian/ wheezy main contrib non-free 
 deb http://security.debian.org/ wheezy/updates main contrib non-free 
 deb-src http://security.debian.org/ wheezy/updates main contrib non-free 
 deb http://archive.cloudera.com/debian squeeze-cdh3 contrib 
 deb-src http://archive.cloudera.com/debian squeeze-cdh3 contrib 
 Hadoop specific configuration (disabled permissions, pseudo-distributed mode, 
 replication set to 1, from my own blog post here: http://j.mp/tsVBR4
Reporter: Tim Mattison
 Attachments: HADOOP-8029.001.patch, HADOOP-8029-b1.003.patch

   Original Estimate: 4h
  Remaining Estimate: 4h

 When Hadoop's directories reside on tmpfs in Debian Wheezy (and possibly all 
 Linux 3.1 distros) in an installation that is using the native libraries 
 fadvise returns EINVAL when trying to run a MapReduce job.  Since EINVAL 
 isn't handled all MapReduce jobs report Map output lost, rescheduling: 
 getMapOutput.
 A full stack trace for this issue looks like this:
 [exec] 12/02/03 09:50:58 INFO mapred.JobClient: Task Id : 
 attempt_201202030949_0001_m_00_0, Status : FAILED
 [exec] Map output lost, rescheduling: 
 getMapOutput(attempt_201202030949_0001_m_00_0,0) failed :
 [exec] EINVAL: Invalid argument
 [exec] at org.apache.hadoop.io.nativeio.NativeIO.posix_fadvise(Native Method)
 [exec] at 
 org.apache.hadoop.io.nativeio.NativeIO.posixFadviseIfPossible(NativeIO.java:177)
 [exec] at 
 org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:4026)
 [exec] at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
 [exec] at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 [exec] at 
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
 [exec] at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
 [exec] at 
 org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:829)
 [exec] at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 [exec] at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
 [exec] at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 Some logic will need to be implemented to handle EINVAL to properly support 
 all file systems.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9640) RPC Congestion Control

2013-06-20 Thread Xiaobo Peng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobo Peng updated HADOOP-9640:


Attachment: (was: rpc-congestion-control-design-draft.pdf)

 RPC Congestion Control
 --

 Key: HADOOP-9640
 URL: https://issues.apache.org/jira/browse/HADOOP-9640
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Xiaobo Peng
 Attachments: rpc-congestion-control-draft-plan.pdf


 Several production Hadoop cluster incidents occurred where the Namenode was 
 overloaded and failed to be responsive.  This task is to improve the system 
 to detect RPC congestion early, and to provide good diagnostic information 
 for alerts that identify suspicious jobs/users so as to restore services 
 quickly.
 Excerpted from the communication of one incident, “The map task of a user was 
 creating huge number of small files in the user directory. Due to the heavy 
 load on NN, the JT also was unable to communicate with NN...The cluster 
 became responsive only once the job was killed.”
 Excerpted from the communication of another incident, “Namenode was 
 overloaded by GetBlockLocation requests (Correction: should be getFileInfo 
 requests. the job had a bug that called getFileInfo for a nonexistent file in 
 an endless loop). All other requests to namenode were also affected by this 
 and hence all jobs slowed down. Cluster almost came to a grinding 
 halt…Eventually killed jobtracker to kill all jobs that are running.”
 Excerpted from HDFS-945, “We've seen defective applications cause havoc on 
 the NameNode, for e.g. by doing 100k+ 'listStatus' on very large directories 
 (60k files) etc.”

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9640) RPC Congestion Control

2013-06-20 Thread Xiaobo Peng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobo Peng updated HADOOP-9640:


Attachment: rpc-congestion-control-draft-plan.pdf

 RPC Congestion Control
 --

 Key: HADOOP-9640
 URL: https://issues.apache.org/jira/browse/HADOOP-9640
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Xiaobo Peng
 Attachments: rpc-congestion-control-draft-plan.pdf


 Several production Hadoop cluster incidents occurred where the Namenode was 
 overloaded and failed to be responsive.  This task is to improve the system 
 to detect RPC congestion early, and to provide good diagnostic information 
 for alerts that identify suspicious jobs/users so as to restore services 
 quickly.
 Excerpted from the communication of one incident, “The map task of a user was 
 creating huge number of small files in the user directory. Due to the heavy 
 load on NN, the JT also was unable to communicate with NN...The cluster 
 became responsive only once the job was killed.”
 Excerpted from the communication of another incident, “Namenode was 
 overloaded by GetBlockLocation requests (Correction: should be getFileInfo 
 requests. the job had a bug that called getFileInfo for a nonexistent file in 
 an endless loop). All other requests to namenode were also affected by this 
 and hence all jobs slowed down. Cluster almost came to a grinding 
 halt…Eventually killed jobtracker to kill all jobs that are running.”
 Excerpted from HDFS-945, “We've seen defective applications cause havoc on 
 the NameNode, for e.g. by doing 100k+ 'listStatus' on very large directories 
 (60k files) etc.”

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9585) unit test failure :org.apache.hadoop.fs.TestFsShellReturnCode.testChgrp

2013-06-20 Thread Leo Leung (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689602#comment-13689602
 ] 

Leo Leung commented on HADOOP-9585:
---

Current test case assumes jenkins (or the test runner) is part of the admin 
group.
Suggest to change test case to add 
1) positive test scenario that includes the users' group
2) and negative scenario that includes none-existing-group. (invalid group)

something like:
+String[] groups = UserGroupInformation.getCurrentUser().getGroupNames();
-String argv[] = { -chgrp, admin, f1 };
-verify(fs, -chgrp, argv, 1, fsShell, 0);
+for (String member : groups)
+{
+  String argv[] = { -chgrp, member, f1 };
+  verify(fs, -chgrp, argv, 1, fsShell, 0);
+}
-// Test 2: exit code for chgrp on non existing path is 1
+// Test 2: exit code for non-existing group on existing file
+String argv1[] = { -chgrp, groupdoesnotexist, f1 };
+verify(fs, -chgrp, argv1, 1, fsShell, 1);




 unit test failure :org.apache.hadoop.fs.TestFsShellReturnCode.testChgrp
 ---

 Key: HADOOP-9585
 URL: https://issues.apache.org/jira/browse/HADOOP-9585
 Project: Hadoop Common
  Issue Type: Bug
  Components: test
Affects Versions: 1.3.0
 Environment: 
 https://builds.apache.org/job/Hadoop-branch1/lastCompletedBuild/testReport/org.apache.hadoop.fs/TestFsShellReturnCode/testChgrp/
Reporter: Giridharan Kesavan

 Standard Error
 chmod: could not get status for 
 '/home/jenkins/jenkins-slave/workspace/Hadoop-branch1/trunk/build/test/data/testChmod/fileDoesNotExist':
  File 
 /home/jenkins/jenkins-slave/workspace/Hadoop-branch1/trunk/build/test/data/testChmod/fileDoesNotExist
  does not exist.
 chmod: could not get status for 
 '/home/jenkins/jenkins-slave/workspace/Hadoop-branch1/trunk/build/test/data/testChmod/nonExistingfiles*'
 chown: could not get status for 
 '/home/jenkins/jenkins-slave/workspace/Hadoop-branch1/trunk/build/test/data/testChown/fileDoesNotExist':
  File 
 /home/jenkins/jenkins-slave/workspace/Hadoop-branch1/trunk/build/test/data/testChown/fileDoesNotExist
  does not exist.
 chown: could not get status for 
 '/home/jenkins/jenkins-slave/workspace/Hadoop-branch1/trunk/build/test/data/testChown/nonExistingfiles*'
 chgrp: failed on 
 'file:/home/jenkins/jenkins-slave/workspace/Hadoop-branch1/trunk/build/test/data/testChgrp/fileExists':
  chgrp: changing group of 
 `/home/jenkins/jenkins-slave/workspace/Hadoop-branch1/trunk/build/test/data/testChgrp/fileExists':
  Operation not permitted

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689619#comment-13689619
 ] 

Luke Lu commented on HADOOP-9421:
-

bq. I'm referring to the roundtrip your patch introduces by responding with 
negotiate if it's a non-token auth.

The roundtrip is mandatory for non-token auth in your patch. It's optional in 
mine.

bq. The client can't chose the best auth

Sometimes it's the only choice, eg. delegation token for tasks, which covers 
99% of the auth workload in typical clusters.

bq. The roundtrip reduction hack is a feature that can be extended to any 
sasl mechanism that can initiate.

No it cannot reduce round-trip for SCRAM or Kerberos, while my patch can.

bq. The point you keep missing is the client can't guess an auth method but you 
keep focusing on retaining that behavior.

The point you keep missing is that the client initiate proto is free and 
optional from network round-trip point of view, which opens door for future 
optimization for SCRAM etc.

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-7140) IPC Reader threads do not stop when server stops

2013-06-20 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated HADOOP-7140:
---

Affects Version/s: 1.3.0
   1-win

Looks like we merged HADOOP-6713 to branch-1 but missed this Jira. I run into 
an issue where TT didn't shut itself down because of running IPC reader 
threads. 

Reopening for backport to branch-1 and branch-1-win.

 IPC Reader threads do not stop when server stops
 

 Key: HADOOP-7140
 URL: https://issues.apache.org/jira/browse/HADOOP-7140
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.22.0, 1-win, 1.3.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.22.0

 Attachments: hadoop-7140.txt, hadoop-7140.txt


 After HADOOP-6713, the new IPC Reader threads are not properly stopped when 
 the server shuts down. One repercussion of this is that conditions that are 
 supposed to shut down a daemon no longer work (eg the TT doesn't shut itself 
 down if it detects an incompatible build version)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-7140) IPC Reader threads do not stop when server stops

2013-06-20 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated HADOOP-7140:
---

Attachment: HADOOP-7140.branch-1-win.patch
HADOOP-7140.branch-1.patch

Attaching the branch-1 and branch-1-win compatible patches. 

 IPC Reader threads do not stop when server stops
 

 Key: HADOOP-7140
 URL: https://issues.apache.org/jira/browse/HADOOP-7140
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.22.0, 1-win, 1.3.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.22.0

 Attachments: HADOOP-7140.branch-1.patch, 
 HADOOP-7140.branch-1-win.patch, hadoop-7140.txt, hadoop-7140.txt


 After HADOOP-6713, the new IPC Reader threads are not properly stopped when 
 the server shuts down. One repercussion of this is that conditions that are 
 supposed to shut down a daemon no longer work (eg the TT doesn't shut itself 
 down if it detects an incompatible build version)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689739#comment-13689739
 ] 

Sanjay Radia commented on HADOOP-9421:
--

bq.  The client can't chose the best auth

bq.Sometimes it's the only choice, eg. delegation token for tasks, which covers 
99% of the auth workload in typical clusters.
That is correct. BTW Client can guess the best default  for initial auth since 
it is the default authentication for the Cluster (e.g. Kerberos or LDAP).

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689759#comment-13689759
 ] 

Sanjay Radia commented on HADOOP-9421:
--

Luke, Daryn can both of you please summarize the packet exchange for simple, 
kerberos, and hadoop-tokens for each of your approaches. ALso the exchange when 
server does not support auth-method X initiated by client and the client is 
suppose to use another auth-method from a list. Lets first get an agreement on 
the exchange.

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689763#comment-13689763
 ] 

Daryn Sharp commented on HADOOP-9421:
-

bq. That is correct. BTW Client can guess the best default for initial auth 
since it is the default authentication for the Cluster (e.g. Kerberos or LDAP).

Wrong.  You can't guess in a heterogeneous security environment for different 
services.  You can't even guess which mechanism a particular service will use 
for tokens.

How will you simultaneously support accessing both LDAP and KERBEROS secured 
services?  Or something pluggable like a SSO_TOKEN on one cluster, normal 
TOKEN on another.  And either of them might be using DIGEST-MD5 or SCRAM or 
something else.

How will you guess w/o a slew of config options?

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Larry McCay (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689774#comment-13689774
 ] 

Larry McCay commented on HADOOP-9421:
-

Just a thought...
I have recently found the use of plantuml very useful for these sorts of
discussions. Sequence diagrams may go a long way and they are refreshingly
easy with that tool.



 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9660) [WINDOWS] Powershell / cmd parses -Dkey=value from command line as [-Dkey, value] which breaks GenericsOptionParser

2013-06-20 Thread Enis Soztutar (JIRA)
Enis Soztutar created HADOOP-9660:
-

 Summary: [WINDOWS] Powershell / cmd parses -Dkey=value from 
command line as [-Dkey, value] which breaks GenericsOptionParser
 Key: HADOOP-9660
 URL: https://issues.apache.org/jira/browse/HADOOP-9660
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts, util
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 3.0.0, 2.1.0-beta


When parsing parameters to a class implementing Tool, and using ToolRunner, we 
can pass 
{code}
bin/hadoop tool_class -Dkey=value 
{code}
However, powershell parses the '=' sign itself, and sends it to  java as 
[-Dkey, value] which breaks GenericOptionsParser. 

Using -Dkey=value or '-Dkey=value' does not fix the problem. The only 
workaround seems to trick PS by using: 
'-Dkey=value' (single + double quote)

In cmd, -Dkey=value works, but not '-Dkey=value'. 

http://stackoverflow.com/questions/4940375/how-do-i-pass-an-equal-sign-when-calling-a-batch-script-in-powershell

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9660) [WINDOWS] Powershell / cmd parses -Dkey=value from command line as [-Dkey, value] which breaks GenericsOptionParser

2013-06-20 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689775#comment-13689775
 ] 

Enis Soztutar commented on HADOOP-9660:
---

Did some further testing with this:
Under powershell command line args[] is send as:
{code}
PS D:\hbase bin/hbase org.apache.hadoop.hbase.TestCmd -D key=value
[-D, key, value]
PS D:\hbase bin/hbase org.apache.hadoop.hbase.TestCmd -D key=value
[-D, key, value]
PS D:\hbase bin/hbase org.apache.hadoop.hbase.TestCmd -D 'key=value'
[-D, key=value]
PS D:\hbase bin/hbase org.apache.hadoop.hbase.TestCmd -D 'key=value'
[-D, 'key, value']
{code}

Under cmd:
{code}
D:\hbasebin\hbase org.apache.hadoop.hbase.TestCmd -Dkey=value
[-Dkey, value]
D:\hbasebin\hbase org.apache.hadoop.hbase.TestCmd -D key=value
[-D, key, value]
D:\hbasebin\hbase org.apache.hadoop.hbase.TestCmd -D key=value
[-D, key=value]
D:\hbasebin\hbase org.apache.hadoop.hbase.TestCmd -D key=value
[-D, key=value]
D:\hbasebin\hbase org.apache.hadoop.hbase.TestCmd -D 'key=value'
[-D, 'key, value']
D:\hbasebin\hbase org.apache.hadoop.hbase.TestCmd -D 'key=value'
[-D, 'key=value']
{code}

Notice that quoting with  works for cmd, but not for cmd, and single + double 
quote works for powershell but not for cmd. 

 [WINDOWS] Powershell / cmd parses -Dkey=value from command line as [-Dkey, 
 value] which breaks GenericsOptionParser
 ---

 Key: HADOOP-9660
 URL: https://issues.apache.org/jira/browse/HADOOP-9660
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts, util
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 3.0.0, 2.1.0-beta


 When parsing parameters to a class implementing Tool, and using ToolRunner, 
 we can pass 
 {code}
 bin/hadoop tool_class -Dkey=value 
 {code}
 However, powershell parses the '=' sign itself, and sends it to  java as 
 [-Dkey, value] which breaks GenericOptionsParser. 
 Using -Dkey=value or '-Dkey=value' does not fix the problem. The only 
 workaround seems to trick PS by using: 
 '-Dkey=value' (single + double quote)
 In cmd, -Dkey=value works, but not '-Dkey=value'. 
 http://stackoverflow.com/questions/4940375/how-do-i-pass-an-equal-sign-when-calling-a-batch-script-in-powershell

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9660) [WINDOWS] Powershell / cmd parses -Dkey=value from command line as [-Dkey, value] which breaks GenericsOptionParser

2013-06-20 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HADOOP-9660:
--

Attachment: hadoop-9660-branch2_v1.patch
hadoop-9660-branch1_v1.patch

Attaching branch1 and branch2 patches. Branch 2 also applies to trunk. 

Testes the patches locally, and with 
{code}
bin/hadoop fs -Dkey=value -Dfs.default.name=somefs -ls /
{code}
Seems to work. 

 [WINDOWS] Powershell / cmd parses -Dkey=value from command line as [-Dkey, 
 value] which breaks GenericsOptionParser
 ---

 Key: HADOOP-9660
 URL: https://issues.apache.org/jira/browse/HADOOP-9660
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts, util
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 3.0.0, 2.1.0-beta

 Attachments: hadoop-9660-branch1_v1.patch, 
 hadoop-9660-branch2_v1.patch


 When parsing parameters to a class implementing Tool, and using ToolRunner, 
 we can pass 
 {code}
 bin/hadoop tool_class -Dkey=value 
 {code}
 However, powershell parses the '=' sign itself, and sends it to  java as 
 [-Dkey, value] which breaks GenericOptionsParser. 
 Using -Dkey=value or '-Dkey=value' does not fix the problem. The only 
 workaround seems to trick PS by using: 
 '-Dkey=value' (single + double quote)
 In cmd, -Dkey=value works, but not '-Dkey=value'. 
 http://stackoverflow.com/questions/4940375/how-do-i-pass-an-equal-sign-when-calling-a-batch-script-in-powershell

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9660) [WINDOWS] Powershell / cmd parses -Dkey=value from command line as [-Dkey, value] which breaks GenericsOptionParser

2013-06-20 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HADOOP-9660:
--

Status: Patch Available  (was: Open)

 [WINDOWS] Powershell / cmd parses -Dkey=value from command line as [-Dkey, 
 value] which breaks GenericsOptionParser
 ---

 Key: HADOOP-9660
 URL: https://issues.apache.org/jira/browse/HADOOP-9660
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts, util
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 3.0.0, 2.1.0-beta

 Attachments: hadoop-9660-branch1_v1.patch, 
 hadoop-9660-branch2_v1.patch


 When parsing parameters to a class implementing Tool, and using ToolRunner, 
 we can pass 
 {code}
 bin/hadoop tool_class -Dkey=value 
 {code}
 However, powershell parses the '=' sign itself, and sends it to  java as 
 [-Dkey, value] which breaks GenericOptionsParser. 
 Using -Dkey=value or '-Dkey=value' does not fix the problem. The only 
 workaround seems to trick PS by using: 
 '-Dkey=value' (single + double quote)
 In cmd, -Dkey=value works, but not '-Dkey=value'. 
 http://stackoverflow.com/questions/4940375/how-do-i-pass-an-equal-sign-when-calling-a-batch-script-in-powershell

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689823#comment-13689823
 ] 

Daryn Sharp commented on HADOOP-9421:
-

+Simple to insecure+
{noformat}
C - S connectHeader(SIMPLE)
C - S connectionContext
C - S RPC request
{noformat}

No change/overhead.

+Simple to secure+
{noformat}
C - S connectHeader(SIMPLE)
C - S connectionContext
C - S RPC request
C - S RPC exception SIMPLE not supported, close connection
{noformat}

No change/overhead.

Note: There's an existing race condition.  The server sent the exception in 
response to the connect header, but the client already blasted the context and 
request.

+SASL to insecure server+
{noformat}
C - S connectHeader(SASL)
C - S SUCCESS
C - S connectionContext
C - S RPC request
{noformat}

Immediate success message replaces switch to simple.

+SASL to secure server+
{noformat}
C - S connectHeader(SASL)
C - S NEGOTIATE {
 [TOKEN, DIGEST-MD5, proto, realm, first-challenge-token]
 [KERBEROS, GSSAPI, user, host, null]
   }
C - S INITIATE [ Auth from NEGOTIATE ] response-token
[... back and forth CHALLENGE/RESPONSE ...]
S - C SUCCESS final-token
C - S connectionContext
C - S RPC request
{noformat}

The INITIATE for a token will use the first-challenge-token provided by the 
server to cut out a roundtrip.  Since the client cannot initiate, the existing 
code wastes a round trip soliciting the first DIGEST-MD5 challenge that I'm now 
immediately returning.

The INITIATE for kerberos cannot include a first challenge.  The GSSAPI 
mechanism cannot generate a challenge w/o an initial response from the client.  
This is fine and expected.

With this design, a follow on change can allow the client to pick the first 
auth type it supports instead of guessing in advance what it should try.  It 
will use the fields as provided, of course with safeguards to sanity check for 
malicious behavior.  Using the advertised fields is how the client will support 
IP failover.

Walking the advertised auths will allow completely decoupling the auth type 
code from the RPC client.  The auth types can be implemented as pluggable 
services that will be called on demand if the server requests that auth type.  
The pluggable services may use the protocol/serverId fields to decide if they 
have the credentials to even attempt the auth.

The client should never attempt an auth unless it knows the server supports the 
auth, and the client actually has the credentials to do the auth.

+Bad SASL client to secure+
{noformat}
C - S connectHeader(SASL)
C - S NEGOTIATE {
 [TOKEN, DIGEST-MD5, proto, realm, first-challenge-token]
 [KERBEROS, GSSAPI, user, host, null]
   }
C - S INITIATE [ INVALID ] response-token
C - S RPC Exception INVALID not supported, connection closed
{noformat}

Well, you were supposed to reply with one of the advertised methods, not make a 
blind guess...  Sorry, game over.


 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9656) Gridmix unit tests fail on Windows and Linux

2013-06-20 Thread Chuan Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chuan Liu updated HADOOP-9656:
--

Attachment: HADOOP-9656-trunk.2.patch

Attaching a new patch to include the fix for TestDistCacheEmulation.

 Gridmix unit tests fail on Windows and Linux
 

 Key: HADOOP-9656
 URL: https://issues.apache.org/jira/browse/HADOOP-9656
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: HADOOP-9656-trunk.2.patch, HADOOP-9656-trunk.patch, 
 HADOOP-9656-trunk.patch


 The following three Gridmix unit tests fail on both Windows and Linux:
 * TestGridmixSubmission
 * TestLoadJob
 * TestSleepJob
 * TestDistCacheEmulation
 For the first three unit tests, one common cause of failure for both Windows 
 and Linux is that -1 was passed to {{scaleConfigParameter()}} as the default 
 per-task memory request in {{GridmixJob.configureHighRamProperties()}} method.
 In additional to the memory setting issue, Windows also have a path issue. In 
 {{CommonJobTest.doSubmission()}} method, root path is an HDFS path. 
 However, it is initialized as a local file path. This lead to later failure 
 to create root on HDFS.
 For TestDistCacheEmulation, the test asserts a different permission (0644) 
 than the permission that GridmixGenerateDistCacheData job sets on those files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9656) Gridmix unit tests fail on Windows and Linux

2013-06-20 Thread Chuan Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chuan Liu updated HADOOP-9656:
--

Description: 
The following three Gridmix unit tests fail on both Windows and Linux:

* TestGridmixSubmission
* TestLoadJob
* TestSleepJob
* TestDistCacheEmulation

For the first three unit tests, one common cause of failure for both Windows 
and Linux is that -1 was passed to {{scaleConfigParameter()}} as the default 
per-task memory request in {{GridmixJob.configureHighRamProperties()}} method.

In additional to the memory setting issue, Windows also have a path issue. In 
{{CommonJobTest.doSubmission()}} method, root path is an HDFS path. However, 
it is initialized as a local file path. This lead to later failure to create 
root on HDFS.

For TestDistCacheEmulation, the test asserts a different permission (0644) than 
the permission that GridmixGenerateDistCacheData job sets on those files.

  was:
The following three Gridmix unit tests fail on both Windows and Linux:

*TestGridmixSubmission
*TestLoadJob
*TestSleepJob

One common cause of failure for both Windows and Linux is that -1 was passed to 
{{scaleConfigParameter()}} as the default per-task memory request in 
{{GridmixJob.configureHighRamProperties()}} method.

In additional to the memory setting issue, Windows also have a path issue. In 
{{CommonJobTest.doSubmission()}} method, root path is an HDFS path. However, 
it is initialized as a local file path. This lead to later failure to create 
root on HDFS.


 Gridmix unit tests fail on Windows and Linux
 

 Key: HADOOP-9656
 URL: https://issues.apache.org/jira/browse/HADOOP-9656
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: HADOOP-9656-trunk.2.patch, HADOOP-9656-trunk.patch, 
 HADOOP-9656-trunk.patch


 The following three Gridmix unit tests fail on both Windows and Linux:
 * TestGridmixSubmission
 * TestLoadJob
 * TestSleepJob
 * TestDistCacheEmulation
 For the first three unit tests, one common cause of failure for both Windows 
 and Linux is that -1 was passed to {{scaleConfigParameter()}} as the default 
 per-task memory request in {{GridmixJob.configureHighRamProperties()}} method.
 In additional to the memory setting issue, Windows also have a path issue. In 
 {{CommonJobTest.doSubmission()}} method, root path is an HDFS path. 
 However, it is initialized as a local file path. This lead to later failure 
 to create root on HDFS.
 For TestDistCacheEmulation, the test asserts a different permission (0644) 
 than the permission that GridmixGenerateDistCacheData job sets on those files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9660) [WINDOWS] Powershell / cmd parses -Dkey=value from command line as [-Dkey, value] which breaks GenericsOptionParser

2013-06-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689837#comment-13689837
 ] 

Hadoop QA commented on HADOOP-9660:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12588954/hadoop-9660-branch2_v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/2686//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/2686//console

This message is automatically generated.

 [WINDOWS] Powershell / cmd parses -Dkey=value from command line as [-Dkey, 
 value] which breaks GenericsOptionParser
 ---

 Key: HADOOP-9660
 URL: https://issues.apache.org/jira/browse/HADOOP-9660
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts, util
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 3.0.0, 2.1.0-beta

 Attachments: hadoop-9660-branch1_v1.patch, 
 hadoop-9660-branch2_v1.patch


 When parsing parameters to a class implementing Tool, and using ToolRunner, 
 we can pass 
 {code}
 bin/hadoop tool_class -Dkey=value 
 {code}
 However, powershell parses the '=' sign itself, and sends it to  java as 
 [-Dkey, value] which breaks GenericOptionsParser. 
 Using -Dkey=value or '-Dkey=value' does not fix the problem. The only 
 workaround seems to trick PS by using: 
 '-Dkey=value' (single + double quote)
 In cmd, -Dkey=value works, but not '-Dkey=value'. 
 http://stackoverflow.com/questions/4940375/how-do-i-pass-an-equal-sign-when-calling-a-batch-script-in-powershell

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9619) Mark stability of .proto files

2013-06-20 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HADOOP-9619:
-

Attachment: HADOOP-9619-v3.patch

Updated to include the yarn public api protos

 Mark stability of .proto files
 --

 Key: HADOOP-9619
 URL: https://issues.apache.org/jira/browse/HADOOP-9619
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: documentation
Reporter: Sanjay Radia
Assignee: Sanjay Radia
 Attachments: HADOOP-9619.patch, HADOOP-9619-v2.patch, 
 HADOOP-9619-v3.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9619) Mark stability of .proto files

2013-06-20 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689857#comment-13689857
 ] 

Karthik Kambatla commented on HADOOP-9619:
--

Nit: Instead of referring to the wiki page, it would be good to refer to the 
apt-docs page on Compatibility, as the wiki can change or be removed. But, this 
can also be addressed later when we actually remove the wiki.

+1

 Mark stability of .proto files
 --

 Key: HADOOP-9619
 URL: https://issues.apache.org/jira/browse/HADOOP-9619
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: documentation
Reporter: Sanjay Radia
Assignee: Sanjay Radia
 Attachments: HADOOP-9619.patch, HADOOP-9619-v2.patch, 
 HADOOP-9619-v3.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9656) Gridmix unit tests fail on Windows and Linux

2013-06-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689859#comment-13689859
 ] 

Hadoop QA commented on HADOOP-9656:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12588965/HADOOP-9656-trunk.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-tools/hadoop-gridmix.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/2687//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/2687//console

This message is automatically generated.

 Gridmix unit tests fail on Windows and Linux
 

 Key: HADOOP-9656
 URL: https://issues.apache.org/jira/browse/HADOOP-9656
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: HADOOP-9656-trunk.2.patch, HADOOP-9656-trunk.patch, 
 HADOOP-9656-trunk.patch


 The following three Gridmix unit tests fail on both Windows and Linux:
 * TestGridmixSubmission
 * TestLoadJob
 * TestSleepJob
 * TestDistCacheEmulation
 For the first three unit tests, one common cause of failure for both Windows 
 and Linux is that -1 was passed to {{scaleConfigParameter()}} as the default 
 per-task memory request in {{GridmixJob.configureHighRamProperties()}} method.
 In additional to the memory setting issue, Windows also have a path issue. In 
 {{CommonJobTest.doSubmission()}} method, root path is an HDFS path. 
 However, it is initialized as a local file path. This lead to later failure 
 to create root on HDFS.
 For TestDistCacheEmulation, the test asserts a different permission (0644) 
 than the permission that GridmixGenerateDistCacheData job sets on those files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689870#comment-13689870
 ] 

Luke Lu commented on HADOOP-9421:
-

My simple to \* is equivalent Daryn's. Note, consecutive C - S can be merged 
into one TCP packet.

SASL to insecure
{code}
C - S connectionHeader(SASL), INITIATE(optional initial token) 
C - S SUCCESS
C - S connectionContext, RPC request
{code}

SASL to secure
{code}
C - S connectionHeader(SASL), INITIATE(optional initial token, [(TOKEN, 
DIGEST-MD5)])
C - S CHALLENGE(challenge-token) or NEGOTIATE([(TOKEN, DIGEST-MD5), (KERBEROS, 
GSSAPI), ...])
C - S RESPONSE(response-token) or REINITIATE(initial token, [(TOKEN, 
DIGEST-MD5)])
...
C - S SUCCESS(final-token)
C - S connectionContext, RPC request
{code}

Bottom line: my patch is a strict superset of Daryn's patch from protocol POV. 
The keyward is *optional* client initiate.  Daryn's protocol can *not* support 
SCRAM (or any modern auths requiring client nonce) without an extra round-trip.

Most of the credit of my patch goes to Daryn, as adding optional client 
initiate is simple (only a few extra lines).

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9619) Mark stability of .proto files

2013-06-20 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689894#comment-13689894
 ] 

Vinod Kumar Vavilapalli commented on HADOOP-9619:
-

+1 for the YARN changes. I'm going to move the Admin protocol to be private 
after this goes in.

Instead of wiki, the correct way is to link to a site doc URL, but we can't do 
that yet as 2.1 isn't relesed yet.

 Mark stability of .proto files
 --

 Key: HADOOP-9619
 URL: https://issues.apache.org/jira/browse/HADOOP-9619
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: documentation
Reporter: Sanjay Radia
Assignee: Sanjay Radia
 Attachments: HADOOP-9619.patch, HADOOP-9619-v2.patch, 
 HADOOP-9619-v3.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689895#comment-13689895
 ] 

Arun C Murthy commented on HADOOP-9421:
---

Guys, how far are we from getting this done? 

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9661) Allow metrics sources to be extended

2013-06-20 Thread Sandy Ryza (JIRA)
Sandy Ryza created HADOOP-9661:
--

 Summary: Allow metrics sources to be extended
 Key: HADOOP-9661
 URL: https://issues.apache.org/jira/browse/HADOOP-9661
 Project: Hadoop Common
  Issue Type: Improvement
  Components: metrics
Affects Versions: 2.0.5-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza


My use case is to create an FSQueueMetrics that extends QueueMetrics and 
includes some additional fair-scheduler-specific information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689940#comment-13689940
 ] 

Daryn Sharp commented on HADOOP-9421:
-

I know Luke means well, but an initial initiate will never be used after 
subsequent changes.  The negotiate will always be required for IP failover and 
improved token selection.  The approach is trying to over optimize and weakens 
the design.

I benched both patches, and Luke's token auth times are 1ms slower after 100 
samples but let's say they are the same. However, Luke's patch is 8-11ms slower 
with Kerberos than mine. It also has a subset of the changes to make future 
changes compatible. 

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689943#comment-13689943
 ] 

Luke Lu commented on HADOOP-9421:
-

My patch is ready to review, commit and scale testing. I'd appreciate fellow 
committer reviews and +1s :)

I was hoping to nudge Daryn to add optional client initiate. My patch shows 
that adding optional client initiate is simple and future proof.

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9661) Allow metrics sources to be extended

2013-06-20 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated HADOOP-9661:
---

Attachment: HADOOP-9661.patch

 Allow metrics sources to be extended
 

 Key: HADOOP-9661
 URL: https://issues.apache.org/jira/browse/HADOOP-9661
 Project: Hadoop Common
  Issue Type: Improvement
  Components: metrics
Affects Versions: 2.0.5-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: HADOOP-9661.patch


 My use case is to create an FSQueueMetrics that extends QueueMetrics and 
 includes some additional fair-scheduler-specific information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9661) Allow metrics sources to be extended

2013-06-20 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated HADOOP-9661:
---

Status: Patch Available  (was: Open)

 Allow metrics sources to be extended
 

 Key: HADOOP-9661
 URL: https://issues.apache.org/jira/browse/HADOOP-9661
 Project: Hadoop Common
  Issue Type: Improvement
  Components: metrics
Affects Versions: 2.0.5-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: HADOOP-9661.patch


 My use case is to create an FSQueueMetrics that extends QueueMetrics and 
 includes some additional fair-scheduler-specific information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9654) IPC timeout doesn't seem to be kicking in

2013-06-20 Thread Roman Shaposhnik (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689946#comment-13689946
 ] 

Roman Shaposhnik commented on HADOOP-9654:
--

[~jagane] as a matter of fact I didn't know that -- thanks a million for 
bringing this up! I can definitely give your suggestion a try (the NN keeps 
OOMing -- which gives me a perfect testbed for this).

I do have a question for the rest of the folks here though -- a client that 
never times out doesn't strike me as a great default. Am I missing something? 
Should we change the default for the client to actually timeout?

 IPC timeout doesn't seem to be kicking in
 -

 Key: HADOOP-9654
 URL: https://issues.apache.org/jira/browse/HADOOP-9654
 Project: Hadoop Common
  Issue Type: Bug
  Components: ipc
Affects Versions: 2.1.0-beta
Reporter: Roman Shaposhnik

 During my Bigtop testing I made the NN OOM. This, in turn, made all of the 
 clients stuck in the IPC call (even the new clients that I run *after* the NN 
 went OOM). Here's an example of a jstack output on the client that was 
 running:
 {noformat}
 $ hadoop fs -lsr /
 {noformat}
 Stacktrace:
 {noformat}
 /usr/java/jdk1.6.0_21/bin/jstack 19078
 2013-06-19 23:14:00
 Full thread dump Java HotSpot(TM) 64-Bit Server VM (17.0-b16 mixed mode):
 Attach Listener daemon prio=10 tid=0x7fcd8c8c1800 nid=0x5105 waiting on 
 condition [0x]
java.lang.Thread.State: RUNNABLE
 IPC Client (1223039541) connection to 
 ip-10-144-82-213.ec2.internal/10.144.82.213:17020 from root daemon prio=10 
 tid=0x7fcd8c7ea000 nid=0x4aa0 runnable [0x7fcd443e2000]
java.lang.Thread.State: RUNNABLE
   at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
   at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
   at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
   at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
   - locked 0x7fcd7529de18 (a sun.nio.ch.Util$1)
   - locked 0x7fcd7529de00 (a java.util.Collections$UnmodifiableSet)
   - locked 0x7fcd7529da80 (a sun.nio.ch.EPollSelectorImpl)
   at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
   at 
 org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
   at 
 org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
   at java.io.FilterInputStream.read(FilterInputStream.java:116)
   at java.io.FilterInputStream.read(FilterInputStream.java:116)
   at 
 org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:421)
   at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
   at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
   - locked 0x7fcd752aaf18 (a java.io.BufferedInputStream)
   at java.io.DataInputStream.readInt(DataInputStream.java:370)
   at 
 org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:943)
   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:840)
 Low Memory Detector daemon prio=10 tid=0x7fcd8c09 nid=0x4a9b 
 runnable [0x]
java.lang.Thread.State: RUNNABLE
 CompilerThread1 daemon prio=10 tid=0x7fcd8c08d800 nid=0x4a9a waiting on 
 condition [0x]
java.lang.Thread.State: RUNNABLE
 CompilerThread0 daemon prio=10 tid=0x7fcd8c08a800 nid=0x4a99 waiting on 
 condition [0x]
java.lang.Thread.State: RUNNABLE
 Signal Dispatcher daemon prio=10 tid=0x7fcd8c088800 nid=0x4a98 runnable 
 [0x]
java.lang.Thread.State: RUNNABLE
 Finalizer daemon prio=10 tid=0x7fcd8c06a000 nid=0x4a97 in Object.wait() 
 [0x7fcd902e9000]
java.lang.Thread.State: WAITING (on object monitor)
   at java.lang.Object.wait(Native Method)
   - waiting on 0x7fcd75fc0470 (a java.lang.ref.ReferenceQueue$Lock)
   at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
   - locked 0x7fcd75fc0470 (a java.lang.ref.ReferenceQueue$Lock)
   at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
   at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
 Reference Handler daemon prio=10 tid=0x7fcd8c068000 nid=0x4a96 in 
 Object.wait() [0x7fcd903ea000]
java.lang.Thread.State: WAITING (on object monitor)
   at java.lang.Object.wait(Native Method)
   - waiting on 0x7fcd75fc0550 (a java.lang.ref.Reference$Lock)
   at java.lang.Object.wait(Object.java:485)
   at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
   

[jira] [Commented] (HADOOP-9619) Mark stability of .proto files

2013-06-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689949#comment-13689949
 ] 

Hadoop QA commented on HADOOP-9619:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12588968/HADOOP-9619-v3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/2688//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/2688//console

This message is automatically generated.

 Mark stability of .proto files
 --

 Key: HADOOP-9619
 URL: https://issues.apache.org/jira/browse/HADOOP-9619
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: documentation
Reporter: Sanjay Radia
Assignee: Sanjay Radia
 Attachments: HADOOP-9619.patch, HADOOP-9619-v2.patch, 
 HADOOP-9619-v3.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689950#comment-13689950
 ] 

Daryn Sharp commented on HADOOP-9421:
-

We also need to consider the issues caused by guessing the auth during rolling 
upgrades that change the security mechanisms.

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689952#comment-13689952
 ] 

Benoy Antony commented on HADOOP-9421:
--

Luke, Is there a sufficient need to add a new state REINITIATE ? The only 
benefit is that it avoids a isSupportedInitiate() call.

From what I understood, Luke's change is a desirable optimization of Daryn's 
protocol. Such optimization is quite common in protocols. I agree that it 
complicates the design a bit. Tt may never be used with the subsequent changes 
related to IP Failover etc, but why not keep it there as it should not cause 
any impact on performance?  Why would it cause change in performance ? 

 


 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689955#comment-13689955
 ] 

Daryn Sharp commented on HADOOP-9421:
-

If it's not obvious, I'm -1 on Luke's patch.  The biggest issue is we must have 
IP failover support.  At best the client will fail to initiate GSSAPI with the 
wrong principle. You'd have to assume an exception means solicit a negotiate 
message to recreate the SASL client.  At worst you get the wrong service ticket 
and fail the negotiation. 

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689959#comment-13689959
 ] 

Daryn Sharp commented on HADOOP-9421:
-

You can do the feature that will never used by blasting the initiate, ignore 
the negotiate response, and hope for the best. 

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689961#comment-13689961
 ] 

Luke Lu commented on HADOOP-9421:
-

bq. Luke, Is there a sufficient need to add a new state REINITIATE ? The only 
benefit is that it avoids a isSupportedInitiate() call.

Yes. It's the most straightforward way to avoid a initiate/negotiate loop 
besides debuggability :)

Bad SASL to secure example:
{code}
C - S connectionHeader(SASL), INITIATE([(INVALID, ...)])
C - S NEGOTIATE([(TOKEN, DIGEST-MD5), (KERBEROS, GSSAPI), ...]) // due to 
isSupportedInitiate returning false.
C - S REINITIATE([(INVALID, ...)]) // note, there is no transition from 
REINITIATE to NEGOTIATE
C - S S RPC Exception INVALID not supported, connection closed
{code}

The code is simple and hopefully not too subtle (see 
Connection#saslReadAndProcess and ProcessSaslMessage).

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9661) Allow metrics sources to be extended

2013-06-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689970#comment-13689970
 ] 

Hadoop QA commented on HADOOP-9661:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12588984/HADOOP-9661.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/2689//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/2689//console

This message is automatically generated.

 Allow metrics sources to be extended
 

 Key: HADOOP-9661
 URL: https://issues.apache.org/jira/browse/HADOOP-9661
 Project: Hadoop Common
  Issue Type: Improvement
  Components: metrics
Affects Versions: 2.0.5-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: HADOOP-9661.patch


 My use case is to create an FSQueueMetrics that extends QueueMetrics and 
 includes some additional fair-scheduler-specific information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689971#comment-13689971
 ] 

Luke Lu commented on HADOOP-9421:
-

bq. Luke's patch is 8-11ms slower with Kerberos than mine

Ah, that's probably due to my unoptimized patch does initial token twice (one 
in INITIATE and the other in REINITIATE), which causes an extra round trip to 
KDC to get service ticket. But I can fix this with no protocol changes and 
eliminate the difference. OTOH, I can further eliminate the NEGOTIATE round 
trip for Kerberos in normal/stead state cases. 

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689976#comment-13689976
 ] 

Daryn Sharp commented on HADOOP-9421:
-

No, you cannot try to further avoid the negotiate. W/o the negotiate to 
properly initialize the SASL client for the proper service host, IP failover 
will never work!



 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689980#comment-13689980
 ] 

Luke Lu commented on HADOOP-9421:
-

bq. At best the client will fail to initiate GSSAPI with the wrong principle. 
You'd have to assume an exception means solicit a negotiate message to recreate 
the SASL client. At worst you get the wrong service ticket and fail the 
negotiation. 

I think we can do better with my protocol for the insane (separate server 
principals for HA servers for a logical server) fail over case:
{code}
C - S connectionHeader, INITIATE(old-token, [KERBEROS], old-host)
C - S NEGOTIATE([TOKEN, KERBEROS], new-host) // no exception as the server can 
detect that old-host and new-host being different.
C - S REINITIATE(new-token, KERBEROS)
...
{code}

Client can cache the server name and in steady state case:
{code}
C - S connectionHeader, INITIATE(new-token, [KERBEROS], new-host)
C - S CHALLENGE(final-token)
{code}

Look ma, no NEGOTIATE!



 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-7140) IPC Reader threads do not stop when server stops

2013-06-20 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690009#comment-13690009
 ] 

Chris Nauroth commented on HADOOP-7140:
---

+1 for the backport patches.  I ran {{TestRPC}} successfully with the backport.

 IPC Reader threads do not stop when server stops
 

 Key: HADOOP-7140
 URL: https://issues.apache.org/jira/browse/HADOOP-7140
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.22.0, 1-win, 1.3.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.22.0

 Attachments: HADOOP-7140.branch-1.patch, 
 HADOOP-7140.branch-1-win.patch, hadoop-7140.txt, hadoop-7140.txt


 After HADOOP-6713, the new IPC Reader threads are not properly stopped when 
 the server shuts down. One repercussion of this is that conditions that are 
 supposed to shut down a daemon no longer work (eg the TT doesn't shut itself 
 down if it detects an incompatible build version)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-7140) IPC Reader threads do not stop when server stops

2013-06-20 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-7140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690030#comment-13690030
 ] 

Ivan Mitic commented on HADOOP-7140:


Thanks Chris! Will commit to branch-1 shortly.

Since we'll be merging branch-1 changes into branch-1-win, I'll wait for the 
merge to pick up this change. 

 IPC Reader threads do not stop when server stops
 

 Key: HADOOP-7140
 URL: https://issues.apache.org/jira/browse/HADOOP-7140
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.22.0, 1-win, 1.3.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.22.0

 Attachments: HADOOP-7140.branch-1.patch, 
 HADOOP-7140.branch-1-win.patch, hadoop-7140.txt, hadoop-7140.txt


 After HADOOP-6713, the new IPC Reader threads are not properly stopped when 
 the server shuts down. One repercussion of this is that conditions that are 
 supposed to shut down a daemon no longer work (eg the TT doesn't shut itself 
 down if it detects an incompatible build version)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690034#comment-13690034
 ] 

Daryn Sharp commented on HADOOP-9421:
-

You seem to be trying to tailor a design that only considers today's 
implementation of tokens and kerberos.  It seems easy when you assume there 
are only two choices.  The optimization becomes more and more complicated, and 
in many cases impossible, instead of simply doing something the server tells 
you to do.

When pluggable auth support allows a world of heterogenous security, such as 
Knox or Rhino, requiring REINITIATE penalties becomes very expensive.

Sorry for the very long read, but these are topics I intended to address on the 
call that unfortunately didn't happen today.

+IP failover+
Distinct service principals with IP failover isn't insane.  With a shared 
principal services can't be accessed directly because the host doesn't match 
the shared principal.  So a different config with a hardcoded shared principal 
is needed.  Similarly, DNs won't be able to heartbeat directly into HA NNs.  
I'm sure there are more problems than we've already discovered investigating 
that route.

The root issue is the client must only use the hostname that appears in the 
kerberos service principal.  Which means you can't access the service via all 
its interface, hostnames, or even pretty CNAMEs.

If server advertises this is who I am via the NEGOTIATE, then the problem is 
solved.

+Token selection issues+
Selecting tokens pre-connection based on the service as a host or ip port tuple 
is a problem.  Let's take a few examples:

Using the IP precludes multi-interface host support, for instance if you want 
to have a fast/private intra-cluster network and a separate public network.  
Tokens will contain the public IP, but clients using the private interface 
(different IP) can't find them.  This isn't contrived, it's something Cloudera 
has wanted to do.

You also can't use the IP because changing a service's IP will break clients 
using tokens with the old IP.  In comes the bane of my creation, use_ip=false, 
to use the given hostname.  But you can't allow non-fully qualified names 
because it will resolve differently on depending on the dns search path.  
There's a raft of reasons why the canonicalization isn't as straightforward as 
you'd think, which led to a custom NetUtils resolver and complicated path 
normalization.

Likewise, any sort of public proxy or NAT-ing between an external client and a 
cluster service creates an unusable token service within the grid.

HA token logic is unnecessarily convoluted to clone tokens from a logical uri 
into multiple tokens with each failover's service.

_Solution_
A clean solution to all these problems is tokens contain a server generated 
opaque id.  The server NEGOTIATE reports this id.  The client looks for a token 
with that id.  Now no matter what interface/IP/hostname/proxy/NAT is used, the 
client will always find the token.

If you cut out the use of the NEGOTIATE, this ability is gone.

+Supporting new auth methods+
Other new auths in the future may need the protocol/serverId hints from the 
NEGOTIATE to locate the required credentials.  Guessing may not be an option.

The RPC client shouldn't have to be modified to make a pre-connection guess for 
all the auth methods it supports.  Because...

Why should the client attempt an auth method before it _even knows if the 
server can do it_?  Let's look at some hairy examples:

The client tries to do kerberos, so it needs to generate the initial response 
to take advantage of your optimization.  But the server isn't kerberized.  So 
either the client fails because it has no TGT, which it doesn't even need!  Or 
fails to get a non-existent service principal.

What if the client decides to use an SSO service, but the server doesn't do 
SSO?  Take a REINITIATE penalty every time?

+Supporting new mechanisms+
Let's say we add support for a new mechanism like SCRAM.  Just because the 
client can do it doesn't mean all services across all clusters can do it.  The 
server's NEGOTIATE will tell the client if it can do DIGEST-MD5, SCRAM, etc.

Inter-cluster compatibility and rolling upgrades will introduce scenarios where 
the required mechanism differs, and penalizing the client to REINITIATE is not 
a valid option.

---

In all of these scenarios, there aren't complex issues if the NEGOTIATE is used 
to chose an appropriate auth type.  In a world of multiple auths and multiple 
mechanisms for an auth, requiring REINITIATE penalties is too expensive.

Ignoring all the issues I've cited, your optimization doesn't appear to have a 
positive impact on performance.  Even if it did shave a few milliseconds or 
even 100ms, will it have a measurable real-world impact?  Considering how many 
RPC requests are performed over a single connection, will the negligible 
penalty from 

[jira] [Updated] (HADOOP-7140) IPC Reader threads do not stop when server stops

2013-06-20 Thread Ivan Mitic (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mitic updated HADOOP-7140:
---

Fix Version/s: 1.3.0

Patch committed to branch-1.

 IPC Reader threads do not stop when server stops
 

 Key: HADOOP-7140
 URL: https://issues.apache.org/jira/browse/HADOOP-7140
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.22.0, 1-win, 1.3.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.22.0, 1.3.0

 Attachments: HADOOP-7140.branch-1.patch, 
 HADOOP-7140.branch-1-win.patch, hadoop-7140.txt, hadoop-7140.txt


 After HADOOP-6713, the new IPC Reader threads are not properly stopped when 
 the server shuts down. One repercussion of this is that conditions that are 
 supposed to shut down a daemon no longer work (eg the TT doesn't shut itself 
 down if it detects an incompatible build version)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690051#comment-13690051
 ] 

Luke Lu commented on HADOOP-9421:
-

bq. In a world of multiple auths and multiple mechanisms for an auth, requiring 
REINITIATE penalties is too expensive.

If a client can't pick a mechanism, it could skip the initial token and send an 
empty INITIATE, REINITIATE is then not expensive, i.e. exactly equivalent to 
yours.

Hadoop RPC foremost should serve its most common workload: delegation tokens. A 
performance regression for the most common workload in the name of integration 
is not acceptable. The specific optimization for Digest-MD5 (cramming a 
speculative challenge with a negotiate) doesn't work with modern client 
initiated auths like SCRAM. If we have to replace Digest-MD5 for security 
reasons, we'll be SOL. 

bq. Ignoring all the issues I've cited, your optimization doesn't appear to 
have a positive impact on performance.

There is no optimization my patch, which merely leave the door for future 
optimization. In fact, there is performance bug in my impl for Kerberos. It's 
you who added a speculative optimization for Digest-MD5 that doesn't work with 
its future replacement SCRAM.

bq. I feel like we've spent weeks haggling over an ill-suited pre-mature 
optimization that could been spent building upon this implementation.

I merely want to leave the optional client initiate proto *in the protocol* for 
future optimizations. I feel like being forced to implement the optimization to 
show that it's straight forward and incremental.

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira