[jira] [Commented] (HADOOP-10596) HttpServer2 should apply the authentication filter to some urls instead of null

2014-05-13 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996056#comment-13996056
 ] 

Haohui Mai commented on HADOOP-10596:
-

The patch does not seem to be conceptually clean. For example, secret file is 
an implementation of the filter, it should not be exposed in HttpServer2. Maybe 
I don't quite understand the use case here. Under what circumstances you only 
want to parts of the HttpServer to be authenticated?



 HttpServer2 should apply the authentication filter to some urls instead of 
 null
 ---

 Key: HADOOP-10596
 URL: https://issues.apache.org/jira/browse/HADOOP-10596
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: HADOOP-10596.1.patch


 HttpServer2 should apply the authentication filter to some urls instead of 
 null. In addition, it should be more flexible for users to configure SPNEGO.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10564) Add username to native RPCv9 client

2014-05-13 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HADOOP-10564:
--

Attachment: HADOOP-10564-pnative.005.patch

Hi Binglin, 

Earlier, on HADOOP-10389, you made a comment that we should set call id rather 
than just using 0 everywhere.  I implemented this in version #5 of this patch.

I think the comment was this:

bq. 1. to my understanding, rpc client should have a mapcallid, call to 
record all unfinished calls, but I could not find any code assigning 
callids(only make them 0) and manage unfinished calls, could you help me 
located those logic?

We don't need a map here, since we can only have one call ID in flight at once 
(the server doesn't *yet* support this).  In the future, this might change, but 
for now, it's OK just to check that the call ID we got in the response was the 
same as the call ID we made in the request.

 Add username to native RPCv9 client
 ---

 Key: HADOOP-10564
 URL: https://issues.apache.org/jira/browse/HADOOP-10564
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: native
Affects Versions: HADOOP-10388
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10564-pnative.002.patch, 
 HADOOP-10564-pnative.003.patch, HADOOP-10564-pnative.004.patch, 
 HADOOP-10564-pnative.005.patch, HADOOP-10564.001.patch


 Add the ability for the native RPCv9 client to set a username when initiating 
 a connection.



--
This message was sent by Atlassian JIRA
(v6.2#6252)



[jira] [Commented] (HADOOP-10596) HttpServer2 should apply the authentication filter to some urls instead of null

2014-05-13 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996051#comment-13996051
 ] 

Zhijie Shen commented on HADOOP-10596:
--

bq. If I understand correctly, this is supported today in the webhdfs server.

Right, YARN style is to have webapp imbedded in the daemon, and configure it 
programmatically, which is actually the motivation here.

 HttpServer2 should apply the authentication filter to some urls instead of 
 null
 ---

 Key: HADOOP-10596
 URL: https://issues.apache.org/jira/browse/HADOOP-10596
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: HADOOP-10596.1.patch


 HttpServer2 should apply the authentication filter to some urls instead of 
 null. In addition, it should be more flexible for users to configure SPNEGO.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10583) bin/hadoop key throws NPE with no args and assorted other fixups

2014-05-13 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HADOOP-10583:
--

Attachment: HADOOP-10583.4.patch

Adds more changes suggested by Alejandro. Does not address comment about using 
common-cli.

 bin/hadoop key throws NPE with no args and assorted other fixups
 

 Key: HADOOP-10583
 URL: https://issues.apache.org/jira/browse/HADOOP-10583
 Project: Hadoop Common
  Issue Type: Bug
  Components: bin
Reporter: Charles Lamb
Assignee: Charles Lamb
Priority: Minor
  Labels: patch
 Fix For: 3.0.0

 Attachments: HADOOP-10583.1.patch, HADOOP-10583.2.patch, 
 HADOOP-10583.3.patch, HADOOP-10583.4.patch, HADOOP-10583.5.patch


 bin/hadoop key throws NPE.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10566) Refactor proxyservers out of ProxyUsers

2014-05-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992792#comment-13992792
 ] 

Hudson commented on HADOOP-10566:
-

FAILURE: Integrated in Hadoop-Hdfs-trunk #1751 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1751/])
HADOOP-10566. Add toLowerCase support to auth_to_local rules for service name. 
(tucu) (tucu: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1593105)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/util/KerberosName.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-auth/src/test/java/org/apache/hadoop/security/authentication/util/TestKerberosName.java
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/SecureMode.apt.vm


 Refactor proxyservers out of ProxyUsers
 ---

 Key: HADOOP-10566
 URL: https://issues.apache.org/jira/browse/HADOOP-10566
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 2.4.0
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: HADOOP-10566.patch, HADOOP-10566.patch, 
 HADOOP-10566.patch, HADOOP-10566.patch


 HADOOP-10498 added proxyservers feature in ProxyUsers. It is beneficial to 
 treat this as a separate feature since 
 1 The ProxyUsers is per proxyuser where as proxyservers is per cluster. The 
 cardinality is different. 
 2 The ProxyUsers.authorize() and ProxyUsers.isproxyUser() are synchronized 
 and hence share the same lock  and impacts performance.
 Since these are two separate features, it will be an improvement to keep them 
 separate. It also enables one to fine-tune each feature independently.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10599) Support prioritization of DN RPCs over client RPCs

2014-05-13 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996401#comment-13996401
 ] 

Daryn Sharp commented on HADOOP-10599:
--

With the current global fair lock, I wonder if DNs could effectively starve 
clients.  The block updates may create a glut of write ops at the front of the 
call queue.  With no prioritization, clients create read bubbles in between the 
write ops.  

I think you've touched upon the angle we are using to attack the problem.  Now 
once I can finally get away from webhdfs hardening which derailed the efforts, 
we're working on fine grain FSN locking + distinct BM locking + BM has its own 
RPC service.  This should effectively achieve prioritization of DNs with much 
better performance characteristics overall.

 Support prioritization of DN RPCs over client RPCs
 --

 Key: HADOOP-10599
 URL: https://issues.apache.org/jira/browse/HADOOP-10599
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Ming Ma

 We might need to prioritize DN RPC over client RPC so that no matter what 
 application do to NN RPC and FSNamesystem's global lock, DN's requests will 
 be processed timely. After a cluster is configured to have service RPC server 
 separated from client RPC server, it is mitigated to some degree with fair 
 FSNamesystem's global lock. Also if the NN global lock can be made more fine 
 grained; such need becomes less important. Still, it will be good to evaluate 
 if this is a good option.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10177) Create CLI tools for managing keys via the KeyProvider API

2014-05-13 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HADOOP-10177:
---

Fix Version/s: (was: 0.3.0)
   3.0.0

 Create CLI tools for managing keys via the KeyProvider API
 --

 Key: HADOOP-10177
 URL: https://issues.apache.org/jira/browse/HADOOP-10177
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Reporter: Owen O'Malley
Assignee: Larry McCay
 Fix For: 3.0.0

 Attachments: 10177-2.patch, 10177-3.patch, 10177.patch


 The KeyProvider API provides access to keys, but we need CLI tools to provide 
 the ability to create and delete keys. I'd think it would look something like:
 {code}
 % hadoop key create key1
 % hadoop key roll key1
 % hadoop key list key1
 % hadoop key delete key1
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10587) Use a thread-local cache in TokenIdentifier#getBytes to avoid creating many DataOutputBuffer objects

2014-05-13 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993079#comment-13993079
 ] 

Colin Patrick McCabe commented on HADOOP-10587:
---

Recently, I encountered an edit log with many {{DELEGATION_TOKEN_IDENTIFIER}} 
log entries.  Attempts to load this edit log failed with an out-of-memory 
error, occurring in {{TokenIdentifier#getBytes}}.  While the edit log seemed to 
have an unusally large number of these log entries due to a misconfiguration, I 
think it would also be useful to optimize {{TokenIdentifier#getBytes}} to 
reduce this memory consumption somewhat.

 Use a thread-local cache in TokenIdentifier#getBytes to avoid creating many 
 DataOutputBuffer objects
 

 Key: HADOOP-10587
 URL: https://issues.apache.org/jira/browse/HADOOP-10587
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HADOOP-10587.001.patch


 We can use a thread-local cache in TokenIdentifier#getBytes to avoid creating 
 many DataOutputBuffer objects.  This will reduce our memory usage (for 
 example, when loading edit logs), and help prevent OOMs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10566) Refactor proxyservers out of ProxyUsers

2014-05-13 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996618#comment-13996618
 ] 

Suresh Srinivas commented on HADOOP-10566:
--

I will commit this patch shortly.

 Refactor proxyservers out of ProxyUsers
 ---

 Key: HADOOP-10566
 URL: https://issues.apache.org/jira/browse/HADOOP-10566
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 2.4.0
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: HADOOP-10566.patch, HADOOP-10566.patch, 
 HADOOP-10566.patch, HADOOP-10566.patch, HADOOP-10566.patch


 HADOOP-10498 added proxyservers feature in ProxyUsers. It is beneficial to 
 treat this as a separate feature since 
 1 The ProxyUsers is per proxyuser where as proxyservers is per cluster. The 
 cardinality is different. 
 2 The ProxyUsers.authorize() and ProxyUsers.isproxyUser() are synchronized 
 and hence share the same lock  and impacts performance.
 Since these are two separate features, it will be an improvement to keep them 
 separate. It also enables one to fine-tune each feature independently.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10602) Documentation has broken Go Back hyperlinks.

2014-05-13 Thread Chris Nauroth (JIRA)
Chris Nauroth created HADOOP-10602:
--

 Summary: Documentation has broken Go Back hyperlinks.
 Key: HADOOP-10602
 URL: https://issues.apache.org/jira/browse/HADOOP-10602
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.4.0, 3.0.0
Reporter: Chris Nauroth


Multiple pages of our documentation have Go Back links that are broken, 
because they point to an incorrect relative path.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10448) Support pluggable mechanism to specify proxy user settings

2014-05-13 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996643#comment-13996643
 ] 

Suresh Srinivas commented on HADOOP-10448:
--

[~daryn], do you have any further comments?

 Support pluggable mechanism to specify proxy user settings
 --

 Key: HADOOP-10448
 URL: https://issues.apache.org/jira/browse/HADOOP-10448
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 2.3.0
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: HADOOP-10448.patch, HADOOP-10448.patch, 
 HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, 
 HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch


 We have a requirement to support large number of superusers. (users who 
 impersonate as another user) 
 (http://hadoop.apache.org/docs/r1.2.1/Secure_Impersonation.html) 
 Currently each  superuser needs to be defined in the core-site.xml via 
 proxyuser settings. This will be cumbersome when there are 1000 entries.
 It seems useful to have a pluggable mechanism to specify  proxy user settings 
 with the current approach as the default. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10600) AuthenticationFilterInitializer doesn't allow null signature secret file

2014-05-13 Thread Zhijie Shen (JIRA)
Zhijie Shen created HADOOP-10600:


 Summary: AuthenticationFilterInitializer doesn't allow null 
signature secret file
 Key: HADOOP-10600
 URL: https://issues.apache.org/jira/browse/HADOOP-10600
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Zhijie Shen


AuthenticationFilterInitializer doesn't allow null signature secret file. 
However, null signature secret is acceptable in AuthenticationFilter, and a 
random signature secret is going to be created instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10566) Refactor proxyservers out of ProxyUsers

2014-05-13 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996692#comment-13996692
 ] 

Benoy Antony commented on HADOOP-10566:
---

jspWriterOutput is removed in this patch  since it was an unused variable in 
TestJspHelper.  
The references to jspWriterOutput is removed by HDFS-6252. The compilation 
error could be because HDFS-6252 is not merged. Is HDFS-6252 merged to branch-2?

I can create a patch which doesn't do anything with jspWriterOutput. (That 
change is unrelated ). Please let me know if that's the right approach.

 Refactor proxyservers out of ProxyUsers
 ---

 Key: HADOOP-10566
 URL: https://issues.apache.org/jira/browse/HADOOP-10566
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 2.4.0
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: HADOOP-10566.patch, HADOOP-10566.patch, 
 HADOOP-10566.patch, HADOOP-10566.patch, HADOOP-10566.patch


 HADOOP-10498 added proxyservers feature in ProxyUsers. It is beneficial to 
 treat this as a separate feature since 
 1 The ProxyUsers is per proxyuser where as proxyservers is per cluster. The 
 cardinality is different. 
 2 The ProxyUsers.authorize() and ProxyUsers.isproxyUser() are synchronized 
 and hence share the same lock  and impacts performance.
 Since these are two separate features, it will be an improvement to keep them 
 separate. It also enables one to fine-tune each feature independently.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10572) Example NFS mount command must pass noacl as it isn't supported by the server yet

2014-05-13 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996650#comment-13996650
 ] 

Brandon Li commented on HADOOP-10572:
-

Thank you, Harsh and Mark. I've committed the patch.

 Example NFS mount command must pass noacl as it isn't supported by the server 
 yet
 -

 Key: HADOOP-10572
 URL: https://issues.apache.org/jira/browse/HADOOP-10572
 Project: Hadoop Common
  Issue Type: Improvement
  Components: nfs
Affects Versions: 2.4.0
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Fix For: 2.5.0

 Attachments: HADOOP-10572.patch


 Use of the documented default mount command results in the below server side 
 log WARN event, cause the client tries to locate the ACL program (#100227):
 {code}
 12:26:11.975 AM   TRACE   org.apache.hadoop.oncrpc.RpcCall
 Xid:-1114380537, messageType:RPC_CALL, rpcVersion:2, program:100227, 
 version:3, procedure:0, credential:(AuthFlavor:AUTH_NONE), 
 verifier:(AuthFlavor:AUTH_NONE)
 12:26:11.976 AM   TRACE   org.apache.hadoop.oncrpc.RpcProgram 
 NFS3 procedure #0
 12:26:11.976 AM   WARNorg.apache.hadoop.oncrpc.RpcProgram 
 Invalid RPC call program 100227
 {code}
 The client mount command must pass {{noacl}} to avoid this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10602) Documentation has broken Go Back hyperlinks.

2014-05-13 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HADOOP-10602:
---

Priority: Trivial  (was: Major)

 Documentation has broken Go Back hyperlinks.
 --

 Key: HADOOP-10602
 URL: https://issues.apache.org/jira/browse/HADOOP-10602
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Affects Versions: 3.0.0, 2.4.0
Reporter: Chris Nauroth
Priority: Trivial
  Labels: newbie

 Multiple pages of our documentation have Go Back links that are broken, 
 because they point to an incorrect relative path.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10389) Native RPCv9 client

2014-05-13 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996017#comment-13996017
 ] 

Binglin Chang commented on HADOOP-10389:


bq. The rationale behind call id in general is that in some future version of 
the Java RPC system, we may want to allow multiple calls to be in flight at 
once
I guess I always thought this is already implemented, because client already 
can make parallel calls, and there are multiple rpc handler threads in server 
side already, doing this should be natural and easy, although I haven't test 
about this. Are you sure about this? If so I can try to add this in java... 

bq. From the library user's perspective, they are calling hdfsOpen, hdfsClose, 
etc. etc.
So those method all need to initialize hrpc_proxy again(which need server 
address, user and other configs), what I try to say is maybe proxy and call can 
be separated, proxy can be shared, call on stack for each call. Maybe it's to 
late to change that, just my two cents.

bq. You just can't de-allocate the proxy while it is in use.
So there should be a method for user to cancel an ongoing rpc(also need to make 
sure after cancel complete, no more memory access to hrpc_proxy and call), 
looks like hrpc_proxy_deactivate can't do this yet?



 Native RPCv9 client
 ---

 Key: HADOOP-10389
 URL: https://issues.apache.org/jira/browse/HADOOP-10389
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: HADOOP-10388
Reporter: Binglin Chang
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10388.001.patch, HADOOP-10389.002.patch, 
 HADOOP-10389.004.patch, HADOOP-10389.005.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10598) Support configurable RPC fair share

2014-05-13 Thread Ming Ma (JIRA)
Ming Ma created HADOOP-10598:


 Summary: Support configurable RPC fair share
 Key: HADOOP-10598
 URL: https://issues.apache.org/jira/browse/HADOOP-10598
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Ming Ma


It will be useful if we can support RPC min fair share on a per user or group 
basis. That will be useful for SLA jobs in a shared cluster. It will be 
complementary to the history-based soft policy defined in fair queue's history 
RPC server.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10607) Create an API to separate Credentials/Password Storage from Applications

2014-05-13 Thread Larry McCay (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Larry McCay updated HADOOP-10607:
-

Attachment: 10607.patch

Initial patch contribution

 Create an API to separate Credentials/Password Storage from Applications
 

 Key: HADOOP-10607
 URL: https://issues.apache.org/jira/browse/HADOOP-10607
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Reporter: Larry McCay
Assignee: Larry McCay
 Fix For: 3.0.0

 Attachments: 10607.patch


 As with the filesystem API, we need to provide a generic mechanism to support 
 multiple credential storage mechanisms that are potentially from third 
 parties. 
 We need the ability to eliminate the storage of passwords and secrets in 
 clear text within configuration files or within code.
 Toward that end, I propose an API that is configured using a list of URLs of 
 CredentialProviders. The implementation will look for implementations using 
 the ServiceLoader interface and thus support third party libraries.
 Two providers will be included in this patch. One using the credentials cache 
 in MapReduce jobs and the other using Java KeyStores from either HDFS or 
 local file system. 
 A CredShell CLI will also be included in this patch which provides the 
 ability to manage the credentials within the stores.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10608) Support appending data in DistCp

2014-05-13 Thread Jing Zhao (JIRA)
Jing Zhao created HADOOP-10608:
--

 Summary: Support appending data in DistCp
 Key: HADOOP-10608
 URL: https://issues.apache.org/jira/browse/HADOOP-10608
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Jing Zhao
Assignee: Jing Zhao


Currently when doing distcp with -update option, for two files with the same 
file names but with different file length or checksum, we overwrite the whole 
file. It will be good if we can detect the case where (sourceFile = targetFile 
+ appended_data), and only transfer the appended data segment to the target. 
This will be very useful if we're doing incremental distcp.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10607) Create an API to Separate Credentials/Password Storage from Applications

2014-05-13 Thread Larry McCay (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Larry McCay updated HADOOP-10607:
-

Summary: Create an API to Separate Credentials/Password Storage from 
Applications  (was: Create an API to separate Credentials/Password Storage from 
Applications)

 Create an API to Separate Credentials/Password Storage from Applications
 

 Key: HADOOP-10607
 URL: https://issues.apache.org/jira/browse/HADOOP-10607
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Reporter: Larry McCay
Assignee: Larry McCay
 Fix For: 3.0.0

 Attachments: 10607.patch


 As with the filesystem API, we need to provide a generic mechanism to support 
 multiple credential storage mechanisms that are potentially from third 
 parties. 
 We need the ability to eliminate the storage of passwords and secrets in 
 clear text within configuration files or within code.
 Toward that end, I propose an API that is configured using a list of URLs of 
 CredentialProviders. The implementation will look for implementations using 
 the ServiceLoader interface and thus support third party libraries.
 Two providers will be included in this patch. One using the credentials cache 
 in MapReduce jobs and the other using Java KeyStores from either HDFS or 
 local file system. 
 A CredShell CLI will also be included in this patch which provides the 
 ability to manage the credentials within the stores.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HADOOP-10489) UserGroupInformation#getTokens and UserGroupInformation#addToken can lead to ConcurrentModificationException

2014-05-13 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter reassigned HADOOP-10489:
--

Assignee: Robert Kanter

 UserGroupInformation#getTokens and UserGroupInformation#addToken can lead to 
 ConcurrentModificationException
 

 Key: HADOOP-10489
 URL: https://issues.apache.org/jira/browse/HADOOP-10489
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Jing Zhao
Assignee: Robert Kanter

 Currently UserGroupInformation#getTokens and UserGroupInformation#addToken 
 uses UGI's monitor to protect the iteration and modification of 
 Credentials#tokenMap. Per 
 [discussion|https://issues.apache.org/jira/browse/HADOOP-10475?focusedCommentId=13965851page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13965851]
  in HADOOP-10475, this can still lead to ConcurrentModificationException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10608) Support appending data in DistCp

2014-05-13 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997212#comment-13997212
 ] 

Jing Zhao commented on HADOOP-10608:


If both the source FS and the target FS are HDFS, I think what we can do here 
is:
# Check the length of the two files with the same name. 
# If the source file's length is greater than the target file's length, we 
compare the checksum of their common length part. 
# If the checksum matches we only copy their difference using position read and 
append functionalities.

 Support appending data in DistCp
 

 Key: HADOOP-10608
 URL: https://issues.apache.org/jira/browse/HADOOP-10608
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Jing Zhao
Assignee: Jing Zhao

 Currently when doing distcp with -update option, for two files with the same 
 file names but with different file length or checksum, we overwrite the whole 
 file. It will be good if we can detect the case where (sourceFile = 
 targetFile + appended_data), and only transfer the appended data segment to 
 the target. This will be very useful if we're doing incremental distcp.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10607) Create an API to separate Credentials/Password Storage from Applications

2014-05-13 Thread Larry McCay (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997196#comment-13997196
 ] 

Larry McCay commented on HADOOP-10607:
--

The same general pattern for provider SPIs has been taken for the credential 
provider API as was taken for the key provider API.

 Create an API to separate Credentials/Password Storage from Applications
 

 Key: HADOOP-10607
 URL: https://issues.apache.org/jira/browse/HADOOP-10607
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Reporter: Larry McCay
Assignee: Larry McCay
 Fix For: 3.0.0


 As with the filesystem API, we need to provide a generic mechanism to support 
 multiple credential storage mechanisms that are potentially from third 
 parties. 
 We need the ability to eliminate the storage of passwords and secrets in 
 clear text within configuration files or within code.
 Toward that end, I propose an API that is configured using a list of URLs of 
 CredentialProviders. The implementation will look for implementations using 
 the ServiceLoader interface and thus support third party libraries.
 Two providers will be included in this patch. One using the credentials cache 
 in MapReduce jobs and the other using Java KeyStores from either HDFS or 
 local file system. 
 A CredShell CLI will also be included in this patch which provides the 
 ability to manage the credentials within the stores.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10607) Create an API to separate Credentials/Password Storage from Applications

2014-05-13 Thread Larry McCay (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Larry McCay updated HADOOP-10607:
-

Status: Patch Available  (was: Open)

 Create an API to separate Credentials/Password Storage from Applications
 

 Key: HADOOP-10607
 URL: https://issues.apache.org/jira/browse/HADOOP-10607
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Reporter: Larry McCay
Assignee: Larry McCay
 Fix For: 3.0.0

 Attachments: 10607.patch


 As with the filesystem API, we need to provide a generic mechanism to support 
 multiple credential storage mechanisms that are potentially from third 
 parties. 
 We need the ability to eliminate the storage of passwords and secrets in 
 clear text within configuration files or within code.
 Toward that end, I propose an API that is configured using a list of URLs of 
 CredentialProviders. The implementation will look for implementations using 
 the ServiceLoader interface and thus support third party libraries.
 Two providers will be included in this patch. One using the credentials cache 
 in MapReduce jobs and the other using Java KeyStores from either HDFS or 
 local file system. 
 A CredShell CLI will also be included in this patch which provides the 
 ability to manage the credentials within the stores.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10607) Create an API to separate Credentials/Password Storage from Applications

2014-05-13 Thread Larry McCay (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Larry McCay updated HADOOP-10607:
-

Issue Type: New Feature  (was: Bug)

 Create an API to separate Credentials/Password Storage from Applications
 

 Key: HADOOP-10607
 URL: https://issues.apache.org/jira/browse/HADOOP-10607
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Reporter: Larry McCay
Assignee: Larry McCay
 Fix For: 3.0.0

 Attachments: 10607.patch


 As with the filesystem API, we need to provide a generic mechanism to support 
 multiple credential storage mechanisms that are potentially from third 
 parties. 
 We need the ability to eliminate the storage of passwords and secrets in 
 clear text within configuration files or within code.
 Toward that end, I propose an API that is configured using a list of URLs of 
 CredentialProviders. The implementation will look for implementations using 
 the ServiceLoader interface and thus support third party libraries.
 Two providers will be included in this patch. One using the credentials cache 
 in MapReduce jobs and the other using Java KeyStores from either HDFS or 
 local file system. 
 A CredShell CLI will also be included in this patch which provides the 
 ability to manage the credentials within the stores.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10591) Compression codecs must used pooled direct buffers or deallocate direct buffers when stream is closed

2014-05-13 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996789#comment-13996789
 ] 

Colin Patrick McCabe commented on HADOOP-10591:
---

We have two ways we could go on this one.  One is to implement a buffer pooling 
scheme.  Another is to manually free the direct buffers.

The buffer-pooling scheme initially might seem more attractive, but it's 
problematic.  We don't know that all the buffers we're creating will be the 
same size, so we end up with the same kind of problems you get when 
implementing {{malloc}}.  It's also unclear how long we should hang on to 
buffers when they're not in use.

Manually freeing the buffers is possible through a Sun-specific API.  We do 
this in a few other cases-- for example, to {{munmap}} a memory segment.  This 
is probably the simpler route to go.

 Compression codecs must used pooled direct buffers or deallocate direct 
 buffers when stream is closed
 -

 Key: HADOOP-10591
 URL: https://issues.apache.org/jira/browse/HADOOP-10591
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Hari Shreedharan

 Currently direct buffers allocated by compression codecs like Gzip (which 
 allocates 2 direct buffers per instance) are not deallocated when the stream 
 is closed. Eventually for long running processes which create a huge number 
 of files, these direct buffers are left hanging till a full gc, which may or 
 may not happen in a reasonable amount of time - especially if the process 
 does not use a whole lot of heap.
 Either these buffers should be pooled or they should be deallocated when the 
 stream is closed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10596) HttpServer2 should apply the authentication filter to some urls instead of null

2014-05-13 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996639#comment-13996639
 ] 

Haohui Mai commented on HADOOP-10596:
-

bq. params.put(AuthenticationFilter.AUTH_TYPE, kerberos);

Introducing the secret file is a much larger change. I think a better idea is 
to clean it up.

bq. ... all the web resources should be secured. Please let me know if it is 
not the case.

I think there is a configuration to toggle whether NN web UI can be accessed 
without spnego in secure mode.

 HttpServer2 should apply the authentication filter to some urls instead of 
 null
 ---

 Key: HADOOP-10596
 URL: https://issues.apache.org/jira/browse/HADOOP-10596
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: HADOOP-10596.1.patch


 HttpServer2 should apply the authentication filter to some urls instead of 
 null. In addition, it should be more flexible for users to configure SPNEGO.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10607) Create an API to separate Credentials/Password Storage from Applications

2014-05-13 Thread Larry McCay (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Larry McCay updated HADOOP-10607:
-

Description: 
As with the filesystem API, we need to provide a generic mechanism to support 
multiple credential storage mechanisms that are potentially from third parties. 

We need the ability to eliminate the storage of passwords and secrets in clear 
text within configuration files or within code.

Toward that end, I propose an API that is configured using a list of URLs of 
CredentialProviders. The implementation will look for implementations using the 
ServiceLoader interface and thus support third party libraries.

Two providers will be included in this patch. One using the credentials cache 
in MapReduce jobs and the other using Java KeyStores from either HDFS or local 
file system. 

A CredShell CLI will also be included in this patch which provides the ability 
to manage the credentials within the stores.



  was:
As with the filesystem API, we need to provide a generic mechanism to support 
multiple credential storage mechanisms that are potentially from third parties. 

We need the ability to eliminate the storage of passwords and secrets in clear 
text within configuration files or within code.

Toward that end, I propose an API that is configured using a list of URLs of 
CredentialProviders. The implementation will look for implementations using the 
ServiceLoader interface and thus support third party libraries.

Two providers will be included in this patch. One using the credentials cache 
in MapReduce jobs and the other using Java KeyStores from either HDFS or local 
file system. 




 Create an API to separate Credentials/Password Storage from Applications
 

 Key: HADOOP-10607
 URL: https://issues.apache.org/jira/browse/HADOOP-10607
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Reporter: Larry McCay
Assignee: Larry McCay
 Fix For: 3.0.0


 As with the filesystem API, we need to provide a generic mechanism to support 
 multiple credential storage mechanisms that are potentially from third 
 parties. 
 We need the ability to eliminate the storage of passwords and secrets in 
 clear text within configuration files or within code.
 Toward that end, I propose an API that is configured using a list of URLs of 
 CredentialProviders. The implementation will look for implementations using 
 the ServiceLoader interface and thus support third party libraries.
 Two providers will be included in this patch. One using the credentials cache 
 in MapReduce jobs and the other using Java KeyStores from either HDFS or 
 local file system. 
 A CredShell CLI will also be included in this patch which provides the 
 ability to manage the credentials within the stores.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Issue Comment Deleted] (HADOOP-10585) Retry polices ignore interrupted exceptions

2014-05-13 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HADOOP-10585:


Comment: was deleted

(was: +1 The patch looks straightforward. )

 Retry polices ignore interrupted exceptions
 ---

 Key: HADOOP-10585
 URL: https://issues.apache.org/jira/browse/HADOOP-10585
 Project: Hadoop Common
  Issue Type: Bug
  Components: ipc
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Critical
 Attachments: HADOOP-10585.patch


 Retry polices should not use {{ThreadUtil.sleepAtLeastIgnoreInterrupts}}.  
 This prevents {{FsShell}} commands from being aborted during retries.  It 
 also causes orphaned webhdfs DN DFSClients to keep running after the webhdfs 
 client closes the connection.  Jetty goes into a loop constantly sending 
 interrupts to the handler thread.  Webhdfs retries cause multiple nodes to 
 have these orphaned clients.  The DN cannot shutdown until orphaned clients 
 complete.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-10151) Implement a Buffer-Based Chiper InputStream and OutputStream

2014-05-13 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu resolved HADOOP-10151.
-

Resolution: Won't Fix

 Implement a Buffer-Based Chiper InputStream and OutputStream
 

 Key: HADOOP-10151
 URL: https://issues.apache.org/jira/browse/HADOOP-10151
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
  Labels: rhino
 Fix For: 3.0.0

 Attachments: HADOOP-10151.patch


 Cipher InputStream and OuputStream are buffer-based, and the buffer is used 
 to cache the encrypted data or result.   Cipher InputStream is used to read 
 encrypted data, and the result is plain text . Cipher OutputStream is used to 
 write plain data and result is encrypted data.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10151) Implement a Buffer-Based Chiper InputStream and OutputStream

2014-05-13 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HADOOP-10151:


Issue Type: Task  (was: Sub-task)
Parent: (was: HADOOP-10150)

 Implement a Buffer-Based Chiper InputStream and OutputStream
 

 Key: HADOOP-10151
 URL: https://issues.apache.org/jira/browse/HADOOP-10151
 Project: Hadoop Common
  Issue Type: Task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
  Labels: rhino
 Fix For: 3.0.0

 Attachments: HADOOP-10151.patch


 Cipher InputStream and OuputStream are buffer-based, and the buffer is used 
 to cache the encrypted data or result.   Cipher InputStream is used to read 
 encrypted data, and the result is plain text . Cipher OutputStream is used to 
 write plain data and result is encrypted data.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10566) Refactor proxyservers out of ProxyUsers

2014-05-13 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992934#comment-13992934
 ] 

Benoy Antony commented on HADOOP-10566:
---

[~wheat9], [~arpitagarwal], [~sureshms] , Could one of please review and commit 
this patch ?
There is already a +1 from [~daryn].
The earlier comments related to commits seem to be some error.

 Refactor proxyservers out of ProxyUsers
 ---

 Key: HADOOP-10566
 URL: https://issues.apache.org/jira/browse/HADOOP-10566
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 2.4.0
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: HADOOP-10566.patch, HADOOP-10566.patch, 
 HADOOP-10566.patch, HADOOP-10566.patch, HADOOP-10566.patch


 HADOOP-10498 added proxyservers feature in ProxyUsers. It is beneficial to 
 treat this as a separate feature since 
 1 The ProxyUsers is per proxyuser where as proxyservers is per cluster. The 
 cardinality is different. 
 2 The ProxyUsers.authorize() and ProxyUsers.isproxyUser() are synchronized 
 and hence share the same lock  and impacts performance.
 Since these are two separate features, it will be an improvement to keep them 
 separate. It also enables one to fine-tune each feature independently.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10587) Use a thread-local cache in TokenIdentifier#getBytes to avoid creating many DataOutputBuffer objects

2014-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993153#comment-13993153
 ] 

Hadoop QA commented on HADOOP-10587:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12644003/HADOOP-10587.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common:

  org.apache.hadoop.ha.TestZKFailoverControllerStress

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/3930//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/3930//console

This message is automatically generated.

 Use a thread-local cache in TokenIdentifier#getBytes to avoid creating many 
 DataOutputBuffer objects
 

 Key: HADOOP-10587
 URL: https://issues.apache.org/jira/browse/HADOOP-10587
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HADOOP-10587.001.patch


 We can use a thread-local cache in TokenIdentifier#getBytes to avoid creating 
 many DataOutputBuffer objects.  This will reduce our memory usage (for 
 example, when loading edit logs), and help prevent OOMs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10156) Define Buffer-based Encryptor/Decryptor interfaces and provide implementation for AES CTR.

2014-05-13 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HADOOP-10156:


Issue Type: Task  (was: Sub-task)
Parent: (was: HADOOP-10150)

 Define Buffer-based Encryptor/Decryptor interfaces and provide implementation 
 for AES CTR.
 --

 Key: HADOOP-10156
 URL: https://issues.apache.org/jira/browse/HADOOP-10156
 Project: Hadoop Common
  Issue Type: Task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
  Labels: rhino
 Fix For: 3.0.0

 Attachments: HADOOP-10156.patch


 Define encryptor and decryptor interfaces, and they are buffer-based to 
 improve performance.  We use direct buffer to avoid bytes copy between JAVA 
 and native if there is JNI call.  In this JIRA, AES CTR mode encryption and 
 decryption are implemented. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10608) Support appending data in DistCp

2014-05-13 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HADOOP-10608:
---

Attachment: HADOOP-10608.000.patch

Very initial patch to demonstrate the idea. The patch includes the following 
changes:
# add a new FileSystem API to get the file checksum for a given length (i.e., 
to calculate the checksum of [0, length] part of the file)
# add a new option -append to distcp, which only works with -update and without 
-skipcrccheck
# Use position read and append to copy the new added data


 Support appending data in DistCp
 

 Key: HADOOP-10608
 URL: https://issues.apache.org/jira/browse/HADOOP-10608
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HADOOP-10608.000.patch


 Currently when doing distcp with -update option, for two files with the same 
 file names but with different file length or checksum, we overwrite the whole 
 file. It will be good if we can detect the case where (sourceFile = 
 targetFile + appended_data), and only transfer the appended data segment to 
 the target. This will be very useful if we're doing incremental distcp.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-10154) Provide cryptographic filesystem implementation and it's data IO.

2014-05-13 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu resolved HADOOP-10154.
-

Resolution: Won't Fix

 Provide cryptographic filesystem implementation and it's data IO.
 -

 Key: HADOOP-10154
 URL: https://issues.apache.org/jira/browse/HADOOP-10154
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
  Labels: rhino
 Fix For: 3.0.0


 The JIRA includes Cryptographic filesystem data  InputStream which extends 
 FSDataInputStream and OutputStream which extends FSDataOutputStream.  
 Implantation of Cryptographic file system is also included in this JIRA.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10481) Fix new findbugs warnings in hadoop-auth

2014-05-13 Thread Swarnim Kulkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swarnim Kulkarni updated HADOOP-10481:
--

Status: Patch Available  (was: Open)

 Fix new findbugs warnings in hadoop-auth
 

 Key: HADOOP-10481
 URL: https://issues.apache.org/jira/browse/HADOOP-10481
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Swarnim Kulkarni
  Labels: newbie
 Attachments: HADOOP-10481.1.patch.txt, HADOOP-10481.2.patch.txt


 The following findbugs warnings need to be fixed:
 {noformat}
 [INFO] --- findbugs-maven-plugin:2.5.3:check (default-cli) @ hadoop-auth ---
 [INFO] BugInstance size is 2
 [INFO] Error size is 0
 [INFO] Total bugs: 2
 [INFO] Found reliance on default encoding in 
 org.apache.hadoop.security.authentication.server.AuthenticationFilter.init(FilterConfig):
  String.getBytes() 
 [org.apache.hadoop.security.authentication.server.AuthenticationFilter] At 
 AuthenticationFilter.java:[lines 76-455]
 [INFO] Found reliance on default encoding in 
 org.apache.hadoop.security.authentication.util.Signer.computeSignature(String):
  String.getBytes() [org.apache.hadoop.security.authentication.util.Signer] 
 At Signer.java:[lines 34-96]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Work started] (HADOOP-10479) Fix new findbugs warnings in hadoop-minikdc

2014-05-13 Thread Swarnim Kulkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-10479 started by Swarnim Kulkarni.

 Fix new findbugs warnings in hadoop-minikdc
 ---

 Key: HADOOP-10479
 URL: https://issues.apache.org/jira/browse/HADOOP-10479
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Swarnim Kulkarni
  Labels: newbie

 The following findbugs warnings need to be fixed:
 {noformat}
 [INFO] --- findbugs-maven-plugin:2.5.3:check (default-cli) @ hadoop-minikdc 
 ---
 [INFO] BugInstance size is 2
 [INFO] Error size is 0
 [INFO] Total bugs: 2
 [INFO] Found reliance on default encoding in 
 org.apache.hadoop.minikdc.MiniKdc.initKDCServer(): new 
 java.io.InputStreamReader(InputStream) [org.apache.hadoop.minikdc.MiniKdc] 
 At MiniKdc.java:[lines 112-557]
 [INFO] Found reliance on default encoding in 
 org.apache.hadoop.minikdc.MiniKdc.main(String[]): new 
 java.io.FileReader(File) [org.apache.hadoop.minikdc.MiniKdc] At 
 MiniKdc.java:[lines 112-557]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10389) Native RPCv9 client

2014-05-13 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996762#comment-13996762
 ] 

Colin Patrick McCabe commented on HADOOP-10389:
---

bq. So those method all need to initialize hrpc_proxy again(which need server 
address, user and other configs), what I try to say is maybe proxy and call can 
be separated, proxy can be shared, call on stack for each call. Maybe it's to 
late to change that, just my two cents.

I think the performance is actually going to be pretty good, since we're just 
putting an object on the stack and doing some memory copying.  I have some code 
which implements the native filesystem which I will post soon... I think some 
of this will make more sense when you see how it gets used.

bq. So there should be a method for user to cancel an ongoing rpc(also need to 
make sure after cancel complete, no more memory access to hrpc_proxy and call), 
looks like hrpc_proxy_deactivate can't do this yet?

The most important use-case for cancelling an RPC is when shutting down the 
filesystem in {{hdfsClose}}.  We can already handle that by calling 
{{hrpc_messenger_shutdown}}, which will abort all in-progress RPCs.

bq. I thought more about this, adding timeout to call also works and seems like 
a better solution.

Yeah, I want to implement timeouts.  The two most important timeouts are how 
long we should wait for a response from the server and how long we should keep 
around an inactive connection.

 Native RPCv9 client
 ---

 Key: HADOOP-10389
 URL: https://issues.apache.org/jira/browse/HADOOP-10389
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: HADOOP-10388
Reporter: Binglin Chang
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10388.001.patch, HADOOP-10389.002.patch, 
 HADOOP-10389.004.patch, HADOOP-10389.005.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10562) Namenode exits on exception without printing stack trace in AbstractDelegationTokenSecretManager

2014-05-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992824#comment-13992824
 ] 

Hudson commented on HADOOP-10562:
-

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1777 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1777/])
YARN-2018. TestClientRMService.testTokenRenewalWrongUser fails after 
HADOOP-10562. (Contributed by Ming Ma) (arp: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1592783)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java


 Namenode exits on exception without printing stack trace in 
 AbstractDelegationTokenSecretManager
 

 Key: HADOOP-10562
 URL: https://issues.apache.org/jira/browse/HADOOP-10562
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 1.2.1, 2.4.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
Priority: Critical
 Fix For: 3.0.0, 1.3.0, 2.5.0

 Attachments: HADOOP-10562.1.patch, HADOOP-10562.branch-1.1.patch, 
 HADOOP-10562.patch


 Not printing the stack trace makes debugging harder.



--
This message was sent by Atlassian JIRA
(v6.2#6252)