[jira] [Commented] (HADOOP-9845) Update protobuf to 2.5 from 2.4.x

2013-12-09 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842958#comment-13842958
 ] 

Mikhail Antonov commented on HADOOP-9845:
-

So I'm guessing, 2.0.5/6 aren't getting this patch?

 Update protobuf to 2.5 from 2.4.x
 -

 Key: HADOOP-9845
 URL: https://issues.apache.org/jira/browse/HADOOP-9845
 Project: Hadoop Common
  Issue Type: Improvement
  Components: performance
Affects Versions: 2.0.5-alpha
Reporter: stack
Assignee: Alejandro Abdelnur
Priority: Blocker
 Fix For: 2.1.0-beta

 Attachments: HADOOP-9845.patch, HADOOP-9845.patch


 protobuf 2.5 is a bit faster with a new Parse to avoid a builder step and a 
 few other goodies that we'd like to take advantage of over in hbase 
 especially now we are all pb all the time.  Unfortunately the protoc 
 generated files are no longer compatible w/ 2.4.1 generated files.  Hadoop 
 uses 2.4.1 pb.  This latter fact makes it so we cannot upgrade until hadoop 
 does.
 This issue suggests hadoop2 move to protobuf 2.5.
 I can do the patch no prob. if there is interest.
 (When we upgraded our build broke with complaints like the below:
 {code}
 java.lang.UnsupportedOperationException: This is supposed to be overridden by 
 subclasses.
   at 
 com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetDatanodeReportRequestProto.getSerializedSize(ClientNamenodeProtocolProtos.java:21566)
   at 
 com.google.protobuf.AbstractMessageLite.toByteString(AbstractMessageLite.java:49)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.constructRpcRequest(ProtobufRpcEngine.java:149)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:193)
   at com.sun.proxy.$Proxy14.getDatanodeReport(Unknown Source)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
   at com.sun.proxy.$Proxy14.getDatanodeReport(Unknown Source)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDatanodeReport(ClientNamenodeProtocolTranslatorPB.java:488)
   at org.apache.hadoop.hdfs.DFSClient.datanodeReport(DFSClient.java:1887)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:1798
 ...
 {code}
 More over in HBASE-8165 if interested.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-9639) truly shared cache for jars (jobjar/libjar)

2013-12-09 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843068#comment-13843068
 ] 

Steve Loughran commented on HADOOP-9639:


Some quick comments on this

# the upload mechanism assumes that rename() is atomic. This should be spelled 
out, to avoid people trying to use blobstores as their cache infrastructure
# obviously: add a specific exception to indicate some kind of race condition
# The shared cache enabled flags are obviously things that admins would have to 
right to set and make final in yarn-site.xml files, clients to handle this 
without problems.


Security: # you have to also think about preserving the security of files I 
don't want to share with others, either by allowing me to mix cached with 
uncached files (those keeping configuration resources with sensitive 
information), or even let others in the cluster know what binaries I'm pushing 
around. Presumably clusters that care about such things will just disable the 
cache altogether, but there is the use case of shared cache for most data, 
some private resources. If that use case is not to be supported, we should at 
least call it out.

co-ordination wise

#  I (personally) think we should all just embrace the presence of 1+ ZK quorum 
on the cluster as the core infrastructure HA systems need it, and it would 
avoid everyone trying to write their let's use the filesystem as a way to 
synchronize clients based on the assumption that FileSystem.create() with 
overwrite==false guarantees unique access. But that's just an opinion, I don't 
see that a side-feature should force the action, but at the same time, if the 
cache it is optional, ZK could be made a prerequisite for caching. It would 
fundamentally change how confident we could be that the system would be 
correct, even on filesystems that break the assumptions of posix 
more-significantly than HDFS.

* 
[HADOOP-9361|https://github.com/steveloughran/hadoop-trunk/tree/stevel/HADOOP-9361-filesystem-contract/hadoop-common-project/hadoop-common/src/site/markdown/filesystem]
 is attempting to formally define the semantics of a Hadoop-compatible 
filesystem. If you could use that as the foundation assumptions  perhaps even 
[notation|https://github.com/steveloughran/hadoop-trunk/blob/stevel/HADOOP-9361-filesystem-contract/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/notation.md]
 for defining your own behavior, the analysis on P7 could be proved more 
rigorously 

* The semantics of `{{happens-before}} comes from [Lamport78]  [Time, Clocks 
and the Ordering of Events in a Distributed 
System|http://research.microsoft.com/en-us/um/people/lamport/pubs/time-clocks.pdf],
 so should be used as the citation as it is more appropriate than memory models 
of Java or out-of-order CPUs.

* Script-wise, I've been evolving a [[generic YARN service 
launcher|https://github.com/hortonworks/hoya/tree/master/hoya-core/src/main/java/org/apache/hadoop/yarn/service/launcher],
 which is nearly ready to submit as [YARN-679]: if the cleaner service were 
implemented as a YARN service it could be invoked as a run-one command line, or 
deployed in a YARN container service which provided cron-like services

 truly shared cache for jars (jobjar/libjar)
 ---

 Key: HADOOP-9639
 URL: https://issues.apache.org/jira/browse/HADOOP-9639
 Project: Hadoop Common
  Issue Type: New Feature
  Components: filecache
Affects Versions: 2.0.4-alpha
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: shared_cache_design.pdf, shared_cache_design_v2.pdf, 
 shared_cache_design_v3.pdf, shared_cache_design_v4.pdf


 Currently there is the distributed cache that enables you to cache jars and 
 files so that attempts from the same job can reuse them. However, sharing is 
 limited with the distributed cache because it is normally on a per-job basis. 
 On a large cluster, sometimes copying of jobjars and libjars becomes so 
 prevalent that it consumes a large portion of the network bandwidth, not to 
 speak of defeating the purpose of bringing compute to where data is. This 
 is wasteful because in most cases code doesn't change much across many jobs.
 I'd like to propose and discuss feasibility of introducing a truly shared 
 cache so that multiple jobs from multiple users can share and cache jars. 
 This JIRA is to open the discussion.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Moved] (HADOOP-10150) Hadoop cryptographic file system

2013-12-09 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu moved HDFS-5143 to HADOOP-10150:
---

  Component/s: (was: security)
   security
Fix Version/s: (was: 3.0.0)
   3.0.0
 Target Version/s: 3.0.0  (was: 3.0.0)
Affects Version/s: (was: 3.0.0)
   3.0.0
  Key: HADOOP-10150  (was: HDFS-5143)
  Project: Hadoop Common  (was: Hadoop HDFS)

 Hadoop cryptographic file system
 

 Key: HADOOP-10150
 URL: https://issues.apache.org/jira/browse/HADOOP-10150
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
  Labels: rhino
 Fix For: 3.0.0

 Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file 
 system.pdf


 There is an increasing need for securing data when Hadoop customers use 
 various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so 
 on.
 HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based 
 on HADOOP “FilterFileSystem” decorating DFS or other file systems, and 
 transparent to upper layer applications. It’s configurable, scalable and fast.
 High level requirements:
 1.Transparent to and no modification required for upper layer 
 applications.
 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if 
 the wrapped file system supports them.
 3.Very high performance for encryption and decryption, they will not 
 become bottleneck.
 4.Can decorate HDFS and all other file systems in Hadoop, and will not 
 modify existing structure of file system, such as namenode and datanode 
 structure if the wrapped file system is HDFS.
 5.Admin can configure encryption policies, such as which directory will 
 be encrypted.
 6.A robust key management framework.
 7.Support Pread and append operations if the wrapped file system supports 
 them.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system

2013-12-09 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843097#comment-13843097
 ] 

Yi Liu commented on HADOOP-10150:
-

Hi Owen, thanks for bringing it up here. I am working on breaking down the 
patches and creating sub-task JIRAs as already mentioned in my previous 
response. Rest of your comment seems to be about a different JIRA and is 
probably best discussed on that JIRA.

* HADOOP-10149: since I have that patch already implemented, do you mind 
assigning it to me? I will take that piece of code and apply there for review.
* Since HADOOP-10141 tries to improve on HADOOP-9333, why not provide your 
feedback on HADOOP-9333 instead of opening a JIRA that duplicates part of that 
work?


 Hadoop cryptographic file system
 

 Key: HADOOP-10150
 URL: https://issues.apache.org/jira/browse/HADOOP-10150
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
  Labels: rhino
 Fix For: 3.0.0

 Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file 
 system.pdf


 There is an increasing need for securing data when Hadoop customers use 
 various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so 
 on.
 HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based 
 on HADOOP “FilterFileSystem” decorating DFS or other file systems, and 
 transparent to upper layer applications. It’s configurable, scalable and fast.
 High level requirements:
 1.Transparent to and no modification required for upper layer 
 applications.
 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if 
 the wrapped file system supports them.
 3.Very high performance for encryption and decryption, they will not 
 become bottleneck.
 4.Can decorate HDFS and all other file systems in Hadoop, and will not 
 modify existing structure of file system, such as namenode and datanode 
 structure if the wrapped file system is HDFS.
 5.Admin can configure encryption policies, such as which directory will 
 be encrypted.
 6.A robust key management framework.
 7.Support Pread and append operations if the wrapped file system supports 
 them.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system

2013-12-09 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843098#comment-13843098
 ] 

Yi Liu commented on HADOOP-10150:
-

Thanks Uma, I am working on breakdown patches and creating sub-task Jiras. I 
will convert this JIRA to common project.

 Hadoop cryptographic file system
 

 Key: HADOOP-10150
 URL: https://issues.apache.org/jira/browse/HADOOP-10150
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
  Labels: rhino
 Fix For: 3.0.0

 Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file 
 system.pdf


 There is an increasing need for securing data when Hadoop customers use 
 various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so 
 on.
 HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based 
 on HADOOP “FilterFileSystem” decorating DFS or other file systems, and 
 transparent to upper layer applications. It’s configurable, scalable and fast.
 High level requirements:
 1.Transparent to and no modification required for upper layer 
 applications.
 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if 
 the wrapped file system supports them.
 3.Very high performance for encryption and decryption, they will not 
 become bottleneck.
 4.Can decorate HDFS and all other file systems in Hadoop, and will not 
 modify existing structure of file system, such as namenode and datanode 
 structure if the wrapped file system is HDFS.
 5.Admin can configure encryption policies, such as which directory will 
 be encrypted.
 6.A robust key management framework.
 7.Support Pread and append operations if the wrapped file system supports 
 them.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HADOOP-10151) Implement a Buffer-Based Chiper InputStream and OutPutStream

2013-12-09 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HADOOP-10151:


Description: Cipher InputStream and OuputStream are buffer-based, and the 
buffer is used to cache the encrypted data or result.   Cipher InputStream is 
used to read encrypted data, and the result is plain text . Cipher OutputStream 
is used to write plain data and result is encrypted data.  

 Implement a Buffer-Based Chiper InputStream and OutPutStream
 

 Key: HADOOP-10151
 URL: https://issues.apache.org/jira/browse/HADOOP-10151
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 3.0.0


 Cipher InputStream and OuputStream are buffer-based, and the buffer is used 
 to cache the encrypted data or result.   Cipher InputStream is used to read 
 encrypted data, and the result is plain text . Cipher OutputStream is used to 
 write plain data and result is encrypted data.  



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HADOOP-10151) Implement a Buffer-Based Chiper InputStream and OutPutStream

2013-12-09 Thread Yi Liu (JIRA)
Yi Liu created HADOOP-10151:
---

 Summary: Implement a Buffer-Based Chiper InputStream and 
OutPutStream
 Key: HADOOP-10151
 URL: https://issues.apache.org/jira/browse/HADOOP-10151
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 3.0.0






--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HADOOP-10152) Distributed file cipher InputStream and OutputStream which provide 1:1 mapping of plain text data and cipher data.

2013-12-09 Thread Yi Liu (JIRA)
Yi Liu created HADOOP-10152:
---

 Summary: Distributed file cipher InputStream and OutputStream 
which provide 1:1 mapping of plain text data and cipher data.
 Key: HADOOP-10152
 URL: https://issues.apache.org/jira/browse/HADOOP-10152
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 3.0.0






--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HADOOP-10152) Distributed file cipher InputStream and OutputStream which provide 1:1 mapping of plain text data and cipher data.

2013-12-09 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HADOOP-10152:


Description: To be easily seek and positioned read distributed file, the 
length of encrypted file should be the same as the length of plain file, and 
the positions have 1:1 mapping.  So in this JIRA we defines distributed file 
cipher InputStream(FSDecryptorStream) and OutputStream(FSEncryptorStream). The 
distributed file cipher InputStream is seekable and positonedReadable.  This 
JIRA is different from HADOOP-10151, the file may be read and written many 
times and on multiple nodes.

 Distributed file cipher InputStream and OutputStream which provide 1:1 
 mapping of plain text data and cipher data.
 --

 Key: HADOOP-10152
 URL: https://issues.apache.org/jira/browse/HADOOP-10152
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 3.0.0


 To be easily seek and positioned read distributed file, the length of 
 encrypted file should be the same as the length of plain file, and the 
 positions have 1:1 mapping.  So in this JIRA we defines distributed file 
 cipher InputStream(FSDecryptorStream) and OutputStream(FSEncryptorStream). 
 The distributed file cipher InputStream is seekable and positonedReadable.  
 This JIRA is different from HADOOP-10151, the file may be read and written 
 many times and on multiple nodes.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HADOOP-10153) Define Crypto policy interfaces and provide its default implementation.

2013-12-09 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HADOOP-10153:


Labels: rhino  (was: )

 Define Crypto policy interfaces and provide its default implementation.
 ---

 Key: HADOOP-10153
 URL: https://issues.apache.org/jira/browse/HADOOP-10153
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
  Labels: rhino
 Fix For: 3.0.0


 The JIRA defines crypto policy interface, developers/users can implement 
 their own crypto policy to decide how files/directories are encrypted. This 
 JIRA also includes a default implementation.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HADOOP-10151) Implement a Buffer-Based Chiper InputStream and OutPutStream

2013-12-09 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HADOOP-10151:


Labels: rhino  (was: )

 Implement a Buffer-Based Chiper InputStream and OutPutStream
 

 Key: HADOOP-10151
 URL: https://issues.apache.org/jira/browse/HADOOP-10151
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
  Labels: rhino
 Fix For: 3.0.0


 Cipher InputStream and OuputStream are buffer-based, and the buffer is used 
 to cache the encrypted data or result.   Cipher InputStream is used to read 
 encrypted data, and the result is plain text . Cipher OutputStream is used to 
 write plain data and result is encrypted data.  



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HADOOP-10153) Define Crypto policy interfaces and provide its default implementation.

2013-12-09 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HADOOP-10153:


Description: The JIRA defines crypto policy interface, developers/users can 
implement their own crypto policy to decide how files/directories are 
encrypted. This JIRA also includes a default implementation.

 Define Crypto policy interfaces and provide its default implementation.
 ---

 Key: HADOOP-10153
 URL: https://issues.apache.org/jira/browse/HADOOP-10153
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
  Labels: rhino
 Fix For: 3.0.0


 The JIRA defines crypto policy interface, developers/users can implement 
 their own crypto policy to decide how files/directories are encrypted. This 
 JIRA also includes a default implementation.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HADOOP-10152) Distributed file cipher InputStream and OutputStream which provide 1:1 mapping of plain text data and cipher data.

2013-12-09 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HADOOP-10152:


Labels: rhino  (was: )

 Distributed file cipher InputStream and OutputStream which provide 1:1 
 mapping of plain text data and cipher data.
 --

 Key: HADOOP-10152
 URL: https://issues.apache.org/jira/browse/HADOOP-10152
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
  Labels: rhino
 Fix For: 3.0.0


 To be easily seek and positioned read distributed file, the length of 
 encrypted file should be the same as the length of plain file, and the 
 positions have 1:1 mapping.  So in this JIRA we defines distributed file 
 cipher InputStream(FSDecryptorStream) and OutputStream(FSEncryptorStream). 
 The distributed file cipher InputStream is seekable and positonedReadable.  
 This JIRA is different from HADOOP-10151, the file may be read and written 
 many times and on multiple nodes.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HADOOP-10153) Define Crypto policy interfaces and provide its default implementation.

2013-12-09 Thread Yi Liu (JIRA)
Yi Liu created HADOOP-10153:
---

 Summary: Define Crypto policy interfaces and provide its default 
implementation.
 Key: HADOOP-10153
 URL: https://issues.apache.org/jira/browse/HADOOP-10153
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 3.0.0






--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HADOOP-10154) Provide cryptographic filesystem implementation and it's data IO.

2013-12-09 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HADOOP-10154:


Description: The JIRA includes Cryptographic filesystem data  InputStream 
which extends FSDataInputStream and OutputStream which extends 
FSDataOutputStream.  Implantation of Cryptographic file system is also included 
in this JIRA.  

 Provide cryptographic filesystem implementation and it's data IO.
 -

 Key: HADOOP-10154
 URL: https://issues.apache.org/jira/browse/HADOOP-10154
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 3.0.0


 The JIRA includes Cryptographic filesystem data  InputStream which extends 
 FSDataInputStream and OutputStream which extends FSDataOutputStream.  
 Implantation of Cryptographic file system is also included in this JIRA.  



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HADOOP-10154) Provide cryptographic filesystem implementation and it's data IO.

2013-12-09 Thread Yi Liu (JIRA)
Yi Liu created HADOOP-10154:
---

 Summary: Provide cryptographic filesystem implementation and it's 
data IO.
 Key: HADOOP-10154
 URL: https://issues.apache.org/jira/browse/HADOOP-10154
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 3.0.0






--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HADOOP-10155) Hadoop-crypto which includes native cipher implementation.

2013-12-09 Thread Yi Liu (JIRA)
Yi Liu created HADOOP-10155:
---

 Summary: Hadoop-crypto which includes native cipher 
implementation. 
 Key: HADOOP-10155
 URL: https://issues.apache.org/jira/browse/HADOOP-10155
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 3.0.0






--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HADOOP-10154) Provide cryptographic filesystem implementation and it's data IO.

2013-12-09 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HADOOP-10154:


Labels: rhino  (was: )

 Provide cryptographic filesystem implementation and it's data IO.
 -

 Key: HADOOP-10154
 URL: https://issues.apache.org/jira/browse/HADOOP-10154
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
  Labels: rhino
 Fix For: 3.0.0


 The JIRA includes Cryptographic filesystem data  InputStream which extends 
 FSDataInputStream and OutputStream which extends FSDataOutputStream.  
 Implantation of Cryptographic file system is also included in this JIRA.  



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HADOOP-10155) Hadoop-crypto which includes native cipher implementation.

2013-12-09 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HADOOP-10155:


Description: Native cipher is used to improve performance, when using 
OpenSSL and with AES-NI enabled, Native cipher is 20x faster than Java cipher, 
for example CBC/CTR mode.

 Hadoop-crypto which includes native cipher implementation. 
 ---

 Key: HADOOP-10155
 URL: https://issues.apache.org/jira/browse/HADOOP-10155
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
  Labels: rhino
 Fix For: 3.0.0


 Native cipher is used to improve performance, when using OpenSSL and with 
 AES-NI enabled, Native cipher is 20x faster than Java cipher, for example 
 CBC/CTR mode.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HADOOP-10155) Hadoop-crypto which includes native cipher implementation.

2013-12-09 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HADOOP-10155:


Labels: rhino  (was: )

 Hadoop-crypto which includes native cipher implementation. 
 ---

 Key: HADOOP-10155
 URL: https://issues.apache.org/jira/browse/HADOOP-10155
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
  Labels: rhino
 Fix For: 3.0.0


 Native cipher is used to improve performance, when using OpenSSL and with 
 AES-NI enabled, Native cipher is 20x faster than Java cipher, for example 
 CBC/CTR mode.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system

2013-12-09 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843114#comment-13843114
 ] 

Yi Liu commented on HADOOP-10150:
-

Hi Owen, I have filed 5 sub tasks, and initial patches will be attached later. 
I want to use HADOOP-10149 to attach ByteBufferCipher API patch.

 Hadoop cryptographic file system
 

 Key: HADOOP-10150
 URL: https://issues.apache.org/jira/browse/HADOOP-10150
 Project: Hadoop Common
  Issue Type: New Feature
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
  Labels: rhino
 Fix For: 3.0.0

 Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file 
 system.pdf


 There is an increasing need for securing data when Hadoop customers use 
 various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so 
 on.
 HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based 
 on HADOOP “FilterFileSystem” decorating DFS or other file systems, and 
 transparent to upper layer applications. It’s configurable, scalable and fast.
 High level requirements:
 1.Transparent to and no modification required for upper layer 
 applications.
 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if 
 the wrapped file system supports them.
 3.Very high performance for encryption and decryption, they will not 
 become bottleneck.
 4.Can decorate HDFS and all other file systems in Hadoop, and will not 
 modify existing structure of file system, such as namenode and datanode 
 structure if the wrapped file system is HDFS.
 5.Admin can configure encryption policies, such as which directory will 
 be encrypted.
 6.A robust key management framework.
 7.Support Pread and append operations if the wrapped file system supports 
 them.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-9611) mvn-rpmbuild against google-guice 3.0 yields missing cglib dependency

2013-12-09 Thread Robert Rati (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843127#comment-13843127
 ] 

Robert Rati commented on HADOOP-9611:
-

Some additional modules were found to be missing the dependency since the patch 
was first posted.

 mvn-rpmbuild against google-guice  3.0 yields missing cglib dependency
 ---

 Key: HADOOP-9611
 URL: https://issues.apache.org/jira/browse/HADOOP-9611
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Timothy St. Clair
  Labels: maven
 Attachments: HADOOP-2.2.0-9611.patch, HADOOP-9611.patch


 Google guice 3.0 repackaged some external dependencies (cglib), which are 
 broken out and exposed when running a mvn-rpmbuild against a stock Fedora 18 
 machine (3.1.2-6).  By adding the explicit dependency, it fixes the error and 
 causes no impact to normal mvn builds.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-10075) Update jetty dependency to version 9

2013-12-09 Thread Robert Rati (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843131#comment-13843131
 ] 

Robert Rati commented on HADOOP-10075:
--

Yes, HBase will need to change to jetty 9 as well.  I'm working on HBase and 
will be providing a patch to them when done.  The change needed for HBase to 
interact with Hadoop using jetty9 is pretty minor from the work I've done so 
far.  Iiirc, it's a returned variable type change from a call into hadoop 
(Connector class issue again).

 Update jetty dependency to version 9
 

 Key: HADOOP-10075
 URL: https://issues.apache.org/jira/browse/HADOOP-10075
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Robert Rati
Assignee: Robert Rati
 Attachments: HADOOP-10075.patch


 Jetty6 is no longer maintained.  Update the dependency to jetty9.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-9611) mvn-rpmbuild against google-guice 3.0 yields missing cglib dependency

2013-12-09 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843298#comment-13843298
 ] 

Steve Loughran commented on HADOOP-9611:


OK, so the 2.2 patch is intended for branch-2 and then trunk?

 mvn-rpmbuild against google-guice  3.0 yields missing cglib dependency
 ---

 Key: HADOOP-9611
 URL: https://issues.apache.org/jira/browse/HADOOP-9611
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Timothy St. Clair
  Labels: maven
 Attachments: HADOOP-2.2.0-9611.patch, HADOOP-9611.patch


 Google guice 3.0 repackaged some external dependencies (cglib), which are 
 broken out and exposed when running a mvn-rpmbuild against a stock Fedora 18 
 machine (3.1.2-6).  By adding the explicit dependency, it fixes the error and 
 causes no impact to normal mvn builds.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-9611) mvn-rpmbuild against google-guice 3.0 yields missing cglib dependency

2013-12-09 Thread Robert Rati (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843307#comment-13843307
 ] 

Robert Rati commented on HADOOP-9611:
-

Correct. 

 mvn-rpmbuild against google-guice  3.0 yields missing cglib dependency
 ---

 Key: HADOOP-9611
 URL: https://issues.apache.org/jira/browse/HADOOP-9611
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Timothy St. Clair
  Labels: maven
 Attachments: HADOOP-2.2.0-9611.patch, HADOOP-9611.patch


 Google guice 3.0 repackaged some external dependencies (cglib), which are 
 broken out and exposed when running a mvn-rpmbuild against a stock Fedora 18 
 machine (3.1.2-6).  By adding the explicit dependency, it fixes the error and 
 causes no impact to normal mvn builds.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-10151) Implement a Buffer-Based Chiper InputStream and OutPutStream

2013-12-09 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843332#comment-13843332
 ] 

Owen O'Malley commented on HADOOP-10151:


After looking more at the Cipher API, it does in fact have ByteBuffer methods, 
but the methods on Cipher are final and it delegates to an underlying 
implementation object. Having a direct implementation will be much more 
efficient.

 Implement a Buffer-Based Chiper InputStream and OutPutStream
 

 Key: HADOOP-10151
 URL: https://issues.apache.org/jira/browse/HADOOP-10151
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
  Labels: rhino
 Fix For: 3.0.0


 Cipher InputStream and OuputStream are buffer-based, and the buffer is used 
 to cache the encrypted data or result.   Cipher InputStream is used to read 
 encrypted data, and the result is plain text . Cipher OutputStream is used to 
 write plain data and result is encrypted data.  



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Issue Comment Deleted] (HADOOP-10151) Implement a Buffer-Based Chiper InputStream and OutPutStream

2013-12-09 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HADOOP-10151:
---

Comment: was deleted

(was: After looking more at the Cipher API, it does in fact have ByteBuffer 
methods, but the methods on Cipher are final and it delegates to an underlying 
implementation object. Having a direct implementation will be much more 
efficient.)

 Implement a Buffer-Based Chiper InputStream and OutPutStream
 

 Key: HADOOP-10151
 URL: https://issues.apache.org/jira/browse/HADOOP-10151
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
  Labels: rhino
 Fix For: 3.0.0


 Cipher InputStream and OuputStream are buffer-based, and the buffer is used 
 to cache the encrypted data or result.   Cipher InputStream is used to read 
 encrypted data, and the result is plain text . Cipher OutputStream is used to 
 write plain data and result is encrypted data.  



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-10153) Define Crypto policy interfaces and provide its default implementation.

2013-12-09 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843345#comment-13843345
 ] 

Owen O'Malley commented on HADOOP-10153:


After looking at the javax.security.Cipher API more deeply, it does have both 
byte[] and ByteBuffer methods. 

Unfortunately the methods in Cipher are declared final and the actual 
implementation is done by a nested implementation class. It isn't clear what 
the performance penalty for using javax.security.Cipher with a openssl-based 
provider would be, but it doesn't look excessive.

My thoughts are that instead of reinventing the wheel, we should use the 
standard javax.security.Cipher API with a customer provider that is based on 
openssl.

Thoughts?

 Define Crypto policy interfaces and provide its default implementation.
 ---

 Key: HADOOP-10153
 URL: https://issues.apache.org/jira/browse/HADOOP-10153
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
  Labels: rhino
 Fix For: 3.0.0


 The JIRA defines crypto policy interface, developers/users can implement 
 their own crypto policy to decide how files/directories are encrypted. This 
 JIRA also includes a default implementation.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Issue Comment Deleted] (HADOOP-10153) Define Crypto policy interfaces and provide its default implementation.

2013-12-09 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HADOOP-10153:
---

Comment: was deleted

(was: After looking at the javax.security.Cipher API more deeply, it does have 
both byte[] and ByteBuffer methods. 

Unfortunately the methods in Cipher are declared final and the actual 
implementation is done by a nested implementation class. It isn't clear what 
the performance penalty for using javax.security.Cipher with a openssl-based 
provider would be, but it doesn't look excessive.

My thoughts are that instead of reinventing the wheel, we should use the 
standard javax.security.Cipher API with a customer provider that is based on 
openssl.

Thoughts?)

 Define Crypto policy interfaces and provide its default implementation.
 ---

 Key: HADOOP-10153
 URL: https://issues.apache.org/jira/browse/HADOOP-10153
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
  Labels: rhino
 Fix For: 3.0.0


 The JIRA defines crypto policy interface, developers/users can implement 
 their own crypto policy to decide how files/directories are encrypted. This 
 JIRA also includes a default implementation.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-10149) Create ByteBuffer-based cipher API

2013-12-09 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843351#comment-13843351
 ] 

Owen O'Malley commented on HADOOP-10149:


Afterr looking at the javax.security.Cipher API more deeply, it does have both 
byte[] and ByteBuffer methods.

Unfortunately the methods in Cipher are declared final and the actual 
implementation is done by a nested implementation class. It isn't clear what 
the performance penalty for using javax.security.Cipher with a openssl-based 
provider would be, but it doesn't look excessive.

My thoughts are that instead of reinventing the wheel, we should use the 
standard javax.security.Cipher API with a customer provider that is based on 
openssl.

Thoughts?

 Create ByteBuffer-based cipher API
 --

 Key: HADOOP-10149
 URL: https://issues.apache.org/jira/browse/HADOOP-10149
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley

 As part of HDFS-5143, [~hitliuyi] included a ByteBuffer-based API for 
 encryption and decryption. Especially, because of the zero-copy work this 
 seems like an important piece of work. 
 This API should be discussed independently instead of just as part of 
 HDFS-5143.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-10148) backport hadoop-10107 to branch-0.23

2013-12-09 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843424#comment-13843424
 ] 

Jonathan Eagles commented on HADOOP-10148:
--

I'm +1 on this patch. Thanks, Chen.

 backport hadoop-10107 to branch-0.23
 

 Key: HADOOP-10148
 URL: https://issues.apache.org/jira/browse/HADOOP-10148
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: ipc
Affects Versions: 0.23.10
Reporter: Chen He
Assignee: Chen He
Priority: Minor
 Attachments: HADOOP-10148.patch


 Found this in [build 
 #5440|https://builds.apache.org/job/PreCommit-HDFS-Build/5440/testReport/junit/org.apache.hadoop.hdfs.server.blockmanagement/TestUnderReplicatedBlocks/testSetrepIncWithUnderReplicatedBlocks/]
 Caused by: java.lang.NullPointerException
   at org.apache.hadoop.ipc.Server.getNumOpenConnections(Server.java:2434)
   at 
 org.apache.hadoop.ipc.metrics.RpcMetrics.numOpenConnections(RpcMetrics.java:74)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-10148) backport hadoop-10107 to branch-0.23

2013-12-09 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843422#comment-13843422
 ] 

Jonathan Eagles commented on HADOOP-10148:
--

There will very likely be a 0.23.11. Since this is just a cosmetic problem in 
the logs, I think this JIRA is a good candidate for that release.

 backport hadoop-10107 to branch-0.23
 

 Key: HADOOP-10148
 URL: https://issues.apache.org/jira/browse/HADOOP-10148
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: ipc
Affects Versions: 0.23.10
Reporter: Chen He
Assignee: Chen He
Priority: Minor
 Attachments: HADOOP-10148.patch


 Found this in [build 
 #5440|https://builds.apache.org/job/PreCommit-HDFS-Build/5440/testReport/junit/org.apache.hadoop.hdfs.server.blockmanagement/TestUnderReplicatedBlocks/testSetrepIncWithUnderReplicatedBlocks/]
 Caused by: java.lang.NullPointerException
   at org.apache.hadoop.ipc.Server.getNumOpenConnections(Server.java:2434)
   at 
 org.apache.hadoop.ipc.metrics.RpcMetrics.numOpenConnections(RpcMetrics.java:74)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HADOOP-10148) backport hadoop-10107 to branch-0.23

2013-12-09 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated HADOOP-10148:
-

   Resolution: Fixed
Fix Version/s: 0.23.11
   Status: Resolved  (was: Patch Available)

 backport hadoop-10107 to branch-0.23
 

 Key: HADOOP-10148
 URL: https://issues.apache.org/jira/browse/HADOOP-10148
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: ipc
Affects Versions: 0.23.10
Reporter: Chen He
Assignee: Chen He
Priority: Minor
 Fix For: 0.23.11

 Attachments: HADOOP-10148.patch


 Found this in [build 
 #5440|https://builds.apache.org/job/PreCommit-HDFS-Build/5440/testReport/junit/org.apache.hadoop.hdfs.server.blockmanagement/TestUnderReplicatedBlocks/testSetrepIncWithUnderReplicatedBlocks/]
 Caused by: java.lang.NullPointerException
   at org.apache.hadoop.ipc.Server.getNumOpenConnections(Server.java:2434)
   at 
 org.apache.hadoop.ipc.metrics.RpcMetrics.numOpenConnections(RpcMetrics.java:74)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-9639) truly shared cache for jars (jobjar/libjar)

2013-12-09 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843447#comment-13843447
 ] 

Sangjin Lee commented on HADOOP-9639:
-

[~stev...@iseran.com], thanks much for your valuable comments. I'm going to go 
over them and reply soon. Thanks again!

 truly shared cache for jars (jobjar/libjar)
 ---

 Key: HADOOP-9639
 URL: https://issues.apache.org/jira/browse/HADOOP-9639
 Project: Hadoop Common
  Issue Type: New Feature
  Components: filecache
Affects Versions: 2.0.4-alpha
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: shared_cache_design.pdf, shared_cache_design_v2.pdf, 
 shared_cache_design_v3.pdf, shared_cache_design_v4.pdf


 Currently there is the distributed cache that enables you to cache jars and 
 files so that attempts from the same job can reuse them. However, sharing is 
 limited with the distributed cache because it is normally on a per-job basis. 
 On a large cluster, sometimes copying of jobjars and libjars becomes so 
 prevalent that it consumes a large portion of the network bandwidth, not to 
 speak of defeating the purpose of bringing compute to where data is. This 
 is wasteful because in most cases code doesn't change much across many jobs.
 I'd like to propose and discuss feasibility of introducing a truly shared 
 cache so that multiple jobs from multiple users can share and cache jars. 
 This JIRA is to open the discussion.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HADOOP-10142) Avoid groups lookup for unprivileged users such as dr.who

2013-12-09 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated HADOOP-10142:
-

Attachment: HADOOP-10142-branch-1.patch

Thanks Vinay. I backport this to branch-1-win and branch-1 
(HADOOP-10142-branch-1.patch).

 Avoid groups lookup for unprivileged users such as dr.who
 ---

 Key: HADOOP-10142
 URL: https://issues.apache.org/jira/browse/HADOOP-10142
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Vinay
Assignee: Vinay
 Fix For: 2.3.0

 Attachments: HADOOP-10142-branch-1.patch, HADOOP-10142.patch, 
 HADOOP-10142.patch, HADOOP-10142.patch, HADOOP-10142.patch


 Reduce the logs generated by ShellBasedUnixGroupsMapping.
 For ex: Using WebHdfs from windows generates following log for each request
 {noformat}2013-12-03 11:34:56,589 WARN 
 org.apache.hadoop.security.ShellBasedUnixGroupsMapping: got exception trying 
 to get groups for user dr.who
 org.apache.hadoop.util.Shell$ExitCodeException: id: dr.who: No such user
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:504)
 at org.apache.hadoop.util.Shell.run(Shell.java:417)
 at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:636)
 at org.apache.hadoop.util.Shell.execCommand(Shell.java:725)
 at org.apache.hadoop.util.Shell.execCommand(Shell.java:708)
 at 
 org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:83)
 at 
 org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBasedUnixGroupsMapping.java:52)
 at 
 org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback.getGroups(JniBasedUnixGroupsMappingWithFallback.java:50)
 at org.apache.hadoop.security.Groups.getGroups(Groups.java:95)
 at 
 org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1376)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.init(FSPermissionChecker.java:63)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getPermissionChecker(FSNamesystem.java:3228)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:4063)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4052)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:748)
 at 
 org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.getDirectoryListing(NamenodeWebHdfsMethods.java:715)
 at 
 org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.getListingStream(NamenodeWebHdfsMethods.java:727)
 at 
 org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.get(NamenodeWebHdfsMethods.java:675)
 at 
 org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.access$400(NamenodeWebHdfsMethods.java:114)
 at 
 org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods$3.run(NamenodeWebHdfsMethods.java:623)
 at 
 org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods$3.run(NamenodeWebHdfsMethods.java:618)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1515)
 at 
 org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.get(NamenodeWebHdfsMethods.java:618)
 at 
 org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.getRoot(NamenodeWebHdfsMethods.java:586)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
 at 
 com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
 at 
 com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
 at 
 com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)
 at 
 com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
 at 
 com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
  

[jira] [Commented] (HADOOP-9845) Update protobuf to 2.5 from 2.4.x

2013-12-09 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843478#comment-13843478
 ] 

Konstantin Shvachko commented on HADOOP-9845:
-

Mikhail, we can port it to 2.0, but there are things to consider.
I know a few binary distributions that are based on 2.0. 
And people may consider this as an incompatible change since it can affect 
particularly their cluster management software.
Other than that it is just a matter of porting the patch.

 Update protobuf to 2.5 from 2.4.x
 -

 Key: HADOOP-9845
 URL: https://issues.apache.org/jira/browse/HADOOP-9845
 Project: Hadoop Common
  Issue Type: Improvement
  Components: performance
Affects Versions: 2.0.5-alpha
Reporter: stack
Assignee: Alejandro Abdelnur
Priority: Blocker
 Fix For: 2.1.0-beta

 Attachments: HADOOP-9845.patch, HADOOP-9845.patch


 protobuf 2.5 is a bit faster with a new Parse to avoid a builder step and a 
 few other goodies that we'd like to take advantage of over in hbase 
 especially now we are all pb all the time.  Unfortunately the protoc 
 generated files are no longer compatible w/ 2.4.1 generated files.  Hadoop 
 uses 2.4.1 pb.  This latter fact makes it so we cannot upgrade until hadoop 
 does.
 This issue suggests hadoop2 move to protobuf 2.5.
 I can do the patch no prob. if there is interest.
 (When we upgraded our build broke with complaints like the below:
 {code}
 java.lang.UnsupportedOperationException: This is supposed to be overridden by 
 subclasses.
   at 
 com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetDatanodeReportRequestProto.getSerializedSize(ClientNamenodeProtocolProtos.java:21566)
   at 
 com.google.protobuf.AbstractMessageLite.toByteString(AbstractMessageLite.java:49)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.constructRpcRequest(ProtobufRpcEngine.java:149)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:193)
   at com.sun.proxy.$Proxy14.getDatanodeReport(Unknown Source)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
   at com.sun.proxy.$Proxy14.getDatanodeReport(Unknown Source)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDatanodeReport(ClientNamenodeProtocolTranslatorPB.java:488)
   at org.apache.hadoop.hdfs.DFSClient.datanodeReport(DFSClient.java:1887)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:1798
 ...
 {code}
 More over in HBASE-8165 if interested.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-10106) Incorrect thread name in RPC log messages

2013-12-09 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843477#comment-13843477
 ] 

Sanjay Radia commented on HADOOP-10106:
---

Refactoring of code generally need to be done in separate Jira. Isn't the 
fixing of the thread name possible with the old structure? If so, I suggest 
that you do the refactoring in a separate jira.

 Incorrect thread name in RPC log messages
 -

 Key: HADOOP-10106
 URL: https://issues.apache.org/jira/browse/HADOOP-10106
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ming Ma
Priority: Minor
 Attachments: hadoop_10106_trunk.patch


 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8020: 
 readAndProcess from client 10.115.201.46 threw exception 
 org.apache.hadoop.ipc.RpcServerException: Unknown out of band call 
 #-2147483647
 This is thrown by a reader thread, so the message should be like
 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 for port 8020: 
 readAndProcess from client 10.115.201.46 threw exception 
 org.apache.hadoop.ipc.RpcServerException: Unknown out of band call 
 #-2147483647
 Another example is Responder.processResponse, which can also be called by 
 handler thread. When that happend, the thread name should be the handler 
 thread, not the responder thread.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-9845) Update protobuf to 2.5 from 2.4.x

2013-12-09 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843485#comment-13843485
 ] 

Mikhail Antonov commented on HADOOP-9845:
-

Konstantin, thanks for reply!

Essentially, I'm building BigTop from sources on Fedora 19 (I know it's not 
supported officially, but I wanted to give it a try), and found that _make_ 
tries to build 2.0.6-alpha, which is using protobuf 2.4.*, but it seems fedora 
19 only installs protobuf 2.5.0.

May be it's more appropriate to discuss in Bigtop dev mail list?

 Update protobuf to 2.5 from 2.4.x
 -

 Key: HADOOP-9845
 URL: https://issues.apache.org/jira/browse/HADOOP-9845
 Project: Hadoop Common
  Issue Type: Improvement
  Components: performance
Affects Versions: 2.0.5-alpha
Reporter: stack
Assignee: Alejandro Abdelnur
Priority: Blocker
 Fix For: 2.1.0-beta

 Attachments: HADOOP-9845.patch, HADOOP-9845.patch


 protobuf 2.5 is a bit faster with a new Parse to avoid a builder step and a 
 few other goodies that we'd like to take advantage of over in hbase 
 especially now we are all pb all the time.  Unfortunately the protoc 
 generated files are no longer compatible w/ 2.4.1 generated files.  Hadoop 
 uses 2.4.1 pb.  This latter fact makes it so we cannot upgrade until hadoop 
 does.
 This issue suggests hadoop2 move to protobuf 2.5.
 I can do the patch no prob. if there is interest.
 (When we upgraded our build broke with complaints like the below:
 {code}
 java.lang.UnsupportedOperationException: This is supposed to be overridden by 
 subclasses.
   at 
 com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetDatanodeReportRequestProto.getSerializedSize(ClientNamenodeProtocolProtos.java:21566)
   at 
 com.google.protobuf.AbstractMessageLite.toByteString(AbstractMessageLite.java:49)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.constructRpcRequest(ProtobufRpcEngine.java:149)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:193)
   at com.sun.proxy.$Proxy14.getDatanodeReport(Unknown Source)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
   at com.sun.proxy.$Proxy14.getDatanodeReport(Unknown Source)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDatanodeReport(ClientNamenodeProtocolTranslatorPB.java:488)
   at org.apache.hadoop.hdfs.DFSClient.datanodeReport(DFSClient.java:1887)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:1798
 ...
 {code}
 More over in HBASE-8165 if interested.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HADOOP-9640) RPC Congestion Control with FairCallQueue

2013-12-09 Thread Chris Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Li updated HADOOP-9640:
-

Attachment: faircallqueue4.patch

Added new version

 RPC Congestion Control with FairCallQueue
 -

 Key: HADOOP-9640
 URL: https://issues.apache.org/jira/browse/HADOOP-9640
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.2.0
Reporter: Xiaobo Peng
  Labels: hdfs, qos, rpc
 Attachments: MinorityMajorityPerformance.pdf, 
 NN-denial-of-service-updated-plan.pdf, faircallqueue.patch, 
 faircallqueue2.patch, faircallqueue3.patch, faircallqueue4.patch, 
 rpc-congestion-control-draft-plan.pdf


 Several production Hadoop cluster incidents occurred where the Namenode was 
 overloaded and failed to respond. 
 We can improve quality of service for users during namenode peak loads by 
 replacing the FIFO call queue with a [Fair Call 
 Queue|https://issues.apache.org/jira/secure/attachment/12616864/NN-denial-of-service-updated-plan.pdf].
  (this plan supersedes rpc-congestion-control-draft-plan).
 Excerpted from the communication of one incident, “The map task of a user was 
 creating huge number of small files in the user directory. Due to the heavy 
 load on NN, the JT also was unable to communicate with NN...The cluster 
 became responsive only once the job was killed.”
 Excerpted from the communication of another incident, “Namenode was 
 overloaded by GetBlockLocation requests (Correction: should be getFileInfo 
 requests. the job had a bug that called getFileInfo for a nonexistent file in 
 an endless loop). All other requests to namenode were also affected by this 
 and hence all jobs slowed down. Cluster almost came to a grinding 
 halt…Eventually killed jobtracker to kill all jobs that are running.”
 Excerpted from HDFS-945, “We've seen defective applications cause havoc on 
 the NameNode, for e.g. by doing 100k+ 'listStatus' on very large directories 
 (60k files) etc.”



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-9640) RPC Congestion Control with FairCallQueue

2013-12-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843551#comment-13843551
 ] 

Hadoop QA commented on HADOOP-9640:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617895/faircallqueue4.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/3347//console

This message is automatically generated.

 RPC Congestion Control with FairCallQueue
 -

 Key: HADOOP-9640
 URL: https://issues.apache.org/jira/browse/HADOOP-9640
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.2.0
Reporter: Xiaobo Peng
  Labels: hdfs, qos, rpc
 Attachments: MinorityMajorityPerformance.pdf, 
 NN-denial-of-service-updated-plan.pdf, faircallqueue.patch, 
 faircallqueue2.patch, faircallqueue3.patch, faircallqueue4.patch, 
 rpc-congestion-control-draft-plan.pdf


 Several production Hadoop cluster incidents occurred where the Namenode was 
 overloaded and failed to respond. 
 We can improve quality of service for users during namenode peak loads by 
 replacing the FIFO call queue with a [Fair Call 
 Queue|https://issues.apache.org/jira/secure/attachment/12616864/NN-denial-of-service-updated-plan.pdf].
  (this plan supersedes rpc-congestion-control-draft-plan).
 Excerpted from the communication of one incident, “The map task of a user was 
 creating huge number of small files in the user directory. Due to the heavy 
 load on NN, the JT also was unable to communicate with NN...The cluster 
 became responsive only once the job was killed.”
 Excerpted from the communication of another incident, “Namenode was 
 overloaded by GetBlockLocation requests (Correction: should be getFileInfo 
 requests. the job had a bug that called getFileInfo for a nonexistent file in 
 an endless loop). All other requests to namenode were also affected by this 
 and hence all jobs slowed down. Cluster almost came to a grinding 
 halt…Eventually killed jobtracker to kill all jobs that are running.”
 Excerpted from HDFS-945, “We've seen defective applications cause havoc on 
 the NameNode, for e.g. by doing 100k+ 'listStatus' on very large directories 
 (60k files) etc.”



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HADOOP-9640) RPC Congestion Control with FairCallQueue

2013-12-09 Thread Chris Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Li updated HADOOP-9640:
-

Attachment: faircallqueue5.patch

Updated patch to target trunk

 RPC Congestion Control with FairCallQueue
 -

 Key: HADOOP-9640
 URL: https://issues.apache.org/jira/browse/HADOOP-9640
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.2.0
Reporter: Xiaobo Peng
  Labels: hdfs, qos, rpc
 Attachments: MinorityMajorityPerformance.pdf, 
 NN-denial-of-service-updated-plan.pdf, faircallqueue.patch, 
 faircallqueue2.patch, faircallqueue3.patch, faircallqueue4.patch, 
 faircallqueue5.patch, rpc-congestion-control-draft-plan.pdf


 Several production Hadoop cluster incidents occurred where the Namenode was 
 overloaded and failed to respond. 
 We can improve quality of service for users during namenode peak loads by 
 replacing the FIFO call queue with a [Fair Call 
 Queue|https://issues.apache.org/jira/secure/attachment/12616864/NN-denial-of-service-updated-plan.pdf].
  (this plan supersedes rpc-congestion-control-draft-plan).
 Excerpted from the communication of one incident, “The map task of a user was 
 creating huge number of small files in the user directory. Due to the heavy 
 load on NN, the JT also was unable to communicate with NN...The cluster 
 became responsive only once the job was killed.”
 Excerpted from the communication of another incident, “Namenode was 
 overloaded by GetBlockLocation requests (Correction: should be getFileInfo 
 requests. the job had a bug that called getFileInfo for a nonexistent file in 
 an endless loop). All other requests to namenode were also affected by this 
 and hence all jobs slowed down. Cluster almost came to a grinding 
 halt…Eventually killed jobtracker to kill all jobs that are running.”
 Excerpted from HDFS-945, “We've seen defective applications cause havoc on 
 the NameNode, for e.g. by doing 100k+ 'listStatus' on very large directories 
 (60k files) etc.”



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-9640) RPC Congestion Control with FairCallQueue

2013-12-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843586#comment-13843586
 ] 

Hadoop QA commented on HADOOP-9640:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617900/faircallqueue5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/3348//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/3348//console

This message is automatically generated.

 RPC Congestion Control with FairCallQueue
 -

 Key: HADOOP-9640
 URL: https://issues.apache.org/jira/browse/HADOOP-9640
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.2.0
Reporter: Xiaobo Peng
  Labels: hdfs, qos, rpc
 Attachments: MinorityMajorityPerformance.pdf, 
 NN-denial-of-service-updated-plan.pdf, faircallqueue.patch, 
 faircallqueue2.patch, faircallqueue3.patch, faircallqueue4.patch, 
 faircallqueue5.patch, rpc-congestion-control-draft-plan.pdf


 Several production Hadoop cluster incidents occurred where the Namenode was 
 overloaded and failed to respond. 
 We can improve quality of service for users during namenode peak loads by 
 replacing the FIFO call queue with a [Fair Call 
 Queue|https://issues.apache.org/jira/secure/attachment/12616864/NN-denial-of-service-updated-plan.pdf].
  (this plan supersedes rpc-congestion-control-draft-plan).
 Excerpted from the communication of one incident, “The map task of a user was 
 creating huge number of small files in the user directory. Due to the heavy 
 load on NN, the JT also was unable to communicate with NN...The cluster 
 became responsive only once the job was killed.”
 Excerpted from the communication of another incident, “Namenode was 
 overloaded by GetBlockLocation requests (Correction: should be getFileInfo 
 requests. the job had a bug that called getFileInfo for a nonexistent file in 
 an endless loop). All other requests to namenode were also affected by this 
 and hence all jobs slowed down. Cluster almost came to a grinding 
 halt…Eventually killed jobtracker to kill all jobs that are running.”
 Excerpted from HDFS-945, “We've seen defective applications cause havoc on 
 the NameNode, for e.g. by doing 100k+ 'listStatus' on very large directories 
 (60k files) etc.”



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-9320) Hadoop native build failure on ARM hard-float

2013-12-09 Thread Trevor Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843661#comment-13843661
 ] 

Trevor Robinson commented on HADOOP-9320:
-

Could someone please commit this patch? The build is broken for ARM hard-float 
systems, which are now the default. (Oracle 7u40 supports armhf.) This fix 
trivially reorders two chunks of JNIFlags.cmake so that JAVA_JVM_LIBRARY is 
defined before it is used, and it has no effect on other platforms. Thanks.

 Hadoop native build failure on ARM hard-float
 -

 Key: HADOOP-9320
 URL: https://issues.apache.org/jira/browse/HADOOP-9320
 Project: Hadoop Common
  Issue Type: Bug
  Components: native
Affects Versions: 2.0.3-alpha
 Environment: $ uname -a
 Linux 3.5.0-1000-highbank #154-Ubuntu SMP Thu Jan 10 09:13:40 UTC 2013 armv7l 
 armv7l armv7l GNU/Linux
 $ java -version
 java version 1.8.0-ea
 Java(TM) SE Runtime Environment (build 1.8.0-ea-b36e)
 Java HotSpot(TM) Client VM (build 25.0-b04, mixed mode)
Reporter: Trevor Robinson
Assignee: Trevor Robinson
  Labels: build-failure
 Attachments: HADOOP-9320.patch


 ARM JVM float ABI detection is failing in JNIFlags.cmake because 
 JAVA_JVM_LIBRARY is not set at that point. The failure currently causes CMake 
 to assume a soft-float JVM. This causes the build to fail with hard-float 
 OpenJDK (but don't use that) and [Oracle Java 8 Preview for 
 ARM|http://jdk8.java.net/fxarmpreview/]. Hopefully the April update of Oracle 
 Java 7 will support hard-float as well.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-9320) Hadoop native build failure on ARM hard-float

2013-12-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843736#comment-13843736
 ] 

Hadoop QA commented on HADOOP-9320:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12570212/HADOOP-9320.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/3349//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/3349//console

This message is automatically generated.

 Hadoop native build failure on ARM hard-float
 -

 Key: HADOOP-9320
 URL: https://issues.apache.org/jira/browse/HADOOP-9320
 Project: Hadoop Common
  Issue Type: Bug
  Components: native
Affects Versions: 2.0.3-alpha
 Environment: $ uname -a
 Linux 3.5.0-1000-highbank #154-Ubuntu SMP Thu Jan 10 09:13:40 UTC 2013 armv7l 
 armv7l armv7l GNU/Linux
 $ java -version
 java version 1.8.0-ea
 Java(TM) SE Runtime Environment (build 1.8.0-ea-b36e)
 Java HotSpot(TM) Client VM (build 25.0-b04, mixed mode)
Reporter: Trevor Robinson
Assignee: Trevor Robinson
  Labels: build-failure
 Attachments: HADOOP-9320.patch


 ARM JVM float ABI detection is failing in JNIFlags.cmake because 
 JAVA_JVM_LIBRARY is not set at that point. The failure currently causes CMake 
 to assume a soft-float JVM. This causes the build to fail with hard-float 
 OpenJDK (but don't use that) and [Oracle Java 8 Preview for 
 ARM|http://jdk8.java.net/fxarmpreview/]. Hopefully the April update of Oracle 
 Java 7 will support hard-float as well.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-10089) How to build eclipse plugin for hadoop 2.2 or what is the alternative to program hadoop 2.2?

2013-12-09 Thread Yoonmin Nam (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843772#comment-13843772
 ] 

Yoonmin Nam commented on HADOOP-10089:
--

It isn't working because the error UNRESOLVED DEPENDENCIES.

[ivy:resolve] :: resolution report :: resolve 117ms :: artifacts dl 3ms
[ivy:resolve] WARN: ::
[ivy:resolve] WARN: ::  UNRESOLVED DEPENDENCIES ::
[ivy:resolve] WARN: ::
[ivy:resolve] WARN: :: 
org.apache.hadoop#hadoop-mapreduce-client-jobclient;2.2.0: not found
[ivy:resolve] WARN: :: 
org.apache.hadoop#hadoop-mapreduce-client-core;2.2.0: not found
[ivy:resolve] WARN: :: 
org.apache.hadoop#hadoop-mapreduce-client-common;2.2.0: not found
[ivy:resolve] WARN: :: org.apache.hadoop#hadoop-hdfs;2.2.0: not found
[ivy:resolve] WARN: :: org.apache.hadoop#hadoop-common;2.2.0: not found
[ivy:resolve] WARN: ::
[ivy:resolve]   report for org.apache.hadoop#eclipse-plugin;working@saturn00 
common produced in 
/home/hadoop/.ivy2/cache/org.apache.hadoop-eclipse-plugin-common.xml
[ivy:resolve]   resolve done (117ms resolve - 3ms download)
[ivy:resolve] 
[ivy:resolve] :: problems summary ::
[ivy:resolve]  WARNINGS
[ivy:resolve]   module not found: 
org.apache.hadoop#hadoop-mapreduce-client-jobclient;2.2.0
[ivy:resolve]    local: tried
[ivy:resolve] 
/home/hadoop/.ivy2/local/org.apache.hadoop/hadoop-mapreduce-client-jobclient/2.2.0/ivys/ivy.xml
[ivy:resolve] -- artifact 
org.apache.hadoop#hadoop-mapreduce-client-jobclient;2.2.0!hadoop-mapreduce-client-jobclient.jar:
[ivy:resolve] 
/home/hadoop/.ivy2/local/org.apache.hadoop/hadoop-mapreduce-client-jobclient/2.2.0/jars/hadoop-mapreduce-client-jobclient.jar
[ivy:resolve]   module not found: 
org.apache.hadoop#hadoop-mapreduce-client-core;2.2.0
[ivy:resolve]    local: tried
[ivy:resolve] 
/home/hadoop/.ivy2/local/org.apache.hadoop/hadoop-mapreduce-client-core/2.2.0/ivys/ivy.xml
[ivy:resolve] -- artifact 
org.apache.hadoop#hadoop-mapreduce-client-core;2.2.0!hadoop-mapreduce-client-core.jar:
[ivy:resolve] 
/home/hadoop/.ivy2/local/org.apache.hadoop/hadoop-mapreduce-client-core/2.2.0/jars/hadoop-mapreduce-client-core.jar
[ivy:resolve]   module not found: 
org.apache.hadoop#hadoop-mapreduce-client-common;2.2.0
[ivy:resolve]    local: tried
[ivy:resolve] 
/home/hadoop/.ivy2/local/org.apache.hadoop/hadoop-mapreduce-client-common/2.2.0/ivys/ivy.xml
[ivy:resolve] -- artifact 
org.apache.hadoop#hadoop-mapreduce-client-common;2.2.0!hadoop-mapreduce-client-common.jar:
[ivy:resolve] 
/home/hadoop/.ivy2/local/org.apache.hadoop/hadoop-mapreduce-client-common/2.2.0/jars/hadoop-mapreduce-client-common.jar
[ivy:resolve]   module not found: org.apache.hadoop#hadoop-hdfs;2.2.0
[ivy:resolve]    local: tried
[ivy:resolve] 
/home/hadoop/.ivy2/local/org.apache.hadoop/hadoop-hdfs/2.2.0/ivys/ivy.xml
[ivy:resolve] -- artifact 
org.apache.hadoop#hadoop-hdfs;2.2.0!hadoop-hdfs.jar:
[ivy:resolve] 
/home/hadoop/.ivy2/local/org.apache.hadoop/hadoop-hdfs/2.2.0/jars/hadoop-hdfs.jar
[ivy:resolve]   module not found: org.apache.hadoop#hadoop-common;2.2.0
[ivy:resolve]    local: tried
[ivy:resolve] 
/home/hadoop/.ivy2/local/org.apache.hadoop/hadoop-common/2.2.0/ivys/ivy.xml
[ivy:resolve] -- artifact 
org.apache.hadoop#hadoop-common;2.2.0!hadoop-common.jar:
[ivy:resolve] 
/home/hadoop/.ivy2/local/org.apache.hadoop/hadoop-common/2.2.0/jars/hadoop-common.jar
[ivy:resolve]   ::
[ivy:resolve]   ::  UNRESOLVED DEPENDENCIES ::
[ivy:resolve]   ::
[ivy:resolve]   :: 
org.apache.hadoop#hadoop-mapreduce-client-jobclient;2.2.0: not found
[ivy:resolve]   :: 
org.apache.hadoop#hadoop-mapreduce-client-core;2.2.0: not found
[ivy:resolve]   :: 
org.apache.hadoop#hadoop-mapreduce-client-common;2.2.0: not found
[ivy:resolve]   :: org.apache.hadoop#hadoop-hdfs;2.2.0: not found
[ivy:resolve]   :: org.apache.hadoop#hadoop-common;2.2.0: not found
[ivy:resolve]   ::
[ivy:resolve] 
[ivy:resolve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS

BUILD FAILED
/home/hadoop/hadoop2x-eclipse-plugin-master/src/contrib/build-contrib.xml:492: 
impossible to resolve dependencies:
resolve failed - see output for details
at org.apache.ivy.ant.IvyResolve.doExecute(IvyResolve.java:251)
at org.apache.ivy.ant.IvyTask.execute(IvyTask.java:277)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)

[jira] [Updated] (HADOOP-10044) Improve the javadoc of rpc code

2013-12-09 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HADOOP-10044:
--

Status: Open  (was: Patch Available)

 Improve the javadoc of rpc code
 ---

 Key: HADOOP-10044
 URL: https://issues.apache.org/jira/browse/HADOOP-10044
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia
Assignee: Sanjay Radia
Priority: Minor
 Attachments: HADOOP-10044.20131014.patch, hadoop-10044.patch






--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HADOOP-10044) Improve the javadoc of rpc code

2013-12-09 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HADOOP-10044:
--

Status: Patch Available  (was: Open)

 Improve the javadoc of rpc code
 ---

 Key: HADOOP-10044
 URL: https://issues.apache.org/jira/browse/HADOOP-10044
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Sanjay Radia
Assignee: Sanjay Radia
Priority: Minor
 Attachments: HADOOP-10044.20131014.patch, hadoop-10044.patch






--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HADOOP-9867) org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well

2013-12-09 Thread Vinay (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay updated HADOOP-9867:
--

Attachment: HADOOP-9867.patch

Attaching the updated patch based on HADOOP-9622 changes

 org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record 
 delimiters well
 --

 Key: HADOOP-9867
 URL: https://issues.apache.org/jira/browse/HADOOP-9867
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.20.2, 0.23.9, 2.2.0
 Environment: CDH3U2 Redhat linux 5.7
Reporter: Kris Geusebroek
Assignee: Vinay
Priority: Critical
 Attachments: HADOOP-9867.patch, HADOOP-9867.patch, HADOOP-9867.patch


 Having defined a recorddelimiter of multiple bytes in a new InputFileFormat 
 sometimes has the effect of skipping records from the input.
 This happens when the input splits are split off just after a 
 recordseparator. Starting point for the next split would be non zero and 
 skipFirstLine would be true. A seek into the file is done to start - 1 and 
 the text until the first recorddelimiter is ignored (due to the presumption 
 that this record is already handled by the previous maptask). Since the re 
 ord delimiter is multibyte the seek only got the last byte of the delimiter 
 into scope and its not recognized as a full delimiter. So the text is skipped 
 until the next delimiter (ignoring a full record!!)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-9867) org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record delimiters well

2013-12-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843911#comment-13843911
 ] 

Hadoop QA commented on HADOOP-9867:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617969/HADOOP-9867.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/3350//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/3350//console

This message is automatically generated.

 org.apache.hadoop.mapred.LineRecordReader does not handle multibyte record 
 delimiters well
 --

 Key: HADOOP-9867
 URL: https://issues.apache.org/jira/browse/HADOOP-9867
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.20.2, 0.23.9, 2.2.0
 Environment: CDH3U2 Redhat linux 5.7
Reporter: Kris Geusebroek
Assignee: Vinay
Priority: Critical
 Attachments: HADOOP-9867.patch, HADOOP-9867.patch, HADOOP-9867.patch


 Having defined a recorddelimiter of multiple bytes in a new InputFileFormat 
 sometimes has the effect of skipping records from the input.
 This happens when the input splits are split off just after a 
 recordseparator. Starting point for the next split would be non zero and 
 skipFirstLine would be true. A seek into the file is done to start - 1 and 
 the text until the first recorddelimiter is ignored (due to the presumption 
 that this record is already handled by the previous maptask). Since the re 
 ord delimiter is multibyte the seek only got the last byte of the delimiter 
 into scope and its not recognized as a full delimiter. So the text is skipped 
 until the next delimiter (ignoring a full record!!)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HADOOP-10149) Create ByteBuffer-based cipher API

2013-12-09 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843990#comment-13843990
 ] 

Yi Liu commented on HADOOP-10149:
-

Owen, Did you mean Liu? Assuming you did, here are my thoughts to your 
comments. If you were not responding to me, sorry for interrupting.

Good to hear you are trying to help commit our work. Hopefully that would not 
need duplicate JIRAs to be created to discuss the work and even break it up in 
to smaller pieces. We are already doing that for HDFS-5143 and will also do the 
same with HADOOP-9331. If you have specific technical review comments please 
provide the same.



 Create ByteBuffer-based cipher API
 --

 Key: HADOOP-10149
 URL: https://issues.apache.org/jira/browse/HADOOP-10149
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley

 As part of HDFS-5143, [~hitliuyi] included a ByteBuffer-based API for 
 encryption and decryption. Especially, because of the zero-copy work this 
 seems like an important piece of work. 
 This API should be discussed independently instead of just as part of 
 HDFS-5143.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)