[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-06-08 Thread Lohit Vijayarenu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021117#comment-14021117
 ] 

Lohit Vijayarenu commented on HADOOP-10641:
---

Minor comments.
- It looks like checkQuorum is kind of noop for submitProposal in ZK based 
implementation, since zooKeeper.create would fail if there is no quorum anyways?
- In ZK based Coordination Engine implementation, how are ZNodes cleaned up? 
Looking at patch each proposal creates PERSISTENT_SEQUENTIAL, but no mention of 
cleanup.

 Introduce Coordination Engine
 -

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance

2014-06-08 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-9361:
---

Status: Open  (was: Patch Available)

 Strictly define the expected behavior of filesystem APIs and write tests to 
 verify compliance
 -

 Key: HADOOP-9361
 URL: https://issues.apache.org/jira/browse/HADOOP-9361
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs, test
Affects Versions: 2.4.0, 3.0.0
Reporter: Steve Loughran
Assignee: Steve Loughran
 Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch, 
 HADOOP-9361-003.patch, HADOOP-9361-004.patch, HADOOP-9361-005.patch, 
 HADOOP-9361-006.patch, HADOOP-9361-007.patch, HADOOP-9361-008.patch, 
 HADOOP-9361-009.patch, HADOOP-9361-011.patch, HADOOP-9361-012.patch, 
 HADOOP-9361-013.patch, HADOOP-9361-014.patch, HADOOP-9361.awang-addendum.patch


 {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while 
 HDFS gets tested downstream, other filesystems, such as blobstore bindings, 
 don't.
 The only tests that are common are those of {{FileSystemContractTestBase}}, 
 which HADOOP-9258 shows is incomplete.
 I propose 
 # writing more tests which clarify expected behavior
 # testing operations in the interface being in their own JUnit4 test classes, 
 instead of one big test suite. 
 # Having each FS declare via a properties file what behaviors they offer, 
 such as atomic-rename, atomic-delete, umask, immediate-consistency -test 
 methods can downgrade to skipped test cases if a feature is missing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance

2014-06-08 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-9361:
---

Attachment: HADOOP-9361-015.patch

This is revision -015 of the patch

# incorporates all of Andrew's modifications. Andrew - thanks for putting in 
the effort!
# added a section on object stores in the introduction, to clarify how they are 
different. 

Once we add a `Blobstore` marker to object stores, we can expand that a bit 
more.

 Strictly define the expected behavior of filesystem APIs and write tests to 
 verify compliance
 -

 Key: HADOOP-9361
 URL: https://issues.apache.org/jira/browse/HADOOP-9361
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs, test
Affects Versions: 3.0.0, 2.4.0
Reporter: Steve Loughran
Assignee: Steve Loughran
 Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch, 
 HADOOP-9361-003.patch, HADOOP-9361-004.patch, HADOOP-9361-005.patch, 
 HADOOP-9361-006.patch, HADOOP-9361-007.patch, HADOOP-9361-008.patch, 
 HADOOP-9361-009.patch, HADOOP-9361-011.patch, HADOOP-9361-012.patch, 
 HADOOP-9361-013.patch, HADOOP-9361-014.patch, HADOOP-9361-015.patch, 
 HADOOP-9361.awang-addendum.patch


 {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while 
 HDFS gets tested downstream, other filesystems, such as blobstore bindings, 
 don't.
 The only tests that are common are those of {{FileSystemContractTestBase}}, 
 which HADOOP-9258 shows is incomplete.
 I propose 
 # writing more tests which clarify expected behavior
 # testing operations in the interface being in their own JUnit4 test classes, 
 instead of one big test suite. 
 # Having each FS declare via a properties file what behaviors they offer, 
 such as atomic-rename, atomic-delete, umask, immediate-consistency -test 
 methods can downgrade to skipped test cases if a feature is missing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10648) Service Authorization Improvements

2014-06-08 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated HADOOP-10648:
--

Description: Umbrella jira for a set of improvements on service 
Authorization  (was: Umbrella jira for set of improvements on service 
Authorization)

 Service Authorization Improvements
 --

 Key: HADOOP-10648
 URL: https://issues.apache.org/jira/browse/HADOOP-10648
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Reporter: Benoy Antony
Assignee: Benoy Antony

 Umbrella jira for a set of improvements on service Authorization



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance

2014-06-08 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-9361:
---

Status: Patch Available  (was: Open)

 Strictly define the expected behavior of filesystem APIs and write tests to 
 verify compliance
 -

 Key: HADOOP-9361
 URL: https://issues.apache.org/jira/browse/HADOOP-9361
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs, test
Affects Versions: 2.4.0, 3.0.0
Reporter: Steve Loughran
Assignee: Steve Loughran
 Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch, 
 HADOOP-9361-003.patch, HADOOP-9361-004.patch, HADOOP-9361-005.patch, 
 HADOOP-9361-006.patch, HADOOP-9361-007.patch, HADOOP-9361-008.patch, 
 HADOOP-9361-009.patch, HADOOP-9361-011.patch, HADOOP-9361-012.patch, 
 HADOOP-9361-013.patch, HADOOP-9361-014.patch, HADOOP-9361-015.patch, 
 HADOOP-9361.awang-addendum.patch


 {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while 
 HDFS gets tested downstream, other filesystems, such as blobstore bindings, 
 don't.
 The only tests that are common are those of {{FileSystemContractTestBase}}, 
 which HADOOP-9258 shows is incomplete.
 I propose 
 # writing more tests which clarify expected behavior
 # testing operations in the interface being in their own JUnit4 test classes, 
 instead of one big test suite. 
 # Having each FS declare via a properties file what behaviors they offer, 
 such as atomic-rename, atomic-delete, umask, immediate-consistency -test 
 methods can downgrade to skipped test cases if a feature is missing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10561) Copy command with preserve option should handle Xattrs

2014-06-08 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021221#comment-14021221
 ] 

Uma Maheswara Rao G commented on HADOOP-10561:
--

Overall latest patch looks good to me. 

Tiny nit:
input attibute -- input attribute

+1 from me on addressing this comment.

 Copy command with preserve option should handle Xattrs
 --

 Key: HADOOP-10561
 URL: https://issues.apache.org/jira/browse/HADOOP-10561
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 3.0.0
Reporter: Uma Maheswara Rao G
Assignee: Yi Liu
 Attachments: HADOOP-10561.1.patch, HADOOP-10561.2.patch, 
 HADOOP-10561.3.patch, HADOOP-10561.patch


 The design docs for Xattrs stated that we handle preserve options with copy 
 commands
 From doc:
 Preserve option of commands like “cp -p” shell command and “distcp -p” should 
 work on XAttrs. 
 In the case of source fs supports XAttrs but target fs does not support, 
 XAttrs will be ignored 
 with warning message



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance

2014-06-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021294#comment-14021294
 ] 

Hadoop QA commented on HADOOP-9361:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12648875/HADOOP-9361-015.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 77 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs 
hadoop-tools/hadoop-openstack.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4024//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4024//console

This message is automatically generated.

 Strictly define the expected behavior of filesystem APIs and write tests to 
 verify compliance
 -

 Key: HADOOP-9361
 URL: https://issues.apache.org/jira/browse/HADOOP-9361
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs, test
Affects Versions: 3.0.0, 2.4.0
Reporter: Steve Loughran
Assignee: Steve Loughran
 Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch, 
 HADOOP-9361-003.patch, HADOOP-9361-004.patch, HADOOP-9361-005.patch, 
 HADOOP-9361-006.patch, HADOOP-9361-007.patch, HADOOP-9361-008.patch, 
 HADOOP-9361-009.patch, HADOOP-9361-011.patch, HADOOP-9361-012.patch, 
 HADOOP-9361-013.patch, HADOOP-9361-014.patch, HADOOP-9361-015.patch, 
 HADOOP-9361.awang-addendum.patch


 {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while 
 HDFS gets tested downstream, other filesystems, such as blobstore bindings, 
 don't.
 The only tests that are common are those of {{FileSystemContractTestBase}}, 
 which HADOOP-9258 shows is incomplete.
 I propose 
 # writing more tests which clarify expected behavior
 # testing operations in the interface being in their own JUnit4 test classes, 
 instead of one big test suite. 
 # Having each FS declare via a properties file what behaviors they offer, 
 such as atomic-rename, atomic-delete, umask, immediate-consistency -test 
 methods can downgrade to skipped test cases if a feature is missing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-06-08 Thread Plamen Jeliazkov (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021473#comment-14021473
 ] 

Plamen Jeliazkov commented on HADOOP-10641:
---

Hi Lohit, thanks for your comments!
# checkQuorum is an optimization some coordination engines may choose to 
implement in order to fail-fast to client requests. In the NameNode case, if 
quorum loss was suspected, that NameNode could start issuing StandbyExceptions.
# You are correct that the ZKCoordinationEngine does not implement ZNode 
clean-up currently. That is because it was made as a proof of concept for the 
CoordinationEngine API. Nonetheless, proper clean-up can be implemented. All 
one has to do is delete the ZNodes that everyone else has already learned about.
## Suppose you have Node A, B, and C, and Agreements 1, 2, 3, 4, and 5.
## Node A and B learn Agreement 1 first. Node C is a lagging node. A  B 
contain 1. C contains nothing.
## Node A and B continue onwards, learning up to Agreement 4. A  B contain 1, 
2, 3, and 4 now. C contains nothing.
## Node C finally learns Agreement 1. A  B contain 1, 2, 3, and 4 now. C 
contains 1.
## We can now discard Agreement 1 from persistence because we know that all the 
Nodes, A, B, and C, have safely learned about and applied Agreement 1.
## We can apply this process for all other Agreements. 

 Introduce Coordination Engine
 -

 Key: HADOOP-10641
 URL: https://issues.apache.org/jira/browse/HADOOP-10641
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Plamen Jeliazkov
 Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
 HADOOP-10641.patch


 Coordination Engine (CE) is a system, which allows to agree on a sequence of 
 events in a distributed system. In order to be reliable CE should be 
 distributed by itself.
 Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
 zab) and have different implementations, depending on use cases, reliability, 
 availability, and performance requirements.
 CE should have a common API, so that it could serve as a pluggable component 
 in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
 HBase (HBASE-10909).
 First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10669) Avro serialization does not flush buffered serialized values causing data lost

2014-06-08 Thread Mikhail Bernadsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bernadsky updated HADOOP-10669:
---

Attachment: HADOOP-10669_alt.patch

 Avro serialization does not flush buffered serialized values causing data lost
 --

 Key: HADOOP-10669
 URL: https://issues.apache.org/jira/browse/HADOOP-10669
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 2.4.0
Reporter: Mikhail Bernadsky
 Attachments: HADOOP-10669.patch, HADOOP-10669_alt.patch


 Found this debugging Nutch. 
 MapTask serializes keys and values to the same stream, in pairs: 
 keySerializer.serialize(key); 
 . 
 valSerializer.serialize(value);
  . 
 bb.write(b0, 0, 0); 
 AvroSerializer does not flush its buffer after each serialization. So if it 
 is used for valSerializer, the values are only partially written or not 
 written at all to the output stream before the record is marked as complete 
 (the last line above).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10669) Avro serialization does not flush buffered serialized values causing data lost

2014-06-08 Thread Mikhail Bernadsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bernadsky updated HADOOP-10669:
---

Description: 
Found this debugging Nutch. 

MapTask serializes keys and values to the same stream, in pairs: 

keySerializer.serialize(key); 
. 
valSerializer.serialize(value);
 . 
bb.write(b0, 0, 0); 

AvroSerializer does not flush its buffer after each serialization. So if it is 
used for valSerializer, the values are only partially written or not written at 
all to the output stream before the record is marked as complete (the last line 
above).

EDIT Added HADOOP-10699_all.patch. This is a less intrusive fix, as it does 
not try to flush MapTask stream. Instead, we write serialized values directly 
to MapTask stream and avoid using a buffer on avro side. 

  was:
Found this debugging Nutch. 

MapTask serializes keys and values to the same stream, in pairs: 

keySerializer.serialize(key); 
. 
valSerializer.serialize(value);
 . 
bb.write(b0, 0, 0); 

AvroSerializer does not flush its buffer after each serialization. So if it is 
used for valSerializer, the values are only partially written or not written at 
all to the output stream before the record is marked as complete (the last line 
above).


 Avro serialization does not flush buffered serialized values causing data lost
 --

 Key: HADOOP-10669
 URL: https://issues.apache.org/jira/browse/HADOOP-10669
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 2.4.0
Reporter: Mikhail Bernadsky
 Attachments: HADOOP-10669.patch, HADOOP-10669_alt.patch


 Found this debugging Nutch. 
 MapTask serializes keys and values to the same stream, in pairs: 
 keySerializer.serialize(key); 
 . 
 valSerializer.serialize(value);
  . 
 bb.write(b0, 0, 0); 
 AvroSerializer does not flush its buffer after each serialization. So if it 
 is used for valSerializer, the values are only partially written or not 
 written at all to the output stream before the record is marked as complete 
 (the last line above).
 EDIT Added HADOOP-10699_all.patch. This is a less intrusive fix, as it does 
 not try to flush MapTask stream. Instead, we write serialized values directly 
 to MapTask stream and avoid using a buffer on avro side. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-9629) Support Windows Azure Storage - Blob as a file system in Hadoop

2014-06-08 Thread Mike Liddell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Liddell updated HADOOP-9629:
-

Attachment: HADOOP-9629.trunk.3.patch

 Support Windows Azure Storage - Blob as a file system in Hadoop
 ---

 Key: HADOOP-9629
 URL: https://issues.apache.org/jira/browse/HADOOP-9629
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Mostafa Elhemali
Assignee: Mike Liddell
 Attachments: HADOOP-9629 - Azure Filesystem - Information for 
 developers.docx, HADOOP-9629 - Azure Filesystem - Information for 
 developers.pdf, HADOOP-9629.2.patch, HADOOP-9629.3.patch, HADOOP-9629.patch, 
 HADOOP-9629.trunk.1.patch, HADOOP-9629.trunk.2.patch, 
 HADOOP-9629.trunk.3.patch


 h2. Description
 This JIRA incorporates adding a new file system implementation for accessing 
 Windows Azure Storage - Blob from within Hadoop, such as using blobs as input 
 to MR jobs or configuring MR jobs to put their output directly into blob 
 storage.
 h2. High level design
 At a high level, the code here extends the FileSystem class to provide an 
 implementation for accessing blob storage; the scheme wasb is used for 
 accessing it over HTTP, and wasbs for accessing over HTTPS. We use the URI 
 scheme: {code}wasb[s]://container@account/path/to/file{code} to address 
 individual blobs. We use the standard Azure Java SDK 
 (com.microsoft.windowsazure) to do most of the work. In order to map a 
 hierarchical file system over the flat name-value pair nature of blob 
 storage, we create a specially tagged blob named path/to/dir whenever we 
 create a directory called path/to/dir, then files under that are stored as 
 normal blobs path/to/dir/file. We have many metrics implemented for it using 
 the Metrics2 interface. Tests are implemented mostly using a mock 
 implementation for the Azure SDK functionality, with an option to test 
 against a real blob storage if configured (instructions provided inside in 
 README.txt).
 h2. Credits and history
 This has been ongoing work for a while, and the early version of this work 
 can be seen in HADOOP-8079. This JIRA is a significant revision of that and 
 we'll post the patch here for Hadoop trunk first, then post a patch for 
 branch-1 as well for backporting the functionality if accepted. Credit for 
 this work goes to the early team: [~minwei], [~davidlao], [~lengningliu] and 
 [~stojanovic] as well as multiple people who have taken over this work since 
 then (hope I don't forget anyone): [~dexterb], Johannes Klein, [~ivanmi], 
 Michael Rys, [~mostafae], [~brian_swan], [~mikelid], [~xifang], and 
 [~chuanliu].
 h2. Test
 Besides unit tests, we have used WASB as the default file system in our 
 service product. (HDFS is also used but not as default file system.) Various 
 different customer and test workloads have been run against clusters with 
 such configurations for quite some time. The current version reflects to the 
 version of the code tested and used in our production environment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9629) Support Windows Azure Storage - Blob as a file system in Hadoop

2014-06-08 Thread Mike Liddell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021497#comment-14021497
 ] 

Mike Liddell commented on HADOOP-9629:
--

The annotations and suggested usages sound good.
The only changes that I suggest are:
- AzureException: Public + Evolving
- WasbFsck: Public + Evolving.

sound good?

 Support Windows Azure Storage - Blob as a file system in Hadoop
 ---

 Key: HADOOP-9629
 URL: https://issues.apache.org/jira/browse/HADOOP-9629
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Mostafa Elhemali
Assignee: Mike Liddell
 Attachments: HADOOP-9629 - Azure Filesystem - Information for 
 developers.docx, HADOOP-9629 - Azure Filesystem - Information for 
 developers.pdf, HADOOP-9629.2.patch, HADOOP-9629.3.patch, HADOOP-9629.patch, 
 HADOOP-9629.trunk.1.patch, HADOOP-9629.trunk.2.patch, 
 HADOOP-9629.trunk.3.patch


 h2. Description
 This JIRA incorporates adding a new file system implementation for accessing 
 Windows Azure Storage - Blob from within Hadoop, such as using blobs as input 
 to MR jobs or configuring MR jobs to put their output directly into blob 
 storage.
 h2. High level design
 At a high level, the code here extends the FileSystem class to provide an 
 implementation for accessing blob storage; the scheme wasb is used for 
 accessing it over HTTP, and wasbs for accessing over HTTPS. We use the URI 
 scheme: {code}wasb[s]://container@account/path/to/file{code} to address 
 individual blobs. We use the standard Azure Java SDK 
 (com.microsoft.windowsazure) to do most of the work. In order to map a 
 hierarchical file system over the flat name-value pair nature of blob 
 storage, we create a specially tagged blob named path/to/dir whenever we 
 create a directory called path/to/dir, then files under that are stored as 
 normal blobs path/to/dir/file. We have many metrics implemented for it using 
 the Metrics2 interface. Tests are implemented mostly using a mock 
 implementation for the Azure SDK functionality, with an option to test 
 against a real blob storage if configured (instructions provided inside in 
 README.txt).
 h2. Credits and history
 This has been ongoing work for a while, and the early version of this work 
 can be seen in HADOOP-8079. This JIRA is a significant revision of that and 
 we'll post the patch here for Hadoop trunk first, then post a patch for 
 branch-1 as well for backporting the functionality if accepted. Credit for 
 this work goes to the early team: [~minwei], [~davidlao], [~lengningliu] and 
 [~stojanovic] as well as multiple people who have taken over this work since 
 then (hope I don't forget anyone): [~dexterb], Johannes Klein, [~ivanmi], 
 Michael Rys, [~mostafae], [~brian_swan], [~mikelid], [~xifang], and 
 [~chuanliu].
 h2. Test
 Besides unit tests, we have used WASB as the default file system in our 
 service product. (HDFS is also used but not as default file system.) Various 
 different customer and test workloads have been run against clusters with 
 such configurations for quite some time. The current version reflects to the 
 version of the code tested and used in our production environment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10561) Copy command with preserve option should handle Xattrs

2014-06-08 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HADOOP-10561:


Attachment: HADOOP-10561.4.patch

Thanks Uma for review. The new patch includes update for your comments.

 Copy command with preserve option should handle Xattrs
 --

 Key: HADOOP-10561
 URL: https://issues.apache.org/jira/browse/HADOOP-10561
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 3.0.0
Reporter: Uma Maheswara Rao G
Assignee: Yi Liu
 Attachments: HADOOP-10561.1.patch, HADOOP-10561.2.patch, 
 HADOOP-10561.3.patch, HADOOP-10561.4.patch, HADOOP-10561.patch


 The design docs for Xattrs stated that we handle preserve options with copy 
 commands
 From doc:
 Preserve option of commands like “cp -p” shell command and “distcp -p” should 
 work on XAttrs. 
 In the case of source fs supports XAttrs but target fs does not support, 
 XAttrs will be ignored 
 with warning message



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-9629) Support Windows Azure Storage - Blob as a file system in Hadoop

2014-06-08 Thread Mike Liddell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Liddell updated HADOOP-9629:
-

Attachment: (was: HADOOP-9629.trunk.3.patch)

 Support Windows Azure Storage - Blob as a file system in Hadoop
 ---

 Key: HADOOP-9629
 URL: https://issues.apache.org/jira/browse/HADOOP-9629
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Mostafa Elhemali
Assignee: Mike Liddell
 Attachments: HADOOP-9629 - Azure Filesystem - Information for 
 developers.docx, HADOOP-9629 - Azure Filesystem - Information for 
 developers.pdf, HADOOP-9629.2.patch, HADOOP-9629.3.patch, HADOOP-9629.patch, 
 HADOOP-9629.trunk.1.patch, HADOOP-9629.trunk.2.patch


 h2. Description
 This JIRA incorporates adding a new file system implementation for accessing 
 Windows Azure Storage - Blob from within Hadoop, such as using blobs as input 
 to MR jobs or configuring MR jobs to put their output directly into blob 
 storage.
 h2. High level design
 At a high level, the code here extends the FileSystem class to provide an 
 implementation for accessing blob storage; the scheme wasb is used for 
 accessing it over HTTP, and wasbs for accessing over HTTPS. We use the URI 
 scheme: {code}wasb[s]://container@account/path/to/file{code} to address 
 individual blobs. We use the standard Azure Java SDK 
 (com.microsoft.windowsazure) to do most of the work. In order to map a 
 hierarchical file system over the flat name-value pair nature of blob 
 storage, we create a specially tagged blob named path/to/dir whenever we 
 create a directory called path/to/dir, then files under that are stored as 
 normal blobs path/to/dir/file. We have many metrics implemented for it using 
 the Metrics2 interface. Tests are implemented mostly using a mock 
 implementation for the Azure SDK functionality, with an option to test 
 against a real blob storage if configured (instructions provided inside in 
 README.txt).
 h2. Credits and history
 This has been ongoing work for a while, and the early version of this work 
 can be seen in HADOOP-8079. This JIRA is a significant revision of that and 
 we'll post the patch here for Hadoop trunk first, then post a patch for 
 branch-1 as well for backporting the functionality if accepted. Credit for 
 this work goes to the early team: [~minwei], [~davidlao], [~lengningliu] and 
 [~stojanovic] as well as multiple people who have taken over this work since 
 then (hope I don't forget anyone): [~dexterb], Johannes Klein, [~ivanmi], 
 Michael Rys, [~mostafae], [~brian_swan], [~mikelid], [~xifang], and 
 [~chuanliu].
 h2. Test
 Besides unit tests, we have used WASB as the default file system in our 
 service product. (HDFS is also used but not as default file system.) Various 
 different customer and test workloads have been run against clusters with 
 such configurations for quite some time. The current version reflects to the 
 version of the code tested and used in our production environment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-9629) Support Windows Azure Storage - Blob as a file system in Hadoop

2014-06-08 Thread Mike Liddell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Liddell updated HADOOP-9629:
-

Attachment: HADOOP-9629.trunk.3.patch

 Support Windows Azure Storage - Blob as a file system in Hadoop
 ---

 Key: HADOOP-9629
 URL: https://issues.apache.org/jira/browse/HADOOP-9629
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Mostafa Elhemali
Assignee: Mike Liddell
 Attachments: HADOOP-9629 - Azure Filesystem - Information for 
 developers.docx, HADOOP-9629 - Azure Filesystem - Information for 
 developers.pdf, HADOOP-9629.2.patch, HADOOP-9629.3.patch, HADOOP-9629.patch, 
 HADOOP-9629.trunk.1.patch, HADOOP-9629.trunk.2.patch, 
 HADOOP-9629.trunk.3.patch


 h2. Description
 This JIRA incorporates adding a new file system implementation for accessing 
 Windows Azure Storage - Blob from within Hadoop, such as using blobs as input 
 to MR jobs or configuring MR jobs to put their output directly into blob 
 storage.
 h2. High level design
 At a high level, the code here extends the FileSystem class to provide an 
 implementation for accessing blob storage; the scheme wasb is used for 
 accessing it over HTTP, and wasbs for accessing over HTTPS. We use the URI 
 scheme: {code}wasb[s]://container@account/path/to/file{code} to address 
 individual blobs. We use the standard Azure Java SDK 
 (com.microsoft.windowsazure) to do most of the work. In order to map a 
 hierarchical file system over the flat name-value pair nature of blob 
 storage, we create a specially tagged blob named path/to/dir whenever we 
 create a directory called path/to/dir, then files under that are stored as 
 normal blobs path/to/dir/file. We have many metrics implemented for it using 
 the Metrics2 interface. Tests are implemented mostly using a mock 
 implementation for the Azure SDK functionality, with an option to test 
 against a real blob storage if configured (instructions provided inside in 
 README.txt).
 h2. Credits and history
 This has been ongoing work for a while, and the early version of this work 
 can be seen in HADOOP-8079. This JIRA is a significant revision of that and 
 we'll post the patch here for Hadoop trunk first, then post a patch for 
 branch-1 as well for backporting the functionality if accepted. Credit for 
 this work goes to the early team: [~minwei], [~davidlao], [~lengningliu] and 
 [~stojanovic] as well as multiple people who have taken over this work since 
 then (hope I don't forget anyone): [~dexterb], Johannes Klein, [~ivanmi], 
 Michael Rys, [~mostafae], [~brian_swan], [~mikelid], [~xifang], and 
 [~chuanliu].
 h2. Test
 Besides unit tests, we have used WASB as the default file system in our 
 service product. (HDFS is also used but not as default file system.) Various 
 different customer and test workloads have been run against clusters with 
 such configurations for quite some time. The current version reflects to the 
 version of the code tested and used in our production environment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9629) Support Windows Azure Storage - Blob as a file system in Hadoop

2014-06-08 Thread Mike Liddell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021499#comment-14021499
 ] 

Mike Liddell commented on HADOOP-9629:
--

new patch: HADOOP-9629.trunk.4.patch
 - addresses code-review comments from [~cnauroth], see 
https://reviews.apache.org/r/22096/
 - adds InterfaceAudience and InterfaceStability annotations to the main 
classes.

 Support Windows Azure Storage - Blob as a file system in Hadoop
 ---

 Key: HADOOP-9629
 URL: https://issues.apache.org/jira/browse/HADOOP-9629
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Mostafa Elhemali
Assignee: Mike Liddell
 Attachments: HADOOP-9629 - Azure Filesystem - Information for 
 developers.docx, HADOOP-9629 - Azure Filesystem - Information for 
 developers.pdf, HADOOP-9629.2.patch, HADOOP-9629.3.patch, HADOOP-9629.patch, 
 HADOOP-9629.trunk.1.patch, HADOOP-9629.trunk.2.patch, 
 HADOOP-9629.trunk.3.patch


 h2. Description
 This JIRA incorporates adding a new file system implementation for accessing 
 Windows Azure Storage - Blob from within Hadoop, such as using blobs as input 
 to MR jobs or configuring MR jobs to put their output directly into blob 
 storage.
 h2. High level design
 At a high level, the code here extends the FileSystem class to provide an 
 implementation for accessing blob storage; the scheme wasb is used for 
 accessing it over HTTP, and wasbs for accessing over HTTPS. We use the URI 
 scheme: {code}wasb[s]://container@account/path/to/file{code} to address 
 individual blobs. We use the standard Azure Java SDK 
 (com.microsoft.windowsazure) to do most of the work. In order to map a 
 hierarchical file system over the flat name-value pair nature of blob 
 storage, we create a specially tagged blob named path/to/dir whenever we 
 create a directory called path/to/dir, then files under that are stored as 
 normal blobs path/to/dir/file. We have many metrics implemented for it using 
 the Metrics2 interface. Tests are implemented mostly using a mock 
 implementation for the Azure SDK functionality, with an option to test 
 against a real blob storage if configured (instructions provided inside in 
 README.txt).
 h2. Credits and history
 This has been ongoing work for a while, and the early version of this work 
 can be seen in HADOOP-8079. This JIRA is a significant revision of that and 
 we'll post the patch here for Hadoop trunk first, then post a patch for 
 branch-1 as well for backporting the functionality if accepted. Credit for 
 this work goes to the early team: [~minwei], [~davidlao], [~lengningliu] and 
 [~stojanovic] as well as multiple people who have taken over this work since 
 then (hope I don't forget anyone): [~dexterb], Johannes Klein, [~ivanmi], 
 Michael Rys, [~mostafae], [~brian_swan], [~mikelid], [~xifang], and 
 [~chuanliu].
 h2. Test
 Besides unit tests, we have used WASB as the default file system in our 
 service product. (HDFS is also used but not as default file system.) Various 
 different customer and test workloads have been run against clusters with 
 such configurations for quite some time. The current version reflects to the 
 version of the code tested and used in our production environment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9629) Support Windows Azure Storage - Blob as a file system in Hadoop

2014-06-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021502#comment-14021502
 ] 

Hadoop QA commented on HADOOP-9629:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12648900/HADOOP-9629.trunk.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 25 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
100 warning messages.
See 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4025//artifact/trunk/patchprocess/diffJavadocWarnings.txt
 for details.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-tools/hadoop-azure hadoop-tools/hadoop-tools-dist.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4025//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4025//console

This message is automatically generated.

 Support Windows Azure Storage - Blob as a file system in Hadoop
 ---

 Key: HADOOP-9629
 URL: https://issues.apache.org/jira/browse/HADOOP-9629
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Mostafa Elhemali
Assignee: Mike Liddell
 Attachments: HADOOP-9629 - Azure Filesystem - Information for 
 developers.docx, HADOOP-9629 - Azure Filesystem - Information for 
 developers.pdf, HADOOP-9629.2.patch, HADOOP-9629.3.patch, HADOOP-9629.patch, 
 HADOOP-9629.trunk.1.patch, HADOOP-9629.trunk.2.patch, 
 HADOOP-9629.trunk.3.patch


 h2. Description
 This JIRA incorporates adding a new file system implementation for accessing 
 Windows Azure Storage - Blob from within Hadoop, such as using blobs as input 
 to MR jobs or configuring MR jobs to put their output directly into blob 
 storage.
 h2. High level design
 At a high level, the code here extends the FileSystem class to provide an 
 implementation for accessing blob storage; the scheme wasb is used for 
 accessing it over HTTP, and wasbs for accessing over HTTPS. We use the URI 
 scheme: {code}wasb[s]://container@account/path/to/file{code} to address 
 individual blobs. We use the standard Azure Java SDK 
 (com.microsoft.windowsazure) to do most of the work. In order to map a 
 hierarchical file system over the flat name-value pair nature of blob 
 storage, we create a specially tagged blob named path/to/dir whenever we 
 create a directory called path/to/dir, then files under that are stored as 
 normal blobs path/to/dir/file. We have many metrics implemented for it using 
 the Metrics2 interface. Tests are implemented mostly using a mock 
 implementation for the Azure SDK functionality, with an option to test 
 against a real blob storage if configured (instructions provided inside in 
 README.txt).
 h2. Credits and history
 This has been ongoing work for a while, and the early version of this work 
 can be seen in HADOOP-8079. This JIRA is a significant revision of that and 
 we'll post the patch here for Hadoop trunk first, then post a patch for 
 branch-1 as well for backporting the functionality if accepted. Credit for 
 this work goes to the early team: [~minwei], [~davidlao], [~lengningliu] and 
 [~stojanovic] as well as multiple people who have taken over this work since 
 then (hope I don't forget anyone): [~dexterb], Johannes Klein, [~ivanmi], 
 Michael Rys, [~mostafae], [~brian_swan], [~mikelid], [~xifang], and 
 [~chuanliu].
 h2. Test
 Besides unit tests, we have used WASB as the default file system in our 
 service product. (HDFS is also used but not as default file system.) Various 
 different customer and test workloads have been run against clusters with 
 such configurations for quite some time. The current version reflects to the 
 version of the code tested and used in our production environment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9629) Support Windows Azure Storage - Blob as a file system in Hadoop

2014-06-08 Thread Mike Liddell (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021506#comment-14021506
 ] 

Mike Liddell commented on HADOOP-9629:
--

Previous comment about new patch file had name wrong: The new patch is 
HADOOP-9629.trunk.3.patch

 Support Windows Azure Storage - Blob as a file system in Hadoop
 ---

 Key: HADOOP-9629
 URL: https://issues.apache.org/jira/browse/HADOOP-9629
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Mostafa Elhemali
Assignee: Mike Liddell
 Attachments: HADOOP-9629 - Azure Filesystem - Information for 
 developers.docx, HADOOP-9629 - Azure Filesystem - Information for 
 developers.pdf, HADOOP-9629.2.patch, HADOOP-9629.3.patch, HADOOP-9629.patch, 
 HADOOP-9629.trunk.1.patch, HADOOP-9629.trunk.2.patch, 
 HADOOP-9629.trunk.3.patch


 h2. Description
 This JIRA incorporates adding a new file system implementation for accessing 
 Windows Azure Storage - Blob from within Hadoop, such as using blobs as input 
 to MR jobs or configuring MR jobs to put their output directly into blob 
 storage.
 h2. High level design
 At a high level, the code here extends the FileSystem class to provide an 
 implementation for accessing blob storage; the scheme wasb is used for 
 accessing it over HTTP, and wasbs for accessing over HTTPS. We use the URI 
 scheme: {code}wasb[s]://container@account/path/to/file{code} to address 
 individual blobs. We use the standard Azure Java SDK 
 (com.microsoft.windowsazure) to do most of the work. In order to map a 
 hierarchical file system over the flat name-value pair nature of blob 
 storage, we create a specially tagged blob named path/to/dir whenever we 
 create a directory called path/to/dir, then files under that are stored as 
 normal blobs path/to/dir/file. We have many metrics implemented for it using 
 the Metrics2 interface. Tests are implemented mostly using a mock 
 implementation for the Azure SDK functionality, with an option to test 
 against a real blob storage if configured (instructions provided inside in 
 README.txt).
 h2. Credits and history
 This has been ongoing work for a while, and the early version of this work 
 can be seen in HADOOP-8079. This JIRA is a significant revision of that and 
 we'll post the patch here for Hadoop trunk first, then post a patch for 
 branch-1 as well for backporting the functionality if accepted. Credit for 
 this work goes to the early team: [~minwei], [~davidlao], [~lengningliu] and 
 [~stojanovic] as well as multiple people who have taken over this work since 
 then (hope I don't forget anyone): [~dexterb], Johannes Klein, [~ivanmi], 
 Michael Rys, [~mostafae], [~brian_swan], [~mikelid], [~xifang], and 
 [~chuanliu].
 h2. Test
 Besides unit tests, we have used WASB as the default file system in our 
 service product. (HDFS is also used but not as default file system.) Various 
 different customer and test workloads have been run against clusters with 
 such configurations for quite some time. The current version reflects to the 
 version of the code tested and used in our production environment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9629) Support Windows Azure Storage - Blob as a file system in Hadoop

2014-06-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021511#comment-14021511
 ] 

Hadoop QA commented on HADOOP-9629:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12648902/HADOOP-9629.trunk.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 25 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
100 warning messages.
See 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4027//artifact/trunk/patchprocess/diffJavadocWarnings.txt
 for details.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-tools/hadoop-azure hadoop-tools/hadoop-tools-dist.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4027//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4027//console

This message is automatically generated.

 Support Windows Azure Storage - Blob as a file system in Hadoop
 ---

 Key: HADOOP-9629
 URL: https://issues.apache.org/jira/browse/HADOOP-9629
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Mostafa Elhemali
Assignee: Mike Liddell
 Attachments: HADOOP-9629 - Azure Filesystem - Information for 
 developers.docx, HADOOP-9629 - Azure Filesystem - Information for 
 developers.pdf, HADOOP-9629.2.patch, HADOOP-9629.3.patch, HADOOP-9629.patch, 
 HADOOP-9629.trunk.1.patch, HADOOP-9629.trunk.2.patch, 
 HADOOP-9629.trunk.3.patch


 h2. Description
 This JIRA incorporates adding a new file system implementation for accessing 
 Windows Azure Storage - Blob from within Hadoop, such as using blobs as input 
 to MR jobs or configuring MR jobs to put their output directly into blob 
 storage.
 h2. High level design
 At a high level, the code here extends the FileSystem class to provide an 
 implementation for accessing blob storage; the scheme wasb is used for 
 accessing it over HTTP, and wasbs for accessing over HTTPS. We use the URI 
 scheme: {code}wasb[s]://container@account/path/to/file{code} to address 
 individual blobs. We use the standard Azure Java SDK 
 (com.microsoft.windowsazure) to do most of the work. In order to map a 
 hierarchical file system over the flat name-value pair nature of blob 
 storage, we create a specially tagged blob named path/to/dir whenever we 
 create a directory called path/to/dir, then files under that are stored as 
 normal blobs path/to/dir/file. We have many metrics implemented for it using 
 the Metrics2 interface. Tests are implemented mostly using a mock 
 implementation for the Azure SDK functionality, with an option to test 
 against a real blob storage if configured (instructions provided inside in 
 README.txt).
 h2. Credits and history
 This has been ongoing work for a while, and the early version of this work 
 can be seen in HADOOP-8079. This JIRA is a significant revision of that and 
 we'll post the patch here for Hadoop trunk first, then post a patch for 
 branch-1 as well for backporting the functionality if accepted. Credit for 
 this work goes to the early team: [~minwei], [~davidlao], [~lengningliu] and 
 [~stojanovic] as well as multiple people who have taken over this work since 
 then (hope I don't forget anyone): [~dexterb], Johannes Klein, [~ivanmi], 
 Michael Rys, [~mostafae], [~brian_swan], [~mikelid], [~xifang], and 
 [~chuanliu].
 h2. Test
 Besides unit tests, we have used WASB as the default file system in our 
 service product. (HDFS is also used but not as default file system.) Various 
 different customer and test workloads have been run against clusters with 
 such configurations for quite some time. The current version reflects to the 
 version of the code tested and used in our production environment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10640) Implement Namenode RPCs in HDFS native client

2014-06-08 Thread Wenwu Peng (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021541#comment-14021541
 ] 

Wenwu Peng commented on HADOOP-10640:
-

run command NAMENODE_URI=hdfs://localhost:8020 ./test_libhdfs_meta_ops and 
hit this error

hdfsBuilderConnect: ndfs failed to connect: 
org.apache.hadoop.native.HadoopCore.OutOfMemoryException: 
cnn_get_server_defaults: failed to allocate sync_ctx (error 12)
org.apache.hadoop.native.HadoopCore.OutOfMemoryException: 
cnn_get_server_defaults: failed to allocate sync_ctxerror: did not expect '0': 
'/root/hadoop-common/hadoop-native-core/src/main/native/test/fs/test_libhdfs_meta_ops.c
 at line 60'

 Implement Namenode RPCs in HDFS native client
 -

 Key: HADOOP-10640
 URL: https://issues.apache.org/jira/browse/HADOOP-10640
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: native
Affects Versions: HADOOP-10388
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10640-pnative.001.patch, 
 HADOOP-10640-pnative.002.patch, HADOOP-10640-pnative.003.patch


 Implement the parts of libhdfs that just involve making RPCs to the Namenode, 
 such as mkdir, rename, etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10561) Copy command with preserve option should handle Xattrs

2014-06-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021558#comment-14021558
 ] 

Hadoop QA commented on HADOOP-10561:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12648901/HADOOP-10561.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.datanode.TestBPOfferService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4026//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4026//console

This message is automatically generated.

 Copy command with preserve option should handle Xattrs
 --

 Key: HADOOP-10561
 URL: https://issues.apache.org/jira/browse/HADOOP-10561
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 3.0.0
Reporter: Uma Maheswara Rao G
Assignee: Yi Liu
 Attachments: HADOOP-10561.1.patch, HADOOP-10561.2.patch, 
 HADOOP-10561.3.patch, HADOOP-10561.4.patch, HADOOP-10561.patch


 The design docs for Xattrs stated that we handle preserve options with copy 
 commands
 From doc:
 Preserve option of commands like “cp -p” shell command and “distcp -p” should 
 work on XAttrs. 
 In the case of source fs supports XAttrs but target fs does not support, 
 XAttrs will be ignored 
 with warning message



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10670) Allow AuthenticationFilter to respect signature secret file even without AuthenticationFilterInitializer

2014-06-08 Thread Kai Zheng (JIRA)
Kai Zheng created HADOOP-10670:
--

 Summary: Allow AuthenticationFilter to respect signature secret 
file even without AuthenticationFilterInitializer
 Key: HADOOP-10670
 URL: https://issues.apache.org/jira/browse/HADOOP-10670
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Kai Zheng
Assignee: Kai Zheng
Priority: Minor


In Hadoop web console, by using AuthenticationFilterInitializer, it's allowed 
to configure AuthenticationFilter for the required signature secret by 
specifying signature.secret.file property. This improvement would also allow 
this when AuthenticationFilterInitializer isn't used in situations like webhdfs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10670) Allow AuthenticationFilter to respect signature secret file even without AuthenticationFilterInitializer

2014-06-08 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HADOOP-10670:
---

Attachment: hadoop-10670.patch

 Allow AuthenticationFilter to respect signature secret file even without 
 AuthenticationFilterInitializer
 

 Key: HADOOP-10670
 URL: https://issues.apache.org/jira/browse/HADOOP-10670
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Kai Zheng
Assignee: Kai Zheng
Priority: Minor
 Attachments: hadoop-10670.patch


 In Hadoop web console, by using AuthenticationFilterInitializer, it's allowed 
 to configure AuthenticationFilter for the required signature secret by 
 specifying signature.secret.file property. This improvement would also allow 
 this when AuthenticationFilterInitializer isn't used in situations like 
 webhdfs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10670) Allow AuthenticationFilter to respect signature secret file even without AuthenticationFilterInitializer

2014-06-08 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HADOOP-10670:
---

Status: Patch Available  (was: Open)

Attached a patch. Changes summary:
1. Moving signature file reading from AuthenticationFilterInitializer to 
AuthenticationFilter:

2. And in AuthenticationFilter, if SIGNATURE_SECRET is configured, then use it; 
otherwise if SIGNATURE_SECRET_FILE is configured, then use it; otherwise 
generate a secret as before.


 Allow AuthenticationFilter to respect signature secret file even without 
 AuthenticationFilterInitializer
 

 Key: HADOOP-10670
 URL: https://issues.apache.org/jira/browse/HADOOP-10670
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Kai Zheng
Assignee: Kai Zheng
Priority: Minor
 Attachments: hadoop-10670.patch


 In Hadoop web console, by using AuthenticationFilterInitializer, it's allowed 
 to configure AuthenticationFilter for the required signature secret by 
 specifying signature.secret.file property. This improvement would also allow 
 this when AuthenticationFilterInitializer isn't used in situations like 
 webhdfs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10640) Implement Namenode RPCs in HDFS native client

2014-06-08 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021649#comment-14021649
 ] 

Binglin Chang commented on HADOOP-10640:


 sizeof(struct hrpc_proxy) always larger than RPC_PROXY_USERDATA_MAX?
{code}
void *hrpc_proxy_alloc_userdata(struct hrpc_proxy *proxy, size_t size)
{
if (size  RPC_PROXY_USERDATA_MAX) {
return NULL;
}
return proxy-userdata;
}

struct hrpc_sync_ctx *hrpc_proxy_alloc_sync_ctx(struct hrpc_proxy *proxy)
{
struct hrpc_sync_ctx *ctx = 
hrpc_proxy_alloc_userdata(proxy, sizeof(struct hrpc_proxy));
if (!ctx) {
return NULL;
}
if (uv_sem_init(ctx-sem, 0)) {
return NULL;
}
memset(ctx, 0, sizeof(ctx));
return ctx;
}
{code}

 Implement Namenode RPCs in HDFS native client
 -

 Key: HADOOP-10640
 URL: https://issues.apache.org/jira/browse/HADOOP-10640
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: native
Affects Versions: HADOOP-10388
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10640-pnative.001.patch, 
 HADOOP-10640-pnative.002.patch, HADOOP-10640-pnative.003.patch


 Implement the parts of libhdfs that just involve making RPCs to the Namenode, 
 such as mkdir, rename, etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)