[jira] [Created] (HDFS-13616) Batch listing of multiple directories

2018-05-24 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-13616:
--

 Summary: Batch listing of multiple directories
 Key: HDFS-13616
 URL: https://issues.apache.org/jira/browse/HDFS-13616
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 3.2.0
Reporter: Andrew Wang
Assignee: Andrew Wang


One of the dominant workloads for external metadata services is listing of 
partition directories. This canĀ end up being bottlenecked on RTT time when 
partition directories contain a small number of files. This is fairly common, 
since fine-grained partitioning is used for partition pruning by the query 
engines.

A batched listing API that takes multiple paths amortizes the RTT cost. Initial 
benchmarks show a 10-20x improvement in metadata loading performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-13611) Unsafe use of Text as a ConcurrentHashMap key in PBHelperClient

2018-05-23 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-13611:
--

 Summary: Unsafe use of Text as a ConcurrentHashMap key in 
PBHelperClient
 Key: HDFS-13611
 URL: https://issues.apache.org/jira/browse/HDFS-13611
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Andrew Wang
Assignee: Andrew Wang


Follow on to HDFS-13601, a bug spotted by [~tlipcon]: since Text is mutable, 
it's not safe to use as a hash map key.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-13601) Optimize ByteString conversions in PBHelper

2018-05-21 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-13601:
--

 Summary: Optimize ByteString conversions in PBHelper
 Key: HDFS-13601
 URL: https://issues.apache.org/jira/browse/HDFS-13601
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.9.1, 3.1.0
Reporter: Andrew Wang
Assignee: Andrew Wang


While doing some profiling of the NN with JMC, I saw a lot of time being spent 
on String->ByteString conversions. These are often the same strings being 
converted over and over again, meaningĀ there's room for optimization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-12499) dfs.namenode.shared.edits.dir property is currently namenode specific key

2017-10-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HDFS-12499:


> dfs.namenode.shared.edits.dir property is currently namenode specific key
> -
>
> Key: HDFS-12499
> URL: https://issues.apache.org/jira/browse/HDFS-12499
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: qjm
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
> Attachments: HDFS-12499.01.patch, HDFS-12499.02.patch
>
>
> HDFS + Federation cluster +QJM
> dfs.shared.edits.dir property can be set as
> 1. dfs.shared.edits.dir.<> 
> 2. dfs.shared.edits.dir.<> .<>
> Configuring both ways are supported currently. Option 2 should not be 
> supported, as for a particular nameservice quorum of journal nodes should be 
> same.
> If option 2 is supported, users can configure for a nameservice Id which is 
> having two namenodes, they can configure different values for journal nodes. 
> which is incorrect.
> Example:
> 
> dfs.nameservices
> ns1,ns2
>   
>   
> dfs.ha.namenodes.ns1
> nn1,nn2
>   
>   
> dfs.ha.namenodes.ns2
> nn1,nn2
>   
> 
> dfs.namenode.shared.edits.dir.ns1.nn1
> 
> qjournal://mycluster-node-1:8485;mycluster-node-2:8485;mycluster-node-3:8485/ns1
>   
> 
> dfs.namenode.shared.edits.dir.ns1.nn1
> 
> qjournal://mycluster-node-3:8485;mycluster-node-4:8485;mycluster-node-5:8485/ns1
>   
>   
> dfs.namenode.shared.edits.dir.ns2.nn1
> 
> qjournal://mycluster-node-1:8485;mycluster-node-2:8485;mycluster-node-3:8485/ns2
>   
>   
> dfs.namenode.shared.edits.dir.ns2.nn1
> 
> qjournal://mycluster-node-3:8485;mycluster-node-4:8485;mycluster-node-5:8485/ns2
>   
> This jira is to discuss do we need to support 2nd option way of configuring 
> or remove it?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12660) Enable async edit logging test cases in TestFailureToReadEdits

2017-10-13 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-12660:
--

 Summary: Enable async edit logging test cases in 
TestFailureToReadEdits
 Key: HDFS-12660
 URL: https://issues.apache.org/jira/browse/HDFS-12660
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.9.0, 3.0.0
Reporter: Andrew Wang


Per discussion in HDFS-12603, this test is failing occasionally due to 
mysterious mocking issues. Let's try and fix them in this issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-12603) Enable async edit logging by default

2017-10-09 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HDFS-12603:


Reopening since the branch-2 patch broke the unit test.

> Enable async edit logging by default
> 
>
> Key: HDFS-12603
> URL: https://issues.apache.org/jira/browse/HDFS-12603
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.8.0, 2.9.0, 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Fix For: 3.0.0
>
> Attachments: HDFS-12603.001.patch, HDFS-12603.002.patch, 
> HDFS-12603.003.patch, HDFS-12603.branch-2.01.patch
>
>
> HDFS-7964 added support for async edit logging. Based on further discussion 
> in that JIRA, we think it's safe to turn this on by default for better 
> out-of-the-box performance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12612) DFSStripedOutputStream#close will throw if called a second time with a failed streamer

2017-10-06 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-12612:
--

 Summary: DFSStripedOutputStream#close will throw if called a 
second time with a failed streamer
 Key: HDFS-12612
 URL: https://issues.apache.org/jira/browse/HDFS-12612
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: erasure-coding
Affects Versions: 3.0.0-beta1
Reporter: Andrew Wang


Found while testing with Hive. We have a cluster with 2 DNs and the XOR-2-1 
policy. If you write a file and call close() twice, it throws this exception:

{noformat}
17/10/04 16:02:14 WARN hdfs.DFSOutputStream: Cannot allocate parity 
block(index=2, policy=XOR-2-1-1024k). Not enough datanodes? Exclude nodes=[]
...
Caused by: java.io.IOException: Failed to get parity block, index=2
at 
org.apache.hadoop.hdfs.DFSStripedOutputStream.allocateNewBlock(DFSStripedOutputStream.java:500)
 ~[hadoop-hdfs-client-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hdfs.DFSStripedOutputStream.writeChunk(DFSStripedOutputStream.java:524)
 ~[hadoop-hdfs-client-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
{noformat}

This is because in DFSStripedOutputStream#closeImpl, if the stream is closed, 
we throw an exception if any of the striped streamers had an exception:

{code}
  protected synchronized void closeImpl() throws IOException {
if (isClosed()) {
  final MultipleIOException.Builder b = new MultipleIOException.Builder();
  for(int i = 0; i < streamers.size(); i++) {
final StripedDataStreamer si = getStripedDataStreamer(i);
try {
  si.getLastException().check(true);
} catch (IOException e) {
  b.add(e);
}
  }
  final IOException ioe = b.build();
  if (ioe != null) {
throw ioe;
  }
  return;
}
{code}

I think this is incorrect, since we only need to throw in this situation if we 
have too many failed streamers. close should also be idempotent, so it should 
throw the first time we call close if it's going to throw at all.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12603) Enable async edit logging by default

2017-10-05 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-12603:
--

 Summary: Enable async edit logging by default
 Key: HDFS-12603
 URL: https://issues.apache.org/jira/browse/HDFS-12603
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 3.0.0-alpha1, 2.8.0, 2.9.0
Reporter: Andrew Wang
Assignee: Andrew Wang


HDFS-7964 added support for async edit logging. Based on further discussion in 
that JIRA, we think it's safe to turn this on by default for better 
out-of-the-box performance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-12442) WebHdfsFileSystem#getFileBlockLocations will always return BlockLocation#corrupt as false

2017-10-03 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-12442.

Resolution: Invalid

> WebHdfsFileSystem#getFileBlockLocations will always return 
> BlockLocation#corrupt as false
> -
>
> Key: HDFS-12442
> URL: https://issues.apache.org/jira/browse/HDFS-12442
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>Priority: Critical
> Attachments: HDFS-12442-1.patch
>
>
> Was going through {{JsonUtilClient#toBlockLocation}} code.
> Below is the relevant code snippet.
> {code:title=JsonUtilClient.java|borderStyle=solid}
>  /** Convert a Json map to BlockLocation. **/
>   static BlockLocation toBlockLocation(Map m)
>   throws IOException{
> ...
> ...  
> boolean corrupt = Boolean.
> getBoolean(m.get("corrupt").toString());
> ...
> ...
>   }
> {code}
> According to java docs for {{Boolean#getBoolean}}
> {noformat}
> Returns true if and only if the system property named by the argument exists 
> and is equal to the string "true". 
> {noformat}
> I assume, the map value for key {{corrupt}} will be populated with either 
> {{true}} or {{false}}.
> On the client side, {{Boolean#getBoolean}} will look for system property for 
> true or false.
> So it will always return false unless the system property is set for true or 
> false.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-12442) WebHdfsFileSystem#getFileBlockLocations will always return BlockLocation#corrupt as false

2017-10-03 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HDFS-12442:


> WebHdfsFileSystem#getFileBlockLocations will always return 
> BlockLocation#corrupt as false
> -
>
> Key: HDFS-12442
> URL: https://issues.apache.org/jira/browse/HDFS-12442
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>Priority: Critical
> Attachments: HDFS-12442-1.patch
>
>
> Was going through {{JsonUtilClient#toBlockLocation}} code.
> Below is the relevant code snippet.
> {code:title=JsonUtilClient.java|borderStyle=solid}
>  /** Convert a Json map to BlockLocation. **/
>   static BlockLocation toBlockLocation(Map m)
>   throws IOException{
> ...
> ...  
> boolean corrupt = Boolean.
> getBoolean(m.get("corrupt").toString());
> ...
> ...
>   }
> {code}
> According to java docs for {{Boolean#getBoolean}}
> {noformat}
> Returns true if and only if the system property named by the argument exists 
> and is equal to the string "true". 
> {noformat}
> I assume, the map value for key {{corrupt}} will be populated with either 
> {{true}} or {{false}}.
> On the client side, {{Boolean#getBoolean}} will look for system property for 
> true or false.
> So it will always return false unless the system property is set for true or 
> false.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12567) BlockPlacementPolicyRackFaultTolerant fails with racks with very few nodes

2017-09-29 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-12567:
--

 Summary: BlockPlacementPolicyRackFaultTolerant fails with racks 
with very few nodes
 Key: HDFS-12567
 URL: https://issues.apache.org/jira/browse/HDFS-12567
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: erasure-coding
Affects Versions: 3.0.0-alpha1
Reporter: Andrew Wang


Found this while doing some testing on an internal cluster with an unusual 
setup. We have a rack with ~20 nodes, then a few more with just a few nodes. It 
would fail to get (# data blocks) datanodes even though there were plenty of 
DNs on the rack with 20 DNs.

I managed to reproduce this same issue in a unit test, stack trace like this:

{noformat}
java.io.IOException: File /testfile0 could only be written to 5 of the 6 
required nodes for RS-6-3-1024k. There are 9 datanode(s) running and no node(s) 
are excluded in this operation.
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2083)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:286)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2609)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:863)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:548)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
{noformat}

This isn't a very critical bug since it's an unusual rack configuration, but it 
can easily happen during testing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12534) Provide logical BlockLocations for EC files for better split calculation

2017-09-22 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-12534:
--

 Summary: Provide logical BlockLocations for EC files for better 
split calculation
 Key: HDFS-12534
 URL: https://issues.apache.org/jira/browse/HDFS-12534
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: erasure-coding
Affects Versions: 3.0.0-beta1
Reporter: Andrew Wang
Assignee: Andrew Wang


I talked to [~vanzin] and [~alex.behm] some more about split calculation with 
EC. It turns out HDFS-1 was resolved prematurely. Applications depend on 
HDFS BlockLocation to understand where the split points are. The current scheme 
of returning one BlockLocation per block group loses this information.

We should change this to provide logical blocks. Divide the file length by the 
block size and provide suitable BlockLocations to match, with virtual offsets 
and lengths too.

I'm not marking this as incompatible, since changing it this way would in fact 
make it more compatible from the perspective of applications that are 
scheduling against replicated files. Thus, it'd be good for beta1 if possible, 
but okay for later too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12497) Re-enable TestDFSStripedOutputStreamWithFailure tests

2017-09-19 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-12497:
--

 Summary: Re-enable TestDFSStripedOutputStreamWithFailure tests
 Key: HDFS-12497
 URL: https://issues.apache.org/jira/browse/HDFS-12497
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: erasure-coding
Affects Versions: 3.0.0-beta1
Reporter: Andrew Wang


We disabled this suite of tests in HDFS-12417 since they were very flaky. We 
should fix these tests and re-enable them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-11156) Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API

2017-09-14 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-11156.

  Resolution: Fixed
Target Version/s: 3.0.0-alpha2, 2.8.0  (was: 2.8.0, 3.0.0-alpha2)

Hi Weiwei, we need to leave this one resolved for changelog purposes since it 
was released in alpha2. Let's use a new JIRA to track the fixed work.

> Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API
> 
>
> Key: HDFS-11156
> URL: https://issues.apache.org/jira/browse/HDFS-11156
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.7.3
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Fix For: 3.0.0-alpha2
>
> Attachments: BlockLocationProperties_JSON_Schema.jpg, 
> BlockLocations_JSON_Schema.jpg, FileStatuses_JSON_Schema.jpg, 
> HDFS-11156.01.patch, HDFS-11156.02.patch, HDFS-11156.03.patch, 
> HDFS-11156.04.patch, HDFS-11156.05.patch, HDFS-11156.06.patch, 
> HDFS-11156.07.patch, HDFS-11156.08.patch, HDFS-11156.09.patch, 
> HDFS-11156.10.patch, HDFS-11156.11.patch, HDFS-11156.12.patch, 
> HDFS-11156.13.patch, HDFS-11156.14.patch, HDFS-11156.15.patch, 
> HDFS-11156.16.patch, HDFS-11156-branch-2.01.patch, 
> Output_JSON_format_v10.jpg, SampleResponse_JSON.jpg
>
>
> Following webhdfs REST API
> {code}
> http://:/webhdfs/v1/?op=GET_BLOCK_LOCATIONS=0=1
> {code}
> will get a response like
> {code}
> {
>   "LocatedBlocks" : {
> "fileLength" : 1073741824,
> "isLastBlockComplete" : true,
> "isUnderConstruction" : false,
> "lastLocatedBlock" : { ... },
> "locatedBlocks" : [ {...} ]
>   }
> }
> {code}
> This represents for *o.a.h.h.p.LocatedBlocks*. However according to 
> *FileSystem* API, 
> {code}
> public BlockLocation[] getFileBlockLocations(Path p, long start, long len)
> {code}
> clients would expect an array of BlockLocation. This mismatch should be 
> fixed. Marked as Incompatible change as this will change the output of the 
> GET_BLOCK_LOCATIONS API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-12457) Revert HDFS-11156 Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API

2017-09-14 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-12457.

   Resolution: Fixed
 Assignee: Andrew Wang
Fix Version/s: 3.0.0-beta1

Reverted per discussion on HDFS-11156. This JIRA is for changelog purposes.

> Revert HDFS-11156 Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API
> --
>
> Key: HDFS-12457
> URL: https://issues.apache.org/jira/browse/HDFS-12457
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0-alpha2
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Fix For: 3.0.0-beta1
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12457) Revert HDFS-11156 Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API

2017-09-14 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-12457:
--

 Summary: Revert HDFS-11156 Add new op GETFILEBLOCKLOCATIONS to 
WebHDFS REST API
 Key: HDFS-12457
 URL: https://issues.apache.org/jira/browse/HDFS-12457
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 3.0.0-alpha2
Reporter: Andrew Wang






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-6874) Add GETFILEBLOCKLOCATIONS operation to HttpFS

2017-09-14 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HDFS-6874:
---

Reverted this per discussion on HDFS-11156.

> Add GETFILEBLOCKLOCATIONS operation to HttpFS
> -
>
> Key: HDFS-6874
> URL: https://issues.apache.org/jira/browse/HDFS-6874
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: httpfs
>Affects Versions: 2.4.1, 2.7.3
>Reporter: Gao Zhong Liang
>Assignee: Weiwei Yang
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6874.02.patch, HDFS-6874.03.patch, 
> HDFS-6874.04.patch, HDFS-6874.05.patch, HDFS-6874.06.patch, 
> HDFS-6874.07.patch, HDFS-6874.08.patch, HDFS-6874-1.patch, 
> HDFS-6874-branch-2.6.0.patch, HDFS-6874.patch
>
>
> GETFILEBLOCKLOCATIONS operation is missing in HttpFS, which is already 
> supported in WebHDFS.  For the request of GETFILEBLOCKLOCATIONS in 
> org.apache.hadoop.fs.http.server.HttpFSServer, BAD_REQUEST is returned so far:
> ...
>  case GETFILEBLOCKLOCATIONS: {
> response = Response.status(Response.Status.BAD_REQUEST).build();
> break;
>   }
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12444) Reduce runtime of TestWriteReadStripedFile

2017-09-13 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-12444:
--

 Summary: Reduce runtime of TestWriteReadStripedFile
 Key: HDFS-12444
 URL: https://issues.apache.org/jira/browse/HDFS-12444
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding, test
Affects Versions: 3.0.0-alpha4
Reporter: Andrew Wang
Assignee: Andrew Wang


This test takes a long time to run since it writes a lot of data, and 
frequently times out during precommit testing. If we change the EC policy from 
RS(6,3) to RS(3,2) then it will run a lot faster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12438) Rename dfs.datanode.ec.reconstruction.stripedblock.threads.size to dfs.datanode.ec.reconstruction.stripedblock.threads

2017-09-12 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-12438:
--

 Summary: Rename 
dfs.datanode.ec.reconstruction.stripedblock.threads.size to 
dfs.datanode.ec.reconstruction.stripedblock.threads
 Key: HDFS-12438
 URL: https://issues.apache.org/jira/browse/HDFS-12438
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0-alpha3
Reporter: Andrew Wang
Assignee: Andrew Wang


We should rename this config key to match other config keys used to size thread 
pools.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-11515) -du throws ConcurrentModificationException

2017-09-01 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-11515.

  Resolution: Invalid
Target Version/s: 2.8.1, 3.0.0-beta1  (was: 3.0.0-beta1, 2.8.1)

Agree, let's resolve this one too. Thanks Istvan.

> -du throws ConcurrentModificationException
> --
>
> Key: HDFS-11515
> URL: https://issues.apache.org/jira/browse/HDFS-11515
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, shell
>Affects Versions: 2.8.0, 3.0.0-alpha2
>Reporter: Wei-Chiu Chuang
>Assignee: Istvan Fajth
> Attachments: HDFS-11515.001.patch, HDFS-11515.002.patch, 
> HDFS-11515.003.patch, HDFS-11515.004.patch, HDFS-11515.test.patch
>
>
> HDFS-10797 fixed a disk summary (-du) bug, but it introduced a new bug.
> The bug can be reproduced running the following commands:
> {noformat}
> bash-4.1$ hdfs dfs -mkdir /tmp/d0
> bash-4.1$ hdfs dfsadmin -allowSnapshot /tmp/d0
> Allowing snaphot on /tmp/d0 succeeded
> bash-4.1$ hdfs dfs -touchz /tmp/d0/f4
> bash-4.1$ hdfs dfs -mkdir /tmp/d0/d1
> bash-4.1$ hdfs dfs -createSnapshot /tmp/d0 s1
> Created snapshot /tmp/d0/.snapshot/s1
> bash-4.1$ hdfs dfs -mkdir /tmp/d0/d1/d2
> bash-4.1$ hdfs dfs -mkdir /tmp/d0/d1/d3
> bash-4.1$ hdfs dfs -mkdir /tmp/d0/d1/d2/d4
> bash-4.1$ hdfs dfs -mkdir /tmp/d0/d1/d3/d5
> bash-4.1$ hdfs dfs -createSnapshot /tmp/d0 s2
> Created snapshot /tmp/d0/.snapshot/s2
> bash-4.1$ hdfs dfs -rmdir /tmp/d0/d1/d2/d4
> bash-4.1$ hdfs dfs -rmdir /tmp/d0/d1/d2
> bash-4.1$ hdfs dfs -rmdir /tmp/d0/d1/d3/d5
> bash-4.1$ hdfs dfs -rmdir /tmp/d0/d1/d3
> bash-4.1$ hdfs dfs -du -h /tmp/d0
> du: java.util.ConcurrentModificationException
> 0 0 /tmp/d0/f4
> {noformat}
> A ConcurrentModificationException forced du to terminate abruptly.
> Correspondingly, NameNode log has the following error:
> {noformat}
> 2017-03-08 14:32:17,673 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 4 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getContentSumma
> ry from 10.0.0.198:49957 Call#2 Retry#0
> java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextEntry(HashMap.java:922)
> at java.util.HashMap$KeyIterator.next(HashMap.java:956)
> at 
> org.apache.hadoop.hdfs.server.namenode.ContentSummaryComputationContext.tallyDeletedSnapshottedINodes(ContentSummaryComputationContext.java:209)
> at 
> org.apache.hadoop.hdfs.server.namenode.INode.computeAndConvertContentSummary(INode.java:507)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.getContentSummary(FSDirectory.java:2302)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getContentSummary(FSNamesystem.java:4535)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getContentSummary(NameNodeRpcServer.java:1087)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getContentSummary(AuthorizationProviderProxyClientProtocol.java:5
> 63)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getContentSummary(ClientNamenodeProtocolServerSideTranslatorPB.jav
> a:873)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
> {noformat}
> The bug is due to a improper use of HashSet, not concurrent operations. 
> Basically, a HashSet can not be updated while an iterator is traversing it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-8295) Add MODIFY and REMOVE ECSchema editlog operations

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-8295.
---
Resolution: Invalid

I'm resolving this since it's likely stale, we've substantially revisited the 
pluggable EC policy API.

> Add MODIFY and REMOVE ECSchema editlog operations
> -
>
> Key: HDFS-8295
> URL: https://issues.apache.org/jira/browse/HDFS-8295
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Xinwei Qin 
>Assignee: Xinwei Qin 
> Attachments: HDFS-8295.001.patch
>
>
> If MODIFY and REMOVE ECSchema operations are supported, then add these 
> editlog operations to persist them. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-9604) Move ErasureCodingPolicyManager to FSDirectory

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-9604.
---
Resolution: Invalid

This code has been refactored a few times now, no longer valid.

> Move ErasureCodingPolicyManager to FSDirectory
> --
>
> Key: HDFS-9604
> URL: https://issues.apache.org/jira/browse/HDFS-9604
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Walter Su
>Assignee: Walter Su
> Attachments: HDFS-9604.01.patch
>
>
> ErasureCodingPolicy is a part of directory metedata, it's better to put it in 
> FSDirectory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-9386) Erasure coding: updateBlockForPipeline sometimes returns non-striped block for striped file

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-9386.
---
  Resolution: Cannot Reproduce
Target Version/s:   (was: )

Haven't seen this recently and bug is two years old, resolving.

> Erasure coding: updateBlockForPipeline sometimes returns non-striped block 
> for striped file
> ---
>
> Key: HDFS-9386
> URL: https://issues.apache.org/jira/browse/HDFS-9386
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha1
>Reporter: Zhe Zhang
>
> I've seen this bug a few times. The returned {{LocatedBlock}} from 
> {{updateBlockForPipeline}} is sometimes not {{LocatedStripedBlock}}. However, 
> {{FSNamesystem#bumpBlockGenerationStamp}} did return a 
> {{LocatedStripedBlock}}. Maybe a bug in PB. I'm still debugging.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-8140) ECSchema supports for offline EditsVisitor over an OEV XML file

2017-08-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-8140.
---
Resolution: Duplicate

This is a dupe of HDFS-11467.

> ECSchema supports for offline EditsVisitor over an OEV XML file
> ---
>
> Key: HDFS-8140
> URL: https://issues.apache.org/jira/browse/HDFS-8140
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Xinwei Qin 
>Assignee: Xinwei Qin 
>
> Make the ECSchema info in Editlog Support for offline EditsVistor over an OEV 
> XML file, which is not implemented in HDFS-7859.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12377) Refactor TestReadStripedFileWithDecoding to avoid test timeouts

2017-08-30 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-12377:
--

 Summary: Refactor TestReadStripedFileWithDecoding to avoid test 
timeouts
 Key: HDFS-12377
 URL: https://issues.apache.org/jira/browse/HDFS-12377
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding
Affects Versions: 3.0.0-alpha3
Reporter: Andrew Wang
Assignee: Andrew Wang


This test times out since the nested for loops means it runs 12 configurations 
inside each test method.

Let's refactor this to use JUnit parameters instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-10531) Add EC policy and storage policy related usage summarization function to dfs du command

2017-08-18 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-10531.

Resolution: Duplicate

> Add EC policy and storage policy related usage summarization function to dfs 
> du command
> ---
>
> Key: HDFS-10531
> URL: https://issues.apache.org/jira/browse/HDFS-10531
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Rui Gao
>Assignee: SammiChen
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-10531.001.patch
>
>
> Currently du command output:
> {code}
> [ ~]$ hdfs dfs -du  -h /home/rgao/
> 0  /home/rgao/.Trash
> 0  /home/rgao/.staging
> 100 M  /home/rgao/ds
> 250 M  /home/rgao/ds-2
> 200 M  /home/rgao/noECBackup-ds
> 500 M  /home/rgao/noECBackup-ds-2
> {code}
> For hdfs users and administrators, EC policy and storage policy related usage 
> summarization would be very helpful when managing storages of cluster. The 
> imitate output of du could be like the following.
> {code}
> [ ~]$ hdfs dfs -du  -h -t( total, parameter to be added) /home/rgao
>  
> 0  /home/rgao/.Trash
> 0  /home/rgao/.staging
> [Archive] [EC:RS-DEFAULT-6-3-64k] 100 M  /home/rgao/ds
> [DISK] [EC:RS-DEFAULT-6-3-64k] 250 M  /home/rgao/ds-2
> [DISK] [Replica] 200 M  /home/rgao/noECBackup-ds
> [DISK] [Replica] 500 M  /home/rgao/noECBackup-ds-2
>  
> Total:
>  
> [Archive][EC:RS-DEFAULT-6-3-64k]  100 M
> [Archive][Replica]0 M
> [DISK] [EC:RS-DEFAULT-6-3-64k] 250 M
> [DISK] [Replica]   700 M  
>  
> [Archive][ALL] 100M
> [DISK][ALL]  950M
> [ALL] [EC:RS-DEFAULT-6-3-64k]350M
> [ALL] [Replica]  700M
> {code} 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-10531) Add EC policy and storage policy related usage summarization function to dfs du command

2017-08-18 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HDFS-10531:


> Add EC policy and storage policy related usage summarization function to dfs 
> du command
> ---
>
> Key: HDFS-10531
> URL: https://issues.apache.org/jira/browse/HDFS-10531
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Rui Gao
>Assignee: SammiChen
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-10531.001.patch
>
>
> Currently du command output:
> {code}
> [ ~]$ hdfs dfs -du  -h /home/rgao/
> 0  /home/rgao/.Trash
> 0  /home/rgao/.staging
> 100 M  /home/rgao/ds
> 250 M  /home/rgao/ds-2
> 200 M  /home/rgao/noECBackup-ds
> 500 M  /home/rgao/noECBackup-ds-2
> {code}
> For hdfs users and administrators, EC policy and storage policy related usage 
> summarization would be very helpful when managing storages of cluster. The 
> imitate output of du could be like the following.
> {code}
> [ ~]$ hdfs dfs -du  -h -t( total, parameter to be added) /home/rgao
>  
> 0  /home/rgao/.Trash
> 0  /home/rgao/.staging
> [Archive] [EC:RS-DEFAULT-6-3-64k] 100 M  /home/rgao/ds
> [DISK] [EC:RS-DEFAULT-6-3-64k] 250 M  /home/rgao/ds-2
> [DISK] [Replica] 200 M  /home/rgao/noECBackup-ds
> [DISK] [Replica] 500 M  /home/rgao/noECBackup-ds-2
>  
> Total:
>  
> [Archive][EC:RS-DEFAULT-6-3-64k]  100 M
> [Archive][Replica]0 M
> [DISK] [EC:RS-DEFAULT-6-3-64k] 250 M
> [DISK] [Replica]   700 M  
>  
> [Archive][ALL] 100M
> [DISK][ALL]  950M
> [ALL] [EC:RS-DEFAULT-6-3-64k]350M
> [ALL] [Replica]  700M
> {code} 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-8387) Erasure Coding: Revisit the long and int datatypes usage in striping logic

2017-08-18 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-8387.
---
Resolution: Delivered

Thanks Rakesh for working on this, resolving per above recommendation.

> Erasure Coding: Revisit the long and int datatypes usage in striping logic
> --
>
> Key: HDFS-8387
> URL: https://issues.apache.org/jira/browse/HDFS-8387
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-nice-to-have
>
> This idea of this jira is to revisit the usage of {{long}} and {{int}} data 
> types in the striping logic.
> Related discussion 
> [here|https://issues.apache.org/jira/browse/HDFS-8294?focusedCommentId=14540788=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14540788]
>  in HDFS-8294



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-11813) TestDFSStripedOutputStreamWithFailure070 failed randomly

2017-08-17 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-11813.

Resolution: Duplicate

This looks the same as HDFS-11882 where we've got a patch that seems close, 
let's dupe to that one.

> TestDFSStripedOutputStreamWithFailure070 failed randomly
> 
>
> Key: HDFS-11813
> URL: https://issues.apache.org/jira/browse/HDFS-11813
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: SammiChen
>Assignee: SammiChen
>  Labels: hdfs-ec-3.0-must-do
>
> TestDFSStripedOutputStreamWithFailure070 failed randomly. Here is the stack 
> trace,
> java.lang.AssertionError: failed, dn=0, 
> length=1638400java.lang.IllegalStateException
>   at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.updatePipeline(DFSStripedOutputStream.java:780)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:664)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1034)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:472)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:360)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.run(TestDFSStripedOutputStreamWithFailure.java:574)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.test7(TestDFSStripedOutputStreamWithFailure.java:614)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:365)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.run(TestDFSStripedOutputStreamWithFailure.java:574)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.test7(TestDFSStripedOutputStreamWithFailure.java:614)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-11814) Benchmark and tune for prefered default cell size

2017-08-15 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-11814.

Resolution: Done

Thanks Wei! Let's resolve this one and continue with changing the defaults in 
HDFS-12303.

> Benchmark and tune for prefered default cell size
> -
>
> Key: HDFS-11814
> URL: https://issues.apache.org/jira/browse/HDFS-11814
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: SammiChen
>Assignee: Wei Zhou
>  Labels: hdfs-ec-3.0-must-do
> Attachments: RS-6-3-Concurrent.png, RS-Read.png, RS-Write.png
>
>
> Doing some benchmarking to see which cell size is more desirable, other than 
> current 64k



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12257) Expose getSnapshottableDirListing in as a public API HdfsAdmin

2017-08-03 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-12257:
--

 Summary: Expose getSnapshottableDirListing in as a public API 
HdfsAdmin
 Key: HDFS-12257
 URL: https://issues.apache.org/jira/browse/HDFS-12257
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: snapshots
Affects Versions: 2.6.5
Reporter: Andrew Wang


Found at HIVE-16294. We have a CLI API for listing snapshottable dirs, but no 
programmatic API. Other snapshot APIs are exposed in HdfsAdmin, I think we 
should expose listing there as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12222) Add EC information to BlockLocations

2017-07-28 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-1:
--

 Summary: Add EC information to BlockLocations
 Key: HDFS-1
 URL: https://issues.apache.org/jira/browse/HDFS-1
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0-alpha1
Reporter: Andrew Wang


HDFS applications query block location information to compute splits. One 
example of this is FileInputFormat:

https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346

You see bits of code like this that calculate offsets as follows:

{noformat}
long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
  blkLocations[startIndex].getLength() - offset;
{noformat}

EC confuses this since the block locations include parity block locations as 
well, which are not part of the logical file length. This messes up the offset 
calculation and thus topology/caching information too.

Applications can figure out what's a parity block by reading the EC policy and 
then parsing the schema, but it'd be a lot better if we exposed this more 
generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12218) Rename split EC / replicated block metrics in BlockManager

2017-07-28 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-12218:
--

 Summary: Rename split EC / replicated block metrics in BlockManager
 Key: HDFS-12218
 URL: https://issues.apache.org/jira/browse/HDFS-12218
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: erasure-coding, metrics
Affects Versions: 3.0.0-beta1
Reporter: Andrew Wang


Noted in HDFS-12206, we should propagate the naming changes made in HDFS-12206 
for FSNamesystem into BlockManager and related classes. Also an opportunity to 
clarify usage of "ECBlocks" vs "ECBlockGroups" in some names.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12207) A few DataXceiver#writeBlock cleanups related to optional storage IDs and types

2017-07-27 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-12207:
--

 Summary: A few DataXceiver#writeBlock cleanups related to optional 
storage IDs and types
 Key: HDFS-12207
 URL: https://issues.apache.org/jira/browse/HDFS-12207
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0-alpha4
Reporter: Andrew Wang


Here's the conversation that [~ehiggs] and I had on HDFS-12151 regarding some 
improvements:

bq. Should we use nst > 0 rather than targetStorageTypes.length > 0 (amended) 
here for clarity?
Yes.
bq. Should the targetStorageTypes.length > 0 check really be nsi > 0? We could 
elide it then since it's already captured in the outside if.
This does look redundant since targetStorageIds.length will be either 0 or == 
targetStorageTypes.length
bq. Finally, I don't understand why we need to add the targeted ID/type for 
checkAccess. Each DN only needs to validate itself, yea? BTSM#checkAccess 
indicates this in its javadoc, but it looks like we run through ourselves and 
the targets each time:
That seems like a good simplification. I think I had assumed the BTI and 
requested types being checked should be the same (String - String, uint64 - 
uint64); but I don't see a reason why they have to be. Chris Douglas, what do 
you think?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12206) Rename the split EC / replicated block metrics

2017-07-27 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-12206:
--

 Summary: Rename the split EC / replicated block metrics
 Key: HDFS-12206
 URL: https://issues.apache.org/jira/browse/HDFS-12206
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: metrics
Affects Versions: 3.0.0-alpha4
Reporter: Andrew Wang
Assignee: Andrew Wang


Going through the split EC/replicated metrics, I think it'd be better to name 
the replicated-only metrics with "ReplicatedBlocks" rather than "BlocksStat" 
for clarity. "Stat" is also not a very descriptive name, so remove it for the 
EC blocks as well. Finally, fix some inconsistencies that were missed earlier.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12197) Do the HDFS dist stitching in hadoop-hdfs-project

2017-07-25 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-12197:
--

 Summary: Do the HDFS dist stitching in hadoop-hdfs-project
 Key: HDFS-12197
 URL: https://issues.apache.org/jira/browse/HDFS-12197
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: build
Affects Versions: 3.0.0-alpha4
Reporter: Andrew Wang


Problem reported by [~lars_francke] on HDFS-11596. We can no longer easily 
start a namenode and datanode from the source directory without doing a full 
build per the wiki instructions: 
https://wiki.apache.org/hadoop/HowToSetupYourDevelopmentEnvironment

This is because we don't have a top-level dist for HDFS. $HADOOP_YARN_HOME for 
instance can be set to {{hadoop-yarn-project/target}}, but $HADOOP_HDFS_HOME 
goes into the submodule: {{hadoop-hdfs-project/hadoop-hdfs/target}}. This means 
it's missing the files from the sibling hadoop-hdfs-client module (which is 
required by the namenode), but also other siblings like nfs and httpfs.

So, I think the right fix is doing the dist stitching at the 
{{hadoop-hdfs-project}} level where we can aggregate all the child modules, and 
pointing $HADOOP_HDFS_HOME at this directory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-10480) Add an admin command to list currently open files

2017-06-30 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HDFS-10480:


> Add an admin command to list currently open files
> -
>
> Key: HDFS-10480
> URL: https://issues.apache.org/jira/browse/HDFS-10480
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kihwal Lee
>Assignee: Manoj Govindassamy
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: HDFS-10480.02.patch, HDFS-10480.03.patch, 
> HDFS-10480.04.patch, HDFS-10480.05.patch, HDFS-10480.06.patch, 
> HDFS-10480.07.patch, HDFS-10480-branch-2.01.patch, 
> HDFS-10480-branch-2.8.01.patch, HDFS-10480-trunk-1.patch, 
> HDFS-10480-trunk.patch
>
>
> Currently there is no easy way to obtain the list of active leases or files 
> being written. It will be nice if we have an admin command to list open files 
> and their lease holders.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-11787) After HDFS-11515, -du still throws ConcurrentModificationException

2017-06-26 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-11787.

  Resolution: Duplicate
   Fix Version/s: (was: 2.8.2)
  (was: 3.0.0-alpha4)
  (was: 2.9.0)
Target Version/s: 2.8.1, 3.0.0-beta1  (was: 3.0.0-beta1, 2.8.1)

Duping to HDFS-11515 since I don't think we need this for changelog purposes.

> After HDFS-11515, -du still throws ConcurrentModificationException
> --
>
> Key: HDFS-11787
> URL: https://issues.apache.org/jira/browse/HDFS-11787
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots, tools
>Affects Versions: 3.0.0-alpha4, 2.8.1
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>
> I ran a modified NameNode that was patched against HDFS-11515 on a production 
> cluster fsimage, and I am still seeing ConcurrentModificationException.
> It seems that there are corner cases not convered by HDFS-11515. File this 
> jira to discuss how to proceed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-11515) -du throws ConcurrentModificationException

2017-06-26 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HDFS-11515:


I think we can simplify the JIRA history since this change never made it into a 
release, re-opening.

> -du throws ConcurrentModificationException
> --
>
> Key: HDFS-11515
> URL: https://issues.apache.org/jira/browse/HDFS-11515
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, shell
>Affects Versions: 2.8.0, 3.0.0-alpha2
>Reporter: Wei-Chiu Chuang
>Assignee: Istvan Fajth
> Attachments: HDFS-11515.001.patch, HDFS-11515.002.patch, 
> HDFS-11515.003.patch, HDFS-11515.004.patch, HDFS-11515.test.patch
>
>
> HDFS-10797 fixed a disk summary (-du) bug, but it introduced a new bug.
> The bug can be reproduced running the following commands:
> {noformat}
> bash-4.1$ hdfs dfs -mkdir /tmp/d0
> bash-4.1$ hdfs dfsadmin -allowSnapshot /tmp/d0
> Allowing snaphot on /tmp/d0 succeeded
> bash-4.1$ hdfs dfs -touchz /tmp/d0/f4
> bash-4.1$ hdfs dfs -mkdir /tmp/d0/d1
> bash-4.1$ hdfs dfs -createSnapshot /tmp/d0 s1
> Created snapshot /tmp/d0/.snapshot/s1
> bash-4.1$ hdfs dfs -mkdir /tmp/d0/d1/d2
> bash-4.1$ hdfs dfs -mkdir /tmp/d0/d1/d3
> bash-4.1$ hdfs dfs -mkdir /tmp/d0/d1/d2/d4
> bash-4.1$ hdfs dfs -mkdir /tmp/d0/d1/d3/d5
> bash-4.1$ hdfs dfs -createSnapshot /tmp/d0 s2
> Created snapshot /tmp/d0/.snapshot/s2
> bash-4.1$ hdfs dfs -rmdir /tmp/d0/d1/d2/d4
> bash-4.1$ hdfs dfs -rmdir /tmp/d0/d1/d2
> bash-4.1$ hdfs dfs -rmdir /tmp/d0/d1/d3/d5
> bash-4.1$ hdfs dfs -rmdir /tmp/d0/d1/d3
> bash-4.1$ hdfs dfs -du -h /tmp/d0
> du: java.util.ConcurrentModificationException
> 0 0 /tmp/d0/f4
> {noformat}
> A ConcurrentModificationException forced du to terminate abruptly.
> Correspondingly, NameNode log has the following error:
> {noformat}
> 2017-03-08 14:32:17,673 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 4 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getContentSumma
> ry from 10.0.0.198:49957 Call#2 Retry#0
> java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextEntry(HashMap.java:922)
> at java.util.HashMap$KeyIterator.next(HashMap.java:956)
> at 
> org.apache.hadoop.hdfs.server.namenode.ContentSummaryComputationContext.tallyDeletedSnapshottedINodes(ContentSummaryComputationContext.java:209)
> at 
> org.apache.hadoop.hdfs.server.namenode.INode.computeAndConvertContentSummary(INode.java:507)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.getContentSummary(FSDirectory.java:2302)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getContentSummary(FSNamesystem.java:4535)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getContentSummary(NameNodeRpcServer.java:1087)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getContentSummary(AuthorizationProviderProxyClientProtocol.java:5
> 63)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getContentSummary(ClientNamenodeProtocolServerSideTranslatorPB.jav
> a:873)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
> {noformat}
> The bug is due to a improper use of HashSet, not concurrent operations. 
> Basically, a HashSet can not be updated while an iterator is traversing it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-11419) BlockPlacementPolicyDefault is choosing datanode in an inefficient way

2017-06-26 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-11419.

   Resolution: Fixed
Fix Version/s: 3.0.0-alpha4

Resolving as it looks like this was completed.

> BlockPlacementPolicyDefault is choosing datanode in an inefficient way
> --
>
> Key: HDFS-11419
> URL: https://issues.apache.org/jira/browse/HDFS-11419
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chen Liang
>Assignee: Chen Liang
> Fix For: 3.0.0-alpha4
>
>
> Currently in {{BlockPlacementPolicyDefault}}, {{chooseTarget}} will end up 
> calling into {{chooseRandom}}, which will first find a random datanode by 
> calling
> {code}DatanodeDescriptor chosenNode = chooseDataNode(scope, 
> excludedNodes);{code}, then it checks whether that returned datanode 
> satisfies storage type requirement
> {code}storage = chooseStorage4Block(
>   chosenNode, blocksize, results, entry.getKey());{code}
> If yes, {{numOfReplicas--;}}, otherwise, the node is added to excluded nodes, 
> and runs the loop again until {{numOfReplicas}} is down to 0.
> A problem here is that, storage type is not being considered only until after 
> a random node is already returned.  We've seen a case where a cluster has a 
> large number of datanodes, while only a few satisfy the storage type 
> condition. So, for the most part, this code blindly picks random datanodes 
> that do not satisfy the storage type requirement.
> To make matters worse, the way {{NetworkTopology#chooseRandom}} works is 
> that, given a set of excluded nodes, it first finds a random datanodes, then 
> if it is in excluded nodes set, try find another random nodes. So the more 
> excluded nodes there are, the more likely a random node will be in the 
> excluded set, in which case we basically wasted one iteration.
> Therefore, this JIRA proposes to augment/modify the relevant classes in a way 
> that datanodes can be found more efficiently. There are currently two 
> different high level solutions we are considering:
> 1. add some field to Node base types to describe the storage type info, and 
> when searching for a node, we take into account such field(s), and do not 
> return node that does not meet the storage type requirement.
> 2. change {{NetworkTopology}} class to be aware of storage types, e.g. for 
> one storage type, there is one tree subset that connects all the nodes with 
> that type. And one search happens on only one such subset. So unexpected 
> storage types are simply not in the search space. 
> Thanks [~szetszwo] for the offline discussion, and thanks [~linyiqun] for 
> pointing out a wrong statement (corrected now) in the description. Any 
> further comments are more than welcome.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12032) Inaccurate comment on DatanodeDescriptor#getNumberOfBlocksToBeErasureCoded

2017-06-23 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-12032:
--

 Summary: Inaccurate comment on 
DatanodeDescriptor#getNumberOfBlocksToBeErasureCoded
 Key: HDFS-12032
 URL: https://issues.apache.org/jira/browse/HDFS-12032
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0-alpha3
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Trivial


I saw this comment is an inaccurate copy paste:

{noformat}
  /**
   * The number of work items that are pending to be replicated
   */
  @VisibleForTesting
  public int getNumberOfBlocksToBeErasureCoded() {
return erasurecodeBlocks.size();
  }
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12009) dfsadmin -setBalancerBandwidth does not accept human friendly units

2017-06-21 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-12009:
--

 Summary: dfsadmin -setBalancerBandwidth does not accept human 
friendly units
 Key: HDFS-12009
 URL: https://issues.apache.org/jira/browse/HDFS-12009
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: shell
Affects Versions: 3.0.0-alpha3
Reporter: Andrew Wang


We support human readable units now. The default balancing bandwidth in the 
conf is "10m". However, human readable units are not supported by dfsadmin 
-setBalancerBandwidth. This means you can't pass the output of "hdfs getconf 
-confKey dfs.datanode.balance.bandwidthPerSec" to "hdfs dfsadmin 
-setBalancerBandwidth".

This is a regression from pre-human readable units.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-8672) Erasure Coding: Add EC-related Metrics to NN (seperate striped blocks count from UnderReplicatedBlocks count)

2017-06-19 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-8672.
---
Resolution: Duplicate

Closing this as I believe it dupes HDFS-10999 which Manoj completed.

> Erasure Coding: Add EC-related Metrics to NN (seperate striped blocks count 
> from UnderReplicatedBlocks count)
> -
>
> Key: HDFS-8672
> URL: https://issues.apache.org/jira/browse/HDFS-8672
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Walter Su
>Assignee: Manoj Govindassamy
>Priority: Minor
>  Labels: hdfs-ec-3.0-nice-to-have
>
> 1. {{MissingBlocks}} metric is updated in HDFS-8461 so it includes striped 
> blocks.
> 2. {{CorruptBlocks}} metric is updated in HDFS-8619 so it includes striped 
> blocks.
> 3. {{UnderReplicatedBlocks}} and {{PendingReplicationBlocks}} includes 
> striped blocks (HDFS-7912).
> This jira aims to seperate striped blocks count from 
> {{UnderReplicatedBlocks}} count.
> EC file recovery need coding. It's more expensive than block duplication.
> It's necessary to seperate striped blocks count from UnderReplicatedBlocks 
> count. So user can know what's going on.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11956) Fix BlockToken compatibility with Hadoop 2.x clients

2017-06-09 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11956:
--

 Summary: Fix BlockToken compatibility with Hadoop 2.x clients
 Key: HDFS-11956
 URL: https://issues.apache.org/jira/browse/HDFS-11956
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0-alpha4
Reporter: Andrew Wang
Priority: Blocker


Seems like HDFS-9807 broke backwards compatibility with Hadoop 2.x clients. 
When talking to a 3.0.0-alpha4 DN with security on:

{noformat}
2017-06-06 23:27:22,568 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
Block token verification failed: op=WRITE_BLOCK, 
remoteAddress=/172.28.208.200:53900, message=Block token with StorageIDs 
[DS-c0f24154-a39b-4941-93cd-5b8323067ba2] not valid for access with StorageIDs 
[]
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11941) Move dfsadmin triggerBlockReport and metaSave to debugadmin

2017-06-07 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11941:
--

 Summary: Move dfsadmin triggerBlockReport and metaSave to 
debugadmin
 Key: HDFS-11941
 URL: https://issues.apache.org/jira/browse/HDFS-11941
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.9.0, 3.0.0-alpha4
Reporter: Andrew Wang


Filing a JIRA for discussion. While reviewing the [dfsadmin 
commands|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#dfsadmin]
 some of them seem better suited for debugadmin:

* triggerBlockReport: similar to recoverLease in debugadmin, you don't need 
this unless HDFS is in a bad state
* metasave: dumps NN datastructures to a side file. Seems purely for debugging.

DebugAdmin commands notably do not need to be compatible between releases and 
do not have Java API equivalents in HdfsAdmin.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11885) createEncryptionZone should not block on initializing EDEK cache

2017-05-25 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11885:
--

 Summary: createEncryptionZone should not block on initializing 
EDEK cache
 Key: HDFS-11885
 URL: https://issues.apache.org/jira/browse/HDFS-11885
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: encryption
Affects Versions: 2.6.5
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Critical


When creating an encryption zone, we call {{ensureKeyIsInitialized}}, which 
calls {{provider.warmUpEncryptedKeys(keyName)}}. This is a blocking call, which 
attempts to fill the key cache up to the low watermark.

If the KMS is down or slow, this can take a very long time, and cause the 
createZone RPC to fail with a timeout.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-11644) Support for querying outputstream capabilities

2017-05-09 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HDFS-11644:


> Support for querying outputstream capabilities
> --
>
> Key: HDFS-11644
> URL: https://issues.apache.org/jira/browse/HDFS-11644
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Manoj Govindassamy
>  Labels: hdfs-ec-3.0-must-do
> Fix For: 3.0.0-alpha3
>
> Attachments: HDFS-11644.01.patch, HDFS-11644.02.patch, 
> HDFS-11644.03.patch, HDFS-11644-branch-2.01.patch
>
>
> FSDataOutputStream#hsync checks if a stream implements Syncable, and if so, 
> calls hsync. Otherwise, it just calls flush. This is used, for instance, by 
> YARN's FileSystemTimelineWriter.
> DFSStripedOutputStream extends DFSOutputStream, which implements Syncable. 
> However, DFSStripedOS throws a runtime exception when the Syncable methods 
> are called.
> We should refactor the inheritance structure so DFSStripedOS does not 
> implement Syncable.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11757) Query StreamCapabilities when creating balancer's lock file

2017-05-04 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11757:
--

 Summary: Query StreamCapabilities when creating balancer's lock 
file
 Key: HDFS-11757
 URL: https://issues.apache.org/jira/browse/HDFS-11757
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer & mover
Affects Versions: 3.0.0-alpha2
Reporter: Andrew Wang


Once HDFS-11644 goes in, we'll have a clean way of querying for stream 
capabilities. We should redo the check in the Balancer introduced in HDFS-11643 
to query the capabilities.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11687) Add new public encryption APIs required by Hive

2017-04-20 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11687:
--

 Summary: Add new public encryption APIs required by Hive
 Key: HDFS-11687
 URL: https://issues.apache.org/jira/browse/HDFS-11687
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: encryption
Affects Versions: 2.6.5
Reporter: Andrew Wang


As discovered on HADOOP-14333, Hive is using reflection to get a DFSClient for 
its encryption shim. We should provide proper public APIs for getting this 
information.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-11683) getErasureCodingPolicyByName should throw a checked exception

2017-04-19 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-11683.

Resolution: Not A Problem

Looking more, it seems like this is actually pretty common in our API. NVM. 

> getErasureCodingPolicyByName should throw a checked exception
> -
>
> Key: HDFS-11683
> URL: https://issues.apache.org/jira/browse/HDFS-11683
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha3
>Reporter: Andrew Wang
>  Labels: newbie
>
> getErasureCodingPolicyByName throws a HadoopIllegalArgumentException, which 
> is a RuntimeException. This is basically only used by CLI tools and client 
> code, we should throw a different exception in the NameNode.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11683) getErasureCodingPolicyByName should throw a checked exception

2017-04-19 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11683:
--

 Summary: getErasureCodingPolicyByName should throw a checked 
exception
 Key: HDFS-11683
 URL: https://issues.apache.org/jira/browse/HDFS-11683
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: erasure-coding
Affects Versions: 3.0.0-alpha3
Reporter: Andrew Wang


getErasureCodingPolicyByName throws a HadoopIllegalArgumentException, which is 
a RuntimeException. This is basically only used by CLI tools and client code, 
we should throw a different exception in the NameNode.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11682) TestBalancer#testBalancerWithStripedFile is flaky

2017-04-19 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11682:
--

 Summary: TestBalancer#testBalancerWithStripedFile is flaky
 Key: HDFS-11682
 URL: https://issues.apache.org/jira/browse/HDFS-11682
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0-alpha3
Reporter: Andrew Wang


Saw this fail in two different ways on a precommit run, but pass locally.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11672) Fix some misleading log messages related to EC block groups

2017-04-18 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11672:
--

 Summary: Fix some misleading log messages related to EC block 
groups
 Key: HDFS-11672
 URL: https://issues.apache.org/jira/browse/HDFS-11672
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding
Affects Versions: 3.0.0-alpha1
Reporter: Andrew Wang
Priority: Minor


I turned the NameNode's BlockStateChange and hdfs.StateChange logs up to debug, 
noticed some out of date log messages like the following:

{noformat}
2017-04-18 17:19:50,639 [IPC Server handler 4 on 45133] DEBUG hdfs.StateChange 
(FSDirWriteFileOp.java:addBlock(507)) - DIR* FSDirectory.addBlock: 
/test2RecoveryTasksForSameBlockGroup with blk_-9223372036854775792_1001 block 
is added to the in-memory file system

2017-04-18 17:19:50,836 [Block report processor] DEBUG BlockStateChange 
(LowRedundancyBlocks.java:update(353)) - BLOCK* 
NameSystem.LowRedundancyBlock.update: blk_-9223372036854775792_1001 has only 8 
replicas and needs 9 replicas so is added to neededReconstructions at priority 
level 2

2017-04-18 17:19:51,235 [IPC Server handler 4 on 45133] DEBUG hdfs.StateChange 
(FSNamesystem.java:closeFile(3740)) - closeFile: 
/test2RecoveryTasksForSameBlockGroup with 1 blocks is persisted to the file 
system
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11671) TestReconstructStripedBlocks#test2RecoveryTasksForSameBlockGroup fails

2017-04-18 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11671:
--

 Summary: 
TestReconstructStripedBlocks#test2RecoveryTasksForSameBlockGroup fails
 Key: HDFS-11671
 URL: https://issues.apache.org/jira/browse/HDFS-11671
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: erasure-coding, test
Affects Versions: 3.0.0-alpha2
Reporter: Andrew Wang


This failed on a unit test run with 3.0.0-alpha2.

{noformat}
java.lang.AssertionError: expected:<1> but was:<0>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.hdfs.server.namenode.TestReconstructStripedBlocks.test2RecoveryTasksForSameBlockGroup(TestReconstructStripedBlocks.java:223)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11660) TestFsDataset#testPageRounder fails intermittently with AssertionError

2017-04-17 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11660:
--

 Summary: TestFsDataset#testPageRounder fails intermittently with 
AssertionError
 Key: HDFS-11660
 URL: https://issues.apache.org/jira/browse/HDFS-11660
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.6.5
Reporter: Andrew Wang
Assignee: Andrew Wang


We've seen this test fail occasionally with an error like the following:

{noformat}
java.lang.AssertionError
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:510)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:695)
at java.lang.Thread.run(Thread.java:745)
{noformat}

This assertion fires when the heartbeat response is null



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11652) Improve ECSchema and ErasureCodingPolicy toString, hashCode, equals

2017-04-12 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11652:
--

 Summary: Improve ECSchema and ErasureCodingPolicy toString, 
hashCode, equals
 Key: HDFS-11652
 URL: https://issues.apache.org/jira/browse/HDFS-11652
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding
Affects Versions: 3.0.0-alpha2
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Minor


Some small cleanups to these methods.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11651) Add a public API for specifying an EC policy at create time

2017-04-12 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11651:
--

 Summary: Add a public API for specifying an EC policy at create 
time
 Key: HDFS-11651
 URL: https://issues.apache.org/jira/browse/HDFS-11651
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding
Affects Versions: 3.0.0-alpha3
Reporter: Andrew Wang


Follow-on work from HDFS-10996. We extended the create builder, but it still 
requires casting to DistributedFileSystem to use, thus is not a public API.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11644) DFSStripedOutputStream should not implement Syncable

2017-04-11 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11644:
--

 Summary: DFSStripedOutputStream should not implement Syncable
 Key: HDFS-11644
 URL: https://issues.apache.org/jira/browse/HDFS-11644
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: erasure-coding
Affects Versions: 3.0.0-alpha1
Reporter: Andrew Wang


FSDataOutputStream#hsync checks if a stream implements Syncable, and if so, 
calls hsync. Otherwise, it just calls flush. This is used, for instance, by 
YARN's FileSystemTimelineWriter.

DFSStripedOutputStream extends DFSOutputStream, which implements Syncable. 
However, DFSStripedOS throws a runtime exception when the Syncable methods are 
called.

We should refactor the inheritance structure so DFSStripedOS does not implement 
Syncable.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11643) Balancer fencing fails when writing erasure coded lock file

2017-04-11 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11643:
--

 Summary: Balancer fencing fails when writing erasure coded lock 
file
 Key: HDFS-11643
 URL: https://issues.apache.org/jira/browse/HDFS-11643
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer & mover, erasure-coding
Affects Versions: 3.0.0-alpha1
Reporter: Andrew Wang
Priority: Critical


At startup, the balancer writes its hostname to the lock file and calls 
hflush(). hflush is not supported for EC files, so this fails when the entire 
filesystem is erasure coded.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-11629) Revert HDFS-11431 hadoop-hdfs-client JAR does not include ConfiguredFailoverProxyProvider.

2017-04-05 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-11629.

   Resolution: Fixed
Fix Version/s: 2.8.1

> Revert HDFS-11431 hadoop-hdfs-client JAR does not include 
> ConfiguredFailoverProxyProvider.
> --
>
> Key: HDFS-11629
> URL: https://issues.apache.org/jira/browse/HDFS-11629
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Fix For: 2.8.1
>
>
> New JIRA for tracking the revert of HDFS-11431 from branch-2.8.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11629) Revert HDFS-11431 hadoop-hdfs-client JAR does not include ConfiguredFailoverProxyProvider.

2017-04-05 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11629:
--

 Summary: Revert HDFS-11431 hadoop-hdfs-client JAR does not include 
ConfiguredFailoverProxyProvider.
 Key: HDFS-11629
 URL: https://issues.apache.org/jira/browse/HDFS-11629
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Andrew Wang
Assignee: Andrew Wang


New JIRA for tracking the revert of HDFS-11431 from branch-2.8.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-11538) Move ClientProtocol HA proxies into hadoop-hdfs-client

2017-04-05 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HDFS-11538:


> Move ClientProtocol HA proxies into hadoop-hdfs-client
> --
>
> Key: HDFS-11538
> URL: https://issues.apache.org/jira/browse/HDFS-11538
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>Priority: Blocker
> Fix For: 3.0.0-alpha3
>
> Attachments: HDFS-11538.001.patch, HDFS-11538.002.patch, 
> HDFS-11538.003.patch
>
>
> Follow-up for HDFS-11431. We should move this missing class over rather than 
> pulling in the whole hadoop-hdfs dependency.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11623) Refactor system erasure coding policies into protocol package

2017-04-04 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11623:
--

 Summary: Refactor system erasure coding policies into protocol 
package
 Key: HDFS-11623
 URL: https://issues.apache.org/jira/browse/HDFS-11623
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding
Affects Versions: 3.0.0-alpha2
Reporter: Andrew Wang
Assignee: Andrew Wang


This is a precursor to HDFS-11565. We need to move the set of system defined EC 
policies out of the NameNode's ECPolicyManager into the hdfs-client module so 
it can be referenced by the client.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11600) Refactor TestDFSStripedOutputStreamWithFailure test classes

2017-03-30 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11600:
--

 Summary: Refactor TestDFSStripedOutputStreamWithFailure test 
classes
 Key: HDFS-11600
 URL: https://issues.apache.org/jira/browse/HDFS-11600
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: test
Affects Versions: 3.0.0-alpha2
Reporter: Andrew Wang
Priority: Minor


TestDFSStripedOutputStreamWithFailure has a great number of subclasses. The 
tests are parameterized based on the name of these subclasses.

Seems like we could parameterize these tests with JUnit and then not need all 
these separate test classes.

Another note, the tests will randomly return instead of running the test. Using 
{{Assume}} instead would make it more clear in the test output that these tests 
were skipped.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11596) hadoop-hdfs-client jar is in the wrong directory in release tarball

2017-03-29 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11596:
--

 Summary: hadoop-hdfs-client jar is in the wrong directory in 
release tarball
 Key: HDFS-11596
 URL: https://issues.apache.org/jira/browse/HDFS-11596
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build
Affects Versions: 3.0.0-alpha2, 2.8.0
Reporter: Andrew Wang
Priority: Critical


Mentioned by [~aw] on HDFS-11356. The hdfs-client jar is in the lib directory 
rather than with the other hadoop jars:

>From the alpha2 artifacts:

{noformat}
-> % find . -name "*hdfs-client*.jar"
./share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/hadoop-hdfs-client-3.0.0-alpha2.jar
./share/hadoop/hdfs/sources/hadoop-hdfs-client-3.0.0-alpha2-sources.jar
./share/hadoop/hdfs/sources/hadoop-hdfs-client-3.0.0-alpha2-test-sources.jar
./share/hadoop/hdfs/lib/hadoop-hdfs-client-3.0.0-alpha2.jar
./share/hadoop/hdfs/hadoop-hdfs-client-3.0.0-alpha2-tests.jar
{noformat}

Strangely enough, the tests jar is in the right place.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-11170) Add builder-based create API to FileSystem

2017-03-24 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HDFS-11170:


Reopen issue for branch-2 backport.

> Add builder-based create API to FileSystem
> --
>
> Key: HDFS-11170
> URL: https://issues.apache.org/jira/browse/HDFS-11170
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: SammiChen
>Assignee: SammiChen
>  Labels: hdfs-ec-3.0-nice-to-have
> Fix For: 3.0.0-alpha3
>
> Attachments: HDFS-11170-00.patch, HDFS-11170-01.patch, 
> HDFS-11170-02.patch, HDFS-11170-03.patch, HDFS-11170-04.patch, 
> HDFS-11170-05.patch, HDFS-11170-06.patch, HDFS-11170-07.patch, 
> HDFS-11170-08.patch
>
>
> FileSystem class supports multiple create functions to help user create file. 
> Some create functions has many parameters, it's hard for user to exactly 
> remember these parameters and their orders. This task is to add builder  
> based create functions to help user more easily create file. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11565) Use compact identifiers for built-in ECPolicies in HdfsFileStatus

2017-03-22 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11565:
--

 Summary: Use compact identifiers for built-in ECPolicies in 
HdfsFileStatus
 Key: HDFS-11565
 URL: https://issues.apache.org/jira/browse/HDFS-11565
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding
Affects Versions: 3.0.0-alpha2
Reporter: Andrew Wang
Assignee: Andrew Wang


Discussed briefly on HDFS-7337 with Kai Zheng. Quoting our convo:

{quote}
>From looking at the protos, one other question I had is about the overhead of 
>these protos when using the hardcoded policies. There are a bunch of strings 
>and ints, which can be kind of heavy since they're added to each 
>HdfsFileStatus. Should we make the built-in ones identified by purely an ID, 
>with these fully specified protos used for the pluggable policies?
{quote}

{quote}
Sounds like this could be considered separately because, either built-in 
policies or plugged-in polices, the full meta info is maintained either by the 
codes or in the fsimage persisted, so identifying them by purely an ID should 
works fine. If agree, we could refactor the codes you mentioned above 
separately.
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-11431) hadoop-hdfs-client JAR does not include ConfiguredFailoverProxyProvider

2017-03-16 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-11431.

Resolution: Fixed

Let's close this as fixed only in branch-2.8.0 / branch-2.8. I also reverted 
this from branch-2.

Filed HDFS-11538 to do the real fix for 2.9 and 3.0.

> hadoop-hdfs-client JAR does not include ConfiguredFailoverProxyProvider
> ---
>
> Key: HDFS-11431
> URL: https://issues.apache.org/jira/browse/HDFS-11431
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build, hdfs-client
>Affects Versions: 2.8.0, 3.0.0-alpha3
>Reporter: Steven Rand
>Assignee: Steven Rand
>Priority: Blocker
>  Labels: maven
> Fix For: 2.8.0
>
> Attachments: HDFS-11431-branch-2.8.0.001.patch, 
> HDFS-11431-branch-2.8.0.002.patch
>
>
> The {{hadoop-hdfs-client-2.8.0.jar}} file does include the 
> {{ConfiguredFailoverProxyProvider}} class. This breaks client applications 
> that use this class to communicate with the active NameNode in an HA 
> deployment of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11538) Move ConfiguredFailoverProxyProvider into hadoop-hdfs-client

2017-03-16 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11538:
--

 Summary: Move ConfiguredFailoverProxyProvider into 
hadoop-hdfs-client
 Key: HDFS-11538
 URL: https://issues.apache.org/jira/browse/HDFS-11538
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 3.0.0-alpha1, 2.8.0
Reporter: Andrew Wang
Priority: Blocker


Follow-up for HDFS-11431. We should move this missing class over rather than 
pulling in the whole hadoop-hdfs dependency.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11510) Revamp erasure coding user documentation

2017-03-07 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11510:
--

 Summary: Revamp erasure coding user documentation
 Key: HDFS-11510
 URL: https://issues.apache.org/jira/browse/HDFS-11510
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Affects Versions: 3.0.0-alpha3
Reporter: Andrew Wang
Assignee: Andrew Wang


After we finish more of the must-do EC changes targeted for 3.0, it'd be good 
to take a fresh look at the EC documentation to make sure it's comprehensive, 
particularly how to choose a good erasure coding policy for your cluster and 
how to enable policies.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11506) Move ErasureCodingPolicyManager#getSystemDefaultPolicy to test code

2017-03-06 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11506:
--

 Summary: Move ErasureCodingPolicyManager#getSystemDefaultPolicy to 
test code
 Key: HDFS-11506
 URL: https://issues.apache.org/jira/browse/HDFS-11506
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding
Affects Versions: 3.0.0-alpha3
Reporter: Andrew Wang
Assignee: Andrew Wang


In HDFS-11416, we removed usages of getSystemDefaultPolicy in production code. 
Let's move it to a test package to make sure no production code calls it in the 
future.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11505) Do not enable any erasure coding policies by default

2017-03-06 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11505:
--

 Summary: Do not enable any erasure coding policies by default
 Key: HDFS-11505
 URL: https://issues.apache.org/jira/browse/HDFS-11505
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding
Affects Versions: 3.0.0-alpha3
Reporter: Andrew Wang
Assignee: Andrew Wang


As discussed on HDFS-11314, administrators need to choose the correct set of EC 
policies based on cluster size and desired fault-tolerance properties.

This means we should not enable any EC policies by default, since any default 
value could be incorrect.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11503) Integrate Chocolate Cloud RS coder implementation

2017-03-06 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11503:
--

 Summary: Integrate Chocolate Cloud RS coder implementation
 Key: HDFS-11503
 URL: https://issues.apache.org/jira/browse/HDFS-11503
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: erasure-coding
Affects Versions: 3.0.0-alpha2
Reporter: Andrew Wang
Assignee: Marcell Feher


Quote from Marcell on HDFS-7285:

First of all let me introduce ourselves: we are Chocolate Cloud from Denmark, 
we use erasure coding to improve storage solutions. We already have 
Reed-Solomon and Random Linear Network Coding backends for Liberasurecode, and 
now we are at the final stage of developing our RS plugin to HDFS-EC. The 
performance of our plugin is similar to ISA-L's, in some configurations we are 
better, in others we are worse (our initial speed comparison charts can be 
found here: https://www.chocolate-cloud.cc/Plugins/HDFS-EC/hdfs.html).

We would like our plugin to become officially supported in Hadoop 3.0. We can 
already provide a preliminary version of our (native) library and a patch with 
the necessary glue code for the next alpha release.

I'd like to know your thoughts about whether it's possible and how it could be 
achieved.

P.S: I'm happy to share more details if there's interest



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11498) Make RestCsrfPreventionHandler and WebHdfsHandler compatible with Netty 4.0

2017-03-03 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11498:
--

 Summary: Make RestCsrfPreventionHandler and WebHdfsHandler 
compatible with Netty 4.0
 Key: HDFS-11498
 URL: https://issues.apache.org/jira/browse/HDFS-11498
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Andrew Wang
Assignee: Andrew Wang


Per discussion in HADOOP-13866, it looks like we can change 2.8.0 back to 
exposing Netty 4.0, but still be ABI compatible with Netty 4.1 for users like 
HBase that want to swap out the version.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-11477) Simplify file IO profiling configuration

2017-03-02 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HDFS-11477:


Reopening, I think this broke the build. I saw a DNConf error referring to a 
FILEIO config key, reverted this and it passed afterwards.

Precommit seems pretty messed up right now unfortunately :(

> Simplify file IO profiling configuration
> 
>
> Key: HDFS-11477
> URL: https://issues.apache.org/jira/browse/HDFS-11477
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Minor
> Attachments: HDFS-11477.000.patch, HDFS-11477.001.patch, 
> HDFS-11477.002.patch
>
>
> For Profiling FileIO events, there are 2 keys:
> - DFS_DATANODE_ENABLE_FILEIO_PROFILING_KEY for enabling the hooks
> - DFS_DATANODE_FILEIO_PROFILING_SAMPLING_FRACTION_KEY for setting the 
> sampling fraction 
> We can instead have only the sampling fraction key and set it to 0 if we want 
> to disable profiling.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11466) Change dfs.namenode.write-lock-reporting-threshold-ms default from 1000ms to 5000ms

2017-02-27 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11466:
--

 Summary: Change dfs.namenode.write-lock-reporting-threshold-ms 
default from 1000ms to 5000ms
 Key: HDFS-11466
 URL: https://issues.apache.org/jira/browse/HDFS-11466
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 3.0.0-alpha1, 2.8.0, 2.7.4
Reporter: Andrew Wang
Assignee: Andrew Wang


Per discussion on HDFS-10798, it might make sense to change the default value 
for long write lock holds to 5000ms like the read threshold, to avoid spamming 
the log.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11465) Rename BlockType#CONTIGUOUS to BlockType#REPLICATED

2017-02-27 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11465:
--

 Summary: Rename BlockType#CONTIGUOUS to BlockType#REPLICATED
 Key: HDFS-11465
 URL: https://issues.apache.org/jira/browse/HDFS-11465
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 3.0.0-alpha2
Reporter: Andrew Wang


HDFS-10759 introduced the a BlockType enum to INodeFile, with possible values 
CONTIGUOUS or STRIPED.

Since HDFS-8030 wants to implement "contiguous EC", CONTIGUOUS isn't an 
appropriate name. I propose we rename CONTIGUOUS to REPLICATED for clarity.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11428) Change setErasureCodingPolicy to take a required string EC policy name

2017-02-17 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11428:
--

 Summary: Change setErasureCodingPolicy to take a required string 
EC policy name
 Key: HDFS-11428
 URL: https://issues.apache.org/jira/browse/HDFS-11428
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding
Affects Versions: 3.0.0-alpha1
Reporter: Andrew Wang
Assignee: Andrew Wang


The current {{setErasureCodingPolicy}} API takes an optional {{ECPolicy}}. This 
makes calling the API harder for clients, since they need to turn a specified 
name into a policy, and the set of available EC policies is only available on 
the NN.

You can see this awkwardness in the current EC cli set command: it first 
fetches the list of EC policies, looks for the one specified by the user, then 
calls set. This means we need to issue two RPCs for every set (inefficient), 
and we need to do validation on the NN side anyway (extraneous work).

Since we're phasing out the system default EC policy, it also makes sense to 
make the policy a required parameter.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11427) Rename "rs-legacy" to "rs"

2017-02-17 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11427:
--

 Summary: Rename "rs-legacy" to "rs"
 Key: HDFS-11427
 URL: https://issues.apache.org/jira/browse/HDFS-11427
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding
Affects Versions: 3.0.0-alpha1
Reporter: Andrew Wang
Assignee: Andrew Wang


I think it's confusing (and verbose) that our RS code is named "rs-default" 
instead of just "rs", partially because we also have the idea of setting a 
default implementation for each codec. Let's rename it for simplicity.

This is an incompatible change since it affects some configuration keys.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11426) Refactor EC CLI to be similar to storage policies CLI

2017-02-17 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11426:
--

 Summary: Refactor EC CLI to be similar to storage policies CLI
 Key: HDFS-11426
 URL: https://issues.apache.org/jira/browse/HDFS-11426
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding, shell
Affects Versions: 3.0.0-alpha1
Reporter: Andrew Wang
Assignee: Andrew Wang


The {{hdfs erasurecode}} CLI is similar to {{hdfs storagepolicies}} in terms of 
functionality, but different in terms of behavior. Let's refactor {{ECCli}} to 
be more similar to the various Admin classes we already have, and also make its 
calling syntax mimic {{hdfs storagepolicies}} as closely as possible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11416) Refactor out system default erasure coding policy

2017-02-14 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11416:
--

 Summary: Refactor out system default erasure coding policy
 Key: HDFS-11416
 URL: https://issues.apache.org/jira/browse/HDFS-11416
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding
Affects Versions: 3.0.0-alpha1
Reporter: Andrew Wang
Assignee: Andrew Wang


As discussed on HDFS-7859, the system default EC policy is mostly a relic from 
development when the system only supported a single global policy. Now, we 
support multiple policies, and the system default policy is mostly used by 
tests.

We should refactor to remove this concept.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11406) Remove unused getStartInstance and getFinalizeInstance in FSEditLogOp

2017-02-10 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11406:
--

 Summary: Remove unused getStartInstance and getFinalizeInstance in 
FSEditLogOp
 Key: HDFS-11406
 URL: https://issues.apache.org/jira/browse/HDFS-11406
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.8.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Trivial


Looking at FSEditLogOp, these two methods are unused and can be removed:

{code}
static RollingUpgradeOp getStartInstance(OpInstanceCache cache) {
  return (RollingUpgradeOp) cache.get(OP_ROLLING_UPGRADE_START);
}

static RollingUpgradeOp getFinalizeInstance(OpInstanceCache cache) {
  return (RollingUpgradeOp) cache.get(OP_ROLLING_UPGRADE_FINALIZE);
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11389) Support rolling upgrade between 2.x and 3.x

2017-02-03 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11389:
--

 Summary: Support rolling upgrade between 2.x and 3.x
 Key: HDFS-11389
 URL: https://issues.apache.org/jira/browse/HDFS-11389
 Project: Hadoop HDFS
  Issue Type: Task
  Components: rolling upgrades
Affects Versions: 3.0.0-alpha2
Reporter: Andrew Wang
Assignee: Ray Chiang
Priority: Blocker


Counterpart JIRA to HDFS-11096. We need to:

* examine YARN and MR's  JACC report for binary and source incompatibilities
* run the [PB 
differ|https://issues.apache.org/jira/browse/HDFS-11096?focusedCommentId=15816405=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15816405]
 that Sean wrote for HDFS-11096 for the YARN PBs.
* sanity test some rolling upgrades between 2.x and 3.x. Ideally these are 
automated and something we can run upstream.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-11376) Revert HDFS-8377 Support HTTP/2 in datanode

2017-01-26 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-11376.

   Resolution: Fixed
Fix Version/s: 3.0.0-alpha3
   2.9.0

> Revert HDFS-8377 Support HTTP/2 in datanode
> ---
>
> Key: HDFS-11376
> URL: https://issues.apache.org/jira/browse/HDFS-11376
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.9.0, 3.0.0-alpha3
>Reporter: Andrew Wang
>Assignee: Xiao Chen
> Fix For: 2.9.0, 3.0.0-alpha3
>
>
> Tracking JIRA so this revert shows up in the release notes. See discussion on 
> HADOOP-13866.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11376) Revert HDFS-8377 Support HTTP/2 in datanode

2017-01-26 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11376:
--

 Summary: Revert HDFS-8377 Support HTTP/2 in datanode
 Key: HDFS-11376
 URL: https://issues.apache.org/jira/browse/HDFS-11376
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.9.0, 3.0.0-alpha3
Reporter: Andrew Wang
Assignee: Xiao Chen


Tracking JIRA so this revert shows up in the release notes. See discussion on 
HADOOP-13866.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-7344) [umbrella] Erasure Coding worker and support in DataNode

2017-01-19 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-7344.
---
Resolution: Done

Resolving since the required subtasks for this umbrella seem to be complete.

> [umbrella] Erasure Coding worker and support in DataNode
> 
>
> Key: HDFS-7344
> URL: https://issues.apache.org/jira/browse/HDFS-7344
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Kai Zheng
>Assignee: Li Bo
> Attachments: ECWorker-design-v2.pdf, hdfs-ec-datanode.0108.zip, 
> hdfs-ec-datanode.0108.zip, HDFS ECWorker Design.pdf
>
>
> According to HDFS-7285 and the design, this handles DataNode side extension 
> and related support for Erasure Coding. More specifically, it implements 
> {{ECWorker}}, which reconstructs lost blocks (in striping layout).
> It generally needs to restore BlockGroup and schema information from coding 
> commands from NameNode or other entities, and construct specific coding work 
> to execute. The required block reader, writer, either local or remote, 
> encoder and decoder, will be implemented separately as sub-tasks. 
> This JIRA will track all the linked sub-tasks, and is responsible for general 
> discussions and integration for ECWorker. It won't resolve until all the 
> related tasks are done.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11349) [Umbrella] Additional testing for striped erasure coding

2017-01-18 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11349:
--

 Summary: [Umbrella] Additional testing for striped erasure coding
 Key: HDFS-11349
 URL: https://issues.apache.org/jira/browse/HDFS-11349
 Project: Hadoop HDFS
  Issue Type: Test
  Components: erasure-coding, test
Affects Versions: 3.0.0-alpha1
Reporter: Andrew Wang


Umbrella for testing work for striped erasure coding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11348) [Umbrella] Append and truncate for striped EC files

2017-01-18 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11348:
--

 Summary: [Umbrella] Append and truncate for striped EC files
 Key: HDFS-11348
 URL: https://issues.apache.org/jira/browse/HDFS-11348
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: erasure-coding
Affects Versions: 3.0.0-alpha1
Reporter: Andrew Wang


Umbrella JIRA for adding append, hflush/hsync, and truncate support to striped 
erasure coded files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11347) [Umbrella] Converting files from replicated to striped EC

2017-01-18 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11347:
--

 Summary: [Umbrella] Converting files from replicated to striped EC
 Key: HDFS-11347
 URL: https://issues.apache.org/jira/browse/HDFS-11347
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: erasure-coding
Affects Versions: 3.0.0-alpha1
Reporter: Andrew Wang


Umbrella for work related to converting replicated files to striped EC.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-8095) Allow to configure the system default EC schema

2017-01-10 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-8095.
---
Resolution: Not A Problem

Resolving per above comments, since we think that this can be handled by a 
combination of HDFS-7859 and HDFS-11314. Thanks [~drankye] for the discussion!

> Allow to configure the system default EC schema
> ---
>
> Key: HDFS-8095
> URL: https://issues.apache.org/jira/browse/HDFS-8095
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>  Labels: hdfs-ec-3.0-nice-to-have
>
> As suggested by [~umamaheswararao] and [~vinayrpet] in HDFS-8074, we may 
> desire allowing to configure the system default EC schema, so in any 
> deployment a cluster admin may be able to define their own system default 
> one. In the discussion, we have two approaches to configure the system 
> default schema: 1) predefine it in the {{ecschema-def.xml}} file, making sure 
> it's not changed; 2) configure the key parameter values as properties in 
> {{core-site.xml}}. Open this for future consideration in case it's forgotten.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-11314) Validate client-provided EC schema on the NameNode

2017-01-10 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-11314:
--

 Summary: Validate client-provided EC schema on the NameNode
 Key: HDFS-11314
 URL: https://issues.apache.org/jira/browse/HDFS-11314
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: erasure-coding
Affects Versions: 3.0.0-alpha1
Reporter: Andrew Wang


Filing based on discussion in HDFS-8095. A user might specify a policy that is 
not appropriate for the cluster, e.g. a RS (10,4) policy when the cluster only 
has 10 nodes. The NN should only allow the client to choose from a pre-approved 
list determined by the cluster administrator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-10258) Erasure Coding: support small cluster whose #DataNode < # (Blocks in a BlockGroup)

2017-01-10 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-10258.

Resolution: Later

We also committed the XOR 2,1 policy, so I think the priority of this JIRA is 
lessened. We can revisit if small clusters are found to be important later.

> Erasure Coding: support small cluster whose #DataNode < # (Blocks in a 
> BlockGroup)
> --
>
> Key: HDFS-10258
> URL: https://issues.apache.org/jira/browse/HDFS-10258
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
>
> Currently EC has not supported small clusters whose datanode number is 
> smaller than the block numbers in a block group. This sub task will solve 
> this problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-8796) Erasure coding: merge HDFS-8499 to EC branch and refactor BlockInfoStriped

2017-01-10 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-8796.
---
Resolution: Invalid

I think this JIRA is invalid now given that the EC branch has been merged to 
trunk, resolving.

> Erasure coding: merge HDFS-8499 to EC branch and refactor BlockInfoStriped
> --
>
> Key: HDFS-8796
> URL: https://issues.apache.org/jira/browse/HDFS-8796
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-7285
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-8796-HDFS-7285.00.patch, 
> HDFS-8796-HDFS-7285.01-part1.patch, HDFS-8796-HDFS-7285.01-part2.patch
>
>
> Separating this change from the HDFS-8728 discussion. Per suggestion from 
> [~szetszwo], clarifying the description of the change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-7674) [umbrella] Adding metrics for Erasure Coding

2017-01-10 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-7674.
---
Resolution: Done

I think we can close this since the subtasks are resolved. Thanks everyone for 
the hard work!

> [umbrella] Adding metrics for Erasure Coding
> 
>
> Key: HDFS-7674
> URL: https://issues.apache.org/jira/browse/HDFS-7674
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Li Bo
>
> As the design (in HDFS-7285) indicates, erasure coding involves non-trivial 
> impact and workload for NameNode, DataNode and client; it also allows 
> configurable and pluggable erasure codec and schema with flexible tradeoff 
> options (see HDFS-7337). To support necessary analysis and adjustment, we'd 
> better have various meaningful metrics for the EC support, like 
> encoding/decoding tasks, recovered blocks, read/transferred data size, 
> computation time and etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-6555) Specify file encryption attributes at create time

2017-01-06 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-6555.
---
Resolution: Not A Problem

I think we can resolve this since crypto info is passed around as an xattr. 
With /.reserved/raw, we support distcp without decryption.

> Specify file encryption attributes at create time
> -
>
> Key: HDFS-6555
> URL: https://issues.apache.org/jira/browse/HDFS-6555
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, security
>Affects Versions: 3.0.0-alpha1
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>
> We need to create a Crypto Blob for passing around crypto info. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-6891) Follow-on work for transparent data at rest encryption

2017-01-06 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-6891.
---
Resolution: Fixed

I moved the remaining subtasks out to separate issues, we can resolve this 
umbrella.

> Follow-on work for transparent data at rest encryption
> --
>
> Key: HDFS-6891
> URL: https://issues.apache.org/jira/browse/HDFS-6891
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Affects Versions: 2.5.0
>Reporter: Andrew Wang
>Assignee: Charles Lamb
>
> This is an umbrella JIRA to track remaining subtasks from HDFS-6134.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-11297) hadoop-7285-power

2017-01-06 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-11297.

Resolution: Invalid

> hadoop-7285-power
> -
>
> Key: HDFS-11297
> URL: https://issues.apache.org/jira/browse/HDFS-11297
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: erasure-coding
>Affects Versions: HDFS-7285
> Environment: power
>Reporter: xlsong
> Fix For: HDFS-7285
>
> Attachments: instruction.doc
>
>
> hadoop-7285-power



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-9896) WebHDFS API may return invalid JSON

2017-01-05 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-9896.
---
Resolution: Duplicate

> WebHDFS API may return invalid JSON
> ---
>
> Key: HDFS-9896
> URL: https://issues.apache.org/jira/browse/HDFS-9896
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.6.0, 2.6.5
> Environment: FreeBSD 10.2
>Reporter: Alexander Shorin
>
> {code}
> >>> import requests
> >>> resp = 
> >>> requests.get('http://server:5/webhdfs/v1/tmp/test/\x00/not_found.txt?op=GETFILESTATUS')
> >>> resp.content
> '{"RemoteException":{"exception":"FileNotFoundException","javaClassName":"java.io.FileNotFoundException","message":"File
>  does not exist: /tmp/test/\x00/not_found.txt"}}'
> >>> resp.json()
> Traceback (most recent call last):
>   File "", line 1, in 
>   File 
> "/home/sandbox/project/venv/lib/python2.7/site-packages/requests/models.py", 
> line 800, in json
> self.content.decode(encoding), **kwargs
>   File "/usr/local/lib/python2.7/json/__init__.py", line 338, in loads
> return _default_decoder.decode(s)
>   File "/usr/local/lib/python2.7/json/decoder.py", line 366, in decode
> obj, end = self.raw_decode(s, idx=_w(s, 0).end())
>   File "/usr/local/lib/python2.7/json/decoder.py", line 382, in raw_decode
> obj, end = self.scan_once(s, idx)
> ValueError: Invalid control character at: line 1 column 147 (char 146)
> {code}
> The null byte {{\x00}} should be encoded according JSON rules as {{\u}}. 
> It seems like WebHDFS returns path back as is without any processing breaking 
> the content type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-9896) WebHDFS API may return invalid JSON

2017-01-05 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HDFS-9896:
---

> WebHDFS API may return invalid JSON
> ---
>
> Key: HDFS-9896
> URL: https://issues.apache.org/jira/browse/HDFS-9896
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.6.0, 2.6.5
> Environment: FreeBSD 10.2
>Reporter: Alexander Shorin
>
> {code}
> >>> import requests
> >>> resp = 
> >>> requests.get('http://server:5/webhdfs/v1/tmp/test/\x00/not_found.txt?op=GETFILESTATUS')
> >>> resp.content
> '{"RemoteException":{"exception":"FileNotFoundException","javaClassName":"java.io.FileNotFoundException","message":"File
>  does not exist: /tmp/test/\x00/not_found.txt"}}'
> >>> resp.json()
> Traceback (most recent call last):
>   File "", line 1, in 
>   File 
> "/home/sandbox/project/venv/lib/python2.7/site-packages/requests/models.py", 
> line 800, in json
> self.content.decode(encoding), **kwargs
>   File "/usr/local/lib/python2.7/json/__init__.py", line 338, in loads
> return _default_decoder.decode(s)
>   File "/usr/local/lib/python2.7/json/decoder.py", line 366, in decode
> obj, end = self.raw_decode(s, idx=_w(s, 0).end())
>   File "/usr/local/lib/python2.7/json/decoder.py", line 382, in raw_decode
> obj, end = self.scan_once(s, idx)
> ValueError: Invalid control character at: line 1 column 147 (char 146)
> {code}
> The null byte {{\x00}} should be encoded according JSON rules as {{\u}}. 
> It seems like WebHDFS returns path back as is without any processing breaking 
> the content type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-11156) Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API

2017-01-04 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HDFS-11156:


Reopening for precommit run.

> Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API
> 
>
> Key: HDFS-11156
> URL: https://issues.apache.org/jira/browse/HDFS-11156
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.7.3
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Fix For: 3.0.0-alpha2
>
> Attachments: BlockLocationProperties_JSON_Schema.jpg, 
> BlockLocations_JSON_Schema.jpg, FileStatuses_JSON_Schema.jpg, 
> HDFS-11156-branch-2.01.patch, HDFS-11156.01.patch, HDFS-11156.02.patch, 
> HDFS-11156.03.patch, HDFS-11156.04.patch, HDFS-11156.05.patch, 
> HDFS-11156.06.patch, HDFS-11156.07.patch, HDFS-11156.08.patch, 
> HDFS-11156.09.patch, HDFS-11156.10.patch, HDFS-11156.11.patch, 
> HDFS-11156.12.patch, HDFS-11156.13.patch, HDFS-11156.14.patch, 
> HDFS-11156.15.patch, HDFS-11156.16.patch, Output_JSON_format_v10.jpg, 
> SampleResponse_JSON.jpg
>
>
> Following webhdfs REST API
> {code}
> http://:/webhdfs/v1/?op=GET_BLOCK_LOCATIONS=0=1
> {code}
> will get a response like
> {code}
> {
>   "LocatedBlocks" : {
> "fileLength" : 1073741824,
> "isLastBlockComplete" : true,
> "isUnderConstruction" : false,
> "lastLocatedBlock" : { ... },
> "locatedBlocks" : [ {...} ]
>   }
> }
> {code}
> This represents for *o.a.h.h.p.LocatedBlocks*. However according to 
> *FileSystem* API, 
> {code}
> public BlockLocation[] getFileBlockLocations(Path p, long start, long len)
> {code}
> clients would expect an array of BlockLocation. This mismatch should be 
> fixed. Marked as Incompatible change as this will change the output of the 
> GET_BLOCK_LOCATIONS API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



  1   2   3   4   >