[jira] [Commented] (HDFS-10885) [SPS]: Mover tool should not be allowed to run when Storage Policy Satisfier is on

2016-10-21 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595897#comment-15595897
 ] 

Uma Maheswara Rao G commented on HDFS-10885:


{quote}
I do think it's a very good way, but what we should do when user disables the 
Xattr feature through dfs.namenode.xattrs.enabled? Is there any potential issue 
if we ask user to enable the feature as a must?
{quote}
I think, inode exist check can be for main decision. Xattr can be just for 
indicating who is running for logging purpose is what I was trying to say. 
Anyway HSM is already depending on Xattr for storing policy ids. 
Any suggestions [~rakeshr]?

> [SPS]: Mover tool should not be allowed to run when Storage Policy Satisfier 
> is on
> --
>
> Key: HDFS-10885
> URL: https://issues.apache.org/jira/browse/HDFS-10885
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Wei Zhou
>Assignee: Wei Zhou
> Fix For: HDFS-10285
>
> Attachments: HDFS-10800-HDFS-10885-00.patch, 
> HDFS-10800-HDFS-10885-01.patch, HDFS-10800-HDFS-10885-02.patch, 
> HDFS-10885-HDFS-10285.03.patch, HDFS-10885-HDFS-10285.04.patch
>
>
> These two can not work at the same time to avoid conflicts and fight with 
> each other.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10885) [SPS]: Mover tool should not be allowed to run when Storage Policy Satisfier is on

2016-10-19 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15590227#comment-15590227
 ] 

Uma Maheswara Rao G commented on HDFS-10885:


HI [~zhouwei], Thank you for working on this task.

I think depending on config option really does not work here. Because Mover can 
run from any process where config items can be different from Namenode. So, 
mover may have this item disabled in its configs, but at NN it might be true 
and running.
I think this is little tricky to handle, but the following is what idea 
striking in my mind for now.

How about we use mover id file for communicating this. Right now Mover will 
depend on that file. IF file exists, it will not allow other mover to run. So, 
we may need to treat that as reserved path in NN and use that file inode 
existence? When SPS running it can set XAttr on that file to indicate SPS 
running. So, that when file already exist and if Xattr says SPS, Mover can log 
the same info to let user know about that. This is just an initial thought. 
Other suggestions are most welcomed.



> [SPS]: Mover tool should not be allowed to run when Storage Policy Satisfier 
> is on
> --
>
> Key: HDFS-10885
> URL: https://issues.apache.org/jira/browse/HDFS-10885
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Wei Zhou
>Assignee: Wei Zhou
> Fix For: HDFS-10285
>
> Attachments: HDFS-10800-HDFS-10885-00.patch, 
> HDFS-10800-HDFS-10885-01.patch, HDFS-10800-HDFS-10885-02.patch, 
> HDFS-10885-HDFS-10285.03.patch, HDFS-10885-HDFS-10285.04.patch
>
>
> These two can not work at the same time to avoid conflicts and fight with 
> each other.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11029) [SPS]:Provide retry mechanism for the blocks which were failed while moving its storage at DNs

2016-10-18 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-11029:
---
Summary: [SPS]:Provide retry mechanism for the blocks which were failed 
while moving its storage at DNs  (was: Provide retry mechanism for the blocks 
which were failed while moving its storage at DNs)

> [SPS]:Provide retry mechanism for the blocks which were failed while moving 
> its storage at DNs
> --
>
> Key: HDFS-11029
> URL: https://issues.apache.org/jira/browse/HDFS-11029
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>
> When DN co-ordinator finds some of blocks associated to trackedID could not 
> be moved its storages, due to some errors.Here retry may work in some cases, 
> example if target node has no space. Then retry by finding another target can 
> work. 
> So, based on the movement result flag(SUCCESS/FAILURE) from DN Co-ordinator,  
> NN would retry by scanning the blocks again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11029) Provide retry mechanism for the blocks which were failed while moving its storage at DNs

2016-10-18 Thread Uma Maheswara Rao G (JIRA)
Uma Maheswara Rao G created HDFS-11029:
--

 Summary: Provide retry mechanism for the blocks which were failed 
while moving its storage at DNs
 Key: HDFS-11029
 URL: https://issues.apache.org/jira/browse/HDFS-11029
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: HDFS-10285
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G


When DN co-ordinator finds some of blocks associated to trackedID could not be 
moved its storages, due to some errors.Here retry may work in some cases, 
example if target node has no space. Then retry by finding another target can 
work. 
So, based on the movement result flag(SUCCESS/FAILURE) from DN Co-ordinator,  
NN would retry by scanning the blocks again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10884) [SPS]: Add block movement tracker to track the completion of block movement future tasks at DN

2016-10-18 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584888#comment-15584888
 ] 

Uma Maheswara Rao G commented on HDFS-10884:


[~rakeshr] Thanks for the patch. Overall idea looks good. Now quick question on 
patch

{code}
void handle(BlockMovementResult result) {
+  completedBlocks.add(result);
+  // TODO: notify namenode about the success/failures.
+}
+
{code}
Are you planning to notify for each and every block or all blocks associated to 
trackID  as combined result?

I will continue review and post my feedback by tomorrow, Thanks

> [SPS]: Add block movement tracker to track the completion of block movement 
> future tasks at DN
> --
>
> Key: HDFS-10884
> URL: https://issues.apache.org/jira/browse/HDFS-10884
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: HDFS-10285
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-10884-HDFS-10285-00.patch, 
> HDFS-10884-HDFS-10285-01.patch, HDFS-10884-HDFS-10285-02.patch, 
> HDFS-10884-HDFS-10285-03.patch, HDFS-10884-HDFS-10285-04.patch
>
>
> Presently 
> [StoragePolicySatisfyWorker#processBlockMovingTasks()|https://github.com/apache/hadoop/blob/HDFS-10285/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/StoragePolicySatisfyWorker.java#L147]
>  function act as a blocking call. The idea of this jira is to implement a 
> mechanism to track these movements async so that would allow other movement 
> while processing the previous one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10801) [SPS]: Protocol buffer changes for sending storage movement commands from NN to DN

2016-10-07 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556385#comment-15556385
 ] 

Uma Maheswara Rao G commented on HDFS-10801:


Latest patch looks good to me. +1

> [SPS]: Protocol buffer changes for sending storage movement commands from NN 
> to DN 
> ---
>
> Key: HDFS-10801
> URL: https://issues.apache.org/jira/browse/HDFS-10801
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Rakesh R
> Attachments: HDFS-10801-HDFS-10285-00.patch, 
> HDFS-10801-HDFS-10285-01.patch
>
>
> This JIRA is for tracking the work of protocol buffer changes for sending the 
> storage movement commands from NN to DN



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10801) [SPS]: Protocol buffer changes for sending storage movement commands from NN to DN

2016-10-06 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553458#comment-15553458
 ] 

Uma Maheswara Rao G commented on HDFS-10801:


Thank you [~rakeshr] for working on this.
Overall patch looks good to me. I have few minor comments though.

* Code piece 
{code}
+message BlockStorageMovementProto {
+  required BlockProto block = 1;
+  required DatanodeInfosProto sourceDnInfos = 2;
+  required DatanodeInfosProto targetDnInfos = 3;
+  required StorageTypesProto sourceStorageTypes = 4;
+  required StorageTypesProto targetStorageTypes = 5;
+}
{code}
Can you please check the variable ordering consistent? BockMovingInfo toString 
also has different ordering. It would be good if we follow consistent ordering.

* BlockStorageMovementCommand: I would like to see javadoc for the variables 
and methods here. Explaining a bit about trackID makes more clear.

* processBlockMovingTasks: I would like to see adding comment by drafting 
overall processing idea here. I am fine if you want to add in other place, but 
you can refer link


> [SPS]: Protocol buffer changes for sending storage movement commands from NN 
> to DN 
> ---
>
> Key: HDFS-10801
> URL: https://issues.apache.org/jira/browse/HDFS-10801
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Rakesh R
> Attachments: HDFS-10801-HDFS-10285-00.patch
>
>
> This JIRA is for tracking the work of protocol buffer changes for sending the 
> storage movement commands from NN to DN



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode

2016-09-23 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15517549#comment-15517549
 ] 

Uma Maheswara Rao G commented on HDFS-10285:


[~ehiggs] Thank you for taking a look and questions.
Here satisfyStoragePolicy will not worry about what is the block layout(EC or 
replication). Erasure coded files still support storage policies. So, if you 
call satisfyStoragePolicy on a directory, we consider all immediate files under 
that directory and a Daemon thread will analyze whether that files really 
matching with namespace storage policy and real block storage placement at DN. 
If there is mismatch, then we need to move the storage. Then we will issue a 
command to DNs to move the block storage. Hope this answers your question. 


> Storage Policy Satisfier in Namenode
> 
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 2.7.2
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: Storage-Policy-Satisfier-in-HDFS-May10.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These 
> policies can be set on directory/file to specify the user preference, where 
> to store the physical block. When user set the storage policy before writing 
> data, then the blocks could take advantage of storage policy preferences and 
> stores physical block accordingly. 
> If user set the storage policy after writing and completing the file, then 
> the blocks would have been written with default storage policy (nothing but 
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such 
> file names as a list. In some distributed system scenarios (ex: HBase) it 
> would be difficult to collect all the files and run the tool as different 
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage 
> policy file (inherited policy from parent directory) to another storage 
> policy effected directory, it will not copy inherited storage policy from 
> source. So it will take effect from destination file/dir parent storage 
> policy. This rename operation is just a metadata change in Namenode. The 
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for 
> admins from distributed nodes(ex: region servers) and running the Mover tool. 
> Here the proposal is to provide an API from Namenode itself for trigger the 
> storage policy satisfaction. A Daemon thread inside Namenode should track 
> such calls and process to DN as movement commands. 
> Will post the detailed design thoughts document soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10800) [SPS]: Daemon thread in Namenode to find blocks placed in other storage than what the policy specifies

2016-09-23 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10800:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: HDFS-10285
   Status: Resolved  (was: Patch Available)

I have just committed this to branch

> [SPS]: Daemon thread in Namenode to find blocks placed in other storage than 
> what the policy specifies
> --
>
> Key: HDFS-10800
> URL: https://issues.apache.org/jira/browse/HDFS-10800
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Fix For: HDFS-10285
>
> Attachments: HDFS-10800-HDFS-10285-00.patch, 
> HDFS-10800-HDFS-10285-01.patch, HDFS-10800-HDFS-10285-02.patch, 
> HDFS-10800-HDFS-10285-03.patch, HDFS-10800-HDFS-10285-04.patch, 
> HDFS-10800-HDFS-10285-05.patch
>
>
> This JIRA is for implementing a daemon thread called StoragePolicySatisfier 
> in namatode, which scans the asked files blocks which are placed in different 
> storages in DNs than the related policies specifie. 
>  The idea is:
>   # When user called on some files/dirs to satisfy storage policy, they 
> should have been tracked in NN and then StoragePolicySatisfier thread will 
> pick one by one file,  then check the blocks which might have been placed in 
> different storage in DN than what the storage policy is expecting it to.
>   # After checking all, it should also construct the data structures with 
> the required information to move a block from one storage to another.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10800) [SPS]: Daemon thread in Namenode to find blocks placed in other storage than what the policy specifies

2016-09-23 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15517508#comment-15517508
 ] 

Uma Maheswara Rao G commented on HDFS-10800:


Thanks Rakesh for the reviews. I am going ahead to commit this patch to branch. 
Failures are unrelated to this change.

> [SPS]: Daemon thread in Namenode to find blocks placed in other storage than 
> what the policy specifies
> --
>
> Key: HDFS-10800
> URL: https://issues.apache.org/jira/browse/HDFS-10800
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10800-HDFS-10285-00.patch, 
> HDFS-10800-HDFS-10285-01.patch, HDFS-10800-HDFS-10285-02.patch, 
> HDFS-10800-HDFS-10285-03.patch, HDFS-10800-HDFS-10285-04.patch, 
> HDFS-10800-HDFS-10285-05.patch
>
>
> This JIRA is for implementing a daemon thread called StoragePolicySatisfier 
> in namatode, which scans the asked files blocks which are placed in different 
> storages in DNs than the related policies specifie. 
>  The idea is:
>   # When user called on some files/dirs to satisfy storage policy, they 
> should have been tracked in NN and then StoragePolicySatisfier thread will 
> pick one by one file,  then check the blocks which might have been placed in 
> different storage in DN than what the storage policy is expecting it to.
>   # After checking all, it should also construct the data structures with 
> the required information to move a block from one storage to another.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10800) [SPS]: Daemon thread in Namenode to find blocks placed in other storage than what the policy specifies

2016-09-21 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10800:
---
Attachment: HDFS-10800-HDFS-10285-05.patch

A minor update in patch. Forgot to move blockMovingInfos inside 
computeAndAssign* API. Please check this patch for review.

> [SPS]: Daemon thread in Namenode to find blocks placed in other storage than 
> what the policy specifies
> --
>
> Key: HDFS-10800
> URL: https://issues.apache.org/jira/browse/HDFS-10800
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10800-HDFS-10285-00.patch, 
> HDFS-10800-HDFS-10285-01.patch, HDFS-10800-HDFS-10285-02.patch, 
> HDFS-10800-HDFS-10285-03.patch, HDFS-10800-HDFS-10285-04.patch, 
> HDFS-10800-HDFS-10285-05.patch
>
>
> This JIRA is for implementing a daemon thread called StoragePolicySatisfier 
> in namatode, which scans the asked files blocks which are placed in different 
> storages in DNs than the related policies specifie. 
>  The idea is:
>   # When user called on some files/dirs to satisfy storage policy, they 
> should have been tracked in NN and then StoragePolicySatisfier thread will 
> pick one by one file,  then check the blocks which might have been placed in 
> different storage in DN than what the storage policy is expecting it to.
>   # After checking all, it should also construct the data structures with 
> the required information to move a block from one storage to another.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10800) [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks which were placed in other storages than what NN is expecting.

2016-09-21 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10800:
---
Attachment: HDFS-10800-HDFS-10285-04.patch

Thanks a lot [~rakeshr], for the thoughtful suggestions. I have incorporated 
all of them in this patch. Please review.

> [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks 
> which were placed in other storages than what NN is expecting.
> ---
>
> Key: HDFS-10800
> URL: https://issues.apache.org/jira/browse/HDFS-10800
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10800-HDFS-10285-00.patch, 
> HDFS-10800-HDFS-10285-01.patch, HDFS-10800-HDFS-10285-02.patch, 
> HDFS-10800-HDFS-10285-03.patch, HDFS-10800-HDFS-10285-04.patch
>
>
> This JIRA is for implementing a daemon thread called StoragePolicySatisfier 
> in nematode, which should scan the asked files blocks which were placed in 
> wrong storages in DNs. 
>  The idea is:
>   # When user called on some files/dirs for satisfyStorage policy, They 
> should have tracked in NN and then StoragePolicyDaemon thread will pick one 
> by one file and then check the blocks which might have placed in wrong 
> storage in DN than what NN is expecting it to.
>   # After checking all, it should also construct the data structures for 
> the required information to move a block from one storage to another.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10800) [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks which were placed in other storages than what NN is expecting.

2016-09-20 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10800:
---
Attachment: HDFS-10800-HDFS-10285-03.patch

Thank you [~rakeshr] for the review.
I have updated the patch fixing all your comments. for #4) I have left as is. 
We can refine later after adding dynamic start/stop option. TODO was there 
about the same. Thanks

Please review!

> [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks 
> which were placed in other storages than what NN is expecting.
> ---
>
> Key: HDFS-10800
> URL: https://issues.apache.org/jira/browse/HDFS-10800
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10800-HDFS-10285-00.patch, 
> HDFS-10800-HDFS-10285-01.patch, HDFS-10800-HDFS-10285-02.patch, 
> HDFS-10800-HDFS-10285-03.patch
>
>
> This JIRA is for implementing a daemon thread called StoragePolicySatisfier 
> in nematode, which should scan the asked files blocks which were placed in 
> wrong storages in DNs. 
>  The idea is:
>   # When user called on some files/dirs for satisfyStorage policy, They 
> should have tracked in NN and then StoragePolicyDaemon thread will pick one 
> by one file and then check the blocks which might have placed in wrong 
> storage in DN than what NN is expecting it to.
>   # After checking all, it should also construct the data structures for 
> the required information to move a block from one storage to another.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10800) [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks which were placed in other storages than what NN is expecting.

2016-09-15 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10800:
---
Attachment: HDFS-10800-HDFS-10285-02.patch

Attached a revised new patch. I also took chance to fix a random test failure 
case in HDFS-10794.
Please review the patch.

> [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks 
> which were placed in other storages than what NN is expecting.
> ---
>
> Key: HDFS-10800
> URL: https://issues.apache.org/jira/browse/HDFS-10800
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10800-HDFS-10285-00.patch, 
> HDFS-10800-HDFS-10285-01.patch, HDFS-10800-HDFS-10285-02.patch
>
>
> This JIRA is for implementing a daemon thread called StoragePolicySatisfier 
> in nematode, which should scan the asked files blocks which were placed in 
> wrong storages in DNs. 
>  The idea is:
>   # When user called on some files/dirs for satisfyStorage policy, They 
> should have tracked in NN and then StoragePolicyDaemon thread will pick one 
> by one file and then check the blocks which might have placed in wrong 
> storage in DN than what NN is expecting it to.
>   # After checking all, it should also construct the data structures for 
> the required information to move a block from one storage to another.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10800) [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks which were placed in other storages than what NN is expecting.

2016-09-13 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15487684#comment-15487684
 ] 

Uma Maheswara Rao G edited comment on HDFS-10800 at 9/13/16 4:40 PM:
-

Since HDFS-10794 is committed, I will re-generate patch on latest code again.


was (Author: umamaheswararao):
Since HDFS-10800 is committed, I will re-generate patch on latest code again.

> [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks 
> which were placed in other storages than what NN is expecting.
> ---
>
> Key: HDFS-10800
> URL: https://issues.apache.org/jira/browse/HDFS-10800
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10800-HDFS-10285-00.patch, 
> HDFS-10800-HDFS-10285-01.patch
>
>
> This JIRA is for implementing a daemon thread called StoragePolicySatisfier 
> in nematode, which should scan the asked files blocks which were placed in 
> wrong storages in DNs. 
>  The idea is:
>   # When user called on some files/dirs for satisfyStorage policy, They 
> should have tracked in NN and then StoragePolicyDaemon thread will pick one 
> by one file and then check the blocks which might have placed in wrong 
> storage in DN than what NN is expecting it to.
>   # After checking all, it should also construct the data structures for 
> the required information to move a block from one storage to another.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10800) [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks which were placed in other storages than what NN is expecting.

2016-09-13 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15487684#comment-15487684
 ] 

Uma Maheswara Rao G edited comment on HDFS-10800 at 9/13/16 4:33 PM:
-

Since HDFS-10800 is committed, I will re-generate patch on latest code again.


was (Author: umamaheswararao):
Since HDFS-10800 is committed, I will generate patch based on that changes.

> [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks 
> which were placed in other storages than what NN is expecting.
> ---
>
> Key: HDFS-10800
> URL: https://issues.apache.org/jira/browse/HDFS-10800
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10800-HDFS-10285-00.patch, 
> HDFS-10800-HDFS-10285-01.patch
>
>
> This JIRA is for implementing a daemon thread called StoragePolicySatisfier 
> in nematode, which should scan the asked files blocks which were placed in 
> wrong storages in DNs. 
>  The idea is:
>   # When user called on some files/dirs for satisfyStorage policy, They 
> should have tracked in NN and then StoragePolicyDaemon thread will pick one 
> by one file and then check the blocks which might have placed in wrong 
> storage in DN than what NN is expecting it to.
>   # After checking all, it should also construct the data structures for 
> the required information to move a block from one storage to another.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10800) [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks which were placed in other storages than what NN is expecting.

2016-09-13 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15487684#comment-15487684
 ] 

Uma Maheswara Rao G commented on HDFS-10800:


Since HDFS-10800 is committed, I will generate patch based on that changes.

> [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks 
> which were placed in other storages than what NN is expecting.
> ---
>
> Key: HDFS-10800
> URL: https://issues.apache.org/jira/browse/HDFS-10800
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10800-HDFS-10285-00.patch, 
> HDFS-10800-HDFS-10285-01.patch
>
>
> This JIRA is for implementing a daemon thread called StoragePolicySatisfier 
> in nematode, which should scan the asked files blocks which were placed in 
> wrong storages in DNs. 
>  The idea is:
>   # When user called on some files/dirs for satisfyStorage policy, They 
> should have tracked in NN and then StoragePolicyDaemon thread will pick one 
> by one file and then check the blocks which might have placed in wrong 
> storage in DN than what NN is expecting it to.
>   # After checking all, it should also construct the data structures for 
> the required information to move a block from one storage to another.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10794) [SPS]: Provide storage policy satisfy worker at DN for co-ordinating the block storage movement work

2016-09-13 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15487680#comment-15487680
 ] 

Uma Maheswara Rao G commented on HDFS-10794:


Thanks a lot Kai for the reviews and commit!

> [SPS]: Provide storage policy satisfy worker at DN for co-ordinating the 
> block storage movement work
> 
>
> Key: HDFS-10794
> URL: https://issues.apache.org/jira/browse/HDFS-10794
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: HDFS-10285
>
> Attachments: HDFS-10794-00.patch, HDFS-10794-HDFS-10285.00.patch, 
> HDFS-10794-HDFS-10285.01.patch, HDFS-10794-HDFS-10285.02.patch, 
> HDFS-10794-HDFS-10285.03.patch
>
>
> The idea of this jira is to implement a mechanism to move the blocks to the 
> given target in order to satisfy the block storage policy. Datanode receives 
> {{blocktomove}} details via heart beat response from NN. More specifically, 
> its a datanode side extension to handle the block storage movement commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10800) [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks which were placed in other storages than what NN is expecting.

2016-08-31 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10800:
---
Attachment: HDFS-10800-HDFS-10285-01.patch

Tests were passed in my local eclipse. Since my local eclipse does not have 
java assertion enabled, this was not caught there. I have just updated to 
correct it now.

> [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks 
> which were placed in other storages than what NN is expecting.
> ---
>
> Key: HDFS-10800
> URL: https://issues.apache.org/jira/browse/HDFS-10800
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10800-HDFS-10285-00.patch, 
> HDFS-10800-HDFS-10285-01.patch
>
>
> This JIRA is for implementing a daemon thread called StoragePolicySatisfier 
> in nematode, which should scan the asked files blocks which were placed in 
> wrong storages in DNs. 
>  The idea is:
>   # When user called on some files/dirs for satisfyStorage policy, They 
> should have tracked in NN and then StoragePolicyDaemon thread will pick one 
> by one file and then check the blocks which might have placed in wrong 
> storage in DN than what NN is expecting it to.
>   # After checking all, it should also construct the data structures for 
> the required information to move a block from one storage to another.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10800) [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks which were placed in other storages than what NN is expecting.

2016-08-31 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10800:
---
Status: Patch Available  (was: Open)

> [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks 
> which were placed in other storages than what NN is expecting.
> ---
>
> Key: HDFS-10800
> URL: https://issues.apache.org/jira/browse/HDFS-10800
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10800-HDFS-10285-00.patch
>
>
> This JIRA is for implementing a daemon thread called StoragePolicySatisfier 
> in nematode, which should scan the asked files blocks which were placed in 
> wrong storages in DNs. 
>  The idea is:
>   # When user called on some files/dirs for satisfyStorage policy, They 
> should have tracked in NN and then StoragePolicyDaemon thread will pick one 
> by one file and then check the blocks which might have placed in wrong 
> storage in DN than what NN is expecting it to.
>   # After checking all, it should also construct the data structures for 
> the required information to move a block from one storage to another.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10800) [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks which were placed in other storages than what NN is expecting.

2016-08-30 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10800:
---
Summary: [SPS]: Storage Policy Satisfier daemon thread in Namenode to find 
the blocks which were placed in other storages than what NN is expecting.  
(was: [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the 
blocks which were placed in wrong storages than what NN is expecting.)

> [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks 
> which were placed in other storages than what NN is expecting.
> ---
>
> Key: HDFS-10800
> URL: https://issues.apache.org/jira/browse/HDFS-10800
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10800-HDFS-10285-00.patch
>
>
> This JIRA is for implementing a daemon thread called StoragePolicySatisfier 
> in nematode, which should scan the asked files blocks which were placed in 
> wrong storages in DNs. 
>  The idea is:
>   # When user called on some files/dirs for satisfyStorage policy, They 
> should have tracked in NN and then StoragePolicyDaemon thread will pick one 
> by one file and then check the blocks which might have placed in wrong 
> storage in DN than what NN is expecting it to.
>   # After checking all, it should also construct the data structures for 
> the required information to move a block from one storage to another.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10800) [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks which were placed in wrong storages than what NN is expecting.

2016-08-29 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10800:
---
Affects Version/s: HDFS-10285

> [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks 
> which were placed in wrong storages than what NN is expecting.
> ---
>
> Key: HDFS-10800
> URL: https://issues.apache.org/jira/browse/HDFS-10800
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10800-HDFS-10285-00.patch
>
>
> This JIRA is for implementing a daemon thread called StoragePolicySatisfier 
> in nematode, which should scan the asked files blocks which were placed in 
> wrong storages in DNs. 
>  The idea is:
>   # When user called on some files/dirs for satisfyStorage policy, They 
> should have tracked in NN and then StoragePolicyDaemon thread will pick one 
> by one file and then check the blocks which might have placed in wrong 
> storage in DN than what NN is expecting it to.
>   # After checking all, it should also construct the data structures for 
> the required information to move a block from one storage to another.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10800) [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks which were placed in wrong storages than what NN is expecting.

2016-08-29 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10800:
---
Attachment: HDFS-10800-HDFS-10285-00.patch

Attaching the first patch for this JIRA.
StoragePolicySatisfier Daemon thread starts along with BlockManager activation 
and take blockCollections one by one and identify the neededStorageMovement 
blocks and builds the source and target DN mapping with the expected storage 
types.

> [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks 
> which were placed in wrong storages than what NN is expecting.
> ---
>
> Key: HDFS-10800
> URL: https://issues.apache.org/jira/browse/HDFS-10800
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10800-HDFS-10285-00.patch
>
>
> This JIRA is for implementing a daemon thread called StoragePolicySatisfier 
> in nematode, which should scan the asked files blocks which were placed in 
> wrong storages in DNs. 
>  The idea is:
>   # When user called on some files/dirs for satisfyStorage policy, They 
> should have tracked in NN and then StoragePolicyDaemon thread will pick one 
> by one file and then check the blocks which might have placed in wrong 
> storage in DN than what NN is expecting it to.
>   # After checking all, it should also construct the data structures for 
> the required information to move a block from one storage to another.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-10794) [SPS]: Provide storage policy satisfy worker at DN for co-ordinating the block storage movement work

2016-08-29 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10794:
---
Comment: was deleted

(was: Attaching the first patch for this JIRA.
StoragePolicySatisfier Daemon thread starts along with BlockManager activation 
and take blockCollections one by one and identify the neededStorageMovement 
blocks and builds the source and target DN mapping with the expected storage 
types.)

> [SPS]: Provide storage policy satisfy worker at DN for co-ordinating the 
> block storage movement work
> 
>
> Key: HDFS-10794
> URL: https://issues.apache.org/jira/browse/HDFS-10794
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-10794-00.patch, HDFS-10794-HDFS-10285.00.patch, 
> HDFS-10794-HDFS-10285.01.patch, HDFS-10794-HDFS-10285.02.patch
>
>
> The idea of this jira is to implement a mechanism to move the blocks to the 
> given target in order to satisfy the block storage policy. Datanode receives 
> {{blocktomove}} details via heart beat response from NN. More specifically, 
> its a datanode side extension to handle the block storage movement commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10794) [SPS]: Provide storage policy satisfy worker at DN for co-ordinating the block storage movement work

2016-08-29 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10794:
---
Attachment: (was: HDFS-10800-HDFS-10285-00.patch)

> [SPS]: Provide storage policy satisfy worker at DN for co-ordinating the 
> block storage movement work
> 
>
> Key: HDFS-10794
> URL: https://issues.apache.org/jira/browse/HDFS-10794
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-10794-00.patch, HDFS-10794-HDFS-10285.00.patch, 
> HDFS-10794-HDFS-10285.01.patch, HDFS-10794-HDFS-10285.02.patch
>
>
> The idea of this jira is to implement a mechanism to move the blocks to the 
> given target in order to satisfy the block storage policy. Datanode receives 
> {{blocktomove}} details via heart beat response from NN. More specifically, 
> its a datanode side extension to handle the block storage movement commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10794) [SPS]: Provide storage policy satisfy worker at DN for co-ordinating the block storage movement work

2016-08-29 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10794:
---
Attachment: HDFS-10800-HDFS-10285-00.patch

Attaching the first patch for this JIRA.
StoragePolicySatisfier Daemon thread starts along with BlockManager activation 
and take blockCollections one by one and identify the neededStorageMovement 
blocks and builds the source and target DN mapping with the expected storage 
types.

> [SPS]: Provide storage policy satisfy worker at DN for co-ordinating the 
> block storage movement work
> 
>
> Key: HDFS-10794
> URL: https://issues.apache.org/jira/browse/HDFS-10794
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-10794-00.patch, HDFS-10794-HDFS-10285.00.patch, 
> HDFS-10794-HDFS-10285.01.patch, HDFS-10794-HDFS-10285.02.patch, 
> HDFS-10800-HDFS-10285-00.patch
>
>
> The idea of this jira is to implement a mechanism to move the blocks to the 
> given target in order to satisfy the block storage policy. Datanode receives 
> {{blocktomove}} details via heart beat response from NN. More specifically, 
> its a datanode side extension to handle the block storage movement commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10794) [SPS]: Provide storage policy satisfy worker at DN for co-ordinating the block storage movement work

2016-08-26 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15439719#comment-15439719
 ] 

Uma Maheswara Rao G commented on HDFS-10794:


HI [~rakeshr], Thank you for working on this. I think our Jenkins will pick it 
up automatically based on patch name. Could you please follow our patch naming 
convention?

Please check the section [naming your 
patch|http://wiki.apache.org/hadoop/HowToContribute#Naming_your_patch]

{quote}
Patches for a non-trunk branch should be named 
-..patch, e.g. HDFS-1234-branch-2.003.patch. 
The branch name suffix should be the exact name of a git branch, such as 
"branch-2". Jenkins will check the name of the patch and detect the appropriate 
branch for testing.
{quote}

> [SPS]: Provide storage policy satisfy worker at DN for co-ordinating the 
> block storage movement work
> 
>
> Key: HDFS-10794
> URL: https://issues.apache.org/jira/browse/HDFS-10794
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-10794-00.patch
>
>
> The idea of this jira is to implement a mechanism to move the blocks to the 
> given target in order to satisfy the block storage policy. Datanode receives 
> {{blocktomove}} details via heart beat response from NN. More specifically, 
> its a datanode side extension to handle the block storage movement commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10794) [SPS]: Provide storage policy satisfy worker at DN for co-ordinating the block storage movement work

2016-08-25 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10794:
---
Summary: [SPS]: Provide storage policy satisfy worker at DN for 
co-ordinating the block storage movement work  (was: Provide storage policy 
satisfy worker at DN for co-ordinating the block storage movement work)

> [SPS]: Provide storage policy satisfy worker at DN for co-ordinating the 
> block storage movement work
> 
>
> Key: HDFS-10794
> URL: https://issues.apache.org/jira/browse/HDFS-10794
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-10794-00.patch
>
>
> The idea of this jira is to implement a mechanism to move the blocks to the 
> given target in order to satisfy the block storage policy. Datanode receives 
> {{blocktomove}} details via heart beat response from NN. More specifically, 
> its a datanode side extension to handle the block storage movement commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10802) [SPS]: Add satisfyStoragePolicy API in HdfsAdmin

2016-08-25 Thread Uma Maheswara Rao G (JIRA)
Uma Maheswara Rao G created HDFS-10802:
--

 Summary: [SPS]: Add satisfyStoragePolicy API in HdfsAdmin
 Key: HDFS-10802
 URL: https://issues.apache.org/jira/browse/HDFS-10802
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G


This JIRA is to track the work for adding user/admin API for calling to 
satisfyStoragePolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10801) [SPS]: Protocol buffer changes for sending storage movement commands from NN to DN

2016-08-25 Thread Uma Maheswara Rao G (JIRA)
Uma Maheswara Rao G created HDFS-10801:
--

 Summary: [SPS]: Protocol buffer changes for sending storage 
movement commands from NN to DN 
 Key: HDFS-10801
 URL: https://issues.apache.org/jira/browse/HDFS-10801
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Reporter: Uma Maheswara Rao G
Assignee: Rakesh R


This JIRA is for tracking the work of protocol buffer changes for sending the 
storage movement commands from NN to DN



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10800) [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks which were placed in wrong storages than what NN is expecting.

2016-08-25 Thread Uma Maheswara Rao G (JIRA)
Uma Maheswara Rao G created HDFS-10800:
--

 Summary: [SPS]: Storage Policy Satisfier daemon thread in Namenode 
to find the blocks which were placed in wrong storages than what NN is 
expecting.
 Key: HDFS-10800
 URL: https://issues.apache.org/jira/browse/HDFS-10800
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G


This JIRA is for implementing a daemon thread called StoragePolicySatisfier in 
nematode, which should scan the asked files blocks which were placed in wrong 
storages in DNs. 

 The idea is:
  # When user called on some files/dirs for satisfyStorage policy, They 
should have tracked in NN and then StoragePolicyDaemon thread will pick one by 
one file and then check the blocks which might have placed in wrong storage in 
DN than what NN is expecting it to.
  # After checking all, it should also construct the data structures for 
the required information to move a block from one storage to another.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode

2016-08-17 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425781#comment-15425781
 ] 

Uma Maheswara Rao G commented on HDFS-10285:


I have created a branch for this. Let's use HDFS-10285 branch for this work.

> Storage Policy Satisfier in Namenode
> 
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 2.7.2
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: Storage-Policy-Satisfier-in-HDFS-May10.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These 
> policies can be set on directory/file to specify the user preference, where 
> to store the physical block. When user set the storage policy before writing 
> data, then the blocks could take advantage of storage policy preferences and 
> stores physical block accordingly. 
> If user set the storage policy after writing and completing the file, then 
> the blocks would have been written with default storage policy (nothing but 
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such 
> file names as a list. In some distributed system scenarios (ex: HBase) it 
> would be difficult to collect all the files and run the tool as different 
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage 
> policy file (inherited policy from parent directory) to another storage 
> policy effected directory, it will not copy inherited storage policy from 
> source. So it will take effect from destination file/dir parent storage 
> policy. This rename operation is just a metadata change in Namenode. The 
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for 
> admins from distributed nodes(ex: region servers) and running the Mover tool. 
> Here the proposal is to provide an API from Namenode itself for trigger the 
> storage policy satisfaction. A Daemon thread inside Namenode should track 
> such calls and process to DN as movement commands. 
> Will post the detailed design thoughts document soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode

2016-08-15 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422168#comment-15422168
 ] 

Uma Maheswara Rao G commented on HDFS-10285:


Thank you Kai and Rakesh for the suggestion. Sure I will create a branch for 
this work.

> Storage Policy Satisfier in Namenode
> 
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 2.7.2
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: Storage-Policy-Satisfier-in-HDFS-May10.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These 
> policies can be set on directory/file to specify the user preference, where 
> to store the physical block. When user set the storage policy before writing 
> data, then the blocks could take advantage of storage policy preferences and 
> stores physical block accordingly. 
> If user set the storage policy after writing and completing the file, then 
> the blocks would have been written with default storage policy (nothing but 
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such 
> file names as a list. In some distributed system scenarios (ex: HBase) it 
> would be difficult to collect all the files and run the tool as different 
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage 
> policy file (inherited policy from parent directory) to another storage 
> policy effected directory, it will not copy inherited storage policy from 
> source. So it will take effect from destination file/dir parent storage 
> policy. This rename operation is just a metadata change in Namenode. The 
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for 
> admins from distributed nodes(ex: region servers) and running the Mover tool. 
> Here the proposal is to provide an API from Namenode itself for trigger the 
> storage policy satisfaction. A Daemon thread inside Namenode should track 
> such calls and process to DN as movement commands. 
> Will post the detailed design thoughts document soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode

2016-08-09 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413011#comment-15413011
 ] 

Uma Maheswara Rao G commented on HDFS-10285:


[~yuanbo] Thanks for taking a look.
For #1:
  We can provide options for both Java and command line.

For #2: 
Actually the Idea is, we should not allow both to run at same time. If 
user/admin wants to run tool, he should switch off the Namenode 
StoragePolicySatisfier by issuing dynamic config command to disable. This was 
the idea, I will add that point in doc in next revision. In the same way, if 
this thread is running in Namenode, "Mover Tool" should not be allowed to run.

Hope this points make sense to you.


> Storage Policy Satisfier in Namenode
> 
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 2.7.2
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: Storage-Policy-Satisfier-in-HDFS-May10.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These 
> policies can be set on directory/file to specify the user preference, where 
> to store the physical block. When user set the storage policy before writing 
> data, then the blocks could take advantage of storage policy preferences and 
> stores physical block accordingly. 
> If user set the storage policy after writing and completing the file, then 
> the blocks would have been written with default storage policy (nothing but 
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such 
> file names as a list. In some distributed system scenarios (ex: HBase) it 
> would be difficult to collect all the files and run the tool as different 
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage 
> policy file (inherited policy from parent directory) to another storage 
> policy effected directory, it will not copy inherited storage policy from 
> source. So it will take effect from destination file/dir parent storage 
> policy. This rename operation is just a metadata change in Namenode. The 
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for 
> admins from distributed nodes(ex: region servers) and running the Mover tool. 
> Here the proposal is to provide an API from Namenode itself for trigger the 
> storage policy satisfaction. A Daemon thread inside Namenode should track 
> such calls and process to DN as movement commands. 
> Will post the detailed design thoughts document soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10565) Erasure Coding: Document about the current allowed storage policies for EC Striped mode files

2016-07-22 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10565:
---
   Resolution: Fixed
Fix Version/s: 3.0.0-alpha1
   Status: Resolved  (was: Patch Available)

Thanks a lot Jing for review. I have just committed this.

> Erasure Coding: Document about the current allowed storage policies for EC 
> Striped mode files 
> --
>
> Key: HDFS-10565
> URL: https://issues.apache.org/jira/browse/HDFS-10565
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 3.0.0-alpha1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10565-00.patch
>
>
> HDFS-10473 implemented to allow only ALL_SSD, HOT, COLD policies  to take 
> effect while moving/placing blocks for Striped EC files. This is JIRA to 
> track the documentation about the behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10565) Erasure Coding: Document about the current allowed storage policies for EC Striped mode files

2016-07-21 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10565:
---
Attachment: HDFS-10565-00.patch

Attached a patch for it. Please review.

> Erasure Coding: Document about the current allowed storage policies for EC 
> Striped mode files 
> --
>
> Key: HDFS-10565
> URL: https://issues.apache.org/jira/browse/HDFS-10565
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 3.0.0-alpha1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10565-00.patch
>
>
> HDFS-10473 implemented to allow only ALL_SSD, HOT, COLD policies  to take 
> effect while moving/placing blocks for Striped EC files. This is JIRA to 
> track the documentation about the behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10565) Erasure Coding: Document about the current allowed storage policies for EC Striped mode files

2016-07-21 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10565:
---
Status: Patch Available  (was: Open)

> Erasure Coding: Document about the current allowed storage policies for EC 
> Striped mode files 
> --
>
> Key: HDFS-10565
> URL: https://issues.apache.org/jira/browse/HDFS-10565
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 3.0.0-alpha1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10565-00.patch
>
>
> HDFS-10473 implemented to allow only ALL_SSD, HOT, COLD policies  to take 
> effect while moving/placing blocks for Striped EC files. This is JIRA to 
> track the documentation about the behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10590) Fix TestReconstructStripedBlocks.testCountLiveReplicas test failures

2016-07-13 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10590:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha1
   Status: Resolved  (was: Patch Available)

I have committed this to trunk. Thanks Rakesh.

> Fix TestReconstructStripedBlocks.testCountLiveReplicas test failures
> 
>
> Key: HDFS-10590
> URL: https://issues.apache.org/jira/browse/HDFS-10590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10590-00.patch
>
>
> This jira is to fix the test case failure. Please see the below stacktrace.
> Reference : 
> [Build_15968|https://builds.apache.org/job/PreCommit-HDFS-Build/15968/testReport/junit/org.apache.hadoop.hdfs.server.namenode/TestReconstructStripedBlocks/testCountLiveReplicas/]
> {code}
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestReconstructStripedBlocks.testCountLiveReplicas(TestReconstructStripedBlocks.java:324)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10590) Fix TestReconstructStripedBlocks.testCountLiveReplicas test failures

2016-07-07 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366969#comment-15366969
 ] 

Uma Maheswara Rao G commented on HDFS-10590:


+1, Thanks Rakesh

> Fix TestReconstructStripedBlocks.testCountLiveReplicas test failures
> 
>
> Key: HDFS-10590
> URL: https://issues.apache.org/jira/browse/HDFS-10590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-10590-00.patch
>
>
> This jira is to fix the test case failure. Please see the below stacktrace.
> Reference : 
> [Build_15968|https://builds.apache.org/job/PreCommit-HDFS-Build/15968/testReport/junit/org.apache.hadoop.hdfs.server.namenode/TestReconstructStripedBlocks/testCountLiveReplicas/]
> {code}
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestReconstructStripedBlocks.testCountLiveReplicas(TestReconstructStripedBlocks.java:324)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10592) Fix intermittent test failure of TestNameNodeResourceChecker#testCheckThatNameNodeResourceMonitorIsRunning

2016-07-07 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10592:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

I have just committed this to trunk and branch-2, 2.8. Thanks

> Fix intermittent test failure of 
> TestNameNodeResourceChecker#testCheckThatNameNodeResourceMonitorIsRunning
> --
>
> Key: HDFS-10592
> URL: https://issues.apache.org/jira/browse/HDFS-10592
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: 2.8.0
>
> Attachments: HDFS-10592-00.patch, HDFS-10592-01.patch
>
>
> This jira is to fix the 
> {{TestNameNodeResourceChecker#testCheckThatNameNodeResourceMonitorIsRunning}} 
> test case failure.
> Reference 
> [Build_15973|https://builds.apache.org/job/PreCommit-HDFS-Build/15973/testReport/junit/org.apache.hadoop.hdfs.server.namenode/TestNameNodeResourceChecker/testCheckThatNameNodeResourceMonitorIsRunning/]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10592) Fix intermittent test failure of TestNameNodeResourceChecker#testCheckThatNameNodeResourceMonitorIsRunning

2016-07-07 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10592:
---
Component/s: test

> Fix intermittent test failure of 
> TestNameNodeResourceChecker#testCheckThatNameNodeResourceMonitorIsRunning
> --
>
> Key: HDFS-10592
> URL: https://issues.apache.org/jira/browse/HDFS-10592
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: 2.8.0
>
> Attachments: HDFS-10592-00.patch, HDFS-10592-01.patch
>
>
> This jira is to fix the 
> {{TestNameNodeResourceChecker#testCheckThatNameNodeResourceMonitorIsRunning}} 
> test case failure.
> Reference 
> [Build_15973|https://builds.apache.org/job/PreCommit-HDFS-Build/15973/testReport/junit/org.apache.hadoop.hdfs.server.namenode/TestNameNodeResourceChecker/testCheckThatNameNodeResourceMonitorIsRunning/]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10592) Fix intermittent test failure of TestNameNodeResourceChecker#testCheckThatNameNodeResourceMonitorIsRunning

2016-07-07 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366704#comment-15366704
 ] 

Uma Maheswara Rao G commented on HDFS-10592:


Thanks Rakesh for working on this.
+1 for the patch. Will commit this.

> Fix intermittent test failure of 
> TestNameNodeResourceChecker#testCheckThatNameNodeResourceMonitorIsRunning
> --
>
> Key: HDFS-10592
> URL: https://issues.apache.org/jira/browse/HDFS-10592
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: 2.8.0
>
> Attachments: HDFS-10592-00.patch, HDFS-10592-01.patch
>
>
> This jira is to fix the 
> {{TestNameNodeResourceChecker#testCheckThatNameNodeResourceMonitorIsRunning}} 
> test case failure.
> Reference 
> [Build_15973|https://builds.apache.org/job/PreCommit-HDFS-Build/15973/testReport/junit/org.apache.hadoop.hdfs.server.namenode/TestNameNodeResourceChecker/testCheckThatNameNodeResourceMonitorIsRunning/]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10555) Unable to loadFSEdits due to a failure in readCachePoolInfo

2016-06-23 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10555:
---
Affects Version/s: 2.9.0

> Unable to loadFSEdits due to a failure in readCachePoolInfo
> ---
>
> Key: HDFS-10555
> URL: https://issues.apache.org/jira/browse/HDFS-10555
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching, namenode
>Affects Versions: 2.9.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Critical
> Fix For: 2.9.0
>
> Attachments: HDFS-10555-00.patch
>
>
> Recently some tests are failing and unable to loadFSEdits due to a failure in 
> readCachePoolInfo.
> Here in below code
> FSImageSerialization.java
> {code}
>   }
> if ((flags & ~0x2F) != 0) {
>   throw new IOException("Unknown flag in CachePoolInfo: " + flags);
> }
> {code}
> When all values of CachePool variable set to true, flags value & ~0x2F turns 
> out to non zero value. So, this condition failing due to the addition of 0x20 
>  and changing  value from ~0x1F to ~0x2F.
> May be to fix this issue, we may can change multiply value to ~0x3F 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10555) Unable to loadFSEdits due to a failure in readCachePoolInfo

2016-06-23 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10555:
---
Component/s: caching

> Unable to loadFSEdits due to a failure in readCachePoolInfo
> ---
>
> Key: HDFS-10555
> URL: https://issues.apache.org/jira/browse/HDFS-10555
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching, namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Critical
> Fix For: 2.9.0
>
> Attachments: HDFS-10555-00.patch
>
>
> Recently some tests are failing and unable to loadFSEdits due to a failure in 
> readCachePoolInfo.
> Here in below code
> FSImageSerialization.java
> {code}
>   }
> if ((flags & ~0x2F) != 0) {
>   throw new IOException("Unknown flag in CachePoolInfo: " + flags);
> }
> {code}
> When all values of CachePool variable set to true, flags value & ~0x2F turns 
> out to non zero value. So, this condition failing due to the addition of 0x20 
>  and changing  value from ~0x1F to ~0x2F.
> May be to fix this issue, we may can change multiply value to ~0x3F 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10555) Unable to loadFSEdits due to a failure in readCachePoolInfo

2016-06-23 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346700#comment-15346700
 ] 

Uma Maheswara Rao G commented on HDFS-10555:


Thanks a lot, Kihwal for merging to branch-2. I supposed to do that now. :-)


> Unable to loadFSEdits due to a failure in readCachePoolInfo
> ---
>
> Key: HDFS-10555
> URL: https://issues.apache.org/jira/browse/HDFS-10555
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Critical
> Fix For: 2.9.0
>
> Attachments: HDFS-10555-00.patch
>
>
> Recently some tests are failing and unable to loadFSEdits due to a failure in 
> readCachePoolInfo.
> Here in below code
> FSImageSerialization.java
> {code}
>   }
> if ((flags & ~0x2F) != 0) {
>   throw new IOException("Unknown flag in CachePoolInfo: " + flags);
> }
> {code}
> When all values of CachePool variable set to true, flags value & ~0x2F turns 
> out to non zero value. So, this condition failing due to the addition of 0x20 
>  and changing  value from ~0x1F to ~0x2F.
> May be to fix this issue, we may can change multiply value to ~0x3F 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10473) Allow only suitable storage policies to be set on striped files

2016-06-22 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10473:
---
   Resolution: Fixed
Fix Version/s: 3.0.0-alpha1
   Status: Resolved  (was: Patch Available)

> Allow only suitable storage policies to be set on striped files
> ---
>
> Key: HDFS-10473
> URL: https://issues.apache.org/jira/browse/HDFS-10473
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10473-01.patch, HDFS-10473-02.patch, 
> HDFS-10473-03.patch, HDFS-10473-04.patch, HDFS-10473-05.patch
>
>
> Currently some of existing storage policies are not suitable for striped 
> layout files.
> This JIRA proposes to reject setting storage policy on striped files.
> Another thought is to allow only suitable storage polices like ALL_SSD.
> Since the major use case of EC is for cold data, this may not be at high 
> importance. So, I am ok to reject setting storage policy on striped files at 
> this stage. Please suggest if others have some thoughts on this.
> Thanks [~zhz] for offline discussion on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10565) Erasure Coding: Document about the current allowed storage policies for EC Striped mode files

2016-06-22 Thread Uma Maheswara Rao G (JIRA)
Uma Maheswara Rao G created HDFS-10565:
--

 Summary: Erasure Coding: Document about the current allowed 
storage policies for EC Striped mode files 
 Key: HDFS-10565
 URL: https://issues.apache.org/jira/browse/HDFS-10565
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: documentation
Affects Versions: 3.0.0-alpha1
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G


HDFS-10473 implemented to allow only ALL_SSD, HOT, COLD policies  to take 
effect while moving/placing blocks for Striped EC files. This is JIRA to track 
the documentation about the behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10473) Allow only suitable storage policies to be set on striped files

2016-06-22 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15344931#comment-15344931
 ] 

Uma Maheswara Rao G commented on HDFS-10473:


Thank you [~jingzhao] and [~zhz] for reviews. I have just committed to trunk. I 
will look to update the documentation for this change. I will file a JIRA for 
that. Thanks

> Allow only suitable storage policies to be set on striped files
> ---
>
> Key: HDFS-10473
> URL: https://issues.apache.org/jira/browse/HDFS-10473
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10473-01.patch, HDFS-10473-02.patch, 
> HDFS-10473-03.patch, HDFS-10473-04.patch, HDFS-10473-05.patch
>
>
> Currently some of existing storage policies are not suitable for striped 
> layout files.
> This JIRA proposes to reject setting storage policy on striped files.
> Another thought is to allow only suitable storage polices like ALL_SSD.
> Since the major use case of EC is for cold data, this may not be at high 
> importance. So, I am ok to reject setting storage policy on striped files at 
> this stage. Please suggest if others have some thoughts on this.
> Thanks [~zhz] for offline discussion on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10555) Unable to loadFSEdits due to a failure in readCachePoolInfo

2016-06-21 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10555:
---
Status: Patch Available  (was: Open)

> Unable to loadFSEdits due to a failure in readCachePoolInfo
> ---
>
> Key: HDFS-10555
> URL: https://issues.apache.org/jira/browse/HDFS-10555
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Critical
> Attachments: HDFS-10555-00.patch
>
>
> Recently some tests are failing and unable to loadFSEdits due to a failure in 
> readCachePoolInfo.
> Here in below code
> FSImageSerialization.java
> {code}
>   }
> if ((flags & ~0x2F) != 0) {
>   throw new IOException("Unknown flag in CachePoolInfo: " + flags);
> }
> {code}
> When all values of CachePool variable set to true, flags value & ~0x2F turns 
> out to non zero value. So, this condition failing due to the addition of 0x20 
>  and changing  value from ~0x1F to ~0x2F.
> May be to fix this issue, we may can change multiply value to ~0x3F 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10555) Unable to loadFSEdits due to a failure in readCachePoolInfo

2016-06-21 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10555:
---
Attachment: HDFS-10555-00.patch

I have just attached a simple patch to fix this.

> Unable to loadFSEdits due to a failure in readCachePoolInfo
> ---
>
> Key: HDFS-10555
> URL: https://issues.apache.org/jira/browse/HDFS-10555
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Critical
> Attachments: HDFS-10555-00.patch
>
>
> Recently some tests are failing and unable to loadFSEdits due to a failure in 
> readCachePoolInfo.
> Here in below code
> FSImageSerialization.java
> {code}
>   }
> if ((flags & ~0x2F) != 0) {
>   throw new IOException("Unknown flag in CachePoolInfo: " + flags);
> }
> {code}
> When all values of CachePool variable set to true, flags value & ~0x2F turns 
> out to non zero value. So, this condition failing due to the addition of 0x20 
>  and changing  value from ~0x1F to ~0x2F.
> May be to fix this issue, we may can change multiply value to ~0x3F 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10473) Allow only suitable storage policies to be set on striped files

2016-06-21 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10473:
---
Attachment: HDFS-10473-05.patch

Thanks a lot, [~jingzhao] for the quick review. Yup, as mostly we may set on 
high level directory, it make sense to reduce to debug level. I have just 
changed it. Please check.

> Allow only suitable storage policies to be set on striped files
> ---
>
> Key: HDFS-10473
> URL: https://issues.apache.org/jira/browse/HDFS-10473
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10473-01.patch, HDFS-10473-02.patch, 
> HDFS-10473-03.patch, HDFS-10473-04.patch, HDFS-10473-05.patch
>
>
> Currently some of existing storage policies are not suitable for striped 
> layout files.
> This JIRA proposes to reject setting storage policy on striped files.
> Another thought is to allow only suitable storage polices like ALL_SSD.
> Since the major use case of EC is for cold data, this may not be at high 
> importance. So, I am ok to reject setting storage policy on striped files at 
> this stage. Please suggest if others have some thoughts on this.
> Thanks [~zhz] for offline discussion on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10473) Allow only suitable storage policies to be set on striped files

2016-06-21 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343118#comment-15343118
 ] 

Uma Maheswara Rao G edited comment on HDFS-10473 at 6/22/16 12:46 AM:
--

Failures seems to be unrelated and I have just filed a JIRA for failures: 
HDFS-10555


was (Author: umamaheswararao):
I have just filed a JIRA for failures: HDFS-10555

> Allow only suitable storage policies to be set on striped files
> ---
>
> Key: HDFS-10473
> URL: https://issues.apache.org/jira/browse/HDFS-10473
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10473-01.patch, HDFS-10473-02.patch, 
> HDFS-10473-03.patch, HDFS-10473-04.patch
>
>
> Currently some of existing storage policies are not suitable for striped 
> layout files.
> This JIRA proposes to reject setting storage policy on striped files.
> Another thought is to allow only suitable storage polices like ALL_SSD.
> Since the major use case of EC is for cold data, this may not be at high 
> importance. So, I am ok to reject setting storage policy on striped files at 
> this stage. Please suggest if others have some thoughts on this.
> Thanks [~zhz] for offline discussion on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10473) Allow only suitable storage policies to be set on striped files

2016-06-21 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343118#comment-15343118
 ] 

Uma Maheswara Rao G commented on HDFS-10473:


I have just filed a JIRA for failures: HDFS-10555

> Allow only suitable storage policies to be set on striped files
> ---
>
> Key: HDFS-10473
> URL: https://issues.apache.org/jira/browse/HDFS-10473
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10473-01.patch, HDFS-10473-02.patch, 
> HDFS-10473-03.patch, HDFS-10473-04.patch
>
>
> Currently some of existing storage policies are not suitable for striped 
> layout files.
> This JIRA proposes to reject setting storage policy on striped files.
> Another thought is to allow only suitable storage polices like ALL_SSD.
> Since the major use case of EC is for cold data, this may not be at high 
> importance. So, I am ok to reject setting storage policy on striped files at 
> this stage. Please suggest if others have some thoughts on this.
> Thanks [~zhz] for offline discussion on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10555) Unable to loadFSEdits due to a failure in readCachePoolInfo

2016-06-21 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343114#comment-15343114
 ] 

Uma Maheswara Rao G commented on HDFS-10555:


{noformat}
testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/1/dfs/name-0-1/current/edits_001-094;
 failing over to edit log 
/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/1/dfs/name-0-2/current/edits_001-094
java.io.IOException: Unknown flag in CachePoolInfo: 63
at 
org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.readCachePoolInfo(FSImageSerialization.java:687)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$AddCachePoolOp.readFields(FSEditLogOp.java:3974)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$LengthPrefixedReader.decodeOp(FSEditLogOp.java:4747)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$Reader.readOp(FSEditLogOp.java:4607)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.nextOpImpl(EditLogFileInputStream.java:202)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.nextOp(EditLogFileInputStream.java:249)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:189)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:196)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:149)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:837)
{nofromat}

> Unable to loadFSEdits due to a failure in readCachePoolInfo
> ---
>
> Key: HDFS-10555
> URL: https://issues.apache.org/jira/browse/HDFS-10555
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Critical
>
> Recently some tests are failing and unable to loadFSEdits due to a failure in 
> readCachePoolInfo.
> Here in below code
> FSImageSerialization.java
> {code}
>   }
> if ((flags & ~0x2F) != 0) {
>   throw new IOException("Unknown flag in CachePoolInfo: " + flags);
> }
> {code}
> When all values of CachePool variable set to true, flags value & ~0x2F turns 
> out to non zero value. So, this condition failing due to the addition of 0x20 
>  and changing  value from ~0x1F to ~0x2F.
> May be to fix this issue, we may can change multiply value to ~0x3F 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10555) Unable to loadFSEdits due to a failure in readCachePoolInfo

2016-06-21 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343114#comment-15343114
 ] 

Uma Maheswara Rao G edited comment on HDFS-10555 at 6/22/16 12:44 AM:
--

{noformat}
testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/1/dfs/name-0-1/current/edits_001-094;
 failing over to edit log 
/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/1/dfs/name-0-2/current/edits_001-094
java.io.IOException: Unknown flag in CachePoolInfo: 63
at 
org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.readCachePoolInfo(FSImageSerialization.java:687)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$AddCachePoolOp.readFields(FSEditLogOp.java:3974)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$LengthPrefixedReader.decodeOp(FSEditLogOp.java:4747)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$Reader.readOp(FSEditLogOp.java:4607)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.nextOpImpl(EditLogFileInputStream.java:202)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.nextOp(EditLogFileInputStream.java:249)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:189)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:196)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:149)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:837)

{noformat}


was (Author: umamaheswararao):
{noformat}
testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/1/dfs/name-0-1/current/edits_001-094;
 failing over to edit log 
/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/1/dfs/name-0-2/current/edits_001-094
java.io.IOException: Unknown flag in CachePoolInfo: 63
at 
org.apache.hadoop.hdfs.server.namenode.FSImageSerialization.readCachePoolInfo(FSImageSerialization.java:687)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$AddCachePoolOp.readFields(FSEditLogOp.java:3974)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$LengthPrefixedReader.decodeOp(FSEditLogOp.java:4747)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$Reader.readOp(FSEditLogOp.java:4607)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.nextOpImpl(EditLogFileInputStream.java:202)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.nextOp(EditLogFileInputStream.java:249)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:189)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:196)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:149)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:837)
{nofromat}

> Unable to loadFSEdits due to a failure in readCachePoolInfo
> ---
>
> Key: HDFS-10555
> URL: https://issues.apache.org/jira/browse/HDFS-10555
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Critical
>
> Recently some tests are failing and unable to loadFSEdits due to a failure in 
> readCachePoolInfo.
> Here in below code
> FSImageSerialization.java
> {code}
>   }
> if ((flags & ~0x2F) != 0) {
>   throw new IOException("Unknown flag in CachePoolInfo: " + flags);
> }
> {code}
> When all values of CachePool variable set to true, flags value & ~0x2F turns 
> out to non zero value. So, this condition failing due to the addition of 0x20 
>  and changing  value from ~0x1F to ~0x2F.
> May be to fix this issue, we may can change multiply value to ~0x3F 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10555) Unable to loadFSEdits due to a failure in readCachePoolInfo

2016-06-21 Thread Uma Maheswara Rao G (JIRA)
Uma Maheswara Rao G created HDFS-10555:
--

 Summary: Unable to loadFSEdits due to a failure in 
readCachePoolInfo
 Key: HDFS-10555
 URL: https://issues.apache.org/jira/browse/HDFS-10555
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
Priority: Critical


Recently some tests are failing and unable to loadFSEdits due to a failure in 
readCachePoolInfo.
Here in below code
FSImageSerialization.java
{code}
  }
if ((flags & ~0x2F) != 0) {
  throw new IOException("Unknown flag in CachePoolInfo: " + flags);
}
{code}

When all values of CachePool variable set to true, flags value & ~0x2F turns 
out to non zero value. So, this condition failing due to the addition of 0x20  
and changing  value from ~0x1F to ~0x2F.
May be to fix this issue, we may can change multiply value to ~0x3F 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10473) Allow only suitable storage policies to be set on striped files

2016-06-21 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10473:
---
Attachment: HDFS-10473-04.patch

[~jingzhao], I have updated the patch as per the comments. Please review it.

> Allow only suitable storage policies to be set on striped files
> ---
>
> Key: HDFS-10473
> URL: https://issues.apache.org/jira/browse/HDFS-10473
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10473-01.patch, HDFS-10473-02.patch, 
> HDFS-10473-03.patch, HDFS-10473-04.patch
>
>
> Currently some of existing storage policies are not suitable for striped 
> layout files.
> This JIRA proposes to reject setting storage policy on striped files.
> Another thought is to allow only suitable storage polices like ALL_SSD.
> Since the major use case of EC is for cold data, this may not be at high 
> importance. So, I am ok to reject setting storage policy on striped files at 
> this stage. Please suggest if others have some thoughts on this.
> Thanks [~zhz] for offline discussion on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10473) Allow only suitable storage policies to be set on striped files

2016-06-14 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330637#comment-15330637
 ] 

Uma Maheswara Rao G commented on HDFS-10473:


{quote}
We also need to consider other scenarios such as the NameNode tries to choose 
extra datanodes to reconstruct EC blocks with missing internal blocks (i.e., 
{{ErasureCodingWork#chooseTargets}}). Maybe we can consider adding the extra 
check introduced in the current patch directly in 
{{INodeFile#getStoragePolicyID}}?
{quote}
Good point [~jingzhao]. Let me think to update patch for covering this scenario.
Thanks a lot Zhe for reviews.
[~arpitagarwal] Thanks. Sure.

> Allow only suitable storage policies to be set on striped files
> ---
>
> Key: HDFS-10473
> URL: https://issues.apache.org/jira/browse/HDFS-10473
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10473-01.patch, HDFS-10473-02.patch, 
> HDFS-10473-03.patch
>
>
> Currently some of existing storage policies are not suitable for striped 
> layout files.
> This JIRA proposes to reject setting storage policy on striped files.
> Another thought is to allow only suitable storage polices like ALL_SSD.
> Since the major use case of EC is for cold data, this may not be at high 
> importance. So, I am ok to reject setting storage policy on striped files at 
> this stage. Please suggest if others have some thoughts on this.
> Thanks [~zhz] for offline discussion on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10473) Allow only suitable storage policies to be set on striped files

2016-06-11 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10473:
---
Attachment: HDFS-10473-03.patch

Addressed some check-style issues. More than 80 cars line.
Have not fixed whitespace issues reported in check style as they look invalid 
for me.
Other white space issues in patch was unrelated. they are from LICENSE.txt
Corrected test failures issue.

Please review the patch.

> Allow only suitable storage policies to be set on striped files
> ---
>
> Key: HDFS-10473
> URL: https://issues.apache.org/jira/browse/HDFS-10473
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10473-01.patch, HDFS-10473-02.patch, 
> HDFS-10473-03.patch
>
>
> Currently some of existing storage policies are not suitable for striped 
> layout files.
> This JIRA proposes to reject setting storage policy on striped files.
> Another thought is to allow only suitable storage polices like ALL_SSD.
> Since the major use case of EC is for cold data, this may not be at high 
> importance. So, I am ok to reject setting storage policy on striped files at 
> this stage. Please suggest if others have some thoughts on this.
> Thanks [~zhz] for offline discussion on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10473) Allow only suitable storage policies to be set on striped files

2016-06-11 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10473:
---
Attachment: HDFS-10473-02.patch

Thank you [~jingzhao] for the reviews.
I updated the patch as we discussed. If this changes fine, Then I will work on 
documentation to cover this changes.

> Allow only suitable storage policies to be set on striped files
> ---
>
> Key: HDFS-10473
> URL: https://issues.apache.org/jira/browse/HDFS-10473
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10473-01.patch, HDFS-10473-02.patch
>
>
> Currently some of existing storage policies are not suitable for striped 
> layout files.
> This JIRA proposes to reject setting storage policy on striped files.
> Another thought is to allow only suitable storage polices like ALL_SSD.
> Since the major use case of EC is for cold data, this may not be at high 
> importance. So, I am ok to reject setting storage policy on striped files at 
> this stage. Please suggest if others have some thoughts on this.
> Thanks [~zhz] for offline discussion on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10473) Allow only suitable storage policies to be set on striped files

2016-06-08 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15320126#comment-15320126
 ] 

Uma Maheswara Rao G commented on HDFS-10473:


>Also, do you see any critical issue if an admin really sets WARM/ONE_SSD 
>policy to EC files?
In striping model all data blocks are equally important. So, there is no 
meaning to set ONE_SSD policy for this files. This is the only point I had. I 
am not sure this is fine.  
To simplify, how about Mover tool allows to move EC files only if the target 
policy is ARCHIVE/ALL_SDD ? (because allowing Mover tool to do other 
policy(ONE_SSD, .etc) movements are not making any sense to me). We can just 
log saying, this policy is not recommended for striped EC files, so ignoring 
for movements.
Let's leave NN side changes as is since that is bringing complexities to handle.

> Allow only suitable storage policies to be set on striped files
> ---
>
> Key: HDFS-10473
> URL: https://issues.apache.org/jira/browse/HDFS-10473
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10473-01.patch
>
>
> Currently some of existing storage policies are not suitable for striped 
> layout files.
> This JIRA proposes to reject setting storage policy on striped files.
> Another thought is to allow only suitable storage polices like ALL_SSD.
> Since the major use case of EC is for cold data, this may not be at high 
> importance. So, I am ok to reject setting storage policy on striped files at 
> this stage. Please suggest if others have some thoughts on this.
> Thanks [~zhz] for offline discussion on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10473) Allow only suitable storage policies to be set on striped files

2016-06-07 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10473:
---
Description: 
Currently some of existing storage policies are not suitable for striped layout 
files.
This JIRA proposes to reject setting storage policy on striped files.

Another thought is to allow only suitable storage polices like ALL_SSD.
Since the major use case of EC is for cold data, this may not be at high 
importance. So, I am ok to reject setting storage policy on striped files at 
this stage. Please suggest if others have some thoughts on this.

Thanks [~zhz] for offline discussion on this.

  was:
Currently existing storage policies are not suitable for striped layout files.
This JIRA proposes to reject setting storage policy on striped files.

Another thought is to allow only suitable storage polices like ALL_SSD.
Since the major use case of EC is for cold data, this may not be at high 
importance. So, I am ok to reject setting storage policy on striped files at 
this stage. Please suggest if others have some thoughts on this.

Thanks [~zhz] for offline discussion on this.


> Allow only suitable storage policies to be set on striped files
> ---
>
> Key: HDFS-10473
> URL: https://issues.apache.org/jira/browse/HDFS-10473
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10473-01.patch
>
>
> Currently some of existing storage policies are not suitable for striped 
> layout files.
> This JIRA proposes to reject setting storage policy on striped files.
> Another thought is to allow only suitable storage polices like ALL_SSD.
> Since the major use case of EC is for cold data, this may not be at high 
> importance. So, I am ok to reject setting storage policy on striped files at 
> this stage. Please suggest if others have some thoughts on this.
> Thanks [~zhz] for offline discussion on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10473) Allow only suitable storage policies to be set on striped files

2016-06-07 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15319807#comment-15319807
 ] 

Uma Maheswara Rao G edited comment on HDFS-10473 at 6/8/16 1:00 AM:


Thanks a lot Jing for taking look.
{quote}
My understanding is policies like "WARM" and "ONE_SSD" are mainly targeting 
replication (since they're mainly setting specific storage type for the first 
replica) thus are not suitable. Could you please confirm it?
{quote}
Yes. You are right.

{quote}
For the patch, storage policies are mainly set on directories (in fact to set 
storage policies on files is not recommended), and we allow moving EC files 
across EC directory boundaries. Therefore it is not possible to disallow 
setting storage policies on striped file in O(1) time complexity. Looks like 
the changes on the NN side may be unnecessary here. We only need to let Mover 
ignore striped files for now.
{quote}
In reality yes. In current patch, we just disabled for files only if some one 
sets. Yes you are right we can not disable for each level of file here in dir 
case. That handled while running mover only.  

Actual plan is to find the suitable policies and enable only for them. At first 
step we thought we will disable and then think more carefully what policies 
suitable. Yes, we can think now itself and do full changes.
Here is how I am thinking :
Mover is the key tool here who moves the file blocks.  So, lets define EC 
allowed policies. i.e, ALL_SSD, ARCHIVE, etc
Since policies are static, lets keep allowed list statically in Code. When 
mover attempt move striped files, if the targeted policy is either of the them, 
then we will just proceed for that file, otherwise we will just skip. 
For the files specifically if someone attempting to set other than above 
policy, then we don't allow straightway by rejecting call. We can not do on 
directory case because it applies for many files under it. some of them may be 
non ec file directories. May be when listing policies for EC files, we should 
ignore if inherited one other than above list?
What do you say?




was (Author: umamaheswararao):
Thanks a lot Jing for taking look.
{quote}
My understanding is policies like "WARM" and "ONE_SSD" are mainly targeting 
replication (since they're mainly setting specific storage type for the first 
replica) thus are not suitable. Could you please confirm it?
{quote}
Yes. You are right.

{quote}
For the patch, storage policies are mainly set on directories (in fact to set 
storage policies on files is not recommended), and we allow moving EC files 
across EC directory boundaries. Therefore it is not possible to disallow 
setting storage policies on striped file in O(1) time complexity. Looks like 
the changes on the NN side may be unnecessary here. We only need to let Mover 
ignore striped files for now.
{quote}
In reality yes. Currently in NN we just disable for files only if some one 
sets. Yes you are right we can not disable for each level of file here. This 
handle while running mover only.  

Actual plan is to find the suitable policies and enable only for them. At first 
step we thought we will disable and then think more carefully what policies 
suitable. Yes, we can think now itself and do full changes.
Here is how I am thinking :
Mover is the key tool here who moves the file blocks.  So, lets define EC 
allowed policies. i.e, ALL_SSD, ARCHIVE, etc
Since policies are static, lets keep allowed list statically in Code. When 
mover attempt move striped files, if the targeted policy is either of the them, 
then we will just proceed for that file, otherwise we will just skip. 
For the files specifically if someone attempting to set other than above 
policy, then we don't allow straightway by rejecting call. We can not do on 
directory case because it applies for many files under it. some of them may be 
non ec file directories. May be when listing policies for EC files, we should 
ignore if inherited one other than above list?
What do you say?



> Allow only suitable storage policies to be set on striped files
> ---
>
> Key: HDFS-10473
> URL: https://issues.apache.org/jira/browse/HDFS-10473
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10473-01.patch
>
>
> Currently existing storage policies are not suitable for striped layout files.
> This JIRA proposes to reject setting storage policy on striped files.
> Another thought is to allow only suitable storage polices like ALL_SSD.
> Since the major use case of EC is for cold data, this may not be at high 
> importance. So, I am ok to reject setting storage policy on striped files at 
> this stage. Please suggest if 

[jira] [Commented] (HDFS-10473) Allow only suitable storage policies to be set on striped files

2016-06-07 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15319807#comment-15319807
 ] 

Uma Maheswara Rao G commented on HDFS-10473:


Thanks a lot Jing for taking look.
{quote}
My understanding is policies like "WARM" and "ONE_SSD" are mainly targeting 
replication (since they're mainly setting specific storage type for the first 
replica) thus are not suitable. Could you please confirm it?
{quote}
Yes. You are right.

{quote}
For the patch, storage policies are mainly set on directories (in fact to set 
storage policies on files is not recommended), and we allow moving EC files 
across EC directory boundaries. Therefore it is not possible to disallow 
setting storage policies on striped file in O(1) time complexity. Looks like 
the changes on the NN side may be unnecessary here. We only need to let Mover 
ignore striped files for now.
{quote}
In reality yes. Currently in NN we just disable for files only if some one 
sets. Yes you are right we can not disable for each level of file here. This 
handle while running mover only.  

Actual plan is to find the suitable policies and enable only for them. At first 
step we thought we will disable and then think more carefully what policies 
suitable. Yes, we can think now itself and do full changes.
Here is how I am thinking :
Mover is the key tool here who moves the file blocks.  So, lets define EC 
allowed policies. i.e, ALL_SSD, ARCHIVE, etc
Since policies are static, lets keep allowed list statically in Code. When 
mover attempt move striped files, if the targeted policy is either of the them, 
then we will just proceed for that file, otherwise we will just skip. 
For the files specifically if someone attempting to set other than above 
policy, then we don't allow straightway by rejecting call. We can not do on 
directory case because it applies for many files under it. some of them may be 
non ec file directories. May be when listing policies for EC files, we should 
ignore if inherited one other than above list?
What do you say?



> Allow only suitable storage policies to be set on striped files
> ---
>
> Key: HDFS-10473
> URL: https://issues.apache.org/jira/browse/HDFS-10473
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10473-01.patch
>
>
> Currently existing storage policies are not suitable for striped layout files.
> This JIRA proposes to reject setting storage policy on striped files.
> Another thought is to allow only suitable storage polices like ALL_SSD.
> Since the major use case of EC is for cold data, this may not be at high 
> importance. So, I am ok to reject setting storage policy on striped files at 
> this stage. Please suggest if others have some thoughts on this.
> Thanks [~zhz] for offline discussion on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10473) Allow only suitable storage policies to be set on striped files

2016-06-07 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15319583#comment-15319583
 ] 

Uma Maheswara Rao G edited comment on HDFS-10473 at 6/7/16 10:26 PM:
-

Here is a patch which rejects to set storage policy on striped files. Let me 
file a JIRA for more discussion on identifying suitable one/define newer ones.


was (Author: umamaheswararao):
Here is a patch which rejects to set storage policy on striped files.

> Allow only suitable storage policies to be set on striped files
> ---
>
> Key: HDFS-10473
> URL: https://issues.apache.org/jira/browse/HDFS-10473
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10473-01.patch
>
>
> Currently existing storage policies are not suitable for striped layout files.
> This JIRA proposes to reject setting storage policy on striped files.
> Another thought is to allow only suitable storage polices like ALL_SSD.
> Since the major use case of EC is for cold data, this may not be at high 
> importance. So, I am ok to reject setting storage policy on striped files at 
> this stage. Please suggest if others have some thoughts on this.
> Thanks [~zhz] for offline discussion on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10473) Allow only suitable storage policies to be set on striped files

2016-06-07 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10473:
---
Status: Patch Available  (was: Open)

> Allow only suitable storage policies to be set on striped files
> ---
>
> Key: HDFS-10473
> URL: https://issues.apache.org/jira/browse/HDFS-10473
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10473-01.patch
>
>
> Currently existing storage policies are not suitable for striped layout files.
> This JIRA proposes to reject setting storage policy on striped files.
> Another thought is to allow only suitable storage polices like ALL_SSD.
> Since the major use case of EC is for cold data, this may not be at high 
> importance. So, I am ok to reject setting storage policy on striped files at 
> this stage. Please suggest if others have some thoughts on this.
> Thanks [~zhz] for offline discussion on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10473) Allow only suitable storage policies to be set on striped files

2016-06-07 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10473:
---
Attachment: HDFS-10473-01.patch

Here is a patch which rejects to set storage policy on striped files.

> Allow only suitable storage policies to be set on striped files
> ---
>
> Key: HDFS-10473
> URL: https://issues.apache.org/jira/browse/HDFS-10473
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10473-01.patch
>
>
> Currently existing storage policies are not suitable for striped layout files.
> This JIRA proposes to reject setting storage policy on striped files.
> Another thought is to allow only suitable storage polices like ALL_SSD.
> Since the major use case of EC is for cold data, this may not be at high 
> importance. So, I am ok to reject setting storage policy on striped files at 
> this stage. Please suggest if others have some thoughts on this.
> Thanks [~zhz] for offline discussion on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10473) Allow only suitable storage policies to be set on striped files

2016-06-01 Thread Uma Maheswara Rao G (JIRA)
Uma Maheswara Rao G created HDFS-10473:
--

 Summary: Allow only suitable storage policies to be set on striped 
files
 Key: HDFS-10473
 URL: https://issues.apache.org/jira/browse/HDFS-10473
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G


Currently existing storage policies are not suitable for striped layout files.
This JIRA proposes to reject setting storage policy on striped files.

Another thought is to allow only suitable storage polices like ALL_SSD.
Since the major use case of EC is for cold data, this may not be at high 
importance. So, I am ok to reject setting storage policy on striped files at 
this stage. Please suggest if others have some thoughts on this.

Thanks [~zhz] for offline discussion on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10285) Storage Policy Satisfier in Namenode

2016-05-12 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281164#comment-15281164
 ] 

Uma Maheswara Rao G edited comment on HDFS-10285 at 5/13/16 4:51 AM:
-

Attached the initial version of document. Please help in review and we can 
improve the document based on feedbacks.

Thanks [~rakeshr] for co-authoring on the design doc. Thanks [~anoopsamjohn], 
[~drankye], [~ram_krish],[~jingcheng...@intel.com] for helping on reviews.

Thanks,
Uma & Rakesh


was (Author: umamaheswararao):
Attached the initial version of document. Please help in review and we can 
improve the document based on feedbacks.

> Storage Policy Satisfier in Namenode
> 
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 2.7.2
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: Storage-Policy-Satisfier-in-HDFS-May10.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These 
> policies can be set on directory/file to specify the user preference, where 
> to store the physical block. When user set the storage policy before writing 
> data, then the blocks could take advantage of storage policy preferences and 
> stores physical block accordingly. 
> If user set the storage policy after writing and completing the file, then 
> the blocks would have been written with default storage policy (nothing but 
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such 
> file names as a list. In some distributed system scenarios (ex: HBase) it 
> would be difficult to collect all the files and run the tool as different 
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage 
> policy file (inherited policy from parent directory) to another storage 
> policy effected directory, it will not copy inherited storage policy from 
> source. So it will take effect from destination file/dir parent storage 
> policy. This rename operation is just a metadata change in Namenode. The 
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for 
> admins from distributed nodes(ex: region servers) and running the Mover tool. 
> Here the proposal is to provide an API from Namenode itself for trigger the 
> storage policy satisfaction. A Daemon thread inside Namenode should track 
> such calls and process to DN as movement commands. 
> Will post the detailed design thoughts document soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8430) Erasure coding: compute file checksum for striped files (stripe by stripe)

2016-05-11 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-8430:
--
Summary: Erasure coding: compute file checksum for striped files (stripe by 
stripe)  (was: Erasure coding: compute file checksum for stripe files)

> Erasure coding: compute file checksum for striped files (stripe by stripe)
> --
>
> Key: HDFS-8430
> URL: https://issues.apache.org/jira/browse/HDFS-8430
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Walter Su
>Assignee: Kai Zheng
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-8430-poc1.patch
>
>
> HADOOP-3981 introduces a  distributed file checksum algorithm. It's designed 
> for replicated block.
> {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped 
> block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8430) Erasure coding: compute file checksum for stripe files

2016-05-11 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281220#comment-15281220
 ] 

Uma Maheswara Rao G commented on HDFS-8430:
---

[~drankye] could you please revise the patch as per the plan? Do you think we 
can target this before 3.0 rc?

> Erasure coding: compute file checksum for stripe files
> --
>
> Key: HDFS-8430
> URL: https://issues.apache.org/jira/browse/HDFS-8430
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Walter Su
>Assignee: Kai Zheng
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-8430-poc1.patch
>
>
> HADOOP-3981 introduces a  distributed file checksum algorithm. It's designed 
> for replicated block.
> {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped 
> block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity

2016-05-11 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281210#comment-15281210
 ] 

Uma Maheswara Rao G commented on HDFS-8287:
---

[~kaisasak] any update on this?

> DFSStripedOutputStream.writeChunk should not wait for writing parity 
> -
>
> Key: HDFS-8287
> URL: https://issues.apache.org/jira/browse/HDFS-8287
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding, hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Kai Sasaki
> Attachments: HDFS-8287-HDFS-7285.00.patch, 
> HDFS-8287-HDFS-7285.01.patch, HDFS-8287-HDFS-7285.02.patch, 
> HDFS-8287-HDFS-7285.03.patch, HDFS-8287-HDFS-7285.04.patch, 
> HDFS-8287-HDFS-7285.05.patch, HDFS-8287-HDFS-7285.06.patch, 
> HDFS-8287-HDFS-7285.07.patch, HDFS-8287-HDFS-7285.08.patch, 
> HDFS-8287-HDFS-7285.09.patch, HDFS-8287-HDFS-7285.10.patch, 
> HDFS-8287-HDFS-7285.11.patch, HDFS-8287-HDFS-7285.WIP.patch, 
> HDFS-8287-performance-report.pdf, HDFS-8287.12.patch, HDFS-8287.13.patch, 
> HDFS-8287.14.patch, HDFS-8287.15.patch, h8287_20150911.patch, jstack-dump.txt
>
>
> When a stripping cell is full, writeChunk computes and generates parity 
> packets.  It sequentially calls waitAndQueuePacket so that user client cannot 
> continue to write data until it finishes.
> We should allow user client to continue writing instead but not blocking it 
> when writing parity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers

2016-05-11 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281208#comment-15281208
 ] 

Uma Maheswara Rao G commented on HDFS-9079:
---

[~zhz], could you please take a look at find bug comments?

> Erasure coding: preallocate multiple generation stamps and serialize updates 
> from data streamers
> 
>
> Key: HDFS-9079
> URL: https://issues.apache.org/jira/browse/HDFS-9079
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: HDFS-7285
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch, 
> HDFS-9079.02.patch, HDFS-9079.03.patch, HDFS-9079.04.patch, 
> HDFS-9079.05.patch, HDFS-9079.06.patch, HDFS-9079.07.patch, 
> HDFS-9079.08.patch, HDFS-9079.09.patch, HDFS-9079.10.patch, 
> HDFS-9079.11.patch, HDFS-9079.12.patch, HDFS-9079.13.patch, 
> HDFS-9079.14.patch, HDFS-9079.15.patch
>
>
> A non-striped DataStreamer goes through the following steps in error handling:
> {code}
> 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) 
> Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) 
> Updates block on NN
> {code}
> With multiple streamer threads run in parallel, we need to correctly handle a 
> large number of possible combinations of interleaved thread events. For 
> example, {{streamer_B}} starts step 2 in between events {{streamer_A.2}} and 
> {{streamer_A.3}}.
> HDFS-9040 moves steps 1, 2, 3, 6 from streamer to {{DFSStripedOutputStream}}. 
> This JIRA proposes some further optimizations based on HDFS-9040:
> # We can preallocate GS when NN creates a new striped block group 
> ({{FSN#createNewBlock}}). For each new striped block group we can reserve 
> {{NUM_PARITY_BLOCKS}} GS's. If more than {{NUM_PARITY_BLOCKS}} errors have 
> happened we shouldn't try to further recover anyway.
> # We can use a dedicated event processor to offload the error handling logic 
> from {{DFSStripedOutputStream}}, which is not a long running daemon.
> # We can limit the lifespan of a streamer to be a single block. A streamer 
> ends either after finishing the current block or when encountering a DN 
> failure.
> With the proposed change, a {{StripedDataStreamer}}'s flow becomes:
> {code}
> 1) Finds DN error => 2) Notify coordinator (async, not waiting for response) 
> => terminates
> 1) Finds external error => 2) Applies new GS to DN (createBlockOutputStream) 
> => 3) Ack from DN => 4) Notify coordinator (async, not waiting for response)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-7350) WebHDFS: Support EC commands through webhdfs

2016-05-11 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G resolved HDFS-7350.
---
Resolution: Invalid

> WebHDFS: Support EC commands through webhdfs
> 
>
> Key: HDFS-7350
> URL: https://issues.apache.org/jira/browse/HDFS-7350
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10285) Storage Policy Satisfier in Namenode

2016-05-11 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10285:
---
Attachment: Storage-Policy-Satisfier-in-HDFS-May10.pdf

Attached the initial version of document. Please help in review and we can 
improve the document based on feedbacks.

> Storage Policy Satisfier in Namenode
> 
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 2.7.2
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: Storage-Policy-Satisfier-in-HDFS-May10.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These 
> policies can be set on directory/file to specify the user preference, where 
> to store the physical block. When user set the storage policy before writing 
> data, then the blocks could take advantage of storage policy preferences and 
> stores physical block accordingly. 
> If user set the storage policy after writing and completing the file, then 
> the blocks would have been written with default storage policy (nothing but 
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such 
> file names as a list. In some distributed system scenarios (ex: HBase) it 
> would be difficult to collect all the files and run the tool as different 
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage 
> policy file (inherited policy from parent directory) to another storage 
> policy effected directory, it will not copy inherited storage policy from 
> source. So it will take effect from destination file/dir parent storage 
> policy. This rename operation is just a metadata change in Namenode. The 
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for 
> admins from distributed nodes(ex: region servers) and running the Mover tool. 
> Here the proposal is to provide an API from Namenode itself for trigger the 
> storage policy satisfaction. A Daemon thread inside Namenode should track 
> such calls and process to DN as movement commands. 
> Will post the detailed design thoughts document soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode

2016-04-27 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15261581#comment-15261581
 ] 

Uma Maheswara Rao G commented on HDFS-10285:


{quote}
So does this mean there would be a need to reserve and copy the inherited 
storage policy in distcp tool?
{quote}
The current implementation does not copy the source storage policy.  What you 
mean by preserve here? Sorry I did not follow this. Could you elaborate a bit?

{quote}
Yeah, having an API to allow applications to trigger the mover behavior sounds 
good. As mentioned in the proposal, there is a need in HBase on HDFS HSM. Maybe 
Jingcheng Du and Wei Zhou could have detailed description about this as I know 
you have the relevant work.
{quote}
That will be great!

Thanks a lot, Kai for your comments.

> Storage Policy Satisfier in Namenode
> 
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 2.7.2
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These 
> policies can be set on directory/file to specify the user preference, where 
> to store the physical block. When user set the storage policy before writing 
> data, then the blocks could take advantage of storage policy preferences and 
> stores physical block accordingly. 
> If user set the storage policy after writing and completing the file, then 
> the blocks would have been written with default storage policy (nothing but 
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such 
> file names as a list. In some distributed system scenarios (ex: HBase) it 
> would be difficult to collect all the files and run the tool as different 
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage 
> policy file (inherited policy from parent directory) to another storage 
> policy effected directory, it will not copy inherited storage policy from 
> source. So it will take effect from destination file/dir parent storage 
> policy. This rename operation is just a metadata change in Namenode. The 
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for 
> admins from distributed nodes(ex: region servers) and running the Mover tool. 
> Here the proposal is to provide an API from Namenode itself for trigger the 
> storage policy satisfaction. A Daemon thread inside Namenode should track 
> such calls and process to DN as movement commands. 
> Will post the detailed design thoughts document soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10285) Storage Policy Satisfier in Namenode

2016-04-13 Thread Uma Maheswara Rao G (JIRA)
Uma Maheswara Rao G created HDFS-10285:
--

 Summary: Storage Policy Satisfier in Namenode
 Key: HDFS-10285
 URL: https://issues.apache.org/jira/browse/HDFS-10285
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, namenode
Affects Versions: 2.7.2
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G


Heterogeneous storage in HDFS introduced the concept of storage policy. These 
policies can be set on directory/file to specify the user preference, where to 
store the physical block. When user set the storage policy before writing data, 
then the blocks could take advantage of storage policy preferences and stores 
physical block accordingly. 

If user set the storage policy after writing and completing the file, then the 
blocks would have been written with default storage policy (nothing but DISK). 
User has to run the ‘Mover tool’ explicitly by specifying all such file names 
as a list. In some distributed system scenarios (ex: HBase) it would be 
difficult to collect all the files and run the tool as different nodes can 
write files separately and file can have different paths.

Another scenarios is, when user rename the files from one effected storage 
policy file (inherited policy from parent directory) to another storage policy 
effected directory, it will not copy inherited storage policy from source. So 
it will take effect from destination file/dir parent storage policy. This 
rename operation is just a metadata change in Namenode. The physical blocks 
still remain with source storage policy.

So, Tracking all such business logic based file names could be difficult for 
admins from distributed nodes(ex: region servers) and running the Mover tool. 

Here the proposal is to provide an API from Namenode itself for trigger the 
storage policy satisfaction. A Daemon thread inside Namenode should track such 
calls and process to DN as movement commands. 

Will post the detailed design thoughts document soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9719) Refactoring ErasureCodingWorker into smaller reusable constructs

2016-04-06 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-9719:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

I have just committed this to trunk. Thanks Kai.

> Refactoring ErasureCodingWorker into smaller reusable constructs
> 
>
> Key: HDFS-9719
> URL: https://issues.apache.org/jira/browse/HDFS-9719
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: HDFS-9719-v1.patch, HDFS-9719-v2.patch, 
> HDFS-9719-v3.patch, HDFS-9719-v4.patch, HDFS-9719-v5.patch, 
> HDFS-9719-v6.patch, HDFS-9719-v7.patch, HDFS-9719-v8.patch, HDFS-9719-v9.patch
>
>
> This would suggest and refactor {{ErasureCodingWorker}} into smaller 
> constructs to be reused in other places like block group checksum computing 
> in datanode side. As discussed in HDFS-8430 and implemented in HDFS-9694 
> patch, checksum computing for striped block groups would be distributed to 
> datanode in the group, where data block data should be able to be 
> reconstructed when missed/corrupted to recompute the block checksum. The most 
> needed codes are in the current ErasureCodingWorker and could be reused in 
> order to avoid duplication. Fortunately, we have very good and complete 
> tests, which would make the refactoring much easier. The refactoring will 
> also help a lot for subsequent tasks in phase II for non-striping erasure 
> coded files and blocks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9719) Refactoring ErasureCodingWorker into smaller reusable constructs

2016-04-06 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229666#comment-15229666
 ] 

Uma Maheswara Rao G commented on HDFS-9719:
---

Thanks [~drankye] for the update. Latest patch looks good to me.
+1, will commit the patch soon.

> Refactoring ErasureCodingWorker into smaller reusable constructs
> 
>
> Key: HDFS-9719
> URL: https://issues.apache.org/jira/browse/HDFS-9719
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-9719-v1.patch, HDFS-9719-v2.patch, 
> HDFS-9719-v3.patch, HDFS-9719-v4.patch, HDFS-9719-v5.patch, 
> HDFS-9719-v6.patch, HDFS-9719-v7.patch, HDFS-9719-v8.patch, HDFS-9719-v9.patch
>
>
> This would suggest and refactor {{ErasureCodingWorker}} into smaller 
> constructs to be reused in other places like block group checksum computing 
> in datanode side. As discussed in HDFS-8430 and implemented in HDFS-9694 
> patch, checksum computing for striped block groups would be distributed to 
> datanode in the group, where data block data should be able to be 
> reconstructed when missed/corrupted to recompute the block checksum. The most 
> needed codes are in the current ErasureCodingWorker and could be reused in 
> order to avoid duplication. Fortunately, we have very good and complete 
> tests, which would make the refactoring much easier. The refactoring will 
> also help a lot for subsequent tasks in phase II for non-striping erasure 
> coded files and blocks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-9694) Make existing DFSClient#getFileChecksum() work for striped blocks

2016-03-26 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G resolved HDFS-9694.
---
Resolution: Fixed

I have just committed this. Before it was my mistake, missed to add newly added 
file. Thanks

> Make existing DFSClient#getFileChecksum() work for striped blocks
> -
>
> Key: HDFS-9694
> URL: https://issues.apache.org/jira/browse/HDFS-9694
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: HDFS-9694-v1.patch, HDFS-9694-v2.patch, 
> HDFS-9694-v3.patch, HDFS-9694-v4.patch, HDFS-9694-v5.patch, 
> HDFS-9694-v6.patch, HDFS-9694-v7.patch, HDFS-9694-v8.patch, HDFS-9694-v9.patch
>
>
> This is a sub-task of HDFS-8430 and will get the existing API 
> {{FileSystem#getFileChecksum(path)}} work for striped files. It will also 
> refactor existing codes and layout basic work for subsequent tasks like 
> support of the new API proposed there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9694) Make existing DFSClient#getFileChecksum() work for striped blocks

2016-03-26 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213294#comment-15213294
 ] 

Uma Maheswara Rao G commented on HDFS-9694:
---

Thanks [~arpitagarwal] and [~kaisasak] for noticing and reverting. I will check 
and recommit it. Thanks

> Make existing DFSClient#getFileChecksum() work for striped blocks
> -
>
> Key: HDFS-9694
> URL: https://issues.apache.org/jira/browse/HDFS-9694
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: HDFS-9694-v1.patch, HDFS-9694-v2.patch, 
> HDFS-9694-v3.patch, HDFS-9694-v4.patch, HDFS-9694-v5.patch, 
> HDFS-9694-v6.patch, HDFS-9694-v7.patch, HDFS-9694-v8.patch, HDFS-9694-v9.patch
>
>
> This is a sub-task of HDFS-8430 and will get the existing API 
> {{FileSystem#getFileChecksum(path)}} work for striped files. It will also 
> refactor existing codes and layout basic work for subsequent tasks like 
> support of the new API proposed there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9694) Make existing DFSClient#getFileChecksum() work for striped blocks

2016-03-26 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-9694:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
 Release Note: Makes the getFileChecksum API works with striped layout EC 
files. Checksum computation done by block level in the distributed fashion. The 
current API does not support to compare the checksum generated with normal file 
and the checksum generated for the same file but in striped layout.
   Status: Resolved  (was: Patch Available)

I have just committed this patch to trunk.

> Make existing DFSClient#getFileChecksum() work for striped blocks
> -
>
> Key: HDFS-9694
> URL: https://issues.apache.org/jira/browse/HDFS-9694
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: HDFS-9694-v1.patch, HDFS-9694-v2.patch, 
> HDFS-9694-v3.patch, HDFS-9694-v4.patch, HDFS-9694-v5.patch, 
> HDFS-9694-v6.patch, HDFS-9694-v7.patch, HDFS-9694-v8.patch, HDFS-9694-v9.patch
>
>
> This is a sub-task of HDFS-8430 and will get the existing API 
> {{FileSystem#getFileChecksum(path)}} work for striped files. It will also 
> refactor existing codes and layout basic work for subsequent tasks like 
> support of the new API proposed there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9694) Make existing DFSClient#getFileChecksum() work for striped blocks

2016-03-26 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15212855#comment-15212855
 ] 

Uma Maheswara Rao G commented on HDFS-9694:
---

Overall latest patch looking good to me. +1
I will go ahead and push this patch shortly. Thanks Kai for your hard work on 
this.

> Make existing DFSClient#getFileChecksum() work for striped blocks
> -
>
> Key: HDFS-9694
> URL: https://issues.apache.org/jira/browse/HDFS-9694
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-9694-v1.patch, HDFS-9694-v2.patch, 
> HDFS-9694-v3.patch, HDFS-9694-v4.patch, HDFS-9694-v5.patch, 
> HDFS-9694-v6.patch, HDFS-9694-v7.patch, HDFS-9694-v8.patch, HDFS-9694-v9.patch
>
>
> This is a sub-task of HDFS-8430 and will get the existing API 
> {{FileSystem#getFileChecksum(path)}} work for striped files. It will also 
> refactor existing codes and layout basic work for subsequent tasks like 
> support of the new API proposed there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9694) Make existing DFSClient#getFileChecksum() work for striped blocks

2016-03-22 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207662#comment-15207662
 ] 

Uma Maheswara Rao G commented on HDFS-9694:
---

{quote}
I guess I'd better incorporate the change in the patch to be in together, so we 
may avoid the further change into the protocol. Sounds good?
{quote}
I would recommend not to incorporate future related changes in this JIRA. Let 
that change go into other JIRA when thats needed. I just wanted to know your 
idea because, if the plan is not by handling with flag, then op name might need 
to refine. But flag is good idea in general. So, I am good with flag and leave 
that change to next JIRA.


> Make existing DFSClient#getFileChecksum() work for striped blocks
> -
>
> Key: HDFS-9694
> URL: https://issues.apache.org/jira/browse/HDFS-9694
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-9694-v1.patch, HDFS-9694-v2.patch, 
> HDFS-9694-v3.patch, HDFS-9694-v4.patch, HDFS-9694-v5.patch, 
> HDFS-9694-v6.patch, HDFS-9694-v7.patch
>
>
> This is a sub-task of HDFS-8430 and will get the existing API 
> {{FileSystem#getFileChecksum(path)}} work for striped files. It will also 
> refactor existing codes and layout basic work for subsequent tasks like 
> support of the new API proposed there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9694) Make existing DFSClient#getFileChecksum() work for striped blocks

2016-03-22 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207137#comment-15207137
 ] 

Uma Maheswara Rao G commented on HDFS-9694:
---



Hi [~drankye], I have a question, 
{code}
 send(out, Op.BLOCK_GROUP_CHECKSUM, proto);
{code}
Are you planning to have a flag to indicate striped or non striped modes later? 
or you want to have separate flag itself?

NonStripedBlockGroupChecksumComputer --> BlockGroupNonStripedChecksumComputer 
is more consistent with StripedFileNonStripedChecksumComputer?

Other than this mostly looks good to me. Once they addressed and if no 
objections from others, I plan to commit this. 

> Make existing DFSClient#getFileChecksum() work for striped blocks
> -
>
> Key: HDFS-9694
> URL: https://issues.apache.org/jira/browse/HDFS-9694
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-9694-v1.patch, HDFS-9694-v2.patch, 
> HDFS-9694-v3.patch, HDFS-9694-v4.patch, HDFS-9694-v5.patch, 
> HDFS-9694-v6.patch, HDFS-9694-v7.patch
>
>
> This is a sub-task of HDFS-8430 and will get the existing API 
> {{FileSystem#getFileChecksum(path)}} work for striped files. It will also 
> refactor existing codes and layout basic work for subsequent tasks like 
> support of the new API proposed there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9694) Make existing DFSClient#getFileChecksum() work for striped blocks

2016-03-19 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198216#comment-15198216
 ] 

Uma Maheswara Rao G commented on HDFS-9694:
---

Thanks [~drankye] for the update. Better now when considering Stripe and 
NonStripe way for checksum computers.

Findbug comments still showing up. I think you need to add the find bug 
comments in 
./hadoop-hdfs-project/hadoop-hdfs-client/dev-support/findbugsExcludeFile.xml 
file.
You may want to add you class in this section
{noformat}


  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  



{noformat}


> Make existing DFSClient#getFileChecksum() work for striped blocks
> -
>
> Key: HDFS-9694
> URL: https://issues.apache.org/jira/browse/HDFS-9694
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-9694-v1.patch, HDFS-9694-v2.patch, 
> HDFS-9694-v3.patch, HDFS-9694-v4.patch, HDFS-9694-v5.patch, HDFS-9694-v6.patch
>
>
> This is a sub-task of HDFS-8430 and will get the existing API 
> {{FileSystem#getFileChecksum(path)}} work for striped files. It will also 
> refactor existing codes and layout basic work for subsequent tasks like 
> support of the new API proposed there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9719) Refactoring ErasureCodingWorker into smaller reusable constructs

2016-03-15 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194813#comment-15194813
 ] 

Uma Maheswara Rao G commented on HDFS-9719:
---

Thanks for the update [~drankye], I will have a look at it tomorrow. 

> Refactoring ErasureCodingWorker into smaller reusable constructs
> 
>
> Key: HDFS-9719
> URL: https://issues.apache.org/jira/browse/HDFS-9719
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-9719-v1.patch, HDFS-9719-v2.patch, 
> HDFS-9719-v3.patch, HDFS-9719-v4.patch, HDFS-9719-v5.patch, 
> HDFS-9719-v6.patch, HDFS-9719-v7.patch
>
>
> This would suggest and refactor {{ErasureCodingWorker}} into smaller 
> constructs to be reused in other places like block group checksum computing 
> in datanode side. As discussed in HDFS-8430 and implemented in HDFS-9694 
> patch, checksum computing for striped block groups would be distributed to 
> datanode in the group, where data block data should be able to be 
> reconstructed when missed/corrupted to recompute the block checksum. The most 
> needed codes are in the current ErasureCodingWorker and could be reused in 
> order to avoid duplication. Fortunately, we have very good and complete 
> tests, which would make the refactoring much easier. The refactoring will 
> also help a lot for subsequent tasks in phase II for non-striping erasure 
> coded files and blocks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9694) Make existing DFSClient#getFileChecksum() work for striped blocks

2016-03-15 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194786#comment-15194786
 ] 

Uma Maheswara Rao G commented on HDFS-9694:
---

[~drankye], Thanks for working on this patch. Great work.
Here is my comments/questions

# StripedBlockChecksumComputer : Seems like we will actually don’t do anything 
with stripes here. We are calculating checksum by block levels right. So, this 
should be like StripedBGBlockCheksumComputer/BlockGroupBlockChecksumComputer ? 
So that later When we implement pure stripe based checksum calculation we could 
name it as StripedBGStripeCheksumComputer/BlockGroupStripedChecksumComputer? 
Just a thought.
# StripedFileChecksumComputer : Here also same as above. But I am not strong on 
name at this time but below is my thought, How about 
StripedFileBlkLevelChecksumComputer and Later when we compute stripe based 
computer we name it as StripedFileStripeLevelChecksumComputer? Just a thought, 
it will great if we get more better meaningful naming.
# I think client’s getFileCheckSum should have proper java doc which would 
explain that when we call this API on striped file, it should say that block 
level checksum would be calculated but not stripe level and checksum result 
can’t be compared with replicated file checksum result. 

> Make existing DFSClient#getFileChecksum() work for striped blocks
> -
>
> Key: HDFS-9694
> URL: https://issues.apache.org/jira/browse/HDFS-9694
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-9694-v1.patch, HDFS-9694-v2.patch, 
> HDFS-9694-v3.patch, HDFS-9694-v4.patch, HDFS-9694-v5.patch
>
>
> This is a sub-task of HDFS-8430 and will get the existing API 
> {{FileSystem#getFileChecksum(path)}} work for striped files. It will also 
> refactor existing codes and layout basic work for subsequent tasks like 
> support of the new API proposed there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8030) HDFS Erasure Coding Phase II -- EC with contiguous layout

2016-03-14 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194425#comment-15194425
 ] 

Uma Maheswara Rao G commented on HDFS-8030:
---

[~zhz] Me and Rakesh discussed again for some time on the document. Please find 
the points to discuss/address for your opinion.
# As this design tries to convert files into EC mode from normal file layout, 
Blockgroups needs to be created later when converting. But block groups 
generally we allocate continuous blockids, but here how do we make that 
continuous blockids when converting?
# Does this create overheads on memory as we need to track blockGroups 
separately and if the blockids are not continuous as discussed in #1
#  parity creation by reading whole 6blocks which will be 6*256MB into memory. 
Do we need to think more on this point I think. We may need to think like 
contiguous blockid but Parity generation based on Stripes?
Blk_0, Blk_1…Blk_5 are the contiguous blocks. We read cell from each block and 
just treat as stripe and generate 3 parity. Continue until finishes all all 
data in blocks. Need to think more on this.
# Do we support mixed zone? contains both striped files and contiguous EC files?
Please others also review the document and get the feedbacks.

> HDFS Erasure Coding Phase II -- EC with contiguous layout
> -
>
> Key: HDFS-8030
> URL: https://issues.apache.org/jira/browse/HDFS-8030
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: erasure-coding
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFSErasureCodingPhaseII-20151204.pdf
>
>
> Data redundancy form -- replication or erasure coding, should be orthogonal 
> to block layout -- contiguous or striped. This JIRA explores the combination 
> of {{Erasure Coding}} + {{Contiguous}} block layout.
> As will be detailed in the design document, key benefits include preserving 
> block locality, and easy conversion between hot and cold modes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9719) Refactoring ErasureCodingWorker into smaller reusable constructs

2016-03-08 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15186493#comment-15186493
 ] 

Uma Maheswara Rao G commented on HDFS-9719:
---

[~drankye], Thanks Kai for working on this JIRA. Overall changes looks good, I 
have the following comments/question though.

# I am seeing some variable names still like “toRecoverLen”. Can we take chance 
in this patch to change them like toReconstructLen ?
# doReadMinimum : this method name looks to be wrong. Its actually reading from 
minimum required sourced data nodes. But this name looks like it is reading 
minimum data length/what?
# I think the current refactored names not representing the functioning what 
they are actually doing. Example StripedReaders, looks like holder class, but 
it is doing more than that. Also I assume StripedReader itself should handle 
multiple chunk/cell readers. So, how about renaming class name like 
StripedReaders -> StripedReader and StripedReader -> StripedChunkReader and the 
same comment applies for StripedWriter*
Please check if this naming make sense to you. If you have more suggestion on 
better naming that would be great.

> Refactoring ErasureCodingWorker into smaller reusable constructs
> 
>
> Key: HDFS-9719
> URL: https://issues.apache.org/jira/browse/HDFS-9719
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-9719-v1.patch, HDFS-9719-v2.patch, 
> HDFS-9719-v3.patch, HDFS-9719-v4.patch, HDFS-9719-v5.patch, HDFS-9719-v6.patch
>
>
> This would suggest and refactor {{ErasureCodingWorker}} into smaller 
> constructs to be reused in other places like block group checksum computing 
> in datanode side. As discussed in HDFS-8430 and implemented in HDFS-9694 
> patch, checksum computing for striped block groups would be distributed to 
> datanode in the group, where data block data should be able to be 
> reconstructed when missed/corrupted to recompute the block checksum. The most 
> needed codes are in the current ErasureCodingWorker and could be reused in 
> order to avoid duplication. Fortunately, we have very good and complete 
> tests, which would make the refactoring much easier. The refactoring will 
> also help a lot for subsequent tasks in phase II for non-striping erasure 
> coded files and blocks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9719) Refactoring ErasureCodingWorker into smaller reusable constructs

2016-03-07 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184285#comment-15184285
 ] 

Uma Maheswara Rao G commented on HDFS-9719:
---

Thanks [~rakeshr] for pinging me. I will take a look today. Thanks Kai for the 
work.

> Refactoring ErasureCodingWorker into smaller reusable constructs
> 
>
> Key: HDFS-9719
> URL: https://issues.apache.org/jira/browse/HDFS-9719
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-9719-v1.patch, HDFS-9719-v2.patch, 
> HDFS-9719-v3.patch, HDFS-9719-v4.patch, HDFS-9719-v5.patch, HDFS-9719-v6.patch
>
>
> This would suggest and refactor {{ErasureCodingWorker}} into smaller 
> constructs to be reused in other places like block group checksum computing 
> in datanode side. As discussed in HDFS-8430 and implemented in HDFS-9694 
> patch, checksum computing for striped block groups would be distributed to 
> datanode in the group, where data block data should be able to be 
> reconstructed when missed/corrupted to recompute the block checksum. The most 
> needed codes are in the current ErasureCodingWorker and could be reused in 
> order to avoid duplication. Fortunately, we have very good and complete 
> tests, which would make the refactoring much easier. The refactoring will 
> also help a lot for subsequent tasks in phase II for non-striping erasure 
> coded files and blocks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9733) Refactor DFSClient#getFileChecksum and DataXceiver#blockChecksum

2016-02-29 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-9733:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

> Refactor DFSClient#getFileChecksum and DataXceiver#blockChecksum
> 
>
> Key: HDFS-9733
> URL: https://issues.apache.org/jira/browse/HDFS-9733
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
> Attachments: HDFS-9733-v1.patch, HDFS-9733-v2.patch, 
> HDFS-9733-v3.patch, HDFS-9733-v4.patch, HDFS-9733-v5.patch, 
> HDFS-9733-v6.patch, HDFS-9733-v7.patch, HDFS-9733-v8.patch, HDFS-9733-v9.patch
>
>
> To prepare for file checksum computing for striped files, this refactors the 
> existing codes in Refactor {{DFSClient#getFileChecksum}} and 
> {{DataXceiver#blockChecksum}} to make HDFS-8430 and HDFS-9694 easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9733) Refactor DFSClient#getFileChecksum and DataXceiver#blockChecksum

2016-02-29 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15173271#comment-15173271
 ] 

Uma Maheswara Rao G commented on HDFS-9733:
---

I have just committed this to trunk. Thanks a lot, Kai for the work.

> Refactor DFSClient#getFileChecksum and DataXceiver#blockChecksum
> 
>
> Key: HDFS-9733
> URL: https://issues.apache.org/jira/browse/HDFS-9733
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-9733-v1.patch, HDFS-9733-v2.patch, 
> HDFS-9733-v3.patch, HDFS-9733-v4.patch, HDFS-9733-v5.patch, 
> HDFS-9733-v6.patch, HDFS-9733-v7.patch, HDFS-9733-v8.patch, HDFS-9733-v9.patch
>
>
> To prepare for file checksum computing for striped files, this refactors the 
> existing codes in Refactor {{DFSClient#getFileChecksum}} and 
> {{DataXceiver#blockChecksum}} to make HDFS-8430 and HDFS-9694 easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9733) Refactor DFSClient#getFileChecksum and DataXceiver#blockChecksum

2016-02-29 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15173270#comment-15173270
 ] 

Uma Maheswara Rao G commented on HDFS-9733:
---

+1 on the latest patch.

> Refactor DFSClient#getFileChecksum and DataXceiver#blockChecksum
> 
>
> Key: HDFS-9733
> URL: https://issues.apache.org/jira/browse/HDFS-9733
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-9733-v1.patch, HDFS-9733-v2.patch, 
> HDFS-9733-v3.patch, HDFS-9733-v4.patch, HDFS-9733-v5.patch, 
> HDFS-9733-v6.patch, HDFS-9733-v7.patch, HDFS-9733-v8.patch, HDFS-9733-v9.patch
>
>
> To prepare for file checksum computing for striped files, this refactors the 
> existing codes in Refactor {{DFSClient#getFileChecksum}} and 
> {{DataXceiver#blockChecksum}} to make HDFS-8430 and HDFS-9694 easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9733) Refactor DFSClient#getFileChecksum and DataXceiver#blockChecksum

2016-02-24 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15162737#comment-15162737
 ] 

Uma Maheswara Rao G commented on HDFS-9733:
---

Thanks Kai for updating the patch.
For your question, I would prefer to keep the patch completion and don't open 
points for another JIRA patch. So, lets make possible variable private here and 
when required we can change later when changing them in another JIRA.

> Refactor DFSClient#getFileChecksum and DataXceiver#blockChecksum
> 
>
> Key: HDFS-9733
> URL: https://issues.apache.org/jira/browse/HDFS-9733
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-9733-v1.patch, HDFS-9733-v2.patch, 
> HDFS-9733-v3.patch, HDFS-9733-v4.patch, HDFS-9733-v5.patch, HDFS-9733-v6.patch
>
>
> To prepare for file checksum computing for striped files, this refactors the 
> existing codes in Refactor {{DFSClient#getFileChecksum}} and 
> {{DataXceiver#blockChecksum}} to make HDFS-8430 and HDFS-9694 easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9733) Refactor DFSClient#getFileChecksum and DataXceiver#blockChecksum

2016-02-23 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159607#comment-15159607
 ] 

Uma Maheswara Rao G commented on HDFS-9733:
---

[~drankye] I think one reason for test failure could be that, when located 
blocks are 0 (may be for zero size files etc), we don't really try from 
datanodes, but here you are setting crcType only when we call tryDatanode. Here 
default crcType can be set bydefaut.I checked the older code, there we are 
setting default crcType by default. Please validate and update the patch 
accordingly. Thanks

> Refactor DFSClient#getFileChecksum and DataXceiver#blockChecksum
> 
>
> Key: HDFS-9733
> URL: https://issues.apache.org/jira/browse/HDFS-9733
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-9733-v1.patch, HDFS-9733-v2.patch, 
> HDFS-9733-v3.patch, HDFS-9733-v4.patch
>
>
> To prepare for file checksum computing for striped files, this refactors the 
> existing codes in Refactor {{DFSClient#getFileChecksum}} and 
> {{DataXceiver#blockChecksum}} to make HDFS-8430 and HDFS-9694 easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9733) Refactor DFSClient#getFileChecksum and DataXceiver#blockChecksum

2016-02-23 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158723#comment-15158723
 ] 

Uma Maheswara Rao G commented on HDFS-9733:
---

Hi [~drankye], Test failures seems to be related. Can you please fix them?
Also please address all check style warnings.

> Refactor DFSClient#getFileChecksum and DataXceiver#blockChecksum
> 
>
> Key: HDFS-9733
> URL: https://issues.apache.org/jira/browse/HDFS-9733
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-9733-v1.patch, HDFS-9733-v2.patch, 
> HDFS-9733-v3.patch, HDFS-9733-v4.patch
>
>
> To prepare for file checksum computing for striped files, this refactors the 
> existing codes in Refactor {{DFSClient#getFileChecksum}} and 
> {{DataXceiver#blockChecksum}} to make HDFS-8430 and HDFS-9694 easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9733) Refactor DFSClient#getFileChecksum and DataXceiver#blockChecksum

2016-02-22 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158354#comment-15158354
 ] 

Uma Maheswara Rao G commented on HDFS-9733:
---

Thanks Kai for updating the patch. The latest patch looks good to me. 
+1 pending Jenkins.

> Refactor DFSClient#getFileChecksum and DataXceiver#blockChecksum
> 
>
> Key: HDFS-9733
> URL: https://issues.apache.org/jira/browse/HDFS-9733
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-9733-v1.patch, HDFS-9733-v2.patch, 
> HDFS-9733-v3.patch, HDFS-9733-v4.patch
>
>
> To prepare for file checksum computing for striped files, this refactors the 
> existing codes in Refactor {{DFSClient#getFileChecksum}} and 
> {{DataXceiver#blockChecksum}} to make HDFS-8430 and HDFS-9694 easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8430) Erasure coding: compute file checksum for stripe files

2016-02-21 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156530#comment-15156530
 ] 

Uma Maheswara Rao G commented on HDFS-8430:
---

[~drankye] I have just posted some comments in HDFS-9733. Please take a look.

> Erasure coding: compute file checksum for stripe files
> --
>
> Key: HDFS-8430
> URL: https://issues.apache.org/jira/browse/HDFS-8430
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Walter Su
>Assignee: Kai Zheng
> Attachments: HDFS-8430-poc1.patch
>
>
> HADOOP-3981 introduces a  distributed file checksum algorithm. It's designed 
> for replicated block.
> {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped 
> block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9733) Refactor DFSClient#getFileChecksum and DataXceiver#blockChecksum

2016-02-21 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156529#comment-15156529
 ] 

Uma Maheswara Rao G commented on HDFS-9733:
---

Thanks for the refactor Kai. Overall it looks good to me.
Following are my questions/comments

# Why do you think we need 2 abstract classes here?
AbstractBlockChecksumComputer
BlockChecksumComputer
One is enough? 
# Could you please add doc for this class? what it is doing?
ReplicatedBlockChecksumComputer
# Small doc for this method as well? compute
# How does this class related?
CorruptedBlocks
I think overall you can take advantage to improve javadoc around this newly 
created classes.
This classes added in DFSUtilClient.java

> Refactor DFSClient#getFileChecksum and DataXceiver#blockChecksum
> 
>
> Key: HDFS-9733
> URL: https://issues.apache.org/jira/browse/HDFS-9733
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-9733-v1.patch, HDFS-9733-v2.patch, 
> HDFS-9733-v3.patch
>
>
> To prepare for file checksum computing for striped files, this refactors the 
> existing codes in Refactor {{DFSClient#getFileChecksum}} and 
> {{DataXceiver#blockChecksum}} to make HDFS-8430 and HDFS-9694 easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    2   3   4   5   6   7   8   9   10   11   >