[jira] [Commented] (HDFS-13123) RBF: Add a balancer tool to move data across subsluter

2018-02-09 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359032#comment-16359032
 ] 

Chris Douglas commented on HDFS-13123:
--

bq. hard linking across block pools as one option and even tiered storage
Yes, [~virajith] and I wrote a prototype of this with an intern for a similar 
project. Making it a proper transaction is complicated, but architecturally RBF 
is in the right place to coordinate this cleanly.

We added APIs to generate and attach an FSImage for a NN subtree. Attaching an 
image required reallocating not only the inodeIds but also the blockIds, which 
were hardlinked into a contiguous range in the destination blockId space. We 
didn't solve all the failover and edit log cases, but these seem tractable as 
long as the subtree is immutable. Without that assumption (which RBF can 
enforce/detect) thar be dragons.

> RBF: Add a balancer tool to move data across subsluter 
> ---
>
> Key: HDFS-13123
> URL: https://issues.apache.org/jira/browse/HDFS-13123
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Major
> Attachments: HDFS Router-Based Federation Rebalancer.pdf
>
>
> Follow the discussion in HDFS-12615. This Jira is to track effort for 
> building a rebalancer tool, used by router-based federation to move data 
> among subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13123) RBF: Add a balancer tool to move data across subsluter

2018-02-08 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357638#comment-16357638
 ] 

Wei Yan commented on HDFS-13123:


{quote}hard linking across block pools as one option and even tiered storage
{quote}
Yes, I was also told this by other ppl. But haven't got the details, so I just 
put "copy" instead of "distcp" in the doc ;)

> RBF: Add a balancer tool to move data across subsluter 
> ---
>
> Key: HDFS-13123
> URL: https://issues.apache.org/jira/browse/HDFS-13123
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Major
> Attachments: HDFS Router-Based Federation Rebalancer.pdf
>
>
> Follow the discussion in HDFS-12615. This Jira is to track effort for 
> building a rebalancer tool, used by router-based federation to move data 
> among subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13123) RBF: Add a balancer tool to move data across subsluter

2018-02-08 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-13123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357575#comment-16357575
 ] 

Íñigo Goiri commented on HDFS-13123:


Thanks [~ywskycn] for the doc. Right now we are leveraging DistCp for this but 
I remember there was some conversation about other options.
[~chris.douglas], I remember you mentioned doing hard linking across block 
pools as one option and even tiered storage; any thoughts?
In any case, I think we should start with DistCp but keep in mind the option to 
leverage other mechanisms.

> RBF: Add a balancer tool to move data across subsluter 
> ---
>
> Key: HDFS-13123
> URL: https://issues.apache.org/jira/browse/HDFS-13123
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Major
> Attachments: HDFS Router-Based Federation Rebalancer.pdf
>
>
> Follow the discussion in HDFS-12615. This Jira is to track effort for 
> building a rebalancer tool, used by router-based federation to move data 
> among subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13123) RBF: Add a balancer tool to move data across subsluter

2018-02-08 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-13123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357371#comment-16357371
 ] 

Íñigo Goiri commented on HDFS-13123:


Just for reference, in HDFS-10467, the document already mentioned the 
Rebalancer in some places so this will be a place to track this component in 
detail.

> RBF: Add a balancer tool to move data across subsluter 
> ---
>
> Key: HDFS-13123
> URL: https://issues.apache.org/jira/browse/HDFS-13123
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Major
>
> Follow the discussion in HDFS-12615. This Jira is to track effort for 
> building a rebalancer tool, used by router-based federation to move data 
> among subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13123) RBF: Add a balancer tool to move data across subsluter

2018-02-07 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356499#comment-16356499
 ] 

Wei Yan commented on HDFS-13123:


I'll put a quick doc summarizing the solution by end of this week.

> RBF: Add a balancer tool to move data across subsluter 
> ---
>
> Key: HDFS-13123
> URL: https://issues.apache.org/jira/browse/HDFS-13123
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Major
>
> Follow the discussion in HDFS-12615. This Jira is to track effort for 
> building a rebalancer tool, used by router-based federation to move data 
> among subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org