[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-16 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478432#comment-16478432
 ] 

Chris Douglas commented on HDFS-12615:
--

bq. Any Jira which has not been released, we should certainly revert. 
A blanket veto over a clerical error is not on the table. A set of unrelated 
features and fixes were unhelpfully linked from a single JIRA. Let's clean it 
up and move significant features- like security- to design doc + branch, 
leaving bug fixes and the like to normal dev on trunk, as we would with any 
other module. Particularly for code that only affects RBF, isolating it on a 
branch doesn't do anything for the stability of trunk.

bq. I don't quite agree with the patch count logic, for I have seen patches 
with are more than 200 KB at times. Let us just say "we know a big feature when 
we see it"
I was referring to the 5/9 patches implementing these features. Legalese 
defining when a branch is required is not a good use of anyone's time. Let's 
not belabor it. However, you requested _all_ development move to a branch, 
based on an impression that you explicitly have not verified. Without seeking 
objective criteria that apply to all situations, you need to research this 
particular patchset to ground your claims about _it_.

Let's be concrete. You have the impression that (a) changes committed to trunk 
affect non-RBF deployments and (b) RBF features miss critical cases. Those are 
demonstrable.

bq. I *feel* that a JIRA that has these following features [...] *Sounds like* 
a major undertaking in my mind and *feels like* these do need a branch and 
these are significant features.
RBF has a significant _roadmap_. A flat list of tasks- with mixed granularity- 
is a poor way to organize it.

bq. [...] since they are not very keen on communicating details, I am proposing 
that you move all this work to a branch and bring it back when the whole idea 
is baked
Not everyone participating in RBF development has worked in OSS projects, 
before. It's fine to explore ideas in code, collaboratively, in JIRA. Failing 
to signal which JIRAs are variations on a theme (e.g., protobuf boilerplate), 
prototypes, or features affecting non-RBF: that's not OK. Reviewers can't 
follow every JIRA, they need help finding the relevant signal.

Your confidence that people working on RBF are applying reasonable filters and 
*soliciting* others' opinion is extremely important. From a random sampling, 
that seems to be happening. Reviewing the code in issues [~arpitagarwal] 
cited... they may "look non-trivial at a glance", but after a slightly longer 
glance, they look pretty straightforward to me. Or at least, they follow from 
the design of the router.

bq. Perhaps there is a communication problem here. I am not sure where your 
assumption comes from; reading the comments on the security patch, I am not 
able to come to that conclusion. Please take a look at HDFS-12284. [...] If we 
both are reading the same comments, I am at a loss on how you came to the 
conclusion that it was a proposal and not a patch.
Committing a patch that claims to add security, without asking for broader 
review, would be absurd. Reading a discussion about a lack of clarity on 
_delegation tokens_ and concluding that group believes its implementation is 
ready to merge... that requires more assumptions about those developers' intent 
and competence than to conclude "prototype". _However_ if it were being 
developed in a branch, that signal would be unambiguous.

bq. It will benefit the RBF feature as well as future maintainers to have a 
design notes or a detailed change description beyond a 1-line summary because 
most are large patches
>From the samples I read, this is a recurring problem. Many RBF JIRAs should 
>have more prose around the design tradeoffs, not just comments on the code. 
>Taking HDFS-13224 as an example, one would need to read the patch to 
>understand what was implemented, and how. Again, most of these follow from the 
>RBF design and the larger patch size often comes from PB boilerplate, but 
>raising the salient details both for review and for future maintainers (I'm 
>glad you brought this up) is not optional.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-16 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478341#comment-16478341
 ] 

Anu Engineer commented on HDFS-12615:
-

{quote}Given that we discarded reverting, I'm not sure a branch makes sense for 
what is left in this JIRA.
{quote}
Just to make sure that we are all on the same ,  what I proposed was a 
compromise based on what [~linyiqun] mentioned. Any Jira which has not been 
released, we *should* certainly revert. 

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-16 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478089#comment-16478089
 ] 

Íñigo Goiri commented on HDFS-12615:


Given that we discarded reverting, I'm not sure a branch makes sense for what 
is left in this JIRA.
The ones we are on time to amend are already in their own umbrella and will 
move into their own branch accordingly.
If we have a branch for what is left open here, we will have a branch with a 
topic like "stuffs on RBF".
Similarly, a design doc for such "topic" would have a similar issue.
Given that, I'm proposing to take the big chunks in this JIRA and if they are 
large enough, move them to their own umbrella with a proper design doc.
If they are too small for a full doc, then just fix the description.

So far, I've gone through them:
* HDFS-13044: this is pretty big () and I created HDFS-13575 to add there a 
proper design doc.
* HDFS-13484: this is around 4/5 JIRAs so not sure an umbrella makes sense but 
I'll add a design doc to HDFS-13484.
* HDFS-13224: this one is also 4/5 JIRAs but it might be worth having its own 
umbrella. For sure a design doc. Thoughts?

I'm also going over their descriptions trying to make them "more descriptive".


BTW, as I'm going through all these patches and I have to say that size in 
bytes is not a very good indicator.
For example, HDFS-13478 is 90KB of nothing (defining RPC interfaces wit PB 
implementations is extremely verbose).
In any case, right now most patches are <10KB and the only outlier is 
HDFS-13215 which is a refactor.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-16 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477959#comment-16477959
 ] 

Arpit Agarwal commented on HDFS-12615:
--

Hi [~elgoiri], splitting patches sounds like a lot of work. Instead, is it 
practical to bring in the large phase 2 changes via a single feature branch 
with a design doc (security being an exception)? If that is impractical, can we 
have a design or implementation note for each large patch and ensure that 
future changes occur in a branch.

A branch also makes it easier to see the impact of a large feature on the 
existing code by looking at the merge patch. e.g. proposals like HDFS-13248.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-16 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477951#comment-16477951
 ] 

Anu Engineer commented on HDFS-12615:
-

Moving  JIRAs out makes no difference, IMHO; But if you are moving them into 
its own branch +1.

Thank you for addressing the feedback.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-16 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477928#comment-16477928
 ] 

Íñigo Goiri commented on HDFS-12615:


bq. Could you please explain what  you are proposing to do ?
I'm putting them in their own umbrella and then we can decide if you guys want 
to start branches for those or not.
With this, we remove the clutter from this JIRA while being able to to track 
them; just accounting for now.
In addition, we can add detailed descriptions based on your thoughts.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-16 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477911#comment-16477911
 ] 

Anu Engineer commented on HDFS-12615:
-

[~goiri] I am sorry, I am not sure I understand what you mean by "let me split 
this one". Could you please explain what  you are proposing to do ?

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-16 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477908#comment-16477908
 ] 

Íñigo Goiri commented on HDFS-12615:


Thanks [~arpitagarwal], let me go over them:
# HDFS-13364: this is pretty straightforward. I think it's just verbose.
# HDFS-13484: let me split this one.
# HDFS-13224: let me split this one.
# HDFS-13044: let me split this one.
# HDFS-13347: let me add some better description.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-16 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477862#comment-16477862
 ] 

Arpit Agarwal commented on HDFS-12615:
--

Hi [~elgoiri], some tasks that fall into the _small bugfixes_ bucket look 
non-trivial at a glance.
# HDFS-13364 - Support NamenodeProtocol in the Router
# HDFS-13484 - Disable Nameservices from the federation
# HDFS-13224 - support mount points across multiple subclusters
# HDFS-13044 - this introduces a safe-mode concept.
# HDFS-13347 - this introduces caching for DN reports. The caching behavior is 
not clear from the short jira description.

It will benefit the RBF feature as well as future maintainers to have a design 
notes or a detailed change description beyond a 1-line summary because most are 
large patches (40-70KB+). Perhaps we didn't pay much attention to these jiras 
because we saw them as sub-tasks and assumed the development was being done in 
a branch. A feature branch feels like a better vehicle for multiple changes of 
this magnitude.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-15 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16476340#comment-16476340
 ] 

Íñigo Goiri commented on HDFS-12615:


[~anu], thanks for your thoughts.
I think your concerns were valid from the beginning and I agree that this 
umbrella didn't have a consistent story.
My idea was to have a place to track the RBF related modifications.
In retrospective, if I had just used tags for these features (which is the 
common approach), this wouldn't have been noticed and this would now just be 
hidden under the carpet.
>From this discussion, I think I took the right decision as now we are able to 
>set a better standard for RBF.
I'm not sure I 100% agree on your statement about that this would require a 
branch long time ago.
I agree that some of the features (~30% of the original JIRAs) should have used 
a branch, but there is a lot of small bugfixes which should go to trunk without 
a branch.

I hope you agree your concerns were heard.
To summarize, the actions were:
* Take the larger components into their own JIRAs which potentially will become 
their own branch:
** Security: Already on that track. As I have 0 confidence in this part, we 
were already making noise trying to get others to review.
** Rebalancer: Already on that track.
** DNs vs Router: Already on that track but still in early discussion.
** Quotas: this grew out of hand and hopefully now is on track in its own 
umbrella.
* Leave this branch as maintenance bug fixes. If I'm not wrong, all the open 
JIRAs now would qualify as "regular maintenance".

The problem is that now we have a few rough edges with some medium features 
that are still left here.
I'd like to trim this umbrella further if so.
Any thoughts here? Should we split things further (multi-destination mount 
points)?

In general, I would like to highlight that all the work here is self-contained 
to RBF and shouldn't have an impact to the rest of HDFS.
(Security is a different beast and we should pay special attention within HDFS 
in general.)
If I'm not wrong the changes to the rest of components has been minimal.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-15 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16476295#comment-16476295
 ] 

Anu Engineer commented on HDFS-12615:
-

[~chris.douglas] Thanks for the response. Please forgive me for the length of 
this response, I wanted to make sure that I have carefully considered all your 
arguments and respond to them. 
I am arguing that the core issue of this JIRA is lack of communication, and I 
am perhaps overcompensating for it :)
{quote}"All future work" is too broad, but developing complex features in 
branches is a fair ask.
{quote}
I agree that is a reasonable statement. I hope that by the end of this message 
I would have articulated my concerns and my position reasonably well.
{quote}Would you mind looking through the JIRAs implementing quotas and 
multi-subcluster mounts, as an exercise?
{quote}
Will do, but I also want to make sure that we are not missing the forest for 
the trees. Please read on to understand my concerns.
{quote}It is important to distinguish between routine maintenance and 
significant features.
{quote}
Completely Agree.
{quote}For the latter, design docs should be written, experts' opinion 
solicited, and implementation should be in a branch if it requires more than a 
handful of patches.
{quote}
You nailed it on the head. I don't quite agree with the patch count logic, for 
I have seen patches with are more than 200 KB at times. Let us just say "we 
know a big feature when we see it" 
([https://en.wikipedia.org/wiki/I_know_it_when_I_see_it])

Given that, I feel that a JIRA that has these following features.
 # Security
 # Quotas
 # Multi-subcluster mounts
 # Dynamic Cluster Build-outs
 # Change of Configuration Management
 # DN Protocol changes
 # Inter-Cluster Rebalancing

Sounds like a major undertaking in my mind and feels like these do need a 
branch and these are significant features.
{quote}In that particular case, I assume the patch was shared for discussion, 
not commit.
{quote}
Perhaps there is a communication problem here. I am not sure where your 
assumption comes from; reading the comments on the security patch, I am not 
able to come to that conclusion. Please take a look at HDFS-12284.

I have copied some sample comments from the JIRA(to make it easy for you to 
read), do these comments sound to you as if it was a proposal?

_"I've tested it on the command line. Adding directory/deleting 
directory/copying file from the local file system, etc. I am working on the 
unit test. Please review the patch in the meantime. "_
 _"Thanks, Sherwood Zheng, nice work! Could you click "Submit Patch" to trigger 
Jenkins testing?"_
 _"Can you open a new JIRA for the delegation tokens part?"_
If we both are reading the same comments, I am at a loss on how you came to the 
conclusion that it was a proposal and not a patch.

And that brings us to the most critical issues I have.

While building all these features; there are no design docs, and from where I 
am standing there is not even a consistency what is a real patch, a proposal or 
a real JIRA. Let me show you an example:
 # Let us look at HDFS-13098. This is supposed to be a discussion JIRA 
(Something that I learn after my comments)
 # At some point, the discussion went to let us commit this other patch since 
that will help this work (HDFS-13312).
 # I come and comment that it all looks great, but please commit to your 
branch, so that people understand what you are proposing (See the earlier point 
about lack of design document in this important JIRA.) 
 # I specifically write "I don't object to patches, just move it to a branch" 
Maybe this is research idea, maybe it is not well-formed whatever it is, please 
keep it in a branch.
 # Then, and only then I get a comment which says, "This just an idea", we are 
not planning to commit it.
 # If you have 97 patches, How am I supposed to know that committing HDFS-13312 
is not to further this cause and this is just an idea? If this is just an idea, 
why would you want to commit that other patch with the reasoning that it helps 
this patch? 
 # And now we have decided to abandon HDFS-13312? I am confused about what 
state this work is in.

Overall, this JIRA does a very poor job of communicating what they want to do. 
*That is the core issue*. 
As I said, the people involved seem to understand (or at least I hope) what 
they are doing, but since they are not very keen on communicating details, I am 
proposing that you move all this work to a branch and bring it back when the 
whole idea is baked. I am really trying to be helpful here.

{quote}you should look at the code/JIRAs if you're going to make sweeping 
statements like this. Many features did have design documents.
{quote}
I hope I have given you examples of what triggered this statement. Other than 
Quota work, there is no design documents. If you have them please do feel free 
to share. Perhaps I am mistaken.

We 

[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-15 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16475380#comment-16475380
 ] 

Yiqun Lin commented on HDFS-12615:
--

[~anu], actually for the Quotas, the major work is done in 3 JIRAs( HDFS-12934, 
HDFS-12972, HDFS-12973), and other 6 are minor bug fixes. We didn't change HDFS 
core functionality and the global quota was did at Router level. you may take a 
chance to look into codes/JIRAs for Quota and Multi-Subcluster mounts.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-15 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16475338#comment-16475338
 ] 

Chris Douglas commented on HDFS-12615:
--

bq. Could you please do me a favor and move all future work and all the JIRAs 
not released into a branch?
"All future work" is too broad, but developing complex features in branches is 
a fair ask. Would you mind looking through the JIRAs implementing quotas and 
multi-subcluster mounts, as an exercise? Reasonable people can disagree on 
minimum criteria requiring a feature branch, but calling a merge vote for 5-9 
patches is probably excessive.

It is important to distinguish between routine maintenance and significant 
features. For the latter, design docs should be written, experts' opinion 
solicited, and implementation should be in a branch if it requires more than a 
handful of patches. The project can't claim that RBF is secure, then issue a 
flurry of CVEs that could have been caught in design review. In that particular 
case, I assume the patch was shared for discussion, not commit.

bq. As I said, I feel bad the way this project is handled, that is heaping lots 
of critical features without even design documents and all these features going 
directly into the trunk.
[~anu], you should look at the code/JIRAs if you're going to make sweeping 
statements like this. Many features did have design documents. Most JIRAs were 
repairs/improvements to released functionality or straightforward extensions. 
The only data point you've cited is the number of subtasks in this JIRA, which 
is a clerical problem, not a stop-the-world emergency.

Since some subset of these JIRAs are already in a release, even that criterion 
doesn't tie these JIRAs together. Your suggestion- moving JIRAs not part of a 
release out of this umbrella- makes sense. Let's wrap up this issue as whatever 
was released, then transition to (a) significant features in branches with 
subtasks and (b) normal maintenance in individual JIRAs.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-14 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16475238#comment-16475238
 ] 

Anu Engineer commented on HDFS-12615:
-

[~linyiqun]  I am not going to insist that you need to do something difficult. 
It is not my intention to make life harder for people who produce code and make 
the product better.
 * Could you please do me a favor and move all future work and all the JIRAs 
not released into a branch?  +That would address my concerns for now.+

As I said, I feel bad the way this project is handled, that is heaping lots of 
critical features without even design documents and all these features going 
directly into the trunk. Let bygones be bygones and see how we can make forward 
progress without too much pain for all involved.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-14 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16475226#comment-16475226
 ] 

Yiqun Lin commented on HDFS-12615:
--

Hi [~anu], I agree that we can make move the work of Quota and Multi-Subcluster 
mounts the new branch. But for the revert from trunk, this will break current 
behavior. As Quota and Multi-Subcluster mounts has been released in latest 
version, this features has already been used in users's production environment. 
If we revert this, users cannot use this features.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-14 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16475175#comment-16475175
 ] 

Anu Engineer commented on HDFS-12615:
-

[~goiri] Thanks for taking care of this, can you move the Quota work and  
Multi-Subcluster mounts to the branch that you have opened and revert from 
trunk, That way all the new work for these features – the outstanding patches 
etc. can go into that branch. Then you can leave the rest of the patches as-is. 
Thank you very much for your consideration.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-14 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16475153#comment-16475153
 ] 

Íñigo Goiri commented on HDFS-12615:


bq. Is there a design doc that you can point me to Quota work?
Yep, HDFS-12934 had added the design doc at the very beginning.
I moved everything under HDFS-13553 to make it a self contained feature.

bq. or for Multi-Subcluster mount points?
This is my bad, I initially expected it to happen in one JIRA but it got all 
the way to 4 (5 if you count a compilation fix in branch-2).
Not sure what to do here; I could make it into its own umbrella.

bq. I see this master JIRA containing all these changes (Big and Small).
Agreed.
I've done some trimming now trying to tackle this.
At this point, all larger features are out of the JIRA and everything (other 
than the multisubcluster) are self-contained changes.
I don't think the ones remaining are worth reverting.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-14 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16475114#comment-16475114
 ] 

Anu Engineer commented on HDFS-12615:
-

[~chris.douglas]  Thank you for the comments. Quotas or Multi-subcluster mount 
points does not look like small features to me. Also, the rest of the feature 
set that triggered this discussion falls into the same bucket.

As for rest of the patches, I have not followed them closely enough to comment 
which is implementing new features or is changing HDFS core functionality. 

My point is precisely that: This JIRA seems to mix big and small work items 
indiscriminately. The Security work had a patch and then [~daryn] , and 
[~arpitagarwal] had to come and request for a design doc. 

Is there a design doc that you can point me to Quota work ? or for Multi-Su 
bcluster mount points? You have looked any many of these patches, and they 
might be part of a small and normal course of work, which is what this Jira was 
intended to do. But then you also see there are  large tasks interwoven into 
this master JIRA. 

That is why I am saying that all this work should move to a feature branch. 
From my point of view, I see this master JIRA containing all these changes (Big 
and Small) -- and I am requesting you to take it its  own branch. Please note, 
I am not arguing the merits of patches per se, I am saying when you have so 
many features clubbed under one JIRA, it is confusing to me what is being 
achieved here. Let me repeat myself: I am not against any of these features, 
but I am all for due diligence and doing this work and other items in a branch. 
It just seems to be the right thing to do here.

 

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-14 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16475098#comment-16475098
 ] 

Chris Douglas commented on HDFS-12615:
--

bq. Right now, this JIRA has become too big, and it has interwoven JIRAs(small 
and large) – like you mentioned the Quotas or Multi-subcluster mount points in 
the trunk.
Agreed, the number of subtasks for this JIRA belies any coherent theme for 
"Phase II" items. The bug fixes, perf improvements, docs, unit tests, and API 
cleanup can continue on trunk. As with other modules, they can use tags for 
tracking.

bq. I still think at this stage ( perhaps this should have been done much 
earlier) that we should open a new branch and revert the changes in trunk and 
2.9.
Going through the subtasks, this looks like normal development, just 
over-indexed. If a large feature (like security) relies on a web of JIRAs 
and/or impacts HDFS significantly, then as you say, that's easier to manage on 
a branch.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-14 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474997#comment-16474997
 ] 

Anu Engineer commented on HDFS-12615:
-

[~goiri] Thanks for your reply. As you mentioned, I realize that this JIRA was 
intended to be a small 20+ work items, But since then it has grown to 97 and 
will grow to more items.

Once you are part of it and file JIRAs every day, It is hard to realize how big 
the patch sets have become.

I still think at this stage ( perhaps this should have been done much earlier) 
that we should open a new branch and revert the changes in trunk and 2.9.

Then we can bring Quotas, Muti-subcluster mount points, DN changes, Dynamic 
Sub-cluster, Rebalacer, etc. together when we merge the tree.

Right now, this JIRA has become too big, and it has interwoven JIRAs(small and 
large) – like you mentioned the Quotas or Multi-subcluster mount points in the 
trunk. I sincerely believe that it is the most productive thing to do, that way 
you can design and test changes without impacting Trunk.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-14 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474977#comment-16474977
 ] 

Íñigo Goiri commented on HDFS-12615:


For the three pieces of work I mentioned before (i.e., security, DN vs Router 
and Rebalancer), I fully agree, they should be done in a chunk, with its own 
design document, etc.
I'll do that (as I did for security) once the effort starts and then we can 
start their own branch.
We can start the discussion individually for each of those JIRAs (soon to be 
umbrellas).
So I think for that chunk we are on the same page.

Then the question is this JIRA/umbrella and its nature.
To be truthful, when I started this umbrella, I was targeting a few bug fixes 
(I reported around 20 of them).
I didn't expect so many tasks; this is something that has been growing over 
time.
I went through the patches and the changes are:
* Small bug fixes.
* Performance improvements.
* Adding unit tests
* Interface cleanup.
* Documentation extensions.
* Moving everything into a separate module: HDFS-13215).
* Modular new features (e.g., HDFS-12512)
* Two big features:
** Quotas: HDFS-12934. HDFS-12972, HDFS-12973, HDFS-13253, HDFS-13308, 
HDFS-13307, HDFS-13346, HDFS-13380, and HDFS-13528.
** Multi subcluster mount points: HDFS-13224, HDFS-13250, HDFS-13237, 
HDFS-13299, and HDFS-13291.

Most of the left items open are minor fixes or new features which I'm not even 
sure they will be eventually done.
Given that the controversial/large parts that are left (i.e., Rebalancer and DN 
vs Router) will be moved once they get traction, do you still see a need for a 
branch for the rest?

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-14 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474747#comment-16474747
 ] 

Anu Engineer commented on HDFS-12615:
-

when you have 97 bug fixes to be made on a feature, I believe it is time to 
open up your branch. I have no issues with the check-ins, but I just want them 
to come in a stable transactional faction. Some of the work that you are 
proposing including security and DN interaction with Router are indeed 
fundamental changes. Just to point out, the security work has a patch and it 
[~daryn] requested a design doc. 

My request is to move all Phase-2 items of RBF to a branch and merge when it is 
ready. Right now, it seems that small changes are being intermingled with 
big-ticket items and directly committed to trunk. If this project made zero 
impact to the rest of the code, it would *not* be this concerned, but if we are 
making fundamental changes, let us bring it via a branch.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-14 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474742#comment-16474742
 ] 

Íñigo Goiri commented on HDFS-12615:


Thanks [~anu] for the comments.
What we have committed so far has been bug fixes and minor features (the 
largest one being the quota).
In the discussion we had a month ago (see my comment on April 11th), we 
identified the tasks that would be much larger:
* Security
* DN interacting with the Router
* Rebalancer

The plan was to eventually make those three into separate umbrellas.
We have already done the split for security (HDFS-13532) and once we have 
critical mass, we can do that for the other two.

To summarize, in my opinion, this phase-2 is only to track bug fixing and minor 
tasks.
Once a task gets significant enough, it moves to its own umbrella (the 
discussion for branch should be done individually for each of those).
Any thoughts on this approach?

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-14 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474692#comment-16474692
 ] 

Anu Engineer commented on HDFS-12615:
-

There is some very fundamental changes being proposed in the Phase-2 of this 
work. I propose that it be done in a branch and not part of trunk. Let us bring 
it in once we are able to understand the scope well

I know I am commenting a bit late, but I have not been concentrating on the 
changes that is being proposed by RBF.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-05-11 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472868#comment-16472868
 ] 

Íñigo Goiri commented on HDFS-12615:


I will be presenting this work in the Dataworks Summit.
Is there anyone interested on sharing their experience deploying RBF?

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-04-17 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441539#comment-16441539
 ] 

Íñigo Goiri commented on HDFS-12615:


Thanks [~daryn], I created HDFS-13469 to discuss and track this part.
I'm not really sure how to handle this as we would need to know the location of 
all the ids.
Anyway, let's discuss there.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-04-17 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441470#comment-16441470
 ] 

Daryn Sharp commented on HDFS-12615:


Correct, [~ywskycn].  That's the jira that original started the feature.

bq. What are the ClientProtocol methods that would rely on ids?

Some methods directly take a fileId, which the router currently ignores.  Every 
namesystem operation handles "/.reserved/.inodes/NNN".

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-04-17 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441337#comment-16441337
 ] 

Íñigo Goiri commented on HDFS-12615:


[~ywskycn] can you link the other JIRA for context?
I haven't done any active effort to support inode ids, we need to go through 
those case.
What are the ClientProtocol methods that would rely on ids?

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-04-17 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441093#comment-16441093
 ] 

Wei Yan commented on HDFS-12615:


[~daryn] asked about how RBF supporting file access using InodeID in another 
discussion. I'm not sure I understand the question correctly, I guess sth 
related to HDFS-4489? Daryn, pls correct me if I'm wrong here :)

[~elgoiri], do you have context about supporting this?

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-04-16 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16439621#comment-16439621
 ] 

Wei Yan commented on HDFS-12615:


{quote}Looking forward to seeing the patch, :).
{quote}
Was too busy last week... Didn't have a chance to finish it... Will try to put 
sth there this week..

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-04-11 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434868#comment-16434868
 ] 

Yiqun Lin commented on HDFS-12615:
--

Thanks [~ywskycn] and [~elgoiri] for the comments.
{quote}Regarding rebalancer, it is tracked in HDFS-13123. Let me put the poc 
patch this week, and clear the implementation details, and then we can split 
out some sub-tasks there.
{quote}
Looking forward to seeing the patch, :).
{quote}As you mentioned we closed most of the opened tasks and now we have 
three big parts:
 ...
 I think tracking 1 and 2 in this umbrella is fine but I'm thinking on making 
the others their own umbrella:
 ...
{quote}
Agreed.

Let's see if there are some comments from others.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-04-11 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434256#comment-16434256
 ] 

Íñigo Goiri commented on HDFS-12615:


[~linyiqun] thanks for following up.
As you mentioned we closed most of the opened tasks and now we have three big 
parts:
# Tasks that would be important to have like the one to keep locality 
(HDFS-13248) and the DBMS store (HDFS-13245).
# Small fixes, improvements, and some new unit tests.
# Big tasks like security, rebalancer, and the DNs interacting with the Routers.

I think tracking 1 and 2 in this umbrella is fine but I'm thinking on making 
the others their own umbrella:
* HDFS-12510: security will have 3 or more patches here including the local 
security, delegation tokens, documentation, etc. ([~zhengxg3] is working on 
this).
* HDFS-13123: the rebalancer will require a few JIRAs like the store for the 
rebalancer logs, the rebalancer, unit tests, etc. ([~ywskycn] is taking care of 
this).
* HDFS-13098: this will require a few subtasks and something similar to 
HDFS-13312. (I can take this).

Any thoughts on this? Any other important feature missing or that would be good 
to have?

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-04-11 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434064#comment-16434064
 ] 

Wei Yan commented on HDFS-12615:


{quote} * What's the plan of RBF Rebalancer? It looks a useful tool for 
users.{quote}
Regarding rebalancer, it is tracked in HDFS-13123. Let me put the poc patch 
this week, and clear the implementation details, and then we can split out some 
sub-tasks there.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-04-10 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433344#comment-16433344
 ] 

Yiqun Lin commented on HDFS-12615:
--

It looks amazing that we have fixed and made so many improvements for RBF. 
Thanks all the contributors!
 Just revisiting the previous plan of RBF phase 2 and ensuring we are in the 
same direction:
{quote}At this point the only critical task missing is enabling security 
(mainly HDFS-12284), Sherwood Zheng has been doing progress there and we are 
only missing the delegation tokens.
 The other tasks are mostly done we should try to complete them, the most 
important in my opinion:
 * HDFS-12773 to do a proper HDFS State Store.
 * HDFS-12792 to have complete unit tests.

Then there are a bunch of open topics:
 # Rebalancer of data across subclusters. We have some internal prototype based 
on DistCp that I shared with Wei Yan (we could also include moving blocks 
across blockpools). We could start a separate umbrella for this.
 # DNs interacting with the Routers. I recently opened HDFS-13098 to discuss 
this but this could.
 # Spreading a mount point across multiple mount points. This is similar to 
merge mount points and we have most of it working in production.{quote}
The #3 (*Spreading a mount point across multiple mount points*) mentioned in 
open topics has been implemented now (the doc of this (HDFS-13237) will be 
merged soon ). HDFS-12773 and HDFS-12792 are both merged.

Now there are two major works in phase 2:
 * RBF security tasks: HDFS-12284 and HDFS-13358.
 * DBMS State Store implementation: HDFS-13245.

[~elgoiri], I have some questions in my mind:
 * Besides this, any other thing we need to implement?
 * What's the next plan in RBF phase2?
 * What's the plan of RBF Rebalancer? It looks a useful tool for users.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-03-20 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406686#comment-16406686
 ] 

Íñigo Goiri commented on HDFS-12615:


HDFS-13215 has already been committed and now we have hadoop-hdfs-rbf working.
Actually now, we were able to run FindBugs comfortably :)
There are a couple errors 
[here|https://builds.apache.org/job/PreCommit-HDFS-Build/23560/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-rbf-warnings.html]
 that we cna potentially fix.
[~ywskycn] thanks for creating the new module.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-03-16 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402573#comment-16402573
 ] 

Íñigo Goiri commented on HDFS-12615:


We are freezing all commits to RBF until we get HDFS-13215 in.
[~ywskycn], please update this thread once done.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-03-08 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16391593#comment-16391593
 ] 

Íñigo Goiri commented on HDFS-12615:


{quote}
BTW, I have a question, should us config the remote ns or data center's router 
to client?
{quote}
For the client, it's enough to setup the Router addresses as a 
{{ConfiguredFailoverProxyProvider}}, no need to setup the remote nameservices.
The documentation 
[here|http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSRouterFederation.html]
 shows the exmaple to access both directly and through the Router but you only 
need:
{code}
 
  
dfs.nameservices
ns-fed
  
  
dfs.namenodes.ns-fed
r1,r2
  
  
dfs.namenode.rpc-address.ns-fed.r1
router1:rpc-port
  
  
dfs.namenode.rpc-address.ns-fed.r2
router2:rpc-port
  
  
dfs.client.failover.proxy.provider.ns-fed

org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
  
  
dfs.client.failover.random.order
true
  

{code}

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-03-08 Thread maobaolong (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16391387#comment-16391387
 ] 

maobaolong commented on HDFS-12615:
---

[~elgoiri] I've create a new issue to track the dbms state store.

 

BTW, I have a question, should us config the remote ns or data center's router 
to client?

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-03-06 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388116#comment-16388116
 ] 

Íñigo Goiri commented on HDFS-12615:


[~maobaolong], I had tried some similar to the one in YARN-3663 but I never 
went too far.

Feel free to create a task and start with it; just keep YARN-3663 as a 
reference.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-03-05 Thread maobaolong (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16387250#comment-16387250
 ] 

maobaolong commented on HDFS-12615:
---

[~elgoiri] Should us impl a StateStore of mysql?

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-02-07 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356497#comment-16356497
 ] 

Wei Yan commented on HDFS-12615:


[~elgoiri] [~linyiqun] just created HDFS-13123 for the balancer.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-02-07 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356427#comment-16356427
 ] 

Yiqun Lin commented on HDFS-12615:
--

{quote}
For the rebalancer itself, I have some code working in my local, by reusing 
lots of code from your prototype. I can put a quick doc summarizing the idea, 
and we can start to merge the code back to trunk.
{quote}
Sounds great. Please go ahead, [~ywskycn], :).

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-02-07 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356319#comment-16356319
 ] 

Íñigo Goiri commented on HDFS-12615:


[~ywskycn] I think it makes sense to create a separate umbrella and add design 
docs etc, there.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-02-07 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356309#comment-16356309
 ] 

Wei Yan commented on HDFS-12615:


[~elgoiri] For the #1 rebalancer, do we just create another separate umbrea 
Jira, so that we can try to close this phase2 Jira? Or we still put the umbrea 
one under this Jira?

For the rebalancer itself, I have some code working in my local, by reusing 
lots of code from your prototype. I can put a quick doc summarizing the idea, 
and we can start to merge the code back to trunk.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-02-05 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352669#comment-16352669
 ] 

Íñigo Goiri commented on HDFS-12615:


{quote}The #1 and #2 don't have to be done in this JIRA, right?
{quote}
Correct, I think those are big enough to be their own umbrella.
{quote}Does this mean that we support the mount point that points to multiple 
mount points? What's the use case for this?
{quote}
It would be a mount point that points to multiple *subclusters*.
 Internally we use it extensively for a couple use cases:
 * Folders with a lot of data; they pretty much don't fit in a single subcluster
 * Folders with a lot of accesses; we have jobs with thousands of containers 
and all of them start at the same time and overload the NN

In these cases, we spread mount points across multiple subclusters and we have 
a bunch of policies for that.

> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-02-04 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352087#comment-16352087
 ] 

Yiqun Lin commented on HDFS-12615:
--

Hi [~elgoiri], thanks for the sharing the next planning. Just some comments for 
the open topics that mentioned:

1. The #1 and #2  don't have to be done in this JIRA, right?

2.
{quote}
Spreading a mount point across multiple mount points...
{quote}
Does this mean that we support the mount point that points to multiple mount 
points? What's the use case for this?


> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12615) Router-based HDFS federation phase 2

2018-02-01 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349120#comment-16349120
 ] 

Íñigo Goiri commented on HDFS-12615:


[~linyiqun] was asking in HDFS-13043 what would be the plan was we completed 
the tasks related to exposing the Router state.
At this point the only critical task missing is enabling security (mainly 
HDFS-12284), [~zhengxg3] has been doing progress there and we are only missing 
the delegation tokens.
The other tasks are mostly done we should try to complete them, the most 
important in my opinion:
* HDFS-12773 to do a proper HDFS State Store.
* HDFS-12792 to have complete unit tests.

Then there are a bunch of open topics:
# Rebalancer of data across subclusters. We have some internal prototype based 
on DistCp that I shared with [~ywskycn] (we could also include moving blocks 
across blockpools). We could start a separate umbrella for this.
# DNs interacting with the Routers. I recently opened HDFS-13098 to discuss 
this but this could.
# Spreading a mount point across multiple mount points. This is similar to 
merge mount points and we have most of it working in production.

What is the opinion around? Which ones we should tackle in this umbrella? Any 
preference?


> Router-based HDFS federation phase 2
> 
>
> Key: HDFS-12615
> URL: https://issues.apache.org/jira/browse/HDFS-12615
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
>  Labels: RBF
>
> This umbrella JIRA tracks set of improvements over the Router-based HDFS 
> federation (HDFS-10467).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org