Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/21216
> I disagree that it is federation. It's just declaring multiple HDFS
services in the same config file.
I am using the terminology which is used in the Hadoop website. The
configuration I am using is the basic one suggested in the Federation
Configuration, as you can see here:
https://hadoop.apache.org/docs/r2.8.3/hadoop-project-dist/hadoop-hdfs/Federation.html#Federation_Configuration.
What you are referring as federation is called on the Hadoop website as
federation + ViewFS:
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/ViewFs.html#Appendix:_A_Mount_Table_Configuration_Example.
Anyway, there are two possible configurations:
1. What I refer as federation (without ViewFS) which without this change
would fail to write to any napespace different from the default one;
2. ViewFS enabled, where this change is not needed. With this change in
this case the risk is that we handle the same thing twice.
So, what about adding a check if viewfs is enabled: if so we skip the code
added here; if not, we do add all the namespaces. In this way all scenarios
should be covered. What do you think?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]