Re: [PR] AMBARI-26065: Add Hadoop Federation Router Service support [ambari]

2024-04-26 Thread via GitHub


JiaLiangC commented on PR #3782:
URL: https://github.com/apache/ambari/pull/3782#issuecomment-2080308226

   @AnanyaSingh2121 
   I believe decoupling is necessary, and I'm not clear how you enabled 
federation, whether modifying the Hadoop configuration or restarting the 
cluster is required. However, by making the router standalone, testing does not 
require any changes to the current Hadoop cluster configuration to enable 
federation.
   
   1. First, we can enable federation on an existing cluster without altering 
any Hadoop cluster configurations, making federation implementation impact-free 
for this cluster. It only requires modifying the router to monitor this Hadoop. 
Therefore, to enable federation for a cluster, one just needs to deploy a 
router, monitor the NameNode address, set up a common zookeeper and namespace, 
and then mount it. If we want to separate this cluster from the federated 
cluster for independent use, only uninstalling this router is needed, likewise 
without any modifications or restarts required for this existing Hadoop cluster.
   
   I think we should distinguish between client and server configurations. 
Ambari lacks a feature, which is differentiating Hadoop's client configurations 
from server configurations. After enabling federation, only client 
configurations need to be modified without conflicting, as the server-side 
configurations remain unchanged. Client configurations are provided to other 
big data components for reading. Alternatively, maintain a separate set of 
federated client configurations manually for other components.
   
   After enabling federation, server-side configurations remain unchanged:
   I provide a client configuration:
   dfs.nameservices=ns-fed
   dfs.namenode.rpc-address.ns-fed.r1=your router1 address
   dfs.namenode.rpc-address.ns-fed.r2=your router2 address
   dfs.namenode.rpc-address.ns-fed.rn=your routern address
   
   fs.defaultFS=hdfs://ns-fed
   This way, all users utilizing this client configuration will access the 
federation.
   
   If enabling or disabling federation requires restarting the entire cluster, 
it would be intolerable for an existing cluster running a significant volume of 
business.
   
   
   It seems you have developed a more comprehensive router solution. You can 
open a new issue and submit a PR. I will consider closing this PR accordingly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@ambari.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@ambari.apache.org
For additional commands, e-mail: dev-h...@ambari.apache.org



Re: [PR] AMBARI-26065: Add Hadoop Federation Router Service support [ambari]

2024-04-26 Thread via GitHub


AnanyaSingh2121 commented on PR #3782:
URL: https://github.com/apache/ambari/pull/3782#issuecomment-2079794895

   Hi @JiaLiangC ,
   
   Thank you for addressing the points in detail. I have some more feedback.
   
   - Ideally we will not have  a federated cluster without Router as to manage 
Federation we will be using Router only. So installing it as a separate 
separate does not provide the required decoupling. Once federation is enabled 
Router is a requirement. So it does not make sense to just delete Router. If 
Router needs to be uninstalled then federation also will have to disabled. We 
cannot separate the two in my opinion.
   
   - About the client requests, yes you are correct we will have to eventually 
modify the core-site and hdfs-site. So better to have all the conf files at one 
place. Also my concern was regarding managing multiple Routers as in that case 
you will not be able to modify the default Fs to one Router rpc. For this 
purpose Hadoop community proposes ns-fed a logical namespace for router. That 
will require testing in Ambari as those configurations will give rise to 
conflict. Hadoop community has also raised a jira for the same.
   
   - Also we have developed and tested the Router based federation with Router 
as component of HDFS. We have also included several UI changes where Router can 
be enabled as part of Federation flow. Will be raising a jira for the same and 
submit a patch. 
   
   Request your inputs also on the same.
   
   Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@ambari.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@ambari.apache.org
For additional commands, e-mail: dev-h...@ambari.apache.org



Re: [PR] AMBARI-26065: Add Hadoop Federation Router Service support [ambari]

2024-04-23 Thread via GitHub


JiaLiangC commented on PR #3782:
URL: https://github.com/apache/ambari/pull/3782#issuecomment-2073814703

   @AnanyaSingh2121 
   Thank you for your feedback. Let me address your questions:
   
   1. **Why treat the router as a separate service?**
   Although the router is part of the Hadoop source code, from an architectural 
standpoint, in a federated cluster, the router cluster operates independently 
from any specific Hadoop cluster, representing a routing layer. Thus, I believe 
it's inappropriate to associate the router with any particular Hadoop cluster 
during deployment. Moreover, having the router deployed independently allows 
for more flexible and clearer scalability and maintenance. To transition a 
cluster into a federated cluster, one simply needs to add a router and modify a 
few configurations. If a cluster with a router wishes to dismount from a 
federated setup and return to being an independent cluster, this can easily be 
done by uninstalling the router service and adjusting the current Hadoop 
configurations, which keeps the coupling low.
   
   2. **About routing client requests to clusters and the configuration:**
   After deploying the router and configuring it to monitor the current 
cluster, along with modifying the cluster's hdfs-site, then you can perform 
directory mounts using `dfsrouteradmin` from the command line. Clients access 
using the `core-site` and `hdfs-site` configurations. By modifying the 
configuration, all client configurations can point to the router cluster.
   
   3. **On your third point:**
   That's a great idea. It requires coordination with the frontend. Perhaps we 
can iterate over it step by step; achieving perfection right away is 
challenging.
   
   4. **Regarding the router `service_advisor` you mentioned:**
   If you're interested, you might consider submitting a PR to address this.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@ambari.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@ambari.apache.org
For additional commands, e-mail: dev-h...@ambari.apache.org



Re: [PR] AMBARI-26065: Add Hadoop Federation Router Service support [ambari]

2024-04-23 Thread via GitHub


AnanyaSingh2121 commented on PR #3782:
URL: https://github.com/apache/ambari/pull/3782#issuecomment-2072920519

   Hi @JiaLiangC , 
   Thank you for initiating the integration of dfsrouter with Ambari. I was 
going through the PR and had a few suggestions/ doubts. 
   I see that you are adding Router as a separate service in Ambari. 
Essentially Router is a HDFS component only. Could you please  tell the reason 
for this approach. Because we are still going to store the configuration file 
for Router under hadoop_conf directory only.
   
   Also if you could please tell how the following things will be taken care of 
in your approach:
   
   - how is it being made sure that client requests are being routed via the 
Router in single and multi router setup
   - there are a certain set of configurations that are required as per the 
hadoop community for managing multi-router setup. How will we support those 
configurations
   - Router is made a master so the cardinality will be +1. But the user should 
not have the ability to add router without enabling multiple namespaces. Don't 
you think this should be integrated as part of Ambari federation setup only as 
Router installation then can be controlled.
   - Some router based configurations like the registering nameservice should 
be and can done automatically by Ambari as it is aware of the multiple 
namespaces in the cluster. Then why do we need to include those configurations 
in the hdfs-rbf-site. These configurations should only be loaded when multiple 
namespaces exist in my opinion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@ambari.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@ambari.apache.org
For additional commands, e-mail: dev-h...@ambari.apache.org



Re: [PR] AMBARI-26065: Add Hadoop Federation Router Service support [ambari]

2024-04-21 Thread via GitHub


JiaLiangC commented on PR #3782:
URL: https://github.com/apache/ambari/pull/3782#issuecomment-2068305909

   @brahmareddybattula 
   Certainly, multiple routers have been tested. Specific test screenshots have 
been added to the PR description above.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@ambari.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@ambari.apache.org
For additional commands, e-mail: dev-h...@ambari.apache.org



Re: [PR] AMBARI-26065: Add Hadoop Federation Router Service support [ambari]

2024-04-21 Thread via GitHub


brahmareddybattula commented on PR #3782:
URL: https://github.com/apache/ambari/pull/3782#issuecomment-2068068858

   @JiaLiangC  thanks for reporting this and working on this which is very much 
required feature. Did you test with multiple routers also.?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@ambari.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@ambari.apache.org
For additional commands, e-mail: dev-h...@ambari.apache.org



Re: [PR] AMBARI-26065: Add Hadoop Federation Router Service support [ambari]

2024-04-19 Thread via GitHub


JiaLiangC commented on PR #3782:
URL: https://github.com/apache/ambari/pull/3782#issuecomment-2066120022

   @virajjasani Could you help review this pr?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@ambari.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@ambari.apache.org
For additional commands, e-mail: dev-h...@ambari.apache.org