[GitHub] flink issue #2461: [FLINK-4505][Cluster Management] Implement TaskManagerFac...
Github user wangzhijiang999 commented on the issue: https://github.com/apache/flink/pull/2461 @tillrohrmann , thank you for merging and help. If there are any following works to do related with TaskManager, you can assign to me and I am willing to do. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2461: [FLINK-4505][Cluster Management] Implement TaskManagerFac...
Github user tillrohrmann commented on the issue: https://github.com/apache/flink/pull/2461 I've merged your PR @wangzhijiang999. Thanks for your work :-) You can close this PR now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2461: [FLINK-4505][Cluster Management] Implement TaskManagerFac...
Github user tillrohrmann commented on the issue: https://github.com/apache/flink/pull/2461 Thanks for your contribution @wangzhijiang999. I'll merge it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2461: [FLINK-4505][Cluster Management] Implement TaskManagerFac...
Github user wangzhijiang999 commented on the issue: https://github.com/apache/flink/pull/2461 @tillrohrmann , I created the `TaskExecutorRunner` for constructing the related components for starting `TaskExecutor`, and removed the factory class. Currently only the `ResourceID` and 'Configuration` parameters must be passed to the method `startComponents`, other parameters are optional. After you confirm the way , I will write some testings based on it. The `MemoryLogger` is removed from `TaskExecutorRunner` temporarily. The `ActorGateway` should not be passed to the constructor of `MemoryLogger` directly I think. If you agree, I will re-structure the `MemoryLogger` in another jira. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2461: [FLINK-4505][Cluster Management] Implement TaskManagerFac...
Github user tillrohrmann commented on the issue: https://github.com/apache/flink/pull/2461 I think you should have at least another method `startComponents` which starts the different components. Everything else can be added later when we see that it would be helpful. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2461: [FLINK-4505][Cluster Management] Implement TaskManagerFac...
Github user tillrohrmann commented on the issue: https://github.com/apache/flink/pull/2461 Hi @wangzhijiang999, maybe we should start simple without introducing a factory method, because there might actually be not many cases to distinguish. Maybe we could rename the `TaskManagerFactory` into `TaskManagerRunner` which has static methods to create the `TaskManagers` components and does the network selection. That way we keep the initialization and the actual `TaskManager` logic separated. For testing purposes I guess we don't need to setup any components because they are usually mocked or one is using testing components. Passing these components to the constructor of the `TaskManager` should not be a big deal. Does this make sense? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2461: [FLINK-4505][Cluster Management] Implement TaskManagerFac...
Github user wangzhijiang999 commented on the issue: https://github.com/apache/flink/pull/2461 @tillrohrmann , I tried to understand your idea as follows: 1. Provide specific TaskExecutorFactory class instead of abstract factory for both standalone/yarn mode. 2. Network selection and RPC service creation methods should be pulled out from factory, maybe remove to some utility class?(for JM reuse) 3. The constructor of TaskExecutorFactory supports many different parameters: such as TaskExecutorFactory(Configuration, ResourceID) TaskExecutorFactory(RpcService, HighAvailabilityServices) TaskExecutorFactory(hostname, port) TaskExecutorFactory(Configuration, ResourceID, RpcService, HighAvailabilityServices,hostname, port), The above three constructors for partial parameters can be transformed into the fourth final constructor , all the missing parameters can be generated by internal default value. 4. The TaskExecutorFactory supports the method "createTaskManger" to bring up TaskManger for outside world, and this method will construct the related components(TaskManagerConfiguration, NetworkManager, IOManager, MemoryManager) 5. For testing mode, construct the TestingTaskExecutorFactory to pass all the components explicitly, including (ResourceID,MemoryManager,IOManager,NetworkEnvironment,numberOfSlots, RpcService,HighAvailabilityServices), the TaskManagerConfiguration should be passed from outside or generate implicitly? 6. In addition, the localTaskManagerCommunication parameter is needed before to decide whether to create NettyConfig for standalone or yarn mode. Now I will remove this parameter to create ~~NettyConfig~~ always. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2461: [FLINK-4505][Cluster Management] Implement TaskManagerFac...
Github user tillrohrmann commented on the issue: https://github.com/apache/flink/pull/2461 I think we don't need the distinction between standalone and yarn in the TaskManager startup. They should actually be the same. The difference will be the `StandaloneRunner`, `YarnRunner` and `MesosRunner`. I think the factory should be responsible for creating the `TaskManager` components. I would pull out the `RpcService` creation and, thus, also the network interface selection, because this is code which can be re-used across different components (TM and JM, for example). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2461: [FLINK-4505][Cluster Management] Implement TaskManagerFac...
Github user wangzhijiang999 commented on the issue: https://github.com/apache/flink/pull/2461 @tillrohrmann , the current implementation is just follow the previous process for yarn and standalone modes. And I agree your opinion and actually it is not very clear for current ways. I think first we should re-define and confirm the parameters passed to the "createTaskManager" method for different factories, then I can fix the inner processes based on the input parameters. Would you suggest the specific parameters passed for different factories? Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2461: [FLINK-4505][Cluster Management] Implement TaskManagerFac...
Github user tillrohrmann commented on the issue: https://github.com/apache/flink/pull/2461 Thanks for the contribution @wangzhijiang999. I think the abstraction is not right. The `StandaloneFactory` and the `YarnFactory` should actually only differ in the class of the `TaskExecutor` they start. Furthermore, I think that the network interface selection should not be part of the factory. Instead I would pass the hostname and port to the `createTaskManager(hostname, port)` method which constructs the `TaskManager`. Maybe it makes even sense to pass the `RpcService` via the `createTaskManager(RpcService)` and defer the RpcService creation to the outside method running the select interface method. The difference between the Standalone/Yarn and Testing factory is that the first two create the TM components whereas the `TestingFactory` is initialized with the `TaskManager` components. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2461: [FLINK-4505][Cluster Management] Implement TaskManagerFac...
Github user wangzhijiang999 commented on the issue: https://github.com/apache/flink/pull/2461 @mxm , The {{StandaloneTaskExecutorFactory}} can be used for mini cluster or testing mode I think, and the {{YarnTaskExecutorFactory}} used for yarn mode. After you confirm the implementation, I can add some testings for the factory if needed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---