[jira] [Comment Edited] (YARN-8881) Phase 1 - Add basic pluggable device plugin framework
[ https://issues.apache.org/jira/browse/YARN-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687713#comment-16687713 ] Zhankun Tang edited comment on YARN-8881 at 11/15/18 9:45 AM: -- [~sunilg] , {quote}I am like +0 for this change. {{Integer.compare(id, other.getId())}}. It could a simple {{if (id != other.getId())}} {quote} One question about this. The _Integer.compare(id, other.getId)_'s implementation is below: {code:java} public static int compare(int x, int y) { return (x < y) ? -1 : ((x == y) ? 0 : 1); }{code} Do you mean we just move it out is more performant? I didn't get your idea clearly. was (Author: tangzhankun): [~sunilg] , {quote}I am like +0 for this change. {{Integer.compare(id, other.getId())}}. It could a simple {{if (id != other.getId())}} {quote} One question about this. The _Integer.compare(id, other.getId)'_s implementation is below: {code:java} public static int compare(int x, int y) { return (x < y) ? -1 : ((x == y) ? 0 : 1); }{code} Do you mean we just move it out is more performant? I didn't get your idea clearly. > Phase 1 - Add basic pluggable device plugin framework > - > > Key: YARN-8881 > URL: https://issues.apache.org/jira/browse/YARN-8881 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8881-trunk.001.patch, YARN-8881-trunk.002.patch, > YARN-8881-trunk.003.patch, YARN-8881-trunk.004.patch, > YARN-8881-trunk.005.patch, YARN-8881-trunk.006.patch, > YARN-8881-trunk.007.patch, YARN-8881-trunk.008.patch, > YARN-8881-trunk.009.patch, YARN-8881-trunk.010.patch > > > It includes adding support in "ResourcePluginManager" to load plugin classes > based on configuration, an interface for the vendor to implement and the > adapter to decouple plugin and YARN internals. And the vendor device resource > discovery will be ready after this support -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8881) Phase 1 - Add basic pluggable device plugin framework
[ https://issues.apache.org/jira/browse/YARN-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687406#comment-16687406 ] Zhankun Tang edited comment on YARN-8881 at 11/15/18 2:28 AM: -- [~csingh], Thanks for the review! {quote}It checks if the pluginClazz is an implementation of {{DevicePlugin}} via {{isAssginableFrom}}. Why is {{checkInterfaceCompatibility}} required? {quote} Zhankun => Good question. isAssiginableFrom only check whether the interface is the plugin class's parent. But it doesn't ensure the real methods implemented in the plugin class matches YARN needs. For instance, "FakeTestDevicePlugin4" implement a method "getOldRegisterRequestInfo"(should be "getRegisterRequestInfo") but can also be reflected and will throw an error until _plugin.getRegisterRequestInfo_ is called if we don't check. This could happen due to the plugin depend on an outdated NM API or some reason we don't know. Here the _checkInterfaceCompatibility_ does a basic fast fail to ensure methods invoked by YARN exists. We'll add more sanity-check later to ensure plugin that its return value is correct, the method implemented is stateless .etc. {quote}DevicePluginAdapter -> resourceName and devicePlugin can be final. LOG -> can be private. {quote} Zhankun => Thanks. Will fix it {quote}The braces {} don't need to be surrounded by quotes. It is surrounded by quotes at quite a few places. {quote} Zhankun => Yeah. Will remove all quotes. {quote}serialVersionUID is 1 for all the serializable classes. Use a time based or random large number. {quote} Zhankun => Yeah. Will change them. was (Author: tangzhankun): [~csingh], Thanks for the review! {quote}It checks if the pluginClazz is an implementation of {{DevicePlugin}} via {{isAssginableFrom}}. Why is {{checkInterfaceCompatibility}} required? {quote} Zhankun => Good question. isAssiginableFrom only check whether the interface is the plugin class's parent. But it doesn't ensure the real methods implemented in the plugin class matches YARN needs. For instance, "FakeTestDevicePlugin4" implement a method "getOldRegisterRequestInfo"(should be "getRegisterRequestInfo") but can also be reflected and will throw an error until _plugin.getRegisterRequestInfo_ is called. This could happen due to the plugin depend on an outdated NM API or some reason we don't know. Here the _checkInterfaceCompatibility_ does a basic fast fail to ensure methods invoked by YARN exists. We'll add more sanity-check later to ensure plugin that its return value is correct, the method implemented is stateless .etc. {quote}DevicePluginAdapter -> resourceName and devicePlugin can be final. LOG -> can be private. {quote} Zhankun => Thanks. Will fix it {quote}The braces {} don't need to be surrounded by quotes. It is surrounded by quotes at quite a few places. {quote} Zhankun => Yeah. Will remove all quotes. {quote}serialVersionUID is 1 for all the serializable classes. Use a time based or random large number. {quote} Zhankun => Yeah. Will change them. > Phase 1 - Add basic pluggable device plugin framework > - > > Key: YARN-8881 > URL: https://issues.apache.org/jira/browse/YARN-8881 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8881-trunk.001.patch, YARN-8881-trunk.002.patch, > YARN-8881-trunk.003.patch, YARN-8881-trunk.004.patch, > YARN-8881-trunk.005.patch, YARN-8881-trunk.006.patch, > YARN-8881-trunk.007.patch, YARN-8881-trunk.008.patch, > YARN-8881-trunk.009.patch > > > It includes adding support in "ResourcePluginManager" to load plugin classes > based on configuration, an interface for the vendor to implement and the > adapter to decouple plugin and YARN internals. And the vendor device resource > discovery will be ready after this support -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8881) Phase 1 - Add basic pluggable device plugin framework
[ https://issues.apache.org/jira/browse/YARN-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687406#comment-16687406 ] Zhankun Tang edited comment on YARN-8881 at 11/15/18 2:27 AM: -- [~csingh], Thanks for the review! {quote}It checks if the pluginClazz is an implementation of {{DevicePlugin}} via {{isAssginableFrom}}. Why is {{checkInterfaceCompatibility}} required? {quote} Zhankun => Good question. isAssiginableFrom only check whether the interface is the plugin class's parent. But it doesn't ensure the real methods implemented in the plugin class matches YARN needs. For instance, "FakeTestDevicePlugin4" implement a method "getOldRegisterRequestInfo"(should be "getRegisterRequestInfo") but can also be reflected and will throw an error until _plugin.getRegisterRequestInfo_ is called. This could happen due to the plugin depend on an outdated NM API or some reason we don't know. Here the _checkInterfaceCompatibility_ does a basic fast fail to ensure methods invoked by YARN exists. We'll add more sanity-check later to ensure plugin that its return value is correct, the method implemented is stateless .etc. {quote}DevicePluginAdapter -> resourceName and devicePlugin can be final. LOG -> can be private. {quote} Zhankun => Thanks. Will fix it {quote}The braces {} don't need to be surrounded by quotes. It is surrounded by quotes at quite a few places. {quote} Zhankun => Yeah. Will remove all quotes. {quote}serialVersionUID is 1 for all the serializable classes. Use a time based or random large number. {quote} Zhankun => Yeah. Will change them. was (Author: tangzhankun): [~csingh], Thanks for the review! {quote}It checks if the pluginClazz is an implementation of {{DevicePlugin}} via {{isAssginableFrom}}. Why is {{checkInterfaceCompatibility}} required? {quote} Zhankun => Good question. isAssiginableFrom only check whether the interface is the plugin class's parent. But it doesn't ensure the real methods implemented in the plugin class matches YARN needs. For instance, "FakeTestDevicePlugin4" doesn't implement a method "getOldRegisterRequestInfo" but can also be reflected and will throw an error until _plugin.getRegisterRequestInfo_ is called. This could happen due to the plugin depend on an outdated NM API or something we don't know. Here the _checkInterfaceCompatibility_ does a basic fast fail to ensure methods invoked by YARN exists. We'll add more sanity-check later to ensure plugin that its return value is correct, the method implemented is stateless .etc. {quote}DevicePluginAdapter -> resourceName and devicePlugin can be final. LOG -> can be private. {quote} Zhankun => Thanks. Will fix it {quote}The braces {} don't need to be surrounded by quotes. It is surrounded by quotes at quite a few places. {quote} Zhankun => Yeah. Will remove all quotes. {quote}serialVersionUID is 1 for all the serializable classes. Use a time based or random large number. {quote} Zhankun => Yeah. Will change them. > Phase 1 - Add basic pluggable device plugin framework > - > > Key: YARN-8881 > URL: https://issues.apache.org/jira/browse/YARN-8881 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8881-trunk.001.patch, YARN-8881-trunk.002.patch, > YARN-8881-trunk.003.patch, YARN-8881-trunk.004.patch, > YARN-8881-trunk.005.patch, YARN-8881-trunk.006.patch, > YARN-8881-trunk.007.patch, YARN-8881-trunk.008.patch, > YARN-8881-trunk.009.patch > > > It includes adding support in "ResourcePluginManager" to load plugin classes > based on configuration, an interface for the vendor to implement and the > adapter to decouple plugin and YARN internals. And the vendor device resource > discovery will be ready after this support -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8881) Phase 1 - Add basic pluggable device plugin framework
[ https://issues.apache.org/jira/browse/YARN-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685355#comment-16685355 ] Zhankun Tang edited comment on YARN-8881 at 11/13/18 3:20 PM: -- [~sunilg], Many thanks for the review! {quote}In Device class, any reason why using Boxed type Integer. It may be better to use int {quote} Zhankun => Yeah. int is better for performance. Here the integer can do a null check to ensure the plugin to set it. If the plugin doesn't set it, int would lead to 0 by default. We cannot distinguish whether it's intentional or not. {quote}DevicePlugin is placed in hadoop-yarn-server-nodemanager package. Is that correct? {quote} Zhankun => DevicePlugin.java is placed in nodemanager package. Do you mean the real vendor device plugin? We haven't decided whether to put the vendor plugin maven project. {quote}DevicePlugin: do we need validation interface api to ensure basic sanity is done during allocation? {quote} Zhankun => Yeah. For now the "checkInterfaceCompatibility(DevicePlugin.class, pluginClazz)" will do a very basic check to ensure the interfaces implemented is correct. More complex checking like if the plugin methods are stateless, return result .etc is valid will be wrapped into a new specific class in the future. And can be used as a seperate tool. Please see YARN-8946. {quote}Why DeviceRegisterRequest need to serializable? {quote} Zhankun => Good point. I did this incase we have serialization in the future. This is overkill. I'll delete it. {quote}who will really invoke adapter.getNodeResourceHandlerInstance() ? {quote} Zhankun => It's the "NodeStatusUpdaterImpl"#serviceInit. {quote}DevicePluginAdapter cleanup is missing? Does it need to be handled by ResourcePluginManager? {quote} Zhankun => It's not missing. We haven't seen scenarios that need logic here. So it's empty. {quote}In below case, do we need to throw exception? if (pluginMap.containsKey(resourceName)) Unknown macro: \{ throw new YarnRuntimeException(resourceName + " already registered! Please change resource type name"); 162 } {quote} Zhankun => Yeah. I'm afraid so. This indicates that the plugin registered a resourceName that already exists. We could just log an error and ignore it but I'm afraid that the admin would think his configuration or plugin is working correctly although it's not. So we throw an exception to tell that there's something wrong and the admin needs to know this. The error message is not good enough, I'll refine it to "... Please change resource type name for or configure correct resource name in resource-types.xml". was (Author: tangzhankun): [~sunilg], Many thanks for the review! {quote}In Device class, any reason why using Boxed type Integer. It may be better to use int {quote} Zhankun => Yeah. int is better for performance. Here the integer can do a null check to ensure the plugin to set it. If the plugin doesn't set it, int would lead to 0 by default. We cannot distinguish whether it's intentional or not. {quote}DevicePlugin is placed in hadoop-yarn-server-nodemanager package. Is that correct? {quote} Zhankun => DevicePlugin.java is placed in nodemanager package. Do you mean the real vendor device plugin? We haven't decided whether to put the vendor plugin maven project. {quote}DevicePlugin: do we need validation interface api to ensure basic sanity is done during allocation? {quote} Zhankun => Yeah. For now the "checkInterfaceCompatibility(DevicePlugin.class, pluginClazz)" will do a very basic check to ensure the interfaces implemented is correct. More complex checking like if the plugin methods are stateless, return result .etc is valid will be wrapped into a new specific class in the future. And can be used as a seperate tool. Please see YARN-8946. {quote}Why DeviceRegisterRequest need to serializable? {quote} Zhankun => Good point. I did this incase we have serialization in the future. This is overkill. I'll delete it. {quote}who will really invoke adapter.getNodeResourceHandlerInstance() ? {quote} Zhankun => It's the "NodeStatusUpdaterImpl"#serviceInit. {quote}DevicePluginAdapter cleanup is missing? Does it need to be handled by ResourcePluginManager? {quote} Zhankun => It's not missing. We haven't seen scenarios that need logic here. So it's empty. {quote}In below case, do we need to throw exception? if (pluginMap.containsKey(resourceName)) Unknown macro: \{ throw new YarnRuntimeException(resourceName + " already registered! Please change resource type name"); 162 } {quote} Zhankun => Yeah. I'm afraid so. This indicates that the plugin registered a resourceName that already exists. We could just log an error and ignore it but I'm afraid that the admin would think his configuration or plugin is working correctly although it's not. So we throw an exception to tell that there's