[jira] [Commented] (YARN-1503) Support making additional 'LocalResources' available to running containers
[ https://issues.apache.org/jira/browse/YARN-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012024#comment-16012024 ] Bingxue Qiu commented on YARN-1503: --- hi,[~jianhe] we have a use case, where the full implementation of localization status in ContainerStatusProto need to be done , so we make it. please feel free to give some advice , thx. [YARN-6606 |https://issues.apache.org/jira/browse/YARN-6606] > Support making additional 'LocalResources' available to running containers > -- > > Key: YARN-1503 > URL: https://issues.apache.org/jira/browse/YARN-1503 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Siddharth Seth >Assignee: Jian He > Attachments: Continuous-resource-localization.pdf > > > We have a use case, where additional resources (jars, libraries etc) need to > be made available to an already running container. Ideally, we'd like this to > be done via YARN (instead of having potentially multiple containers per node > download resources on their own). > Proposal: > NM to support an additional API where a list of resources can be specified. > Something like "localiceResource(ContainerId, Map) > NM would also require an additional API to get state for these resources - > "getLocalizationState(ContainerId)" - which returns the current state of all > local resources for the specified container(s). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1503) Support making additional 'LocalResources' available to running containers
[ https://issues.apache.org/jira/browse/YARN-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438392#comment-15438392 ] Jian He commented on YARN-1503: --- It is differentiated. To clarify, here, I'm referring to the re-localization process, not the normal localization. For normal container localization, we can keep the behavior the same as today. For re-localization, i.e. localize the resources while the container is running, the container should not fail if the localization process fails. The AM just gets notification that the localization failed, AM itself choose to ignore/retry or fail the task depending on the use-case. > Support making additional 'LocalResources' available to running containers > -- > > Key: YARN-1503 > URL: https://issues.apache.org/jira/browse/YARN-1503 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Siddharth Seth >Assignee: Jian He > Attachments: Continuous-resource-localization.pdf > > > We have a use case, where additional resources (jars, libraries etc) need to > be made available to an already running container. Ideally, we'd like this to > be done via YARN (instead of having potentially multiple containers per node > download resources on their own). > Proposal: > NM to support an additional API where a list of resources can be specified. > Something like "localiceResource(ContainerId, Map) > NM would also require an additional API to get state for these resources - > "getLocalizationState(ContainerId)" - which returns the current state of all > local resources for the specified container(s). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1503) Support making additional 'LocalResources' available to running containers
[ https://issues.apache.org/jira/browse/YARN-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437626#comment-15437626 ] Arun Suresh commented on YARN-1503: --- bq. The re-localization process should not tie to container state-machine, regardless whether the localization fails or succeed. Container continues to run. Hmmm... then shouldn't we differentiate between Relocalization and Localization that is required to start the container ? Or are you proposing that the AM calls the new localize API first and then startContainer only after it receives a successful response. That way, we can maybe remove the Localization related states in the NM Container state machine completely.. but that also means existing AMs would need to be modified (or maybe we can just handle it in the NMClient) > Support making additional 'LocalResources' available to running containers > -- > > Key: YARN-1503 > URL: https://issues.apache.org/jira/browse/YARN-1503 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Siddharth Seth >Assignee: Jian He > Attachments: Continuous-resource-localization.pdf > > > We have a use case, where additional resources (jars, libraries etc) need to > be made available to an already running container. Ideally, we'd like this to > be done via YARN (instead of having potentially multiple containers per node > download resources on their own). > Proposal: > NM to support an additional API where a list of resources can be specified. > Something like "localiceResource(ContainerId, Map) > NM would also require an additional API to get state for these resources - > "getLocalizationState(ContainerId)" - which returns the current state of all > local resources for the specified container(s). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1503) Support making additional 'LocalResources' available to running containers
[ https://issues.apache.org/jira/browse/YARN-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436983#comment-15436983 ] Junping Du commented on YARN-1503: -- bq. Anyway, these are advanced stuff and not conflicting with core change. I'll open separate jira and talk about how to implement it when it comes. Sure. The plan sound good to me. Just a notification for guys on watching this jira, I am starting to review the first patch (YARN-5557) under this umbrella which should be irrelevant of our discussions above. Please let me know if you have any concerns. > Support making additional 'LocalResources' available to running containers > -- > > Key: YARN-1503 > URL: https://issues.apache.org/jira/browse/YARN-1503 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Siddharth Seth >Assignee: Jian He > Attachments: Continuous-resource-localization.pdf > > > We have a use case, where additional resources (jars, libraries etc) need to > be made available to an already running container. Ideally, we'd like this to > be done via YARN (instead of having potentially multiple containers per node > download resources on their own). > Proposal: > NM to support an additional API where a list of resources can be specified. > Something like "localiceResource(ContainerId, Map) > NM would also require an additional API to get state for these resources - > "getLocalizationState(ContainerId)" - which returns the current state of all > local resources for the specified container(s). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1503) Support making additional 'LocalResources' available to running containers
[ https://issues.apache.org/jira/browse/YARN-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436378#comment-15436378 ] Jian He commented on YARN-1503: --- Thanks for the feedback, Arun, Varun, Junping. bq. I was wondering how this would tie into the NM Container state machine. The re-localization process should not tie to container state-machine, regardless whether the localization fails or succeed. Container continues to run. This echoes the requirement for Tez relocalization. The AM also gets notification whether the localization process succeeded or failed. bq. I haven't found our solution details for some failed over cases, like: AM or NM restart: For AM restart, it simply queries NM for the localization status. For NM restart, it needs to persist the symlink mapping. I had thought to add symlink into the LocalResource object itself so that it gets persisted automatically. Anyway, these are advanced stuff and not conflicting with core change. I'll open separate jira and talk about how to implement it when it comes. > Support making additional 'LocalResources' available to running containers > -- > > Key: YARN-1503 > URL: https://issues.apache.org/jira/browse/YARN-1503 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Siddharth Seth >Assignee: Jian He > Attachments: Continuous-resource-localization.pdf > > > We have a use case, where additional resources (jars, libraries etc) need to > be made available to an already running container. Ideally, we'd like this to > be done via YARN (instead of having potentially multiple containers per node > download resources on their own). > Proposal: > NM to support an additional API where a list of resources can be specified. > Something like "localiceResource(ContainerId, Map) > NM would also require an additional API to get state for these resources - > "getLocalizationState(ContainerId)" - which returns the current state of all > local resources for the specified container(s). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1503) Support making additional 'LocalResources' available to running containers
[ https://issues.apache.org/jira/browse/YARN-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435976#comment-15435976 ] Junping Du commented on YARN-1503: -- Thanks [~jianhe] for the document. The use cases and proposal make sense to me. However, I haven't found our solution details for some failed over cases, like: AM or NM restart: - In case of AM get failed and restart, does these localization effort need to redo again? Do we have some way to track re-localization effort on running containers by communication with NMs or shared some context across different attempts? - For NM, it could get just restart when receiving/replying additional localization request from AM. We should consider to persistent it to NM state store before replying back to AM. May be we should add these scenarios into detail design? > Support making additional 'LocalResources' available to running containers > -- > > Key: YARN-1503 > URL: https://issues.apache.org/jira/browse/YARN-1503 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Siddharth Seth >Assignee: Jian He > Attachments: Continuous-resource-localization.pdf > > > We have a use case, where additional resources (jars, libraries etc) need to > be made available to an already running container. Ideally, we'd like this to > be done via YARN (instead of having potentially multiple containers per node > download resources on their own). > Proposal: > NM to support an additional API where a list of resources can be specified. > Something like "localiceResource(ContainerId, Map) > NM would also require an additional API to get state for these resources - > "getLocalizationState(ContainerId)" - which returns the current state of all > local resources for the specified container(s). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1503) Support making additional 'LocalResources' available to running containers
[ https://issues.apache.org/jira/browse/YARN-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435337#comment-15435337 ] Vinod Kumar Vavilapalli commented on YARN-1503: --- [~jianhe], [~asuresh], [~vvasudev], This got a little too confusing. I just (a) edited the title of this JIRA to what it was originally, and (b) reopened YARN-5532, editing its title to what it originally targeted. How about we do the following? - Use YARN-5532 to build the basic building block of ResourceLocalization as an independent unit inside the NM. Today, it is wired into Container's state-machine. The goal is to make it first-class so that multiple parties like upgrades, continuous localization can use this core localization service. - Use YARN-1503 - this JIRA - to define and implement APIs for continuous localization. - Use YARN-4876 to define and implement APIs needed for upgrades, and also the corresponding Container state machine changes. Thoughts? > Support making additional 'LocalResources' available to running containers > -- > > Key: YARN-1503 > URL: https://issues.apache.org/jira/browse/YARN-1503 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Siddharth Seth >Assignee: Jian He > Attachments: Continuous-resource-localization.pdf > > > We have a use case, where additional resources (jars, libraries etc) need to > be made available to an already running container. Ideally, we'd like this to > be done via YARN (instead of having potentially multiple containers per node > download resources on their own). > Proposal: > NM to support an additional API where a list of resources can be specified. > Something like "localiceResource(ContainerId, Map) > NM would also require an additional API to get state for these resources - > "getLocalizationState(ContainerId)" - which returns the current state of all > local resources for the specified container(s). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1503) Support making additional 'LocalResources' available to running containers
[ https://issues.apache.org/jira/browse/YARN-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432722#comment-15432722 ] Jian He commented on YARN-1503: --- I uploaded a design doc for this and update the title / description accordingly. > Support making additional 'LocalResources' available to running containers > -- > > Key: YARN-1503 > URL: https://issues.apache.org/jira/browse/YARN-1503 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Siddharth Seth >Assignee: Jian He > Attachments: Continuous-resource-localization.pdf > > > We have a use case, where additional resources (jars, libraries etc) need to > be made available to an already running container. Ideally, we'd like this to > be done via YARN (instead of having potentially multiple containers per node > download resources on their own). > Proposal: > NM to support an additional API where a list of resources can be specified. > Something like "localiceResource(ContainerId, Map) > NM would also require an additional API to get state for these resources - > "getLocalizationState(ContainerId)" - which returns the current state of all > local resources for the specified container(s). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1503) Support making additional 'LocalResources' available to running containers
[ https://issues.apache.org/jira/browse/YARN-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847380#comment-13847380 ] Bikas Saha commented on YARN-1503: -- +1. If multiple containers are already running on a node and we issue the same set of new resources to be localized for all these containers, then will the NM make sure that there is only 1 concurrent download? Will this be a blocking call or is the second API meant to be used as a polling mechanism? Support making additional 'LocalResources' available to running containers -- Key: YARN-1503 URL: https://issues.apache.org/jira/browse/YARN-1503 Project: Hadoop YARN Issue Type: Improvement Reporter: Siddharth Seth Assignee: Siddharth Seth We have a use case, where additional resources (jars, libraries etc) need to be made available to an already running container. Ideally, we'd like this to be done via YARN (instead of having potentially multiple containers per node download resources on their own). Proposal: NM to support an additional API where a list of resources can be specified. Something like localiceResource(ContainerId, MapString, LocalResource) NM would also require an additional API to get state for these resources - getLocalizationState(ContainerId) - which returns the current state of all local resources for the specified container(s). -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (YARN-1503) Support making additional 'LocalResources' available to running containers
[ https://issues.apache.org/jira/browse/YARN-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847775#comment-13847775 ] Vinod Kumar Vavilapalli commented on YARN-1503: --- bq. We have a use case, where additional resources (jars, libraries etc) need to be made available to an already running container. A slightly more detailed explanation of the use-case so everyone can understand? And why something like YARN-1040 is not enough. Support making additional 'LocalResources' available to running containers -- Key: YARN-1503 URL: https://issues.apache.org/jira/browse/YARN-1503 Project: Hadoop YARN Issue Type: Improvement Reporter: Siddharth Seth Assignee: Siddharth Seth We have a use case, where additional resources (jars, libraries etc) need to be made available to an already running container. Ideally, we'd like this to be done via YARN (instead of having potentially multiple containers per node download resources on their own). Proposal: NM to support an additional API where a list of resources can be specified. Something like localiceResource(ContainerId, MapString, LocalResource) NM would also require an additional API to get state for these resources - getLocalizationState(ContainerId) - which returns the current state of all local resources for the specified container(s). -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (YARN-1503) Support making additional 'LocalResources' available to running containers
[ https://issues.apache.org/jira/browse/YARN-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848111#comment-13848111 ] Siddharth Seth commented on YARN-1503: -- bq. A slightly more detailed explanation of the use-case so everyone can understand? And why something like YARN-1040 is not enough. YARN-1040 talks about launching multiple processes within the same container. This requirement is for a single running process - we want to avoid re-launching the process due to the cost involved with starting a new Java process. The specific use case is running different tasks within the same JVM - where one task may need some additional jars (Hive UDFs for example). Support making additional 'LocalResources' available to running containers -- Key: YARN-1503 URL: https://issues.apache.org/jira/browse/YARN-1503 Project: Hadoop YARN Issue Type: Improvement Reporter: Siddharth Seth Assignee: Siddharth Seth We have a use case, where additional resources (jars, libraries etc) need to be made available to an already running container. Ideally, we'd like this to be done via YARN (instead of having potentially multiple containers per node download resources on their own). Proposal: NM to support an additional API where a list of resources can be specified. Something like localiceResource(ContainerId, MapString, LocalResource) NM would also require an additional API to get state for these resources - getLocalizationState(ContainerId) - which returns the current state of all local resources for the specified container(s). -- This message was sent by Atlassian JIRA (v6.1.4#6159)