[jira] [Commented] (YARN-1040) De-link container life cycle from an Allocation

2016-03-25 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15212248#comment-15212248 ] Bikas Saha commented on YARN-1040: -- bq. Hmmm.. Given that launching multiple processes, being a new

[jira] [Commented] (YARN-1040) De-link container life cycle from an Allocation

2016-03-23 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208906#comment-15208906 ] Bikas Saha commented on YARN-1040: -- It would be great if existing apps can use the changes in YARN-1040 to

[jira] [Commented] (YARN-1040) De-link container life cycle from an Allocation

2016-03-18 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15202391#comment-15202391 ] Bikas Saha commented on YARN-1040: -- This design doc effectively looks like a re-design of almost all core

[jira] [Updated] (YARN-4758) Enable discovery of AMs by containers

2016-03-03 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated YARN-4758: - Description: {color:red} This is already discussed on the umbrella JIRA YARN-1489. Copying some of my

[jira] [Commented] (YARN-1011) [Umbrella] Schedule containers based on utilization of currently allocated containers

2016-02-28 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171244#comment-15171244 ] Bikas Saha commented on YARN-1011: -- bq. If it absolutely wants a guaranteed container, we should allocate

[jira] [Commented] (YARN-1040) De-link container life cycle from the process and add ability to execute multiple processes in the same long-lived container

2016-02-25 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167889#comment-15167889 ] Bikas Saha commented on YARN-1040: -- Vinod, the plan you are suggesting has merits. But my initial

[jira] [Commented] (YARN-1040) De-link container life cycle from the process and add ability to execute multiple processes in the same long-lived container

2016-02-25 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167738#comment-15167738 ] Bikas Saha commented on YARN-1040: -- My guess is that YARN-4725 may be redundant after we do this work

[jira] [Commented] (YARN-1040) De-link container life cycle from the process and add ability to execute multiple processes in the same long-lived container

2016-02-24 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166633#comment-15166633 ] Bikas Saha commented on YARN-1040: -- I am sorry if I caused a digression by mentioning Slider etc. I am

[jira] [Commented] (YARN-1040) De-link container life cycle from the process and add ability to execute multiple processes in the same long-lived container

2016-02-24 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15165668#comment-15165668 ] Bikas Saha commented on YARN-1040: -- Agree with your scenarios. I am trying to figure a way by which this

[jira] [Commented] (YARN-1040) De-link container life cycle from the process and add ability to execute multiple processes in the same long-lived container

2016-02-24 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15163708#comment-15163708 ] Bikas Saha commented on YARN-1040: -- I am not sure we need to place (somewhat artificial) constraints on

[jira] [Commented] (YARN-1011) [Umbrella] Schedule containers based on utilization of currently allocated containers

2016-02-22 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157482#comment-15157482 ] Bikas Saha commented on YARN-1011: -- 2.3 The relationship between overall cluster utilization and node

[jira] [Commented] (YARN-1011) [Umbrella] Schedule containers based on utilization of currently allocated containers

2016-02-20 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15155867#comment-15155867 ] Bikas Saha commented on YARN-1011: -- 2.3. Whats the reasoning behind this? Over-allocating a node seems to

[jira] [Commented] (YARN-2019) Retrospect on decision of making RM crashed if any exception throw in ZKRMStateStore

2016-01-26 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118116#comment-15118116 ] Bikas Saha commented on YARN-2019: -- Does this now mean that during a failover the new RM could forget

[jira] [Commented] (YARN-1011) [Umbrella] Schedule containers based on utilization of currently allocated containers

2016-01-06 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086326#comment-15086326 ] Bikas Saha commented on YARN-1011: -- Some of what I am saying emanates from prior experience with a

[jira] [Commented] (YARN-1011) [Umbrella] Schedule containers based on utilization of currently allocated containers

2016-01-06 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15086325#comment-15086325 ] Bikas Saha commented on YARN-1011: -- bq. At this point what happens to the opportunistic container. It is

[jira] [Commented] (YARN-1011) [Umbrella] Schedule containers based on utilization of currently allocated containers

2016-01-05 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15083683#comment-15083683 ] Bikas Saha commented on YARN-1011: -- Good points but let me play the devils advocate to get some more

[jira] [Commented] (YARN-1011) [Umbrella] Schedule containers based on utilization of currently allocated containers

2016-01-05 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15083957#comment-15083957 ] Bikas Saha commented on YARN-1011: -- I agree with natural container churn in favor of preemption to avoid

[jira] [Commented] (YARN-1011) [Umbrella] Schedule containers based on utilization of currently allocated containers

2016-01-04 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081950#comment-15081950 ] Bikas Saha commented on YARN-1011: -- In Tez we always try to allocated the most important work to the next

[jira] [Commented] (YARN-1011) [Umbrella] Schedule containers based on utilization of currently allocated containers

2015-12-27 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15072279#comment-15072279 ] Bikas Saha commented on YARN-1011: -- In my prior experience, something like this is not practical without

[jira] [Commented] (YARN-1197) Support changing resources of an allocated container

2015-12-16 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15060559#comment-15060559 ] Bikas Saha commented on YARN-1197: -- The API supports it but the backed implementation does not. So in the

[jira] [Commented] (YARN-4108) CapacityScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request

2015-11-25 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028104#comment-15028104 ] Bikas Saha commented on YARN-4108: -- These problems will be hard to solve without involving the scheduler

[jira] [Commented] (YARN-4390) Consider container request size during CS preemption

2015-11-24 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025542#comment-15025542 ] Bikas Saha commented on YARN-4390: -- I am not sure if this is a bug as described. If preemption does free

[jira] [Commented] (YARN-2047) RM should honor NM heartbeat expiry after RM restart

2015-11-09 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997181#comment-14997181 ] Bikas Saha commented on YARN-2047: -- I think the general idea is that the AM cannot be trusted about

[jira] [Commented] (YARN-2047) RM should honor NM heartbeat expiry after RM restart

2015-11-04 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14991069#comment-14991069 ] Bikas Saha commented on YARN-2047: -- >From the description it seems like the original scope was making sure

[jira] [Commented] (YARN-3911) Add tail of stderr to diagnostics if container fails to launch or it container logs are empty

2015-10-30 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983845#comment-14983845 ] Bikas Saha commented on YARN-3911: -- Sure > Add tail of stderr to diagnostics if container fails to launch

[jira] [Commented] (YARN-4278) On AM registration, response should include cluster Nodes report on demanded by registration request.

2015-10-23 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970840#comment-14970840 ] Bikas Saha commented on YARN-4278: -- Client can ask for it but IMO, no clients actually do. Hence we

[jira] [Commented] (YARN-4278) On AM registration, response should include cluster Nodes report on demanded by registration request.

2015-10-22 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970408#comment-14970408 ] Bikas Saha commented on YARN-4278: -- Not sure if the alternate to allow an AM to use the ClientRMProtocol

[jira] [Commented] (YARN-1509) Make AMRMClient support send increase container request and get increased/decreased containers

2015-10-16 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14961035#comment-14961035 ] Bikas Saha commented on YARN-1509: -- Does issuing an increase followed by the decrease actually remove the

[jira] [Commented] (YARN-1509) Make AMRMClient support send increase container request and get increased/decreased containers

2015-10-16 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14961154#comment-14961154 ] Bikas Saha commented on YARN-1509: -- Sounds good > Make AMRMClient support send increase container request

[jira] [Commented] (YARN-1509) Make AMRMClient support send increase container request and get increased/decreased containers

2015-10-15 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959877#comment-14959877 ] Bikas Saha commented on YARN-1509: -- bq. Increase/Decrease/Change Not sure why the implementation went down

[jira] [Commented] (YARN-3911) Add tail of stderr to diagnostics if container fails to launch or it container logs are empty

2015-10-15 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959579#comment-14959579 ] Bikas Saha commented on YARN-3911: -- I am sorry for the super delayed response. For some reason I was not a

[jira] [Commented] (YARN-1509) Make AMRMClient support send increase container request and get increased/decreased containers

2015-10-09 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950835#comment-14950835 ] Bikas Saha commented on YARN-1509: -- A change container request (maybe not supported now) can be increase

[jira] [Commented] (YARN-1509) Make AMRMClient support send increase container request and get increased/decreased containers

2015-10-07 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14947315#comment-14947315 ] Bikas Saha commented on YARN-1509: -- Sorry for coming in late on this. I have some questions on the API.

[jira] [Commented] (YARN-4087) Followup fixes after YARN-2019 regarding RM behavior when state-store error occurs

2015-09-07 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14734102#comment-14734102 ] Bikas Saha commented on YARN-4087: -- Repeating my comments from YARN-2019 here There would be 2 kinds of

[jira] [Commented] (YARN-2019) Retrospect on decision of making RM crashed if any exception throw in ZKRMStateStore

2015-09-07 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14734100#comment-14734100 ] Bikas Saha commented on YARN-2019: -- Sorry for coming in late on this. There would be 2 kinds of state

[jira] [Commented] (YARN-3942) Timeline store to read events from HDFS

2015-09-03 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14729917#comment-14729917 ] Bikas Saha commented on YARN-3942: -- In Option 1 the latency would be the time to read the entire session

[jira] [Commented] (YARN-4088) RM should be able to process heartbeats from NM asynchronously

2015-08-27 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717746#comment-14717746 ] Bikas Saha commented on YARN-4088: -- Is the suggestion to process them in concurrently? Not

[jira] [Commented] (YARN-4088) RM should be able to process heartbeats from NM asynchronously

2015-08-27 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717907#comment-14717907 ] Bikas Saha commented on YARN-4088: -- Right. So the combined objective is to continue to

[jira] [Commented] (YARN-4088) RM should be able to process heartbeats from NM asynchronously

2015-08-27 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717831#comment-14717831 ] Bikas Saha commented on YARN-4088: -- Why not on a 3K cluster? We could slowdown heartbeats

[jira] [Commented] (YARN-3736) Add RMStateStore apis to store and load accepted reservations for failover

2015-08-06 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660415#comment-14660415 ] Bikas Saha commented on YARN-3736: -- Sounds good! Add RMStateStore apis to store and load

[jira] [Commented] (YARN-3736) Add RMStateStore apis to store and load accepted reservations for failover

2015-08-05 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658798#comment-14658798 ] Bikas Saha commented on YARN-3736: -- Folks, how much load is this going to add to the state

[jira] [Commented] (YARN-2005) Blacklisting support for scheduling AMs

2015-07-29 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14646700#comment-14646700 ] Bikas Saha commented on YARN-2005: -- I am fine for opening a separate jira for the specific

[jira] [Created] (YARN-3994) RM should respect AM resource/placement constraints

2015-07-29 Thread Bikas Saha (JIRA)
Bikas Saha created YARN-3994: Summary: RM should respect AM resource/placement constraints Key: YARN-3994 URL: https://issues.apache.org/jira/browse/YARN-3994 Project: Hadoop YARN Issue Type:

[jira] [Commented] (YARN-2005) Blacklisting support for scheduling AMs

2015-07-28 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14644966#comment-14644966 ] Bikas Saha commented on YARN-2005: -- I believe the reverse case is also valid. A user may

[jira] [Created] (YARN-3911) Add tail of stderr to diagnostics if container fails to launch or it container logs are empty

2015-07-10 Thread Bikas Saha (JIRA)
Bikas Saha created YARN-3911: Summary: Add tail of stderr to diagnostics if container fails to launch or it container logs are empty Key: YARN-3911 URL: https://issues.apache.org/jira/browse/YARN-3911

[jira] [Commented] (YARN-1197) Support changing resources of an allocated container

2015-06-29 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606218#comment-14606218 ] Bikas Saha commented on YARN-1197: -- There has been a lot of discussion that looks like its

[jira] [Commented] (YARN-1902) Allocation of too many containers when a second request is done with the same resource capability

2015-05-18 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548492#comment-14548492 ] Bikas Saha commented on YARN-1902: -- An alternate approach that we tried in Apache Tez is

[jira] [Commented] (YARN-1902) Allocation of too many containers when a second request is done with the same resource capability

2015-05-15 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546377#comment-14546377 ] Bikas Saha commented on YARN-1902: -- The AMRMClient was not written to automatically remove

[jira] [Commented] (YARN-1902) Allocation of too many containers when a second request is done with the same resource capability

2015-05-15 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546421#comment-14546421 ] Bikas Saha commented on YARN-1902: -- Yes. And then the RM may give a container on H1 which

[jira] [Commented] (YARN-908) URL is a YARN API record instead of being java.net.url

2015-05-01 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14523396#comment-14523396 ] Bikas Saha commented on YARN-908: - The issue is why not use standard java URL instead of

[jira] [Updated] (YARN-745) Move UnmanagedAMLauncher to yarn client package

2015-05-01 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated YARN-745: Assignee: (was: Bikas Saha) Move UnmanagedAMLauncher to yarn client package

[jira] [Resolved] (YARN-240) Rename ProcessTree.isSetsidAvailable

2015-05-01 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha resolved YARN-240. - Resolution: Won't Fix Rename ProcessTree.isSetsidAvailable

[jira] [Updated] (YARN-456) allow OS scheduling priority of NM to be different than the containers it launches for Windows

2015-05-01 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated YARN-456: Assignee: (was: Bikas Saha) allow OS scheduling priority of NM to be different than the containers it

[jira] [Updated] (YARN-394) RM should be able to return requests that it cannot fulfill

2015-05-01 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated YARN-394: Assignee: (was: Bikas Saha) RM should be able to return requests that it cannot fulfill

[jira] [Commented] (YARN-556) RM Restart phase 2 - Work preserving restart

2015-05-01 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14523973#comment-14523973 ] Bikas Saha commented on YARN-556: - [~jianhe] [~adhoot] [~kasha] [~vinodkv] Should we resolve

[jira] [Updated] (YARN-255) Support secure AM launch for unmanaged AM's

2015-05-01 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated YARN-255: Assignee: (was: Bikas Saha) Support secure AM launch for unmanaged AM's

[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers

2015-03-12 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355395#comment-14355395 ] Bikas Saha commented on YARN-2140: -- This paper may have useful insights into the network

[jira] [Commented] (YARN-2261) YARN should have a way to run post-application cleanup

2015-02-17 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325548#comment-14325548 ] Bikas Saha commented on YARN-2261: -- Looks like AM preemption will not fail the AM and so

[jira] [Commented] (YARN-2261) YARN should have a way to run post-application cleanup

2015-02-17 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325547#comment-14325547 ] Bikas Saha commented on YARN-2261: -- Sounds reasonable. Though things like how does YARN

[jira] [Commented] (YARN-3025) Provide API for retrieving blacklisted nodes

2015-01-26 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292848#comment-14292848 ] Bikas Saha commented on YARN-3025: -- Yes. But that would mean that the RM cannot provide

[jira] [Commented] (YARN-3025) Provide API for retrieving blacklisted nodes

2015-01-25 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291522#comment-14291522 ] Bikas Saha commented on YARN-3025: -- In the worst case, the frequency of this update can be

[jira] [Commented] (YARN-3025) Provide API for retrieving blacklisted nodes

2015-01-09 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14271658#comment-14271658 ] Bikas Saha commented on YARN-3025: -- The AM is expected to maintain its own state. However,

[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-12-03 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14233341#comment-14233341 ] Bikas Saha commented on YARN-2139: -- So to be clear, currently vdisks is counting the

[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-12-02 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232133#comment-14232133 ] Bikas Saha commented on YARN-2139: -- Thanks for the update. Its not clear to me how we are

[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-12-02 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232448#comment-14232448 ] Bikas Saha commented on YARN-2139: -- Is the concept of vdisk representing a spinning disk

[jira] [Commented] (YARN-1902) Allocation of too many containers when a second request is done with the same resource capability

2014-11-27 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228030#comment-14228030 ] Bikas Saha commented on YARN-1902: -- If all the requests are for the same location (say

[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-11-21 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221591#comment-14221591 ] Bikas Saha commented on YARN-2139: -- Given that this design and possible implementation

[jira] [Commented] (YARN-1902) Allocation of too many containers when a second request is done with the same resource capability

2014-10-29 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14188516#comment-14188516 ] Bikas Saha commented on YARN-1902: -- bq. Given a ContainerRequest x with Resource y, when

[jira] [Commented] (YARN-2314) ContainerManagementProtocolProxy can create thousands of threads for a large cluster

2014-10-14 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14171408#comment-14171408 ] Bikas Saha commented on YARN-2314: -- Folks, this is something that would be of interest in

[jira] [Commented] (YARN-2314) ContainerManagementProtocolProxy can create thousands of threads for a large cluster

2014-10-14 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14171429#comment-14171429 ] Bikas Saha commented on YARN-2314: -- To be clear, my question was only to clarify if Tez

[jira] [Commented] (YARN-2314) ContainerManagementProtocolProxy can create thousands of threads for a large cluster

2014-10-14 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14171459#comment-14171459 ] Bikas Saha commented on YARN-2314: -- My understanding from the comments was that in most

[jira] [Commented] (YARN-2261) YARN should have a way to run post-application cleanup

2014-07-11 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14058979#comment-14058979 ] Bikas Saha commented on YARN-2261: -- The cleanup would be indistinguishable from an AM that

[jira] [Commented] (YARN-2261) YARN should have a way to run post-application cleanup

2014-07-11 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14059094#comment-14059094 ] Bikas Saha commented on YARN-2261: -- Would that be an existing issue that needs to be

[jira] [Commented] (YARN-2261) YARN should have a way to run post-application cleanup

2014-07-08 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14055231#comment-14055231 ] Bikas Saha commented on YARN-2261: -- +1 for having the control/responsibility in YARN An

[jira] [Commented] (YARN-1366) AM should implement Resync with the ApplicationMasterService instead of shutting down

2014-07-06 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053201#comment-14053201 ] Bikas Saha commented on YARN-1366: -- bq. I meant an empty response. After this does

[jira] [Commented] (YARN-1366) AM should implement Resync with the ApplicationMasterService instead of shutting down

2014-07-02 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14050965#comment-14050965 ] Bikas Saha commented on YARN-1366: -- Why are we returning the old allocateResponse to the

[jira] [Commented] (YARN-1366) AM should implement Resync with the ApplicationMasterService instead of shutting down

2014-07-02 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14050978#comment-14050978 ] Bikas Saha commented on YARN-1366: -- Does a null response make sense for the user? AM

[jira] [Commented] (YARN-614) Retry attempts automatically for hardware failures or YARN issues and set default app retries to 1

2014-06-24 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043029#comment-14043029 ] Bikas Saha commented on YARN-614: - Steve, what you want should already happen. AM will

[jira] [Commented] (YARN-2052) ContainerId creation after work preserving restart is broken

2014-06-18 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034876#comment-14034876 ] Bikas Saha commented on YARN-2052: -- With 32 bits for epoch number we have 4 billion

[jira] [Commented] (YARN-2052) ContainerId creation after work preserving restart is broken

2014-06-17 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034691#comment-14034691 ] Bikas Saha commented on YARN-2052: -- bq. Had an offline discussion with Vinod. Maybe it's

[jira] [Commented] (YARN-1373) Transition RMApp and RMAppAttempt state to RUNNING after restart for recovered running apps

2014-06-17 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034700#comment-14034700 ] Bikas Saha commented on YARN-1373: -- Sorry I am not clear how this is a dup. This jira is

[jira] [Commented] (YARN-2052) ContainerId creation after work preserving restart is broken

2014-06-17 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034731#comment-14034731 ] Bikas Saha commented on YARN-2052: -- Why would ContainerId#compareTo fail? Existing

[jira] [Commented] (YARN-2052) ContainerId creation after work preserving restart is broken

2014-06-17 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034732#comment-14034732 ] Bikas Saha commented on YARN-2052: -- Ah. I did not see the rest of the comment. Yes.

[jira] [Commented] (YARN-2148) TestNMClient failed due more exit code values added and passed to AM

2014-06-11 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14028216#comment-14028216 ] Bikas Saha commented on YARN-2148: -- [~ozawa] Can you please apply this patch and run the

[jira] [Commented] (YARN-2148) TestNMClient failed due more exit code values added and passed to AM

2014-06-11 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14028213#comment-14028213 ] Bikas Saha commented on YARN-2148: -- lgtm. +1. TestNMClient failed due more exit code

[jira] [Commented] (YARN-2052) ContainerId creation after work preserving restart is broken

2014-06-11 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14028734#comment-14028734 ] Bikas Saha commented on YARN-2052: -- Dont we already read and write synchronously from the

[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers

2014-06-10 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026672#comment-14026672 ] Bikas Saha commented on YARN-2140: -- [~ywskycn] For this and YARN-2139 my suggestion would

[jira] [Updated] (YARN-2091) Add more values to ContainerExitStatus and pass it from NM to RM and then to app masters

2014-06-10 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated YARN-2091: - Summary: Add more values to ContainerExitStatus and pass it from NM to RM and then to app masters (was:

[jira] [Commented] (YARN-1365) ApplicationMasterService to allow Register and Unregister of an app that was running before restart

2014-06-10 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14027187#comment-14027187 ] Bikas Saha commented on YARN-1365: -- Sounds like the right approach. Keeps things

[jira] [Commented] (YARN-2091) Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters

2014-06-06 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020350#comment-14020350 ] Bikas Saha commented on YARN-2091: -- How about naming it hasDefaultExitCode() and directly

[jira] [Commented] (YARN-2091) Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters

2014-06-02 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14015611#comment-14015611 ] Bikas Saha commented on YARN-2091: -- Can this miss a case when the exitCode has not been

[jira] [Commented] (YARN-2091) Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters

2014-06-02 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14016081#comment-14016081 ] Bikas Saha commented on YARN-2091: -- If we are sure that the default value is set in the

[jira] [Commented] (YARN-2091) Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters

2014-05-30 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014079#comment-14014079 ] Bikas Saha commented on YARN-2091: -- Why is isAMAware needed. All values in

[jira] [Commented] (YARN-2091) Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters

2014-05-30 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014284#comment-14014284 ] Bikas Saha commented on YARN-2091: -- We can check all cases of ContainerKillEvent and add

[jira] [Commented] (YARN-2091) Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters

2014-05-29 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14012980#comment-14012980 ] Bikas Saha commented on YARN-2091: -- Instead of having the following if-else code

[jira] [Commented] (YARN-2091) Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters

2014-05-29 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013071#comment-14013071 ] Bikas Saha commented on YARN-2091: -- That would make sense if YARN would allow specifying

[jira] [Commented] (YARN-2091) Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters

2014-05-27 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010585#comment-14010585 ] Bikas Saha commented on YARN-2091: -- Thats the missing pieces AFAIK. That exit reason needs

[jira] [Commented] (YARN-2091) Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters

2014-05-27 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010630#comment-14010630 ] Bikas Saha commented on YARN-2091: -- We are on the same page. The kill reason is directly a

[jira] [Commented] (YARN-796) Allow for (admin) labels on nodes and resource-requests

2014-05-23 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007866#comment-14007866 ] Bikas Saha commented on YARN-796: - Thanks [~john.jian.fang] An interesting use case from

[jira] [Commented] (YARN-1366) ApplicationMasterService should Resync with the AM upon allocate call after restart

2014-05-21 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004927#comment-14004927 ] Bikas Saha commented on YARN-1366: -- I mean what will go wrong is we allow unregister

  1   2   3   4   5   6   7   8   9   10   >