[
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059332#comment-15059332
]
sandflee commented on YARN-1197:
seems not support increase memory and decrease cpu cores
[
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059362#comment-15059362
]
sandflee commented on YARN-1197:
got it, Thanks,[~leftnoteasy]!
> Support changing resourc
[
https://issues.apache.org/jira/browse/YARN-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059777#comment-15059777
]
sandflee commented on YARN-4138:
1, use Resources.fitsin(targetResource, lastConfirmedReso
[
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061346#comment-15061346
]
sandflee commented on YARN-1197:
user application(long running) are running on our yarn pla
[
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061348#comment-15061348
]
sandflee commented on YARN-1197:
user application(long running) are running on our yarn pla
[
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061349#comment-15061349
]
sandflee commented on YARN-1197:
user application(long running) are running on our yarn pla
[
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061347#comment-15061347
]
sandflee commented on YARN-1197:
user application(long running) are running on our yarn pla
[
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061577#comment-15061577
]
sandflee commented on YARN-1197:
seems complicated for AM to do this, especially we added
[
https://issues.apache.org/jira/browse/YARN-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15063576#comment-15063576
]
sandflee commented on YARN-4138:
{quote}
We should not update lastConfirmedResource in this
sandflee created YARN-4495:
--
Summary: add a way to tell AM container increase/decrease request
is invalid
Key: YARN-4495
URL: https://issues.apache.org/jira/browse/YARN-4495
Project: Hadoop YARN
Is
[
https://issues.apache.org/jira/browse/YARN-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4495:
---
Description: now RM may pass InvalidResourceRequestException to AM or just
ignore the change request, the forme
[
https://issues.apache.org/jira/browse/YARN-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15071403#comment-15071403
]
sandflee commented on YARN-4138:
Hi, [~mding], sorry for the late reply,
1, If AM send tok
[
https://issues.apache.org/jira/browse/YARN-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15071475#comment-15071475
]
sandflee commented on YARN-4138:
+ decreaseRequest = new SchedContainerChangeRequest(
[
https://issues.apache.org/jira/browse/YARN-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15071492#comment-15071492
]
sandflee commented on YARN-4138:
there seems a deadlock, in allocate and rollback logic we
[
https://issues.apache.org/jira/browse/YARN-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15072314#comment-15072314
]
sandflee commented on YARN-4138:
Hi, [~mding], I'll open a new jira to track this, not to d
[
https://issues.apache.org/jira/browse/YARN-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15072387#comment-15072387
]
sandflee commented on YARN-4495:
RM will pass InvalidResourceRequestException to AM in belo
sandflee created YARN-4519:
--
Summary: potential deadlock of CapacityScheduler between decrease
container and assign containers
Key: YARN-4519
URL: https://issues.apache.org/jira/browse/YARN-4519
Project: Had
[
https://issues.apache.org/jira/browse/YARN-4519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4519:
---
Description:
In CapacityScheduler.allocate() , first get FiCaSchedulerApp sync lock, and may
be get CapacitySc
sandflee created YARN-4520:
--
Summary: FinishAppEvent is leaked in leveldb if no app's container
running on this node
Key: YARN-4520
URL: https://issues.apache.org/jira/browse/YARN-4520
Project: Hadoop YARN
[
https://issues.apache.org/jira/browse/YARN-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15072430#comment-15072430
]
sandflee commented on YARN-4138:
when release containers , we didn't hold SchedulerApp's lo
[
https://issues.apache.org/jira/browse/YARN-4520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4520:
---
Attachment: YARN-4520.01.patch
> FinishAppEvent is leaked in leveldb if no app's container running on this node
[
https://issues.apache.org/jira/browse/YARN-4520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4520:
---
Attachment: YARN-4520.02.patch
fix checkstyle errors
> FinishAppEvent is leaked in leveldb if no app's contain
[
https://issues.apache.org/jira/browse/YARN-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4495:
---
Attachment: YARN-4495.01.patch
just protocol change, add FailedResourceChange to AllocateResponse, represents
[
https://issues.apache.org/jira/browse/YARN-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15073347#comment-15073347
]
sandflee commented on YARN-4495:
Hi [~jianhe] [~wangda] [~mding] , do you think the change
[
https://issues.apache.org/jira/browse/YARN-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15073400#comment-15073400
]
sandflee commented on YARN-4495:
Thanks [~mding], [~wangda], yes this could simple the cod
[
https://issues.apache.org/jira/browse/YARN-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15073472#comment-15073472
]
sandflee commented on YARN-4495:
[~mding] [~wangda] one problem, seems hadoop rpc could onl
[
https://issues.apache.org/jira/browse/YARN-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15073587#comment-15073587
]
sandflee commented on YARN-3328:
we fix this problem by removing container state machine in
[
https://issues.apache.org/jira/browse/YARN-4519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15073622#comment-15073622
]
sandflee commented on YARN-4519:
sorry, I don't understand
1, why should we put compute de
[
https://issues.apache.org/jira/browse/YARN-4519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074038#comment-15074038
]
sandflee commented on YARN-4519:
got it, thanks [~mding]!
> potential deadlock of Capacit
[
https://issues.apache.org/jira/browse/YARN-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074049#comment-15074049
]
sandflee commented on YARN-4495:
the main problem is we couldn't pass containerId to
inva
[
https://issues.apache.org/jira/browse/YARN-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074052#comment-15074052
]
sandflee commented on YARN-4495:
better to pass why resource change request is failed.
> a
[
https://issues.apache.org/jira/browse/YARN-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074473#comment-15074473
]
sandflee commented on YARN-4495:
to [~mding],
1, we have a StateMachine in AM to track e
[
https://issues.apache.org/jira/browse/YARN-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075043#comment-15075043
]
sandflee commented on YARN-4495:
Hi, [~leftnoteasy], I do a simple test throwing a exceptio
sandflee created YARN-4528:
--
Summary: decreaseConainer Message maybe lost if NM restart
Key: YARN-4528
URL: https://issues.apache.org/jira/browse/YARN-4528
Project: Hadoop YARN
Issue Type: Bug
[
https://issues.apache.org/jira/browse/YARN-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4528:
---
Summary: decreaseContainer Message maybe lost if NM restart (was:
decreaseConainer Message maybe lost if NM re
[
https://issues.apache.org/jira/browse/YARN-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075570#comment-15075570
]
sandflee commented on YARN-4495:
thanks [~wangda], hoping more suggestions
> add a way to
[
https://issues.apache.org/jira/browse/YARN-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075772#comment-15075772
]
sandflee commented on YARN-4528:
since in most cases container size is not changed, so I pr
[
https://issues.apache.org/jira/browse/YARN-4520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4520:
---
Description:
once we restart nodemanager we see many logs like :
2015-12-28 11:59:18,725 WARN
org.apache.hadoo
[
https://issues.apache.org/jira/browse/YARN-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4528:
---
Attachment: YARN-4528.01.patch
1, pending container decrease msg util next heartbeat.
2, nodemanager#allocate d
[
https://issues.apache.org/jira/browse/YARN-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081322#comment-15081322
]
sandflee commented on YARN-4528:
HI, [~mding], container decrease msg is passed like conta
[
https://issues.apache.org/jira/browse/YARN-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081342#comment-15081342
]
sandflee commented on YARN-4528:
[~jianhe] reviewing the code of how containers complete ms
[
https://issues.apache.org/jira/browse/YARN-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081969#comment-15081969
]
sandflee commented on YARN-4528:
thanks [~mding], yes this could happen, but rarely. should
[
https://issues.apache.org/jira/browse/YARN-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4581:
---
Description:
we enable ApplicationHistoryWriter, and find thousands of Errors:
{quote}
2016-01-08 03:13:03,44
sandflee created YARN-4581:
--
Summary: thread leak makes RM crash while RM is recovering
Key: YARN-4581
URL: https://issues.apache.org/jira/browse/YARN-4581
Project: Hadoop YARN
Issue Type: Bug
[
https://issues.apache.org/jira/browse/YARN-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4581:
---
Attachment: YARN-4581.01.patch
simple fix thread leak problem.
> thread leak makes RM crash while RM is recov
[
https://issues.apache.org/jira/browse/YARN-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095369#comment-15095369
]
sandflee commented on YARN-4581:
thanks [~Naganarasimha] [~djp], our cluster is based on 2.
[
https://issues.apache.org/jira/browse/YARN-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102942#comment-15102942
]
sandflee commented on YARN-4581:
thanks Junping, Naga, Vinod!
> AHS writer thread leak mak
sandflee created YARN-4646:
--
Summary: AMRMClient crashed when RM transition from active to
standby
Key: YARN-4646
URL: https://issues.apache.org/jira/browse/YARN-4646
Project: Hadoop YARN
Issue Typ
[
https://issues.apache.org/jira/browse/YARN-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15118815#comment-15118815
]
sandflee commented on YARN-4646:
I propose not passing Interrupted exception to client whil
[
https://issues.apache.org/jira/browse/YARN-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15118939#comment-15118939
]
sandflee commented on YARN-4646:
Thanks [~zxu], they're the same issue, but patch in MAPRED
[
https://issues.apache.org/jira/browse/YARN-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15120626#comment-15120626
]
sandflee commented on YARN-4646:
MR AM catches most remote exceptions and retry, I don't kn
[
https://issues.apache.org/jira/browse/YARN-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133627#comment-15133627
]
sandflee commented on YARN-4138:
Hi, [~mding], there may some cases not user/app error,
1,
[
https://issues.apache.org/jira/browse/YARN-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133758#comment-15133758
]
sandflee commented on YARN-4138:
to simple the race condition process, could we reject the
sandflee created YARN-4672:
--
Summary: container resource increased msg may lost if nm restart
Key: YARN-4672
URL: https://issues.apache.org/jira/browse/YARN-4672
Project: Hadoop YARN
Issue Type: Bug
[
https://issues.apache.org/jira/browse/YARN-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133783#comment-15133783
]
sandflee commented on YARN-4672:
This will trigger container resource rollback logic, thoug
sandflee created YARN-4673:
--
Summary: race condition in ResourceTrackerService#nodeHeartBeat
while processing deduplicated msg
Key: YARN-4673
URL: https://issues.apache.org/jira/browse/YARN-4673
Project: Had
[
https://issues.apache.org/jira/browse/YARN-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15142052#comment-15142052
]
sandflee commented on YARN-4138:
looks good to me too, thanks [~mding]
> Roll back contain
[
https://issues.apache.org/jira/browse/YARN-4673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4673:
---
Attachment: YARN-4673.01.patch
> race condition in ResourceTrackerService#nodeHeartBeat while processing
> ded
[
https://issues.apache.org/jira/browse/YARN-4673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15166923#comment-15166923
]
sandflee commented on YARN-4673:
Hi, [~ozawa], in ResourceTrackService we may concurrently
sandflee created YARN-4740:
--
Summary: container complete msg may lost while AM restart in race
condition
Key: YARN-4740
URL: https://issues.apache.org/jira/browse/YARN-4740
Project: Hadoop YARN
Iss
[
https://issues.apache.org/jira/browse/YARN-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4740:
---
Attachment: YARN-4740.01.patch
put containers in finishedContainersSentToAM back to justFinishedContainer if
[
https://issues.apache.org/jira/browse/YARN-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15171341#comment-15171341
]
sandflee commented on YARN-4741:
Hi,[~sjlee0],
1, does the num of FINISHED_CONTAINERS_PUL
[
https://issues.apache.org/jira/browse/YARN-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15171378#comment-15171378
]
sandflee commented on YARN-4741:
one race condition may cause the "Invalid event
FINISHED_
[
https://issues.apache.org/jira/browse/YARN-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173395#comment-15173395
]
sandflee commented on YARN-4741:
without the fix of YARN-3990 and YARN-3896, our rm was flo
[
https://issues.apache.org/jira/browse/YARN-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4740:
---
Attachment: YARN-4740.02.patch
> container complete msg may lost while AM restart in race condition
> -
[
https://issues.apache.org/jira/browse/YARN-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15176947#comment-15176947
]
sandflee commented on YARN-4740:
thanks for your suggest, attach a new patch to fix these.
[
https://issues.apache.org/jira/browse/YARN-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15178708#comment-15178708
]
sandflee commented on YARN-4740:
yes, this patch ensure AM receive at least one container c
[
https://issues.apache.org/jira/browse/YARN-4763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182437#comment-15182437
]
sandflee commented on YARN-4763:
I think a general way fix for this is we should get rmappa
[
https://issues.apache.org/jira/browse/YARN-4763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184713#comment-15184713
]
sandflee commented on YARN-4763:
yes, thanks for pointing this
> RMApps Page crashes with
[
https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15227703#comment-15227703
]
sandflee commented on YARN-4924:
In YARN-4051, we also had containers leak from NEW to DONE
[
https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15229486#comment-15229486
]
sandflee commented on YARN-4924:
thanks [~nroberts], another thought, seems it's not nesses
[
https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4924:
---
Attachment: YARN-4924.01.patch
remove FINISH_APP related code in NM
> NM recovery race can lead to container n
[
https://issues.apache.org/jira/browse/YARN-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232971#comment-15232971
]
sandflee commented on YARN-4740:
thanks [~jianhe] for reviewing and committing!
> AM may n
[
https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4924:
---
Attachment: YARN-4924.02.patch
> NM recovery race can lead to container not cleaned up
> --
[
https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15233085#comment-15233085
]
sandflee commented on YARN-4924:
{quote}
I don't think removeDeprecatedKeys is an appropria
sandflee created YARN-4936:
--
Summary: FileInputStream should be closed explicitly in
NMWebService#getLogs
Key: YARN-4936
URL: https://issues.apache.org/jira/browse/YARN-4936
Project: Hadoop YARN
Is
[
https://issues.apache.org/jira/browse/YARN-4936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4936:
---
Attachment: YARN-4936.01.patch
> FileInputStream should be closed explicitly in NMWebService#getLogs
>
sandflee created YARN-4939:
--
Summary: the decommissioning Node should keep alive if NM restart
Key: YARN-4939
URL: https://issues.apache.org/jira/browse/YARN-4939
Project: Hadoop YARN
Issue Type: B
[
https://issues.apache.org/jira/browse/YARN-4939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4939:
---
Attachment: YARN-4939.01.patch
> the decommissioning Node should keep alive if NM restart
> --
sandflee created YARN-4940:
--
Summary: yarn node -list -all failed if RM start with
decommissioned node
Key: YARN-4940
URL: https://issues.apache.org/jira/browse/YARN-4940
Project: Hadoop YARN
Issue
[
https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235369#comment-15235369
]
sandflee commented on YARN-4924:
thanks [~jlowe], I added @Deprecated to FINISHED_APP_KEY_
[
https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4924:
---
Attachment: YARN-4924.03.patch
> NM recovery race can lead to container not cleaned up
> --
[
https://issues.apache.org/jira/browse/YARN-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235388#comment-15235388
]
sandflee commented on YARN-4940:
seems not, they are all caused by YARN-3102
> yarn node -
[
https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236072#comment-15236072
]
sandflee commented on YARN-4924:
>From the interface of DB, createWriteBatch didn't not th
[
https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236115#comment-15236115
]
sandflee commented on YARN-4924:
in case of createWriteBatch throws runtime Exception, see
[
https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4924:
---
Attachment: YARN-4924.04.patch
> NM recovery race can lead to container not cleaned up
> --
[
https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236128#comment-15236128
]
sandflee commented on YARN-4924:
Thanks [~jlowe], not noticed that DBException is a RUNTIM
[
https://issues.apache.org/jira/browse/YARN-4939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4939:
---
Attachment: YARN-4939.02.patch
./bin/yarn node -list -states DECOMMISSIONING couldn't get the
decommissionin
[
https://issues.apache.org/jira/browse/YARN-4939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4939:
---
Attachment: (was: YARN-4939.02.patch)
> the decommissioning Node should keep alive if NM restart
> ---
[
https://issues.apache.org/jira/browse/YARN-4939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4939:
---
Attachment: YARN-4939.02.patch
> the decommissioning Node should keep alive if NM restart
> --
[
https://issues.apache.org/jira/browse/YARN-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236459#comment-15236459
]
sandflee commented on YARN-2567:
Hi , [~vinodkv], could you assign this to me, I'd like to
[
https://issues.apache.org/jira/browse/YARN-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236701#comment-15236701
]
sandflee commented on YARN-2567:
The main idea is to lazily store NM status, if RM failover
[
https://issues.apache.org/jira/browse/YARN-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236724#comment-15236724
]
sandflee commented on YARN-2567:
there maybe one problem that if NM recovered as a finished
[
https://issues.apache.org/jira/browse/YARN-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4940:
---
Attachment: YARN-4940.01.patch
> yarn node -list -all failed if RM start with decommissioned node
> ---
[
https://issues.apache.org/jira/browse/YARN-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4940:
---
Attachment: YARN-4940.02.patch
> yarn node -list -all failed if RM start with decommissioned node
> ---
[
https://issues.apache.org/jira/browse/YARN-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237007#comment-15237007
]
sandflee commented on YARN-4940:
rather than converting UnknownNodeId , using NodeId seems
[
https://issues.apache.org/jira/browse/YARN-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237387#comment-15237387
]
sandflee commented on YARN-4940:
thanks [~kshukla], the test failures seems not related, I
[
https://issues.apache.org/jira/browse/YARN-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15239425#comment-15239425
]
sandflee commented on YARN-2567:
Thanks [~jlowe], agree that a asynchronous state store wi
[
https://issues.apache.org/jira/browse/YARN-4939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4939:
---
Attachment: YARN-4939.03.patch
> the decommissioning Node should keep alive if NM restart
> --
[
https://issues.apache.org/jira/browse/YARN-4939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sandflee updated YARN-4939:
---
Attachment: YARN-4939.04.patch
> the decommissioning Node should keep alive if NM restart
> --
1 - 100 of 468 matches
Mail list logo