[jira] [Updated] (YARN-2313) Livelock can occur on FairScheduler when there are lots entry in queue

2014-07-21 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-2313:
-

Attachment: YARN-2313.4.patch

Thanks for the review, [~sandyr].

Updated a patch:
* Moved new configuration definition to FairSchedulerConfiguration.
* Excluded warning by updateInterval.
* Replaced warning message use with using
* Removed trailing spaces.
 

 Livelock can occur on FairScheduler when there are lots entry in queue
 --

 Key: YARN-2313
 URL: https://issues.apache.org/jira/browse/YARN-2313
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-2313.1.patch, YARN-2313.2.patch, YARN-2313.3.patch, 
 YARN-2313.4.patch, rm-stack-trace.txt


 Observed livelock on FairScheduler when there are lots entry in queue. After 
 my investigating code, following case can occur:
 1. {{update()}} called by UpdateThread takes longer times than 
 UPDATE_INTERVAL(500ms) if there are lots queue.
 2. UpdateThread goes busy loop.
 3. Other threads(AllocationFileReloader, 
 ResourceManager$SchedulerEventDispatcher) can wait forever.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2313) Livelock can occur on FairScheduler when there are lots entry in queue

2014-07-17 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-2313:
-

Attachment: YARN-2313.1.patch

Ideally, UPDATE_INTERVAL should be calculated based on current number of 
entries in queue. Another workaround is making UPDATE_INTERVAL configurable. 
Attached patch takes the latter approach, because it's easy to implement.

 Livelock can occur on FairScheduler when there are lots entry in queue
 --

 Key: YARN-2313
 URL: https://issues.apache.org/jira/browse/YARN-2313
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-2313.1.patch


 Observed livelock on FairScheduler when there are lots entry in queue. After 
 my investigating code, following case can occur:
 1. {{update()}} called by UpdateThread takes longer times than 
 UPDATE_INTERVAL(500ms) if there are lots queue.
 2. UpdateThread goes busy loop.
 3. Other threads(AllocationFileReloader, 
 ResourceManager$SchedulerEventDispatcher) can wait forever.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2313) Livelock can occur on FairScheduler when there are lots entry in queue

2014-07-17 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-2313:
-

Attachment: rm-stack-trace.txt

Attached stack trace when we faced the problem.

 Livelock can occur on FairScheduler when there are lots entry in queue
 --

 Key: YARN-2313
 URL: https://issues.apache.org/jira/browse/YARN-2313
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-2313.1.patch, rm-stack-trace.txt


 Observed livelock on FairScheduler when there are lots entry in queue. After 
 my investigating code, following case can occur:
 1. {{update()}} called by UpdateThread takes longer times than 
 UPDATE_INTERVAL(500ms) if there are lots queue.
 2. UpdateThread goes busy loop.
 3. Other threads(AllocationFileReloader, 
 ResourceManager$SchedulerEventDispatcher) can wait forever.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2313) Livelock can occur on FairScheduler when there are lots entry in queue

2014-07-17 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-2313:
-

Attachment: (was: YARN-2313.1.patch)

 Livelock can occur on FairScheduler when there are lots entry in queue
 --

 Key: YARN-2313
 URL: https://issues.apache.org/jira/browse/YARN-2313
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-2313.1.patch, rm-stack-trace.txt


 Observed livelock on FairScheduler when there are lots entry in queue. After 
 my investigating code, following case can occur:
 1. {{update()}} called by UpdateThread takes longer times than 
 UPDATE_INTERVAL(500ms) if there are lots queue.
 2. UpdateThread goes busy loop.
 3. Other threads(AllocationFileReloader, 
 ResourceManager$SchedulerEventDispatcher) can wait forever.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2313) Livelock can occur on FairScheduler when there are lots entry in queue

2014-07-17 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-2313:
-

Attachment: YARN-2313.1.patch

 Livelock can occur on FairScheduler when there are lots entry in queue
 --

 Key: YARN-2313
 URL: https://issues.apache.org/jira/browse/YARN-2313
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-2313.1.patch, rm-stack-trace.txt


 Observed livelock on FairScheduler when there are lots entry in queue. After 
 my investigating code, following case can occur:
 1. {{update()}} called by UpdateThread takes longer times than 
 UPDATE_INTERVAL(500ms) if there are lots queue.
 2. UpdateThread goes busy loop.
 3. Other threads(AllocationFileReloader, 
 ResourceManager$SchedulerEventDispatcher) can wait forever.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2313) Livelock can occur on FairScheduler when there are lots entry in queue

2014-07-17 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-2313:
-

Attachment: YARN-2313.2.patch

Fixed the warning by findbugs.

 Livelock can occur on FairScheduler when there are lots entry in queue
 --

 Key: YARN-2313
 URL: https://issues.apache.org/jira/browse/YARN-2313
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-2313.1.patch, YARN-2313.2.patch, rm-stack-trace.txt


 Observed livelock on FairScheduler when there are lots entry in queue. After 
 my investigating code, following case can occur:
 1. {{update()}} called by UpdateThread takes longer times than 
 UPDATE_INTERVAL(500ms) if there are lots queue.
 2. UpdateThread goes busy loop.
 3. Other threads(AllocationFileReloader, 
 ResourceManager$SchedulerEventDispatcher) can wait forever.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2313) Livelock can occur on FairScheduler when there are lots entry in queue

2014-07-17 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-2313:
-

Attachment: YARN-2313.3.patch

 Livelock can occur on FairScheduler when there are lots entry in queue
 --

 Key: YARN-2313
 URL: https://issues.apache.org/jira/browse/YARN-2313
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-2313.1.patch, YARN-2313.2.patch, YARN-2313.3.patch, 
 rm-stack-trace.txt


 Observed livelock on FairScheduler when there are lots entry in queue. After 
 my investigating code, following case can occur:
 1. {{update()}} called by UpdateThread takes longer times than 
 UPDATE_INTERVAL(500ms) if there are lots queue.
 2. UpdateThread goes busy loop.
 3. Other threads(AllocationFileReloader, 
 ResourceManager$SchedulerEventDispatcher) can wait forever.



--
This message was sent by Atlassian JIRA
(v6.2#6252)