[jira] [Commented] (YARN-3553) TreeSet is not a nice container for organizing schedulableEntities.

2015-05-04 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526295#comment-14526295
 ] 

Xianyin Xin commented on YARN-3553:
---

Thanks, [~leftnoteasy] and [~cwelch]. Now that it is not an issue, just close 
it.

 TreeSet is not a nice container for organizing schedulableEntities.
 ---

 Key: YARN-3553
 URL: https://issues.apache.org/jira/browse/YARN-3553
 Project: Hadoop YARN
  Issue Type: Wish
  Components: scheduler
Reporter: Xianyin Xin

 For TreeSet, element is identified by comparator, not the object reference. 
 If any *attributes that used for comparing two elements* of an specific 
 element is modified by other methods, the TreeSet will be in an un-sorted 
 state, and cannot become sorted forever except that we reconstruct another 
 TreeSet with the elements. To avoid this, one must be *very careful* when 
 they try to modify the attributes (such as increase or decrease the used 
 capacity of a schedulabeEntity) of an object.
 An example in AbstractComparatorOrderingPolicy.java, Line63,
 {code}
   protected void reorderSchedulableEntity(S schedulableEntity) {
 //remove, update comparable data, and reinsert to update position in order
 schedulableEntities.remove(schedulableEntity);
 updateSchedulingResourceUsage(
 schedulableEntity.getSchedulingResourceUsage());
 schedulableEntities.add(schedulableEntity);
   }
 {code}
 This method tries to remove the schedulableEntity first and then reinsert it 
 so as to reorder the set. However, the changes of the schedulableEntity 
 should be done in the middle of the above two operations. But the comparator 
 of the class is not clear, so we don't know which attributes of the 
 schedulableEntity was changed. If we changed the schedulableEntity outside 
 the method and then inform the orderingPolicy that we made such a change, the 
 operation schedulableEntities.remove(schedulableEntity) would not work 
 correctly since the element of a TreeSet is identified by comparator. Any 
 implement class of this abstract class should overwrite this method, but few 
 does. Another choice is that we make modification of a schedulableEntity 
 manually, but we mustn't forget to reorder the set when we do so and must 
 remember the order: remove, modify the attributes(used for comparing), 
 insert, or use an iterator to mark the schedulableEntity so that we can 
 remove and reinsert it correctly.
 YARN-897 is an example that we fell into the trap. If the comparator become 
 complex in future, e.g., we consider other types of resources in comparator, 
 such traps will be more and disperse anywhere, which makes it easy to let a 
 TreeSet become a un-sorted state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3553) TreeSet is not a nice container for organizing schedulableEntities.

2015-05-01 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14523757#comment-14523757
 ] 

Craig Welch commented on YARN-3553:
---

[~xinxianyin], this pattern is important for these implementations for 
efficiency reasons - the frequency with which changes occur which effect 
ordering is much less than the frequency with which the applications must be 
available in the proper order for allocation (esp. when allocating on 
heartbeat, which is typical..).  By iteratively resorting individual elements 
only when needed we avoid frequent resorting of all applications in the queue, 
which can be quite expensive with the frequency it would occur.  Values are 
cached with a pretty simple update lifecycle to avoid the issues you are 
concerned about.  Finally, this is an implementation specific choice, other 
implementations of ordering policies are free to use other data structures / 
sorting frequency, although the concern wrt efficiency which this approach 
avoids would apply to any non-iterative approaches.

 TreeSet is not a nice container for organizing schedulableEntities.
 ---

 Key: YARN-3553
 URL: https://issues.apache.org/jira/browse/YARN-3553
 Project: Hadoop YARN
  Issue Type: Wish
  Components: scheduler
Reporter: Xianyin Xin

 For TreeSet, element is identified by comparator, not the object reference. 
 If any *attributes that used for comparing two elements* of an specific 
 element is modified by other methods, the TreeSet will be in an un-sorted 
 state, and cannot become sorted forever except that we reconstruct another 
 TreeSet with the elements. To avoid this, one must be *very careful* when 
 they try to modify the attributes (such as increase or decrease the used 
 capacity of a schedulabeEntity) of an object.
 An example in AbstractComparatorOrderingPolicy.java, Line63,
 {code}
   protected void reorderSchedulableEntity(S schedulableEntity) {
 //remove, update comparable data, and reinsert to update position in order
 schedulableEntities.remove(schedulableEntity);
 updateSchedulingResourceUsage(
 schedulableEntity.getSchedulingResourceUsage());
 schedulableEntities.add(schedulableEntity);
   }
 {code}
 This method tries to remove the schedulableEntity first and then reinsert it 
 so as to reorder the set. However, the changes of the schedulableEntity 
 should be done in the middle of the above two operations. But the comparator 
 of the class is not clear, so we don't know which attributes of the 
 schedulableEntity was changed. If we changed the schedulableEntity outside 
 the method and then inform the orderingPolicy that we made such a change, the 
 operation schedulableEntities.remove(schedulableEntity) would not work 
 correctly since the element of a TreeSet is identified by comparator. Any 
 implement class of this abstract class should overwrite this method, but few 
 does. Another choice is that we make modification of a schedulableEntity 
 manually, but we mustn't forget to reorder the set when we do so and must 
 remember the order: remove, modify the attributes(used for comparing), 
 insert, or use an iterator to mark the schedulableEntity so that we can 
 remove and reinsert it correctly.
 YARN-897 is an example that we fell into the trap. If the comparator become 
 complex in future, e.g., we consider other types of resources in comparator, 
 such traps will be more and disperse anywhere, which makes it easy to let a 
 TreeSet become a un-sorted state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3553) TreeSet is not a nice container for organizing schedulableEntities.

2015-04-30 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14522495#comment-14522495
 ] 

Wangda Tan commented on YARN-3553:
--

[~xinxianyin],
I think what you mentioned is a valid point, but it should be already resolved 
by {{updateSchedulingResourceUsage}}. When we reorder application, we will 
first remove it from TreeSet, call {{updateSchedulingResourceUsage}} to update 
it's cached demand/usage, and reinsert it back to TreeSet. And FairComparator 
also uses cached demand/pending to compare two applications.

If SchedulableEntity doesn't touch cached fields (it shouldn't touch), and 
TreeSet uses cached fields to compare items, this problem isn't existed, do you 
agree?

+[~cwelch]

 TreeSet is not a nice container for organizing schedulableEntities.
 ---

 Key: YARN-3553
 URL: https://issues.apache.org/jira/browse/YARN-3553
 Project: Hadoop YARN
  Issue Type: Wish
  Components: scheduler
Reporter: Xianyin Xin

 For TreeSet, element is identified by comparator, not the object reference. 
 If any *attributes that used for comparing two elements* of an specific 
 element is modified by other methods, the TreeSet will be in an un-sorted 
 state, and cannot become sorted forever except that we reconstruct another 
 TreeSet with the elements. To avoid this, one must be *very careful* when 
 they try to modify the attributes (such as increase or decrease the used 
 capacity of a schedulabeEntity) of an object.
 An example in AbstractComparatorOrderingPolicy.java, Line63,
 {code}
   protected void reorderSchedulableEntity(S schedulableEntity) {
 //remove, update comparable data, and reinsert to update position in order
 schedulableEntities.remove(schedulableEntity);
 updateSchedulingResourceUsage(
 schedulableEntity.getSchedulingResourceUsage());
 schedulableEntities.add(schedulableEntity);
   }
 {code}
 This method tries to remove the schedulableEntity first and then reinsert it 
 so as to reorder the set. However, the changes of the schedulableEntity 
 should be done in the middle of the above two operations. But the comparator 
 of the class is not clear, so we don't know which attributes of the 
 schedulableEntity was changed. If we changed the schedulableEntity outside 
 the method and then inform the orderingPolicy that we made such a change, the 
 operation schedulableEntities.remove(schedulableEntity) would not work 
 correctly since the element of a TreeSet is identified by comparator. Any 
 implement class of this abstract class should overwrite this method, but few 
 does. Another choice is that we make modification of a schedulableEntity 
 manually, but we mustn't forget to reorder the set when we do so and must 
 remember the order: remove, modify the attributes(used for comparing), 
 insert, or use an iterator to mark the schedulableEntity so that we can 
 remove and reinsert it correctly.
 YARN-897 is an example that we fell into the trap. If the comparator become 
 complex in future, e.g., we consider other types of resources in comparator, 
 such traps will be more and disperse anywhere, which makes it easy to let a 
 TreeSet become a un-sorted state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)