[ https://issues.apache.org/jira/browse/APEXMALHAR-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426793#comment-15426793 ]
ASF GitHub Bot commented on APEXMALHAR-2197: -------------------------------------------- GitHub user brightchen opened a pull request: https://github.com/apache/apex-malhar/pull/374 APEXMALHAR-2197 #resolve #comment fix TimeBasedPriorityQueue.removeLR… …U throws NoSuchElementException You can merge this pull request into a Git repository by running: $ git pull https://github.com/brightchen/apex-malhar APEXMALHAR-2197 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/374.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #374 ---- commit 2989162248a9057357dbfe725e6473840c46b28a Author: brightchen <bri...@datatorrent.com> Date: 2016-08-18T16:50:55Z APEXMALHAR-2197 #resolve #comment fix TimeBasedPriorityQueue.removeLRU throws NoSuchElementException ---- > TimeBasedPriorityQueue.removeLRU throws NoSuchElementException > -------------------------------------------------------------- > > Key: APEXMALHAR-2197 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2197 > Project: Apache Apex Malhar > Issue Type: Bug > Reporter: bright chen > Assignee: bright chen > Original Estimate: 120h > Remaining Estimate: 120h > > When there are lots of keys and large throughput, the following exception > will be throw. > java.util.NoSuchElementException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1113) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) > at > org.apache.apex.malhar.lib.state.spillable.TimeBasedPriorityQueue.removeLRU(TimeBasedPriorityQueue.java:75) > at > org.apache.apex.malhar.lib.state.spillable.WindowBoundedMapCache.endWindow(WindowBoundedMapCache.java:119) > at > org.apache.apex.malhar.lib.state.spillable.SpillableByteArrayListMultimapImpl.endWindow(SpillableByteArrayListMultimapImpl.java:281) > at > com.datatorrent.benchmark.spillable.SpillableDSBenchmarkTest.testSpillableMutimap(SpillableDSBenchmarkTest.java:139) > The reason is as following: > In TimeBasedPriorityQueue.TimeWrapper<T>, when implement compareTo(), only > compare by time. While in equals(), compare both time and key. > Generally, "compareTo() == 0” means “equals() == true”. But in > TimeBasedPriorityQueue, compareTo() was used by “sortedTimestamp” and it > would be OK only compare by time. > But it would cause problem in removeLRU() if lots of key written in same > time. In this case, one time(even different key) is one entry of > sortedTimestamp, but the counter and count was calculated by key. > So the counter could be much larger than the number of entry of > sortedTimestamp. And NoSuchElementException would throw when reach the end of > the iterator. > Solutions: > Add key as the less important factor could solve this problem. But probably > need to limit key to support compareTo() > Another idea I think also work is use two maps, the benefit is don’t need to > compare by key( only compare by time) > - timeToKeys: multiple list map from time to keys > - keyToRecentTime: key to most recent time > implementation: > - upSert(T key): get time from keyToRecentTime map; if don’t have time, put > (key, time) to both keyToRecentTime and timeToKeys; if have time and time is > different, remove key from timeToKeys, update keyToRecentTime > - removeLRU(int count): get time sortedTimestamp, and then get and remove > keys from timeToKeys, and remove entry from keyToRecentTime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)