[jira] [Commented] (FLINK-2545) NegativeArraySizeException while creating hash table bloom filters
[ https://issues.apache.org/jira/browse/FLINK-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14727371#comment-14727371 ] ASF GitHub Bot commented on FLINK-2545: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1067 > NegativeArraySizeException while creating hash table bloom filters > -- > > Key: FLINK-2545 > URL: https://issues.apache.org/jira/browse/FLINK-2545 > Project: Flink > Issue Type: Bug > Components: Distributed Runtime >Affects Versions: master >Reporter: Greg Hogan >Assignee: Chengxiang Li > > The following exception occurred a second time when I immediately re-ran my > application, though after recompiling and restarting Flink the subsequent > execution ran without error. > java.lang.Exception: The data preparation for task '...' , caused an error: > null > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:465) > at > org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NegativeArraySizeException > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucket(MutableHashTable.java:1160) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucketsInPartition(MutableHashTable.java:1143) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.spillPartition(MutableHashTable.java:1117) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertBucketEntry(MutableHashTable.java:946) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertIntoTable(MutableHashTable.java:868) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildInitialTable(MutableHashTable.java:692) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.open(MutableHashTable.java:455) > at > org.apache.flink.runtime.operators.hash.ReusingBuildSecondHashMatchIterator.open(ReusingBuildSecondHashMatchIterator.java:93) > at > org.apache.flink.runtime.operators.JoinDriver.prepare(JoinDriver.java:195) > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:459) > ... 3 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2545) NegativeArraySizeException while creating hash table bloom filters
[ https://issues.apache.org/jira/browse/FLINK-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14725153#comment-14725153 ] ASF GitHub Bot commented on FLINK-2545: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1067#issuecomment-136667661 Looks good, I'll merge this! > NegativeArraySizeException while creating hash table bloom filters > -- > > Key: FLINK-2545 > URL: https://issues.apache.org/jira/browse/FLINK-2545 > Project: Flink > Issue Type: Bug > Components: Distributed Runtime >Affects Versions: master >Reporter: Greg Hogan >Assignee: Chengxiang Li > > The following exception occurred a second time when I immediately re-ran my > application, though after recompiling and restarting Flink the subsequent > execution ran without error. > java.lang.Exception: The data preparation for task '...' , caused an error: > null > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:465) > at > org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NegativeArraySizeException > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucket(MutableHashTable.java:1160) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucketsInPartition(MutableHashTable.java:1143) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.spillPartition(MutableHashTable.java:1117) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertBucketEntry(MutableHashTable.java:946) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertIntoTable(MutableHashTable.java:868) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildInitialTable(MutableHashTable.java:692) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.open(MutableHashTable.java:455) > at > org.apache.flink.runtime.operators.hash.ReusingBuildSecondHashMatchIterator.open(ReusingBuildSecondHashMatchIterator.java:93) > at > org.apache.flink.runtime.operators.JoinDriver.prepare(JoinDriver.java:195) > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:459) > ... 3 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2545) NegativeArraySizeException while creating hash table bloom filters
[ https://issues.apache.org/jira/browse/FLINK-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14721931#comment-14721931 ] ASF GitHub Bot commented on FLINK-2545: --- Github user ChengXiangLi commented on the pull request: https://github.com/apache/flink/pull/1067#issuecomment-136243437 Nice job, @greghogan , you just pointed out the root cause and the solution. I add the logic to skip latest buckets as @StephanEwen suggested, and add related unit test for this issue. > NegativeArraySizeException while creating hash table bloom filters > -- > > Key: FLINK-2545 > URL: https://issues.apache.org/jira/browse/FLINK-2545 > Project: Flink > Issue Type: Bug > Components: Distributed Runtime >Affects Versions: master >Reporter: Greg Hogan >Assignee: Chengxiang Li > > The following exception occurred a second time when I immediately re-ran my > application, though after recompiling and restarting Flink the subsequent > execution ran without error. > java.lang.Exception: The data preparation for task '...' , caused an error: > null > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:465) > at > org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NegativeArraySizeException > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucket(MutableHashTable.java:1160) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucketsInPartition(MutableHashTable.java:1143) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.spillPartition(MutableHashTable.java:1117) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertBucketEntry(MutableHashTable.java:946) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertIntoTable(MutableHashTable.java:868) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildInitialTable(MutableHashTable.java:692) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.open(MutableHashTable.java:455) > at > org.apache.flink.runtime.operators.hash.ReusingBuildSecondHashMatchIterator.open(ReusingBuildSecondHashMatchIterator.java:93) > at > org.apache.flink.runtime.operators.JoinDriver.prepare(JoinDriver.java:195) > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:459) > ... 3 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2545) NegativeArraySizeException while creating hash table bloom filters
[ https://issues.apache.org/jira/browse/FLINK-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1472#comment-1472 ] ASF GitHub Bot commented on FLINK-2545: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1067#issuecomment-135983916 Ah, that makes perfect sense. The last memory segment is not fully used (only until the hash index has initialized enough buckets). The bloom filter initialization loops should also skip those last buckets. Since the memory for these buckets is not initialized, their contents (like count) is undefined. > NegativeArraySizeException while creating hash table bloom filters > -- > > Key: FLINK-2545 > URL: https://issues.apache.org/jira/browse/FLINK-2545 > Project: Flink > Issue Type: Bug > Components: Distributed Runtime >Affects Versions: master >Reporter: Greg Hogan >Assignee: Chengxiang Li > > The following exception occurred a second time when I immediately re-ran my > application, though after recompiling and restarting Flink the subsequent > execution ran without error. > java.lang.Exception: The data preparation for task '...' , caused an error: > null > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:465) > at > org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NegativeArraySizeException > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucket(MutableHashTable.java:1160) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucketsInPartition(MutableHashTable.java:1143) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.spillPartition(MutableHashTable.java:1117) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertBucketEntry(MutableHashTable.java:946) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertIntoTable(MutableHashTable.java:868) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildInitialTable(MutableHashTable.java:692) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.open(MutableHashTable.java:455) > at > org.apache.flink.runtime.operators.hash.ReusingBuildSecondHashMatchIterator.open(ReusingBuildSecondHashMatchIterator.java:93) > at > org.apache.flink.runtime.operators.JoinDriver.prepare(JoinDriver.java:195) > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:459) > ... 3 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2545) NegativeArraySizeException while creating hash table bloom filters
[ https://issues.apache.org/jira/browse/FLINK-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720556#comment-14720556 ] ASF GitHub Bot commented on FLINK-2545: --- Github user greghogan commented on the pull request: https://github.com/apache/flink/pull/1067#issuecomment-135885866 I am currently running release-0.10.0-milestone-1. Debugging with Eclipse and looking at MutableHashTable.initTable, numBuckets is computed as 16086. There are 63 memory segments with 256 buckets each = 16128 total buckets. The last 16128 - 16086 = 42 buckets are not initialized by initTable which terminates the inner loop when bucket == numBuckets. Here is an example header dump from the last memory segment showing the crossover from initialized to uninitialized data. offset, partition, status, count, next-pointer 26880 10 0 0 -72340172838076673 27008 11 0 0 -72340172838076673 27136 12 0 0 -72340172838076673 27264 13 0 0 -72340172838076673 27392 0 -56 9 844425030795264 27520 0 -56 9 -9191846839379296256 27648 0 -56 9 10133099245469696 27776 0 -56 9 12103424082444288 Setting a breakpoint for MutableHashTable.buildBloomFilterForBucket for count < 0, the last memory segment looked as follows (this is from a different execution, operation, and thread). offset, partition, status, count, next-pointer 26880 10 0 9 27584547767975936 27008 11 0 9 -9208735337998712832 27136 12 0 9 4503599694479360 27264 13 0 9 -9219994337067139072 27392 0 0 -32697 1161165883580435 27520 0 3 -15328 18016855230957176 27648 0 5 1388 -33740636012148672 27776 0 6 25494 -17363350186618861 MutableHashTable.buildBloomFilterForBucketsInPartition processed offset 27392 which happened to match the partition number and bucket status even though it looks to be uninitialized. After changing MutableHashTable.initTable to initialize all buckets in all segments I have not seen the bug reoccur. {code} for (int k = 0; k < bucketsPerSegment /* && bucket < numBuckets*/; k++, bucket++) { } {code} I see at least three potential resolutions: 1) have MutableHashTable.initTable initialize all buckets, 2) have MutableHashTable.buildBloomFilterForBucket skip uninitialized buckets, or 3) I have not looked enough at MutableHashTable.getInitialTableSize but it is possible to completely fill the last segment with usable buckets? > NegativeArraySizeException while creating hash table bloom filters > -- > > Key: FLINK-2545 > URL: https://issues.apache.org/jira/browse/FLINK-2545 > Project: Flink > Issue Type: Bug > Components: Distributed Runtime >Affects Versions: master >Reporter: Greg Hogan >Assignee: Chengxiang Li > > The following exception occurred a second time when I immediately re-ran my > application, though after recompiling and restarting Flink the subsequent > execution ran without error. > java.lang.Exception: The data preparation for task '...' , caused an error: > null > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:465) > at > org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NegativeArraySizeException > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucket(MutableHashTable.java:1160) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucketsInPartition(MutableHashTable.java:1143) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.spillPartition(MutableHashTable.java:1117) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertBucketEntry(MutableHashTable.java:946) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertIntoTable(MutableHashTable.java:868) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildInitialTable(MutableHashTable.java:692) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.open(MutableHashTable.java:455) > at > org.apache.flink.runtime.operators.hash.ReusingBuildSecondHashMatchIterator.open(ReusingBuildSecondHashMatchIterator.java:93) > at > org.apache.flink.runtime.operators.JoinDriver.prepare(JoinDriver.java:195) > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:459) > ... 3 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2545) NegativeArraySizeException while creating hash table bloom filters
[ https://issues.apache.org/jira/browse/FLINK-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717934#comment-14717934 ] ASF GitHub Bot commented on FLINK-2545: --- Github user ChengXiangLi commented on the pull request: https://github.com/apache/flink/pull/1067#issuecomment-135612599 Thanks for the remind, @zentol and @StephanEwen , I should be too hurry to open this PR. I tried to fix the exception in bloom filter in this PR and verify other potential issues in hash table behind negative count number separately, obviously, there is no need to do in that way. So let's wait for Greg's response now. > NegativeArraySizeException while creating hash table bloom filters > -- > > Key: FLINK-2545 > URL: https://issues.apache.org/jira/browse/FLINK-2545 > Project: Flink > Issue Type: Bug > Components: Distributed Runtime >Affects Versions: master >Reporter: Greg Hogan >Assignee: Chengxiang Li > > The following exception occurred a second time when I immediately re-ran my > application, though after recompiling and restarting Flink the subsequent > execution ran without error. > java.lang.Exception: The data preparation for task '...' , caused an error: > null > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:465) > at > org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NegativeArraySizeException > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucket(MutableHashTable.java:1160) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucketsInPartition(MutableHashTable.java:1143) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.spillPartition(MutableHashTable.java:1117) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertBucketEntry(MutableHashTable.java:946) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertIntoTable(MutableHashTable.java:868) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildInitialTable(MutableHashTable.java:692) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.open(MutableHashTable.java:455) > at > org.apache.flink.runtime.operators.hash.ReusingBuildSecondHashMatchIterator.open(ReusingBuildSecondHashMatchIterator.java:93) > at > org.apache.flink.runtime.operators.JoinDriver.prepare(JoinDriver.java:195) > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:459) > ... 3 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2545) NegativeArraySizeException while creating hash table bloom filters
[ https://issues.apache.org/jira/browse/FLINK-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716324#comment-14716324 ] ASF GitHub Bot commented on FLINK-2545: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1067#issuecomment-135354418 @ChengXiangLi Do you know what caused the problem initially? I was puzzled, because the count in the bucket should never be negative, and a zero sized bucket should work with your original code. Would be great to capture that error, to see if the root bug was actually somewhere else (not in the bloom filters), but in the other parts of the hash table structure. Hopefully Greg can help us to reproduce this... > NegativeArraySizeException while creating hash table bloom filters > -- > > Key: FLINK-2545 > URL: https://issues.apache.org/jira/browse/FLINK-2545 > Project: Flink > Issue Type: Bug > Components: Distributed Runtime >Affects Versions: master >Reporter: Greg Hogan >Assignee: Chengxiang Li > > The following exception occurred a second time when I immediately re-ran my > application, though after recompiling and restarting Flink the subsequent > execution ran without error. > java.lang.Exception: The data preparation for task '...' , caused an error: > null > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:465) > at > org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NegativeArraySizeException > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucket(MutableHashTable.java:1160) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucketsInPartition(MutableHashTable.java:1143) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.spillPartition(MutableHashTable.java:1117) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertBucketEntry(MutableHashTable.java:946) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertIntoTable(MutableHashTable.java:868) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildInitialTable(MutableHashTable.java:692) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.open(MutableHashTable.java:455) > at > org.apache.flink.runtime.operators.hash.ReusingBuildSecondHashMatchIterator.open(ReusingBuildSecondHashMatchIterator.java:93) > at > org.apache.flink.runtime.operators.JoinDriver.prepare(JoinDriver.java:195) > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:459) > ... 3 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2545) NegativeArraySizeException while creating hash table bloom filters
[ https://issues.apache.org/jira/browse/FLINK-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716292#comment-14716292 ] ASF GitHub Bot commented on FLINK-2545: --- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/1067#issuecomment-135342976 Have you verified that returning at that position does not cause other issues? This is essentially just swallowing the thrown exception, hoping nothing else goes wrong. I don't see how this actually fixes the issue. The count being negatives tells us there is something wrong with the bucket count being set. Resolving that would be a fix. > NegativeArraySizeException while creating hash table bloom filters > -- > > Key: FLINK-2545 > URL: https://issues.apache.org/jira/browse/FLINK-2545 > Project: Flink > Issue Type: Bug > Components: Distributed Runtime >Affects Versions: master >Reporter: Greg Hogan >Assignee: Chengxiang Li > > The following exception occurred a second time when I immediately re-ran my > application, though after recompiling and restarting Flink the subsequent > execution ran without error. > java.lang.Exception: The data preparation for task '...' , caused an error: > null > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:465) > at > org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NegativeArraySizeException > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucket(MutableHashTable.java:1160) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucketsInPartition(MutableHashTable.java:1143) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.spillPartition(MutableHashTable.java:1117) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertBucketEntry(MutableHashTable.java:946) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertIntoTable(MutableHashTable.java:868) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildInitialTable(MutableHashTable.java:692) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.open(MutableHashTable.java:455) > at > org.apache.flink.runtime.operators.hash.ReusingBuildSecondHashMatchIterator.open(ReusingBuildSecondHashMatchIterator.java:93) > at > org.apache.flink.runtime.operators.JoinDriver.prepare(JoinDriver.java:195) > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:459) > ... 3 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2545) NegativeArraySizeException while creating hash table bloom filters
[ https://issues.apache.org/jira/browse/FLINK-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716279#comment-14716279 ] ASF GitHub Bot commented on FLINK-2545: --- GitHub user ChengXiangLi opened a pull request: https://github.com/apache/flink/pull/1067 [FLINK-2545] add bucket member count verification while build bloom filter Bug fix, see detail error message at [FLINK-2545](https://issues.apache.org/jira/browse/FLINK-2545). You can merge this pull request into a Git repository by running: $ git pull https://github.com/ChengXiangLi/flink FLINK-2545 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1067.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1067 commit 072ce60ac875a270f2cd06ec6323f9eb2814d5ea Author: chengxiang li Date: 2015-08-27T06:28:38Z [FLINK-2545] add bucket member count verification while build bloom filter. > NegativeArraySizeException while creating hash table bloom filters > -- > > Key: FLINK-2545 > URL: https://issues.apache.org/jira/browse/FLINK-2545 > Project: Flink > Issue Type: Bug > Components: Distributed Runtime >Affects Versions: master >Reporter: Greg Hogan >Assignee: Chengxiang Li > > The following exception occurred a second time when I immediately re-ran my > application, though after recompiling and restarting Flink the subsequent > execution ran without error. > java.lang.Exception: The data preparation for task '...' , caused an error: > null > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:465) > at > org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NegativeArraySizeException > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucket(MutableHashTable.java:1160) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucketsInPartition(MutableHashTable.java:1143) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.spillPartition(MutableHashTable.java:1117) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertBucketEntry(MutableHashTable.java:946) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertIntoTable(MutableHashTable.java:868) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildInitialTable(MutableHashTable.java:692) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.open(MutableHashTable.java:455) > at > org.apache.flink.runtime.operators.hash.ReusingBuildSecondHashMatchIterator.open(ReusingBuildSecondHashMatchIterator.java:93) > at > org.apache.flink.runtime.operators.JoinDriver.prepare(JoinDriver.java:195) > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:459) > ... 3 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2545) NegativeArraySizeException while creating hash table bloom filters
[ https://issues.apache.org/jira/browse/FLINK-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716152#comment-14716152 ] Chengxiang Li commented on FLINK-2545: -- Hi, [~greghogan], do you mind to share how to reproduce this issue? The exception caused by a negative array index, which represents the count of bucket member. It should be very simple to add a check of the count value to fix this issue. But the count should never be negative according to the hashtable implementation, so if it's possible, i want to reproduce this issue to check if there is other hidden issues behind this. > NegativeArraySizeException while creating hash table bloom filters > -- > > Key: FLINK-2545 > URL: https://issues.apache.org/jira/browse/FLINK-2545 > Project: Flink > Issue Type: Bug > Components: Distributed Runtime >Affects Versions: master >Reporter: Greg Hogan >Assignee: Chengxiang Li > > The following exception occurred a second time when I immediately re-ran my > application, though after recompiling and restarting Flink the subsequent > execution ran without error. > java.lang.Exception: The data preparation for task '...' , caused an error: > null > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:465) > at > org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NegativeArraySizeException > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucket(MutableHashTable.java:1160) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucketsInPartition(MutableHashTable.java:1143) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.spillPartition(MutableHashTable.java:1117) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertBucketEntry(MutableHashTable.java:946) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertIntoTable(MutableHashTable.java:868) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildInitialTable(MutableHashTable.java:692) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.open(MutableHashTable.java:455) > at > org.apache.flink.runtime.operators.hash.ReusingBuildSecondHashMatchIterator.open(ReusingBuildSecondHashMatchIterator.java:93) > at > org.apache.flink.runtime.operators.JoinDriver.prepare(JoinDriver.java:195) > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:459) > ... 3 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2545) NegativeArraySizeException while creating hash table bloom filters
[ https://issues.apache.org/jira/browse/FLINK-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14701802#comment-14701802 ] Stephan Ewen commented on FLINK-2545: - Thanks for reporting this! As a quick fix, you can disable bloom filters by adding {{taskmanager.runtime.hashjoin-bloom-filters: false}} to the Flink config. See here for a reference: https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#runtime-algorithms Th bloom filters are a relatively new addition in 0.10. > NegativeArraySizeException while creating hash table bloom filters > -- > > Key: FLINK-2545 > URL: https://issues.apache.org/jira/browse/FLINK-2545 > Project: Flink > Issue Type: Bug > Components: Distributed Runtime >Affects Versions: master >Reporter: Greg Hogan > > The following exception occurred a second time when I immediately re-ran my > application, though after recompiling and restarting Flink the subsequent > execution ran without error. > java.lang.Exception: The data preparation for task '...' , caused an error: > null > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:465) > at > org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NegativeArraySizeException > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucket(MutableHashTable.java:1160) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildBloomFilterForBucketsInPartition(MutableHashTable.java:1143) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.spillPartition(MutableHashTable.java:1117) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertBucketEntry(MutableHashTable.java:946) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.insertIntoTable(MutableHashTable.java:868) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.buildInitialTable(MutableHashTable.java:692) > at > org.apache.flink.runtime.operators.hash.MutableHashTable.open(MutableHashTable.java:455) > at > org.apache.flink.runtime.operators.hash.ReusingBuildSecondHashMatchIterator.open(ReusingBuildSecondHashMatchIterator.java:93) > at > org.apache.flink.runtime.operators.JoinDriver.prepare(JoinDriver.java:195) > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:459) > ... 3 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)