[
https://issues.apache.org/jira/browse/FLINK-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15307691#comment-15307691
]
ASF GitHub Bot commented on FLINK-3994:
---------------------------------------
GitHub user chiwanpark opened a pull request:
https://github.com/apache/flink/pull/2056
[FLINK-3994] [ml, tests] Fix flaky KNN integration tests
This PR is related to flaky KNN integration tests. The problem is caused by
sharing `ExecutionEnvironment` between test cases. I'm not sure about exact
reason. This PR makes each test case have own `ExecutionEnvironment`. Tests on
my local machine and my Travis-CI [1] is passed with this PR.
I have some doubt because this is not essential fix for the problem. AFAIK
and @StephanEwen said, sharing `ExecutionEnvironment` should be supported.
Addtionally, `mvn clean verify` has passed without this PR on my local machine.
If there are any other opinions, please leave comment.
[1]: https://travis-ci.org/chiwanpark/flink/builds/134104491
p.s. we need to re-write commit message.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/chiwanpark/flink hotfix-ml-test
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/2056.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2056
----
commit a47ae8481bcbac2c490386089ee6b1e740f3a1f4
Author: Chiwan Park <[email protected]>
Date: 2016-05-31T08:50:05Z
[hotfix] [ml] Fix flaky KNN integration tests
----
> Instable KNNITSuite
> -------------------
>
> Key: FLINK-3994
> URL: https://issues.apache.org/jira/browse/FLINK-3994
> Project: Flink
> Issue Type: Bug
> Components: Machine Learning Library, Tests
> Affects Versions: 1.1.0
> Reporter: Chiwan Park
> Assignee: Chiwan Park
> Priority: Critical
> Labels: test-stability
> Fix For: 1.1.0
>
>
> KNNITSuite fails in Travis-CI with following error:
> {code}
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
> at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:806)
> at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:752)
> at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:752)
> at
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
> at
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
> at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
> ...
> Cause: java.io.IOException: Insufficient number of network buffers:
> required 32, but only 4 available. The total number of network buffers is
> currently set to 2048. You can increase this number by setting the
> configuration key 'taskmanager.network.numberOfBuffers'.
> at
> org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.createBufferPool(NetworkBufferPool.java:196)
> at
> org.apache.flink.runtime.io.network.NetworkEnvironment.registerTask(NetworkEnvironment.java:327)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:497)
> at java.lang.Thread.run(Thread.java:745)
> ...
> {code}
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/134064237/log.txt
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/134064236/log.txt
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/134064235/log.txt
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/134052961/log.txt
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)