[jira] [Commented] (FLINK-4833) Unstable test OperatorStatsAccumulatorTest.testAccumulatorHeavyHitterCountMinSketch

2016-10-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595040#comment-15595040
 ] 

ASF GitHub Bot commented on FLINK-4833:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2660


> Unstable test 
> OperatorStatsAccumulatorTest.testAccumulatorHeavyHitterCountMinSketch
> ---
>
> Key: FLINK-4833
> URL: https://issues.apache.org/jira/browse/FLINK-4833
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Reporter: Kostas Kloudas
>  Labels: test-stability
>
> Some instances:
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801187/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801191/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801193/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801195/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4833) Unstable test OperatorStatsAccumulatorTest.testAccumulatorHeavyHitterCountMinSketch

2016-10-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15594717#comment-15594717
 ] 

ASF GitHub Bot commented on FLINK-4833:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2660
  
+1 to merge.


> Unstable test 
> OperatorStatsAccumulatorTest.testAccumulatorHeavyHitterCountMinSketch
> ---
>
> Key: FLINK-4833
> URL: https://issues.apache.org/jira/browse/FLINK-4833
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Reporter: Kostas Kloudas
>  Labels: test-stability
>
> Some instances:
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801187/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801191/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801193/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801195/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4833) Unstable test OperatorStatsAccumulatorTest.testAccumulatorHeavyHitterCountMinSketch

2016-10-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15588827#comment-15588827
 ] 

ASF GitHub Bot commented on FLINK-4833:
---

GitHub user mxm opened a pull request:

https://github.com/apache/flink/pull/2660

[FLINK-4833] properly log exceptions in CountMinHeavyHitter

This logs the underlying exception properly which could help us to find the 
exact cause of the reported problems.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mxm/flink FLINK-4833

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2660.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2660


commit d117c59e704e9faad8d5001cb9b2164cd4aa7b9a
Author: Maximilian Michels 
Date:   2016-10-19T13:50:15Z

[FLINK-4833] properly log exceptions in CountMinHeavyHitter




> Unstable test 
> OperatorStatsAccumulatorTest.testAccumulatorHeavyHitterCountMinSketch
> ---
>
> Key: FLINK-4833
> URL: https://issues.apache.org/jira/browse/FLINK-4833
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Reporter: Kostas Kloudas
>  Labels: test-stability
>
> Some instances:
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801187/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801191/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801193/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801195/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4833) Unstable test OperatorStatsAccumulatorTest.testAccumulatorHeavyHitterCountMinSketch

2016-10-18 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15585028#comment-15585028
 ] 

Robert Metzger commented on FLINK-4833:
---

But why is the error now happening to frequently? Did we change something in 
the accumulator implementation?

I got two more cases today:
https://s3.amazonaws.com/archive.travis-ci.org/jobs/168509703/log.txt
https://s3.amazonaws.com/archive.travis-ci.org/jobs/168509705/log.txt

> Unstable test 
> OperatorStatsAccumulatorTest.testAccumulatorHeavyHitterCountMinSketch
> ---
>
> Key: FLINK-4833
> URL: https://issues.apache.org/jira/browse/FLINK-4833
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Reporter: Kostas Kloudas
>  Labels: test-stability
>
> Some instances:
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801187/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801191/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801193/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801195/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4833) Unstable test OperatorStatsAccumulatorTest.testAccumulatorHeavyHitterCountMinSketch

2016-10-17 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581973#comment-15581973
 ] 

Maximilian Michels commented on FLINK-4833:
---

I think this occurs when the TaskManager snapshots the state of the operator 
statistics operator concurrently with values being added.

The underlying exception in {{CountMinHeavyHitter.merge(..)}} is not properly 
forwarded. This could be improved in the code operator statistics code. The 
concurrent modification of snapshotting / accumulator modification can only be 
solved by FLINK-4829.

> Unstable test 
> OperatorStatsAccumulatorTest.testAccumulatorHeavyHitterCountMinSketch
> ---
>
> Key: FLINK-4833
> URL: https://issues.apache.org/jira/browse/FLINK-4833
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Reporter: Kostas Kloudas
>  Labels: test-stability
>
> Some instances:
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801187/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801191/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801193/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801195/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4833) Unstable test OperatorStatsAccumulatorTest.testAccumulatorHeavyHitterCountMinSketch

2016-10-17 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581841#comment-15581841
 ] 

Robert Metzger commented on FLINK-4833:
---

More of this: 
https://s3.amazonaws.com/archive.travis-ci.org/jobs/167840958/log.txt

{code}
org.apache.flink.runtime.client.JobExecutionException: Failed to retrieve 
accumulator results.
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:877)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:845)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:845)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.RuntimeException: Failed to clone OperatorStatistics!
at 
org.apache.flink.contrib.operatorstatistics.OperatorStatistics.clone(OperatorStatistics.java:219)
at 
org.apache.flink.contrib.operatorstatistics.OperatorStatisticsAccumulator.clone(OperatorStatisticsAccumulator.java:62)
at 
org.apache.flink.api.common.accumulators.AccumulatorHelper.mergeInto(AccumulatorHelper.java:49)
at 
org.apache.flink.runtime.executiongraph.ExecutionGraph.aggregateUserAccumulators(ExecutionGraph.java:625)
at 
org.apache.flink.runtime.executiongraph.ExecutionGraph.getAccumulatorsSerialized(ExecutionGraph.java:640)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:866)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:845)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:845)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)


Results :

Tests in error: 
  OperatorStatsAccumulatorTest.testAccumulatorHeavyHitterCountMinSketch » 
JobExecution
{code}

[~tmendt] can you look into the issue?

> Unstable test 
> OperatorStatsAccumulatorTest.testAccumulatorHeavyHitterCountMinSketch
> ---
>
> Key: FLINK-4833
> URL: https://issues.apache.org/jira/browse/FLINK-4833
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Reporter: Kostas Kloudas
>  Labels: test-stability
>
> Some instances:
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801187/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801191/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801193/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801195/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4833) Unstable test OperatorStatsAccumulatorTest.testAccumulatorHeavyHitterCountMinSketch

2016-10-17 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581816#comment-15581816
 ] 

Maximilian Michels commented on FLINK-4833:
---

This can also be tracked down to FLINK-4829.

> Unstable test 
> OperatorStatsAccumulatorTest.testAccumulatorHeavyHitterCountMinSketch
> ---
>
> Key: FLINK-4833
> URL: https://issues.apache.org/jira/browse/FLINK-4833
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Reporter: Kostas Kloudas
>  Labels: test-stability
>
> Some instances:
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801187/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801191/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801193/log.txt
> view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/167801195/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)