[jira] [Commented] (FLINK-1959) Accumulators BROKEN after Partitioning

2015-05-12 Thread Alexander Alexandrov (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14540146#comment-14540146
 ] 

Alexander Alexandrov commented on FLINK-1959:
-

After some more debugging with [~elbehery] we managed to identify the issue.

Adding {{partitionByHash}} into the pipeline breaks the local chain two and 
creates two tasks. The second task (the receiver of the partition) starts with 
a RegularPactTask which runs a NoOpDriver and does not have a UDF stub. Because 
of that, the test at [line 
511|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/operators/RegularPactTask.java#L511]
 and the accumulators are never called. 

@[~StephanEwen]: Unless you see a problem with that solution, I suggest to 
remove the {{stub != null}} check at [line 
511|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/operators/RegularPactTask.java#L511]
 and report accumulators always.

Regards,
Alexander


 Accumulators BROKEN after Partitioning
 --

 Key: FLINK-1959
 URL: https://issues.apache.org/jira/browse/FLINK-1959
 Project: Flink
  Issue Type: Bug
  Components: Examples
Affects Versions: master
Reporter: mustafa elbehery
Priority: Critical
 Fix For: master


 while running the Accumulator example in 
 https://github.com/Elbehery/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/relational/EmptyFieldsCountAccumulator.java,
  
 I tried to alter the data flow with PartitionByHash function before 
 applying Filter, and the resulted accumulator was NULL. 
 By Debugging, I could see the accumulator in the RunTime Map. However, by 
 retrieving the accumulator from the JobExecutionResult object, it was NULL. 
 The line caused the problem is file.partitionByHash(1).filter(new 
 EmptyFieldFilter()) instead of file.filter(new EmptyFieldFilter())



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1959) Accumulators BROKEN after Partitioning

2015-05-12 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14540610#comment-14540610
 ] 

Stephan Ewen commented on FLINK-1959:
-

Great debugging, this is perfect!

I will prepare a fix based on that...

 Accumulators BROKEN after Partitioning
 --

 Key: FLINK-1959
 URL: https://issues.apache.org/jira/browse/FLINK-1959
 Project: Flink
  Issue Type: Bug
  Components: Examples
Affects Versions: master
Reporter: mustafa elbehery
Priority: Critical
 Fix For: master


 while running the Accumulator example in 
 https://github.com/Elbehery/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/relational/EmptyFieldsCountAccumulator.java,
  
 I tried to alter the data flow with PartitionByHash function before 
 applying Filter, and the resulted accumulator was NULL. 
 By Debugging, I could see the accumulator in the RunTime Map. However, by 
 retrieving the accumulator from the JobExecutionResult object, it was NULL. 
 The line caused the problem is file.partitionByHash(1).filter(new 
 EmptyFieldFilter()) instead of file.filter(new EmptyFieldFilter())



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1959) Accumulators BROKEN after Partitioning

2015-05-12 Thread mustafa elbehery (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14539987#comment-14539987
 ] 

mustafa elbehery commented on FLINK-1959:
-

I have tried debugging to check the problem .. What I can see is that 
re-partitioning is working find and the data is re-distributed as expected. 
Moreover, the accumulators from each sub-task if created and filled with the 
required information. However, the merge method is never called in case of 
Re-Partitioning !!! 

I could see in debug mode that the accumulator in RunTime is empty.

Any suggestion to solve the problem ?!!

 Accumulators BROKEN after Partitioning
 --

 Key: FLINK-1959
 URL: https://issues.apache.org/jira/browse/FLINK-1959
 Project: Flink
  Issue Type: Bug
  Components: Examples
Affects Versions: master
Reporter: mustafa elbehery
Priority: Critical
 Fix For: master


 while running the Accumulator example in 
 https://github.com/Elbehery/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/relational/EmptyFieldsCountAccumulator.java,
  
 I tried to alter the data flow with PartitionByHash function before 
 applying Filter, and the resulted accumulator was NULL. 
 By Debugging, I could see the accumulator in the RunTime Map. However, by 
 retrieving the accumulator from the JobExecutionResult object, it was NULL. 
 The line caused the problem is file.partitionByHash(1).filter(new 
 EmptyFieldFilter()) instead of file.filter(new EmptyFieldFilter())



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1959) Accumulators BROKEN after Partitioning

2015-05-11 Thread Alexander Alexandrov (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14537845#comment-14537845
 ] 

Alexander Alexandrov commented on FLINK-1959:
-

I think that some communication / traversal chain gets broken in the 
ParitionByHash node.

You can either 

(1) try to dig through the code and see where this happens, or
(2) use an alternative to the accumulator until the issue is resolved (e.g. 
write the information to a pre-defined HDFS path); 

 Accumulators BROKEN after Partitioning
 --

 Key: FLINK-1959
 URL: https://issues.apache.org/jira/browse/FLINK-1959
 Project: Flink
  Issue Type: Bug
  Components: Examples
Affects Versions: 0.8.1
Reporter: mustafa elbehery
Priority: Critical
 Fix For: 0.8.1


 while running the Accumulator example in 
 https://github.com/Elbehery/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/relational/EmptyFieldsCountAccumulator.java,
  
 I tried to alter the data flow with PartitionByHash function before 
 applying Filter, and the resulted accumulator was NULL. 
 By Debugging, I could see the accumulator in the RunTime Map. However, by 
 retrieving the accumulator from the JobExecutionResult object, it was NULL. 
 The line caused the problem is file.partitionByHash(1).filter(new 
 EmptyFieldFilter()) instead of file.filter(new EmptyFieldFilter())



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1959) Accumulators BROKEN after Partitioning

2015-05-11 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14538540#comment-14538540
 ] 

Stephan Ewen commented on FLINK-1959:
-

I will try and look into this very soon...

 Accumulators BROKEN after Partitioning
 --

 Key: FLINK-1959
 URL: https://issues.apache.org/jira/browse/FLINK-1959
 Project: Flink
  Issue Type: Bug
  Components: Examples
Affects Versions: master
Reporter: mustafa elbehery
Priority: Critical
 Fix For: master


 while running the Accumulator example in 
 https://github.com/Elbehery/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/relational/EmptyFieldsCountAccumulator.java,
  
 I tried to alter the data flow with PartitionByHash function before 
 applying Filter, and the resulted accumulator was NULL. 
 By Debugging, I could see the accumulator in the RunTime Map. However, by 
 retrieving the accumulator from the JobExecutionResult object, it was NULL. 
 The line caused the problem is file.partitionByHash(1).filter(new 
 EmptyFieldFilter()) instead of file.filter(new EmptyFieldFilter())



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1959) Accumulators BROKEN after Partitioning

2015-05-10 Thread mustafa elbehery (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14537312#comment-14537312
 ] 

mustafa elbehery commented on FLINK-1959:
-

Any updates for this bug ?!!This is kind of critical to me 

 Accumulators BROKEN after Partitioning
 --

 Key: FLINK-1959
 URL: https://issues.apache.org/jira/browse/FLINK-1959
 Project: Flink
  Issue Type: Bug
  Components: Examples
Affects Versions: 0.8.1
Reporter: mustafa elbehery
Priority: Critical
 Fix For: 0.8.1


 while running the Accumulator example in 
 https://github.com/Elbehery/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/relational/EmptyFieldsCountAccumulator.java,
  
 I tried to alter the data flow with PartitionByHash function before 
 applying Filter, and the resulted accumulator was NULL. 
 By Debugging, I could see the accumulator in the RunTime Map. However, by 
 retrieving the accumulator from the JobExecutionResult object, it was NULL. 
 The line caused the problem is file.partitionByHash(1).filter(new 
 EmptyFieldFilter()) instead of file.filter(new EmptyFieldFilter())



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1959) Accumulators BROKEN after Partitioning

2015-04-30 Thread mustafa elbehery (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521195#comment-14521195
 ] 

mustafa elbehery commented on FLINK-1959:
-

I just tried with DOP 1,2 and 5 .. All return NULL 

Number of detected empty fields per column: null

 Accumulators BROKEN after Partitioning
 --

 Key: FLINK-1959
 URL: https://issues.apache.org/jira/browse/FLINK-1959
 Project: Flink
  Issue Type: Bug
  Components: Examples
Affects Versions: 0.8.1
Reporter: mustafa elbehery
Priority: Critical
 Fix For: 0.8.1


 while running the Accumulator example in 
 https://github.com/Elbehery/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/relational/EmptyFieldsCountAccumulator.java,
  
 I tried to alter the data flow with PartitionByHash function before 
 applying Filter, and the resulted accumulator was NULL. 
 By Debugging, I could see the accumulator in the RunTime Map. However, by 
 retrieving the accumulator from the JobExecutionResult object, it was NULL. 
 The line caused the problem is file.partitionByHash(1).filter(new 
 EmptyFieldFilter()) instead of file.filter(new EmptyFieldFilter())



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1959) Accumulators BROKEN after Partitioning

2015-04-29 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14519542#comment-14519542
 ] 

Stephan Ewen commented on FLINK-1959:
-

Can you try if this only happens with a parallelism greater than one, or also 
with parallelism one?

 Accumulators BROKEN after Partitioning
 --

 Key: FLINK-1959
 URL: https://issues.apache.org/jira/browse/FLINK-1959
 Project: Flink
  Issue Type: Bug
  Components: Examples
Affects Versions: 0.8.1
Reporter: mustafa elbehery
Priority: Critical
 Fix For: 0.8.1


 while running the Accumulator example in 
 https://github.com/Elbehery/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/relational/EmptyFieldsCountAccumulator.java,
  
 I tried to alter the data flow with PartitionByHash function before 
 applying Filter, and the resulted accumulator was NULL. 
 By Debugging, I could see the accumulator in the RunTime Map. However, by 
 retrieving the accumulator from the JobExecutionResult object, it was NULL. 
 The line caused the problem is file.partitionByHash(1).filter(new 
 EmptyFieldFilter()) instead of file.filter(new EmptyFieldFilter())



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1959) Accumulators BROKEN after Partitioning

2015-04-29 Thread mustafa elbehery (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14519494#comment-14519494
 ] 

mustafa elbehery commented on FLINK-1959:
-

BTW, the version of Flink is 0.9 but it was not in the drop-down list.

 Accumulators BROKEN after Partitioning
 --

 Key: FLINK-1959
 URL: https://issues.apache.org/jira/browse/FLINK-1959
 Project: Flink
  Issue Type: Bug
  Components: Examples
Affects Versions: 0.8.1
Reporter: mustafa elbehery
Priority: Critical
 Fix For: 0.8.1


 while running the Accumulator example in 
 https://github.com/Elbehery/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/relational/EmptyFieldsCountAccumulator.java,
  
 I tried to alter the data flow with PartitionByHash function before 
 applying Filter, and the resulted accumulator was NULL. 
 By Debugging, I could see the accumulator in the RunTime Map. However, by 
 retrieving the accumulator from the JobExecutionResult object, it was NULL. 
 The line caused the problem is file.partitionByHash(1).filter(new 
 EmptyFieldFilter()) instead of file.filter(new EmptyFieldFilter())



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)