[GitHub] [flink] StephanEwen commented on a change in pull request #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client

2019-07-02 Thread GitBox
StephanEwen commented on a change in pull request #8756: [FLINK-12406] 
[Runtime] Report BLOCKING_PERSISTENT result partition meta back to client
URL: https://github.com/apache/flink/pull/8756#discussion_r299700273
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/AccessExecutionGraph.java
 ##
 @@ -158,6 +159,14 @@
 */
Map>> 
getAccumulatorsSerialized();
 
+   /**
+* Returns a {@link BlockingPersistentResultPartitionMeta}.
+* @return BlockingPersistentResultPartitionMeta contains 
ResultPartition locations
+*/
+   default BlockingPersistentResultPartitionMeta 
getBlockingPersistentResultPartitionMeta() {
 
 Review comment:
   I think this is a tricky use of a default methods, because this is not a 
working implementation for all subclasses, which is how default methods on 
interfaces should only be used.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] StephanEwen commented on a change in pull request #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client

2019-07-02 Thread GitBox
StephanEwen commented on a change in pull request #8756: [FLINK-12406] 
[Runtime] Report BLOCKING_PERSISTENT result partition meta back to client
URL: https://github.com/apache/flink/pull/8756#discussion_r299699717
 
 

 ##
 File path: 
flink-java/src/main/java/org/apache/flink/api/java/ExecutionEnvironment.java
 ##
 @@ -118,6 +119,9 @@
 
private final ExecutionConfig config = new ExecutionConfig();
 
+   private final BlockingPersistentResultPartitionMeta 
blockingPersistentResultPartitionMeta =
 
 Review comment:
   This should probably not be in a separate field, but in the 
`lastJobExecutionResult`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] StephanEwen commented on a change in pull request #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client

2019-07-02 Thread GitBox
StephanEwen commented on a change in pull request #8756: [FLINK-12406] 
[Runtime] Report BLOCKING_PERSISTENT result partition meta back to client
URL: https://github.com/apache/flink/pull/8756#discussion_r299701049
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java
 ##
 @@ -814,6 +820,69 @@ public Executor getFutureExecutor() {
entry -> serializeAccumulator(entry.getKey(), 
entry.getValue(;
}
 
+   @Override
+   public BlockingPersistentResultPartitionMeta 
getBlockingPersistentResultPartitionMeta() {
+   Map> 
resultPartitionDescriptors = new HashMap<>();
+
+   // keep record of all failed IntermediateDataSetID
+   Set failedIntermediateDataSetIds = new HashSet<>();
+
+   for (ExecutionVertex executionVertex : 
getAllExecutionVertices()) {
+   for (IntermediateResultPartition 
intermediateResultPartition : executionVertex.getProducedPartitions().values()) 
{
+   if (intermediateResultPartition.getResultType() 
== ResultPartitionType.BLOCKING_PERSISTENT) {
+   try {
+   
addLocation(resultPartitionDescriptors, intermediateResultPartition);
+   } catch (Throwable throwable) {
+   LOG.error("Failed to get 
location of ResultPartition: " + intermediateResultPartition.getPartitionId(), 
throwable);
+   
failedIntermediateDataSetIds.add(
+   new 
AbstractID(intermediateResultPartition.getIntermediateResult().getId()));
+   }
+   }
+   }
+   }
+
+   return new 
BlockingPersistentResultPartitionMeta(resultPartitionDescriptors, 
failedIntermediateDataSetIds);
+   }
+
+   /**
+*
+* @param resultPartitionDescriptors
+* @param intermediateResultPartition
+* throw exception if any error occurs.
+*/
+   public void addLocation(
+   Map> 
resultPartitionDescriptors,
+   IntermediateResultPartition intermediateResultPartition) {
+
+   IntermediateDataSetID dataSetID = 
intermediateResultPartition.getIntermediateResult().getId();
+
+   Map map = 
resultPartitionDescriptors.computeIfAbsent(
+   new AbstractID(dataSetID), key -> new HashMap<>()
+   );
+
+   TaskManagerLocation taskManagerLocation = null;
+
+   // The taskManagerLocation should be ready already since the 
previous job is done.
+   try {
+   taskManagerLocation = intermediateResultPartition
+   
.getProducer().getCurrentExecutionAttempt().getTaskManagerLocationFuture().get(1,
 TimeUnit.SECONDS);
 
 Review comment:
   This is a blocking waiting call, which cannot be used in a non blocking data 
structure like the execution graph.
   The call to the future needs to complete or fail instantly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] StephanEwen commented on a change in pull request #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client

2019-07-02 Thread GitBox
StephanEwen commented on a change in pull request #8756: [FLINK-12406] 
[Runtime] Report BLOCKING_PERSISTENT result partition meta back to client
URL: https://github.com/apache/flink/pull/8756#discussion_r299701522
 
 

 ##
 File path: 
flink-tests/src/test/java/org/apache/flink/test/operators/ExecutionEnvironmentITCase.java
 ##
 @@ -66,6 +74,49 @@ public void mapPartition(Iterable values, 
Collector out) throw
assertEquals(PARALLELISM, resultCollection.size());
}
 
+   @Test
+   public void testAccessingBlockingPersistentResultPartition() throws 
Exception {
 
 Review comment:
   I am not sure if this test is well placed in the ExecutionEnvironmentITCase.
   It does not test the ExecutionEnvironment, it tests a different feature.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] StephanEwen commented on a change in pull request #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client

2019-07-02 Thread GitBox
StephanEwen commented on a change in pull request #8756: [FLINK-12406] 
[Runtime] Report BLOCKING_PERSISTENT result partition meta back to client
URL: https://github.com/apache/flink/pull/8756#discussion_r299700721
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java
 ##
 @@ -814,6 +820,69 @@ public Executor getFutureExecutor() {
entry -> serializeAccumulator(entry.getKey(), 
entry.getValue(;
}
 
+   @Override
+   public BlockingPersistentResultPartitionMeta 
getBlockingPersistentResultPartitionMeta() {
+   Map> 
resultPartitionDescriptors = new HashMap<>();
+
+   // keep record of all failed IntermediateDataSetID
+   Set failedIntermediateDataSetIds = new HashSet<>();
+
+   for (ExecutionVertex executionVertex : 
getAllExecutionVertices()) {
+   for (IntermediateResultPartition 
intermediateResultPartition : executionVertex.getProducedPartitions().values()) 
{
+   if (intermediateResultPartition.getResultType() 
== ResultPartitionType.BLOCKING_PERSISTENT) {
+   try {
+   
addLocation(resultPartitionDescriptors, intermediateResultPartition);
+   } catch (Throwable throwable) {
+   LOG.error("Failed to get 
location of ResultPartition: " + intermediateResultPartition.getPartitionId(), 
throwable);
+   
failedIntermediateDataSetIds.add(
+   new 
AbstractID(intermediateResultPartition.getIntermediateResult().getId()));
+   }
+   }
+   }
+   }
+
+   return new 
BlockingPersistentResultPartitionMeta(resultPartitionDescriptors, 
failedIntermediateDataSetIds);
+   }
+
+   /**
+*
+* @param resultPartitionDescriptors
+* @param intermediateResultPartition
+* throw exception if any error occurs.
+*/
+   public void addLocation(
 
 Review comment:
   This utility method should not be in the ExecutionGraph - it just blows up 
the public signature.
   It should be private static or moved to a utility class.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] StephanEwen commented on a change in pull request #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client

2019-07-02 Thread GitBox
StephanEwen commented on a change in pull request #8756: [FLINK-12406] 
[Runtime] Report BLOCKING_PERSISTENT result partition meta back to client
URL: https://github.com/apache/flink/pull/8756#discussion_r299701304
 
 

 ##
 File path: 
flink-tests/src/test/java/org/apache/flink/test/operators/ExecutionEnvironmentITCase.java
 ##
 @@ -66,6 +74,49 @@ public void mapPartition(Iterable values, 
Collector out) throw
assertEquals(PARALLELISM, resultCollection.size());
}
 
+   @Test
+   public void testAccessingBlockingPersistentResultPartition() throws 
Exception {
+   final ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+
+   env.setParallelism(1);
+
+   DataSet> input = env.fromElements(new 
Tuple2<>(1L, 2L));
+
+   DataSet ds = input.map((MapFunction, 
Object>) value -> new Tuple2<>(value.f0 + 1, value.f1));
+
+   // specify IntermediateDataSetID
+   AbstractID intermediateDataSetId = new AbstractID();
+
+   // this output branch will be excluded.
+   
ds.output(BlockingShuffleOutputFormat.createOutputFormat(intermediateDataSetId))
+   .setParallelism(1);
+
+   ds.collect();
+
+   BlockingPersistentResultPartitionMeta meta = 
env.getBlockingPersistentResultPartitionMeta();
+
+   // only one cached IntermediateDataSet
+   Assert.assertEquals(1, 
meta.getResultPartitionDescriptors().size());
+
+   AbstractID intermediateDataSetID = 
meta.getResultPartitionDescriptors().keySet().iterator().next();
+
+   // IntermediateDataSetID should be the same
+   Assert.assertEquals(intermediateDataSetID, 
intermediateDataSetID);
+
+   Map descriptors = 
meta.getResultPartitionDescriptors().get(intermediateDataSetID);
+
+   Assert.assertEquals(1, descriptors.size());
+
+   ResultPartitionDescriptor descriptor = 
descriptors.values().iterator().next();
+
+   Assert.assertTrue(
 
 Review comment:
   Separate conditions should have separate assertions, so that it is visible 
from a test failure which of the conditions failed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services