[GitHub] [flink] StephanEwen commented on a change in pull request #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client
StephanEwen commented on a change in pull request #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client URL: https://github.com/apache/flink/pull/8756#discussion_r299700273 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/AccessExecutionGraph.java ## @@ -158,6 +159,14 @@ */ Map>> getAccumulatorsSerialized(); + /** +* Returns a {@link BlockingPersistentResultPartitionMeta}. +* @return BlockingPersistentResultPartitionMeta contains ResultPartition locations +*/ + default BlockingPersistentResultPartitionMeta getBlockingPersistentResultPartitionMeta() { Review comment: I think this is a tricky use of a default methods, because this is not a working implementation for all subclasses, which is how default methods on interfaces should only be used. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] StephanEwen commented on a change in pull request #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client
StephanEwen commented on a change in pull request #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client URL: https://github.com/apache/flink/pull/8756#discussion_r299699717 ## File path: flink-java/src/main/java/org/apache/flink/api/java/ExecutionEnvironment.java ## @@ -118,6 +119,9 @@ private final ExecutionConfig config = new ExecutionConfig(); + private final BlockingPersistentResultPartitionMeta blockingPersistentResultPartitionMeta = Review comment: This should probably not be in a separate field, but in the `lastJobExecutionResult`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] StephanEwen commented on a change in pull request #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client
StephanEwen commented on a change in pull request #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client URL: https://github.com/apache/flink/pull/8756#discussion_r299701049 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java ## @@ -814,6 +820,69 @@ public Executor getFutureExecutor() { entry -> serializeAccumulator(entry.getKey(), entry.getValue(; } + @Override + public BlockingPersistentResultPartitionMeta getBlockingPersistentResultPartitionMeta() { + Map> resultPartitionDescriptors = new HashMap<>(); + + // keep record of all failed IntermediateDataSetID + Set failedIntermediateDataSetIds = new HashSet<>(); + + for (ExecutionVertex executionVertex : getAllExecutionVertices()) { + for (IntermediateResultPartition intermediateResultPartition : executionVertex.getProducedPartitions().values()) { + if (intermediateResultPartition.getResultType() == ResultPartitionType.BLOCKING_PERSISTENT) { + try { + addLocation(resultPartitionDescriptors, intermediateResultPartition); + } catch (Throwable throwable) { + LOG.error("Failed to get location of ResultPartition: " + intermediateResultPartition.getPartitionId(), throwable); + failedIntermediateDataSetIds.add( + new AbstractID(intermediateResultPartition.getIntermediateResult().getId())); + } + } + } + } + + return new BlockingPersistentResultPartitionMeta(resultPartitionDescriptors, failedIntermediateDataSetIds); + } + + /** +* +* @param resultPartitionDescriptors +* @param intermediateResultPartition +* throw exception if any error occurs. +*/ + public void addLocation( + Map> resultPartitionDescriptors, + IntermediateResultPartition intermediateResultPartition) { + + IntermediateDataSetID dataSetID = intermediateResultPartition.getIntermediateResult().getId(); + + Map map = resultPartitionDescriptors.computeIfAbsent( + new AbstractID(dataSetID), key -> new HashMap<>() + ); + + TaskManagerLocation taskManagerLocation = null; + + // The taskManagerLocation should be ready already since the previous job is done. + try { + taskManagerLocation = intermediateResultPartition + .getProducer().getCurrentExecutionAttempt().getTaskManagerLocationFuture().get(1, TimeUnit.SECONDS); Review comment: This is a blocking waiting call, which cannot be used in a non blocking data structure like the execution graph. The call to the future needs to complete or fail instantly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] StephanEwen commented on a change in pull request #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client
StephanEwen commented on a change in pull request #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client URL: https://github.com/apache/flink/pull/8756#discussion_r299701522 ## File path: flink-tests/src/test/java/org/apache/flink/test/operators/ExecutionEnvironmentITCase.java ## @@ -66,6 +74,49 @@ public void mapPartition(Iterable values, Collector out) throw assertEquals(PARALLELISM, resultCollection.size()); } + @Test + public void testAccessingBlockingPersistentResultPartition() throws Exception { Review comment: I am not sure if this test is well placed in the ExecutionEnvironmentITCase. It does not test the ExecutionEnvironment, it tests a different feature. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] StephanEwen commented on a change in pull request #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client
StephanEwen commented on a change in pull request #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client URL: https://github.com/apache/flink/pull/8756#discussion_r299700721 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java ## @@ -814,6 +820,69 @@ public Executor getFutureExecutor() { entry -> serializeAccumulator(entry.getKey(), entry.getValue(; } + @Override + public BlockingPersistentResultPartitionMeta getBlockingPersistentResultPartitionMeta() { + Map> resultPartitionDescriptors = new HashMap<>(); + + // keep record of all failed IntermediateDataSetID + Set failedIntermediateDataSetIds = new HashSet<>(); + + for (ExecutionVertex executionVertex : getAllExecutionVertices()) { + for (IntermediateResultPartition intermediateResultPartition : executionVertex.getProducedPartitions().values()) { + if (intermediateResultPartition.getResultType() == ResultPartitionType.BLOCKING_PERSISTENT) { + try { + addLocation(resultPartitionDescriptors, intermediateResultPartition); + } catch (Throwable throwable) { + LOG.error("Failed to get location of ResultPartition: " + intermediateResultPartition.getPartitionId(), throwable); + failedIntermediateDataSetIds.add( + new AbstractID(intermediateResultPartition.getIntermediateResult().getId())); + } + } + } + } + + return new BlockingPersistentResultPartitionMeta(resultPartitionDescriptors, failedIntermediateDataSetIds); + } + + /** +* +* @param resultPartitionDescriptors +* @param intermediateResultPartition +* throw exception if any error occurs. +*/ + public void addLocation( Review comment: This utility method should not be in the ExecutionGraph - it just blows up the public signature. It should be private static or moved to a utility class. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] StephanEwen commented on a change in pull request #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client
StephanEwen commented on a change in pull request #8756: [FLINK-12406] [Runtime] Report BLOCKING_PERSISTENT result partition meta back to client URL: https://github.com/apache/flink/pull/8756#discussion_r299701304 ## File path: flink-tests/src/test/java/org/apache/flink/test/operators/ExecutionEnvironmentITCase.java ## @@ -66,6 +74,49 @@ public void mapPartition(Iterable values, Collector out) throw assertEquals(PARALLELISM, resultCollection.size()); } + @Test + public void testAccessingBlockingPersistentResultPartition() throws Exception { + final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); + + env.setParallelism(1); + + DataSet> input = env.fromElements(new Tuple2<>(1L, 2L)); + + DataSet ds = input.map((MapFunction, Object>) value -> new Tuple2<>(value.f0 + 1, value.f1)); + + // specify IntermediateDataSetID + AbstractID intermediateDataSetId = new AbstractID(); + + // this output branch will be excluded. + ds.output(BlockingShuffleOutputFormat.createOutputFormat(intermediateDataSetId)) + .setParallelism(1); + + ds.collect(); + + BlockingPersistentResultPartitionMeta meta = env.getBlockingPersistentResultPartitionMeta(); + + // only one cached IntermediateDataSet + Assert.assertEquals(1, meta.getResultPartitionDescriptors().size()); + + AbstractID intermediateDataSetID = meta.getResultPartitionDescriptors().keySet().iterator().next(); + + // IntermediateDataSetID should be the same + Assert.assertEquals(intermediateDataSetID, intermediateDataSetID); + + Map descriptors = meta.getResultPartitionDescriptors().get(intermediateDataSetID); + + Assert.assertEquals(1, descriptors.size()); + + ResultPartitionDescriptor descriptor = descriptors.values().iterator().next(); + + Assert.assertTrue( Review comment: Separate conditions should have separate assertions, so that it is visible from a test failure which of the conditions failed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services