[jira] [Commented] (FLINK-10006) Improve logging in BarrierBuffer
[ https://issues.apache.org/jira/browse/FLINK-10006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578035#comment-16578035 ] ASF GitHub Bot commented on FLINK-10006: NicoK closed pull request #6470: [FLINK-10006][network] improve logging in BarrierBuffer: prepend owning task name URL: https://github.com/apache/flink/pull/6470 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/consumer/InputGate.java b/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/consumer/InputGate.java index 0413caa8aec..c78abb5165a 100644 --- a/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/consumer/InputGate.java +++ b/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/consumer/InputGate.java @@ -69,6 +69,8 @@ int getNumberOfInputChannels(); + String getOwningTaskName(); + boolean isFinished(); void requestPartitions() throws IOException, InterruptedException; diff --git a/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/consumer/SingleInputGate.java b/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/consumer/SingleInputGate.java index 06e80ff531a..2e7d076f3f8 100644 --- a/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/consumer/SingleInputGate.java +++ b/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/consumer/SingleInputGate.java @@ -62,10 +62,10 @@ /** * An input gate consumes one or more partitions of a single produced intermediate result. * - * Each intermediate result is partitioned over its producing parallel subtasks; each of these + * Each intermediate result is partitioned over its producing parallel subtasks; each of these * partitions is furthermore partitioned into one or more subpartitions. * - * As an example, consider a map-reduce program, where the map operator produces data and the + * As an example, consider a map-reduce program, where the map operator produces data and the * reduce operator consumes the produced data. * * {@code @@ -74,7 +74,7 @@ * +-+ +-+ ++ * } * - * When deploying such a program in parallel, the intermediate result will be partitioned over its + * When deploying such a program in parallel, the intermediate result will be partitioned over its * producing parallel subtasks; each of these partitions is furthermore partitioned into one or more * subpartitions. * @@ -95,7 +95,7 @@ * +-+ * } * - * In the above example, two map subtasks produce the intermediate result in parallel, resulting + * In the above example, two map subtasks produce the intermediate result in parallel, resulting * in two partitions (Partition 1 and 2). Each of these partitions is further partitioned into two * subpartitions -- one for each parallel reduce subtask. */ @@ -274,6 +274,11 @@ public int getNumberOfQueuedBuffers() { return 0; } + @Override + public String getOwningTaskName() { + return owningTaskName; + } + // // Setup/Life-cycle // @@ -364,7 +369,7 @@ else if (partitionLocation.isRemote()) { throw new IllegalStateException("Tried to update unknown channel with unknown channel."); } - LOG.debug("Updated unknown input channel to {}.", newChannel); + LOG.debug("{}: Updated unknown input channel to {}.", owningTaskName, newChannel); inputChannels.put(partitionId, newChannel); @@ -393,7 +398,7 @@ public void retriggerPartitionRequest(IntermediateResultPartitionID partitionId) checkNotNull(ch, "Unknown input channel with ID " + partitionId); - LOG.debug("Retriggering partition request {}:{}.", ch.partitionId, consumedSubpartitionIndex); + LOG.debug("{}: Retriggering partition request {}:{}.", owningTaskName, ch.partitionId, consumedSubpartitionIndex); if (ch.getClass() == RemoteInputChannel.class) { final RemoteInputChannel rch = (RemoteInputChannel) ch; @@ -432,7 +437,8 @@ public void
[jira] [Commented] (FLINK-10006) Improve logging in BarrierBuffer
[ https://issues.apache.org/jira/browse/FLINK-10006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576366#comment-16576366 ] ASF GitHub Bot commented on FLINK-10006: dawidwys commented on issue #6470: [FLINK-10006][network] improve logging in BarrierBuffer: prepend owning task name URL: https://github.com/apache/flink/pull/6470#issuecomment-412103541 +1, lgtm This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve logging in BarrierBuffer > > > Key: FLINK-10006 > URL: https://issues.apache.org/jira/browse/FLINK-10006 > Project: Flink > Issue Type: Improvement > Components: Network >Affects Versions: 1.5.2, 1.6.0, 1.7.0 >Reporter: Nico Kruber >Assignee: Nico Kruber >Priority: Major > Labels: pull-request-available > > Almost all log messages of {{BarrierBuffer}} do not contain the task name and > are therefore of little use if either multiple slots are executed on a single > TM or multiple checkpoints run in parallel. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10006) Improve logging in BarrierBuffer
[ https://issues.apache.org/jira/browse/FLINK-10006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566768#comment-16566768 ] ASF GitHub Bot commented on FLINK-10006: NicoK commented on issue #6470: [FLINK-10006][network] improve logging in BarrierBuffer: prepend owning task name URL: https://github.com/apache/flink/pull/6470#issuecomment-409922334 agreed, and done This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve logging in BarrierBuffer > > > Key: FLINK-10006 > URL: https://issues.apache.org/jira/browse/FLINK-10006 > Project: Flink > Issue Type: Improvement > Components: Network >Affects Versions: 1.5.2, 1.6.0, 1.7.0 >Reporter: Nico Kruber >Assignee: Nico Kruber >Priority: Major > Labels: pull-request-available > > Almost all log messages of {{BarrierBuffer}} do not contain the task name and > are therefore of little use if either multiple slots are executed on a single > TM or multiple checkpoints run in parallel. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10006) Improve logging in BarrierBuffer
[ https://issues.apache.org/jira/browse/FLINK-10006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565516#comment-16565516 ] ASF GitHub Bot commented on FLINK-10006: yanghua commented on a change in pull request #6470: [FLINK-10006][network] improve logging in BarrierBuffer: prepend owning task name URL: https://github.com/apache/flink/pull/6470#discussion_r206931865 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/consumer/UnionInputGate.java ## @@ -129,6 +129,12 @@ public int getNumberOfInputChannels() { return totalNumberOfInputChannels; } + @Override + public String getOwningTaskName() { + // all input dates have the same owning task Review comment: "dates" ? or "gates"? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve logging in BarrierBuffer > > > Key: FLINK-10006 > URL: https://issues.apache.org/jira/browse/FLINK-10006 > Project: Flink > Issue Type: Improvement > Components: Network >Affects Versions: 1.5.2, 1.6.0, 1.7.0 >Reporter: Nico Kruber >Assignee: Nico Kruber >Priority: Major > Labels: pull-request-available > > Almost all log messages of {{BarrierBuffer}} do not contain the task name and > are therefore of little use if either multiple slots are executed on a single > TM or multiple checkpoints run in parallel. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10006) Improve logging in BarrierBuffer
[ https://issues.apache.org/jira/browse/FLINK-10006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565517#comment-16565517 ] ASF GitHub Bot commented on FLINK-10006: yanghua commented on a change in pull request #6470: [FLINK-10006][network] improve logging in BarrierBuffer: prepend owning task name URL: https://github.com/apache/flink/pull/6470#discussion_r206931425 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/consumer/SingleInputGate.java ## @@ -432,7 +437,8 @@ public void releaseAllResources() throws IOException { inputChannel.releaseAllResources(); } catch (IOException e) { - LOG.warn("Error during release of channel resources: " + e.getMessage(), e); + LOG.warn("{}: Error during release of channel resources: " + e.getMessage(), Review comment: use placeholder ({}) for `e.getMessage()` looks better to me~ This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve logging in BarrierBuffer > > > Key: FLINK-10006 > URL: https://issues.apache.org/jira/browse/FLINK-10006 > Project: Flink > Issue Type: Improvement > Components: Network >Affects Versions: 1.5.2, 1.6.0, 1.7.0 >Reporter: Nico Kruber >Assignee: Nico Kruber >Priority: Major > Labels: pull-request-available > > Almost all log messages of {{BarrierBuffer}} do not contain the task name and > are therefore of little use if either multiple slots are executed on a single > TM or multiple checkpoints run in parallel. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10006) Improve logging in BarrierBuffer
[ https://issues.apache.org/jira/browse/FLINK-10006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565414#comment-16565414 ] ASF GitHub Bot commented on FLINK-10006: NicoK opened a new pull request #6470: [FLINK-10006][network] improve logging in BarrierBuffer: prepend owning task name URL: https://github.com/apache/flink/pull/6470 ## What is the purpose of the change Almost all log messages of `BarrierBuffer` do not contain the task name and are therefore of little use if either multiple slots are executed on a single TM or multiple checkpoints run in parallel. Please also merge to `release-1.6` and `master`. ## Brief change log - prepend the owning task name to `BarrierBuffer` log messages (available from the `InputGate`) - add task name to `SingleInputGate` logs ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. - Tested manually the log messages with logging level set to `DEBUG` and the `SocketWindowWordCount` with checkpointing and parallelism 4. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): **no** - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: **no** - The serializers: **no** - The runtime per-record code paths (performance sensitive): **no** - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: **no** - The S3 file system connector: **no** ## Documentation - Does this pull request introduce a new feature? **no** - If yes, how is the feature documented? **not applicable** This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve logging in BarrierBuffer > > > Key: FLINK-10006 > URL: https://issues.apache.org/jira/browse/FLINK-10006 > Project: Flink > Issue Type: Improvement > Components: Network >Affects Versions: 1.5.2, 1.6.0, 1.7.0 >Reporter: Nico Kruber >Assignee: Nico Kruber >Priority: Major > Labels: pull-request-available > > Almost all log messages of {{BarrierBuffer}} do not contain the task name and > are therefore of little use if either multiple slots are executed on a single > TM or multiple checkpoints run in parallel. -- This message was sent by Atlassian JIRA (v7.6.3#76005)