chamikaramj commented on a change in pull request #14158:
URL: https://github.com/apache/beam/pull/14158#discussion_r589708724
##########
File path:
runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineJob.java
##########
@@ -121,6 +122,8 @@
private @Nullable String latestStateString;
+ private RunnerApi.Pipeline pipelineProto = null;
Review comment:
Done.
##########
File path:
runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowMetrics.java
##########
@@ -249,14 +249,33 @@ private boolean isMetricTentative(MetricUpdate
metricUpdate) {
*/
private MetricKey getMetricHashKey(MetricUpdate metricUpdate) {
String fullStepName = metricUpdate.getName().getContext().get("step");
- if (dataflowPipelineJob.transformStepNames == null
- ||
!dataflowPipelineJob.transformStepNames.inverse().containsKey(fullStepName)) {
- // If we can't translate internal step names to user step names, we
just skip them
- // altogether.
- return null;
+
+ if (dataflowPipelineJob.getPipelineProto() != null
+ && dataflowPipelineJob
+ .getPipelineProto()
+ .getComponents()
+ .getTransformsMap()
+ .containsKey(fullStepName)) {
+ // Dataflow Runner v2 with portable job submission uses proto
transform map
+ // IDs for step names. Hence we lookup user step names based on the
proto.
+ fullStepName =
+ dataflowPipelineJob
+ .getPipelineProto()
+ .getComponents()
+ .getTransformsMap()
+ .get(fullStepName)
+ .getUniqueName();
+ } else {
+ if (dataflowPipelineJob.transformStepNames == null
+ ||
!dataflowPipelineJob.transformStepNames.inverse().containsKey(fullStepName)) {
+ // If we can't translate internal step names to user step names, we
just skip them
+ // altogether.
+ return null;
Review comment:
Done.
##########
File path:
runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowMetrics.java
##########
@@ -249,14 +249,33 @@ private boolean isMetricTentative(MetricUpdate
metricUpdate) {
*/
private MetricKey getMetricHashKey(MetricUpdate metricUpdate) {
String fullStepName = metricUpdate.getName().getContext().get("step");
- if (dataflowPipelineJob.transformStepNames == null
- ||
!dataflowPipelineJob.transformStepNames.inverse().containsKey(fullStepName)) {
- // If we can't translate internal step names to user step names, we
just skip them
- // altogether.
- return null;
+
+ if (dataflowPipelineJob.getPipelineProto() != null
Review comment:
Not sure if I fully understood but seems like a getUniqueName() method
will not save much.
##########
File path:
runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineJob.java
##########
@@ -140,6 +143,24 @@ public DataflowPipelineJob(
this.dataflowMetrics = new DataflowMetrics(this, this.dataflowClient);
}
+ /**
+ * Constructs the job.
+ *
+ * @param jobId the job id
+ * @param dataflowOptions used to configure the client for the Dataflow
Service
+ * @param transformStepNames a mapping from AppliedPTransforms to Step Names
+ * @param pipelineProto Runner API pipeline proto.
+ */
+ public DataflowPipelineJob(
+ DataflowClient dataflowClient,
+ String jobId,
+ DataflowPipelineOptions dataflowOptions,
+ Map<AppliedPTransform<?, ?, ?>, String> transformStepNames,
+ RunnerApi.Pipeline pipelineProto) {
Review comment:
Done.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]