[
https://issues.apache.org/jira/browse/BEAM-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ismaël Mejía reassigned BEAM-2812:
----------------------------------
Assignee: (was: Amit Sela)
> Dropped windows counters / log prints no longer working
> -------------------------------------------------------
>
> Key: BEAM-2812
> URL: https://issues.apache.org/jira/browse/BEAM-2812
> Project: Beam
> Issue Type: Bug
> Components: runner-spark
> Reporter: Aviem Zur
> Priority: Major
>
> In https://github.com/apache/beam/pull/2838 aggregators were removed from
> Spark runner, this caused regression around dropped windows counters and logs.
> {{CounterCell}} instances are created ad hoc instead of using the {{Metrics}}
> class static factory methods:
> [SparkGroupAlsoByWindowViaWindowSet.java#L213-L219|https://github.com/apache/beam/blob/v2.1.0/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L213-L219]
> Context of where the metrics are reported isn't taken into account, and since
> these counters are being passed to a lazily evaluated iterator
> [SparkGroupAlsoByWindowViaWindowSet.java#L221-L223|https://github.com/apache/beam/blob/v2.1.0/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L221-L223]
> the subsequent code which looks at the counters is always looking at these
> counters immediately after initialization, before they are populated, so
> these prints will never happen since the conditional statements do not check
> on the right counters
> [SparkGroupAlsoByWindowViaWindowSet.java#L323-L333|https://github.com/apache/beam/blob/v2.1.0/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L323-L333].
> What we want is these counts exposed as metrics as well as logs.
> Additionally,
> {{org.apache.beam.runners.core.LateDataUtils#dropExpiredWindows}} now takes a
> {{CounterCell}} as a parameter, which is a class for metrics implementation
> and should generally not be used elsewhere (this is also mentioned in its
> Javadoc), we should look into changing this method to use something else and
> perhaps make {{CounterCell}} and similar classes package private (And change
> runner code which uses these to be in the same package).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)