Hi there!

I wonder: why tests that use TestDataflowRunner fail if there are some
transient difficulties on Dataflow pipeline?

Let's consider the JDBC Performance test case: the pipelines that are there
sometimes have trouble connecting to a Postgres instance. If this happens,
they retry processing the bundle as described in Dataflow FAQ [1]. The
PSQLExceptions that happen on Dataflow (due to connection problems) are
collected by TestDataflowRunner's messageHandler. After the whole data
processing is done, TestDataflowRunner "rethrows" gathered exceptions if
there are any ([2], [3]). IMO, this results in a "false-negative": maven
fails due to the exceptions being thrown, even despite the fact that the
job actually succeeded on Dataflow (State.DONE).

I think we should "rethrow" those exceptions only if the job status is
other than DONE, which AFAIK means that the job succeeded on Dataflow. If
Dataflow managed to handle them, I don't see any reason for the test to
fail. Am I missing something here? WDYT?

[1]
https://cloud.google.com/dataflow/faq#how-are-java-exceptions-handled-in-dataflow
[2]
https://github.com/apache/beam/blob/a3e262b96be5e6507f3c38413341b4ab607ade41/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/TestDataflowRunner.java#L197
[3]
https://builds.apache.org/view/A-D/view/Beam/job/beam_PerformanceTests_JDBC/291/console

Reply via email to