Minbo Bae created BEAM-12504:
--------------------------------
Summary: SpannerIO Read SessionNotFoundException for long-running
pipline
Key: BEAM-12504
URL: https://issues.apache.org/jira/browse/BEAM-12504
Project: Beam
Issue Type: Bug
Components: io-java-gcp
Reporter: Minbo Bae
[SpannderIO|https://github.com/apache/beam/blob/v2.30.0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java#L328]
creates a transaction session with
[CreateTransaction|https://github.com/apache/beam/blob/v2.30.0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java#L747]
at the beginning of pipeline run. If the pipeline takes a long time to run
SpannerIO.ReadAll like the following, the session can be [cleaned
up|https://cloud.google.com/spanner/docs/sessions#keep_an_idle_session_alive]
by Spanner service. The read step fails with error
`com.google.cloud.spanner.SessionNotFoundException: NOT_FOUND:
com.google.api.gax.rpc.NotFoundException: io.grpc.StatusRuntimeException:
NOT_FOUND: Session not found:
projects/<PROJECT>/instances/<INSTANCE>/databases/<DATABASE>/sessions/<SESSION_ID>`
PCollection<ReadOperation> queries = pipeline
.apply(Create.of((Void)null))
.apply("BeforeRead", ParDo.of( ... )); // long running. for example more than
2 hours
queries
.apply(SpanerIO.readAll().withSpannerConfig(...))
.apply("AfterRead", ParDo.of( ... ));
--
This message was sent by Atlassian Jira
(v8.3.4#803005)