[
https://issues.apache.org/jira/browse/BEAM-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17070806#comment-17070806
]
Tim Robertson commented on BEAM-9629:
-------------------------------------
I'd recommend setting the default to something low (1-3) but documenting that
clearly and recommending people increase it to a sensible limit for their
environment (e.g. have it in the default examples so it is prominent).
I expect you could easily hit DB resource limits - e.g. a Spark cluster running
500 executors opening connections simultaneously by default. We have hit limits
(not Beam related) with various microservices and Hadoop processes (e.g. Scoop)
hammering PostgreSQL.
> JdbcIO seems to run out of connections in the connection pool and freezes
> pipeline
> ----------------------------------------------------------------------------------
>
> Key: BEAM-9629
> URL: https://issues.apache.org/jira/browse/BEAM-9629
> Project: Beam
> Issue Type: Bug
> Components: io-java-jdbc
> Affects Versions: 2.19.0
> Environment: Dataflow, Direct Runner on macOS Catalina.
> Reporter: Boris Shilov
> Assignee: Ismaël Mejía
> Priority: Major
> Labels: performance
> Fix For: 2.21.0
>
>
> Greetings,
> I am using JdbcIO via the Scala wrappers provided in the Scio project. I am
> trying to read a few dozen tables in parallel from MySQL, but above 8
> concurrent SELECT operations the pipeline freezes. With help of the Scio
> maintainers we've been able to isolate the issue as likely originating in
> JdbcIO running out of connections in the connection pool and idling
> indefinitely. The issue occurs both on the Direct Runner and Dataflow.
> Please see linked issue for more context:
> https://github.com/spotify/scio/issues/2774
--
This message was sent by Atlassian Jira
(v8.3.4#803005)