[jira] [Comment Edited] (BEAM-7230) Using JdbcIO creates huge amount of connections

2019-09-27 Thread Anton Bankovskii (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16939262#comment-16939262
 ] 

Anton Bankovskii edited comment on BEAM-7230 at 9/27/19 11:28 AM:
--

Unfortunately I came across the same behavior while executing the pipeline 
using the DataflowRunner.

Exception in Dataflow console:
{noformat}
java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException:  Caused 
by: 
[SKIPPED] 14 more Caused by: java.lang.NullPointerException at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.buildDataSource(JdbcIO.java:1394)
 at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.apply(JdbcIO.java:1389)
 at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.apply(JdbcIO.java:1369)
 at org.apache.beam.sdk.io.jdbc.JdbcIO$ReadFn.setup(JdbcIO.java:862){noformat}
Testing locally with DirectRunner gives no error.

 

Here's my code:
{code:java}
public class ReadFromJdbcFn extends PTransform> {

private JdbcToCsvOptions options;

public ReadFromJdbcFn(JdbcToCsvOptions options) {
this.options = options;
}

@Override
public PCollection expand(PBegin input) {

final JdbcIO.DataSourceConfiguration dataSourceConfiguration = 
configure(options.getConnectionString(),
options.getDriverName(), options.getUser(),
options.getPassword(), options.getConnectionProperties());

final SerializableFunction dataSourceProviderFunction 
=
JdbcIO.PoolableDataSourceProvider.of(dataSourceConfiguration);

return input.apply(
JdbcIO.read()
.withDataSourceProviderFn(dataSourceProviderFunction)
.withCoder(StringUtf8Coder.of())
.withFetchSize(options.getFetchSize())
.withQuery(options.getQuery())
.withRowMapper(new JdbcToCsvRowMapper()));
}

private static JdbcIO.DataSourceConfiguration configure(String connString, 
String driver, String user, String password,
String 
connProperties) {

final JdbcIO.DataSourceConfiguration dataSourceConfiguration = 
JdbcIO.DataSourceConfiguration.create(driver, connString)
.withUsername(user)
.withPassword(password);

return connProperties == null ?
dataSourceConfiguration : 
dataSourceConfiguration.withConnectionProperties(connProperties);
}
}

{code}
Also, I wonder will the Dataflow autoscale it's worker's count based on the 
amount of data read from database?


was (Author: stabmeqt):
Unfortunately I came across the same behavior while executing the pipeline 
using the DataflowRunner.

Exception in Dataflow console:
{noformat}
java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException:  Caused 
by: 
[SKIPPED] 14 more Caused by: java.lang.NullPointerException at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.buildDataSource(JdbcIO.java:1394)
 at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.apply(JdbcIO.java:1389)
 at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.apply(JdbcIO.java:1369)
 at org.apache.beam.sdk.io.jdbc.JdbcIO$ReadFn.setup(JdbcIO.java:862){noformat}
Testing locally with DirectRunner gives no error.

 

Here's my code:
{code:java}
public class ReadFromJdbcFn extends PTransform> {

private JdbcToCsvOptions options;

public ReadFromJdbcFn(JdbcToCsvOptions options) {
this.options = options;
}

@Override
public PCollection expand(PBegin input) {

final JdbcIO.DataSourceConfiguration dataSourceConfiguration = 
configure(options.getConnectionString(),
options.getDriverName(), options.getUser(),
options.getPassword(), options.getConnectionProperties());

final SerializableFunction dataSourceProviderFunction 
=
JdbcIO.PoolableDataSourceProvider.of(dataSourceConfiguration);

return input.apply(
JdbcIO.read()
.withDataSourceProviderFn(dataSourceProviderFunction)
.withCoder(StringUtf8Coder.of())
.withFetchSize(options.getFetchSize())
.withQuery(options.getQuery())
.withRowMapper(new JdbcToCsvRowMapper()));
}

private static JdbcIO.DataSourceConfiguration configure(String connString, 
String driver, String user, String password,
String 
connProperties) {

final JdbcIO.DataSourceConfiguration dataSourceConfiguration = 
JdbcIO.DataSourceConfiguration.create(driver, connString)
.withUsername(user)
.withPassword(password);

return connProperties == null ?
  

[jira] [Comment Edited] (BEAM-7230) Using JdbcIO creates huge amount of connections

2019-09-27 Thread Anton Bankovskii (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16939262#comment-16939262
 ] 

Anton Bankovskii edited comment on BEAM-7230 at 9/27/19 11:28 AM:
--

Unfortunately I came across the same behavior while executing the pipeline 
using the DataflowRunner.

Exception in Dataflow console:
{noformat}
java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException:  Caused 
by: 
[SKIPPED] 14 more Caused by: java.lang.NullPointerException at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.buildDataSource(JdbcIO.java:1394)
 at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.apply(JdbcIO.java:1389)
 at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.apply(JdbcIO.java:1369)
 at org.apache.beam.sdk.io.jdbc.JdbcIO$ReadFn.setup(JdbcIO.java:862){noformat}
Testing locally with DirectRunner gives no error.

 

Here's my code:
{code:java}
public class ReadFromJdbcFn extends PTransform> {

private JdbcToCsvOptions options;

public ReadFromJdbcFn(JdbcToCsvOptions options) {
this.options = options;
}

@Override
public PCollection expand(PBegin input) {

final JdbcIO.DataSourceConfiguration dataSourceConfiguration = 
configure(options.getConnectionString(),
options.getDriverName(), options.getUser(),
options.getPassword(), options.getConnectionProperties());

final SerializableFunction dataSourceProviderFunction 
=
JdbcIO.PoolableDataSourceProvider.of(dataSourceConfiguration);

return input.apply(
JdbcIO.read()
.withDataSourceProviderFn(dataSourceProviderFunction)
.withCoder(StringUtf8Coder.of())
.withFetchSize(options.getFetchSize())
.withQuery(options.getQuery())
.withRowMapper(new JdbcToCsvRowMapper()));
}

private static JdbcIO.DataSourceConfiguration configure(String connString, 
String driver, String user, String password,
String 
connProperties) {

final JdbcIO.DataSourceConfiguration dataSourceConfiguration = 
JdbcIO.DataSourceConfiguration.create(driver, connString)
.withUsername(user)
.withPassword(password);

return connProperties == null ?
dataSourceConfiguration : 
dataSourceConfiguration.withConnectionProperties(connProperties);
}
}

{code}
Also, I wonder will the Dataflow autoscale it's worker based on the amount of 
data read from database?


was (Author: stabmeqt):
Unfortunately I came across the same behavior while executing the pipeline 
using the DataflowRunner.

Exception in Dataflow console:
{noformat}
java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException:  Caused 
by: 
[SKIPPED] 14 more Caused by: java.lang.NullPointerException at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.buildDataSource(JdbcIO.java:1394)
 at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.apply(JdbcIO.java:1389)
 at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.apply(JdbcIO.java:1369)
 at org.apache.beam.sdk.io.jdbc.JdbcIO$ReadFn.setup(JdbcIO.java:862){noformat}
Testing locally with DirectRunner gives no error.

 

Here's my code:
{code:java}
public class ReadFromJdbcFn extends PTransform> {

private JdbcToCsvOptions options;

public ReadFromJdbcFn(JdbcToCsvOptions options) {
this.options = options;
}

@Override
public PCollection expand(PBegin input) {

final JdbcIO.DataSourceConfiguration dataSourceConfiguration = 
configure(options.getConnectionString(),
options.getDriverName(), options.getUser(),
options.getPassword(), options.getConnectionProperties());

final SerializableFunction dataSourceProviderFunction 
=
JdbcIO.PoolableDataSourceProvider.of(dataSourceConfiguration);

return input.apply(
JdbcIO.read()
.withDataSourceProviderFn(dataSourceProviderFunction)
.withCoder(StringUtf8Coder.of())
.withFetchSize(options.getFetchSize())
.withQuery(options.getQuery())
.withRowMapper(new JdbcToCsvRowMapper()));
}

private static JdbcIO.DataSourceConfiguration configure(String connString, 
String driver, String user, String password,
String 
connProperties) {

final JdbcIO.DataSourceConfiguration dataSourceConfiguration = 
JdbcIO.DataSourceConfiguration.create(driver, connString)
.withUsername(user)
.withPassword(password);

return connProperties == null ?

[jira] [Comment Edited] (BEAM-7230) Using JdbcIO creates huge amount of connections

2019-09-27 Thread Anton Bankovskii (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16939262#comment-16939262
 ] 

Anton Bankovskii edited comment on BEAM-7230 at 9/27/19 9:48 AM:
-

Unfortunately I came across the same behavior while executing the pipeline 
using the DataflowRunner.

Exception in Dataflow console:
{noformat}
java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException:  Caused 
by: 
[SKIPPED] 14 more Caused by: java.lang.NullPointerException at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.buildDataSource(JdbcIO.java:1394)
 at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.apply(JdbcIO.java:1389)
 at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.apply(JdbcIO.java:1369)
 at org.apache.beam.sdk.io.jdbc.JdbcIO$ReadFn.setup(JdbcIO.java:862){noformat}
Testing locally with DirectRunner gives no error.

 

Here's my code:
{code:java}
public class ReadFromJdbcFn extends PTransform> {

private JdbcToCsvOptions options;

public ReadFromJdbcFn(JdbcToCsvOptions options) {
this.options = options;
}

@Override
public PCollection expand(PBegin input) {

final JdbcIO.DataSourceConfiguration dataSourceConfiguration = 
configure(options.getConnectionString(),
options.getDriverName(), options.getUser(),
options.getPassword(), options.getConnectionProperties());

final SerializableFunction dataSourceProviderFunction 
=
JdbcIO.PoolableDataSourceProvider.of(dataSourceConfiguration);

return input.apply(
JdbcIO.read()
.withDataSourceProviderFn(dataSourceProviderFunction)
.withCoder(StringUtf8Coder.of())
.withFetchSize(options.getFetchSize())
.withQuery(options.getQuery())
.withRowMapper(new JdbcToCsvRowMapper()));
}

private static JdbcIO.DataSourceConfiguration configure(String connString, 
String driver, String user, String password,
String 
connProperties) {

final JdbcIO.DataSourceConfiguration dataSourceConfiguration = 
JdbcIO.DataSourceConfiguration.create(driver, connString)
.withUsername(user)
.withPassword(password);

return connProperties == null ?
dataSourceConfiguration : 
dataSourceConfiguration.withConnectionProperties(connProperties);
}
}

{code}


was (Author: stabmeqt):
Unfortunately I came across the same behavior while executing the pipeline 
using the DataflowRunner.

Exception in Dataflow console:
{noformat}
java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException:  Caused 
by: 
[SKIPPED] 14 more Caused by: java.lang.NullPointerException at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.buildDataSource(JdbcIO.java:1394)
 at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.apply(JdbcIO.java:1389)
 at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.apply(JdbcIO.java:1369)
 at org.apache.beam.sdk.io.jdbc.JdbcIO$ReadFn.setup(JdbcIO.java:862){noformat}
Testing locally with DirectRunner gives no error.

> Using JdbcIO creates huge amount of connections
> ---
>
> Key: BEAM-7230
> URL: https://issues.apache.org/jira/browse/BEAM-7230
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.11.0
>Reporter: Brachi Packter
>Assignee: Ismaël Mejía
>Priority: Major
> Fix For: 2.13.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> I want to write form DataFlow to GCP cloud SQL, I'm using connection pool, 
> and still I see huge amount of connections in GCP SQL (4k while I set 
> connection pool to 300), and most of them in sleep.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-7230) Using JdbcIO creates huge amount of connections

2019-09-27 Thread Anton Bankovskii (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16939262#comment-16939262
 ] 

Anton Bankovskii edited comment on BEAM-7230 at 9/27/19 9:45 AM:
-

Unfortunately I came across the same behavior while executing the pipeline 
using the DataflowRunner.

Exception in Dataflow console:
{noformat}
java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException:  Caused 
by: 
[SKIPPED] 14 more Caused by: java.lang.NullPointerException at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.buildDataSource(JdbcIO.java:1394)
 at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.apply(JdbcIO.java:1389)
 at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.apply(JdbcIO.java:1369)
 at org.apache.beam.sdk.io.jdbc.JdbcIO$ReadFn.setup(JdbcIO.java:862){noformat}
Testing locally with DirectRunner gives no error.


was (Author: stabmeqt):
Unfortunately I came across the same behavior while executing the pipeline 
using the DataflowRunner.

Exception in Dataflow console:
{noformat}
java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: 
java.lang.NullPointerException at 
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$1.typedApply(IntrinsicMapTaskExecutorFactory.java:194)
 at 
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$1.typedApply(IntrinsicMapTaskExecutorFactory.java:165)
 at 
org.apache.beam.runners.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:63)
 at 
org.apache.beam.runners.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:50)
 at 
org.apache.beam.runners.dataflow.worker.graph.Networks.replaceDirectedNetworkNodes(Networks.java:87)
 at 
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.create(IntrinsicMapTaskExecutorFactory.java:125)
 at 
org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:352)
 at 
org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:305)
 at 
org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:140)
 at 
org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:120)
 at 
org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:107)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:745) Caused by: 
org.apache.beam.sdk.util.UserCodeException: java.lang.NullPointerException at 
org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:34) at 
org.apache.beam.sdk.io.jdbc.JdbcIO$ReadFn$DoFnInvoker.invokeSetup(Unknown 
Source) at 
org.apache.beam.runners.dataflow.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.deserializeCopy(DoFnInstanceManagers.java:80)
 at 
org.apache.beam.runners.dataflow.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.peek(DoFnInstanceManagers.java:62)
 at 
org.apache.beam.runners.dataflow.worker.UserParDoFnFactory.create(UserParDoFnFactory.java:95)
 at 
org.apache.beam.runners.dataflow.worker.DefaultParDoFnFactory.create(DefaultParDoFnFactory.java:75)
 at 
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.createParDoOperation(IntrinsicMapTaskExecutorFactory.java:264)
 at 
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.access$000(IntrinsicMapTaskExecutorFactory.java:86)
 at 
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$1.typedApply(IntrinsicMapTaskExecutorFactory.java:183)
 ... 14 more Caused by: java.lang.NullPointerException at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.buildDataSource(JdbcIO.java:1394)
 at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.apply(JdbcIO.java:1389)
 at 
org.apache.beam.sdk.io.jdbc.JdbcIO$PoolableDataSourceProvider.apply(JdbcIO.java:1369)
 at org.apache.beam.sdk.io.jdbc.JdbcIO$ReadFn.setup(JdbcIO.java:862){noformat}
Testing locally with DirectRunner gives no error.

> Using JdbcIO creates huge amount of connections
> ---
>
> Key: BEAM-7230
> URL: https://issues.apache.org/jira/browse/BEAM-7230
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.11.0
>Reporter: Brachi Packter
>Assignee: Ismaël Mejía
>Priority: Major
> Fix For: 2.13.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> I want to write form DataFlow to 

[jira] [Comment Edited] (BEAM-7230) Using JdbcIO creates huge amount of connections

2019-05-13 Thread Brachi Packter (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838667#comment-16838667
 ] 

Brachi Packter edited comment on BEAM-7230 at 5/13/19 4:16 PM:
---

But I didn't use `PoolableDatasource`, because I see GCP recommend using 
`HikariCP`. (yes I keep it static)

Do you think better use MemoizedDataSourceProvider?


was (Author: brachi_packter):
But I didn't use `PoolableDatasource`, because I see GCP recommend using 
`HikariCP`.

Do you think better use MemoizedDataSourceProvider?

> Using JdbcIO creates huge amount of connections
> ---
>
> Key: BEAM-7230
> URL: https://issues.apache.org/jira/browse/BEAM-7230
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.11.0
>Reporter: Brachi Packter
>Assignee: Ismaël Mejía
>Priority: Major
>
> I want to write form DataFlow to GCP cloud SQL, I'm using connection pool, 
> and still I see huge amount of connections in GCP SQL (4k while I set 
> connection pool to 300), and most of them in sleep.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-7230) Using JdbcIO creates huge amount of connections

2019-05-13 Thread JIRA


[ 
https://issues.apache.org/jira/browse/BEAM-7230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838642#comment-16838642
 ] 

Ismaël Mejía edited comment on BEAM-7230 at 5/13/19 3:56 PM:
-

[~brachi_packter] notice that in the provided function if you use 
PoolableDatasource you MUST guarantee that the DataSource is constructed as a 
static value, that way it will exist only once per JVM, otherwise you will be 
building a Pool per DoFn and in streaming it will intend to do many connections 
which is the core of this issue.

 


was (Author: iemejia):
[~brachi_packter] notice that in the provided function you MUST guarantee that 
the DataSource is constructed as a static value, that way it will exist only 
once per JVM, otherwise you will be building a Pool per DoFn and in streaming 
it will intend to do many connections which will bring us back to the same 
error here.

 

> Using JdbcIO creates huge amount of connections
> ---
>
> Key: BEAM-7230
> URL: https://issues.apache.org/jira/browse/BEAM-7230
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.11.0
>Reporter: Brachi Packter
>Assignee: Ismaël Mejía
>Priority: Major
>
> I want to write form DataFlow to GCP cloud SQL, I'm using connection pool, 
> and still I see huge amount of connections in GCP SQL (4k while I set 
> connection pool to 300), and most of them in sleep.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-7230) Using JdbcIO creates huge amount of connections

2019-05-13 Thread Brachi Packter (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838451#comment-16838451
 ] 

Brachi Packter edited comment on BEAM-7230 at 5/13/19 10:34 AM:


You close the data source on @TearDown

 
{code:java}
@Teardown
public void teardown() throws Exception {
connection.close();
if (dataSource instanceof AutoCloseable) {
  ((AutoCloseable) dataSource).close();
}
}
{code}
So. If I want one Data source for any worker, shared between ant DoFn threads, 
you must not close it up.

 

my code:
{code:java}
.apply(JdbcIO.>write()
.withDataSourceProviderFn((SerializableFunction)   
input1 -> {
ConnectionPoolProvider.init(dataBaseName, userName, project, 
region, instanceName);
return ConnectionPoolProvider.getInstance().getDataSource();
})
.withStatement(sqlInsertCommand)
.withPreparedStatementSetter((JdbcIO.PreparedStatementSetter>)
 this::prepareStatement
));
{code}
This leads to this exception:

 
{code:java}
Caused by: java.sql.SQLException: HikariDataSource HikariDataSource 
(HikariPool-1) has been closed. 
com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:96) 
org.apache.beam.sdk.io.jdbc.JdbcIO$WriteVoid$WriteFn.startBundle(JdbcIO.java:990)
{code}
 


was (Author: brachi_packter):
You close the data source on @TearDown

 
{code:java}
@Teardown
public void teardown() throws Exception {
connection.close();
if (dataSource instanceof AutoCloseable) {
  ((AutoCloseable) dataSource).close();
}
}
{code}
So. If I want one Data source for any worker, shared between ant DoFn threads, 
you must not close it up.

 

my code:
{code:java}
.apply(JdbcIO.>write()
.withDataSourceProviderFn((SerializableFunction) input1 -> { 
ConnectionPoolProvider.init(dataBaseName, userName, project, region, 
instanceName); return ConnectionPoolProvider.getInstance().getDataSource(); }) 
.withStatement(sqlInsertCommand) 
.withPreparedStatementSetter((JdbcIO.PreparedStatementSetter>)
 this::prepareStatement ));
{code}
This leads to this exception:

 
{code:java}
Caused by: java.sql.SQLException: HikariDataSource HikariDataSource 
(HikariPool-1) has been closed. 
com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:96) 
org.apache.beam.sdk.io.jdbc.JdbcIO$WriteVoid$WriteFn.startBundle(JdbcIO.java:990)
{code}
 

> Using JdbcIO creates huge amount of connections
> ---
>
> Key: BEAM-7230
> URL: https://issues.apache.org/jira/browse/BEAM-7230
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.11.0
>Reporter: Brachi Packter
>Assignee: Ismaël Mejía
>Priority: Major
>
> I want to write form DataFlow to GCP cloud SQL, I'm using connection pool, 
> and still I see huge amount of connections in GCP SQL (4k while I set 
> connection pool to 300), and most of them in sleep.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-7230) Using JdbcIO creates huge amount of connections

2019-05-13 Thread JIRA


[ 
https://issues.apache.org/jira/browse/BEAM-7230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838359#comment-16838359
 ] 

Ismaël Mejía edited comment on BEAM-7230 at 5/13/19 8:19 AM:
-

You can try today by using the `-SNAPSHOT` version `2.13.0-SNAPSHOT`. The 
release vote for 2.13.0 is starting this week so that could be another way. 
Only doubt I have is I am not sure if they build 'snapshot' versions of their 
worker images, if that does not work we should then wait some days until the 
release is voted/out.


was (Author: iemejia):
You can try today by using the -SNAPSHOT version `2.13.0-SNAPSHOT`-. The 
release vote for 2.13.0 is starting this week so that could be another way. 
Only doubt I have is I am not sure if they build 'snapshot' versions of their 
worker images, if that does not work we should then wait some days until the 
release is voted/out.

> Using JdbcIO creates huge amount of connections
> ---
>
> Key: BEAM-7230
> URL: https://issues.apache.org/jira/browse/BEAM-7230
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.11.0
>Reporter: Brachi Packter
>Assignee: Ismaël Mejía
>Priority: Major
>
> I want to write form DataFlow to GCP cloud SQL, I'm using connection pool, 
> and still I see huge amount of connections in GCP SQL (4k while I set 
> connection pool to 300), and most of them in sleep.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)