Paul Wu created SPARK-27077:
-------------------------------
Summary: DataFrameReader and Number of Connection Limitation
Key: SPARK-27077
URL: https://issues.apache.org/jira/browse/SPARK-27077
Project: Spark
Issue Type: Bug
Components: Spark Core, SQL
Affects Versions: 2.3.2
Reporter: Paul Wu
I am not very sure this is a Spark core issue or a Vertica issue, however I
intended to think this is Spark's issue. The problem is that when we try to
read with sparkSession.read.load from some datasource, in my case, Vertica DB,
the DataFrameReader needs to make some 'large' number of initial jdbc
connection requests. My account limits I can only use 16 (and I can see at
least 6 of them can be used for my loading), and when the "large" number of the
requests issued, I got exception below. In fact, I can see eventually it could
settle with fewer numbers of connections (in my case 2 simultaneous
DataFrameReader). So I think we should have a parameter that prevents the
reader to send out initial "bigger" number of connection requests than user's
limit. If we don't have this option parameter, my app could fail randomly due
to my Vertica account's number of connections allowed.
java.sql.SQLNonTransientConnectionException: [Vertica][VJDBC](7470) FATAL: New
session rejected because connection limit of 16 on database already met for
M21176
at com.vertica.util.ServerErrorData.buildException(Unknown Source)
at com.vertica.io.ProtocolStream.readStartupMessages(Unknown Source)
at com.vertica.io.ProtocolStream.initSession(Unknown Source)
at com.vertica.core.VConnection.tryConnect(Unknown Source)
at com.vertica.core.VConnection.connect(Unknown Source)
at com.vertica.jdbc.common.BaseConnectionFactory.doConnect(Unknown
Source)
at com.vertica.jdbc.common.AbstractDriver.connect(Unknown Source)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
at
com.vertica.spark.datasource.VerticaDataSourceRDD$.resolveTable(VerticaRDD.scala:105)
at
com.vertica.spark.datasource.VerticaRelation.<init>(VerticaRelation.scala:34)
at
com.vertica.spark.datasource.DefaultSource.createRelation(VerticaSource.scala:47)
at
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:341)
at
org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)
at
com.att.iqi.data.ConnectorPrepareHourlyDataRT$1.run(ConnectorPrepareHourlyDataRT.java:156)
Caused by: com.vertica.support.exceptions.NonTransientConnectionException:
[Vertica][VJDBC](7470) FATAL: New session rejected because connection limit of
16 on databas e already met for
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]