[jira] [Updated] (SPARK-27077) DataFrameReader and Number of Connection Limitation

Paul Wu (JIRA) Wed, 06 Mar 2019 16:04:41 -0800


     [ 
https://issues.apache.org/jira/browse/SPARK-27077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Paul Wu updated SPARK-27077:
----------------------------
    Description: 
I am not very sure this is a Spark core issue or a Vertica issue, however I 
intended to think this is Spark's issue.  The problem is that when we try to 
read with sparkSession.read.load from some datasource, in my case, Vertica DB, 
the DataFrameReader needs to make some 'large' number of  initial  jdbc 
connection requests. My account limits I can only use 16 (and I can see at 
least 6 of them can be used for my loading), and when the "large" number of the 
requests issued, I got exception below.  In fact, I can see eventually it could 
settle with fewer numbers of connections (in my case 2 simultaneous 
DataFrameReader). So I think we should have a parameter that prevents the 
reader from sending out initial "bigger" number of connection requests than 
user's limit. If we don't have this option parameter, my app could fail 
randomly due to my Vertica account's number of connections allowed.

 

java.sql.SQLNonTransientConnectionException: [Vertica][VJDBC](7470) FATAL: New 
session rejected because connection limit of 16 on database already met for 
M21176
         at com.vertica.util.ServerErrorData.buildException(Unknown Source)
         at com.vertica.io.ProtocolStream.readStartupMessages(Unknown Source)
         at com.vertica.io.ProtocolStream.initSession(Unknown Source)
         at com.vertica.core.VConnection.tryConnect(Unknown Source)
         at com.vertica.core.VConnection.connect(Unknown Source)
         at com.vertica.jdbc.common.BaseConnectionFactory.doConnect(Unknown 
Source)
         at com.vertica.jdbc.common.AbstractDriver.connect(Unknown Source)
         at java.sql.DriverManager.getConnection(DriverManager.java:664)
         at java.sql.DriverManager.getConnection(DriverManager.java:208)
         at 
com.vertica.spark.datasource.VerticaDataSourceRDD$.resolveTable(VerticaRDD.scala:105)
         at 
com.vertica.spark.datasource.VerticaRelation.<init>(VerticaRelation.scala:34)
         at 
com.vertica.spark.datasource.DefaultSource.createRelation(VerticaSource.scala:47)
         at 
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:341)
         at 
org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
         at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
         at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)
         at 
com.att.iqi.data.ConnectorPrepareHourlyDataRT$1.run(ConnectorPrepareHourlyDataRT.java:156)
 Caused by: com.vertica.support.exceptions.NonTransientConnectionException: 
[Vertica][VJDBC](7470) FATAL: New session rejected because connection limit of 
16 on databas                e already met for

 

  was:
I am not very sure this is a Spark core issue or a Vertica issue, however I 
intended to think this is Spark's issue.  The problem is that when we try to 
read with sparkSession.read.load from some datasource, in my case, Vertica DB, 
the DataFrameReader needs to make some 'large' number of  initial  jdbc 
connection requests. My account limits I can only use 16 (and I can see at 
least 6 of them can be used for my loading), and when the "large" number of the 
requests issued, I got exception below.  In fact, I can see eventually it could 
settle with fewer numbers of connections (in my case 2 simultaneous 
DataFrameReader). So I think we should have a parameter that prevents the 
reader to send out initial "bigger" number of connection requests than user's 
limit. If we don't have this option parameter, my app could fail randomly due 
to my Vertica account's number of connections allowed.

 

java.sql.SQLNonTransientConnectionException: [Vertica][VJDBC](7470) FATAL: New 
session rejected because connection limit of 16 on database already met for 
M21176
        at com.vertica.util.ServerErrorData.buildException(Unknown Source)
        at com.vertica.io.ProtocolStream.readStartupMessages(Unknown Source)
        at com.vertica.io.ProtocolStream.initSession(Unknown Source)
        at com.vertica.core.VConnection.tryConnect(Unknown Source)
        at com.vertica.core.VConnection.connect(Unknown Source)
        at com.vertica.jdbc.common.BaseConnectionFactory.doConnect(Unknown 
Source)
        at com.vertica.jdbc.common.AbstractDriver.connect(Unknown Source)
        at java.sql.DriverManager.getConnection(DriverManager.java:664)
        at java.sql.DriverManager.getConnection(DriverManager.java:208)
        at 
com.vertica.spark.datasource.VerticaDataSourceRDD$.resolveTable(VerticaRDD.scala:105)
        at 
com.vertica.spark.datasource.VerticaRelation.<init>(VerticaRelation.scala:34)
        at 
com.vertica.spark.datasource.DefaultSource.createRelation(VerticaSource.scala:47)
        at 
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:341)
        at 
org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
        at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
        at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)
        at 
com.att.iqi.data.ConnectorPrepareHourlyDataRT$1.run(ConnectorPrepareHourlyDataRT.java:156)
Caused by: com.vertica.support.exceptions.NonTransientConnectionException: 
[Vertica][VJDBC](7470) FATAL: New session rejected because connection limit of 
16 on databas                e already met for

 


> DataFrameReader and Number of Connection Limitation
> ---------------------------------------------------
>
>                 Key: SPARK-27077
>                 URL: https://issues.apache.org/jira/browse/SPARK-27077
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, SQL
>    Affects Versions: 2.3.2
>            Reporter: Paul Wu
>            Priority: Major
>
> I am not very sure this is a Spark core issue or a Vertica issue, however I 
> intended to think this is Spark's issue.  The problem is that when we try to 
> read with sparkSession.read.load from some datasource, in my case, Vertica 
> DB, the DataFrameReader needs to make some 'large' number of  initial  jdbc 
> connection requests. My account limits I can only use 16 (and I can see at 
> least 6 of them can be used for my loading), and when the "large" number of 
> the requests issued, I got exception below.  In fact, I can see eventually it 
> could settle with fewer numbers of connections (in my case 2 simultaneous 
> DataFrameReader). So I think we should have a parameter that prevents the 
> reader from sending out initial "bigger" number of connection requests than 
> user's limit. If we don't have this option parameter, my app could fail 
> randomly due to my Vertica account's number of connections allowed.
>  
> java.sql.SQLNonTransientConnectionException: [Vertica][VJDBC](7470) FATAL: 
> New session rejected because connection limit of 16 on database already met 
> for M21176
>          at com.vertica.util.ServerErrorData.buildException(Unknown Source)
>          at com.vertica.io.ProtocolStream.readStartupMessages(Unknown Source)
>          at com.vertica.io.ProtocolStream.initSession(Unknown Source)
>          at com.vertica.core.VConnection.tryConnect(Unknown Source)
>          at com.vertica.core.VConnection.connect(Unknown Source)
>          at com.vertica.jdbc.common.BaseConnectionFactory.doConnect(Unknown 
> Source)
>          at com.vertica.jdbc.common.AbstractDriver.connect(Unknown Source)
>          at java.sql.DriverManager.getConnection(DriverManager.java:664)
>          at java.sql.DriverManager.getConnection(DriverManager.java:208)
>          at 
> com.vertica.spark.datasource.VerticaDataSourceRDD$.resolveTable(VerticaRDD.scala:105)
>          at 
> com.vertica.spark.datasource.VerticaRelation.<init>(VerticaRelation.scala:34)
>          at 
> com.vertica.spark.datasource.DefaultSource.createRelation(VerticaSource.scala:47)
>          at 
> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:341)
>          at 
> org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
>          at 
> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
>          at 
> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)
>          at 
> com.att.iqi.data.ConnectorPrepareHourlyDataRT$1.run(ConnectorPrepareHourlyDataRT.java:156)
>  Caused by: com.vertica.support.exceptions.NonTransientConnectionException: 
> [Vertica][VJDBC](7470) FATAL: New session rejected because connection limit 
> of 16 on databas                e already met for
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-27077) DataFrameReader and Number of Connection Limitation

Reply via email to