[ https://issues.apache.org/jira/browse/SPARK-27077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Paul Wu updated SPARK-27077: ---------------------------- Description: I am not very sure this is a Spark core issue or a Vertica issue, however I intended to think this is Spark's issue. The problem is that when we try to read with sparkSession.read.load from some datasource, in my case, Vertica DB, the DataFrameReader needs to make some 'large' number of initial jdbc connection requests. My account limits I can only use 16 (and I can see at least 6 of them can be used for my loading), and when the "large" number of the requests issued, I got exception below. In fact, I can see eventually it could settle with fewer numbers of connections (in my case 2 simultaneous DataFrameReader). So I think we should have a parameter that prevents the reader from sending out initial "bigger" number of connection requests than user's limit. If we don't have this option parameter, my app could fail randomly due to my Vertica account's number of connections allowed. java.sql.SQLNonTransientConnectionException: [Vertica][VJDBC](7470) FATAL: New session rejected because connection limit of 16 on database already met for M21176 at com.vertica.util.ServerErrorData.buildException(Unknown Source) at com.vertica.io.ProtocolStream.readStartupMessages(Unknown Source) at com.vertica.io.ProtocolStream.initSession(Unknown Source) at com.vertica.core.VConnection.tryConnect(Unknown Source) at com.vertica.core.VConnection.connect(Unknown Source) at com.vertica.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source) at com.vertica.jdbc.common.AbstractDriver.connect(Unknown Source) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:208) at com.vertica.spark.datasource.VerticaDataSourceRDD$.resolveTable(VerticaRDD.scala:105) at com.vertica.spark.datasource.VerticaRelation.<init>(VerticaRelation.scala:34) at com.vertica.spark.datasource.DefaultSource.createRelation(VerticaSource.scala:47) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:341) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164) at com.att.iqi.data.ConnectorPrepareHourlyDataRT$1.run(ConnectorPrepareHourlyDataRT.java:156) Caused by: com.vertica.support.exceptions.NonTransientConnectionException: [Vertica][VJDBC](7470) FATAL: New session rejected because connection limit of 16 on databas e already met for was: I am not very sure this is a Spark core issue or a Vertica issue, however I intended to think this is Spark's issue. The problem is that when we try to read with sparkSession.read.load from some datasource, in my case, Vertica DB, the DataFrameReader needs to make some 'large' number of initial jdbc connection requests. My account limits I can only use 16 (and I can see at least 6 of them can be used for my loading), and when the "large" number of the requests issued, I got exception below. In fact, I can see eventually it could settle with fewer numbers of connections (in my case 2 simultaneous DataFrameReader). So I think we should have a parameter that prevents the reader to send out initial "bigger" number of connection requests than user's limit. If we don't have this option parameter, my app could fail randomly due to my Vertica account's number of connections allowed. java.sql.SQLNonTransientConnectionException: [Vertica][VJDBC](7470) FATAL: New session rejected because connection limit of 16 on database already met for M21176 at com.vertica.util.ServerErrorData.buildException(Unknown Source) at com.vertica.io.ProtocolStream.readStartupMessages(Unknown Source) at com.vertica.io.ProtocolStream.initSession(Unknown Source) at com.vertica.core.VConnection.tryConnect(Unknown Source) at com.vertica.core.VConnection.connect(Unknown Source) at com.vertica.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source) at com.vertica.jdbc.common.AbstractDriver.connect(Unknown Source) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:208) at com.vertica.spark.datasource.VerticaDataSourceRDD$.resolveTable(VerticaRDD.scala:105) at com.vertica.spark.datasource.VerticaRelation.<init>(VerticaRelation.scala:34) at com.vertica.spark.datasource.DefaultSource.createRelation(VerticaSource.scala:47) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:341) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164) at com.att.iqi.data.ConnectorPrepareHourlyDataRT$1.run(ConnectorPrepareHourlyDataRT.java:156) Caused by: com.vertica.support.exceptions.NonTransientConnectionException: [Vertica][VJDBC](7470) FATAL: New session rejected because connection limit of 16 on databas e already met for > DataFrameReader and Number of Connection Limitation > --------------------------------------------------- > > Key: SPARK-27077 > URL: https://issues.apache.org/jira/browse/SPARK-27077 > Project: Spark > Issue Type: Bug > Components: Spark Core, SQL > Affects Versions: 2.3.2 > Reporter: Paul Wu > Priority: Major > > I am not very sure this is a Spark core issue or a Vertica issue, however I > intended to think this is Spark's issue. The problem is that when we try to > read with sparkSession.read.load from some datasource, in my case, Vertica > DB, the DataFrameReader needs to make some 'large' number of initial jdbc > connection requests. My account limits I can only use 16 (and I can see at > least 6 of them can be used for my loading), and when the "large" number of > the requests issued, I got exception below. In fact, I can see eventually it > could settle with fewer numbers of connections (in my case 2 simultaneous > DataFrameReader). So I think we should have a parameter that prevents the > reader from sending out initial "bigger" number of connection requests than > user's limit. If we don't have this option parameter, my app could fail > randomly due to my Vertica account's number of connections allowed. > > java.sql.SQLNonTransientConnectionException: [Vertica][VJDBC](7470) FATAL: > New session rejected because connection limit of 16 on database already met > for M21176 > at com.vertica.util.ServerErrorData.buildException(Unknown Source) > at com.vertica.io.ProtocolStream.readStartupMessages(Unknown Source) > at com.vertica.io.ProtocolStream.initSession(Unknown Source) > at com.vertica.core.VConnection.tryConnect(Unknown Source) > at com.vertica.core.VConnection.connect(Unknown Source) > at com.vertica.jdbc.common.BaseConnectionFactory.doConnect(Unknown > Source) > at com.vertica.jdbc.common.AbstractDriver.connect(Unknown Source) > at java.sql.DriverManager.getConnection(DriverManager.java:664) > at java.sql.DriverManager.getConnection(DriverManager.java:208) > at > com.vertica.spark.datasource.VerticaDataSourceRDD$.resolveTable(VerticaRDD.scala:105) > at > com.vertica.spark.datasource.VerticaRelation.<init>(VerticaRelation.scala:34) > at > com.vertica.spark.datasource.DefaultSource.createRelation(VerticaSource.scala:47) > at > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:341) > at > org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239) > at > org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227) > at > org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164) > at > com.att.iqi.data.ConnectorPrepareHourlyDataRT$1.run(ConnectorPrepareHourlyDataRT.java:156) > Caused by: com.vertica.support.exceptions.NonTransientConnectionException: > [Vertica][VJDBC](7470) FATAL: New session rejected because connection limit > of 16 on databas e already met for > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org