[
https://issues.apache.org/jira/browse/SPARK-41070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17635065#comment-17635065
]
Hyukjin Kwon commented on SPARK-41070:
--------------------------------------
Sounds more like a question than an issue. I suspect this isn't a performance
regression. Let's interact with Spark user/dev mailing list before filing an
issue.
> Performance issue when Spark SQL connects with TeraData
> --------------------------------------------------------
>
> Key: SPARK-41070
> URL: https://issues.apache.org/jira/browse/SPARK-41070
> Project: Spark
> Issue Type: Bug
> Components: Spark Core, SQL
> Affects Versions: 2.4.4
> Reporter: Ramakrishna
> Priority: Major
>
> We are connecting Tera data from spark SQL with below API
> {color:#ff8b00}Dataset<Row> jdbcDF = spark.read().jdbc(connectionUrl,
> tableQuery, connectionProperties);{color}
> We are facing one issue when we execute above logic on large table with
> million rows every time we are seeing below extra query is executing every
> time as this resulting performance hit on DB.
> This below information we got from DBA. We dont have any logs on SPARK SQL.
> SELECT 1 FROM ONE_MILLION_ROWS_TABLE;
> |1|
> |1|
> |1|
> |1|
> |1|
> |1|
> |1|
> |1|
> |1|
> |1|
>
> Can you please clarify why this query is executing or is there any chance
> that this type of query is executing from our code it self while check for
> rows count from dataframe.
>
> Please provide me your inputs on this.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]