CaoYu created SPARK-40502:
-----------------------------
Summary: Support dataframe API use jdbc data source in PySpark
Key: SPARK-40502
URL: https://issues.apache.org/jira/browse/SPARK-40502
Project: Spark
Issue Type: New Feature
Components: PySpark
Affects Versions: 3.3.0
Reporter: CaoYu
When i using pyspark, i wanna get data from mysql database. so i want use
JDBCRDD like java\scala.
But that is not be supported in PySpark.
For some reasons, i can't using DataFrame API, only can use RDD(datastream)
API. Even i know the DataFrame can get data from jdbc source fairly well.
So i want to implement functionality that can use rdd to get data from jdbc
source for PySpark.
*But i don't know if that are necessary for PySpark. so we can discuss it.*
{*}If it is necessary for PySpark{*}{*}, i want to contribute to Spark.{*}
*i hope this Jira task can assigned to me, so i can start working to implement
it.*
*if not, please close this Jira task.*
*thanks a lot.*
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]