CaoYu created SPARK-40502:
-----------------------------

             Summary: Support dataframe API use jdbc data source in PySpark
                 Key: SPARK-40502
                 URL: https://issues.apache.org/jira/browse/SPARK-40502
             Project: Spark
          Issue Type: New Feature
          Components: PySpark
    Affects Versions: 3.3.0
            Reporter: CaoYu


When i using pyspark, i wanna get data from mysql database.  so i want use 
JDBCRDD like java\scala.

But that is not be supported in PySpark.

 

For some reasons, i can't using DataFrame API, only can use RDD(datastream) 
API. Even i know the DataFrame can get data from jdbc source fairly well.

 
So i want to implement functionality that can use rdd to get data from jdbc 
source for PySpark.
 
*But i don't know if that are necessary for PySpark.   so we can discuss it.*
 
{*}If it is necessary for PySpark{*}{*}, i want to contribute to Spark.{*}  
*i hope this Jira task can assigned to me, so i can start working to implement 
it.*
 
*if not, please close this Jira task.*
 
 
*thanks a lot.*
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to