[jira] [Commented] (SPARK-36996) fixing "SQL column nullable setting not retained as part of spark read" issue

Senthil Kumar (Jira) Wed, 13 Oct 2021 03:54:04 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-36996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428141#comment-17428141
 ]


Senthil Kumar commented on SPARK-36996:
---------------------------------------

Sample Output after this changes:

SQL :

mysql> CREATE TABLE Persons(Id int NOT NULL, FirstName varchar(255), LastName 
varchar(255), Age int);

 

mysql> desc Persons;
+-----------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+--------------+------+-----+---------+-------+
| Id | int | NO | | NULL | |
| FirstName | varchar(255) | YES | | NULL | |
| LastName | varchar(255) | YES | | NULL | |
| Age | int | YES | | NULL | |
+-----------+--------------+------+-----+---------+-------+

----------++-----------++----------------+

Spark:

scala> val df = 
spark.read.format("jdbc").option("database","Test_DB").option("user", 
"root").option("password", "").option("driver", 
"com.mysql.cj.jdbc.Driver").option("url", 
"jdbc:mysql://localhost:3306/Test_DB").option("dbtable", "Persons").load()
 df: org.apache.spark.sql.DataFrame = [Id: int, FirstName: string ... 2 more 
fields]

scala> df.printSchema()
 root
 |-- Id: integer (nullable = false)
 |-- FirstName: string (nullable = true)
 |-- LastName: string (nullable = true)
 |-- Age: integer (nullable = true)

 

 

And for TIMESTAMP columns

 

SQL:
create table timestamp_test(id int(11), time_stamp timestamp not null default 
current_timestamp);

SPARK:

scala> val df = 
spark.read.format("jdbc").option("database","Test_DB").option("user", 
"root").option("password", "").option("driver", 
"com.mysql.cj.jdbc.Driver").option("url", 
"jdbc:mysql://localhost:3306/Test_DB").option("dbtable", 
"timestamp_test").load()
df: org.apache.spark.sql.DataFrame = [id: int, time_stamp: timestamp]

scala> df.printSchema()
root
|-- id: integer (nullable = true)
|-- time_stamp: timestamp (nullable = true)

> fixing "SQL column nullable setting not retained as part of spark read" issue
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-36996
>                 URL: https://issues.apache.org/jira/browse/SPARK-36996
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.0.0, 3.1.0, 3.1.1, 3.1.2
>            Reporter: Senthil Kumar
>            Priority: Major
>
> Sql 'nullable' columns are not retaining 'nullable' type as it is while 
> reading from Spark read using jdbc format.
>  
> SQL :
> ------------
>  
> mysql> CREATE TABLE Persons(Id int NOT NULL, FirstName varchar(255), LastName 
> varchar(255), Age int);
>  
> mysql> desc Persons;
> +-----------+--------------+------+-----+---------+-------+
> | Field | Type | Null | Key | Default | Extra |
> +-----------+--------------+------+-----+---------+-------+
> | Id | int | NO | | NULL | |
> | FirstName | varchar(255) | YES | | NULL | |
> | LastName | varchar(255) | YES | | NULL | |
> | Age | int | YES | | NULL | |
> +-----------+--------------+------+-----+---------+-------+
>  
> But in Spark  we get all the columns as "Nullable":
> =============
> scala> val df = 
> spark.read.format("jdbc").option("database","Test_DB").option("user", 
> "root").option("password", "").option("driver", 
> "com.mysql.cj.jdbc.Driver").option("url", 
> "jdbc:mysql://localhost:3306/Test_DB").option("dbtable", "Persons").load()
> df: org.apache.spark.sql.DataFrame = [Id: int, FirstName: string ... 2 more 
> fields]
> scala> df.printSchema()
> root
>  |-- Id: integer (nullable = true)
>  |-- FirstName: string (nullable = true)
>  |-- LastName: string (nullable = true)
>  |-- Age: integer (nullable = true)
> =============
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-36996) fixing "SQL column nullable setting not retained as part of spark read" issue

Reply via email to