[
https://issues.apache.org/jira/browse/SPARK-36801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Senthil Kumar updated SPARK-36801:
----------------------------------
Description:
Reading using Spark SQL jdbc DataSource does not maintain nullable type and
changes "non nullable" columns to "nullable".
For example:
mysql> CREATE TABLE Persons(Id int NOT NULL, FirstName varchar(255), LastName
varchar(255), Age int);
Query OK, 0 rows affected (0.04 sec)
mysql> show tables;
+-------------------+
| Tables_in_test_db |
+-------------------+
| Persons |
+-------------------+
1 row in set (0.00 sec)
mysql> desc Persons;
+-----------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+--------------+------+-----+---------+-------+
| Id | int | NO | | NULL | |
| FirstName | varchar(255) | YES | | NULL | |
| LastName | varchar(255) | YES | | NULL | |
| Age | int | YES | | NULL | |
+-----------+--------------+------+-----+---------+-------+
{color:#cc7832}val {color}df = spark.read.format({color:#6a8759}"jdbc"{color})
.option({color:#6a8759}"database"{color}{color:#cc7832},{color}{color:#6a8759}"Test_DB"{color})
.option({color:#6a8759}"user"{color}{color:#cc7832},
{color}{color:#6a8759}"root"{color})
.option({color:#6a8759}"password"{color}{color:#cc7832},
{color}{color:#6a8759}""{color})
.option({color:#6a8759}"driver"{color}{color:#cc7832},
{color}{color:#6a8759}"com.mysql.cj.jdbc.Driver"{color})
.option({color:#6a8759}"url"{color}{color:#cc7832},
{color}{color:#6a8759}"jdbc:mysql://localhost:3306/Test_DB"{color})
.option({color:#6a8759}"query"{color}{color:#cc7832},
{color}{color:#6a8759}"(select * from Persons)"{color})
.load()
df.printSchema()
*output:*
root
|-- Id: integer (nullable = true)
|-- FirstName: string (nullable = true)
|-- LastName: string (nullable = true)
|-- Age: integer (nullable = true)
So we need to add a note, in Documentation[1], "All columns are automatically
converted to be nullable for compatibility reasons."
Ref:
[1
][https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html#jdbc-to-other-databases]
was:
Reading using Spark SQL jdbc DataSource does not maintain nullable type and
changes "non nullable" columns to "nullable".
So we need to add a note, in Documentation[1], "All columns are automatically
converted to be nullable for compatibility reasons."
[1
]https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html#jdbc-to-other-databases
> Document change for Spark sql jdbc
> ----------------------------------
>
> Key: SPARK-36801
> URL: https://issues.apache.org/jira/browse/SPARK-36801
> Project: Spark
> Issue Type: Documentation
> Components: Documentation
> Affects Versions: 3.0.0, 3.0.2, 3.0.3, 3.1.0, 3.1.1, 3.1.2
> Reporter: Senthil Kumar
> Priority: Trivial
>
> Reading using Spark SQL jdbc DataSource does not maintain nullable type and
> changes "non nullable" columns to "nullable".
>
> For example:
> mysql> CREATE TABLE Persons(Id int NOT NULL, FirstName varchar(255), LastName
> varchar(255), Age int);
> Query OK, 0 rows affected (0.04 sec)
> mysql> show tables;
> +-------------------+
> | Tables_in_test_db |
> +-------------------+
> | Persons |
> +-------------------+
> 1 row in set (0.00 sec)
> mysql> desc Persons;
> +-----------+--------------+------+-----+---------+-------+
> | Field | Type | Null | Key | Default | Extra |
> +-----------+--------------+------+-----+---------+-------+
> | Id | int | NO | | NULL | |
> | FirstName | varchar(255) | YES | | NULL | |
> | LastName | varchar(255) | YES | | NULL | |
> | Age | int | YES | | NULL | |
> +-----------+--------------+------+-----+---------+-------+
>
>
> {color:#cc7832}val {color}df = spark.read.format({color:#6a8759}"jdbc"{color})
>
> .option({color:#6a8759}"database"{color}{color:#cc7832},{color}{color:#6a8759}"Test_DB"{color})
> .option({color:#6a8759}"user"{color}{color:#cc7832},
> {color}{color:#6a8759}"root"{color})
> .option({color:#6a8759}"password"{color}{color:#cc7832},
> {color}{color:#6a8759}""{color})
> .option({color:#6a8759}"driver"{color}{color:#cc7832},
> {color}{color:#6a8759}"com.mysql.cj.jdbc.Driver"{color})
> .option({color:#6a8759}"url"{color}{color:#cc7832},
> {color}{color:#6a8759}"jdbc:mysql://localhost:3306/Test_DB"{color})
> .option({color:#6a8759}"query"{color}{color:#cc7832},
> {color}{color:#6a8759}"(select * from Persons)"{color})
> .load()
> df.printSchema()
>
> *output:*
>
> root
> |-- Id: integer (nullable = true)
> |-- FirstName: string (nullable = true)
> |-- LastName: string (nullable = true)
> |-- Age: integer (nullable = true)
>
>
> So we need to add a note, in Documentation[1], "All columns are automatically
> converted to be nullable for compatibility reasons."
> Ref:
> [1
> ][https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html#jdbc-to-other-databases]
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]