[jira] [Updated] (SPARK-28151) [JDBC Connector] ByteType is not correctly mapped for read/write of SQLServer tables

Shiv Prashant Sood (JIRA) Sat, 13 Jul 2019 12:10:38 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-28151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Shiv Prashant Sood updated SPARK-28151:
---------------------------------------
    Description: 
##ByteType issue
Writing dataframe with column type BYTETYPE fails when using JDBC connector for 
SQL Server. Append and Read of tables also fail. The problem is due 

1. (Write path) Incorrect mapping of BYTETYPE in getCommonJDBCType() in 
jdbcutils.scala where BYTETYPE gets mapped to BYTE text. It should be mapped to 
TINYINT
{color:#cc7832}case {color}ByteType => 
Option(JdbcType({color:#6a8759}"BYTE"{color}{color:#cc7832}, 
{color}java.sql.Types.{color:#9876aa}TINYINT{color}))

In getCatalystType() ( JDBC to Catalyst type mapping) TINYINT is mapped to 
INTEGER, while it should be mapped to BYTETYPE. Mapping to integer is ok from 
the point of view of upcasting, but will lead to 4 byte allocation rather than 
1 byte for BYTETYPE.



2. (read path) Read path ends up calling makeGetter(dt: DataType, metadata: 
Metadata). The function sets the value in RDD row. The value is set per the 
data type. Here there is no mapping for BYTETYPE and thus results will result 
in an error when getCatalystType() is fixed.

Note : These issues were found when reading/writing with SQLServer. Will be 
submitting a PR soon to fix these mappings in MSSQLServerDialect.

Error seen when writing table

(JDBC Write failed,com.microsoft.sqlserver.jdbc.SQLServerException: Column, 
parameter, or variable #2: *Cannot find data type BYTE*.)
com.microsoft.sqlserver.jdbc.SQLServerException: Column, parameter, or variable 
#2: Cannot find data type BYTE.
com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:254)
com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1608)
com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQLServerStatement.java:859)
 .. 

 

 

 

  was:
##ByteType issue
Writing dataframe with column type BYTETYPE fails when using JDBC connector for 
SQL Server. Append and Read of tables also fail. The problem is due 

1. (Write path) Incorrect mapping of BYTETYPE in getCommonJDBCType() in 
jdbcutils.scala where BYTETYPE gets mapped to BYTE text. It should be mapped to 
TINYINT
{color:#cc7832}case {color}ByteType => 
Option(JdbcType({color:#6a8759}"BYTE"{color}{color:#cc7832}, 
{color}java.sql.Types.{color:#9876aa}TINYINT{color}))

In getCatalystType() ( JDBC to Catalyst type mapping) TINYINT is mapped to 
INTEGER, while it should be mapped to BYTETYPE. Mapping to integer is ok from 
the point of view of upcasting, but will lead to 4 byte allocation rather than 
1 byte for BYTETYPE.



2. (read path) Read path ends up calling makeGetter(dt: DataType, metadata: 
Metadata). The function sets the value in RDD row. The value is set per the 
data type. Here there is no mapping for BYTETYPE and thus results will result 
in an error when getCatalystType() is fixed.

Note : These issues were found when reading/writing with SQLServer. Will be 
submitting a PR soon to fix these mappings in MSSQLServerDialect.

Error seen when writing table

(JDBC Write failed,com.microsoft.sqlserver.jdbc.SQLServerException: Column, 
parameter, or variable #2: *Cannot find data type BYTE*.)
com.microsoft.sqlserver.jdbc.SQLServerException: Column, parameter, or variable 
#2: Cannot find data type BYTE.
com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:254)
com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1608)
com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQLServerStatement.java:859)
 ..

##ShortType and FloatType issue
ShortType and FloatTypes are not correctly mapped to right JDBC types when 
using JDBC connector. This results in tables and spark data frame being created 
with unintended types.

Some example issue

    Write from df with column type results in a SQL table of with column type 
as INTEGER as opposed to SMALLINT. Thus a larger table that expected.
    read results in a dataframe with type INTEGER as opposed to ShortType 

FloatTypes have a issue with read path. In the write path Spark data type 
'FloatType' is correctly mapped to JDBC equivalent data type 'Real'. But in the 
read path when JDBC data types need to be converted to Catalyst data types ( 
getCatalystType) 'Real' gets incorrectly gets mapped to 'DoubleType' rather 
than 'FloatType'.

 

 

 

 

 

 


> [JDBC Connector] ByteType is not correctly mapped for read/write of SQLServer 
> tables
> ------------------------------------------------------------------------------------
>
>                 Key: SPARK-28151
>                 URL: https://issues.apache.org/jira/browse/SPARK-28151
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.0, 2.4.3
>            Reporter: Shiv Prashant Sood
>            Priority: Minor
>
> ##ByteType issue
> Writing dataframe with column type BYTETYPE fails when using JDBC connector 
> for SQL Server. Append and Read of tables also fail. The problem is due 
> 1. (Write path) Incorrect mapping of BYTETYPE in getCommonJDBCType() in 
> jdbcutils.scala where BYTETYPE gets mapped to BYTE text. It should be mapped 
> to TINYINT
> {color:#cc7832}case {color}ByteType => 
> Option(JdbcType({color:#6a8759}"BYTE"{color}{color:#cc7832}, 
> {color}java.sql.Types.{color:#9876aa}TINYINT{color}))
> In getCatalystType() ( JDBC to Catalyst type mapping) TINYINT is mapped to 
> INTEGER, while it should be mapped to BYTETYPE. Mapping to integer is ok from 
> the point of view of upcasting, but will lead to 4 byte allocation rather 
> than 1 byte for BYTETYPE.
> 2. (read path) Read path ends up calling makeGetter(dt: DataType, metadata: 
> Metadata). The function sets the value in RDD row. The value is set per the 
> data type. Here there is no mapping for BYTETYPE and thus results will result 
> in an error when getCatalystType() is fixed.
> Note : These issues were found when reading/writing with SQLServer. Will be 
> submitting a PR soon to fix these mappings in MSSQLServerDialect.
> Error seen when writing table
> (JDBC Write failed,com.microsoft.sqlserver.jdbc.SQLServerException: Column, 
> parameter, or variable #2: *Cannot find data type BYTE*.)
> com.microsoft.sqlserver.jdbc.SQLServerException: Column, parameter, or 
> variable #2: Cannot find data type BYTE.
> com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:254)
> com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1608)
> com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQLServerStatement.java:859)
>  .. 
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-28151) [JDBC Connector] ByteType is not correctly mapped for read/write of SQLServer tables

Reply via email to