shivsood opened a new pull request #24969: [SPARK-28151] ByteType is not correctly mapped for read/write of SQLServer tables URL: https://github.com/apache/spark/pull/24969 Fir for ByteType is not correctly mapped for read/write of SQLServer tables. Issues: Writing dataframe with column type BYTETYPE fails when using JDBC connector for SQL Server. Append and Read of tables also fail. The problem is due to 1. (Write path) Incorrect mapping of BYTETYPE in getCommonJDBCType() in jdbcutils.scala where BYTETYPE gets mapped to BYTE text. It should be mapped to TINYINT case ByteType => Option(JdbcType("BYTE", java.sql.Types.TINYINT)) In getCatalystType() ( JDBC to Catalyst type mapping) TINYINT is mapped to INTEGER, while it should be mapped to BYTETYPE. Mapping to integer is ok from the point of view of upcasting, but will lead to 4 byte allocation rather than 1 byte for BYTETYPE. 2. (Read path) Read path ends up calling makeGetter(dt: DataType, metadata: Metadata). The function sets the value in RDD row. The value is set per the data type. Here there is no mapping for BYTETYPE and thus results will result in an error when getCatalystType() is fixed. Note : These issues were found when reading/writing with SQLServer. Error seen when writing table (JDBC Write failed,com.microsoft.sqlserver.jdbc.SQLServerException: Column, parameter, or variable #2: Cannot find data type BYTE.) com.microsoft.sqlserver.jdbc.SQLServerException: Column, parameter, or variable #2: Cannot find data type BYTE. com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:254) com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1608) com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQLServerStatement.java:859) .. ## What changes were proposed in this pull request? (MsSqlServerDialect.scala ) BYTETYPE is mapped to "TINYINT" (JdbcUtils.scala) reading ByteTYPE is added to makeGetter to enable reading after ByteType is translated to TINYINT. ## How was this patch tested? Unit test - Added and passed. Integration Test - Tested end to end with SQLServer using JDBC connector. (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review https://spark.apache.org/contributing.html before opening a pull request.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
