Re: [PR] [SPARK-47945][SQL] MsSQLServer: Document Mapping Spark SQL Data Types from Microsoft SQL Server and add tests [spark]
yaooqinn closed pull request #46173: [SPARK-47945][SQL] MsSQLServer: Document Mapping Spark SQL Data Types from Microsoft SQL Server and add tests URL: https://github.com/apache/spark/pull/46173 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47945][SQL] MsSQLServer: Document Mapping Spark SQL Data Types from Microsoft SQL Server and add tests [spark]
yaooqinn commented on PR #46173: URL: https://github.com/apache/spark/pull/46173#issuecomment-2071563284 Thank you @dongjoon-hyun Merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47945][SQL] MsSQLServer: Document Mapping Spark SQL Data Types from Microsoft SQL Server and add tests [spark]
dongjoon-hyun commented on code in PR #46173: URL: https://github.com/apache/spark/pull/46173#discussion_r1575686088 ## docs/sql-data-sources-jdbc.md: ## @@ -1441,3 +1441,192 @@ The Spark Catalyst data types below are not supported with suitable Oracle types - NullType - ObjectType - VariantType + +### Mapping Spark SQL Data Types from Microsoft SQL Server + +The below table describes the data type conversions from Microsoft SQL Server data types to Spark SQL Data Types, +when reading data from a Microsoft SQL Server table using the built-in jdbc data source with the mssql-jdbc +as the activated JDBC Driver. + + + + + + SQL Server Data Type + Spark SQL Data Type + Remarks + + + + + bit + BooleanType + + + + tinyint + ShortType + + + + smallint + ShortType + + + + int + IntegerType + + + + bigint + LongType + + + + float(p), real + FloatType + 1 p 24 + + + float[(p)] + DoubleType + 25 p 53 + + + double precision + DoubleType + + + + smallmoney + DecimalType(10, 4) + + + + money + DecimalType(19, 4) + + + + decimal[(p[, s])], numeric[(p[, s])] + DecimalType(p, s) + + + + date + DateType + + + + datetime + TimestampType + (Default)preferTimestampNTZ=false or spark.sql.timestampType=TIMESTAMP_LTZ + + + datetime + TimestampNTZType + preferTimestampNTZ=true or spark.sql.timestampType=TIMESTAMP_NTZ + + + datetime2 [ (fractional seconds precision) ] + TimestampType + (Default)preferTimestampNTZ=false or spark.sql.timestampType=TIMESTAMP_LTZ + + + datetime2 [ (fractional seconds precision) ] + TimestampNTZType + preferTimestampNTZ=true or spark.sql.timestampType=TIMESTAMP_NTZ + + + datetimeoffset [ (fractional seconds precision) ] + StringType Review Comment: Ya, let's take a look at that separately. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47945][SQL] MsSQLServer: Document Mapping Spark SQL Data Types from Microsoft SQL Server and add tests [spark]
dongjoon-hyun commented on code in PR #46173: URL: https://github.com/apache/spark/pull/46173#discussion_r1575684362 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MsSqlServerIntegrationSuite.scala: ## @@ -265,7 +273,7 @@ class MsSqlServerIntegrationSuite extends DockerJDBCIntegrationSuite { assert(types(7).equals("class java.lang.String")) assert(types(8).equals("class [B")) assert(row.getString(0).length == 10) -assert(row.getString(0).trim.equals("the")) +assert(row.getString(0).equals("the".padTo(10, ' '))) Review Comment: Thanks. Yes, that would be safe. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47945][SQL] MsSQLServer: Document Mapping Spark SQL Data Types from Microsoft SQL Server and add tests [spark]
yaooqinn commented on code in PR #46173: URL: https://github.com/apache/spark/pull/46173#discussion_r1575669023 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MsSqlServerIntegrationSuite.scala: ## @@ -265,7 +273,7 @@ class MsSqlServerIntegrationSuite extends DockerJDBCIntegrationSuite { assert(types(7).equals("class java.lang.String")) assert(types(8).equals("class [B")) assert(row.getString(0).length == 10) -assert(row.getString(0).trim.equals("the")) +assert(row.getString(0).equals("the".padTo(10, ' '))) Review Comment: Anyway, I removed this LOC -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47945][SQL] MsSQLServer: Document Mapping Spark SQL Data Types from Microsoft SQL Server and add tests [spark]
yaooqinn commented on code in PR #46173: URL: https://github.com/apache/spark/pull/46173#discussion_r1575667729 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MsSqlServerIntegrationSuite.scala: ## @@ -265,7 +273,7 @@ class MsSqlServerIntegrationSuite extends DockerJDBCIntegrationSuite { assert(types(7).equals("class java.lang.String")) assert(types(8).equals("class [B")) assert(row.getString(0).length == 10) -assert(row.getString(0).trim.equals("the")) +assert(row.getString(0).equals("the".padTo(10, ' '))) Review Comment: Char padding is ANSI-irrelevant, AFAIK -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47945][SQL] MsSQLServer: Document Mapping Spark SQL Data Types from Microsoft SQL Server and add tests [spark]
yaooqinn commented on code in PR #46173: URL: https://github.com/apache/spark/pull/46173#discussion_r1575666204 ## docs/sql-data-sources-jdbc.md: ## @@ -1441,3 +1441,192 @@ The Spark Catalyst data types below are not supported with suitable Oracle types - NullType - ObjectType - VariantType + +### Mapping Spark SQL Data Types from Microsoft SQL Server + +The below table describes the data type conversions from Microsoft SQL Server data types to Spark SQL Data Types, +when reading data from a Microsoft SQL Server table using the built-in jdbc data source with the mssql-jdbc +as the activated JDBC Driver. + + + + + + SQL Server Data Type + Spark SQL Data Type + Remarks + + + + + bit + BooleanType + + + + tinyint + ShortType + + + + smallint + ShortType + + + + int + IntegerType + + + + bigint + LongType + + + + float(p), real + FloatType + 1 p 24 + + + float[(p)] + DoubleType + 25 p 53 + + + double precision + DoubleType + + + + smallmoney + DecimalType(10, 4) + + + + money + DecimalType(19, 4) + + + + decimal[(p[, s])], numeric[(p[, s])] + DecimalType(p, s) + + + + date + DateType + + + + datetime + TimestampType + (Default)preferTimestampNTZ=false or spark.sql.timestampType=TIMESTAMP_LTZ + + + datetime + TimestampNTZType + preferTimestampNTZ=true or spark.sql.timestampType=TIMESTAMP_NTZ + + + datetime2 [ (fractional seconds precision) ] + TimestampType + (Default)preferTimestampNTZ=false or spark.sql.timestampType=TIMESTAMP_LTZ + + + datetime2 [ (fractional seconds precision) ] + TimestampNTZType + preferTimestampNTZ=true or spark.sql.timestampType=TIMESTAMP_NTZ + + + datetimeoffset [ (fractional seconds precision) ] + StringType Review Comment: https://github.com/apache/spark/blob/9d715ba491710969340d9e8a49a21d11f51ef7d3/sql/core/src/main/scala/org/apache/spark/sql/jdbc/MsSqlServerDialect.scala#L112-L114 This comment appears not true as we use mssql-jdbc -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47945][SQL] MsSQLServer: Document Mapping Spark SQL Data Types from Microsoft SQL Server and add tests [spark]
dongjoon-hyun commented on code in PR #46173: URL: https://github.com/apache/spark/pull/46173#discussion_r1575663710 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MsSqlServerIntegrationSuite.scala: ## @@ -265,7 +273,7 @@ class MsSqlServerIntegrationSuite extends DockerJDBCIntegrationSuite { assert(types(7).equals("class java.lang.String")) assert(types(8).equals("class [B")) assert(row.getString(0).length == 10) -assert(row.getString(0).trim.equals("the")) +assert(row.getString(0).equals("the".padTo(10, ' '))) Review Comment: This seems to be ANSI result. Is this safe in non-ANSI CI, @yaooqinn ? - https://github.com/apache/spark/actions/workflows/build_non_ansi.yml -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org