Re: [PR] [SPARK-47945][SQL] MsSQLServer: Document Mapping Spark SQL Data Types from Microsoft SQL Server and add tests [spark]

2024-04-23 Thread via GitHub


yaooqinn closed pull request #46173: [SPARK-47945][SQL] MsSQLServer: Document 
Mapping Spark SQL Data Types from Microsoft SQL Server and add tests
URL: https://github.com/apache/spark/pull/46173


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47945][SQL] MsSQLServer: Document Mapping Spark SQL Data Types from Microsoft SQL Server and add tests [spark]

2024-04-23 Thread via GitHub


yaooqinn commented on PR #46173:
URL: https://github.com/apache/spark/pull/46173#issuecomment-2071563284

   Thank you @dongjoon-hyun 
   
   Merged to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47945][SQL] MsSQLServer: Document Mapping Spark SQL Data Types from Microsoft SQL Server and add tests [spark]

2024-04-23 Thread via GitHub


dongjoon-hyun commented on code in PR #46173:
URL: https://github.com/apache/spark/pull/46173#discussion_r1575686088


##
docs/sql-data-sources-jdbc.md:
##
@@ -1441,3 +1441,192 @@ The Spark Catalyst data types below are not supported 
with suitable Oracle types
 - NullType
 - ObjectType
 - VariantType
+
+### Mapping Spark SQL Data Types from Microsoft SQL Server
+
+The below table describes the data type conversions from Microsoft SQL Server 
data types to Spark SQL Data Types,
+when reading data from a Microsoft SQL Server table using the built-in jdbc 
data source with the mssql-jdbc
+as the activated JDBC Driver.
+
+
+
+  
+
+  SQL Server  Data Type
+  Spark SQL Data Type
+  Remarks
+
+  
+  
+
+  bit
+  BooleanType
+  
+
+
+  tinyint
+  ShortType
+  
+
+
+  smallint
+  ShortType
+  
+
+
+  int
+  IntegerType
+  
+
+
+  bigint
+  LongType
+  
+
+
+  float(p), real
+  FloatType
+  1  p  24
+
+
+  float[(p)]
+  DoubleType
+  25  p  53
+
+
+  double precision
+  DoubleType
+  
+
+
+  smallmoney
+  DecimalType(10, 4)
+  
+
+
+  money
+  DecimalType(19, 4)
+  
+
+
+  decimal[(p[, s])], numeric[(p[, s])]
+  DecimalType(p, s)
+  
+
+
+  date
+  DateType
+  
+
+
+  datetime
+  TimestampType
+  (Default)preferTimestampNTZ=false or 
spark.sql.timestampType=TIMESTAMP_LTZ
+
+
+  datetime
+  TimestampNTZType
+  preferTimestampNTZ=true or spark.sql.timestampType=TIMESTAMP_NTZ
+
+
+  datetime2 [ (fractional seconds precision) ]
+  TimestampType
+  (Default)preferTimestampNTZ=false or 
spark.sql.timestampType=TIMESTAMP_LTZ
+
+
+  datetime2 [ (fractional seconds precision) ]
+  TimestampNTZType
+  preferTimestampNTZ=true or spark.sql.timestampType=TIMESTAMP_NTZ
+
+
+  datetimeoffset [ (fractional seconds precision) ]
+  StringType

Review Comment:
   Ya, let's take a look at that separately.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47945][SQL] MsSQLServer: Document Mapping Spark SQL Data Types from Microsoft SQL Server and add tests [spark]

2024-04-23 Thread via GitHub


dongjoon-hyun commented on code in PR #46173:
URL: https://github.com/apache/spark/pull/46173#discussion_r1575684362


##
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MsSqlServerIntegrationSuite.scala:
##
@@ -265,7 +273,7 @@ class MsSqlServerIntegrationSuite extends 
DockerJDBCIntegrationSuite {
 assert(types(7).equals("class java.lang.String"))
 assert(types(8).equals("class [B"))
 assert(row.getString(0).length == 10)
-assert(row.getString(0).trim.equals("the"))
+assert(row.getString(0).equals("the".padTo(10, ' ')))

Review Comment:
   Thanks. Yes, that would be safe.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47945][SQL] MsSQLServer: Document Mapping Spark SQL Data Types from Microsoft SQL Server and add tests [spark]

2024-04-22 Thread via GitHub


yaooqinn commented on code in PR #46173:
URL: https://github.com/apache/spark/pull/46173#discussion_r1575669023


##
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MsSqlServerIntegrationSuite.scala:
##
@@ -265,7 +273,7 @@ class MsSqlServerIntegrationSuite extends 
DockerJDBCIntegrationSuite {
 assert(types(7).equals("class java.lang.String"))
 assert(types(8).equals("class [B"))
 assert(row.getString(0).length == 10)
-assert(row.getString(0).trim.equals("the"))
+assert(row.getString(0).equals("the".padTo(10, ' ')))

Review Comment:
   Anyway, I removed this LOC



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47945][SQL] MsSQLServer: Document Mapping Spark SQL Data Types from Microsoft SQL Server and add tests [spark]

2024-04-22 Thread via GitHub


yaooqinn commented on code in PR #46173:
URL: https://github.com/apache/spark/pull/46173#discussion_r1575667729


##
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MsSqlServerIntegrationSuite.scala:
##
@@ -265,7 +273,7 @@ class MsSqlServerIntegrationSuite extends 
DockerJDBCIntegrationSuite {
 assert(types(7).equals("class java.lang.String"))
 assert(types(8).equals("class [B"))
 assert(row.getString(0).length == 10)
-assert(row.getString(0).trim.equals("the"))
+assert(row.getString(0).equals("the".padTo(10, ' ')))

Review Comment:
   Char padding is ANSI-irrelevant, AFAIK



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47945][SQL] MsSQLServer: Document Mapping Spark SQL Data Types from Microsoft SQL Server and add tests [spark]

2024-04-22 Thread via GitHub


yaooqinn commented on code in PR #46173:
URL: https://github.com/apache/spark/pull/46173#discussion_r1575666204


##
docs/sql-data-sources-jdbc.md:
##
@@ -1441,3 +1441,192 @@ The Spark Catalyst data types below are not supported 
with suitable Oracle types
 - NullType
 - ObjectType
 - VariantType
+
+### Mapping Spark SQL Data Types from Microsoft SQL Server
+
+The below table describes the data type conversions from Microsoft SQL Server 
data types to Spark SQL Data Types,
+when reading data from a Microsoft SQL Server table using the built-in jdbc 
data source with the mssql-jdbc
+as the activated JDBC Driver.
+
+
+
+  
+
+  SQL Server  Data Type
+  Spark SQL Data Type
+  Remarks
+
+  
+  
+
+  bit
+  BooleanType
+  
+
+
+  tinyint
+  ShortType
+  
+
+
+  smallint
+  ShortType
+  
+
+
+  int
+  IntegerType
+  
+
+
+  bigint
+  LongType
+  
+
+
+  float(p), real
+  FloatType
+  1  p  24
+
+
+  float[(p)]
+  DoubleType
+  25  p  53
+
+
+  double precision
+  DoubleType
+  
+
+
+  smallmoney
+  DecimalType(10, 4)
+  
+
+
+  money
+  DecimalType(19, 4)
+  
+
+
+  decimal[(p[, s])], numeric[(p[, s])]
+  DecimalType(p, s)
+  
+
+
+  date
+  DateType
+  
+
+
+  datetime
+  TimestampType
+  (Default)preferTimestampNTZ=false or 
spark.sql.timestampType=TIMESTAMP_LTZ
+
+
+  datetime
+  TimestampNTZType
+  preferTimestampNTZ=true or spark.sql.timestampType=TIMESTAMP_NTZ
+
+
+  datetime2 [ (fractional seconds precision) ]
+  TimestampType
+  (Default)preferTimestampNTZ=false or 
spark.sql.timestampType=TIMESTAMP_LTZ
+
+
+  datetime2 [ (fractional seconds precision) ]
+  TimestampNTZType
+  preferTimestampNTZ=true or spark.sql.timestampType=TIMESTAMP_NTZ
+
+
+  datetimeoffset [ (fractional seconds precision) ]
+  StringType

Review Comment:
   
https://github.com/apache/spark/blob/9d715ba491710969340d9e8a49a21d11f51ef7d3/sql/core/src/main/scala/org/apache/spark/sql/jdbc/MsSqlServerDialect.scala#L112-L114
   
   This comment appears not true as we use mssql-jdbc



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47945][SQL] MsSQLServer: Document Mapping Spark SQL Data Types from Microsoft SQL Server and add tests [spark]

2024-04-22 Thread via GitHub


dongjoon-hyun commented on code in PR #46173:
URL: https://github.com/apache/spark/pull/46173#discussion_r1575663710


##
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MsSqlServerIntegrationSuite.scala:
##
@@ -265,7 +273,7 @@ class MsSqlServerIntegrationSuite extends 
DockerJDBCIntegrationSuite {
 assert(types(7).equals("class java.lang.String"))
 assert(types(8).equals("class [B"))
 assert(row.getString(0).length == 10)
-assert(row.getString(0).trim.equals("the"))
+assert(row.getString(0).equals("the".padTo(10, ' ')))

Review Comment:
   This seems to be ANSI result. Is this safe in non-ANSI CI, @yaooqinn ?
   - https://github.com/apache/spark/actions/workflows/build_non_ansi.yml



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org