[GitHub] [spark] sadikovi commented on a diff in pull request #37327: [SPARK-39904][SQL] Rename inferDate to preferDate and add check for inferSchema = false

2022-07-31 Thread GitBox


sadikovi commented on code in PR #37327:
URL: https://github.com/apache/spark/pull/37327#discussion_r934085913


##
docs/sql-data-sources-csv.md:
##
@@ -109,9 +109,9 @@ Data source options of CSV can be set via:
 read
   
   
-inferDate 
+inferDate

Review Comment:
   Personally, I would prefer `preferDate` instead of `prefersDate` but the 
latter would be consistent with JSON option `prefersDecimal` so it is fine.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sadikovi commented on a diff in pull request #37327: [SPARK-39904][SQL] Rename inferDate to preferDate and add check for inferSchema = false

2022-07-31 Thread GitBox


sadikovi commented on code in PR #37327:
URL: https://github.com/apache/spark/pull/37327#discussion_r934085422


##
docs/sql-data-sources-csv.md:
##
@@ -109,9 +109,9 @@ Data source options of CSV can be set via:
 read
   
   
-inferDate 
+inferDate

Review Comment:
   Yes, let's change the name to `prefersDate` since we are not going to throw 
an exception when the schema inference is disabled.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sadikovi commented on a diff in pull request #37327: [SPARK-39904][SQL] Rename inferDate to preferDate and add check for inferSchema = false

2022-07-28 Thread GitBox


sadikovi commented on code in PR #37327:
URL: https://github.com/apache/spark/pull/37327#discussion_r932712356


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala:
##
@@ -153,19 +153,24 @@ class CSVOptions(
* Disabled by default for backwards compatibility and performance. When 
enabled, date entries in
* timestamp columns will be cast to timestamp upon parsing. Not compatible 
with
* legacyTimeParserPolicy == LEGACY since legacy date parser will accept 
extra trailing characters
+   *
+   * The flag is only enabled if inferSchema is set to true.
*/
-  val inferDate = {
-val inferDateFlag = getBool("inferDate")
-if (SQLConf.get.legacyTimeParserPolicy == LegacyBehaviorPolicy.LEGACY && 
inferDateFlag) {
+  val preferDate = {
+val preferDateFlag = getBool("preferDate")
+if (preferDateFlag && SQLConf.get.legacyTimeParserPolicy == 
LegacyBehaviorPolicy.LEGACY) {
   throw QueryExecutionErrors.inferDateWithLegacyTimeParserError()
 }
-inferDateFlag
+if (preferDateFlag && !inferSchemaFlag) {

Review Comment:
   Okay, I can do that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sadikovi commented on a diff in pull request #37327: [SPARK-39904][SQL] Rename inferDate to preferDate and add check for inferSchema = false

2022-07-28 Thread GitBox


sadikovi commented on code in PR #37327:
URL: https://github.com/apache/spark/pull/37327#discussion_r931880374


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala:
##
@@ -153,19 +153,24 @@ class CSVOptions(
* Disabled by default for backwards compatibility and performance. When 
enabled, date entries in
* timestamp columns will be cast to timestamp upon parsing. Not compatible 
with
* legacyTimeParserPolicy == LEGACY since legacy date parser will accept 
extra trailing characters
+   *
+   * The flag is only enabled if inferSchema is set to true.
*/
-  val inferDate = {
-val inferDateFlag = getBool("inferDate")
-if (SQLConf.get.legacyTimeParserPolicy == LegacyBehaviorPolicy.LEGACY && 
inferDateFlag) {
+  val preferDate = {
+val preferDateFlag = getBool("preferDate")
+if (preferDateFlag && SQLConf.get.legacyTimeParserPolicy == 
LegacyBehaviorPolicy.LEGACY) {
   throw QueryExecutionErrors.inferDateWithLegacyTimeParserError()
 }
-inferDateFlag
+if (preferDateFlag && !inferSchemaFlag) {

Review Comment:
   Okay, @Jonathancui123 would you be able to clarify the semantics of the 
flag? Thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sadikovi commented on a diff in pull request #37327: [SPARK-39904][SQL] Rename inferDate to preferDate and add check for inferSchema = false

2022-07-28 Thread GitBox


sadikovi commented on code in PR #37327:
URL: https://github.com/apache/spark/pull/37327#discussion_r931877506


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala:
##
@@ -153,19 +153,24 @@ class CSVOptions(
* Disabled by default for backwards compatibility and performance. When 
enabled, date entries in
* timestamp columns will be cast to timestamp upon parsing. Not compatible 
with
* legacyTimeParserPolicy == LEGACY since legacy date parser will accept 
extra trailing characters
+   *
+   * The flag is only enabled if inferSchema is set to true.
*/
-  val inferDate = {
-val inferDateFlag = getBool("inferDate")
-if (SQLConf.get.legacyTimeParserPolicy == LegacyBehaviorPolicy.LEGACY && 
inferDateFlag) {
+  val preferDate = {
+val preferDateFlag = getBool("preferDate")
+if (preferDateFlag && SQLConf.get.legacyTimeParserPolicy == 
LegacyBehaviorPolicy.LEGACY) {
   throw QueryExecutionErrors.inferDateWithLegacyTimeParserError()
 }
-inferDateFlag
+if (preferDateFlag && !inferSchemaFlag) {

Review Comment:
   Also, I chose `preferDate` similar to `inferDate`. Or should we go for 
`prefersDate`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sadikovi commented on a diff in pull request #37327: [SPARK-39904][SQL] Rename inferDate to preferDate and add check for inferSchema = false

2022-07-28 Thread GitBox


sadikovi commented on code in PR #37327:
URL: https://github.com/apache/spark/pull/37327#discussion_r931876296


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala:
##
@@ -153,19 +153,24 @@ class CSVOptions(
* Disabled by default for backwards compatibility and performance. When 
enabled, date entries in
* timestamp columns will be cast to timestamp upon parsing. Not compatible 
with
* legacyTimeParserPolicy == LEGACY since legacy date parser will accept 
extra trailing characters
+   *
+   * The flag is only enabled if inferSchema is set to true.
*/
-  val inferDate = {
-val inferDateFlag = getBool("inferDate")
-if (SQLConf.get.legacyTimeParserPolicy == LegacyBehaviorPolicy.LEGACY && 
inferDateFlag) {
+  val preferDate = {
+val preferDateFlag = getBool("preferDate")
+if (preferDateFlag && SQLConf.get.legacyTimeParserPolicy == 
LegacyBehaviorPolicy.LEGACY) {
   throw QueryExecutionErrors.inferDateWithLegacyTimeParserError()
 }
-inferDateFlag
+if (preferDateFlag && !inferSchemaFlag) {

Review Comment:
   I followed the flag definition in the doc that says inferSchema should be 
enabled for this feature to work.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org