[GitHub] [spark] SparkQA commented on issue #24486: [SPARK-27592][SQL] Set the bucketed data source table SerDe correctly
SparkQA commented on issue #24486: [SPARK-27592][SQL] Set the bucketed data source table SerDe correctly URL: https://github.com/apache/spark/pull/24486#issuecomment-521537900 **[Test build #109147 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109147/testReport)** for PR 24486 at commit [`842bd3e`](https://github.com/apache/spark/commit/842bd3ec57a33093a5f47ceb38016ebabf9503e1). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] 19kka commented on issue #25450: [SPARK-23793][SQL]Handle database names in spark.udf.register()
19kka commented on issue #25450: [SPARK-23793][SQL]Handle database names in spark.udf.register() URL: https://github.com/apache/spark/pull/25450#issuecomment-521537674 > Thank you for your first contribution, @19kka . Could you run `dev/scalastyle` and fix the errors? I saw some violation like [this](https://github.com/apache/spark/pull/25450/files#diff-85fdb913077429ac8e211a3c68375994L24) here. I'm awfully sorry about forget check the style, now I fixed the style error and add UDFSuite Test. I read the related register code again, I realized `spark.sql.resigter()` is responsible for **Create Temp Function** , so I modify the code if `spark.sql.resigter()` function name with **database** name It will throw new AnalysisException e.g. ```scala spark.udf.register("db.fun1", (x: Long) => x + 1) // throw new AnalysisException ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available
AmplabJenkins removed a comment on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available URL: https://github.com/apache/spark/pull/25460#issuecomment-521535225 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available
SparkQA commented on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available URL: https://github.com/apache/spark/pull/25460#issuecomment-521535973 **[Test build #109146 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109146/testReport)** for PR 25460 at commit [`c7edcb4`](https://github.com/apache/spark/commit/c7edcb4f89e57332f24a7f4994c7e762eecf12df). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available
AmplabJenkins removed a comment on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available URL: https://github.com/apache/spark/pull/25460#issuecomment-521535442 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available
AmplabJenkins removed a comment on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available URL: https://github.com/apache/spark/pull/25460#issuecomment-521535447 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14215/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available
AmplabJenkins commented on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available URL: https://github.com/apache/spark/pull/25460#issuecomment-521535442 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available
AmplabJenkins commented on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available URL: https://github.com/apache/spark/pull/25460#issuecomment-521535447 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14215/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #25418: [SPARK-28695][SS] Use CaseInsensitiveMap in KafkaSourceProvider to make source param handling more robust
cloud-fan closed pull request #25418: [SPARK-28695][SS] Use CaseInsensitiveMap in KafkaSourceProvider to make source param handling more robust URL: https://github.com/apache/spark/pull/25418 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] shahidki31 commented on a change in pull request #22502: [SPARK-25474][SQL] Support `spark.sql.statistics.fallBackToHdfs` in data source tables
shahidki31 commented on a change in pull request #22502: [SPARK-25474][SQL] Support `spark.sql.statistics.fallBackToHdfs` in data source tables URL: https://github.com/apache/spark/pull/22502#discussion_r314191902 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFsRelation.scala ## @@ -71,7 +70,13 @@ case class HadoopFsRelation( override def sizeInBytes: Long = { val compressionFactor = sqlContext.conf.fileCompressionFactor -(location.sizeInBytes * compressionFactor).toLong +val defaultSize = (location.sizeInBytes * compressionFactor).toLong +location match { + case cfi: CatalogFileIndex if sparkSession.sessionState.conf.fallBackToHdfsForStatsEnabled => Review comment: @cloud-fan I have created a followup PR to add the extra condition. Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available
AmplabJenkins commented on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available URL: https://github.com/apache/spark/pull/25460#issuecomment-521535225 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] shahidki31 commented on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available
shahidki31 commented on issue #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available URL: https://github.com/apache/spark/pull/25460#issuecomment-521534572 cc @cloud-fan @dongjoon-hyun This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dilipbiswal edited a comment on issue #25448: [SPARK-28697][SQL] Invalidate Database/Table names starting with underscore
dilipbiswal edited a comment on issue #25448: [SPARK-28697][SQL] Invalidate Database/Table names starting with underscore URL: https://github.com/apache/spark/pull/25448#issuecomment-521533557 @cloud-fan @dongjoon-hyun @HyukjinKwon Was just checking the db2 definition of a identifier in [link](https://www.ibm.com/support/knowledgecenter/en/SSEPGG_9.7.0/com.ibm.db2.luw.sql.ref.doc/doc/r720.html) Its defined as following : ``` An ordinary identifier is an uppercase letter followed by zero or more characters, each of which is an uppercase letter, a digit, or the underscore character. Note that lower case letters can be used when specifying an ordinary identifier, but they are converted to uppercase when processed. An ordinary identifier should not be a reserved word. ``` Hive seems to have allowed digit as first character as well. ``` Identifier : (Letter | Digit) (Letter | Digit | '_')* | {allowQuotedId()}? QuotedIdentifier /* though at the language level we allow all Identifiers to be QuotedIdentifiers; at the API level only columns are allowed to be of this form */ | '`' RegexComponent+ '`' ; ``` Not sure why in spark we allowed "_" as starting char to begin with ? Is it to match some other system ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.
AmplabJenkins removed a comment on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type. URL: https://github.com/apache/spark/pull/25458#issuecomment-521533643 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109139/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] shahidki31 opened a new pull request #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available
shahidki31 opened a new pull request #25460: [SPARK-25474][SQL][Followup] fallback to hdfs when relation table stats is not available URL: https://github.com/apache/spark/pull/25460 …ts not available ## What changes were proposed in this pull request? When the table relation stats are not empty, do not fall back to HDFS for size estimation. ## How was this patch tested? Existing tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dilipbiswal edited a comment on issue #25448: [SPARK-28697][SQL] Invalidate Database/Table names starting with underscore
dilipbiswal edited a comment on issue #25448: [SPARK-28697][SQL] Invalidate Database/Table names starting with underscore URL: https://github.com/apache/spark/pull/25448#issuecomment-521533557 @cloud-fan @dongjoon-hyun Was just checking the db2 definition of a identifier in [link](https://www.ibm.com/support/knowledgecenter/en/SSEPGG_9.7.0/com.ibm.db2.luw.sql.ref.doc/doc/r720.html) Its defined as following : ``` An ordinary identifier is an uppercase letter followed by zero or more characters, each of which is an uppercase letter, a digit, or the underscore character. Note that lower case letters can be used when specifying an ordinary identifier, but they are converted to uppercase when processed. An ordinary identifier should not be a reserved word. ``` Hive seems to have allowed digit as first character as well. ``` Identifier : (Letter | Digit) (Letter | Digit | '_')* | {allowQuotedId()}? QuotedIdentifier /* though at the language level we allow all Identifiers to be QuotedIdentifiers; at the API level only columns are allowed to be of this form */ | '`' RegexComponent+ '`' ; ``` Not sure why in spark we allowed "_" as starting char to begin with ? Is it to match some other system ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #25418: [SPARK-28695][SS] Use CaseInsensitiveMap in KafkaSourceProvider to make source param handling more robust
cloud-fan commented on issue #25418: [SPARK-28695][SS] Use CaseInsensitiveMap in KafkaSourceProvider to make source param handling more robust URL: https://github.com/apache/spark/pull/25418#issuecomment-521534340 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.
AmplabJenkins removed a comment on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type. URL: https://github.com/apache/spark/pull/25458#issuecomment-521533637 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dilipbiswal commented on issue #25448: [SPARK-28697][SQL] Invalidate Database/Table names starting with underscore
dilipbiswal commented on issue #25448: [SPARK-28697][SQL] Invalidate Database/Table names starting with underscore URL: https://github.com/apache/spark/pull/25448#issuecomment-521533557 @cloud-fan @dongjoon-hyun Was just checking the db2 definition of a identifier in [link](https://www.ibm.com/support/knowledgecenter/en/SSEPGG_9.7.0/com.ibm.db2.luw.sql.ref.doc/doc/r720.html) Its defined as following : ``` An ordinary identifier is an uppercase letter followed by zero or more characters, each of which is an uppercase letter, a digit, or the underscore character. Note that lower case letters can be used when specifying an ordinary identifier, but they are converted to uppercase when processed. An ordinary identifier should not be a reserved word. ``` Hive seems to have allowed digit as first character as well. ``` Identifier : (Letter | Digit) (Letter | Digit | '_')* | {allowQuotedId()}? QuotedIdentifier /* though at the language level we allow all Identifiers to be QuotedIdentifiers; at the API level only columns are allowed to be of this form */ | '`' RegexComponent+ '`' ; ``` Not sure why in spark we allowed "_" as starting char to begin with ? Is it match some other system ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.
AmplabJenkins commented on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type. URL: https://github.com/apache/spark/pull/25458#issuecomment-521533643 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109139/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.
AmplabJenkins commented on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type. URL: https://github.com/apache/spark/pull/25458#issuecomment-521533637 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.
SparkQA removed a comment on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type. URL: https://github.com/apache/spark/pull/25458#issuecomment-521506392 **[Test build #109139 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109139/testReport)** for PR 25458 at commit [`7d61642`](https://github.com/apache/spark/commit/7d61642860125ff8049578507b6b1143eacad88b). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.
SparkQA commented on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type. URL: https://github.com/apache/spark/pull/25458#issuecomment-521533431 **[Test build #109139 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109139/testReport)** for PR 25458 at commit [`7d61642`](https://github.com/apache/spark/commit/7d61642860125ff8049578507b6b1143eacad88b). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
MaxGekk commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#discussion_r314189408 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala ## @@ -1963,3 +1963,64 @@ case class Epoch(child: Expression, timeZoneId: Option[String] = None) defineCodeGen(ctx, ev, c => s"$dtu.getEpoch($c, $zid)") } } + +@ExpressionDescription( + usage = "_FUNC_(field, source) - Extracts a part of the date/timestamp.", + arguments = """ +Arguments: + * field - selects which part of the source should be extracted. Supported string values are: +["MILLENNIUM", "CENTURY", "DECADE", "YEAR", "QUARTER", "MONTH", + "WEEK", "DAY", "DAYOFWEEK", "DOW", "ISODOW", "DOY", + "HOUR", "MINUTE", "SECOND"] + * source - a date (or timestamp) column from where `field` should be extracted + """, + examples = """ +Examples: + > SELECT _FUNC_('YEAR', TIMESTAMP '2019-08-12 01:00:00.123456'); + 2019 + > SELECT _FUNC_('week', timestamp'2019-08-12 01:00:00.123456'); + 33 + > SELECT _FUNC_('doy', DATE'2019-08-12'); + 224 + """, + since = "3.0.0") +case class DatePart(field: Expression, source: Expression, child: Expression) + extends RuntimeReplaceable { + + def this(field: Expression, source: Expression) { +this(field, source, { + if (!field.foldable) { +throw new AnalysisException("The field parameter needs to be a foldable string value.") Review comment: According to PostgreSQL docs https://www.postgresql.org/docs/11/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT: >_source_ must be a value expression ... the _**field**_ parameter needs to be **a string value** Accepting _field_ as an expression is undocumented feature. We could support that separately if it is needed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
SparkQA commented on issue #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#issuecomment-521532339 **[Test build #109145 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109145/testReport)** for PR 25410 at commit [`1b2c8d4`](https://github.com/apache/spark/commit/1b2c8d4d72394cfd27e8d4e6b0a9291706cd62e5). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
AmplabJenkins removed a comment on issue #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#issuecomment-521531847 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14214/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
AmplabJenkins removed a comment on issue #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#issuecomment-521531838 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
AmplabJenkins commented on issue #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#issuecomment-521531838 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
AmplabJenkins commented on issue #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#issuecomment-521531847 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14214/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
MaxGekk commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#discussion_r314188108 ## File path: sql/core/src/test/resources/sql-tests/inputs/pgSQL/timestamp.sql ## @@ -187,12 +187,11 @@ SELECT '' AS date_trunc_week, date_trunc( 'week', timestamp '2004-02-29 15:44:17 -- WHERE d1 BETWEEN timestamp '1902-01-01' --AND timestamp '2038-01-01'; --- [SPARK-28420] Date/Time Functions: date_part --- SELECT '' AS "54", d1 as "timestamp", ---date_part( 'year', d1) AS year, date_part( 'month', d1) AS month, ---date_part( 'day', d1) AS day, date_part( 'hour', d1) AS hour, ---date_part( 'minute', d1) AS minute, date_part( 'second', d1) AS second ---FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01'; +SELECT '' AS `54`, d1 as `timestamp`, +date_part( 'year', d1) AS `year`, date_part( 'month', d1) AS `month`, +date_part( 'day', d1) AS `day`, date_part( 'hour', d1) AS `hour`, +date_part( 'minute', d1) AS `minute`, date_part( 'second', d1) AS `second` +FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01'; -- SELECT '' AS "54", d1 as "timestamp", --date_part( 'quarter', d1) AS quarter, date_part( 'msec', d1) AS msec, Review comment: I uncommented those 2 queries This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #24486: [SPARK-27592][SQL] Set the bucketed data source table SerDe correctly
cloud-fan commented on issue #24486: [SPARK-27592][SQL] Set the bucketed data source table SerDe correctly URL: https://github.com/apache/spark/pull/24486#issuecomment-521531072 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24486: [SPARK-27592][SQL] Set the bucketed data source table SerDe correctly
cloud-fan commented on a change in pull request #24486: [SPARK-27592][SQL] Set the bucketed data source table SerDe correctly URL: https://github.com/apache/spark/pull/24486#discussion_r314187995 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveMetastoreCatalogSuite.scala ## @@ -284,4 +284,40 @@ class DataSourceWithHiveMetastoreCatalogSuite } } + + test("Set the bucketed data source table SerDe correctly") { Review comment: let's include the jira id in test name. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates
MaxGekk commented on a change in pull request #25410: [SPARK-28690][SQL] Add `date_part` function for timestamps/dates URL: https://github.com/apache/spark/pull/25410#discussion_r314188026 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala ## @@ -1963,3 +1963,64 @@ case class Epoch(child: Expression, timeZoneId: Option[String] = None) defineCodeGen(ctx, ev, c => s"$dtu.getEpoch($c, $zid)") } } + +@ExpressionDescription( + usage = "_FUNC_(field, source) - Extracts a part of the date/timestamp.", + arguments = """ +Arguments: + * field - selects which part of the source should be extracted. Supported string values are: +["MILLENNIUM", "CENTURY", "DECADE", "YEAR", "QUARTER", "MONTH", + "WEEK", "DAY", "DAYOFWEEK", "DOW", "ISODOW", "DOY", + "HOUR", "MINUTE", "SECOND"] Review comment: I documented all values for consistency. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on issue #24486: [SPARK-27592][SQL] Set the bucketed data source table SerDe correctly
wangyum commented on issue #24486: [SPARK-27592][SQL] Set the bucketed data source table SerDe correctly URL: https://github.com/apache/spark/pull/24486#issuecomment-521528867 ping @cloud-fan This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #24715: [SPARK-25474][SQL] Data source tables support fallback to HDFS for size estimation
cloud-fan commented on issue #24715: [SPARK-25474][SQL] Data source tables support fallback to HDFS for size estimation URL: https://github.com/apache/spark/pull/24715#issuecomment-521528661 The idea LGTM, can you rebase this PR? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #22502: [SPARK-25474][SQL] Support `spark.sql.statistics.fallBackToHdfs` in data source tables
cloud-fan commented on a change in pull request #22502: [SPARK-25474][SQL] Support `spark.sql.statistics.fallBackToHdfs` in data source tables URL: https://github.com/apache/spark/pull/22502#discussion_r314185683 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFsRelation.scala ## @@ -71,7 +70,13 @@ case class HadoopFsRelation( override def sizeInBytes: Long = { val compressionFactor = sqlContext.conf.fileCompressionFactor -(location.sizeInBytes * compressionFactor).toLong +val defaultSize = (location.sizeInBytes * compressionFactor).toLong +location match { + case cfi: CatalogFileIndex if sparkSession.sessionState.conf.fallBackToHdfsForStatsEnabled => Review comment: IIUC the issue in this PR is, we always fallback to HDFS stats even if table stats are available. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #25448: [SPARK-28697][SQL] Invalidate Database/Table names starting with underscore
cloud-fan commented on issue #25448: [SPARK-28697][SQL] Invalidate Database/Table names starting with underscore URL: https://github.com/apache/spark/pull/25448#issuecomment-521527717 Wait, does table name starting with `_` work in Spark currently? From SPARK-19059 it seems supported, but from SPARK-28697 it seems not. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
SparkQA commented on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#issuecomment-521525967 **[Test build #109144 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109144/testReport)** for PR 25368 at commit [`45cbbd0`](https://github.com/apache/spark/commit/45cbbd04408251e14a9157d1a5b93ae6a8e91401). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
AmplabJenkins removed a comment on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#issuecomment-521525502 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14213/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
AmplabJenkins removed a comment on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#issuecomment-521525496 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
AmplabJenkins commented on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#issuecomment-521525502 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14213/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
AmplabJenkins commented on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#issuecomment-521525496 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] shahidki31 commented on a change in pull request #22502: [SPARK-25474][SQL] Support `spark.sql.statistics.fallBackToHdfs` in data source tables
shahidki31 commented on a change in pull request #22502: [SPARK-25474][SQL] Support `spark.sql.statistics.fallBackToHdfs` in data source tables URL: https://github.com/apache/spark/pull/22502#discussion_r314183178 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFsRelation.scala ## @@ -71,7 +70,13 @@ case class HadoopFsRelation( override def sizeInBytes: Long = { val compressionFactor = sqlContext.conf.fileCompressionFactor -(location.sizeInBytes * compressionFactor).toLong +val defaultSize = (location.sizeInBytes * compressionFactor).toLong +location match { + case cfi: CatalogFileIndex if sparkSession.sessionState.conf.fallBackToHdfsForStatsEnabled => Review comment: I am not sure, is there any issue in this PR. As per this code, if the table doesn't have any statistics, then only will come to the `sizeInBytes` method. May be we can add the extra check mentioned above. https://github.com/apache/spark/blob/0526529b31737e5bf4829f8259f3a020f2cc51f1/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/LogicalRelation.scala#L42-L46 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on a change in pull request #22502: [SPARK-25474][SQL] Support `spark.sql.statistics.fallBackToHdfs` in data source tables
wangyum commented on a change in pull request #22502: [SPARK-25474][SQL] Support `spark.sql.statistics.fallBackToHdfs` in data source tables URL: https://github.com/apache/spark/pull/22502#discussion_r314180567 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFsRelation.scala ## @@ -71,7 +70,13 @@ case class HadoopFsRelation( override def sizeInBytes: Long = { val compressionFactor = sqlContext.conf.fileCompressionFactor -(location.sizeInBytes * compressionFactor).toLong +val defaultSize = (location.sizeInBytes * compressionFactor).toLong +location match { + case cfi: CatalogFileIndex if sparkSession.sessionState.conf.fallBackToHdfsForStatsEnabled => Review comment: https://github.com/apache/spark/pull/24715 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #22502: [SPARK-25474][SQL] Support `spark.sql.statistics.fallBackToHdfs` in data source tables
cloud-fan commented on a change in pull request #22502: [SPARK-25474][SQL] Support `spark.sql.statistics.fallBackToHdfs` in data source tables URL: https://github.com/apache/spark/pull/22502#discussion_r314180114 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFsRelation.scala ## @@ -71,7 +70,13 @@ case class HadoopFsRelation( override def sizeInBytes: Long = { val compressionFactor = sqlContext.conf.fileCompressionFactor -(location.sizeInBytes * compressionFactor).toLong +val defaultSize = (location.sizeInBytes * compressionFactor).toLong +location match { + case cfi: CatalogFileIndex if sparkSession.sessionState.conf.fallBackToHdfsForStatsEnabled => Review comment: @wangyum can you send a PR for your proposal? It's unclear to me what you are proposing here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] younggyuchun commented on a change in pull request #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.
younggyuchun commented on a change in pull request #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type. URL: https://github.com/apache/spark/pull/25458#discussion_r314179583 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala ## @@ -65,12 +65,15 @@ object StringUtils extends Logging { "(?s)" + out.result() // (?s) enables dotall mode, causing "." to match new lines } - private[this] val trueStrings = Set("t", "true", "y", "yes", "1").map(UTF8String.fromString) - private[this] val falseStrings = Set("f", "false", "n", "no", "0").map(UTF8String.fromString) + private[this] val trueStrings = +Set("t", "true", "y", "yes", "1", "on").map(UTF8String.fromString) + + private[this] val falseStrings = +Set("f", "false", "n", "no", "0", "off").map(UTF8String.fromString) Review comment: Ah okay. Let me add that too. Thank you This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on a change in pull request #22502: [SPARK-25474][SQL] Support `spark.sql.statistics.fallBackToHdfs` in data source tables
wangyum commented on a change in pull request #22502: [SPARK-25474][SQL] Support `spark.sql.statistics.fallBackToHdfs` in data source tables URL: https://github.com/apache/spark/pull/22502#discussion_r314179379 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFsRelation.scala ## @@ -71,7 +70,13 @@ case class HadoopFsRelation( override def sizeInBytes: Long = { val compressionFactor = sqlContext.conf.fileCompressionFactor -(location.sizeInBytes * compressionFactor).toLong +val defaultSize = (location.sizeInBytes * compressionFactor).toLong +location match { + case cfi: CatalogFileIndex if sparkSession.sessionState.conf.fallBackToHdfsForStatsEnabled => Review comment: Yes. I have prepared some tests to illustrate this issue. These tests can be passed before this commit: ```scala test("Non-partitioned data source table") { withTempDir { dir => withTable("spark_25474") { sql(s"CREATE TABLE spark_25474 (c1 BIGINT) USING PARQUET LOCATION '${dir.toURI}'") spark.range(5).write.mode(SaveMode.Overwrite).parquet(dir.getCanonicalPath) assert(getCatalogTable("spark_25474").stats.isEmpty) val relation = spark.table("spark_25474").queryExecution.analyzed.children.head assert(relation.stats.sizeInBytes === 935) } } } test("Partitioned data source table default") { withTempDir { dir => withTable("spark_25474") { spark.sql("CREATE TABLE spark_25474(a int, b int) USING parquet " + s"PARTITIONED BY(a) LOCATION '${dir.toURI}'") spark.sql("INSERT INTO TABLE spark_25474 PARTITION(a=1) SELECT 2") assert(getCatalogTable("spark_25474").stats.isEmpty) val relation = spark.table("spark_25474").queryExecution.analyzed.children.head // scalastyle:off line.size.limit // It's 8.0EB in this case. This 8.0EB from: // https://github.com/apache/spark/blob/c30b5297bc607ae33cc2fcf624b127942154e559/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala#L383-L387 // scalastyle:on line.size.limit assert(relation.stats.sizeInBytes === conf.defaultSizeInBytes) } } } test("Partitioned data source table and disable HIVE_MANAGE_FILESOURCE_PARTITIONS") { withSQLConf(SQLConf.HIVE_MANAGE_FILESOURCE_PARTITIONS.key -> "false") { withTempDir { dir => withTable("spark_25474") { spark.sql("CREATE TABLE spark_25474(a int, b int) USING parquet " + s"PARTITIONED BY(a) LOCATION '${dir.toURI}'") spark.sql("INSERT INTO TABLE spark_25474 PARTITION(a=1) SELECT 2") assert(getCatalogTable("spark_25474").stats.isEmpty) val relation = spark.table("spark_25474").queryExecution.analyzed.children.head assert(relation.stats.sizeInBytes === 418) } } } } ``` https://github.com/apache/spark/compare/master...wangyum:SPARK-25474-DEV?expand=1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24440: [SPARK-27545] [SQL] Uncache table needs to delete the temporary view …
SparkQA commented on issue #24440: [SPARK-27545] [SQL] Uncache table needs to delete the temporary view … URL: https://github.com/apache/spark/pull/24440#issuecomment-521519898 **[Test build #109143 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109143/testReport)** for PR 24440 at commit [`770ee42`](https://github.com/apache/spark/commit/770ee4261335635fafe79afebb1ce7302db96d92). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24440: [SPARK-27545] [SQL] Uncache table needs to delete the temporary view …
AmplabJenkins removed a comment on issue #24440: [SPARK-27545] [SQL] Uncache table needs to delete the temporary view … URL: https://github.com/apache/spark/pull/24440#issuecomment-521519493 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24440: [SPARK-27545] [SQL] Uncache table needs to delete the temporary view …
AmplabJenkins removed a comment on issue #24440: [SPARK-27545] [SQL] Uncache table needs to delete the temporary view … URL: https://github.com/apache/spark/pull/24440#issuecomment-521519498 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14212/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24440: [SPARK-27545] [SQL] Uncache table needs to delete the temporary view …
AmplabJenkins commented on issue #24440: [SPARK-27545] [SQL] Uncache table needs to delete the temporary view … URL: https://github.com/apache/spark/pull/24440#issuecomment-521519498 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14212/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24440: [SPARK-27545] [SQL] Uncache table needs to delete the temporary view …
AmplabJenkins commented on issue #24440: [SPARK-27545] [SQL] Uncache table needs to delete the temporary view … URL: https://github.com/apache/spark/pull/24440#issuecomment-521519493 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] httfighter commented on issue #24440: [SPARK-27545] [SQL] Uncache table needs to delete the temporary view …
httfighter commented on issue #24440: [SPARK-27545] [SQL] Uncache table needs to delete the temporary view … URL: https://github.com/apache/spark/pull/24440#issuecomment-521518729 @dongjoon-hyun Thank you for reminding.I have added a test case. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25456: [SPARK-28739][SQL] Add a simple cost check for Adaptive Query Execution
AmplabJenkins removed a comment on issue #25456: [SPARK-28739][SQL] Add a simple cost check for Adaptive Query Execution URL: https://github.com/apache/spark/pull/25456#issuecomment-521517932 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14211/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25456: [SPARK-28739][SQL] Add a simple cost check for Adaptive Query Execution
AmplabJenkins commented on issue #25456: [SPARK-28739][SQL] Add a simple cost check for Adaptive Query Execution URL: https://github.com/apache/spark/pull/25456#issuecomment-521517930 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25456: [SPARK-28739][SQL] Add a simple cost check for Adaptive Query Execution
AmplabJenkins removed a comment on issue #25456: [SPARK-28739][SQL] Add a simple cost check for Adaptive Query Execution URL: https://github.com/apache/spark/pull/25456#issuecomment-521517930 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25456: [SPARK-28739][SQL] Add a simple cost check for Adaptive Query Execution
AmplabJenkins commented on issue #25456: [SPARK-28739][SQL] Add a simple cost check for Adaptive Query Execution URL: https://github.com/apache/spark/pull/25456#issuecomment-521517932 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14211/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc
AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc URL: https://github.com/apache/spark/pull/25459#issuecomment-521517572 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109141/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc
AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc URL: https://github.com/apache/spark/pull/25459#issuecomment-521517570 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc
AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc URL: https://github.com/apache/spark/pull/25459#issuecomment-521517572 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109141/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc
SparkQA commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc URL: https://github.com/apache/spark/pull/25459#issuecomment-521517517 **[Test build #109141 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109141/testReport)** for PR 25459 at commit [`791ee67`](https://github.com/apache/spark/commit/791ee67a26230d44b6839e4d414980d9889cea74). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc
AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc URL: https://github.com/apache/spark/pull/25459#issuecomment-521517570 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc
SparkQA removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc URL: https://github.com/apache/spark/pull/25459#issuecomment-521515799 **[Test build #109141 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109141/testReport)** for PR 25459 at commit [`791ee67`](https://github.com/apache/spark/commit/791ee67a26230d44b6839e4d414980d9889cea74). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25456: [SPARK-28739][SQL] Add a simple cost check for Adaptive Query Execution
SparkQA commented on issue #25456: [SPARK-28739][SQL] Add a simple cost check for Adaptive Query Execution URL: https://github.com/apache/spark/pull/25456#issuecomment-521516999 **[Test build #109142 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109142/testReport)** for PR 25456 at commit [`74dd386`](https://github.com/apache/spark/commit/74dd3865e0fe3287d73a7b6aa954cc63bf17e9fd). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc
AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc URL: https://github.com/apache/spark/pull/25459#issuecomment-521516649 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc
AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc URL: https://github.com/apache/spark/pull/25459#issuecomment-521516655 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14210/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc
AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc URL: https://github.com/apache/spark/pull/25459#issuecomment-521516649 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc
AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc URL: https://github.com/apache/spark/pull/25459#issuecomment-521516655 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14210/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc
AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc URL: https://github.com/apache/spark/pull/25459#issuecomment-521515466 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc
AmplabJenkins removed a comment on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc URL: https://github.com/apache/spark/pull/25459#issuecomment-521515467 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14209/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on a change in pull request #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.
wangyum commented on a change in pull request #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type. URL: https://github.com/apache/spark/pull/25458#discussion_r314176109 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala ## @@ -65,12 +65,15 @@ object StringUtils extends Logging { "(?s)" + out.result() // (?s) enables dotall mode, causing "." to match new lines } - private[this] val trueStrings = Set("t", "true", "y", "yes", "1").map(UTF8String.fromString) - private[this] val falseStrings = Set("f", "false", "n", "no", "0").map(UTF8String.fromString) + private[this] val trueStrings = +Set("t", "true", "y", "yes", "1", "on").map(UTF8String.fromString) + + private[this] val falseStrings = +Set("f", "false", "n", "no", "0", "off").map(UTF8String.fromString) Review comment: But PostgreSQL also accepts`of`, `tru`, `fals`, ...: ```sql postgres=# select cast('of' as boolean), cast('tru' as boolean), cast('fals' as boolean); bool | bool | bool --+--+-- f| t| f (1 row) ``` https://github.com/postgres/postgres/commit/9729c9360886bee7feddc6a1124b0742de4b9f3d This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc
SparkQA commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc URL: https://github.com/apache/spark/pull/25459#issuecomment-521515799 **[Test build #109141 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109141/testReport)** for PR 25459 at commit [`791ee67`](https://github.com/apache/spark/commit/791ee67a26230d44b6839e4d414980d9889cea74). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc
AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc URL: https://github.com/apache/spark/pull/25459#issuecomment-521515466 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc
AmplabJenkins commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc URL: https://github.com/apache/spark/pull/25459#issuecomment-521515467 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14209/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dilipbiswal opened a new pull request #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc
dilipbiswal opened a new pull request #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc URL: https://github.com/apache/spark/pull/25459 ## What changes were proposed in this pull request? This is a initial PR that creates the table of content for SQL reference guide. The left side bar will displays additional menu items corresponding to supported SQL constructs. One this PR is merged, we will fill in the content incrementally. Additionally this PR contains a minor change to make the left sidebar scrollable. Currently it is not possible to scroll in the left hand side window. ## How was this patch tested? Used jekyll build and serve to verify. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dilipbiswal commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc
dilipbiswal commented on issue #25459: [SPARK-28734[DOC] Initial table of content in the left hand side bar for SQL doc URL: https://github.com/apache/spark/pull/25459#issuecomment-521514797 cc @gatorsmile This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] younggyuchun commented on a change in pull request #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.
younggyuchun commented on a change in pull request #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type. URL: https://github.com/apache/spark/pull/25458#discussion_r314174879 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala ## @@ -65,12 +65,15 @@ object StringUtils extends Logging { "(?s)" + out.result() // (?s) enables dotall mode, causing "." to match new lines } - private[this] val trueStrings = Set("t", "true", "y", "yes", "1").map(UTF8String.fromString) - private[this] val falseStrings = Set("f", "false", "n", "no", "0").map(UTF8String.fromString) + private[this] val trueStrings = +Set("t", "true", "y", "yes", "1", "on").map(UTF8String.fromString) + + private[this] val falseStrings = +Set("f", "false", "n", "no", "0", "off").map(UTF8String.fromString) Review comment: Yes I guess so. Do you know other common string representattion used in other databases? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on a change in pull request #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins
wangyum commented on a change in pull request #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins URL: https://github.com/apache/spark/pull/25443#discussion_r314174619 ## File path: pom.xml ## @@ -115,7 +115,7 @@ UTF-8 UTF-8 -11 +1.8 Review comment: Let's wait for the fix of PySpark and SparkR? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] BestOreo commented on a change in pull request #25342: [SPARK-28571][CORE][SHUFFLE] Use the shuffle writer plugin for the SortShuffleWriter
BestOreo commented on a change in pull request #25342: [SPARK-28571][CORE][SHUFFLE] Use the shuffle writer plugin for the SortShuffleWriter URL: https://github.com/apache/spark/pull/25342#discussion_r314174130 ## File path: core/src/main/scala/org/apache/spark/util/collection/ShufflePartitionPairsWriter.scala ## @@ -0,0 +1,98 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.util.collection + +import java.io.{Closeable, FilterOutputStream, OutputStream} + +import org.apache.spark.serializer.{SerializationStream, SerializerInstance, SerializerManager} +import org.apache.spark.shuffle.ShuffleWriteMetricsReporter +import org.apache.spark.shuffle.api.ShufflePartitionWriter +import org.apache.spark.storage.BlockId + +/** + * A key-value writer inspired by {@link DiskBlockObjectWriter} that pushes the bytes to an + * arbitrary partition writer instead of writing to local disk through the block manager. + */ +private[spark] class ShufflePartitionPairsWriter( +partitionWriter: ShufflePartitionWriter, +serializerManager: SerializerManager, +serializerInstance: SerializerInstance, +blockId: BlockId, +writeMetrics: ShuffleWriteMetricsReporter) + extends PairsWriter with Closeable { + + private var isOpen = false + private var partitionStream: OutputStream = _ + private var wrappedStream: OutputStream = _ + private var objOut: SerializationStream = _ + private var numRecordsWritten = 0 + private var curNumBytesWritten = 0L + + override def write(key: Any, value: Any): Unit = { +if (!isOpen) { + open() + isOpen = true +} +objOut.writeKey(key) +objOut.writeValue(value) +writeMetrics.incRecordsWritten(1) + } + + private def open(): Unit = { +partitionStream = partitionWriter.openStream +wrappedStream = serializerManager.wrapStream(blockId, partitionStream) +objOut = serializerInstance.serializeStream(wrappedStream) + } + + override def close(): Unit = { +if (isOpen) { Review comment: The worry is unnecessary because `wrappedStream` and `objOut` would must be initialized successfully if `partitionStream` is opened as OutputStream without exception. And I think flag `isOpen` makes code easier to understand. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] BestOreo commented on a change in pull request #25342: [SPARK-28571][CORE][SHUFFLE] Use the shuffle writer plugin for the SortShuffleWriter
BestOreo commented on a change in pull request #25342: [SPARK-28571][CORE][SHUFFLE] Use the shuffle writer plugin for the SortShuffleWriter URL: https://github.com/apache/spark/pull/25342#discussion_r314174130 ## File path: core/src/main/scala/org/apache/spark/util/collection/ShufflePartitionPairsWriter.scala ## @@ -0,0 +1,98 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.util.collection + +import java.io.{Closeable, FilterOutputStream, OutputStream} + +import org.apache.spark.serializer.{SerializationStream, SerializerInstance, SerializerManager} +import org.apache.spark.shuffle.ShuffleWriteMetricsReporter +import org.apache.spark.shuffle.api.ShufflePartitionWriter +import org.apache.spark.storage.BlockId + +/** + * A key-value writer inspired by {@link DiskBlockObjectWriter} that pushes the bytes to an + * arbitrary partition writer instead of writing to local disk through the block manager. + */ +private[spark] class ShufflePartitionPairsWriter( +partitionWriter: ShufflePartitionWriter, +serializerManager: SerializerManager, +serializerInstance: SerializerInstance, +blockId: BlockId, +writeMetrics: ShuffleWriteMetricsReporter) + extends PairsWriter with Closeable { + + private var isOpen = false + private var partitionStream: OutputStream = _ + private var wrappedStream: OutputStream = _ + private var objOut: SerializationStream = _ + private var numRecordsWritten = 0 + private var curNumBytesWritten = 0L + + override def write(key: Any, value: Any): Unit = { +if (!isOpen) { + open() + isOpen = true +} +objOut.writeKey(key) +objOut.writeValue(value) +writeMetrics.incRecordsWritten(1) + } + + private def open(): Unit = { +partitionStream = partitionWriter.openStream +wrappedStream = serializerManager.wrapStream(blockId, partitionStream) +objOut = serializerInstance.serializeStream(wrappedStream) + } + + override def close(): Unit = { +if (isOpen) { Review comment: The worry is unnecessary because wrappedStream and objOut would must be initialized successfully if partitionStream is opened as OutputStream without exception. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] BestOreo commented on a change in pull request #25342: [SPARK-28571][CORE][SHUFFLE] Use the shuffle writer plugin for the SortShuffleWriter
BestOreo commented on a change in pull request #25342: [SPARK-28571][CORE][SHUFFLE] Use the shuffle writer plugin for the SortShuffleWriter URL: https://github.com/apache/spark/pull/25342#discussion_r314174130 ## File path: core/src/main/scala/org/apache/spark/util/collection/ShufflePartitionPairsWriter.scala ## @@ -0,0 +1,98 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.util.collection + +import java.io.{Closeable, FilterOutputStream, OutputStream} + +import org.apache.spark.serializer.{SerializationStream, SerializerInstance, SerializerManager} +import org.apache.spark.shuffle.ShuffleWriteMetricsReporter +import org.apache.spark.shuffle.api.ShufflePartitionWriter +import org.apache.spark.storage.BlockId + +/** + * A key-value writer inspired by {@link DiskBlockObjectWriter} that pushes the bytes to an + * arbitrary partition writer instead of writing to local disk through the block manager. + */ +private[spark] class ShufflePartitionPairsWriter( +partitionWriter: ShufflePartitionWriter, +serializerManager: SerializerManager, +serializerInstance: SerializerInstance, +blockId: BlockId, +writeMetrics: ShuffleWriteMetricsReporter) + extends PairsWriter with Closeable { + + private var isOpen = false + private var partitionStream: OutputStream = _ + private var wrappedStream: OutputStream = _ + private var objOut: SerializationStream = _ + private var numRecordsWritten = 0 + private var curNumBytesWritten = 0L + + override def write(key: Any, value: Any): Unit = { +if (!isOpen) { + open() + isOpen = true +} +objOut.writeKey(key) +objOut.writeValue(value) +writeMetrics.incRecordsWritten(1) + } + + private def open(): Unit = { +partitionStream = partitionWriter.openStream +wrappedStream = serializerManager.wrapStream(blockId, partitionStream) +objOut = serializerInstance.serializeStream(wrappedStream) + } + + override def close(): Unit = { +if (isOpen) { Review comment: The worry is unnecessary because wrappedStream and objOut would be initialized successfully if partitionStream is opened as OutputStream without exception. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #25446: [SPARK-28724] [SQL] Throw error message when cast out range decimal to long
dongjoon-hyun commented on issue #25446: [SPARK-28724] [SQL] Throw error message when cast out range decimal to long URL: https://github.com/apache/spark/pull/25446#issuecomment-521513201 Thank you for your understanding, @LiShuMing . Thank you, @maropu . This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths
cloud-fan commented on a change in pull request #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths URL: https://github.com/apache/spark/pull/25348#discussion_r314173218 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala ## @@ -200,24 +202,37 @@ object DataSourceV2Strategy extends Strategy with PredicateHelper { catalog, ident, parts, +query, planLater(query), props, writeOptions, orCreate = orCreate) :: Nil } case AppendData(r: DataSourceV2Relation, query, _) => - AppendDataExec(r.table.asWritable, r.options, planLater(query)) :: Nil Review comment: If end-users look at the SQL tab and see `AppendDataExecV1`, they would expect to see v1 version of CTAS physical plan as well, and may report a bug if they don't see it. BTW I think there are other ways to implement this feature (users know if v1 fallback is triggered from SQL tab), e.g. we can use SQLMetrics to report it, which can be updated at runtime and support CTAS as well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs)
AmplabJenkins removed a comment on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs) URL: https://github.com/apache/spark/pull/25457#issuecomment-521511185 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs)
AmplabJenkins removed a comment on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs) URL: https://github.com/apache/spark/pull/25457#issuecomment-521511188 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109135/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs)
AmplabJenkins commented on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs) URL: https://github.com/apache/spark/pull/25457#issuecomment-521511188 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109135/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs)
AmplabJenkins commented on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs) URL: https://github.com/apache/spark/pull/25457#issuecomment-521511185 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs)
SparkQA removed a comment on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs) URL: https://github.com/apache/spark/pull/25457#issuecomment-521473909 **[Test build #109135 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109135/testReport)** for PR 25457 at commit [`4c5fdd6`](https://github.com/apache/spark/commit/4c5fdd668be1f31561849e5fe485e814e318a3ef). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs)
SparkQA commented on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs) URL: https://github.com/apache/spark/pull/25457#issuecomment-521511008 **[Test build #109135 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109135/testReport)** for PR 25457 at commit [`4c5fdd6`](https://github.com/apache/spark/commit/4c5fdd668be1f31561849e5fe485e814e318a3ef). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins
dongjoon-hyun commented on a change in pull request #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins URL: https://github.com/apache/spark/pull/25443#discussion_r314172342 ## File path: pom.xml ## @@ -115,7 +115,7 @@ UTF-8 UTF-8 -11 +1.8 Review comment: Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs)
dongjoon-hyun commented on issue #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs) URL: https://github.com/apache/spark/pull/25457#issuecomment-521510518 Merged to `branch-2.4`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun closed pull request #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs)
dongjoon-hyun closed pull request #25457: [SPARK-27234][SS][PYTHON][BRANCH-2.4] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs) URL: https://github.com/apache/spark/pull/25457 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on a change in pull request #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.
wangyum commented on a change in pull request #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type. URL: https://github.com/apache/spark/pull/25458#discussion_r314171694 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala ## @@ -65,12 +65,15 @@ object StringUtils extends Logging { "(?s)" + out.result() // (?s) enables dotall mode, causing "." to match new lines } - private[this] val trueStrings = Set("t", "true", "y", "yes", "1").map(UTF8String.fromString) - private[this] val falseStrings = Set("f", "false", "n", "no", "0").map(UTF8String.fromString) + private[this] val trueStrings = +Set("t", "true", "y", "yes", "1", "on").map(UTF8String.fromString) + + private[this] val falseStrings = +Set("f", "false", "n", "no", "0", "off").map(UTF8String.fromString) Review comment: It seems only PostgreSQL accepts `on` and `off`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #25402: [SPARK-28666] Support saveAsTable for V2 tables through Session Catalog
cloud-fan closed pull request #25402: [SPARK-28666] Support saveAsTable for V2 tables through Session Catalog URL: https://github.com/apache/spark/pull/25402 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins
AmplabJenkins removed a comment on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins URL: https://github.com/apache/spark/pull/25443#issuecomment-521509487 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25455: [WIP][SPARK-28737][CORE] Update Jersey to 2.29
AmplabJenkins commented on issue #25455: [WIP][SPARK-28737][CORE] Update Jersey to 2.29 URL: https://github.com/apache/spark/pull/25455#issuecomment-521509388 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109137/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins
AmplabJenkins removed a comment on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins URL: https://github.com/apache/spark/pull/25443#issuecomment-521509492 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14208/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins
AmplabJenkins commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins URL: https://github.com/apache/spark/pull/25443#issuecomment-521509492 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14208/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25455: [WIP][SPARK-28737][CORE] Update Jersey to 2.29
AmplabJenkins removed a comment on issue #25455: [WIP][SPARK-28737][CORE] Update Jersey to 2.29 URL: https://github.com/apache/spark/pull/25455#issuecomment-521509386 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25455: [WIP][SPARK-28737][CORE] Update Jersey to 2.29
AmplabJenkins removed a comment on issue #25455: [WIP][SPARK-28737][CORE] Update Jersey to 2.29 URL: https://github.com/apache/spark/pull/25455#issuecomment-521509388 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109137/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org