[GitHub] [spark] maropu commented on a change in pull request #28104: [SPARK-31331][SQL][DOCS] Document Spark integration with Hive UDFs/UDAFs/UDTFs
maropu commented on a change in pull request #28104: URL: https://github.com/apache/spark/pull/28104#discussion_r411105164 ## File path: docs/sql-ref-functions-udf-hive.md ## @@ -19,4 +19,90 @@ license: | limitations under the License. --- -Integration with Hive UDFs/UDAFs/UDTFs \ No newline at end of file +### Description + +Spark SQL supports integration of Hive UDFs, UDAFs and UDTFs. Similar to Spark UDFs and UDAFs, Hive UDFs work on a single row as input and generate a single row as output, while Hive UDAFs operate on multiple rows and return a single aggregated row as a result. In addition, Hive also supports UDTFs (User Defined Tabular Functions) that act on one row as input and return multiple rows as output. To use Hive UDFs/UDAFs/UTFs, the user should register them in Spark, and then use them in Spark SQL queries. + +### Examples + +Hive has two UDF interfaces: [UDF](https://github.com/apache/hive/blob/master/udf/src/java/org/apache/hadoop/hive/ql/exec/UDF.java) and [GenericUDF](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java). +An example below uses [GenericUDFAbs](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAbs.java) derived from `GenericUDF`. + +{% highlight sql %} +-- Register `GenericUDFAbs` and use it in Spark SQL. +-- Note that, if you use your own programmed one, you need to add a JAR containig it +-- into a classpath, +-- e.g., ADD JAR yourHiveUDF.jar; +CREATE TEMPORARY FUNCTION testUDF AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDFAbs'; + +SELECT * FROM t; + +-+ + |value| + +-+ + | -1.0| + | 2.0| + | -3.0| + +-+ + +SELECT testUDF(value) FROM t; + +--+ + |testUDF(value)| + +--+ + | 1.0| + | 2.0| + | 3.0| + +--+ +{% endhighlight %} + + +An example below uses [GenericUDTFExplode](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFExplode.java) derived from [GenericUDTF](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java). + +{% highlight sql %} +-- Register `GenericUDTFExplode` and use it in Spark SQL +CREATE TEMPORARY FUNCTION hiveUDTF +AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDTFExplode'; + +SELECT * FROM t; + +--+ Review comment: Ah, I see. Actually, no strong reason. Just for format consistency. Before https://github.com/apache/spark/pull/28151, we used the different & inconsistent formats cross the SQL documents. So, I put the simple rule to use the same format in https://github.com/apache/spark/pull/28151. But, If we have a better format for the documents, the reformat looks fine. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 commented on a change in pull request #28254: [SPARK-31478][CORE]Call `StopExecutor` before killing executors
Ngone51 commented on a change in pull request #28254: URL: https://github.com/apache/spark/pull/28254#discussion_r40015 ## File path: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala ## @@ -769,6 +769,8 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: Rp val killExecutors: Boolean => Future[Boolean] = if (executorsToKill.nonEmpty) { + executorsToKill.foreach(id => + executorDataMap.get(id).foreach(_.executorEndpoint.send(StopExecutor))) Review comment: `StopExecutor` may arrive at executor after `kill` arrive at worker/container due to network delay, isn't it possible? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28251: [SPARK-31476][SQL] Add an ExpressionInfo entry for EXTRACT
maropu commented on a change in pull request #28251: URL: https://github.com/apache/spark/pull/28251#discussion_r411109456 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ## @@ -423,6 +423,7 @@ object FunctionRegistry { expression[MakeTimestamp]("make_timestamp"), expression[MakeInterval]("make_interval"), expression[DatePart]("date_part"), +expression[Extract]("extract"), Review comment: > Not a big deal but better if we can avoid exposing more APIs Yea, +1 . btw, its better to add tests for the case `extract(field, source)`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27944: [SPARK-31180][ML] Implement PowerTransform
AmplabJenkins commented on issue #27944: URL: https://github.com/apache/spark/pull/27944#issuecomment-616325425 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27944: [SPARK-31180][ML] Implement PowerTransform
AmplabJenkins removed a comment on issue #27944: URL: https://github.com/apache/spark/pull/27944#issuecomment-616325425 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] iRakson commented on a change in pull request #28254: [SPARK-31478][CORE]Call `StopExecutor` before killing executors
iRakson commented on a change in pull request #28254: URL: https://github.com/apache/spark/pull/28254#discussion_r411109001 ## File path: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala ## @@ -769,6 +769,8 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: Rp val killExecutors: Boolean => Future[Boolean] = if (executorsToKill.nonEmpty) { + executorsToKill.foreach(id => + executorDataMap.get(id).foreach(_.executorEndpoint.send(StopExecutor))) Review comment: Yes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27944: [SPARK-31180][ML] Implement PowerTransform
SparkQA commented on issue #27944: URL: https://github.com/apache/spark/pull/27944#issuecomment-616325119 **[Test build #121504 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121504/testReport)** for PR 27944 at commit [`f2fd922`](https://github.com/apache/spark/commit/f2fd9229f2d5914535ed87411e8d9080bbc5c7d9). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #28266: [SPARK-31256][SQL] DataFrameNaFunctions.drop should work for nested columns
cloud-fan commented on issue #28266: URL: https://github.com/apache/spark/pull/28266#issuecomment-616322205 @dongjoon-hyun yes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28104: [SPARK-31331][SQL][DOCS] Document Spark integration with Hive UDFs/UDAFs/UDTFs
maropu commented on a change in pull request #28104: URL: https://github.com/apache/spark/pull/28104#discussion_r411105164 ## File path: docs/sql-ref-functions-udf-hive.md ## @@ -19,4 +19,90 @@ license: | limitations under the License. --- -Integration with Hive UDFs/UDAFs/UDTFs \ No newline at end of file +### Description + +Spark SQL supports integration of Hive UDFs, UDAFs and UDTFs. Similar to Spark UDFs and UDAFs, Hive UDFs work on a single row as input and generate a single row as output, while Hive UDAFs operate on multiple rows and return a single aggregated row as a result. In addition, Hive also supports UDTFs (User Defined Tabular Functions) that act on one row as input and return multiple rows as output. To use Hive UDFs/UDAFs/UTFs, the user should register them in Spark, and then use them in Spark SQL queries. + +### Examples + +Hive has two UDF interfaces: [UDF](https://github.com/apache/hive/blob/master/udf/src/java/org/apache/hadoop/hive/ql/exec/UDF.java) and [GenericUDF](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java). +An example below uses [GenericUDFAbs](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAbs.java) derived from `GenericUDF`. + +{% highlight sql %} +-- Register `GenericUDFAbs` and use it in Spark SQL. +-- Note that, if you use your own programmed one, you need to add a JAR containig it +-- into a classpath, +-- e.g., ADD JAR yourHiveUDF.jar; +CREATE TEMPORARY FUNCTION testUDF AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDFAbs'; + +SELECT * FROM t; + +-+ + |value| + +-+ + | -1.0| + | 2.0| + | -3.0| + +-+ + +SELECT testUDF(value) FROM t; + +--+ + |testUDF(value)| + +--+ + | 1.0| + | 2.0| + | 3.0| + +--+ +{% endhighlight %} + + +An example below uses [GenericUDTFExplode](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFExplode.java) derived from [GenericUDTF](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java). + +{% highlight sql %} +-- Register `GenericUDTFExplode` and use it in Spark SQL +CREATE TEMPORARY FUNCTION hiveUDTF +AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDTFExplode'; + +SELECT * FROM t; + +--+ Review comment: Ah, I see. Actually, no strong reason. Just for format consistency. Before https://github.com/apache/spark/pull/28151, we used the different & inconsistent formats cross the SQL documents. So, I put the simple rule to use the format in https://github.com/apache/spark/pull/28151. But, If we have a better format for the documents, the reformat looks fine. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #28197: [SPARK-31431][SQL] Add CalendarInterval encoder support
cloud-fan commented on issue #28197: URL: https://github.com/apache/spark/pull/28197#issuecomment-616320345 I'd say `CalendarInterval` should be treated the same as `Decimal`. They are semi-public, and are already supported partially (inside case class). It's arguable if we want to support more. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #28104: [SPARK-31331][SQL][DOCS] Document Spark integration with Hive UDFs/UDAFs/UDTFs
HyukjinKwon commented on a change in pull request #28104: URL: https://github.com/apache/spark/pull/28104#discussion_r411103361 ## File path: docs/sql-ref-functions-udf-hive.md ## @@ -19,4 +19,90 @@ license: | limitations under the License. --- -Integration with Hive UDFs/UDAFs/UDTFs \ No newline at end of file +### Description + +Spark SQL supports integration of Hive UDFs, UDAFs and UDTFs. Similar to Spark UDFs and UDAFs, Hive UDFs work on a single row as input and generate a single row as output, while Hive UDAFs operate on multiple rows and return a single aggregated row as a result. In addition, Hive also supports UDTFs (User Defined Tabular Functions) that act on one row as input and return multiple rows as output. To use Hive UDFs/UDAFs/UTFs, the user should register them in Spark, and then use them in Spark SQL queries. + +### Examples + +Hive has two UDF interfaces: [UDF](https://github.com/apache/hive/blob/master/udf/src/java/org/apache/hadoop/hive/ql/exec/UDF.java) and [GenericUDF](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java). +An example below uses [GenericUDFAbs](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAbs.java) derived from `GenericUDF`. + +{% highlight sql %} +-- Register `GenericUDFAbs` and use it in Spark SQL. +-- Note that, if you use your own programmed one, you need to add a JAR containig it +-- into a classpath, +-- e.g., ADD JAR yourHiveUDF.jar; +CREATE TEMPORARY FUNCTION testUDF AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDFAbs'; + +SELECT * FROM t; + +-+ + |value| + +-+ + | -1.0| + | 2.0| + | -3.0| + +-+ + +SELECT testUDF(value) FROM t; + +--+ + |testUDF(value)| + +--+ + | 1.0| + | 2.0| + | 3.0| + +--+ +{% endhighlight %} + + +An example below uses [GenericUDTFExplode](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFExplode.java) derived from [GenericUDTF](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java). + +{% highlight sql %} +-- Register `GenericUDTFExplode` and use it in Spark SQL +CREATE TEMPORARY FUNCTION hiveUDTF +AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDTFExplode'; + +SELECT * FROM t; + +--+ Review comment: Also, seems like we should comment these output out. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28268: [SPARK-31492][ML] flatten the result dataframe of FValueTest
AmplabJenkins removed a comment on issue #28268: URL: https://github.com/apache/spark/pull/28268#issuecomment-616319622 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #28268: [SPARK-31492][ML] flatten the result dataframe of FValueTest
SparkQA removed a comment on issue #28268: URL: https://github.com/apache/spark/pull/28268#issuecomment-616293948 **[Test build #121496 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121496/testReport)** for PR 28268 at commit [`3caf7d1`](https://github.com/apache/spark/commit/3caf7d12408f3cd6d8245c53974ea84703c5f767). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28268: [SPARK-31492][ML] flatten the result dataframe of FValueTest
AmplabJenkins commented on issue #28268: URL: https://github.com/apache/spark/pull/28268#issuecomment-616319622 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28226: [SPARK-31452][SQL] Do not create partition spec for 0-size partitions in AQE
cloud-fan commented on a change in pull request #28226: URL: https://github.com/apache/spark/pull/28226#discussion_r411102045 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala ## @@ -88,9 +88,11 @@ case class OptimizeSkewedJoin(conf: SQLConf) extends Rule[SparkPlan] { private def targetSize(sizes: Seq[Long], medianSize: Long): Long = { val advisorySize = conf.getConf(SQLConf.ADVISORY_PARTITION_SIZE_IN_BYTES) val nonSkewSizes = sizes.filterNot(isSkewed(_, medianSize)) -// It's impossible that all the partitions are skewed, as we use median size to define skew. -assert(nonSkewSizes.nonEmpty) -math.max(advisorySize, nonSkewSizes.sum / nonSkewSizes.length) +if (nonSkewSizes.isEmpty) { Review comment: because we calculate the median size based on the original map stats, but the input partitions are coalesced. It's possible all partitions (after coalesce) are skewed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28268: [SPARK-31492][ML] flatten the result dataframe of FValueTest
SparkQA commented on issue #28268: URL: https://github.com/apache/spark/pull/28268#issuecomment-616319130 **[Test build #121496 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121496/testReport)** for PR 28268 at commit [`3caf7d1`](https://github.com/apache/spark/commit/3caf7d12408f3cd6d8245c53974ea84703c5f767). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26339: [SPARK-27194][SPARK-29302][SQL] Fix the issue that for dynamic partition overwrite a task would conflict with its speculative task
AmplabJenkins commented on issue #26339: URL: https://github.com/apache/spark/pull/26339#issuecomment-616318720 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #28104: [SPARK-31331][SQL][DOCS] Document Spark integration with Hive UDFs/UDAFs/UDTFs
HyukjinKwon commented on a change in pull request #28104: URL: https://github.com/apache/spark/pull/28104#discussion_r411101626 ## File path: docs/sql-ref-functions-udf-hive.md ## @@ -19,4 +19,90 @@ license: | limitations under the License. --- -Integration with Hive UDFs/UDAFs/UDTFs \ No newline at end of file +### Description + +Spark SQL supports integration of Hive UDFs, UDAFs and UDTFs. Similar to Spark UDFs and UDAFs, Hive UDFs work on a single row as input and generate a single row as output, while Hive UDAFs operate on multiple rows and return a single aggregated row as a result. In addition, Hive also supports UDTFs (User Defined Tabular Functions) that act on one row as input and return multiple rows as output. To use Hive UDFs/UDAFs/UTFs, the user should register them in Spark, and then use them in Spark SQL queries. + +### Examples + +Hive has two UDF interfaces: [UDF](https://github.com/apache/hive/blob/master/udf/src/java/org/apache/hadoop/hive/ql/exec/UDF.java) and [GenericUDF](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java). +An example below uses [GenericUDFAbs](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAbs.java) derived from `GenericUDF`. + +{% highlight sql %} +-- Register `GenericUDFAbs` and use it in Spark SQL. +-- Note that, if you use your own programmed one, you need to add a JAR containig it +-- into a classpath, +-- e.g., ADD JAR yourHiveUDF.jar; +CREATE TEMPORARY FUNCTION testUDF AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDFAbs'; + +SELECT * FROM t; + +-+ + |value| + +-+ + | -1.0| + | 2.0| + | -3.0| + +-+ + +SELECT testUDF(value) FROM t; + +--+ + |testUDF(value)| + +--+ + | 1.0| + | 2.0| + | 3.0| + +--+ +{% endhighlight %} + + +An example below uses [GenericUDTFExplode](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFExplode.java) derived from [GenericUDTF](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java). + +{% highlight sql %} +-- Register `GenericUDTFExplode` and use it in Spark SQL +CREATE TEMPORARY FUNCTION hiveUDTF +AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDTFExplode'; + +SELECT * FROM t; + +--+ Review comment: quick question. Why did we use: ``` +---+ |col| +---+ | 1| | 2| | 3| | 4| +---+ ``` format over the Hive string format (which is produced by `spark-sql` script)? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26339: [SPARK-27194][SPARK-29302][SQL] Fix the issue that for dynamic partition overwrite a task would conflict with its speculative task
AmplabJenkins removed a comment on issue #26339: URL: https://github.com/apache/spark/pull/26339#issuecomment-616318720 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
cloud-fan commented on a change in pull request #28237: URL: https://github.com/apache/spark/pull/28237#discussion_r411101432 ## File path: docs/sql-ref-literals.md ## @@ -0,0 +1,506 @@ +--- +layout: global +title: Literals +displayTitle: Literals +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +A literal (also known as a constant) represents a fixed data value. Spark SQL supports the following literals: + + * [String Literal](#string-literal) + * [Null Literal](#null-literal) + * [Boolean Literal](#boolean-literal) + * [Numeric Literal](#numeric-literal) + * [Datetime Literal](#datetime-literal) + * [Interval Literal](#interval-literal) + +### String Literal + +A string literal is used to specify a character string value. + + Syntax + +{% highlight sql %} +'c [ ... ]' | "c [ ... ]" +{% endhighlight %} + + Parameters + + + c + +One character from the character set. Use \ to escape special characters (e.g., ' or \). + + + + Examples + +{% highlight sql %} +SELECT 'Hello, World!' AS col; + +-+ + | col| + +-+ + |Hello, World!| + +-+ + +SELECT "SPARK SQL" AS col; + +-+ + | col| + +-+ + |Spark SQL| + +-+ + +SELECT SELECT 'it\'s $10.' AS col; + +-+ + | col| + +-+ + |It's $10.| + +-+ +{% endhighlight %} + +### Null Literal + +A null literal is used to specify a null value. + + Syntax + +{% highlight sql %} +NULL +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT NULL AS col; + ++ + | col| + ++ + |NULL| + ++ +{% endhighlight %} + +### Boolean Literal + +A boolean literal is used to specify a boolean value. + + Syntax + +{% highlight sql %} +TRUE | FALSE +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT TRUE AS col; + ++ + | col| + ++ + |true| + ++ +{% endhighlight %} + +### Numeric Literal + +A numeric literal is used to specify a fixed or floating-point number. + + Integer Literal + + Syntax + +{% highlight sql %} +[ + | - ] digit [ ... ] [ L | S | Y ] +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + L + +Case insensitive, indicates BIGINT, which is a 8-byte signed integer number. + + + + S + +Case insensitive, indicates SMALLINT, which is a 2-byte signed integer number. + + + + Y + +Case insensitive, indicates TINYINT, which is a 1-byte signed integer number. + + + + default (no postfix) + +Indicates a 4-byte signed integer number. + + + + Examples + +{% highlight sql %} +SELECT -2147483648 AS col; + +---+ + |col| + +---+ + |-2147483648| + +---+ + +SELECT 9223372036854775807l AS col; + +---+ + |col| + +---+ + |9223372036854775807| + +---+ + +SELECT -32Y AS col; + +---+ + |col| + +---+ + |-32| + +---+ + +SELECT 482S AS col; + +---+ + |col| + +---+ + |482| + +---+ +{% endhighlight %} + + Decimal Literal + + Syntax + +{% highlight sql %} +[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] } +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + Examples + +{% highlight sql %} +SELECT 12.578 AS col; + +--+ + | col| + +--+ + |12.578| + +--+ + +SELECT -0.1234567 AS col; + +--+ + | col| + +--+ + |-0.1234567| + +--+ + +SELECT -.1234567 AS col; + +--+ + | col| + +--+ + |-0.1234567| + +--+ +{% endhighlight %} + + Floating Point and BigDecimal Literals + + Syntax + +{% highlight sql %} +[ + | - ] { digit [ ... ] [ E [ + | - ] digit [ ... ] ] [ D | BD ] | +digit [ ... ] . [ digit [ ... ] ] [ E [ + | - ] digit [ ... ] ] [ D | BD ] | +. digit [ ... ] [ E [ + | - ] digit [ ... ] ] [ D | BD ] } +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + D + +Case insensitive, indicates DOUBLE, which is a 8-byte double-precision floating point number. + + + + BD + +Case insensitive, indicates
[GitHub] [spark] SparkQA commented on issue #26339: [SPARK-27194][SPARK-29302][SQL] Fix the issue that for dynamic partition overwrite a task would conflict with its speculative task
SparkQA commented on issue #26339: URL: https://github.com/apache/spark/pull/26339#issuecomment-616318288 **[Test build #121503 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121503/testReport)** for PR 26339 at commit [`fdeeb5c`](https://github.com/apache/spark/commit/fdeeb5c3acf3f917a370a77d7327401949eb34a4). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
cloud-fan commented on a change in pull request #28237: URL: https://github.com/apache/spark/pull/28237#discussion_r411100904 ## File path: docs/sql-ref-literals.md ## @@ -0,0 +1,506 @@ +--- +layout: global +title: Literals +displayTitle: Literals +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +A literal (also known as a constant) represents a fixed data value. Spark SQL supports the following literals: + + * [String Literal](#string-literal) + * [Null Literal](#null-literal) + * [Boolean Literal](#boolean-literal) + * [Numeric Literal](#numeric-literal) + * [Datetime Literal](#datetime-literal) + * [Interval Literal](#interval-literal) + +### String Literal + +A string literal is used to specify a character string value. + + Syntax + +{% highlight sql %} +'c [ ... ]' | "c [ ... ]" +{% endhighlight %} + + Parameters + + + c + +One character from the character set. Use \ to escape special characters (e.g., ' or \). + + + + Examples + +{% highlight sql %} +SELECT 'Hello, World!' AS col; + +-+ + | col| + +-+ + |Hello, World!| + +-+ + +SELECT "SPARK SQL" AS col; + +-+ + | col| + +-+ + |Spark SQL| + +-+ + +SELECT SELECT 'it\'s $10.' AS col; + +-+ + | col| + +-+ + |It's $10.| + +-+ +{% endhighlight %} + +### Null Literal + +A null literal is used to specify a null value. + + Syntax + +{% highlight sql %} +NULL +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT NULL AS col; + ++ + | col| + ++ + |NULL| + ++ +{% endhighlight %} + +### Boolean Literal + +A boolean literal is used to specify a boolean value. + + Syntax + +{% highlight sql %} +TRUE | FALSE +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT TRUE AS col; + ++ + | col| + ++ + |true| + ++ +{% endhighlight %} + +### Numeric Literal + +A numeric literal is used to specify a fixed or floating-point number. + + Integer Literal + + Syntax + +{% highlight sql %} +[ + | - ] digit [ ... ] [ L | S | Y ] +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + L + +Case insensitive, indicates BIGINT, which is a 8-byte signed integer number. + + + + S + +Case insensitive, indicates SMALLINT, which is a 2-byte signed integer number. + + + + Y + +Case insensitive, indicates TINYINT, which is a 1-byte signed integer number. + + + + default (no postfix) + +Indicates a 4-byte signed integer number. + + + + Examples + +{% highlight sql %} +SELECT -2147483648 AS col; + +---+ + |col| + +---+ + |-2147483648| + +---+ + +SELECT 9223372036854775807l AS col; + +---+ + |col| + +---+ + |9223372036854775807| + +---+ + +SELECT -32Y AS col; + +---+ + |col| + +---+ + |-32| + +---+ + +SELECT 482S AS col; + +---+ + |col| + +---+ + |482| + +---+ +{% endhighlight %} + + Decimal Literal + + Syntax + +{% highlight sql %} +[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] } +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + Examples + +{% highlight sql %} +SELECT 12.578 AS col; + +--+ + | col| + +--+ + |12.578| + +--+ + +SELECT -0.1234567 AS col; + +--+ + | col| + +--+ + |-0.1234567| + +--+ + +SELECT -.1234567 AS col; + +--+ + | col| + +--+ + |-0.1234567| + +--+ +{% endhighlight %} + + Floating Point and BigDecimal Literals + + Syntax + +{% highlight sql %} +[ + | - ] { digit [ ... ] [ E [ + | - ] digit [ ... ] ] [ D | BD ] | +digit [ ... ] . [ digit [ ... ] ] [ E [ + | - ] digit [ ... ] ] [ D | BD ] | +. digit [ ... ] [ E [ + | - ] digit [ ... ] ] [ D | BD ] } +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + D + +Case insensitive, indicates DOUBLE, which is a 8-byte double-precision floating point number. + + + + BD + +Case insensitive, indicates
[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
cloud-fan commented on a change in pull request #28237: URL: https://github.com/apache/spark/pull/28237#discussion_r411100552 ## File path: docs/sql-ref-literals.md ## @@ -0,0 +1,506 @@ +--- +layout: global +title: Literals +displayTitle: Literals +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +A literal (also known as a constant) represents a fixed data value. Spark SQL supports the following literals: + + * [String Literal](#string-literal) + * [Null Literal](#null-literal) + * [Boolean Literal](#boolean-literal) + * [Numeric Literal](#numeric-literal) + * [Datetime Literal](#datetime-literal) + * [Interval Literal](#interval-literal) + +### String Literal + +A string literal is used to specify a character string value. + + Syntax + +{% highlight sql %} +'c [ ... ]' | "c [ ... ]" +{% endhighlight %} + + Parameters + + + c + +One character from the character set. Use \ to escape special characters (e.g., ' or \). + + + + Examples + +{% highlight sql %} +SELECT 'Hello, World!' AS col; + +-+ + | col| + +-+ + |Hello, World!| + +-+ + +SELECT "SPARK SQL" AS col; + +-+ + | col| + +-+ + |Spark SQL| + +-+ + +SELECT SELECT 'it\'s $10.' AS col; + +-+ + | col| + +-+ + |It's $10.| + +-+ +{% endhighlight %} + +### Null Literal + +A null literal is used to specify a null value. + + Syntax + +{% highlight sql %} +NULL +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT NULL AS col; + ++ + | col| + ++ + |NULL| + ++ +{% endhighlight %} + +### Boolean Literal + +A boolean literal is used to specify a boolean value. + + Syntax + +{% highlight sql %} +TRUE | FALSE +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT TRUE AS col; + ++ + | col| + ++ + |true| + ++ +{% endhighlight %} + +### Numeric Literal + +A numeric literal is used to specify a fixed or floating-point number. + + Integer Literal + + Syntax + +{% highlight sql %} +[ + | - ] digit [ ... ] [ L | S | Y ] +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + L + +Case insensitive, indicates BIGINT, which is a 8-byte signed integer number. + + + + S + +Case insensitive, indicates SMALLINT, which is a 2-byte signed integer number. + + + + Y + +Case insensitive, indicates TINYINT, which is a 1-byte signed integer number. + + + + default (no postfix) + +Indicates a 4-byte signed integer number. + + + + Examples + +{% highlight sql %} +SELECT -2147483648 AS col; + +---+ + |col| + +---+ + |-2147483648| + +---+ + +SELECT 9223372036854775807l AS col; + +---+ + |col| + +---+ + |9223372036854775807| + +---+ + +SELECT -32Y AS col; + +---+ + |col| + +---+ + |-32| + +---+ + +SELECT 482S AS col; + +---+ + |col| + +---+ + |482| + +---+ +{% endhighlight %} + + Decimal Literal + + Syntax + +{% highlight sql %} +[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] } +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + Examples + +{% highlight sql %} +SELECT 12.578 AS col; + +--+ + | col| + +--+ + |12.578| + +--+ + +SELECT -0.1234567 AS col; + +--+ + | col| + +--+ + |-0.1234567| + +--+ + +SELECT -.1234567 AS col; + +--+ + | col| + +--+ + |-0.1234567| + +--+ +{% endhighlight %} + + Floating Point and BigDecimal Literals + + Syntax + +{% highlight sql %} +[ + | - ] { digit [ ... ] [ E [ + | - ] digit [ ... ] ] [ D | BD ] | +digit [ ... ] . [ digit [ ... ] ] [ E [ + | - ] digit [ ... ] ] [ D | BD ] | +. digit [ ... ] [ E [ + | - ] digit [ ... ] ] [ D | BD ] } +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + D + +Case insensitive, indicates DOUBLE, which is a 8-byte double-precision floating point number. + + + + BD + +Case insensitive, indicates
[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
cloud-fan commented on a change in pull request #28237: URL: https://github.com/apache/spark/pull/28237#discussion_r411099766 ## File path: docs/sql-ref-literals.md ## @@ -0,0 +1,506 @@ +--- +layout: global +title: Literals +displayTitle: Literals +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +A literal (also known as a constant) represents a fixed data value. Spark SQL supports the following literals: + + * [String Literal](#string-literal) + * [Null Literal](#null-literal) + * [Boolean Literal](#boolean-literal) + * [Numeric Literal](#numeric-literal) + * [Datetime Literal](#datetime-literal) + * [Interval Literal](#interval-literal) + +### String Literal + +A string literal is used to specify a character string value. + + Syntax + +{% highlight sql %} +'c [ ... ]' | "c [ ... ]" +{% endhighlight %} + + Parameters + + + c + +One character from the character set. Use \ to escape special characters (e.g., ' or \). + + + + Examples + +{% highlight sql %} +SELECT 'Hello, World!' AS col; + +-+ + | col| + +-+ + |Hello, World!| + +-+ + +SELECT "SPARK SQL" AS col; + +-+ + | col| + +-+ + |Spark SQL| + +-+ + +SELECT SELECT 'it\'s $10.' AS col; + +-+ + | col| + +-+ + |It's $10.| + +-+ +{% endhighlight %} + +### Null Literal + +A null literal is used to specify a null value. + + Syntax + +{% highlight sql %} +NULL +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT NULL AS col; + ++ + | col| + ++ + |NULL| + ++ +{% endhighlight %} + +### Boolean Literal + +A boolean literal is used to specify a boolean value. + + Syntax + +{% highlight sql %} +TRUE | FALSE +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT TRUE AS col; + ++ + | col| + ++ + |true| + ++ +{% endhighlight %} + +### Numeric Literal + +A numeric literal is used to specify a fixed or floating-point number. + + Integer Literal + + Syntax + +{% highlight sql %} +[ + | - ] digit [ ... ] [ L | S | Y ] +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + L + +Case insensitive, indicates BIGINT, which is a 8-byte signed integer number. + + + + S + +Case insensitive, indicates SMALLINT, which is a 2-byte signed integer number. + + + + Y + +Case insensitive, indicates TINYINT, which is a 1-byte signed integer number. + + + + default (no postfix) + +Indicates a 4-byte signed integer number. + + + + Examples + +{% highlight sql %} +SELECT -2147483648 AS col; + +---+ + |col| + +---+ + |-2147483648| + +---+ + +SELECT 9223372036854775807l AS col; + +---+ + |col| + +---+ + |9223372036854775807| + +---+ + +SELECT -32Y AS col; + +---+ + |col| + +---+ + |-32| + +---+ + +SELECT 482S AS col; + +---+ + |col| + +---+ + |482| + +---+ +{% endhighlight %} + + Decimal Literal + + Syntax + +{% highlight sql %} +[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] } +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + Examples + +{% highlight sql %} +SELECT 12.578 AS col; + +--+ + | col| + +--+ + |12.578| + +--+ + +SELECT -0.1234567 AS col; + +--+ + | col| + +--+ + |-0.1234567| + +--+ + +SELECT -.1234567 AS col; + +--+ + | col| + +--+ + |-0.1234567| + +--+ +{% endhighlight %} + + Floating Point and BigDecimal Literals + + Syntax + +{% highlight sql %} +[ + | - ] { digit [ ... ] [ E [ + | - ] digit [ ... ] ] [ D | BD ] | +digit [ ... ] . [ digit [ ... ] ] [ E [ + | - ] digit [ ... ] ] [ D | BD ] | +. digit [ ... ] [ E [ + | - ] digit [ ... ] ] [ D | BD ] } +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + D + +Case insensitive, indicates DOUBLE, which is a 8-byte double-precision floating point number. + + + + BD + +Case insensitive, indicates
[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
cloud-fan commented on a change in pull request #28237: URL: https://github.com/apache/spark/pull/28237#discussion_r411099482 ## File path: docs/sql-ref-literals.md ## @@ -0,0 +1,506 @@ +--- +layout: global +title: Literals +displayTitle: Literals +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +A literal (also known as a constant) represents a fixed data value. Spark SQL supports the following literals: + + * [String Literal](#string-literal) + * [Null Literal](#null-literal) + * [Boolean Literal](#boolean-literal) + * [Numeric Literal](#numeric-literal) + * [Datetime Literal](#datetime-literal) + * [Interval Literal](#interval-literal) + +### String Literal + +A string literal is used to specify a character string value. + + Syntax + +{% highlight sql %} +'c [ ... ]' | "c [ ... ]" +{% endhighlight %} + + Parameters + + + c + +One character from the character set. Use \ to escape special characters (e.g., ' or \). + + + + Examples + +{% highlight sql %} +SELECT 'Hello, World!' AS col; + +-+ + | col| + +-+ + |Hello, World!| + +-+ + +SELECT "SPARK SQL" AS col; + +-+ + | col| + +-+ + |Spark SQL| + +-+ + +SELECT SELECT 'it\'s $10.' AS col; + +-+ + | col| + +-+ + |It's $10.| + +-+ +{% endhighlight %} + +### Null Literal + +A null literal is used to specify a null value. + + Syntax + +{% highlight sql %} +NULL +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT NULL AS col; + ++ + | col| + ++ + |NULL| + ++ +{% endhighlight %} + +### Boolean Literal + +A boolean literal is used to specify a boolean value. + + Syntax + +{% highlight sql %} +TRUE | FALSE +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT TRUE AS col; + ++ + | col| + ++ + |true| + ++ +{% endhighlight %} + +### Numeric Literal + +A numeric literal is used to specify a fixed or floating-point number. + + Integer Literal + + Syntax + +{% highlight sql %} +[ + | - ] digit [ ... ] [ L | S | Y ] +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + L + +Case insensitive, indicates BIGINT, which is a 8-byte signed integer number. + + + + S + +Case insensitive, indicates SMALLINT, which is a 2-byte signed integer number. + + + + Y + +Case insensitive, indicates TINYINT, which is a 1-byte signed integer number. + + + + default (no postfix) + +Indicates a 4-byte signed integer number. + + + + Examples + +{% highlight sql %} +SELECT -2147483648 AS col; + +---+ + |col| + +---+ + |-2147483648| + +---+ + +SELECT 9223372036854775807l AS col; + +---+ + |col| + +---+ + |9223372036854775807| + +---+ + +SELECT -32Y AS col; + +---+ + |col| + +---+ + |-32| + +---+ + +SELECT 482S AS col; + +---+ + |col| + +---+ + |482| + +---+ +{% endhighlight %} + + Decimal Literal + + Syntax + +{% highlight sql %} +[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] } Review comment: and there is no way to create float type literals (except using functions like `cast` and `float`). Maybe we should look at the SQL standard and support it later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #28148: [SPARK-31381][SPARK-29245][SQL] Upgrade built-in Hive 2.3.6 to 2.3.7
dongjoon-hyun commented on issue #28148: URL: https://github.com/apache/spark/pull/28148#issuecomment-616316299 Finally! Thank you, @wangyum . This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
maropu commented on a change in pull request #28237: URL: https://github.com/apache/spark/pull/28237#discussion_r411099259 ## File path: docs/sql-ref-literals.md ## @@ -0,0 +1,505 @@ +--- +layout: global +title: Literals +displayTitle: Literals +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +A literal (also known as a constant) represents a fixed data value. Spark SQL supports the following literals: + + * [String Literal](#string-literal) + * [Null Literal](#null-literal) + * [Boolean Literal](#boolean-literal) + * [Numeric Literal](#numeric-literal) + * [Datetime Literal](#datetime-literal) + * [Interval Literal](#interval-literal) + +### String Literal + +A string literal is used to specify a character string value. + + Syntax + +{% highlight sql %} +'c [ ... ]' | "c [ ... ]" +{% endhighlight %} + + Parameters + + + c + +One character from the character set. Use \ to escape special characters. + + + + Examples + +{% highlight sql %} +SELECT 'Hello, World!' AS col; + +-+ + | col| + +-+ + |Hello, World!| + +-+ + +SELECT "SPARK SQL" AS col; + +-+ + | col| + +-+ + |Spark SQL| + +-+ + +SELECT SELECT 'it\'s $10.' AS col; + +-+ + | col| + +-+ + |It's $10.| + +-+ +{% endhighlight %} + +### Null Literal + +A null literal is used to specify a null value. + + Syntax + +{% highlight sql %} +NULL +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT NULL AS col; + ++ + | col| + ++ + |NULL| + ++ +{% endhighlight %} + +### Boolean Literal + +A boolean literal is used to specify a boolean value. + + Syntax + +{% highlight sql %} +TRUE | FALSE +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT TRUE AS col; + ++ + | col| + ++ + |true| + ++ +{% endhighlight %} + +### Numeric Literal + +A numeric literal is used to specify a fixed or floating-point number. + + Integer Literal + + Syntax + +{% highlight sql %} +[ + | - ] digit [ ... ] [ L | S | Y ] +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + L + +Case insensitive, indicates BIGINT, which is a 8-byte signed integer number. + + + + S + +Case insensitive, indicates SMALLINT, which is a 2-byte signed integer number. + + + + Y + +Case insensitive, indicates TINYINT, which is a 1-byte signed integer number. + + + + default (no postfix) + +Indicates a 4-byte signed integer number. + + + Examples + +{% highlight sql %} +SELECT -2147483648 AS col; + +---+ + |col| + +---+ + |-2147483648| + +---+ + +SELECT 9223372036854775807l AS col; + +---+ + |col| + +---+ + |9223372036854775807| + +---+ + +SELECT -32Y AS col; + +---+ + |col| + +---+ + |-32| + +---+ + +SELECT 482S AS col; + +---+ + |col| + +---+ + |482| + +---+ +{% endhighlight %} + + Decimal Literal + + Syntax + +{% highlight sql %} +[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] } +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + Examples + +{% highlight sql %} +SELECT 12.578 AS col; + +--+ + | col| + +--+ + |12.578| + +--+ + +SELECT -0.1234567 AS col; + +--+ + | col| + +--+ + |-0.1234567| + +--+ + +SELECT -.1234567 AS col; + +--+ + | col| + +--+ + |-0.1234567| + +--+ +{% endhighlight %} + + Floating Point and BigDecimal Literals Review comment: > I think we should mention BD in the Decimal Literal section. +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail:
[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
cloud-fan commented on a change in pull request #28237: URL: https://github.com/apache/spark/pull/28237#discussion_r411098844 ## File path: docs/sql-ref-literals.md ## @@ -0,0 +1,506 @@ +--- +layout: global +title: Literals +displayTitle: Literals +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +A literal (also known as a constant) represents a fixed data value. Spark SQL supports the following literals: + + * [String Literal](#string-literal) + * [Null Literal](#null-literal) + * [Boolean Literal](#boolean-literal) + * [Numeric Literal](#numeric-literal) + * [Datetime Literal](#datetime-literal) + * [Interval Literal](#interval-literal) + +### String Literal + +A string literal is used to specify a character string value. + + Syntax + +{% highlight sql %} +'c [ ... ]' | "c [ ... ]" +{% endhighlight %} + + Parameters + + + c + +One character from the character set. Use \ to escape special characters (e.g., ' or \). + + + + Examples + +{% highlight sql %} +SELECT 'Hello, World!' AS col; + +-+ + | col| + +-+ + |Hello, World!| + +-+ + +SELECT "SPARK SQL" AS col; + +-+ + | col| + +-+ + |Spark SQL| + +-+ + +SELECT SELECT 'it\'s $10.' AS col; + +-+ + | col| + +-+ + |It's $10.| + +-+ +{% endhighlight %} + +### Null Literal + +A null literal is used to specify a null value. + + Syntax + +{% highlight sql %} +NULL +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT NULL AS col; + ++ + | col| + ++ + |NULL| + ++ +{% endhighlight %} + +### Boolean Literal + +A boolean literal is used to specify a boolean value. + + Syntax + +{% highlight sql %} +TRUE | FALSE +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT TRUE AS col; + ++ + | col| + ++ + |true| + ++ +{% endhighlight %} + +### Numeric Literal + +A numeric literal is used to specify a fixed or floating-point number. + + Integer Literal + + Syntax + +{% highlight sql %} +[ + | - ] digit [ ... ] [ L | S | Y ] +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + L + +Case insensitive, indicates BIGINT, which is a 8-byte signed integer number. + + + + S + +Case insensitive, indicates SMALLINT, which is a 2-byte signed integer number. + + + + Y + +Case insensitive, indicates TINYINT, which is a 1-byte signed integer number. + + + + default (no postfix) + +Indicates a 4-byte signed integer number. + + + + Examples + +{% highlight sql %} +SELECT -2147483648 AS col; + +---+ + |col| + +---+ + |-2147483648| + +---+ + +SELECT 9223372036854775807l AS col; + +---+ + |col| + +---+ + |9223372036854775807| + +---+ + +SELECT -32Y AS col; + +---+ + |col| + +---+ + |-32| + +---+ + +SELECT 482S AS col; + +---+ + |col| + +---+ + |482| + +---+ +{% endhighlight %} + + Decimal Literal + + Syntax + +{% highlight sql %} +[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] } Review comment: seems better to mention the fraction literals together: ``` decimal_digits: [ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] } exponent: E [+ | -] digit [ ... ] decimal literal: decimal_digits | decimal_digits [exponent] 'BD' double literal: decimal_digits exponent | decimal_digits [exponent] 'D' ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #28266: [SPARK-31256][SQL] DataFrameNaFunctions.drop should work for nested columns
dongjoon-hyun commented on issue #28266: URL: https://github.com/apache/spark/pull/28266#issuecomment-616314632 So, SPARK-31256 made a regression at 2.4.5 and this recovers it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SPARK-31060][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet
viirya commented on a change in pull request #27728: URL: https://github.com/apache/spark/pull/27728#discussion_r411096547 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -652,10 +652,19 @@ object DataSourceStrategy { */ object PushableColumn { def unapply(e: Expression): Option[String] = { -def helper(e: Expression) = e match { - case a: Attribute => Some(a.name) +val nestedPredicatePushdownEnabled = SQLConf.get.nestedPredicatePushdownEnabled +import org.apache.spark.sql.connector.catalog.CatalogV2Implicits.MultipartIdentifierHelper +def helper(e: Expression): Option[Seq[String]] = e match { + case a: Attribute => +if (nestedPredicatePushdownEnabled || !a.name.contains(".")) { + Some(Seq(a.name)) +} else { + None +} + case s: GetStructField if nestedPredicatePushdownEnabled => +helper(s.child).map(_ :+ s.childSchema(s.ordinal).name) case _ => None } -helper(e) +helper(e).map(_.quoted) Review comment: I can try looking at this this week. If anyone picks it up before me, I'm also ok. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
cloud-fan commented on a change in pull request #28237: URL: https://github.com/apache/spark/pull/28237#discussion_r411096138 ## File path: docs/sql-ref-literals.md ## @@ -0,0 +1,505 @@ +--- +layout: global +title: Literals +displayTitle: Literals +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +A literal (also known as a constant) represents a fixed data value. Spark SQL supports the following literals: + + * [String Literal](#string-literal) + * [Null Literal](#null-literal) + * [Boolean Literal](#boolean-literal) + * [Numeric Literal](#numeric-literal) + * [Datetime Literal](#datetime-literal) + * [Interval Literal](#interval-literal) + +### String Literal + +A string literal is used to specify a character string value. + + Syntax + +{% highlight sql %} +'c [ ... ]' | "c [ ... ]" +{% endhighlight %} + + Parameters + + + c + +One character from the character set. Use \ to escape special characters. + + + + Examples + +{% highlight sql %} +SELECT 'Hello, World!' AS col; + +-+ + | col| + +-+ + |Hello, World!| + +-+ + +SELECT "SPARK SQL" AS col; + +-+ + | col| + +-+ + |Spark SQL| + +-+ + +SELECT SELECT 'it\'s $10.' AS col; + +-+ + | col| + +-+ + |It's $10.| + +-+ +{% endhighlight %} + +### Null Literal + +A null literal is used to specify a null value. + + Syntax + +{% highlight sql %} +NULL +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT NULL AS col; + ++ + | col| + ++ + |NULL| + ++ +{% endhighlight %} + +### Boolean Literal + +A boolean literal is used to specify a boolean value. + + Syntax + +{% highlight sql %} +TRUE | FALSE +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT TRUE AS col; + ++ + | col| + ++ + |true| + ++ +{% endhighlight %} + +### Numeric Literal + +A numeric literal is used to specify a fixed or floating-point number. + + Integer Literal + + Syntax + +{% highlight sql %} +[ + | - ] digit [ ... ] [ L | S | Y ] +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + L + +Case insensitive, indicates BIGINT, which is a 8-byte signed integer number. + + + + S + +Case insensitive, indicates SMALLINT, which is a 2-byte signed integer number. + + + + Y + +Case insensitive, indicates TINYINT, which is a 1-byte signed integer number. + + + + default (no postfix) + +Indicates a 4-byte signed integer number. + + + Examples + +{% highlight sql %} +SELECT -2147483648 AS col; + +---+ + |col| + +---+ + |-2147483648| + +---+ + +SELECT 9223372036854775807l AS col; + +---+ + |col| + +---+ + |9223372036854775807| + +---+ + +SELECT -32Y AS col; + +---+ + |col| + +---+ + |-32| + +---+ + +SELECT 482S AS col; + +---+ + |col| + +---+ + |482| + +---+ +{% endhighlight %} + + Decimal Literal + + Syntax + +{% highlight sql %} +[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] } +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + Examples + +{% highlight sql %} +SELECT 12.578 AS col; + +--+ + | col| + +--+ + |12.578| + +--+ + +SELECT -0.1234567 AS col; + +--+ + | col| + +--+ + |-0.1234567| + +--+ + +SELECT -.1234567 AS col; + +--+ + | col| + +--+ + |-0.1234567| + +--+ +{% endhighlight %} + + Floating Point and BigDecimal Literals Review comment: they are different SQL syntax, but they both create decimal literals (decimal type values). I think we should mention `BD` in the `Decimal Literal` section. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
cloud-fan commented on a change in pull request #28237: URL: https://github.com/apache/spark/pull/28237#discussion_r411095492 ## File path: docs/sql-ref-literals.md ## @@ -0,0 +1,506 @@ +--- +layout: global +title: Literals +displayTitle: Literals +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +A literal (also known as a constant) represents a fixed data value. Spark SQL supports the following literals: + + * [String Literal](#string-literal) + * [Null Literal](#null-literal) + * [Boolean Literal](#boolean-literal) + * [Numeric Literal](#numeric-literal) + * [Datetime Literal](#datetime-literal) + * [Interval Literal](#interval-literal) + +### String Literal + +A string literal is used to specify a character string value. + + Syntax + +{% highlight sql %} +'c [ ... ]' | "c [ ... ]" +{% endhighlight %} + + Parameters + + + c + +One character from the character set. Use \ to escape special characters (e.g., ' or \). + + + + Examples + +{% highlight sql %} +SELECT 'Hello, World!' AS col; + +-+ + | col| + +-+ + |Hello, World!| + +-+ + +SELECT "SPARK SQL" AS col; + +-+ + | col| + +-+ + |Spark SQL| + +-+ + +SELECT SELECT 'it\'s $10.' AS col; + +-+ + | col| + +-+ + |It's $10.| + +-+ +{% endhighlight %} + +### Null Literal + +A null literal is used to specify a null value. + + Syntax + +{% highlight sql %} +NULL +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT NULL AS col; + ++ + | col| + ++ + |NULL| + ++ +{% endhighlight %} + +### Boolean Literal + +A boolean literal is used to specify a boolean value. + + Syntax + +{% highlight sql %} +TRUE | FALSE +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT TRUE AS col; + ++ + | col| + ++ + |true| + ++ +{% endhighlight %} + +### Numeric Literal + +A numeric literal is used to specify a fixed or floating-point number. + + Integer Literal + + Syntax + +{% highlight sql %} +[ + | - ] digit [ ... ] [ L | S | Y ] +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + L + +Case insensitive, indicates BIGINT, which is a 8-byte signed integer number. + + + + S + +Case insensitive, indicates SMALLINT, which is a 2-byte signed integer number. + + + + Y + +Case insensitive, indicates TINYINT, which is a 1-byte signed integer number. + + + + default (no postfix) + +Indicates a 4-byte signed integer number. + + + + Examples + +{% highlight sql %} +SELECT -2147483648 AS col; + +---+ + |col| + +---+ + |-2147483648| + +---+ + +SELECT 9223372036854775807l AS col; + +---+ + |col| + +---+ + |9223372036854775807| + +---+ + +SELECT -32Y AS col; + +---+ + |col| + +---+ + |-32| + +---+ + +SELECT 482S AS col; + +---+ + |col| + +---+ + |482| + +---+ +{% endhighlight %} + + Decimal Literal + + Syntax + +{% highlight sql %} +[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] } +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + Examples + +{% highlight sql %} +SELECT 12.578 AS col; + +--+ + | col| + +--+ + |12.578| + +--+ + +SELECT -0.1234567 AS col; + +--+ + | col| + +--+ + |-0.1234567| + +--+ + +SELECT -.1234567 AS col; Review comment: `123.` is also a decimal, right? which is the same as `123.0` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on issue #28197: [SPARK-31431][SQL] Add CalendarInterval encoder support
viirya commented on issue #28197: URL: https://github.com/apache/spark/pull/28197#issuecomment-616311588 Do we expect users to read data and represent as CalendarInterval in Dataset? Seems to me CalendarInterval is only for usage in Spark row. Although not the same, it sounds similar to me to make UTF8String in Dataset. For domain objects we provide encoders in Dataset, I think they shall be frequently used domain objects which users will use in their business logic. This sounds a strange encoder when I looked at it at first. It seems not to be a problem, however I wonder we should be careful in adding encoder. If others also agree to add, I'm fine for this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
cloud-fan commented on a change in pull request #28237: URL: https://github.com/apache/spark/pull/28237#discussion_r411095121 ## File path: docs/sql-ref-literals.md ## @@ -0,0 +1,506 @@ +--- +layout: global +title: Literals +displayTitle: Literals +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +A literal (also known as a constant) represents a fixed data value. Spark SQL supports the following literals: + + * [String Literal](#string-literal) + * [Null Literal](#null-literal) + * [Boolean Literal](#boolean-literal) + * [Numeric Literal](#numeric-literal) + * [Datetime Literal](#datetime-literal) + * [Interval Literal](#interval-literal) + +### String Literal + +A string literal is used to specify a character string value. + + Syntax + +{% highlight sql %} +'c [ ... ]' | "c [ ... ]" +{% endhighlight %} + + Parameters + + + c + +One character from the character set. Use \ to escape special characters (e.g., ' or \). + + + + Examples + +{% highlight sql %} +SELECT 'Hello, World!' AS col; + +-+ + | col| + +-+ + |Hello, World!| + +-+ + +SELECT "SPARK SQL" AS col; + +-+ + | col| + +-+ + |Spark SQL| + +-+ + +SELECT SELECT 'it\'s $10.' AS col; + +-+ + | col| + +-+ + |It's $10.| + +-+ +{% endhighlight %} + +### Null Literal + +A null literal is used to specify a null value. + + Syntax + +{% highlight sql %} +NULL +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT NULL AS col; + ++ + | col| + ++ + |NULL| + ++ +{% endhighlight %} + +### Boolean Literal + +A boolean literal is used to specify a boolean value. + + Syntax + +{% highlight sql %} +TRUE | FALSE +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT TRUE AS col; + ++ + | col| + ++ + |true| + ++ +{% endhighlight %} + +### Numeric Literal + +A numeric literal is used to specify a fixed or floating-point number. + + Integer Literal + + Syntax + +{% highlight sql %} +[ + | - ] digit [ ... ] [ L | S | Y ] +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + L + +Case insensitive, indicates BIGINT, which is a 8-byte signed integer number. + + + + S + +Case insensitive, indicates SMALLINT, which is a 2-byte signed integer number. + + + + Y + +Case insensitive, indicates TINYINT, which is a 1-byte signed integer number. + + + + default (no postfix) + +Indicates a 4-byte signed integer number. + + + + Examples + +{% highlight sql %} +SELECT -2147483648 AS col; + +---+ + |col| + +---+ + |-2147483648| + +---+ + +SELECT 9223372036854775807l AS col; + +---+ + |col| + +---+ + |9223372036854775807| + +---+ + +SELECT -32Y AS col; + +---+ + |col| + +---+ + |-32| + +---+ + +SELECT 482S AS col; + +---+ + |col| + +---+ + |482| + +---+ +{% endhighlight %} + + Decimal Literal + + Syntax + +{% highlight sql %} +[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] } Review comment: nit: ``` [ + | - ] { { digit [ ... ] . [digit [ ... ] ] } | { . digit [ ... ] } } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
cloud-fan commented on a change in pull request #28237: URL: https://github.com/apache/spark/pull/28237#discussion_r411095492 ## File path: docs/sql-ref-literals.md ## @@ -0,0 +1,506 @@ +--- +layout: global +title: Literals +displayTitle: Literals +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +A literal (also known as a constant) represents a fixed data value. Spark SQL supports the following literals: + + * [String Literal](#string-literal) + * [Null Literal](#null-literal) + * [Boolean Literal](#boolean-literal) + * [Numeric Literal](#numeric-literal) + * [Datetime Literal](#datetime-literal) + * [Interval Literal](#interval-literal) + +### String Literal + +A string literal is used to specify a character string value. + + Syntax + +{% highlight sql %} +'c [ ... ]' | "c [ ... ]" +{% endhighlight %} + + Parameters + + + c + +One character from the character set. Use \ to escape special characters (e.g., ' or \). + + + + Examples + +{% highlight sql %} +SELECT 'Hello, World!' AS col; + +-+ + | col| + +-+ + |Hello, World!| + +-+ + +SELECT "SPARK SQL" AS col; + +-+ + | col| + +-+ + |Spark SQL| + +-+ + +SELECT SELECT 'it\'s $10.' AS col; + +-+ + | col| + +-+ + |It's $10.| + +-+ +{% endhighlight %} + +### Null Literal + +A null literal is used to specify a null value. + + Syntax + +{% highlight sql %} +NULL +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT NULL AS col; + ++ + | col| + ++ + |NULL| + ++ +{% endhighlight %} + +### Boolean Literal + +A boolean literal is used to specify a boolean value. + + Syntax + +{% highlight sql %} +TRUE | FALSE +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT TRUE AS col; + ++ + | col| + ++ + |true| + ++ +{% endhighlight %} + +### Numeric Literal + +A numeric literal is used to specify a fixed or floating-point number. + + Integer Literal + + Syntax + +{% highlight sql %} +[ + | - ] digit [ ... ] [ L | S | Y ] +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + L + +Case insensitive, indicates BIGINT, which is a 8-byte signed integer number. + + + + S + +Case insensitive, indicates SMALLINT, which is a 2-byte signed integer number. + + + + Y + +Case insensitive, indicates TINYINT, which is a 1-byte signed integer number. + + + + default (no postfix) + +Indicates a 4-byte signed integer number. + + + + Examples + +{% highlight sql %} +SELECT -2147483648 AS col; + +---+ + |col| + +---+ + |-2147483648| + +---+ + +SELECT 9223372036854775807l AS col; + +---+ + |col| + +---+ + |9223372036854775807| + +---+ + +SELECT -32Y AS col; + +---+ + |col| + +---+ + |-32| + +---+ + +SELECT 482S AS col; + +---+ + |col| + +---+ + |482| + +---+ +{% endhighlight %} + + Decimal Literal + + Syntax + +{% highlight sql %} +[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] } +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + Examples + +{% highlight sql %} +SELECT 12.578 AS col; + +--+ + | col| + +--+ + |12.578| + +--+ + +SELECT -0.1234567 AS col; + +--+ + | col| + +--+ + |-0.1234567| + +--+ + +SELECT -.1234567 AS col; Review comment: `123.` is also a decimal, right? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
cloud-fan commented on a change in pull request #28237: URL: https://github.com/apache/spark/pull/28237#discussion_r411095121 ## File path: docs/sql-ref-literals.md ## @@ -0,0 +1,506 @@ +--- +layout: global +title: Literals +displayTitle: Literals +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +A literal (also known as a constant) represents a fixed data value. Spark SQL supports the following literals: + + * [String Literal](#string-literal) + * [Null Literal](#null-literal) + * [Boolean Literal](#boolean-literal) + * [Numeric Literal](#numeric-literal) + * [Datetime Literal](#datetime-literal) + * [Interval Literal](#interval-literal) + +### String Literal + +A string literal is used to specify a character string value. + + Syntax + +{% highlight sql %} +'c [ ... ]' | "c [ ... ]" +{% endhighlight %} + + Parameters + + + c + +One character from the character set. Use \ to escape special characters (e.g., ' or \). + + + + Examples + +{% highlight sql %} +SELECT 'Hello, World!' AS col; + +-+ + | col| + +-+ + |Hello, World!| + +-+ + +SELECT "SPARK SQL" AS col; + +-+ + | col| + +-+ + |Spark SQL| + +-+ + +SELECT SELECT 'it\'s $10.' AS col; + +-+ + | col| + +-+ + |It's $10.| + +-+ +{% endhighlight %} + +### Null Literal + +A null literal is used to specify a null value. + + Syntax + +{% highlight sql %} +NULL +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT NULL AS col; + ++ + | col| + ++ + |NULL| + ++ +{% endhighlight %} + +### Boolean Literal + +A boolean literal is used to specify a boolean value. + + Syntax + +{% highlight sql %} +TRUE | FALSE +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT TRUE AS col; + ++ + | col| + ++ + |true| + ++ +{% endhighlight %} + +### Numeric Literal + +A numeric literal is used to specify a fixed or floating-point number. + + Integer Literal + + Syntax + +{% highlight sql %} +[ + | - ] digit [ ... ] [ L | S | Y ] +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + L + +Case insensitive, indicates BIGINT, which is a 8-byte signed integer number. + + + + S + +Case insensitive, indicates SMALLINT, which is a 2-byte signed integer number. + + + + Y + +Case insensitive, indicates TINYINT, which is a 1-byte signed integer number. + + + + default (no postfix) + +Indicates a 4-byte signed integer number. + + + + Examples + +{% highlight sql %} +SELECT -2147483648 AS col; + +---+ + |col| + +---+ + |-2147483648| + +---+ + +SELECT 9223372036854775807l AS col; + +---+ + |col| + +---+ + |9223372036854775807| + +---+ + +SELECT -32Y AS col; + +---+ + |col| + +---+ + |-32| + +---+ + +SELECT 482S AS col; + +---+ + |col| + +---+ + |482| + +---+ +{% endhighlight %} + + Decimal Literal + + Syntax + +{% highlight sql %} +[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] } Review comment: nit: ``` [ + | - ] { { digit [ ... ] . [digit [ ... ] ] } | { . digit [ ... ] } } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
cloud-fan commented on a change in pull request #28237: URL: https://github.com/apache/spark/pull/28237#discussion_r411095121 ## File path: docs/sql-ref-literals.md ## @@ -0,0 +1,506 @@ +--- +layout: global +title: Literals +displayTitle: Literals +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +A literal (also known as a constant) represents a fixed data value. Spark SQL supports the following literals: + + * [String Literal](#string-literal) + * [Null Literal](#null-literal) + * [Boolean Literal](#boolean-literal) + * [Numeric Literal](#numeric-literal) + * [Datetime Literal](#datetime-literal) + * [Interval Literal](#interval-literal) + +### String Literal + +A string literal is used to specify a character string value. + + Syntax + +{% highlight sql %} +'c [ ... ]' | "c [ ... ]" +{% endhighlight %} + + Parameters + + + c + +One character from the character set. Use \ to escape special characters (e.g., ' or \). + + + + Examples + +{% highlight sql %} +SELECT 'Hello, World!' AS col; + +-+ + | col| + +-+ + |Hello, World!| + +-+ + +SELECT "SPARK SQL" AS col; + +-+ + | col| + +-+ + |Spark SQL| + +-+ + +SELECT SELECT 'it\'s $10.' AS col; + +-+ + | col| + +-+ + |It's $10.| + +-+ +{% endhighlight %} + +### Null Literal + +A null literal is used to specify a null value. + + Syntax + +{% highlight sql %} +NULL +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT NULL AS col; + ++ + | col| + ++ + |NULL| + ++ +{% endhighlight %} + +### Boolean Literal + +A boolean literal is used to specify a boolean value. + + Syntax + +{% highlight sql %} +TRUE | FALSE +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT TRUE AS col; + ++ + | col| + ++ + |true| + ++ +{% endhighlight %} + +### Numeric Literal + +A numeric literal is used to specify a fixed or floating-point number. + + Integer Literal + + Syntax + +{% highlight sql %} +[ + | - ] digit [ ... ] [ L | S | Y ] +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + L + +Case insensitive, indicates BIGINT, which is a 8-byte signed integer number. + + + + S + +Case insensitive, indicates SMALLINT, which is a 2-byte signed integer number. + + + + Y + +Case insensitive, indicates TINYINT, which is a 1-byte signed integer number. + + + + default (no postfix) + +Indicates a 4-byte signed integer number. + + + + Examples + +{% highlight sql %} +SELECT -2147483648 AS col; + +---+ + |col| + +---+ + |-2147483648| + +---+ + +SELECT 9223372036854775807l AS col; + +---+ + |col| + +---+ + |9223372036854775807| + +---+ + +SELECT -32Y AS col; + +---+ + |col| + +---+ + |-32| + +---+ + +SELECT 482S AS col; + +---+ + |col| + +---+ + |482| + +---+ +{% endhighlight %} + + Decimal Literal + + Syntax + +{% highlight sql %} +[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] } Review comment: nit: ``` [ + | - ] { { digit [ ... ] [. digit [ ... ] ] } | { . digit [ ... ] } } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] huaxingao commented on issue #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
huaxingao commented on issue #28237: URL: https://github.com/apache/spark/pull/28237#issuecomment-616309537 @cloud-fan I addressed all the comments. Could you please check one more time? Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28269: [SPARK-31493][SQL] Optimize InSet to In according partition size at InSubqueryExec
AmplabJenkins removed a comment on issue #28269: URL: https://github.com/apache/spark/pull/28269#issuecomment-616306537 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28269: [SPARK-31493][SQL] Optimize InSet to In according partition size at InSubqueryExec
AmplabJenkins commented on issue #28269: URL: https://github.com/apache/spark/pull/28269#issuecomment-616306537 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28269: [SPARK-31493][SQL] Optimize InSet to In according partition size at InSubqueryExec
SparkQA commented on issue #28269: URL: https://github.com/apache/spark/pull/28269#issuecomment-616306215 **[Test build #121502 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121502/testReport)** for PR 28269 at commit [`dfb8504`](https://github.com/apache/spark/commit/dfb8504b00e736d5bb230850cafd749acf83b130). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28250: [SPARK-31475][SQL] Broadcast stage in AQE did not timeout
cloud-fan commented on a change in pull request #28250: URL: https://github.com/apache/spark/pull/28250#discussion_r411089336 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/joins/BroadcastJoinSuite.scala ## @@ -398,4 +399,22 @@ class BroadcastJoinSuite extends QueryTest with SQLTestUtils with AdaptiveSparkP } } } + + test("Broadcast timeout") { +val timeout = 30 +val slowUDF = udf({ x: Int => Thread.sleep(timeout * 10 * 1000); x }) +val df1 = spark.range(10).select($"id" as 'a) +val df2 = spark.range(5).select(slowUDF($"id") as 'a) +val testDf = df1.join(broadcast(df2), "a") +withSQLConf(SQLConf.BROADCAST_TIMEOUT.key -> timeout.toString) { + val e = intercept[Exception] { +testDf.collect() + } + AdaptiveTestUtils.assertExceptionMessage(e, s"Could not execute broadcast in $timeout secs.") Review comment: so this test runs 30 seconds? Can we make it a bit shorter? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28265: [SPARK-31234][SQL][FOLLOW-UP] ResetCommand should not affect static SQL Configuration
AmplabJenkins removed a comment on issue #28265: URL: https://github.com/apache/spark/pull/28265#issuecomment-616304410 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28265: [SPARK-31234][SQL][FOLLOW-UP] ResetCommand should not affect static SQL Configuration
AmplabJenkins commented on issue #28265: URL: https://github.com/apache/spark/pull/28265#issuecomment-616304410 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28270: [SPARK-31494][ML] flatten the result dataframe of ANOVATest
AmplabJenkins removed a comment on issue #28270: URL: https://github.com/apache/spark/pull/28270#issuecomment-616302400 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28270: [SPARK-31494][ML] flatten the result dataframe of ANOVATest
SparkQA commented on issue #28270: URL: https://github.com/apache/spark/pull/28270#issuecomment-616304123 **[Test build #121500 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121500/testReport)** for PR 28270 at commit [`ec132a7`](https://github.com/apache/spark/commit/ec132a778b98b9315cbc5417c1dd209ffb418bd6). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28265: [SPARK-31234][SQL][FOLLOW-UP] ResetCommand should not affect static SQL Configuration
SparkQA commented on issue #28265: URL: https://github.com/apache/spark/pull/28265#issuecomment-616304143 **[Test build #121501 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121501/testReport)** for PR 28265 at commit [`a63ad80`](https://github.com/apache/spark/commit/a63ad80b75777e0d6aaf40f26825612b389518d8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
AmplabJenkins removed a comment on issue #28237: URL: https://github.com/apache/spark/pull/28237#issuecomment-616304127 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
AmplabJenkins commented on issue #28237: URL: https://github.com/apache/spark/pull/28237#issuecomment-616304127 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #28265: [SPARK-31234][SQL][FOLLOW-UP] ResetCommand should not affect static SQL Configuration
cloud-fan commented on issue #28265: URL: https://github.com/apache/spark/pull/28265#issuecomment-616304084 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
SparkQA removed a comment on issue #28237: URL: https://github.com/apache/spark/pull/28237#issuecomment-616300097 **[Test build #121499 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121499/testReport)** for PR 28237 at commit [`35cb286`](https://github.com/apache/spark/commit/35cb2862b338ecc5d24deef2e0f7ad216cae1534). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
SparkQA commented on issue #28237: URL: https://github.com/apache/spark/pull/28237#issuecomment-616303982 **[Test build #121499 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121499/testReport)** for PR 28237 at commit [`35cb286`](https://github.com/apache/spark/commit/35cb2862b338ecc5d24deef2e0f7ad216cae1534). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28251: [SPARK-31476][SQL] Add an ExpressionInfo entry for EXTRACT
cloud-fan commented on a change in pull request #28251: URL: https://github.com/apache/spark/pull/28251#discussion_r411087296 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ## @@ -423,6 +423,7 @@ object FunctionRegistry { expression[MakeTimestamp]("make_timestamp"), expression[MakeInterval]("make_interval"), expression[DatePart]("date_part"), +expression[Extract]("extract"), Review comment: one side effect is now we support `extract(field, source)` other than `extract(field from source)`. Not a big deal but better if we can avoid exposing more APIs, This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on a change in pull request #28265: [SPARK-31234][SQL][FOLLOW-UP] ResetCommand should not affect static SQL Configuration
gatorsmile commented on a change in pull request #28265: URL: https://github.com/apache/spark/pull/28265#discussion_r411086764 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SparkSessionBuilderSuite.scala ## @@ -163,9 +163,9 @@ class SparkSessionBuilderSuite extends SparkFunSuite with BeforeAndAfterEach { .getOrCreate() assert(session.sessionState.conf.getConfString("spark.app.name") === "test-app-SPARK-31234") -assert(session.sessionState.conf.getConf(GLOBAL_TEMP_DATABASE) === "globalTempDB-SPARK-31234") +assert(session.sessionState.conf.getConf(GLOBAL_TEMP_DATABASE) === "globaltempdb-spark-31234") Review comment: This difference between Spark 2.4 and Spark 3.0 is caused by https://github.com/apache/spark/pull/24979/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on issue #28265: [SPARK-31234][SQL][FOLLOW-UP] ResetCommand should not affect static SQL Configuration
gatorsmile commented on issue #28265: URL: https://github.com/apache/spark/pull/28265#issuecomment-616303005 cc @cloud-fan This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on a change in pull request #24979: [SPARK-28179][SQL] Avoid hard-coded config: spark.sql.globalTempDatabase
gatorsmile commented on a change in pull request #24979: URL: https://github.com/apache/spark/pull/24979#discussion_r411086592 ## File path: sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala ## @@ -158,7 +158,7 @@ private[sql] class SharedState( // System preserved database should not exists in metastore. However it's hard to guarantee it // for every session, because case-sensitivity differs. Here we always lowercase it to make our // life easier. Review comment: https://github.com/apache/spark/pull/28265 fixed it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on a change in pull request #24979: [SPARK-28179][SQL] Avoid hard-coded config: spark.sql.globalTempDatabase
gatorsmile commented on a change in pull request #24979: URL: https://github.com/apache/spark/pull/24979#discussion_r411086416 ## File path: sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala ## @@ -158,7 +158,7 @@ private[sql] class SharedState( // System preserved database should not exists in metastore. However it's hard to guarantee it // for every session, because case-sensitivity differs. Here we always lowercase it to make our // life easier. Review comment: This description should be moved to StaticSQLConf This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26339: [SPARK-27194][SPARK-29302][SQL] Fix the issue that for dynamic partition overwrite a task would conflict with its speculative task
SparkQA removed a comment on issue #26339: URL: https://github.com/apache/spark/pull/26339#issuecomment-616267769 **[Test build #121493 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121493/testReport)** for PR 26339 at commit [`a4012d8`](https://github.com/apache/spark/commit/a4012d87d1e18c7a1c922d4198ff7aa7756cac81). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28270: [SPARK-31494][ML] flatten the result dataframe of ANOVATest
AmplabJenkins commented on issue #28270: URL: https://github.com/apache/spark/pull/28270#issuecomment-616302400 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26339: [SPARK-27194][SPARK-29302][SQL] Fix the issue that for dynamic partition overwrite a task would conflict with its speculative task
AmplabJenkins removed a comment on issue #26339: URL: https://github.com/apache/spark/pull/26339#issuecomment-616302130 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26339: [SPARK-27194][SPARK-29302][SQL] Fix the issue that for dynamic partition overwrite a task would conflict with its speculative task
AmplabJenkins removed a comment on issue #26339: URL: https://github.com/apache/spark/pull/26339#issuecomment-616302135 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/121493/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28248: [SPARK-31474][SQL] Consistency between dayofweek/dow in extract exprsession and dayofweek function
cloud-fan commented on a change in pull request #28248: URL: https://github.com/apache/spark/pull/28248#discussion_r411086116 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala ## @@ -2215,7 +2219,11 @@ case class DatePart(field: Expression, source: Expression, child: Expression) > SELECT _FUNC_(seconds FROM interval 5 hours 30 seconds 1 milliseconds 1 microseconds); 30.001001 """, + note = """ +The _FUNC_ function is equivalent to `date_part`. Review comment: BTW is `EXTRACT` more widely used? If yes then we should put the document in `Extract`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26339: [SPARK-27194][SPARK-29302][SQL] Fix the issue that for dynamic partition overwrite a task would conflict with its speculative task
AmplabJenkins commented on issue #26339: URL: https://github.com/apache/spark/pull/26339#issuecomment-616302130 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng opened a new pull request #28270: [SPARK-31494][ML] flatten the result dataframe of ANOVATest
zhengruifeng opened a new pull request #28270: URL: https://github.com/apache/spark/pull/28270 ### What changes were proposed in this pull request? add a new method `def test(dataset: DataFrame, featuresCol: String, labelCol: String, flatten: Boolean): DataFrame` ### Why are the changes needed? Similar to new `test` method in `ChiSquareTest`, it will: 1, support df operation on the returned df; 2, make driver no longer a bottleneck with large numFeatures ### Does this PR introduce any user-facing change? Yes, new method added ### How was this patch tested? existing testsuites This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28248: [SPARK-31474][SQL] Consistency between dayofweek/dow in extract exprsession and dayofweek function
cloud-fan commented on a change in pull request #28248: URL: https://github.com/apache/spark/pull/28248#discussion_r411085917 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala ## @@ -2215,7 +2219,11 @@ case class DatePart(field: Expression, source: Expression, child: Expression) > SELECT _FUNC_(seconds FROM interval 5 hours 30 seconds 1 milliseconds 1 microseconds); 30.001001 """, + note = """ +The _FUNC_ function is equivalent to `date_part`. Review comment: `date_part(field, source)` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28248: [SPARK-31474][SQL] Consistency between dayofweek/dow in extract exprsession and dayofweek function
cloud-fan commented on a change in pull request #28248: URL: https://github.com/apache/spark/pull/28248#discussion_r411085824 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala ## @@ -2179,7 +2178,11 @@ object DatePartLike { > SELECT _FUNC_('seconds', interval 5 hours 30 seconds 1 milliseconds 1 microseconds); 30.001001 """, + note = """ +The _FUNC_ function is equivalent to the SQL-standard function `extract` Review comment: `EXTRACT(field FROM source)` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26339: [SPARK-27194][SPARK-29302][SQL] Fix the issue that for dynamic partition overwrite a task would conflict with its speculative task
SparkQA commented on issue #26339: URL: https://github.com/apache/spark/pull/26339#issuecomment-616301796 **[Test build #121493 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121493/testReport)** for PR 26339 at commit [`a4012d8`](https://github.com/apache/spark/commit/a4012d87d1e18c7a1c922d4198ff7aa7756cac81). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class RenameFailedForFirstTaskFirstAttemptFileSystem extends RawLocalFileSystem ` * `class PartitionedSpeculateRenameFailedWriteSuite extends QueryTest with SharedSparkSession ` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28248: [SPARK-31474][SQL] Consistency between dayofweek/dow in extract exprsession and dayofweek function
cloud-fan commented on a change in pull request #28248: URL: https://github.com/apache/spark/pull/28248#discussion_r411085373 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala ## @@ -2130,38 +2129,38 @@ object DatePartLike { } } +// scalastyle:off line.size.limit @ExpressionDescription( usage = "_FUNC_(field, source) - Extracts a part of the date/timestamp or interval source.", arguments = """ Arguments: * field - selects which part of the source should be extracted. - Supported string values of `field` for dates and timestamps are: -["MILLENNIUM", ("MILLENNIA", "MIL", "MILS"), - "CENTURY", ("CENTURIES", "C", "CENT"), - "DECADE", ("DECADES", "DEC", "DECS"), - "YEAR", ("Y", "YEARS", "YR", "YRS"), - "ISOYEAR", - "QUARTER", ("QTR"), - "MONTH", ("MON", "MONS", "MONTHS"), - "WEEK", ("W", "WEEKS"), - "DAY", ("D", "DAYS"), - "DAYOFWEEK", - "DOW", - "ISODOW", - "DOY", - "HOUR", ("H", "HOURS", "HR", "HRS"), - "MINUTE", ("M", "MIN", "MINS", "MINUTES"), - "SECOND", ("S", "SEC", "SECONDS", "SECS"), - "MILLISECONDS", ("MSEC", "MSECS", "MILLISECON", "MSECONDS", "MS"), - "MICROSECONDS", ("USEC", "USECS", "USECONDS", "MICROSECON", "US"), - "EPOCH"] -Supported string values of `field` for intervals are: - ["YEAR", ("Y", "YEARS", "YR", "YRS"), - "MONTH", ("MON", "MONS", "MONTHS"), - "DAY", ("D", "DAYS"), - "HOUR", ("H", "HOURS", "HR", "HRS"), - "MINUTE", ("M", "MIN", "MINS", "MINUTES"), - "SECOND", ("S", "SEC", "SECONDS", "SECS")] + - Supported string values of `field` for dates and timestamps are: + - "MILLENNIUM", ("MILLENNIA", "MIL", "MILS") - the conventional numbering of millennia + - "CENTURY", ("CENTURIES", "C", "CENT") - the conventional numbering of centuries + - "DECADE", ("DECADES", "DEC", "DECS") - the year field divided by 1 + - "YEAR", ("Y", "YEARS", "YR", "YRS") - the year field + - "ISOYEAR" - the ISO 8601 week-numbering year that the datetime falls in + - "QUARTER", ("QTR") - the quarter (1 - 4) of the year that the datetime falls in + - "MONTH", ("MON", "MONS", "MONTHS") - the month field + - "WEEK", ("W", "WEEKS") - the number of the ISO 8601 week-of-week-based-year. A week is considered to start on a Monday and week 1 is the first week with >3 days. In the ISO week-numbering system, it is possible for early-January dates to be part of the 52nd or 53rd week of the previous year, and for late-December dates to be part of the first week of the next year. For example, 2005-01-02 is part of the 53rd week of year 2004, while 2012-12-31 is part of the first week of 2013 + - "DAY", ("D", "DAYS") - the day of the month field (1 - 31) + - "DAYOFWEEK",("DOW") - the day of the week for datetime as Sunday(1) to Saturday(7) + - "ISODOW" - ISO 8601 based day of the week for datetime as Monday(1) to Sunday(7) + - "DOY" - the day of the year (1 - 365/366) + - "HOUR", ("H", "HOURS", "HR", "HRS") - The hour field (0 - 23) + - "MINUTE", ("M", "MIN", "MINS", "MINUTES") - the minutes field (0 - 59) + - "SECOND", ("S", "SEC", "SECONDS", "SECS") - the seconds field, including fractional parts + - "MILLISECONDS", ("MSEC", "MSECS", "MILLISECON", "MSECONDS", "MS") - the seconds field, including fractional parts, multiplied by 1000. Note that this includes full seconds + - "MICROSECONDS", ("USEC", "USECS", "USECONDS", "MICROSECON", "US") - The seconds field, including fractional parts, multiplied by 100. Note that this includes full seconds + - "EPOCH" - the number of seconds with fractional part in microsecond precision since 1970-01-01 00:00:00 local time (can be negative) + - Supported string values of `field` for interval(which consists of `months`, `days`, `microseconds`) are: + - "YEAR", ("Y", "YEARS", "YR", "YRS") - the total `months` / 12 + - "MONTH", ("MON", "MONS", "MONTHS") - the total `months` modulo 12 Review comment: ``` the total `months` % 12 ``` to be consistent with ``` the total `months` / 12 ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this
[GitHub] [spark] cloud-fan commented on a change in pull request #28248: [SPARK-31474][SQL] Consistency between dayofweek/dow in extract exprsession and dayofweek function
cloud-fan commented on a change in pull request #28248: URL: https://github.com/apache/spark/pull/28248#discussion_r411085051 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala ## @@ -2130,38 +2129,38 @@ object DatePartLike { } } +// scalastyle:off line.size.limit @ExpressionDescription( usage = "_FUNC_(field, source) - Extracts a part of the date/timestamp or interval source.", arguments = """ Arguments: * field - selects which part of the source should be extracted. - Supported string values of `field` for dates and timestamps are: -["MILLENNIUM", ("MILLENNIA", "MIL", "MILS"), - "CENTURY", ("CENTURIES", "C", "CENT"), - "DECADE", ("DECADES", "DEC", "DECS"), - "YEAR", ("Y", "YEARS", "YR", "YRS"), - "ISOYEAR", - "QUARTER", ("QTR"), - "MONTH", ("MON", "MONS", "MONTHS"), - "WEEK", ("W", "WEEKS"), - "DAY", ("D", "DAYS"), - "DAYOFWEEK", - "DOW", - "ISODOW", - "DOY", - "HOUR", ("H", "HOURS", "HR", "HRS"), - "MINUTE", ("M", "MIN", "MINS", "MINUTES"), - "SECOND", ("S", "SEC", "SECONDS", "SECS"), - "MILLISECONDS", ("MSEC", "MSECS", "MILLISECON", "MSECONDS", "MS"), - "MICROSECONDS", ("USEC", "USECS", "USECONDS", "MICROSECON", "US"), - "EPOCH"] -Supported string values of `field` for intervals are: - ["YEAR", ("Y", "YEARS", "YR", "YRS"), - "MONTH", ("MON", "MONS", "MONTHS"), - "DAY", ("D", "DAYS"), - "HOUR", ("H", "HOURS", "HR", "HRS"), - "MINUTE", ("M", "MIN", "MINS", "MINUTES"), - "SECOND", ("S", "SEC", "SECONDS", "SECS")] + - Supported string values of `field` for dates and timestamps are: + - "MILLENNIUM", ("MILLENNIA", "MIL", "MILS") - the conventional numbering of millennia + - "CENTURY", ("CENTURIES", "C", "CENT") - the conventional numbering of centuries + - "DECADE", ("DECADES", "DEC", "DECS") - the year field divided by 1 + - "YEAR", ("Y", "YEARS", "YR", "YRS") - the year field + - "ISOYEAR" - the ISO 8601 week-numbering year that the datetime falls in + - "QUARTER", ("QTR") - the quarter (1 - 4) of the year that the datetime falls in + - "MONTH", ("MON", "MONS", "MONTHS") - the month field Review comment: `the month field (1 - 12)` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
AmplabJenkins removed a comment on issue #28237: URL: https://github.com/apache/spark/pull/28237#issuecomment-616300361 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28248: [SPARK-31474][SQL] Consistency between dayofweek/dow in extract exprsession and dayofweek function
cloud-fan commented on a change in pull request #28248: URL: https://github.com/apache/spark/pull/28248#discussion_r411084360 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala ## @@ -2089,8 +2089,7 @@ object DatePart { case "MONTH" | "MON" | "MONS" | "MONTHS" => Month(source) case "WEEK" | "W" | "WEEKS" => WeekOfYear(source) case "DAY" | "D" | "DAYS" => DayOfMonth(source) -case "DAYOFWEEK" => DayOfWeek(source) -case "DOW" => Subtract(DayOfWeek(source), Literal(1)) +case "DAYOFWEEK" | "DOW" => DayOfWeek(source) Review comment: I said that the `DOW` behavior looks more reasonable, but unfortunately, we already have `DAYOFWEEK` in Spark 2.4 and we can't change that. It's more important to keep internal consistency. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
AmplabJenkins commented on issue #28237: URL: https://github.com/apache/spark/pull/28237#issuecomment-616300361 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
SparkQA commented on issue #28237: URL: https://github.com/apache/spark/pull/28237#issuecomment-616300097 **[Test build #121499 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121499/testReport)** for PR 28237 at commit [`35cb286`](https://github.com/apache/spark/commit/35cb2862b338ecc5d24deef2e0f7ad216cae1534). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] huaxingao commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
huaxingao commented on a change in pull request #28237: URL: https://github.com/apache/spark/pull/28237#discussion_r411082047 ## File path: docs/sql-ref-literals.md ## @@ -0,0 +1,505 @@ +--- +layout: global +title: Literals +displayTitle: Literals +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +A literal (also known as a constant) represents a fixed data value. Spark SQL supports the following literals: + + * [String Literal](#string-literal) + * [Null Literal](#null-literal) + * [Boolean Literal](#boolean-literal) + * [Numeric Literal](#numeric-literal) + * [Datetime Literal](#datetime-literal) + * [Interval Literal](#interval-literal) + +### String Literal + +A string literal is used to specify a character string value. + + Syntax + +{% highlight sql %} +'c [ ... ]' | "c [ ... ]" +{% endhighlight %} + + Parameters + + + c + +One character from the character set. Use \ to escape special characters. + + + + Examples + +{% highlight sql %} +SELECT 'Hello, World!' AS col; + +-+ + | col| + +-+ + |Hello, World!| + +-+ + +SELECT "SPARK SQL" AS col; + +-+ + | col| + +-+ + |Spark SQL| + +-+ + +SELECT SELECT 'it\'s $10.' AS col; + +-+ + | col| + +-+ + |It's $10.| + +-+ +{% endhighlight %} + +### Null Literal + +A null literal is used to specify a null value. + + Syntax + +{% highlight sql %} +NULL +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT NULL AS col; + ++ + | col| + ++ + |NULL| + ++ +{% endhighlight %} + +### Boolean Literal + +A boolean literal is used to specify a boolean value. + + Syntax + +{% highlight sql %} +TRUE | FALSE +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT TRUE AS col; + ++ + | col| + ++ + |true| + ++ +{% endhighlight %} + +### Numeric Literal + +A numeric literal is used to specify a fixed or floating-point number. + + Integer Literal + + Syntax Review comment: I prefer not to add. The font seems to small if add one more # https://user-images.githubusercontent.com/13592258/79713482-c030bb80-8282-11ea-992f-eadad975f4c2.png;> vs https://user-images.githubusercontent.com/13592258/79713488-c3c44280-8282-11ea-8af2-17efcb897b06.png;> This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] huaxingao commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference
huaxingao commented on a change in pull request #28237: URL: https://github.com/apache/spark/pull/28237#discussion_r411081987 ## File path: docs/sql-ref-literals.md ## @@ -0,0 +1,505 @@ +--- +layout: global +title: Literals +displayTitle: Literals +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +A literal (also known as a constant) represents a fixed data value. Spark SQL supports the following literals: + + * [String Literal](#string-literal) + * [Null Literal](#null-literal) + * [Boolean Literal](#boolean-literal) + * [Numeric Literal](#numeric-literal) + * [Datetime Literal](#datetime-literal) + * [Interval Literal](#interval-literal) + +### String Literal + +A string literal is used to specify a character string value. + + Syntax + +{% highlight sql %} +'c [ ... ]' | "c [ ... ]" +{% endhighlight %} + + Parameters + + + c + +One character from the character set. Use \ to escape special characters. + + + + Examples + +{% highlight sql %} +SELECT 'Hello, World!' AS col; + +-+ + | col| + +-+ + |Hello, World!| + +-+ + +SELECT "SPARK SQL" AS col; + +-+ + | col| + +-+ + |Spark SQL| + +-+ + +SELECT SELECT 'it\'s $10.' AS col; + +-+ + | col| + +-+ + |It's $10.| + +-+ +{% endhighlight %} + +### Null Literal + +A null literal is used to specify a null value. + + Syntax + +{% highlight sql %} +NULL +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT NULL AS col; + ++ + | col| + ++ + |NULL| + ++ +{% endhighlight %} + +### Boolean Literal + +A boolean literal is used to specify a boolean value. + + Syntax + +{% highlight sql %} +TRUE | FALSE +{% endhighlight %} + + Examples + +{% highlight sql %} +SELECT TRUE AS col; + ++ + | col| + ++ + |true| + ++ +{% endhighlight %} + +### Numeric Literal + +A numeric literal is used to specify a fixed or floating-point number. + + Integer Literal + + Syntax + +{% highlight sql %} +[ + | - ] digit [ ... ] [ L | S | Y ] +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + L + +Case insensitive, indicates BIGINT, which is a 8-byte signed integer number. + + + + S + +Case insensitive, indicates SMALLINT, which is a 2-byte signed integer number. + + + + Y + +Case insensitive, indicates TINYINT, which is a 1-byte signed integer number. + + + + default (no postfix) + +Indicates a 4-byte signed integer number. + + + Examples + +{% highlight sql %} +SELECT -2147483648 AS col; + +---+ + |col| + +---+ + |-2147483648| + +---+ + +SELECT 9223372036854775807l AS col; + +---+ + |col| + +---+ + |9223372036854775807| + +---+ + +SELECT -32Y AS col; + +---+ + |col| + +---+ + |-32| + +---+ + +SELECT 482S AS col; + +---+ + |col| + +---+ + |482| + +---+ +{% endhighlight %} + + Decimal Literal + + Syntax + +{% highlight sql %} +[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] } +{% endhighlight %} + + Parameters + + + digit + +Any numeral from 0 to 9. + + + + Examples + +{% highlight sql %} +SELECT 12.578 AS col; + +--+ + | col| + +--+ + |12.578| + +--+ + +SELECT -0.1234567 AS col; + +--+ + | col| + +--+ + |-0.1234567| + +--+ + +SELECT -.1234567 AS col; + +--+ + | col| + +--+ + |-0.1234567| + +--+ +{% endhighlight %} + + Floating Point and BigDecimal Literals Review comment: https://github.com/apache/spark/blob/master/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4#L992-L1002 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28269: [SPARK-31493][SQL] Optimize InSet to In according partition size at InSubqueryExec
AmplabJenkins removed a comment on issue #28269: URL: https://github.com/apache/spark/pull/28269#issuecomment-616296943 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28148: [SPARK-31381][SPARK-29245][SQL] Upgrade built-in Hive 2.3.6 to 2.3.7
AmplabJenkins removed a comment on issue #28148: URL: https://github.com/apache/spark/pull/28148#issuecomment-616296974 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28148: [SPARK-31381][SPARK-29245][SQL] Upgrade built-in Hive 2.3.6 to 2.3.7
AmplabJenkins commented on issue #28148: URL: https://github.com/apache/spark/pull/28148#issuecomment-616296974 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28269: [SPARK-31493][SQL] Optimize InSet to In according partition size at InSubqueryExec
AmplabJenkins commented on issue #28269: URL: https://github.com/apache/spark/pull/28269#issuecomment-616296943 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28269: [SPARK-31493][SQL] Optimize InSet to In according partition size at InSubqueryExec
SparkQA commented on issue #28269: URL: https://github.com/apache/spark/pull/28269#issuecomment-616296734 **[Test build #121497 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121497/testReport)** for PR 28269 at commit [`a135dbc`](https://github.com/apache/spark/commit/a135dbc12bdf98008db04069289c98db3d9627e6). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28148: [SPARK-31381][SPARK-29245][SQL] Upgrade built-in Hive 2.3.6 to 2.3.7
SparkQA commented on issue #28148: URL: https://github.com/apache/spark/pull/28148#issuecomment-616296735 **[Test build #121498 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121498/testReport)** for PR 28148 at commit [`de13433`](https://github.com/apache/spark/commit/de134334e712ba822e6a8243a4ba3f50ec5b2ba3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on issue #28148: [SPARK-31381][SPARK-29245][SQL] Upgrade built-in Hive 2.3.6 to 2.3.7
wangyum commented on issue #28148: URL: https://github.com/apache/spark/pull/28148#issuecomment-616295772 retest this please. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28269: [SPARK-31493][SQL] Optimize InSet to In according partition size at InSubqueryExec
ulysses-you commented on a change in pull request #28269: URL: https://github.com/apache/spark/pull/28269#discussion_r411078284 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/PartitionPruning.scala ## @@ -109,7 +109,7 @@ object PartitionPruning extends Rule[LogicalPlan] with PredicateHelper { * the size in bytes of the partitioned plan after filtering is greater than the size * in bytes of the plan on the other side of the join. We estimate the filtering ratio * using column statistics if they are available, otherwise we use the config value of - * `spark.sql.optimizer.joinFilterRatio`. + * `SQLConf.DYNAMIC_PARTITION_PRUNING_FALLBACK_FILTER_RATIO`. Review comment: Update the config name. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you opened a new pull request #28269: [SPARK-31493][SQL] Optimize InSet to In according partition size at InSubqueryExec
ulysses-you opened a new pull request #28269: URL: https://github.com/apache/spark/pull/28269 ### What changes were proposed in this pull request? To respect `OptimizeIn`. Use `In` or `InSet` according partition size. ### Why are the changes needed? Better performance. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Add UT. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28148: [WIP][SPARK-31381][SPARK-29245][SQL][test-hadoop3.2][test-java11] Upgrade built-in Hive 2.3.6 to 2.3.7
AmplabJenkins commented on issue #28148: URL: https://github.com/apache/spark/pull/28148#issuecomment-616294803 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28148: [WIP][SPARK-31381][SPARK-29245][SQL][test-hadoop3.2][test-java11] Upgrade built-in Hive 2.3.6 to 2.3.7
AmplabJenkins removed a comment on issue #28148: URL: https://github.com/apache/spark/pull/28148#issuecomment-616294803 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #28148: [WIP][SPARK-31381][SPARK-29245][SQL][test-hadoop3.2][test-java11] Upgrade built-in Hive 2.3.6 to 2.3.7
SparkQA removed a comment on issue #28148: URL: https://github.com/apache/spark/pull/28148#issuecomment-616255932 **[Test build #121491 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121491/testReport)** for PR 28148 at commit [`de13433`](https://github.com/apache/spark/commit/de134334e712ba822e6a8243a4ba3f50ec5b2ba3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 commented on a change in pull request #28226: [SPARK-31452][SQL] Do not create partition spec for 0-size partitions in AQE
Ngone51 commented on a change in pull request #28226: URL: https://github.com/apache/spark/pull/28226#discussion_r411076696 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala ## @@ -88,9 +88,11 @@ case class OptimizeSkewedJoin(conf: SQLConf) extends Rule[SparkPlan] { private def targetSize(sizes: Seq[Long], medianSize: Long): Long = { val advisorySize = conf.getConf(SQLConf.ADVISORY_PARTITION_SIZE_IN_BYTES) val nonSkewSizes = sizes.filterNot(isSkewed(_, medianSize)) -// It's impossible that all the partitions are skewed, as we use median size to define skew. -assert(nonSkewSizes.nonEmpty) -math.max(advisorySize, nonSkewSizes.sum / nonSkewSizes.length) +if (nonSkewSizes.isEmpty) { Review comment: Why it's possible empty now? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28148: [WIP][SPARK-31381][SPARK-29245][SQL][test-hadoop3.2][test-java11] Upgrade built-in Hive 2.3.6 to 2.3.7
SparkQA commented on issue #28148: URL: https://github.com/apache/spark/pull/28148#issuecomment-616294353 **[Test build #121491 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121491/testReport)** for PR 28148 at commit [`de13433`](https://github.com/apache/spark/commit/de134334e712ba822e6a8243a4ba3f50ec5b2ba3). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28268: [SPARK-31492][ML] flatten the result dataframe of FValueTest
AmplabJenkins removed a comment on issue #28268: URL: https://github.com/apache/spark/pull/28268#issuecomment-616294101 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28268: [SPARK-31492][ML] flatten the result dataframe of FValueTest
AmplabJenkins commented on issue #28268: URL: https://github.com/apache/spark/pull/28268#issuecomment-616294101 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28268: [SPARK-31492][ML] flatten the result dataframe of FValueTest
SparkQA commented on issue #28268: URL: https://github.com/apache/spark/pull/28268#issuecomment-616293948 **[Test build #121496 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121496/testReport)** for PR 28268 at commit [`3caf7d1`](https://github.com/apache/spark/commit/3caf7d12408f3cd6d8245c53974ea84703c5f767). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng opened a new pull request #28268: [SPARK-31492][ML] flatten the result dataframe of FValueTest
zhengruifeng opened a new pull request #28268: URL: https://github.com/apache/spark/pull/28268 ### What changes were proposed in this pull request? add a new method `def test(dataset: DataFrame, featuresCol: String, labelCol: String, flatten: Boolean): DataFrame` ### Why are the changes needed? Similar to new test method in ChiSquareTest, it will: 1, support df operation on the returned df; 2, make driver no longer a bottleneck with large `numFeatures` ### Does this PR introduce any user-facing change? Yes, add a new method ### How was this patch tested? existing testsuites This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn edited a comment on issue #28197: [SPARK-31431][SQL] Add CalendarInterval encoder support
yaooqinn edited a comment on issue #28197: URL: https://github.com/apache/spark/pull/28197#issuecomment-616291295 Hi,@viirya, thanks for the details. take your commit as an example https://github.com/apache/spark/commit/48e44b24a7663142176102ac4c6bf4242f103804, `Seq(Set(interval)).toDF()`, do intervals work as domain objects already? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on issue #28197: [SPARK-31431][SQL] Add CalendarInterval encoder support
yaooqinn commented on issue #28197: URL: https://github.com/apache/spark/pull/28197#issuecomment-616291295 Hi, take your commit as an example https://github.com/apache/spark/commit/48e44b24a7663142176102ac4c6bf4242f103804, `Seq(Set(interval)).toDF()`, do intervals work as domain objects already? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] uncleGen commented on issue #27694: [SPARK-30946][SS] Serde entry with UnsafeRow on FileStream(Source/Sink)Log with LZ4 compression
uncleGen commented on issue #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-616289728 Suppose there is a streaming job pipeline, and these streaming job comes from different end-users or department, if middle end-user upgrade their spark and use `CompactibleFileStreamLog V2` (as you said in default, we can read from version 1 and write to version 2 for smooth migration.), then downstream jobs will fail. Is there something I misunderstand?So, It is better to keep the version of `CompactibleFileStreamLog`, or leave a config to upgrade version of `CompactibleFileStreamLog` with default `false`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 commented on a change in pull request #28254: [SPARK-31478][CORE]Call `StopExecutor` before killing executors
Ngone51 commented on a change in pull request #28254: URL: https://github.com/apache/spark/pull/28254#discussion_r411069475 ## File path: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala ## @@ -769,6 +769,8 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: Rp val killExecutors: Boolean => Future[Boolean] = if (executorsToKill.nonEmpty) { + executorsToKill.foreach(id => + executorDataMap.get(id).foreach(_.executorEndpoint.send(StopExecutor))) Review comment: Can we guarantee that `stop` is called before `kill` in this way? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on issue #28202: [SPARK-31433][ML] Summarizer supports string arguments
zhengruifeng commented on issue #28202: URL: https://github.com/apache/spark/pull/28202#issuecomment-616288772 I think it is not worth too much, and will close it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on a change in pull request #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SPARK-31060][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet
gatorsmile commented on a change in pull request #27728: URL: https://github.com/apache/spark/pull/27728#discussion_r411068064 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -652,10 +652,19 @@ object DataSourceStrategy { */ object PushableColumn { def unapply(e: Expression): Option[String] = { -def helper(e: Expression) = e match { - case a: Attribute => Some(a.name) +val nestedPredicatePushdownEnabled = SQLConf.get.nestedPredicatePushdownEnabled +import org.apache.spark.sql.connector.catalog.CatalogV2Implicits.MultipartIdentifierHelper +def helper(e: Expression): Option[Seq[String]] = e match { + case a: Attribute => +if (nestedPredicatePushdownEnabled || !a.name.contains(".")) { + Some(Seq(a.name)) +} else { + None +} + case s: GetStructField if nestedPredicatePushdownEnabled => +helper(s.child).map(_ :+ s.childSchema(s.ordinal).name) case _ => None } -helper(e) +helper(e).map(_.quoted) Review comment: @viirya Are you interested in this follow up? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28260: [SPARK-31487][CORE] Move slots check of barrier job from DAGScheduler to TaskSchedulerImpl
AmplabJenkins commented on issue #28260: URL: https://github.com/apache/spark/pull/28260#issuecomment-616286093 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org