[GitHub] [spark] maropu commented on a change in pull request #28104: [SPARK-31331][SQL][DOCS] Document Spark integration with Hive UDFs/UDAFs/UDTFs

2020-04-19 Thread GitBox


maropu commented on a change in pull request #28104:
URL: https://github.com/apache/spark/pull/28104#discussion_r411105164



##
File path: docs/sql-ref-functions-udf-hive.md
##
@@ -19,4 +19,90 @@ license: |
   limitations under the License.
 ---
 
-Integration with Hive UDFs/UDAFs/UDTFs
\ No newline at end of file
+### Description
+
+Spark SQL supports integration of Hive UDFs, UDAFs and UDTFs. Similar to Spark 
UDFs and UDAFs, Hive UDFs work on a single row as input and generate a single 
row as output, while Hive UDAFs operate on multiple rows and return a single 
aggregated row as a result. In addition, Hive also supports UDTFs (User Defined 
Tabular Functions) that act on one row as input and return multiple rows as 
output. To use Hive UDFs/UDAFs/UTFs, the user should register them in Spark, 
and then use them in Spark SQL queries.
+
+### Examples
+
+Hive has two UDF interfaces: 
[UDF](https://github.com/apache/hive/blob/master/udf/src/java/org/apache/hadoop/hive/ql/exec/UDF.java)
 and 
[GenericUDF](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java).
+An example below uses 
[GenericUDFAbs](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAbs.java)
 derived from `GenericUDF`.
+
+{% highlight sql %}
+-- Register `GenericUDFAbs` and use it in Spark SQL.
+-- Note that, if you use your own programmed one, you need to add a JAR 
containig it
+-- into a classpath,
+-- e.g., ADD JAR yourHiveUDF.jar;
+CREATE TEMPORARY FUNCTION testUDF AS 
'org.apache.hadoop.hive.ql.udf.generic.GenericUDFAbs';
+
+SELECT * FROM t;
+  +-+
+  |value|
+  +-+
+  | -1.0|
+  |  2.0|
+  | -3.0|
+  +-+
+
+SELECT testUDF(value) FROM t;
+  +--+
+  |testUDF(value)|
+  +--+
+  |   1.0|
+  |   2.0|
+  |   3.0|
+  +--+
+{% endhighlight %}
+
+
+An example below uses 
[GenericUDTFExplode](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFExplode.java)
 derived from 
[GenericUDTF](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java).
+
+{% highlight sql %}
+-- Register `GenericUDTFExplode` and use it in Spark SQL
+CREATE TEMPORARY FUNCTION hiveUDTF
+AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDTFExplode';
+
+SELECT * FROM t;
+  +--+

Review comment:
   Ah, I see. Actually, no strong reason. Just for format consistency. 
Before https://github.com/apache/spark/pull/28151, we used the different & 
inconsistent formats cross the SQL documents. So, I put the simple rule to use 
the same format in https://github.com/apache/spark/pull/28151. But, If we have 
a better format for the documents, the reformat looks fine.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Ngone51 commented on a change in pull request #28254: [SPARK-31478][CORE]Call `StopExecutor` before killing executors

2020-04-19 Thread GitBox


Ngone51 commented on a change in pull request #28254:
URL: https://github.com/apache/spark/pull/28254#discussion_r40015



##
File path: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
##
@@ -769,6 +769,8 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, val rpcEnv: Rp
 
   val killExecutors: Boolean => Future[Boolean] =
 if (executorsToKill.nonEmpty) {
+  executorsToKill.foreach(id =>
+
executorDataMap.get(id).foreach(_.executorEndpoint.send(StopExecutor)))

Review comment:
   `StopExecutor` may arrive at executor after `kill` arrive at 
worker/container due to network delay, isn't it possible?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28251: [SPARK-31476][SQL] Add an ExpressionInfo entry for EXTRACT

2020-04-19 Thread GitBox


maropu commented on a change in pull request #28251:
URL: https://github.com/apache/spark/pull/28251#discussion_r411109456



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
##
@@ -423,6 +423,7 @@ object FunctionRegistry {
 expression[MakeTimestamp]("make_timestamp"),
 expression[MakeInterval]("make_interval"),
 expression[DatePart]("date_part"),
+expression[Extract]("extract"),

Review comment:
   > Not a big deal but better if we can avoid exposing more APIs
   
   Yea, +1 .
   
   btw, its better to add tests for the case `extract(field, source)`?
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27944: [SPARK-31180][ML] Implement PowerTransform

2020-04-19 Thread GitBox


AmplabJenkins commented on issue #27944:
URL: https://github.com/apache/spark/pull/27944#issuecomment-616325425







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27944: [SPARK-31180][ML] Implement PowerTransform

2020-04-19 Thread GitBox


AmplabJenkins removed a comment on issue #27944:
URL: https://github.com/apache/spark/pull/27944#issuecomment-616325425







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] iRakson commented on a change in pull request #28254: [SPARK-31478][CORE]Call `StopExecutor` before killing executors

2020-04-19 Thread GitBox


iRakson commented on a change in pull request #28254:
URL: https://github.com/apache/spark/pull/28254#discussion_r411109001



##
File path: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
##
@@ -769,6 +769,8 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, val rpcEnv: Rp
 
   val killExecutors: Boolean => Future[Boolean] =
 if (executorsToKill.nonEmpty) {
+  executorsToKill.foreach(id =>
+
executorDataMap.get(id).foreach(_.executorEndpoint.send(StopExecutor)))

Review comment:
   Yes.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27944: [SPARK-31180][ML] Implement PowerTransform

2020-04-19 Thread GitBox


SparkQA commented on issue #27944:
URL: https://github.com/apache/spark/pull/27944#issuecomment-616325119


   **[Test build #121504 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121504/testReport)**
 for PR 27944 at commit 
[`f2fd922`](https://github.com/apache/spark/commit/f2fd9229f2d5914535ed87411e8d9080bbc5c7d9).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on issue #28266: [SPARK-31256][SQL] DataFrameNaFunctions.drop should work for nested columns

2020-04-19 Thread GitBox


cloud-fan commented on issue #28266:
URL: https://github.com/apache/spark/pull/28266#issuecomment-616322205


   @dongjoon-hyun yes



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28104: [SPARK-31331][SQL][DOCS] Document Spark integration with Hive UDFs/UDAFs/UDTFs

2020-04-19 Thread GitBox


maropu commented on a change in pull request #28104:
URL: https://github.com/apache/spark/pull/28104#discussion_r411105164



##
File path: docs/sql-ref-functions-udf-hive.md
##
@@ -19,4 +19,90 @@ license: |
   limitations under the License.
 ---
 
-Integration with Hive UDFs/UDAFs/UDTFs
\ No newline at end of file
+### Description
+
+Spark SQL supports integration of Hive UDFs, UDAFs and UDTFs. Similar to Spark 
UDFs and UDAFs, Hive UDFs work on a single row as input and generate a single 
row as output, while Hive UDAFs operate on multiple rows and return a single 
aggregated row as a result. In addition, Hive also supports UDTFs (User Defined 
Tabular Functions) that act on one row as input and return multiple rows as 
output. To use Hive UDFs/UDAFs/UTFs, the user should register them in Spark, 
and then use them in Spark SQL queries.
+
+### Examples
+
+Hive has two UDF interfaces: 
[UDF](https://github.com/apache/hive/blob/master/udf/src/java/org/apache/hadoop/hive/ql/exec/UDF.java)
 and 
[GenericUDF](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java).
+An example below uses 
[GenericUDFAbs](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAbs.java)
 derived from `GenericUDF`.
+
+{% highlight sql %}
+-- Register `GenericUDFAbs` and use it in Spark SQL.
+-- Note that, if you use your own programmed one, you need to add a JAR 
containig it
+-- into a classpath,
+-- e.g., ADD JAR yourHiveUDF.jar;
+CREATE TEMPORARY FUNCTION testUDF AS 
'org.apache.hadoop.hive.ql.udf.generic.GenericUDFAbs';
+
+SELECT * FROM t;
+  +-+
+  |value|
+  +-+
+  | -1.0|
+  |  2.0|
+  | -3.0|
+  +-+
+
+SELECT testUDF(value) FROM t;
+  +--+
+  |testUDF(value)|
+  +--+
+  |   1.0|
+  |   2.0|
+  |   3.0|
+  +--+
+{% endhighlight %}
+
+
+An example below uses 
[GenericUDTFExplode](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFExplode.java)
 derived from 
[GenericUDTF](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java).
+
+{% highlight sql %}
+-- Register `GenericUDTFExplode` and use it in Spark SQL
+CREATE TEMPORARY FUNCTION hiveUDTF
+AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDTFExplode';
+
+SELECT * FROM t;
+  +--+

Review comment:
   Ah, I see. Actually, no strong reason. Just for format consistency. 
Before https://github.com/apache/spark/pull/28151, we used the different & 
inconsistent formats cross the SQL documents. So, I put the simple rule to use 
the format in https://github.com/apache/spark/pull/28151. But, If we have a 
better format for the documents, the reformat looks fine.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on issue #28197: [SPARK-31431][SQL] Add CalendarInterval encoder support

2020-04-19 Thread GitBox


cloud-fan commented on issue #28197:
URL: https://github.com/apache/spark/pull/28197#issuecomment-616320345


   I'd say `CalendarInterval` should be treated the same as `Decimal`. They are 
semi-public, and are already supported partially (inside case class). It's 
arguable if we want to support more.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28104: [SPARK-31331][SQL][DOCS] Document Spark integration with Hive UDFs/UDAFs/UDTFs

2020-04-19 Thread GitBox


HyukjinKwon commented on a change in pull request #28104:
URL: https://github.com/apache/spark/pull/28104#discussion_r411103361



##
File path: docs/sql-ref-functions-udf-hive.md
##
@@ -19,4 +19,90 @@ license: |
   limitations under the License.
 ---
 
-Integration with Hive UDFs/UDAFs/UDTFs
\ No newline at end of file
+### Description
+
+Spark SQL supports integration of Hive UDFs, UDAFs and UDTFs. Similar to Spark 
UDFs and UDAFs, Hive UDFs work on a single row as input and generate a single 
row as output, while Hive UDAFs operate on multiple rows and return a single 
aggregated row as a result. In addition, Hive also supports UDTFs (User Defined 
Tabular Functions) that act on one row as input and return multiple rows as 
output. To use Hive UDFs/UDAFs/UTFs, the user should register them in Spark, 
and then use them in Spark SQL queries.
+
+### Examples
+
+Hive has two UDF interfaces: 
[UDF](https://github.com/apache/hive/blob/master/udf/src/java/org/apache/hadoop/hive/ql/exec/UDF.java)
 and 
[GenericUDF](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java).
+An example below uses 
[GenericUDFAbs](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAbs.java)
 derived from `GenericUDF`.
+
+{% highlight sql %}
+-- Register `GenericUDFAbs` and use it in Spark SQL.
+-- Note that, if you use your own programmed one, you need to add a JAR 
containig it
+-- into a classpath,
+-- e.g., ADD JAR yourHiveUDF.jar;
+CREATE TEMPORARY FUNCTION testUDF AS 
'org.apache.hadoop.hive.ql.udf.generic.GenericUDFAbs';
+
+SELECT * FROM t;
+  +-+
+  |value|
+  +-+
+  | -1.0|
+  |  2.0|
+  | -3.0|
+  +-+
+
+SELECT testUDF(value) FROM t;
+  +--+
+  |testUDF(value)|
+  +--+
+  |   1.0|
+  |   2.0|
+  |   3.0|
+  +--+
+{% endhighlight %}
+
+
+An example below uses 
[GenericUDTFExplode](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFExplode.java)
 derived from 
[GenericUDTF](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java).
+
+{% highlight sql %}
+-- Register `GenericUDTFExplode` and use it in Spark SQL
+CREATE TEMPORARY FUNCTION hiveUDTF
+AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDTFExplode';
+
+SELECT * FROM t;
+  +--+

Review comment:
   Also, seems like we should comment these output out.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28268: [SPARK-31492][ML] flatten the result dataframe of FValueTest

2020-04-19 Thread GitBox


AmplabJenkins removed a comment on issue #28268:
URL: https://github.com/apache/spark/pull/28268#issuecomment-616319622







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #28268: [SPARK-31492][ML] flatten the result dataframe of FValueTest

2020-04-19 Thread GitBox


SparkQA removed a comment on issue #28268:
URL: https://github.com/apache/spark/pull/28268#issuecomment-616293948


   **[Test build #121496 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121496/testReport)**
 for PR 28268 at commit 
[`3caf7d1`](https://github.com/apache/spark/commit/3caf7d12408f3cd6d8245c53974ea84703c5f767).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28268: [SPARK-31492][ML] flatten the result dataframe of FValueTest

2020-04-19 Thread GitBox


AmplabJenkins commented on issue #28268:
URL: https://github.com/apache/spark/pull/28268#issuecomment-616319622







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28226: [SPARK-31452][SQL] Do not create partition spec for 0-size partitions in AQE

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28226:
URL: https://github.com/apache/spark/pull/28226#discussion_r411102045



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
##
@@ -88,9 +88,11 @@ case class OptimizeSkewedJoin(conf: SQLConf) extends 
Rule[SparkPlan] {
   private def targetSize(sizes: Seq[Long], medianSize: Long): Long = {
 val advisorySize = conf.getConf(SQLConf.ADVISORY_PARTITION_SIZE_IN_BYTES)
 val nonSkewSizes = sizes.filterNot(isSkewed(_, medianSize))
-// It's impossible that all the partitions are skewed, as we use median 
size to define skew.
-assert(nonSkewSizes.nonEmpty)
-math.max(advisorySize, nonSkewSizes.sum / nonSkewSizes.length)
+if (nonSkewSizes.isEmpty) {

Review comment:
   because we calculate the median size based on the original map stats, 
but the input partitions are coalesced. It's possible all partitions (after 
coalesce) are skewed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28268: [SPARK-31492][ML] flatten the result dataframe of FValueTest

2020-04-19 Thread GitBox


SparkQA commented on issue #28268:
URL: https://github.com/apache/spark/pull/28268#issuecomment-616319130


   **[Test build #121496 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121496/testReport)**
 for PR 28268 at commit 
[`3caf7d1`](https://github.com/apache/spark/commit/3caf7d12408f3cd6d8245c53974ea84703c5f767).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26339: [SPARK-27194][SPARK-29302][SQL] Fix the issue that for dynamic partition overwrite a task would conflict with its speculative task

2020-04-19 Thread GitBox


AmplabJenkins commented on issue #26339:
URL: https://github.com/apache/spark/pull/26339#issuecomment-616318720







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28104: [SPARK-31331][SQL][DOCS] Document Spark integration with Hive UDFs/UDAFs/UDTFs

2020-04-19 Thread GitBox


HyukjinKwon commented on a change in pull request #28104:
URL: https://github.com/apache/spark/pull/28104#discussion_r411101626



##
File path: docs/sql-ref-functions-udf-hive.md
##
@@ -19,4 +19,90 @@ license: |
   limitations under the License.
 ---
 
-Integration with Hive UDFs/UDAFs/UDTFs
\ No newline at end of file
+### Description
+
+Spark SQL supports integration of Hive UDFs, UDAFs and UDTFs. Similar to Spark 
UDFs and UDAFs, Hive UDFs work on a single row as input and generate a single 
row as output, while Hive UDAFs operate on multiple rows and return a single 
aggregated row as a result. In addition, Hive also supports UDTFs (User Defined 
Tabular Functions) that act on one row as input and return multiple rows as 
output. To use Hive UDFs/UDAFs/UTFs, the user should register them in Spark, 
and then use them in Spark SQL queries.
+
+### Examples
+
+Hive has two UDF interfaces: 
[UDF](https://github.com/apache/hive/blob/master/udf/src/java/org/apache/hadoop/hive/ql/exec/UDF.java)
 and 
[GenericUDF](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java).
+An example below uses 
[GenericUDFAbs](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAbs.java)
 derived from `GenericUDF`.
+
+{% highlight sql %}
+-- Register `GenericUDFAbs` and use it in Spark SQL.
+-- Note that, if you use your own programmed one, you need to add a JAR 
containig it
+-- into a classpath,
+-- e.g., ADD JAR yourHiveUDF.jar;
+CREATE TEMPORARY FUNCTION testUDF AS 
'org.apache.hadoop.hive.ql.udf.generic.GenericUDFAbs';
+
+SELECT * FROM t;
+  +-+
+  |value|
+  +-+
+  | -1.0|
+  |  2.0|
+  | -3.0|
+  +-+
+
+SELECT testUDF(value) FROM t;
+  +--+
+  |testUDF(value)|
+  +--+
+  |   1.0|
+  |   2.0|
+  |   3.0|
+  +--+
+{% endhighlight %}
+
+
+An example below uses 
[GenericUDTFExplode](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFExplode.java)
 derived from 
[GenericUDTF](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java).
+
+{% highlight sql %}
+-- Register `GenericUDTFExplode` and use it in Spark SQL
+CREATE TEMPORARY FUNCTION hiveUDTF
+AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDTFExplode';
+
+SELECT * FROM t;
+  +--+

Review comment:
   quick question. Why did we use:
   
   ```
 +---+
 |col|
 +---+
 |  1|
 |  2|
 |  3|
 |  4|
 +---+
   ```
   
   format over the Hive string format (which is produced by `spark-sql` script)?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26339: [SPARK-27194][SPARK-29302][SQL] Fix the issue that for dynamic partition overwrite a task would conflict with its speculative task

2020-04-19 Thread GitBox


AmplabJenkins removed a comment on issue #26339:
URL: https://github.com/apache/spark/pull/26339#issuecomment-616318720







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28237:
URL: https://github.com/apache/spark/pull/28237#discussion_r411101432



##
File path: docs/sql-ref-literals.md
##
@@ -0,0 +1,506 @@
+---
+layout: global
+title: Literals
+displayTitle: Literals
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+A literal (also known as a constant) represents a fixed data value. Spark SQL 
supports the following literals:
+
+ * [String Literal](#string-literal)
+ * [Null Literal](#null-literal)
+ * [Boolean Literal](#boolean-literal)
+ * [Numeric Literal](#numeric-literal)
+ * [Datetime Literal](#datetime-literal)
+ * [Interval Literal](#interval-literal)
+
+### String Literal
+
+A string literal is used to specify a character string value.
+
+ Syntax
+
+{% highlight sql %}
+'c [ ... ]' | "c [ ... ]"
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set. Use \ to escape special 
characters (e.g., ' or \).
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 'Hello, World!' AS col;
+  +-+
+  |  col|
+  +-+
+  |Hello, World!|
+  +-+
+
+SELECT "SPARK SQL" AS col;
+  +-+
+  |  col|
+  +-+
+  |Spark SQL|
+  +-+
+
+SELECT SELECT 'it\'s $10.' AS col;
+  +-+
+  |  col|
+  +-+
+  |It's $10.|
+  +-+
+{% endhighlight %}
+
+### Null Literal
+
+A null literal is used to specify a null value.
+
+ Syntax
+
+{% highlight sql %}
+NULL
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT NULL AS col;
+  ++
+  | col|
+  ++
+  |NULL|
+  ++
+{% endhighlight %}
+
+### Boolean Literal
+
+A boolean literal is used to specify a boolean value.
+
+ Syntax
+
+{% highlight sql %}
+TRUE | FALSE
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT TRUE AS col;
+  ++
+  | col|
+  ++
+  |true|
+  ++
+{% endhighlight %}
+
+### Numeric Literal
+
+A numeric literal is used to specify a fixed or floating-point number.
+
+ Integer Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] digit [ ... ] [ L | S | Y ]
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  L
+  
+Case insensitive, indicates BIGINT, which is a 8-byte signed 
integer number.
+  
+
+
+  S
+  
+Case insensitive, indicates SMALLINT, which is a 2-byte 
signed integer number.
+  
+
+
+  Y
+  
+Case insensitive, indicates TINYINT, which is a 1-byte signed 
integer number.
+  
+
+
+  default (no postfix)
+  
+Indicates a 4-byte signed integer number.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT -2147483648 AS col;
+  +---+
+  |col|
+  +---+
+  |-2147483648|
+  +---+
+
+SELECT 9223372036854775807l AS col;
+  +---+
+  |col|
+  +---+
+  |9223372036854775807|
+  +---+
+
+SELECT -32Y AS col;
+  +---+
+  |col|
+  +---+
+  |-32|
+  +---+
+
+SELECT 482S AS col;
+  +---+
+  |col|
+  +---+
+  |482|
+  +---+
+{% endhighlight %}
+
+ Decimal Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] }
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 12.578 AS col;
+  +--+
+  |   col|
+  +--+
+  |12.578|
+  +--+
+
+SELECT -0.1234567 AS col;
+  +--+
+  |   col|
+  +--+
+  |-0.1234567|
+  +--+
+
+SELECT -.1234567 AS col;
+  +--+
+  |   col|
+  +--+
+  |-0.1234567|
+  +--+
+{% endhighlight %}
+
+ Floating Point and BigDecimal Literals
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] { digit [ ... ] [ E [ + | - ] digit [ ... ] ] [ D | BD ] |
+digit [ ... ] . [ digit [ ... ] ] [ E [ + | - ] digit [ ... ] ] [ 
D | BD ] |
+. digit [ ... ] [ E [ + | - ] digit [ ... ] ] [ D | BD ] }
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  D
+  
+Case insensitive, indicates DOUBLE, which is a 8-byte 
double-precision floating point number.
+  
+
+
+  BD
+  
+Case insensitive, indicates 

[GitHub] [spark] SparkQA commented on issue #26339: [SPARK-27194][SPARK-29302][SQL] Fix the issue that for dynamic partition overwrite a task would conflict with its speculative task

2020-04-19 Thread GitBox


SparkQA commented on issue #26339:
URL: https://github.com/apache/spark/pull/26339#issuecomment-616318288


   **[Test build #121503 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121503/testReport)**
 for PR 26339 at commit 
[`fdeeb5c`](https://github.com/apache/spark/commit/fdeeb5c3acf3f917a370a77d7327401949eb34a4).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28237:
URL: https://github.com/apache/spark/pull/28237#discussion_r411100904



##
File path: docs/sql-ref-literals.md
##
@@ -0,0 +1,506 @@
+---
+layout: global
+title: Literals
+displayTitle: Literals
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+A literal (also known as a constant) represents a fixed data value. Spark SQL 
supports the following literals:
+
+ * [String Literal](#string-literal)
+ * [Null Literal](#null-literal)
+ * [Boolean Literal](#boolean-literal)
+ * [Numeric Literal](#numeric-literal)
+ * [Datetime Literal](#datetime-literal)
+ * [Interval Literal](#interval-literal)
+
+### String Literal
+
+A string literal is used to specify a character string value.
+
+ Syntax
+
+{% highlight sql %}
+'c [ ... ]' | "c [ ... ]"
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set. Use \ to escape special 
characters (e.g., ' or \).
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 'Hello, World!' AS col;
+  +-+
+  |  col|
+  +-+
+  |Hello, World!|
+  +-+
+
+SELECT "SPARK SQL" AS col;
+  +-+
+  |  col|
+  +-+
+  |Spark SQL|
+  +-+
+
+SELECT SELECT 'it\'s $10.' AS col;
+  +-+
+  |  col|
+  +-+
+  |It's $10.|
+  +-+
+{% endhighlight %}
+
+### Null Literal
+
+A null literal is used to specify a null value.
+
+ Syntax
+
+{% highlight sql %}
+NULL
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT NULL AS col;
+  ++
+  | col|
+  ++
+  |NULL|
+  ++
+{% endhighlight %}
+
+### Boolean Literal
+
+A boolean literal is used to specify a boolean value.
+
+ Syntax
+
+{% highlight sql %}
+TRUE | FALSE
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT TRUE AS col;
+  ++
+  | col|
+  ++
+  |true|
+  ++
+{% endhighlight %}
+
+### Numeric Literal
+
+A numeric literal is used to specify a fixed or floating-point number.
+
+ Integer Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] digit [ ... ] [ L | S | Y ]
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  L
+  
+Case insensitive, indicates BIGINT, which is a 8-byte signed 
integer number.
+  
+
+
+  S
+  
+Case insensitive, indicates SMALLINT, which is a 2-byte 
signed integer number.
+  
+
+
+  Y
+  
+Case insensitive, indicates TINYINT, which is a 1-byte signed 
integer number.
+  
+
+
+  default (no postfix)
+  
+Indicates a 4-byte signed integer number.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT -2147483648 AS col;
+  +---+
+  |col|
+  +---+
+  |-2147483648|
+  +---+
+
+SELECT 9223372036854775807l AS col;
+  +---+
+  |col|
+  +---+
+  |9223372036854775807|
+  +---+
+
+SELECT -32Y AS col;
+  +---+
+  |col|
+  +---+
+  |-32|
+  +---+
+
+SELECT 482S AS col;
+  +---+
+  |col|
+  +---+
+  |482|
+  +---+
+{% endhighlight %}
+
+ Decimal Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] }
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 12.578 AS col;
+  +--+
+  |   col|
+  +--+
+  |12.578|
+  +--+
+
+SELECT -0.1234567 AS col;
+  +--+
+  |   col|
+  +--+
+  |-0.1234567|
+  +--+
+
+SELECT -.1234567 AS col;
+  +--+
+  |   col|
+  +--+
+  |-0.1234567|
+  +--+
+{% endhighlight %}
+
+ Floating Point and BigDecimal Literals
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] { digit [ ... ] [ E [ + | - ] digit [ ... ] ] [ D | BD ] |
+digit [ ... ] . [ digit [ ... ] ] [ E [ + | - ] digit [ ... ] ] [ 
D | BD ] |
+. digit [ ... ] [ E [ + | - ] digit [ ... ] ] [ D | BD ] }
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  D
+  
+Case insensitive, indicates DOUBLE, which is a 8-byte 
double-precision floating point number.
+  
+
+
+  BD
+  
+Case insensitive, indicates 

[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28237:
URL: https://github.com/apache/spark/pull/28237#discussion_r411100552



##
File path: docs/sql-ref-literals.md
##
@@ -0,0 +1,506 @@
+---
+layout: global
+title: Literals
+displayTitle: Literals
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+A literal (also known as a constant) represents a fixed data value. Spark SQL 
supports the following literals:
+
+ * [String Literal](#string-literal)
+ * [Null Literal](#null-literal)
+ * [Boolean Literal](#boolean-literal)
+ * [Numeric Literal](#numeric-literal)
+ * [Datetime Literal](#datetime-literal)
+ * [Interval Literal](#interval-literal)
+
+### String Literal
+
+A string literal is used to specify a character string value.
+
+ Syntax
+
+{% highlight sql %}
+'c [ ... ]' | "c [ ... ]"
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set. Use \ to escape special 
characters (e.g., ' or \).
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 'Hello, World!' AS col;
+  +-+
+  |  col|
+  +-+
+  |Hello, World!|
+  +-+
+
+SELECT "SPARK SQL" AS col;
+  +-+
+  |  col|
+  +-+
+  |Spark SQL|
+  +-+
+
+SELECT SELECT 'it\'s $10.' AS col;
+  +-+
+  |  col|
+  +-+
+  |It's $10.|
+  +-+
+{% endhighlight %}
+
+### Null Literal
+
+A null literal is used to specify a null value.
+
+ Syntax
+
+{% highlight sql %}
+NULL
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT NULL AS col;
+  ++
+  | col|
+  ++
+  |NULL|
+  ++
+{% endhighlight %}
+
+### Boolean Literal
+
+A boolean literal is used to specify a boolean value.
+
+ Syntax
+
+{% highlight sql %}
+TRUE | FALSE
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT TRUE AS col;
+  ++
+  | col|
+  ++
+  |true|
+  ++
+{% endhighlight %}
+
+### Numeric Literal
+
+A numeric literal is used to specify a fixed or floating-point number.
+
+ Integer Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] digit [ ... ] [ L | S | Y ]
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  L
+  
+Case insensitive, indicates BIGINT, which is a 8-byte signed 
integer number.
+  
+
+
+  S
+  
+Case insensitive, indicates SMALLINT, which is a 2-byte 
signed integer number.
+  
+
+
+  Y
+  
+Case insensitive, indicates TINYINT, which is a 1-byte signed 
integer number.
+  
+
+
+  default (no postfix)
+  
+Indicates a 4-byte signed integer number.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT -2147483648 AS col;
+  +---+
+  |col|
+  +---+
+  |-2147483648|
+  +---+
+
+SELECT 9223372036854775807l AS col;
+  +---+
+  |col|
+  +---+
+  |9223372036854775807|
+  +---+
+
+SELECT -32Y AS col;
+  +---+
+  |col|
+  +---+
+  |-32|
+  +---+
+
+SELECT 482S AS col;
+  +---+
+  |col|
+  +---+
+  |482|
+  +---+
+{% endhighlight %}
+
+ Decimal Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] }
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 12.578 AS col;
+  +--+
+  |   col|
+  +--+
+  |12.578|
+  +--+
+
+SELECT -0.1234567 AS col;
+  +--+
+  |   col|
+  +--+
+  |-0.1234567|
+  +--+
+
+SELECT -.1234567 AS col;
+  +--+
+  |   col|
+  +--+
+  |-0.1234567|
+  +--+
+{% endhighlight %}
+
+ Floating Point and BigDecimal Literals
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] { digit [ ... ] [ E [ + | - ] digit [ ... ] ] [ D | BD ] |
+digit [ ... ] . [ digit [ ... ] ] [ E [ + | - ] digit [ ... ] ] [ 
D | BD ] |
+. digit [ ... ] [ E [ + | - ] digit [ ... ] ] [ D | BD ] }
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  D
+  
+Case insensitive, indicates DOUBLE, which is a 8-byte 
double-precision floating point number.
+  
+
+
+  BD
+  
+Case insensitive, indicates 

[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28237:
URL: https://github.com/apache/spark/pull/28237#discussion_r411099766



##
File path: docs/sql-ref-literals.md
##
@@ -0,0 +1,506 @@
+---
+layout: global
+title: Literals
+displayTitle: Literals
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+A literal (also known as a constant) represents a fixed data value. Spark SQL 
supports the following literals:
+
+ * [String Literal](#string-literal)
+ * [Null Literal](#null-literal)
+ * [Boolean Literal](#boolean-literal)
+ * [Numeric Literal](#numeric-literal)
+ * [Datetime Literal](#datetime-literal)
+ * [Interval Literal](#interval-literal)
+
+### String Literal
+
+A string literal is used to specify a character string value.
+
+ Syntax
+
+{% highlight sql %}
+'c [ ... ]' | "c [ ... ]"
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set. Use \ to escape special 
characters (e.g., ' or \).
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 'Hello, World!' AS col;
+  +-+
+  |  col|
+  +-+
+  |Hello, World!|
+  +-+
+
+SELECT "SPARK SQL" AS col;
+  +-+
+  |  col|
+  +-+
+  |Spark SQL|
+  +-+
+
+SELECT SELECT 'it\'s $10.' AS col;
+  +-+
+  |  col|
+  +-+
+  |It's $10.|
+  +-+
+{% endhighlight %}
+
+### Null Literal
+
+A null literal is used to specify a null value.
+
+ Syntax
+
+{% highlight sql %}
+NULL
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT NULL AS col;
+  ++
+  | col|
+  ++
+  |NULL|
+  ++
+{% endhighlight %}
+
+### Boolean Literal
+
+A boolean literal is used to specify a boolean value.
+
+ Syntax
+
+{% highlight sql %}
+TRUE | FALSE
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT TRUE AS col;
+  ++
+  | col|
+  ++
+  |true|
+  ++
+{% endhighlight %}
+
+### Numeric Literal
+
+A numeric literal is used to specify a fixed or floating-point number.
+
+ Integer Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] digit [ ... ] [ L | S | Y ]
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  L
+  
+Case insensitive, indicates BIGINT, which is a 8-byte signed 
integer number.
+  
+
+
+  S
+  
+Case insensitive, indicates SMALLINT, which is a 2-byte 
signed integer number.
+  
+
+
+  Y
+  
+Case insensitive, indicates TINYINT, which is a 1-byte signed 
integer number.
+  
+
+
+  default (no postfix)
+  
+Indicates a 4-byte signed integer number.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT -2147483648 AS col;
+  +---+
+  |col|
+  +---+
+  |-2147483648|
+  +---+
+
+SELECT 9223372036854775807l AS col;
+  +---+
+  |col|
+  +---+
+  |9223372036854775807|
+  +---+
+
+SELECT -32Y AS col;
+  +---+
+  |col|
+  +---+
+  |-32|
+  +---+
+
+SELECT 482S AS col;
+  +---+
+  |col|
+  +---+
+  |482|
+  +---+
+{% endhighlight %}
+
+ Decimal Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] }
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 12.578 AS col;
+  +--+
+  |   col|
+  +--+
+  |12.578|
+  +--+
+
+SELECT -0.1234567 AS col;
+  +--+
+  |   col|
+  +--+
+  |-0.1234567|
+  +--+
+
+SELECT -.1234567 AS col;
+  +--+
+  |   col|
+  +--+
+  |-0.1234567|
+  +--+
+{% endhighlight %}
+
+ Floating Point and BigDecimal Literals
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] { digit [ ... ] [ E [ + | - ] digit [ ... ] ] [ D | BD ] |
+digit [ ... ] . [ digit [ ... ] ] [ E [ + | - ] digit [ ... ] ] [ 
D | BD ] |
+. digit [ ... ] [ E [ + | - ] digit [ ... ] ] [ D | BD ] }
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  D
+  
+Case insensitive, indicates DOUBLE, which is a 8-byte 
double-precision floating point number.
+  
+
+
+  BD
+  
+Case insensitive, indicates 

[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28237:
URL: https://github.com/apache/spark/pull/28237#discussion_r411099482



##
File path: docs/sql-ref-literals.md
##
@@ -0,0 +1,506 @@
+---
+layout: global
+title: Literals
+displayTitle: Literals
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+A literal (also known as a constant) represents a fixed data value. Spark SQL 
supports the following literals:
+
+ * [String Literal](#string-literal)
+ * [Null Literal](#null-literal)
+ * [Boolean Literal](#boolean-literal)
+ * [Numeric Literal](#numeric-literal)
+ * [Datetime Literal](#datetime-literal)
+ * [Interval Literal](#interval-literal)
+
+### String Literal
+
+A string literal is used to specify a character string value.
+
+ Syntax
+
+{% highlight sql %}
+'c [ ... ]' | "c [ ... ]"
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set. Use \ to escape special 
characters (e.g., ' or \).
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 'Hello, World!' AS col;
+  +-+
+  |  col|
+  +-+
+  |Hello, World!|
+  +-+
+
+SELECT "SPARK SQL" AS col;
+  +-+
+  |  col|
+  +-+
+  |Spark SQL|
+  +-+
+
+SELECT SELECT 'it\'s $10.' AS col;
+  +-+
+  |  col|
+  +-+
+  |It's $10.|
+  +-+
+{% endhighlight %}
+
+### Null Literal
+
+A null literal is used to specify a null value.
+
+ Syntax
+
+{% highlight sql %}
+NULL
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT NULL AS col;
+  ++
+  | col|
+  ++
+  |NULL|
+  ++
+{% endhighlight %}
+
+### Boolean Literal
+
+A boolean literal is used to specify a boolean value.
+
+ Syntax
+
+{% highlight sql %}
+TRUE | FALSE
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT TRUE AS col;
+  ++
+  | col|
+  ++
+  |true|
+  ++
+{% endhighlight %}
+
+### Numeric Literal
+
+A numeric literal is used to specify a fixed or floating-point number.
+
+ Integer Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] digit [ ... ] [ L | S | Y ]
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  L
+  
+Case insensitive, indicates BIGINT, which is a 8-byte signed 
integer number.
+  
+
+
+  S
+  
+Case insensitive, indicates SMALLINT, which is a 2-byte 
signed integer number.
+  
+
+
+  Y
+  
+Case insensitive, indicates TINYINT, which is a 1-byte signed 
integer number.
+  
+
+
+  default (no postfix)
+  
+Indicates a 4-byte signed integer number.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT -2147483648 AS col;
+  +---+
+  |col|
+  +---+
+  |-2147483648|
+  +---+
+
+SELECT 9223372036854775807l AS col;
+  +---+
+  |col|
+  +---+
+  |9223372036854775807|
+  +---+
+
+SELECT -32Y AS col;
+  +---+
+  |col|
+  +---+
+  |-32|
+  +---+
+
+SELECT 482S AS col;
+  +---+
+  |col|
+  +---+
+  |482|
+  +---+
+{% endhighlight %}
+
+ Decimal Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] }

Review comment:
   and there is no way to create float type literals (except using 
functions like `cast` and `float`). Maybe we should look at the SQL standard 
and support it later.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on issue #28148: [SPARK-31381][SPARK-29245][SQL] Upgrade built-in Hive 2.3.6 to 2.3.7

2020-04-19 Thread GitBox


dongjoon-hyun commented on issue #28148:
URL: https://github.com/apache/spark/pull/28148#issuecomment-616316299


   Finally! Thank you, @wangyum .



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


maropu commented on a change in pull request #28237:
URL: https://github.com/apache/spark/pull/28237#discussion_r411099259



##
File path: docs/sql-ref-literals.md
##
@@ -0,0 +1,505 @@
+---
+layout: global
+title: Literals
+displayTitle: Literals
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+A literal (also known as a constant) represents a fixed data value. Spark SQL 
supports the following literals:
+
+ * [String Literal](#string-literal)
+ * [Null Literal](#null-literal)
+ * [Boolean Literal](#boolean-literal)
+ * [Numeric Literal](#numeric-literal)
+ * [Datetime Literal](#datetime-literal)
+ * [Interval Literal](#interval-literal)
+
+### String Literal
+
+A string literal is used to specify a character string value.
+
+ Syntax
+
+{% highlight sql %}
+'c [ ... ]' | "c [ ... ]"
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set. Use \ to escape special 
characters.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 'Hello, World!' AS col;
+  +-+
+  |  col|
+  +-+
+  |Hello, World!|
+  +-+
+
+SELECT "SPARK SQL" AS col;
+  +-+
+  |  col|
+  +-+
+  |Spark SQL|
+  +-+
+
+SELECT SELECT 'it\'s $10.' AS col;
+  +-+
+  |  col|
+  +-+
+  |It's $10.|
+  +-+
+{% endhighlight %}
+
+### Null Literal
+
+A null literal is used to specify a null value.
+
+ Syntax
+
+{% highlight sql %}
+NULL
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT NULL AS col;
+  ++
+  | col|
+  ++
+  |NULL|
+  ++
+{% endhighlight %}
+
+### Boolean Literal
+
+A boolean literal is used to specify a boolean value.
+
+ Syntax
+
+{% highlight sql %}
+TRUE | FALSE
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT TRUE AS col;
+  ++
+  | col|
+  ++
+  |true|
+  ++
+{% endhighlight %}
+
+### Numeric Literal
+
+A numeric literal is used to specify a fixed or floating-point number.
+
+ Integer Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] digit [ ... ] [ L | S | Y ]
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  L
+  
+Case insensitive, indicates BIGINT, which is a 8-byte signed 
integer number.
+  
+
+
+  S
+  
+Case insensitive, indicates SMALLINT, which is a 2-byte 
signed integer number.
+  
+
+
+  Y
+  
+Case insensitive, indicates TINYINT, which is a 1-byte signed 
integer number.
+  
+
+
+  default (no postfix)
+  
+Indicates a 4-byte signed integer number.
+  
+
+ Examples
+
+{% highlight sql %}
+SELECT -2147483648 AS col;
+  +---+
+  |col|
+  +---+
+  |-2147483648|
+  +---+
+
+SELECT 9223372036854775807l AS col;
+  +---+
+  |col|
+  +---+
+  |9223372036854775807|
+  +---+
+
+SELECT -32Y AS col;
+  +---+
+  |col|
+  +---+
+  |-32|
+  +---+
+
+SELECT 482S AS col;
+  +---+
+  |col|
+  +---+
+  |482|
+  +---+
+{% endhighlight %}
+
+ Decimal Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] }
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 12.578 AS col;
+  +--+
+  |   col|
+  +--+
+  |12.578|
+  +--+
+
+SELECT -0.1234567 AS col;
+  +--+
+  |   col|
+  +--+
+  |-0.1234567|
+  +--+
+
+SELECT -.1234567 AS col;
+  +--+
+  |   col|
+  +--+
+  |-0.1234567|
+  +--+
+{% endhighlight %}
+
+ Floating Point and BigDecimal Literals

Review comment:
   > I think we should mention BD in the Decimal Literal section.
   
   +1





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: 

[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28237:
URL: https://github.com/apache/spark/pull/28237#discussion_r411098844



##
File path: docs/sql-ref-literals.md
##
@@ -0,0 +1,506 @@
+---
+layout: global
+title: Literals
+displayTitle: Literals
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+A literal (also known as a constant) represents a fixed data value. Spark SQL 
supports the following literals:
+
+ * [String Literal](#string-literal)
+ * [Null Literal](#null-literal)
+ * [Boolean Literal](#boolean-literal)
+ * [Numeric Literal](#numeric-literal)
+ * [Datetime Literal](#datetime-literal)
+ * [Interval Literal](#interval-literal)
+
+### String Literal
+
+A string literal is used to specify a character string value.
+
+ Syntax
+
+{% highlight sql %}
+'c [ ... ]' | "c [ ... ]"
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set. Use \ to escape special 
characters (e.g., ' or \).
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 'Hello, World!' AS col;
+  +-+
+  |  col|
+  +-+
+  |Hello, World!|
+  +-+
+
+SELECT "SPARK SQL" AS col;
+  +-+
+  |  col|
+  +-+
+  |Spark SQL|
+  +-+
+
+SELECT SELECT 'it\'s $10.' AS col;
+  +-+
+  |  col|
+  +-+
+  |It's $10.|
+  +-+
+{% endhighlight %}
+
+### Null Literal
+
+A null literal is used to specify a null value.
+
+ Syntax
+
+{% highlight sql %}
+NULL
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT NULL AS col;
+  ++
+  | col|
+  ++
+  |NULL|
+  ++
+{% endhighlight %}
+
+### Boolean Literal
+
+A boolean literal is used to specify a boolean value.
+
+ Syntax
+
+{% highlight sql %}
+TRUE | FALSE
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT TRUE AS col;
+  ++
+  | col|
+  ++
+  |true|
+  ++
+{% endhighlight %}
+
+### Numeric Literal
+
+A numeric literal is used to specify a fixed or floating-point number.
+
+ Integer Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] digit [ ... ] [ L | S | Y ]
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  L
+  
+Case insensitive, indicates BIGINT, which is a 8-byte signed 
integer number.
+  
+
+
+  S
+  
+Case insensitive, indicates SMALLINT, which is a 2-byte 
signed integer number.
+  
+
+
+  Y
+  
+Case insensitive, indicates TINYINT, which is a 1-byte signed 
integer number.
+  
+
+
+  default (no postfix)
+  
+Indicates a 4-byte signed integer number.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT -2147483648 AS col;
+  +---+
+  |col|
+  +---+
+  |-2147483648|
+  +---+
+
+SELECT 9223372036854775807l AS col;
+  +---+
+  |col|
+  +---+
+  |9223372036854775807|
+  +---+
+
+SELECT -32Y AS col;
+  +---+
+  |col|
+  +---+
+  |-32|
+  +---+
+
+SELECT 482S AS col;
+  +---+
+  |col|
+  +---+
+  |482|
+  +---+
+{% endhighlight %}
+
+ Decimal Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] }

Review comment:
   seems better to mention the fraction literals together:
   ```
   decimal_digits: [ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ 
... ] }
   exponent: E [+ | -] digit [ ... ]
   
   decimal literal: decimal_digits | decimal_digits [exponent] 'BD'
   double literal: decimal_digits exponent | decimal_digits [exponent] 'D'
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on issue #28266: [SPARK-31256][SQL] DataFrameNaFunctions.drop should work for nested columns

2020-04-19 Thread GitBox


dongjoon-hyun commented on issue #28266:
URL: https://github.com/apache/spark/pull/28266#issuecomment-616314632


   So, SPARK-31256 made a regression at 2.4.5 and this recovers it?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SPARK-31060][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-04-19 Thread GitBox


viirya commented on a change in pull request #27728:
URL: https://github.com/apache/spark/pull/27728#discussion_r411096547



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
##
@@ -652,10 +652,19 @@ object DataSourceStrategy {
  */
 object PushableColumn {
   def unapply(e: Expression): Option[String] = {
-def helper(e: Expression) = e match {
-  case a: Attribute => Some(a.name)
+val nestedPredicatePushdownEnabled = 
SQLConf.get.nestedPredicatePushdownEnabled
+import 
org.apache.spark.sql.connector.catalog.CatalogV2Implicits.MultipartIdentifierHelper
+def helper(e: Expression): Option[Seq[String]] = e match {
+  case a: Attribute =>
+if (nestedPredicatePushdownEnabled || !a.name.contains(".")) {
+  Some(Seq(a.name))
+} else {
+  None
+}
+  case s: GetStructField if nestedPredicatePushdownEnabled =>
+helper(s.child).map(_ :+ s.childSchema(s.ordinal).name)
   case _ => None
 }
-helper(e)
+helper(e).map(_.quoted)

Review comment:
   I can try looking at this this week. If anyone picks it up before me, 
I'm also ok.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28237:
URL: https://github.com/apache/spark/pull/28237#discussion_r411096138



##
File path: docs/sql-ref-literals.md
##
@@ -0,0 +1,505 @@
+---
+layout: global
+title: Literals
+displayTitle: Literals
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+A literal (also known as a constant) represents a fixed data value. Spark SQL 
supports the following literals:
+
+ * [String Literal](#string-literal)
+ * [Null Literal](#null-literal)
+ * [Boolean Literal](#boolean-literal)
+ * [Numeric Literal](#numeric-literal)
+ * [Datetime Literal](#datetime-literal)
+ * [Interval Literal](#interval-literal)
+
+### String Literal
+
+A string literal is used to specify a character string value.
+
+ Syntax
+
+{% highlight sql %}
+'c [ ... ]' | "c [ ... ]"
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set. Use \ to escape special 
characters.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 'Hello, World!' AS col;
+  +-+
+  |  col|
+  +-+
+  |Hello, World!|
+  +-+
+
+SELECT "SPARK SQL" AS col;
+  +-+
+  |  col|
+  +-+
+  |Spark SQL|
+  +-+
+
+SELECT SELECT 'it\'s $10.' AS col;
+  +-+
+  |  col|
+  +-+
+  |It's $10.|
+  +-+
+{% endhighlight %}
+
+### Null Literal
+
+A null literal is used to specify a null value.
+
+ Syntax
+
+{% highlight sql %}
+NULL
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT NULL AS col;
+  ++
+  | col|
+  ++
+  |NULL|
+  ++
+{% endhighlight %}
+
+### Boolean Literal
+
+A boolean literal is used to specify a boolean value.
+
+ Syntax
+
+{% highlight sql %}
+TRUE | FALSE
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT TRUE AS col;
+  ++
+  | col|
+  ++
+  |true|
+  ++
+{% endhighlight %}
+
+### Numeric Literal
+
+A numeric literal is used to specify a fixed or floating-point number.
+
+ Integer Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] digit [ ... ] [ L | S | Y ]
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  L
+  
+Case insensitive, indicates BIGINT, which is a 8-byte signed 
integer number.
+  
+
+
+  S
+  
+Case insensitive, indicates SMALLINT, which is a 2-byte 
signed integer number.
+  
+
+
+  Y
+  
+Case insensitive, indicates TINYINT, which is a 1-byte signed 
integer number.
+  
+
+
+  default (no postfix)
+  
+Indicates a 4-byte signed integer number.
+  
+
+ Examples
+
+{% highlight sql %}
+SELECT -2147483648 AS col;
+  +---+
+  |col|
+  +---+
+  |-2147483648|
+  +---+
+
+SELECT 9223372036854775807l AS col;
+  +---+
+  |col|
+  +---+
+  |9223372036854775807|
+  +---+
+
+SELECT -32Y AS col;
+  +---+
+  |col|
+  +---+
+  |-32|
+  +---+
+
+SELECT 482S AS col;
+  +---+
+  |col|
+  +---+
+  |482|
+  +---+
+{% endhighlight %}
+
+ Decimal Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] }
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 12.578 AS col;
+  +--+
+  |   col|
+  +--+
+  |12.578|
+  +--+
+
+SELECT -0.1234567 AS col;
+  +--+
+  |   col|
+  +--+
+  |-0.1234567|
+  +--+
+
+SELECT -.1234567 AS col;
+  +--+
+  |   col|
+  +--+
+  |-0.1234567|
+  +--+
+{% endhighlight %}
+
+ Floating Point and BigDecimal Literals

Review comment:
   they are different SQL syntax, but they both create decimal literals 
(decimal type values). I think we should mention `BD` in the `Decimal Literal` 
section. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28237:
URL: https://github.com/apache/spark/pull/28237#discussion_r411095492



##
File path: docs/sql-ref-literals.md
##
@@ -0,0 +1,506 @@
+---
+layout: global
+title: Literals
+displayTitle: Literals
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+A literal (also known as a constant) represents a fixed data value. Spark SQL 
supports the following literals:
+
+ * [String Literal](#string-literal)
+ * [Null Literal](#null-literal)
+ * [Boolean Literal](#boolean-literal)
+ * [Numeric Literal](#numeric-literal)
+ * [Datetime Literal](#datetime-literal)
+ * [Interval Literal](#interval-literal)
+
+### String Literal
+
+A string literal is used to specify a character string value.
+
+ Syntax
+
+{% highlight sql %}
+'c [ ... ]' | "c [ ... ]"
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set. Use \ to escape special 
characters (e.g., ' or \).
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 'Hello, World!' AS col;
+  +-+
+  |  col|
+  +-+
+  |Hello, World!|
+  +-+
+
+SELECT "SPARK SQL" AS col;
+  +-+
+  |  col|
+  +-+
+  |Spark SQL|
+  +-+
+
+SELECT SELECT 'it\'s $10.' AS col;
+  +-+
+  |  col|
+  +-+
+  |It's $10.|
+  +-+
+{% endhighlight %}
+
+### Null Literal
+
+A null literal is used to specify a null value.
+
+ Syntax
+
+{% highlight sql %}
+NULL
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT NULL AS col;
+  ++
+  | col|
+  ++
+  |NULL|
+  ++
+{% endhighlight %}
+
+### Boolean Literal
+
+A boolean literal is used to specify a boolean value.
+
+ Syntax
+
+{% highlight sql %}
+TRUE | FALSE
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT TRUE AS col;
+  ++
+  | col|
+  ++
+  |true|
+  ++
+{% endhighlight %}
+
+### Numeric Literal
+
+A numeric literal is used to specify a fixed or floating-point number.
+
+ Integer Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] digit [ ... ] [ L | S | Y ]
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  L
+  
+Case insensitive, indicates BIGINT, which is a 8-byte signed 
integer number.
+  
+
+
+  S
+  
+Case insensitive, indicates SMALLINT, which is a 2-byte 
signed integer number.
+  
+
+
+  Y
+  
+Case insensitive, indicates TINYINT, which is a 1-byte signed 
integer number.
+  
+
+
+  default (no postfix)
+  
+Indicates a 4-byte signed integer number.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT -2147483648 AS col;
+  +---+
+  |col|
+  +---+
+  |-2147483648|
+  +---+
+
+SELECT 9223372036854775807l AS col;
+  +---+
+  |col|
+  +---+
+  |9223372036854775807|
+  +---+
+
+SELECT -32Y AS col;
+  +---+
+  |col|
+  +---+
+  |-32|
+  +---+
+
+SELECT 482S AS col;
+  +---+
+  |col|
+  +---+
+  |482|
+  +---+
+{% endhighlight %}
+
+ Decimal Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] }
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 12.578 AS col;
+  +--+
+  |   col|
+  +--+
+  |12.578|
+  +--+
+
+SELECT -0.1234567 AS col;
+  +--+
+  |   col|
+  +--+
+  |-0.1234567|
+  +--+
+
+SELECT -.1234567 AS col;

Review comment:
   `123.` is also a decimal, right? which is the same as `123.0`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on issue #28197: [SPARK-31431][SQL] Add CalendarInterval encoder support

2020-04-19 Thread GitBox


viirya commented on issue #28197:
URL: https://github.com/apache/spark/pull/28197#issuecomment-616311588


   Do we expect users to read data and represent as CalendarInterval in 
Dataset? Seems to me CalendarInterval is only for usage in Spark row. Although 
not the same, it sounds similar to me to make UTF8String in Dataset.
   
   For domain objects we provide encoders in Dataset, I think they shall be 
frequently used domain objects which users will use in their business logic.
   
   This sounds a strange encoder when I looked at it at first. It seems not to 
be a problem, however I wonder we should be careful in adding encoder. If 
others also agree to add, I'm fine for this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28237:
URL: https://github.com/apache/spark/pull/28237#discussion_r411095121



##
File path: docs/sql-ref-literals.md
##
@@ -0,0 +1,506 @@
+---
+layout: global
+title: Literals
+displayTitle: Literals
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+A literal (also known as a constant) represents a fixed data value. Spark SQL 
supports the following literals:
+
+ * [String Literal](#string-literal)
+ * [Null Literal](#null-literal)
+ * [Boolean Literal](#boolean-literal)
+ * [Numeric Literal](#numeric-literal)
+ * [Datetime Literal](#datetime-literal)
+ * [Interval Literal](#interval-literal)
+
+### String Literal
+
+A string literal is used to specify a character string value.
+
+ Syntax
+
+{% highlight sql %}
+'c [ ... ]' | "c [ ... ]"
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set. Use \ to escape special 
characters (e.g., ' or \).
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 'Hello, World!' AS col;
+  +-+
+  |  col|
+  +-+
+  |Hello, World!|
+  +-+
+
+SELECT "SPARK SQL" AS col;
+  +-+
+  |  col|
+  +-+
+  |Spark SQL|
+  +-+
+
+SELECT SELECT 'it\'s $10.' AS col;
+  +-+
+  |  col|
+  +-+
+  |It's $10.|
+  +-+
+{% endhighlight %}
+
+### Null Literal
+
+A null literal is used to specify a null value.
+
+ Syntax
+
+{% highlight sql %}
+NULL
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT NULL AS col;
+  ++
+  | col|
+  ++
+  |NULL|
+  ++
+{% endhighlight %}
+
+### Boolean Literal
+
+A boolean literal is used to specify a boolean value.
+
+ Syntax
+
+{% highlight sql %}
+TRUE | FALSE
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT TRUE AS col;
+  ++
+  | col|
+  ++
+  |true|
+  ++
+{% endhighlight %}
+
+### Numeric Literal
+
+A numeric literal is used to specify a fixed or floating-point number.
+
+ Integer Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] digit [ ... ] [ L | S | Y ]
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  L
+  
+Case insensitive, indicates BIGINT, which is a 8-byte signed 
integer number.
+  
+
+
+  S
+  
+Case insensitive, indicates SMALLINT, which is a 2-byte 
signed integer number.
+  
+
+
+  Y
+  
+Case insensitive, indicates TINYINT, which is a 1-byte signed 
integer number.
+  
+
+
+  default (no postfix)
+  
+Indicates a 4-byte signed integer number.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT -2147483648 AS col;
+  +---+
+  |col|
+  +---+
+  |-2147483648|
+  +---+
+
+SELECT 9223372036854775807l AS col;
+  +---+
+  |col|
+  +---+
+  |9223372036854775807|
+  +---+
+
+SELECT -32Y AS col;
+  +---+
+  |col|
+  +---+
+  |-32|
+  +---+
+
+SELECT 482S AS col;
+  +---+
+  |col|
+  +---+
+  |482|
+  +---+
+{% endhighlight %}
+
+ Decimal Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] }

Review comment:
   nit:
   ```
   [ + | - ] { { digit [ ... ] . [digit [ ... ] ] } | { . digit [ ... ] } }
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28237:
URL: https://github.com/apache/spark/pull/28237#discussion_r411095492



##
File path: docs/sql-ref-literals.md
##
@@ -0,0 +1,506 @@
+---
+layout: global
+title: Literals
+displayTitle: Literals
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+A literal (also known as a constant) represents a fixed data value. Spark SQL 
supports the following literals:
+
+ * [String Literal](#string-literal)
+ * [Null Literal](#null-literal)
+ * [Boolean Literal](#boolean-literal)
+ * [Numeric Literal](#numeric-literal)
+ * [Datetime Literal](#datetime-literal)
+ * [Interval Literal](#interval-literal)
+
+### String Literal
+
+A string literal is used to specify a character string value.
+
+ Syntax
+
+{% highlight sql %}
+'c [ ... ]' | "c [ ... ]"
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set. Use \ to escape special 
characters (e.g., ' or \).
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 'Hello, World!' AS col;
+  +-+
+  |  col|
+  +-+
+  |Hello, World!|
+  +-+
+
+SELECT "SPARK SQL" AS col;
+  +-+
+  |  col|
+  +-+
+  |Spark SQL|
+  +-+
+
+SELECT SELECT 'it\'s $10.' AS col;
+  +-+
+  |  col|
+  +-+
+  |It's $10.|
+  +-+
+{% endhighlight %}
+
+### Null Literal
+
+A null literal is used to specify a null value.
+
+ Syntax
+
+{% highlight sql %}
+NULL
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT NULL AS col;
+  ++
+  | col|
+  ++
+  |NULL|
+  ++
+{% endhighlight %}
+
+### Boolean Literal
+
+A boolean literal is used to specify a boolean value.
+
+ Syntax
+
+{% highlight sql %}
+TRUE | FALSE
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT TRUE AS col;
+  ++
+  | col|
+  ++
+  |true|
+  ++
+{% endhighlight %}
+
+### Numeric Literal
+
+A numeric literal is used to specify a fixed or floating-point number.
+
+ Integer Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] digit [ ... ] [ L | S | Y ]
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  L
+  
+Case insensitive, indicates BIGINT, which is a 8-byte signed 
integer number.
+  
+
+
+  S
+  
+Case insensitive, indicates SMALLINT, which is a 2-byte 
signed integer number.
+  
+
+
+  Y
+  
+Case insensitive, indicates TINYINT, which is a 1-byte signed 
integer number.
+  
+
+
+  default (no postfix)
+  
+Indicates a 4-byte signed integer number.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT -2147483648 AS col;
+  +---+
+  |col|
+  +---+
+  |-2147483648|
+  +---+
+
+SELECT 9223372036854775807l AS col;
+  +---+
+  |col|
+  +---+
+  |9223372036854775807|
+  +---+
+
+SELECT -32Y AS col;
+  +---+
+  |col|
+  +---+
+  |-32|
+  +---+
+
+SELECT 482S AS col;
+  +---+
+  |col|
+  +---+
+  |482|
+  +---+
+{% endhighlight %}
+
+ Decimal Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] }
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 12.578 AS col;
+  +--+
+  |   col|
+  +--+
+  |12.578|
+  +--+
+
+SELECT -0.1234567 AS col;
+  +--+
+  |   col|
+  +--+
+  |-0.1234567|
+  +--+
+
+SELECT -.1234567 AS col;

Review comment:
   `123.` is also a decimal, right?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28237:
URL: https://github.com/apache/spark/pull/28237#discussion_r411095121



##
File path: docs/sql-ref-literals.md
##
@@ -0,0 +1,506 @@
+---
+layout: global
+title: Literals
+displayTitle: Literals
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+A literal (also known as a constant) represents a fixed data value. Spark SQL 
supports the following literals:
+
+ * [String Literal](#string-literal)
+ * [Null Literal](#null-literal)
+ * [Boolean Literal](#boolean-literal)
+ * [Numeric Literal](#numeric-literal)
+ * [Datetime Literal](#datetime-literal)
+ * [Interval Literal](#interval-literal)
+
+### String Literal
+
+A string literal is used to specify a character string value.
+
+ Syntax
+
+{% highlight sql %}
+'c [ ... ]' | "c [ ... ]"
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set. Use \ to escape special 
characters (e.g., ' or \).
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 'Hello, World!' AS col;
+  +-+
+  |  col|
+  +-+
+  |Hello, World!|
+  +-+
+
+SELECT "SPARK SQL" AS col;
+  +-+
+  |  col|
+  +-+
+  |Spark SQL|
+  +-+
+
+SELECT SELECT 'it\'s $10.' AS col;
+  +-+
+  |  col|
+  +-+
+  |It's $10.|
+  +-+
+{% endhighlight %}
+
+### Null Literal
+
+A null literal is used to specify a null value.
+
+ Syntax
+
+{% highlight sql %}
+NULL
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT NULL AS col;
+  ++
+  | col|
+  ++
+  |NULL|
+  ++
+{% endhighlight %}
+
+### Boolean Literal
+
+A boolean literal is used to specify a boolean value.
+
+ Syntax
+
+{% highlight sql %}
+TRUE | FALSE
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT TRUE AS col;
+  ++
+  | col|
+  ++
+  |true|
+  ++
+{% endhighlight %}
+
+### Numeric Literal
+
+A numeric literal is used to specify a fixed or floating-point number.
+
+ Integer Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] digit [ ... ] [ L | S | Y ]
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  L
+  
+Case insensitive, indicates BIGINT, which is a 8-byte signed 
integer number.
+  
+
+
+  S
+  
+Case insensitive, indicates SMALLINT, which is a 2-byte 
signed integer number.
+  
+
+
+  Y
+  
+Case insensitive, indicates TINYINT, which is a 1-byte signed 
integer number.
+  
+
+
+  default (no postfix)
+  
+Indicates a 4-byte signed integer number.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT -2147483648 AS col;
+  +---+
+  |col|
+  +---+
+  |-2147483648|
+  +---+
+
+SELECT 9223372036854775807l AS col;
+  +---+
+  |col|
+  +---+
+  |9223372036854775807|
+  +---+
+
+SELECT -32Y AS col;
+  +---+
+  |col|
+  +---+
+  |-32|
+  +---+
+
+SELECT 482S AS col;
+  +---+
+  |col|
+  +---+
+  |482|
+  +---+
+{% endhighlight %}
+
+ Decimal Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] }

Review comment:
   nit:
   ```
   [ + | - ] { { digit [ ... ] . [digit [ ... ] ] } | { . digit [ ... ] } }
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28237:
URL: https://github.com/apache/spark/pull/28237#discussion_r411095121



##
File path: docs/sql-ref-literals.md
##
@@ -0,0 +1,506 @@
+---
+layout: global
+title: Literals
+displayTitle: Literals
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+A literal (also known as a constant) represents a fixed data value. Spark SQL 
supports the following literals:
+
+ * [String Literal](#string-literal)
+ * [Null Literal](#null-literal)
+ * [Boolean Literal](#boolean-literal)
+ * [Numeric Literal](#numeric-literal)
+ * [Datetime Literal](#datetime-literal)
+ * [Interval Literal](#interval-literal)
+
+### String Literal
+
+A string literal is used to specify a character string value.
+
+ Syntax
+
+{% highlight sql %}
+'c [ ... ]' | "c [ ... ]"
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set. Use \ to escape special 
characters (e.g., ' or \).
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 'Hello, World!' AS col;
+  +-+
+  |  col|
+  +-+
+  |Hello, World!|
+  +-+
+
+SELECT "SPARK SQL" AS col;
+  +-+
+  |  col|
+  +-+
+  |Spark SQL|
+  +-+
+
+SELECT SELECT 'it\'s $10.' AS col;
+  +-+
+  |  col|
+  +-+
+  |It's $10.|
+  +-+
+{% endhighlight %}
+
+### Null Literal
+
+A null literal is used to specify a null value.
+
+ Syntax
+
+{% highlight sql %}
+NULL
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT NULL AS col;
+  ++
+  | col|
+  ++
+  |NULL|
+  ++
+{% endhighlight %}
+
+### Boolean Literal
+
+A boolean literal is used to specify a boolean value.
+
+ Syntax
+
+{% highlight sql %}
+TRUE | FALSE
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT TRUE AS col;
+  ++
+  | col|
+  ++
+  |true|
+  ++
+{% endhighlight %}
+
+### Numeric Literal
+
+A numeric literal is used to specify a fixed or floating-point number.
+
+ Integer Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] digit [ ... ] [ L | S | Y ]
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  L
+  
+Case insensitive, indicates BIGINT, which is a 8-byte signed 
integer number.
+  
+
+
+  S
+  
+Case insensitive, indicates SMALLINT, which is a 2-byte 
signed integer number.
+  
+
+
+  Y
+  
+Case insensitive, indicates TINYINT, which is a 1-byte signed 
integer number.
+  
+
+
+  default (no postfix)
+  
+Indicates a 4-byte signed integer number.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT -2147483648 AS col;
+  +---+
+  |col|
+  +---+
+  |-2147483648|
+  +---+
+
+SELECT 9223372036854775807l AS col;
+  +---+
+  |col|
+  +---+
+  |9223372036854775807|
+  +---+
+
+SELECT -32Y AS col;
+  +---+
+  |col|
+  +---+
+  |-32|
+  +---+
+
+SELECT 482S AS col;
+  +---+
+  |col|
+  +---+
+  |482|
+  +---+
+{% endhighlight %}
+
+ Decimal Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] }

Review comment:
   nit:
   ```
   [ + | - ] { { digit [ ... ] [. digit [ ... ] ] } | { . digit [ ... ] } }
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] huaxingao commented on issue #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


huaxingao commented on issue #28237:
URL: https://github.com/apache/spark/pull/28237#issuecomment-616309537


   @cloud-fan 
   I addressed all the comments. Could you please check one more time? Thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28269: [SPARK-31493][SQL] Optimize InSet to In according partition size at InSubqueryExec

2020-04-19 Thread GitBox


AmplabJenkins removed a comment on issue #28269:
URL: https://github.com/apache/spark/pull/28269#issuecomment-616306537







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28269: [SPARK-31493][SQL] Optimize InSet to In according partition size at InSubqueryExec

2020-04-19 Thread GitBox


AmplabJenkins commented on issue #28269:
URL: https://github.com/apache/spark/pull/28269#issuecomment-616306537







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28269: [SPARK-31493][SQL] Optimize InSet to In according partition size at InSubqueryExec

2020-04-19 Thread GitBox


SparkQA commented on issue #28269:
URL: https://github.com/apache/spark/pull/28269#issuecomment-616306215


   **[Test build #121502 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121502/testReport)**
 for PR 28269 at commit 
[`dfb8504`](https://github.com/apache/spark/commit/dfb8504b00e736d5bb230850cafd749acf83b130).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28250: [SPARK-31475][SQL] Broadcast stage in AQE did not timeout

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28250:
URL: https://github.com/apache/spark/pull/28250#discussion_r411089336



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/joins/BroadcastJoinSuite.scala
##
@@ -398,4 +399,22 @@ class BroadcastJoinSuite extends QueryTest with 
SQLTestUtils with AdaptiveSparkP
 }
 }
   }
+
+  test("Broadcast timeout") {
+val timeout = 30
+val slowUDF = udf({ x: Int => Thread.sleep(timeout * 10 * 1000); x })
+val df1 = spark.range(10).select($"id" as 'a)
+val df2 = spark.range(5).select(slowUDF($"id") as 'a)
+val testDf = df1.join(broadcast(df2), "a")
+withSQLConf(SQLConf.BROADCAST_TIMEOUT.key -> timeout.toString) {
+  val e = intercept[Exception] {
+testDf.collect()
+  }
+  AdaptiveTestUtils.assertExceptionMessage(e, s"Could not execute 
broadcast in $timeout secs.")

Review comment:
   so this test runs 30 seconds? Can we make it a bit shorter?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28265: [SPARK-31234][SQL][FOLLOW-UP] ResetCommand should not affect static SQL Configuration

2020-04-19 Thread GitBox


AmplabJenkins removed a comment on issue #28265:
URL: https://github.com/apache/spark/pull/28265#issuecomment-616304410







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28265: [SPARK-31234][SQL][FOLLOW-UP] ResetCommand should not affect static SQL Configuration

2020-04-19 Thread GitBox


AmplabJenkins commented on issue #28265:
URL: https://github.com/apache/spark/pull/28265#issuecomment-616304410







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28270: [SPARK-31494][ML] flatten the result dataframe of ANOVATest

2020-04-19 Thread GitBox


AmplabJenkins removed a comment on issue #28270:
URL: https://github.com/apache/spark/pull/28270#issuecomment-616302400







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28270: [SPARK-31494][ML] flatten the result dataframe of ANOVATest

2020-04-19 Thread GitBox


SparkQA commented on issue #28270:
URL: https://github.com/apache/spark/pull/28270#issuecomment-616304123


   **[Test build #121500 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121500/testReport)**
 for PR 28270 at commit 
[`ec132a7`](https://github.com/apache/spark/commit/ec132a778b98b9315cbc5417c1dd209ffb418bd6).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28265: [SPARK-31234][SQL][FOLLOW-UP] ResetCommand should not affect static SQL Configuration

2020-04-19 Thread GitBox


SparkQA commented on issue #28265:
URL: https://github.com/apache/spark/pull/28265#issuecomment-616304143


   **[Test build #121501 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121501/testReport)**
 for PR 28265 at commit 
[`a63ad80`](https://github.com/apache/spark/commit/a63ad80b75777e0d6aaf40f26825612b389518d8).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


AmplabJenkins removed a comment on issue #28237:
URL: https://github.com/apache/spark/pull/28237#issuecomment-616304127







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


AmplabJenkins commented on issue #28237:
URL: https://github.com/apache/spark/pull/28237#issuecomment-616304127







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on issue #28265: [SPARK-31234][SQL][FOLLOW-UP] ResetCommand should not affect static SQL Configuration

2020-04-19 Thread GitBox


cloud-fan commented on issue #28265:
URL: https://github.com/apache/spark/pull/28265#issuecomment-616304084


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


SparkQA removed a comment on issue #28237:
URL: https://github.com/apache/spark/pull/28237#issuecomment-616300097


   **[Test build #121499 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121499/testReport)**
 for PR 28237 at commit 
[`35cb286`](https://github.com/apache/spark/commit/35cb2862b338ecc5d24deef2e0f7ad216cae1534).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


SparkQA commented on issue #28237:
URL: https://github.com/apache/spark/pull/28237#issuecomment-616303982


   **[Test build #121499 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121499/testReport)**
 for PR 28237 at commit 
[`35cb286`](https://github.com/apache/spark/commit/35cb2862b338ecc5d24deef2e0f7ad216cae1534).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28251: [SPARK-31476][SQL] Add an ExpressionInfo entry for EXTRACT

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28251:
URL: https://github.com/apache/spark/pull/28251#discussion_r411087296



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
##
@@ -423,6 +423,7 @@ object FunctionRegistry {
 expression[MakeTimestamp]("make_timestamp"),
 expression[MakeInterval]("make_interval"),
 expression[DatePart]("date_part"),
+expression[Extract]("extract"),

Review comment:
   one side effect is now we support `extract(field, source)` other than 
`extract(field from source)`. Not a big deal but better if we can avoid 
exposing more APIs,





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gatorsmile commented on a change in pull request #28265: [SPARK-31234][SQL][FOLLOW-UP] ResetCommand should not affect static SQL Configuration

2020-04-19 Thread GitBox


gatorsmile commented on a change in pull request #28265:
URL: https://github.com/apache/spark/pull/28265#discussion_r411086764



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/SparkSessionBuilderSuite.scala
##
@@ -163,9 +163,9 @@ class SparkSessionBuilderSuite extends SparkFunSuite with 
BeforeAndAfterEach {
   .getOrCreate()
 
 assert(session.sessionState.conf.getConfString("spark.app.name") === 
"test-app-SPARK-31234")
-assert(session.sessionState.conf.getConf(GLOBAL_TEMP_DATABASE) === 
"globalTempDB-SPARK-31234")
+assert(session.sessionState.conf.getConf(GLOBAL_TEMP_DATABASE) === 
"globaltempdb-spark-31234")

Review comment:
   This difference between Spark 2.4 and Spark 3.0 is caused by 
https://github.com/apache/spark/pull/24979/





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gatorsmile commented on issue #28265: [SPARK-31234][SQL][FOLLOW-UP] ResetCommand should not affect static SQL Configuration

2020-04-19 Thread GitBox


gatorsmile commented on issue #28265:
URL: https://github.com/apache/spark/pull/28265#issuecomment-616303005


   cc @cloud-fan 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gatorsmile commented on a change in pull request #24979: [SPARK-28179][SQL] Avoid hard-coded config: spark.sql.globalTempDatabase

2020-04-19 Thread GitBox


gatorsmile commented on a change in pull request #24979:
URL: https://github.com/apache/spark/pull/24979#discussion_r411086592



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala
##
@@ -158,7 +158,7 @@ private[sql] class SharedState(
 // System preserved database should not exists in metastore. However it's 
hard to guarantee it
 // for every session, because case-sensitivity differs. Here we always 
lowercase it to make our
 // life easier.

Review comment:
   https://github.com/apache/spark/pull/28265 fixed it. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gatorsmile commented on a change in pull request #24979: [SPARK-28179][SQL] Avoid hard-coded config: spark.sql.globalTempDatabase

2020-04-19 Thread GitBox


gatorsmile commented on a change in pull request #24979:
URL: https://github.com/apache/spark/pull/24979#discussion_r411086416



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala
##
@@ -158,7 +158,7 @@ private[sql] class SharedState(
 // System preserved database should not exists in metastore. However it's 
hard to guarantee it
 // for every session, because case-sensitivity differs. Here we always 
lowercase it to make our
 // life easier.

Review comment:
   This description should be moved to StaticSQLConf





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #26339: [SPARK-27194][SPARK-29302][SQL] Fix the issue that for dynamic partition overwrite a task would conflict with its speculative task

2020-04-19 Thread GitBox


SparkQA removed a comment on issue #26339:
URL: https://github.com/apache/spark/pull/26339#issuecomment-616267769


   **[Test build #121493 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121493/testReport)**
 for PR 26339 at commit 
[`a4012d8`](https://github.com/apache/spark/commit/a4012d87d1e18c7a1c922d4198ff7aa7756cac81).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28270: [SPARK-31494][ML] flatten the result dataframe of ANOVATest

2020-04-19 Thread GitBox


AmplabJenkins commented on issue #28270:
URL: https://github.com/apache/spark/pull/28270#issuecomment-616302400







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26339: [SPARK-27194][SPARK-29302][SQL] Fix the issue that for dynamic partition overwrite a task would conflict with its speculative task

2020-04-19 Thread GitBox


AmplabJenkins removed a comment on issue #26339:
URL: https://github.com/apache/spark/pull/26339#issuecomment-616302130


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26339: [SPARK-27194][SPARK-29302][SQL] Fix the issue that for dynamic partition overwrite a task would conflict with its speculative task

2020-04-19 Thread GitBox


AmplabJenkins removed a comment on issue #26339:
URL: https://github.com/apache/spark/pull/26339#issuecomment-616302135


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/121493/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28248: [SPARK-31474][SQL] Consistency between dayofweek/dow in extract exprsession and dayofweek function

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28248:
URL: https://github.com/apache/spark/pull/28248#discussion_r411086116



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
##
@@ -2215,7 +2219,11 @@ case class DatePart(field: Expression, source: 
Expression, child: Expression)
   > SELECT _FUNC_(seconds FROM interval 5 hours 30 seconds 1 milliseconds 
1 microseconds);
30.001001
   """,
+  note = """
+The _FUNC_ function is equivalent to `date_part`.

Review comment:
   BTW is `EXTRACT` more widely used? If yes then we should put the 
document in `Extract`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26339: [SPARK-27194][SPARK-29302][SQL] Fix the issue that for dynamic partition overwrite a task would conflict with its speculative task

2020-04-19 Thread GitBox


AmplabJenkins commented on issue #26339:
URL: https://github.com/apache/spark/pull/26339#issuecomment-616302130







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng opened a new pull request #28270: [SPARK-31494][ML] flatten the result dataframe of ANOVATest

2020-04-19 Thread GitBox


zhengruifeng opened a new pull request #28270:
URL: https://github.com/apache/spark/pull/28270


   ### What changes were proposed in this pull request?
   add a new method `def test(dataset: DataFrame, featuresCol: String, 
labelCol: String, flatten: Boolean): DataFrame`
   
   
   ### Why are the changes needed?
   Similar to new `test` method in `ChiSquareTest`, it will:
   1, support df operation on the returned df;
   2, make driver no longer a bottleneck with large numFeatures
   
   ### Does this PR introduce any user-facing change?
   Yes, new method added
   
   
   ### How was this patch tested?
   existing testsuites
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28248: [SPARK-31474][SQL] Consistency between dayofweek/dow in extract exprsession and dayofweek function

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28248:
URL: https://github.com/apache/spark/pull/28248#discussion_r411085917



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
##
@@ -2215,7 +2219,11 @@ case class DatePart(field: Expression, source: 
Expression, child: Expression)
   > SELECT _FUNC_(seconds FROM interval 5 hours 30 seconds 1 milliseconds 
1 microseconds);
30.001001
   """,
+  note = """
+The _FUNC_ function is equivalent to `date_part`.

Review comment:
   `date_part(field, source)`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28248: [SPARK-31474][SQL] Consistency between dayofweek/dow in extract exprsession and dayofweek function

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28248:
URL: https://github.com/apache/spark/pull/28248#discussion_r411085824



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
##
@@ -2179,7 +2178,11 @@ object DatePartLike {
   > SELECT _FUNC_('seconds', interval 5 hours 30 seconds 1 milliseconds 1 
microseconds);
30.001001
   """,
+  note = """
+The _FUNC_ function is equivalent to the SQL-standard function `extract`

Review comment:
   `EXTRACT(field FROM source)`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #26339: [SPARK-27194][SPARK-29302][SQL] Fix the issue that for dynamic partition overwrite a task would conflict with its speculative task

2020-04-19 Thread GitBox


SparkQA commented on issue #26339:
URL: https://github.com/apache/spark/pull/26339#issuecomment-616301796


   **[Test build #121493 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121493/testReport)**
 for PR 26339 at commit 
[`a4012d8`](https://github.com/apache/spark/commit/a4012d87d1e18c7a1c922d4198ff7aa7756cac81).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `class RenameFailedForFirstTaskFirstAttemptFileSystem extends 
RawLocalFileSystem `
 * `class PartitionedSpeculateRenameFailedWriteSuite extends QueryTest with 
SharedSparkSession `



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28248: [SPARK-31474][SQL] Consistency between dayofweek/dow in extract exprsession and dayofweek function

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28248:
URL: https://github.com/apache/spark/pull/28248#discussion_r411085373



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
##
@@ -2130,38 +2129,38 @@ object DatePartLike {
   }
 }
 
+// scalastyle:off line.size.limit
 @ExpressionDescription(
   usage = "_FUNC_(field, source) - Extracts a part of the date/timestamp or 
interval source.",
   arguments = """
 Arguments:
   * field - selects which part of the source should be extracted.
-   Supported string values of `field` for dates and timestamps are:
-["MILLENNIUM", ("MILLENNIA", "MIL", "MILS"),
- "CENTURY", ("CENTURIES", "C", "CENT"),
- "DECADE", ("DECADES", "DEC", "DECS"),
- "YEAR", ("Y", "YEARS", "YR", "YRS"),
- "ISOYEAR",
- "QUARTER", ("QTR"),
- "MONTH", ("MON", "MONS", "MONTHS"),
- "WEEK", ("W", "WEEKS"),
- "DAY", ("D", "DAYS"),
- "DAYOFWEEK",
- "DOW",
- "ISODOW",
- "DOY",
- "HOUR", ("H", "HOURS", "HR", "HRS"),
- "MINUTE", ("M", "MIN", "MINS", "MINUTES"),
- "SECOND", ("S", "SEC", "SECONDS", "SECS"),
- "MILLISECONDS", ("MSEC", "MSECS", "MILLISECON", "MSECONDS", 
"MS"),
- "MICROSECONDS", ("USEC", "USECS", "USECONDS", "MICROSECON", 
"US"),
- "EPOCH"]
-Supported string values of `field` for intervals are:
- ["YEAR", ("Y", "YEARS", "YR", "YRS"),
-  "MONTH", ("MON", "MONS", "MONTHS"),
-  "DAY", ("D", "DAYS"),
-  "HOUR", ("H", "HOURS", "HR", "HRS"),
-  "MINUTE", ("M", "MIN", "MINS", "MINUTES"),
-  "SECOND", ("S", "SEC", "SECONDS", "SECS")]
+  - Supported string values of `field` for dates and timestamps are:
+  - "MILLENNIUM", ("MILLENNIA", "MIL", "MILS") - the conventional 
numbering of millennia
+  - "CENTURY", ("CENTURIES", "C", "CENT") - the conventional 
numbering of centuries
+  - "DECADE", ("DECADES", "DEC", "DECS") - the year field divided 
by 1
+  - "YEAR", ("Y", "YEARS", "YR", "YRS") - the year field
+  - "ISOYEAR" - the ISO 8601 week-numbering year that the datetime 
falls in
+  - "QUARTER", ("QTR") - the quarter (1 - 4) of the year that the 
datetime falls in
+  - "MONTH", ("MON", "MONS", "MONTHS") - the month field
+  - "WEEK", ("W", "WEEKS") - the number of the ISO 8601 
week-of-week-based-year. A week is considered to start on a Monday and week 1 
is the first week with >3 days. In the ISO week-numbering system, it is 
possible for early-January dates to be part of the 52nd or 53rd week of the 
previous year, and for late-December dates to be part of the first week of the 
next year. For example, 2005-01-02 is part of the 53rd week of year 2004, while 
2012-12-31 is part of the first week of 2013
+  - "DAY", ("D", "DAYS") - the day of the month field (1 - 31)
+  - "DAYOFWEEK",("DOW") - the day of the week for datetime as 
Sunday(1) to Saturday(7)
+  - "ISODOW" - ISO 8601 based day of the week for datetime as 
Monday(1) to Sunday(7)
+  - "DOY" - the day of the year (1 - 365/366)
+  - "HOUR", ("H", "HOURS", "HR", "HRS") - The hour field (0 - 23)
+  - "MINUTE", ("M", "MIN", "MINS", "MINUTES") - the minutes field 
(0 - 59)
+  - "SECOND", ("S", "SEC", "SECONDS", "SECS") - the seconds field, 
including fractional parts
+  - "MILLISECONDS", ("MSEC", "MSECS", "MILLISECON", "MSECONDS", 
"MS") - the seconds field, including fractional parts, multiplied by 1000. Note 
that this includes full seconds
+  - "MICROSECONDS", ("USEC", "USECS", "USECONDS", "MICROSECON", 
"US") - The seconds field, including fractional parts, multiplied by 100. 
Note that this includes full seconds
+  - "EPOCH" - the number of seconds with fractional part in 
microsecond precision since 1970-01-01 00:00:00 local time (can be negative)
+  - Supported string values of `field` for interval(which consists of 
`months`, `days`, `microseconds`) are:
+  - "YEAR", ("Y", "YEARS", "YR", "YRS") - the total `months` / 12
+  - "MONTH", ("MON", "MONS", "MONTHS") - the total `months` modulo 
12

Review comment:
   ```
   the total `months` % 12
   ```
   to be consistent with
   ```
   the total `months` / 12
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this 

[GitHub] [spark] cloud-fan commented on a change in pull request #28248: [SPARK-31474][SQL] Consistency between dayofweek/dow in extract exprsession and dayofweek function

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28248:
URL: https://github.com/apache/spark/pull/28248#discussion_r411085051



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
##
@@ -2130,38 +2129,38 @@ object DatePartLike {
   }
 }
 
+// scalastyle:off line.size.limit
 @ExpressionDescription(
   usage = "_FUNC_(field, source) - Extracts a part of the date/timestamp or 
interval source.",
   arguments = """
 Arguments:
   * field - selects which part of the source should be extracted.
-   Supported string values of `field` for dates and timestamps are:
-["MILLENNIUM", ("MILLENNIA", "MIL", "MILS"),
- "CENTURY", ("CENTURIES", "C", "CENT"),
- "DECADE", ("DECADES", "DEC", "DECS"),
- "YEAR", ("Y", "YEARS", "YR", "YRS"),
- "ISOYEAR",
- "QUARTER", ("QTR"),
- "MONTH", ("MON", "MONS", "MONTHS"),
- "WEEK", ("W", "WEEKS"),
- "DAY", ("D", "DAYS"),
- "DAYOFWEEK",
- "DOW",
- "ISODOW",
- "DOY",
- "HOUR", ("H", "HOURS", "HR", "HRS"),
- "MINUTE", ("M", "MIN", "MINS", "MINUTES"),
- "SECOND", ("S", "SEC", "SECONDS", "SECS"),
- "MILLISECONDS", ("MSEC", "MSECS", "MILLISECON", "MSECONDS", 
"MS"),
- "MICROSECONDS", ("USEC", "USECS", "USECONDS", "MICROSECON", 
"US"),
- "EPOCH"]
-Supported string values of `field` for intervals are:
- ["YEAR", ("Y", "YEARS", "YR", "YRS"),
-  "MONTH", ("MON", "MONS", "MONTHS"),
-  "DAY", ("D", "DAYS"),
-  "HOUR", ("H", "HOURS", "HR", "HRS"),
-  "MINUTE", ("M", "MIN", "MINS", "MINUTES"),
-  "SECOND", ("S", "SEC", "SECONDS", "SECS")]
+  - Supported string values of `field` for dates and timestamps are:
+  - "MILLENNIUM", ("MILLENNIA", "MIL", "MILS") - the conventional 
numbering of millennia
+  - "CENTURY", ("CENTURIES", "C", "CENT") - the conventional 
numbering of centuries
+  - "DECADE", ("DECADES", "DEC", "DECS") - the year field divided 
by 1
+  - "YEAR", ("Y", "YEARS", "YR", "YRS") - the year field
+  - "ISOYEAR" - the ISO 8601 week-numbering year that the datetime 
falls in
+  - "QUARTER", ("QTR") - the quarter (1 - 4) of the year that the 
datetime falls in
+  - "MONTH", ("MON", "MONS", "MONTHS") - the month field

Review comment:
   `the month field (1 - 12)`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


AmplabJenkins removed a comment on issue #28237:
URL: https://github.com/apache/spark/pull/28237#issuecomment-616300361







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28248: [SPARK-31474][SQL] Consistency between dayofweek/dow in extract exprsession and dayofweek function

2020-04-19 Thread GitBox


cloud-fan commented on a change in pull request #28248:
URL: https://github.com/apache/spark/pull/28248#discussion_r411084360



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
##
@@ -2089,8 +2089,7 @@ object DatePart {
 case "MONTH" | "MON" | "MONS" | "MONTHS" => Month(source)
 case "WEEK" | "W" | "WEEKS" => WeekOfYear(source)
 case "DAY" | "D" | "DAYS" => DayOfMonth(source)
-case "DAYOFWEEK" => DayOfWeek(source)
-case "DOW" => Subtract(DayOfWeek(source), Literal(1))
+case "DAYOFWEEK" | "DOW" => DayOfWeek(source)

Review comment:
   I said that the `DOW` behavior looks more reasonable, but unfortunately, 
we already have `DAYOFWEEK` in Spark 2.4 and we can't change that. It's more 
important to keep internal consistency.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


AmplabJenkins commented on issue #28237:
URL: https://github.com/apache/spark/pull/28237#issuecomment-616300361







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


SparkQA commented on issue #28237:
URL: https://github.com/apache/spark/pull/28237#issuecomment-616300097


   **[Test build #121499 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121499/testReport)**
 for PR 28237 at commit 
[`35cb286`](https://github.com/apache/spark/commit/35cb2862b338ecc5d24deef2e0f7ad216cae1534).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] huaxingao commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


huaxingao commented on a change in pull request #28237:
URL: https://github.com/apache/spark/pull/28237#discussion_r411082047



##
File path: docs/sql-ref-literals.md
##
@@ -0,0 +1,505 @@
+---
+layout: global
+title: Literals
+displayTitle: Literals
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+A literal (also known as a constant) represents a fixed data value. Spark SQL 
supports the following literals:
+
+ * [String Literal](#string-literal)
+ * [Null Literal](#null-literal)
+ * [Boolean Literal](#boolean-literal)
+ * [Numeric Literal](#numeric-literal)
+ * [Datetime Literal](#datetime-literal)
+ * [Interval Literal](#interval-literal)
+
+### String Literal
+
+A string literal is used to specify a character string value.
+
+ Syntax
+
+{% highlight sql %}
+'c [ ... ]' | "c [ ... ]"
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set. Use \ to escape special 
characters.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 'Hello, World!' AS col;
+  +-+
+  |  col|
+  +-+
+  |Hello, World!|
+  +-+
+
+SELECT "SPARK SQL" AS col;
+  +-+
+  |  col|
+  +-+
+  |Spark SQL|
+  +-+
+
+SELECT SELECT 'it\'s $10.' AS col;
+  +-+
+  |  col|
+  +-+
+  |It's $10.|
+  +-+
+{% endhighlight %}
+
+### Null Literal
+
+A null literal is used to specify a null value.
+
+ Syntax
+
+{% highlight sql %}
+NULL
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT NULL AS col;
+  ++
+  | col|
+  ++
+  |NULL|
+  ++
+{% endhighlight %}
+
+### Boolean Literal
+
+A boolean literal is used to specify a boolean value.
+
+ Syntax
+
+{% highlight sql %}
+TRUE | FALSE
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT TRUE AS col;
+  ++
+  | col|
+  ++
+  |true|
+  ++
+{% endhighlight %}
+
+### Numeric Literal
+
+A numeric literal is used to specify a fixed or floating-point number.
+
+ Integer Literal
+
+ Syntax

Review comment:
   I prefer not to add. The font seems to small if add one more #
   https://user-images.githubusercontent.com/13592258/79713482-c030bb80-8282-11ea-992f-eadad975f4c2.png;>
   vs
   https://user-images.githubusercontent.com/13592258/79713488-c3c44280-8282-11ea-8af2-17efcb897b06.png;>
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] huaxingao commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-19 Thread GitBox


huaxingao commented on a change in pull request #28237:
URL: https://github.com/apache/spark/pull/28237#discussion_r411081987



##
File path: docs/sql-ref-literals.md
##
@@ -0,0 +1,505 @@
+---
+layout: global
+title: Literals
+displayTitle: Literals
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+A literal (also known as a constant) represents a fixed data value. Spark SQL 
supports the following literals:
+
+ * [String Literal](#string-literal)
+ * [Null Literal](#null-literal)
+ * [Boolean Literal](#boolean-literal)
+ * [Numeric Literal](#numeric-literal)
+ * [Datetime Literal](#datetime-literal)
+ * [Interval Literal](#interval-literal)
+
+### String Literal
+
+A string literal is used to specify a character string value.
+
+ Syntax
+
+{% highlight sql %}
+'c [ ... ]' | "c [ ... ]"
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set. Use \ to escape special 
characters.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 'Hello, World!' AS col;
+  +-+
+  |  col|
+  +-+
+  |Hello, World!|
+  +-+
+
+SELECT "SPARK SQL" AS col;
+  +-+
+  |  col|
+  +-+
+  |Spark SQL|
+  +-+
+
+SELECT SELECT 'it\'s $10.' AS col;
+  +-+
+  |  col|
+  +-+
+  |It's $10.|
+  +-+
+{% endhighlight %}
+
+### Null Literal
+
+A null literal is used to specify a null value.
+
+ Syntax
+
+{% highlight sql %}
+NULL
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT NULL AS col;
+  ++
+  | col|
+  ++
+  |NULL|
+  ++
+{% endhighlight %}
+
+### Boolean Literal
+
+A boolean literal is used to specify a boolean value.
+
+ Syntax
+
+{% highlight sql %}
+TRUE | FALSE
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT TRUE AS col;
+  ++
+  | col|
+  ++
+  |true|
+  ++
+{% endhighlight %}
+
+### Numeric Literal
+
+A numeric literal is used to specify a fixed or floating-point number.
+
+ Integer Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] digit [ ... ] [ L | S | Y ]
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  L
+  
+Case insensitive, indicates BIGINT, which is a 8-byte signed 
integer number.
+  
+
+
+  S
+  
+Case insensitive, indicates SMALLINT, which is a 2-byte 
signed integer number.
+  
+
+
+  Y
+  
+Case insensitive, indicates TINYINT, which is a 1-byte signed 
integer number.
+  
+
+
+  default (no postfix)
+  
+Indicates a 4-byte signed integer number.
+  
+
+ Examples
+
+{% highlight sql %}
+SELECT -2147483648 AS col;
+  +---+
+  |col|
+  +---+
+  |-2147483648|
+  +---+
+
+SELECT 9223372036854775807l AS col;
+  +---+
+  |col|
+  +---+
+  |9223372036854775807|
+  +---+
+
+SELECT -32Y AS col;
+  +---+
+  |col|
+  +---+
+  |-32|
+  +---+
+
+SELECT 482S AS col;
+  +---+
+  |col|
+  +---+
+  |482|
+  +---+
+{% endhighlight %}
+
+ Decimal Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] }
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 12.578 AS col;
+  +--+
+  |   col|
+  +--+
+  |12.578|
+  +--+
+
+SELECT -0.1234567 AS col;
+  +--+
+  |   col|
+  +--+
+  |-0.1234567|
+  +--+
+
+SELECT -.1234567 AS col;
+  +--+
+  |   col|
+  +--+
+  |-0.1234567|
+  +--+
+{% endhighlight %}
+
+ Floating Point and BigDecimal Literals

Review comment:
   
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4#L992-L1002





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [spark] AmplabJenkins removed a comment on issue #28269: [SPARK-31493][SQL] Optimize InSet to In according partition size at InSubqueryExec

2020-04-19 Thread GitBox


AmplabJenkins removed a comment on issue #28269:
URL: https://github.com/apache/spark/pull/28269#issuecomment-616296943







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28148: [SPARK-31381][SPARK-29245][SQL] Upgrade built-in Hive 2.3.6 to 2.3.7

2020-04-19 Thread GitBox


AmplabJenkins removed a comment on issue #28148:
URL: https://github.com/apache/spark/pull/28148#issuecomment-616296974







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28148: [SPARK-31381][SPARK-29245][SQL] Upgrade built-in Hive 2.3.6 to 2.3.7

2020-04-19 Thread GitBox


AmplabJenkins commented on issue #28148:
URL: https://github.com/apache/spark/pull/28148#issuecomment-616296974







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28269: [SPARK-31493][SQL] Optimize InSet to In according partition size at InSubqueryExec

2020-04-19 Thread GitBox


AmplabJenkins commented on issue #28269:
URL: https://github.com/apache/spark/pull/28269#issuecomment-616296943







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28269: [SPARK-31493][SQL] Optimize InSet to In according partition size at InSubqueryExec

2020-04-19 Thread GitBox


SparkQA commented on issue #28269:
URL: https://github.com/apache/spark/pull/28269#issuecomment-616296734


   **[Test build #121497 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121497/testReport)**
 for PR 28269 at commit 
[`a135dbc`](https://github.com/apache/spark/commit/a135dbc12bdf98008db04069289c98db3d9627e6).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28148: [SPARK-31381][SPARK-29245][SQL] Upgrade built-in Hive 2.3.6 to 2.3.7

2020-04-19 Thread GitBox


SparkQA commented on issue #28148:
URL: https://github.com/apache/spark/pull/28148#issuecomment-616296735


   **[Test build #121498 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121498/testReport)**
 for PR 28148 at commit 
[`de13433`](https://github.com/apache/spark/commit/de134334e712ba822e6a8243a4ba3f50ec5b2ba3).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum commented on issue #28148: [SPARK-31381][SPARK-29245][SQL] Upgrade built-in Hive 2.3.6 to 2.3.7

2020-04-19 Thread GitBox


wangyum commented on issue #28148:
URL: https://github.com/apache/spark/pull/28148#issuecomment-616295772


   retest this please.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28269: [SPARK-31493][SQL] Optimize InSet to In according partition size at InSubqueryExec

2020-04-19 Thread GitBox


ulysses-you commented on a change in pull request #28269:
URL: https://github.com/apache/spark/pull/28269#discussion_r411078284



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/PartitionPruning.scala
##
@@ -109,7 +109,7 @@ object PartitionPruning extends Rule[LogicalPlan] with 
PredicateHelper {
* the size in bytes of the partitioned plan after filtering is greater than 
the size
* in bytes of the plan on the other side of the join. We estimate the 
filtering ratio
* using column statistics if they are available, otherwise we use the 
config value of
-   * `spark.sql.optimizer.joinFilterRatio`.
+   * `SQLConf.DYNAMIC_PARTITION_PRUNING_FALLBACK_FILTER_RATIO`.

Review comment:
   Update the config name.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you opened a new pull request #28269: [SPARK-31493][SQL] Optimize InSet to In according partition size at InSubqueryExec

2020-04-19 Thread GitBox


ulysses-you opened a new pull request #28269:
URL: https://github.com/apache/spark/pull/28269


   
   
   ### What changes were proposed in this pull request?
   
   To respect `OptimizeIn`. Use `In` or `InSet` according partition size.
   
   ### Why are the changes needed?
   
   Better performance.
   
   ### Does this PR introduce any user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Add UT.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28148: [WIP][SPARK-31381][SPARK-29245][SQL][test-hadoop3.2][test-java11] Upgrade built-in Hive 2.3.6 to 2.3.7

2020-04-19 Thread GitBox


AmplabJenkins commented on issue #28148:
URL: https://github.com/apache/spark/pull/28148#issuecomment-616294803







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28148: [WIP][SPARK-31381][SPARK-29245][SQL][test-hadoop3.2][test-java11] Upgrade built-in Hive 2.3.6 to 2.3.7

2020-04-19 Thread GitBox


AmplabJenkins removed a comment on issue #28148:
URL: https://github.com/apache/spark/pull/28148#issuecomment-616294803







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #28148: [WIP][SPARK-31381][SPARK-29245][SQL][test-hadoop3.2][test-java11] Upgrade built-in Hive 2.3.6 to 2.3.7

2020-04-19 Thread GitBox


SparkQA removed a comment on issue #28148:
URL: https://github.com/apache/spark/pull/28148#issuecomment-616255932


   **[Test build #121491 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121491/testReport)**
 for PR 28148 at commit 
[`de13433`](https://github.com/apache/spark/commit/de134334e712ba822e6a8243a4ba3f50ec5b2ba3).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Ngone51 commented on a change in pull request #28226: [SPARK-31452][SQL] Do not create partition spec for 0-size partitions in AQE

2020-04-19 Thread GitBox


Ngone51 commented on a change in pull request #28226:
URL: https://github.com/apache/spark/pull/28226#discussion_r411076696



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
##
@@ -88,9 +88,11 @@ case class OptimizeSkewedJoin(conf: SQLConf) extends 
Rule[SparkPlan] {
   private def targetSize(sizes: Seq[Long], medianSize: Long): Long = {
 val advisorySize = conf.getConf(SQLConf.ADVISORY_PARTITION_SIZE_IN_BYTES)
 val nonSkewSizes = sizes.filterNot(isSkewed(_, medianSize))
-// It's impossible that all the partitions are skewed, as we use median 
size to define skew.
-assert(nonSkewSizes.nonEmpty)
-math.max(advisorySize, nonSkewSizes.sum / nonSkewSizes.length)
+if (nonSkewSizes.isEmpty) {

Review comment:
   Why it's possible empty now?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28148: [WIP][SPARK-31381][SPARK-29245][SQL][test-hadoop3.2][test-java11] Upgrade built-in Hive 2.3.6 to 2.3.7

2020-04-19 Thread GitBox


SparkQA commented on issue #28148:
URL: https://github.com/apache/spark/pull/28148#issuecomment-616294353


   **[Test build #121491 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121491/testReport)**
 for PR 28148 at commit 
[`de13433`](https://github.com/apache/spark/commit/de134334e712ba822e6a8243a4ba3f50ec5b2ba3).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28268: [SPARK-31492][ML] flatten the result dataframe of FValueTest

2020-04-19 Thread GitBox


AmplabJenkins removed a comment on issue #28268:
URL: https://github.com/apache/spark/pull/28268#issuecomment-616294101







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28268: [SPARK-31492][ML] flatten the result dataframe of FValueTest

2020-04-19 Thread GitBox


AmplabJenkins commented on issue #28268:
URL: https://github.com/apache/spark/pull/28268#issuecomment-616294101







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28268: [SPARK-31492][ML] flatten the result dataframe of FValueTest

2020-04-19 Thread GitBox


SparkQA commented on issue #28268:
URL: https://github.com/apache/spark/pull/28268#issuecomment-616293948


   **[Test build #121496 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121496/testReport)**
 for PR 28268 at commit 
[`3caf7d1`](https://github.com/apache/spark/commit/3caf7d12408f3cd6d8245c53974ea84703c5f767).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng opened a new pull request #28268: [SPARK-31492][ML] flatten the result dataframe of FValueTest

2020-04-19 Thread GitBox


zhengruifeng opened a new pull request #28268:
URL: https://github.com/apache/spark/pull/28268


   ### What changes were proposed in this pull request?
   add a new method  `def test(dataset: DataFrame, featuresCol: String, 
labelCol: String, flatten: Boolean): DataFrame`
   
   ### Why are the changes needed?
   
   
   Similar to new test method in ChiSquareTest, it will:
   1, support df operation on the returned df;
   2, make driver no longer a bottleneck with large `numFeatures`
   
   
   ### Does this PR introduce any user-facing change?
   Yes, add a new method
   
   
   ### How was this patch tested?
   existing testsuites
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] yaooqinn edited a comment on issue #28197: [SPARK-31431][SQL] Add CalendarInterval encoder support

2020-04-19 Thread GitBox


yaooqinn edited a comment on issue #28197:
URL: https://github.com/apache/spark/pull/28197#issuecomment-616291295


   Hi,@viirya, thanks for the details. take your commit as an example 
https://github.com/apache/spark/commit/48e44b24a7663142176102ac4c6bf4242f103804,
 `Seq(Set(interval)).toDF()`, do intervals work as domain objects already?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] yaooqinn commented on issue #28197: [SPARK-31431][SQL] Add CalendarInterval encoder support

2020-04-19 Thread GitBox


yaooqinn commented on issue #28197:
URL: https://github.com/apache/spark/pull/28197#issuecomment-616291295


   Hi, take your commit as an example 
https://github.com/apache/spark/commit/48e44b24a7663142176102ac4c6bf4242f103804,
 `Seq(Set(interval)).toDF()`, do intervals work as domain objects already?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] uncleGen commented on issue #27694: [SPARK-30946][SS] Serde entry with UnsafeRow on FileStream(Source/Sink)Log with LZ4 compression

2020-04-19 Thread GitBox


uncleGen commented on issue #27694:
URL: https://github.com/apache/spark/pull/27694#issuecomment-616289728


   Suppose there is a streaming job pipeline, and these streaming job comes 
from different end-users or department, if middle end-user upgrade their spark 
and use `CompactibleFileStreamLog V2` (as you said in default, we can read from 
version 1 and write to version 2 for smooth migration.), then downstream jobs 
will fail. Is there something I misunderstand?So, It is better to keep the 
version of `CompactibleFileStreamLog`, or leave a config to upgrade version of 
`CompactibleFileStreamLog` with default `false`?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Ngone51 commented on a change in pull request #28254: [SPARK-31478][CORE]Call `StopExecutor` before killing executors

2020-04-19 Thread GitBox


Ngone51 commented on a change in pull request #28254:
URL: https://github.com/apache/spark/pull/28254#discussion_r411069475



##
File path: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
##
@@ -769,6 +769,8 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, val rpcEnv: Rp
 
   val killExecutors: Boolean => Future[Boolean] =
 if (executorsToKill.nonEmpty) {
+  executorsToKill.foreach(id =>
+
executorDataMap.get(id).foreach(_.executorEndpoint.send(StopExecutor)))

Review comment:
   Can we guarantee that `stop` is called before `kill` in this way?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zhengruifeng commented on issue #28202: [SPARK-31433][ML] Summarizer supports string arguments

2020-04-19 Thread GitBox


zhengruifeng commented on issue #28202:
URL: https://github.com/apache/spark/pull/28202#issuecomment-616288772


   I think it is not worth too much, and will close it.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gatorsmile commented on a change in pull request #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SPARK-31060][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-04-19 Thread GitBox


gatorsmile commented on a change in pull request #27728:
URL: https://github.com/apache/spark/pull/27728#discussion_r411068064



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
##
@@ -652,10 +652,19 @@ object DataSourceStrategy {
  */
 object PushableColumn {
   def unapply(e: Expression): Option[String] = {
-def helper(e: Expression) = e match {
-  case a: Attribute => Some(a.name)
+val nestedPredicatePushdownEnabled = 
SQLConf.get.nestedPredicatePushdownEnabled
+import 
org.apache.spark.sql.connector.catalog.CatalogV2Implicits.MultipartIdentifierHelper
+def helper(e: Expression): Option[Seq[String]] = e match {
+  case a: Attribute =>
+if (nestedPredicatePushdownEnabled || !a.name.contains(".")) {
+  Some(Seq(a.name))
+} else {
+  None
+}
+  case s: GetStructField if nestedPredicatePushdownEnabled =>
+helper(s.child).map(_ :+ s.childSchema(s.ordinal).name)
   case _ => None
 }
-helper(e)
+helper(e).map(_.quoted)

Review comment:
   @viirya Are you interested in this follow up?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28260: [SPARK-31487][CORE] Move slots check of barrier job from DAGScheduler to TaskSchedulerImpl

2020-04-19 Thread GitBox


AmplabJenkins commented on issue #28260:
URL: https://github.com/apache/spark/pull/28260#issuecomment-616286093







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >