[GitHub] [spark] maropu commented on issue #28306: [SPARK-31465][SQL][DOCS][FOLLOW-UP] Document Literal in SQL Reference

2020-04-22 Thread GitBox


maropu commented on issue #28306:
URL: https://github.com/apache/spark/pull/28306#issuecomment-618194775


   Oops, I see. LGTM.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28306: [SPARK-31465][SQL][DOCS][FOLLOW-UP] Document Literal in SQL Reference

2020-04-22 Thread GitBox


AmplabJenkins commented on issue #28306:
URL: https://github.com/apache/spark/pull/28306#issuecomment-618194577







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28306: [SPARK-31465][SQL][DOCS][FOLLOW-UP] Document Literal in SQL Reference

2020-04-22 Thread GitBox


AmplabJenkins removed a comment on issue #28306:
URL: https://github.com/apache/spark/pull/28306#issuecomment-618194577







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28306: [SPARK-31465][SQL][DOCS][FOLLOW-UP] Document Literal in SQL Reference

2020-04-22 Thread GitBox


SparkQA commented on issue #28306:
URL: https://github.com/apache/spark/pull/28306#issuecomment-618194437


   **[Test build #121651 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121651/testReport)**
 for PR 28306 at commit 
[`9ab90f2`](https://github.com/apache/spark/commit/9ab90f2a5cb305684746db2ec9ec98b1e8b9921e).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #28306: [SPARK-31465][SQL][DOCS][FOLLOW-UP] Document Literal in SQL Reference

2020-04-22 Thread GitBox


SparkQA removed a comment on issue #28306:
URL: https://github.com/apache/spark/pull/28306#issuecomment-618190388


   **[Test build #121651 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121651/testReport)**
 for PR 28306 at commit 
[`9ab90f2`](https://github.com/apache/spark/commit/9ab90f2a5cb305684746db2ec9ec98b1e8b9921e).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28304: [SPARK-31523][SQL] LogicalPlan doCanonicalize should throw exception if not resolved

2020-04-22 Thread GitBox


maropu commented on a change in pull request #28304:
URL: https://github.com/apache/spark/pull/28304#discussion_r413525735



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala
##
@@ -40,6 +40,13 @@ abstract class LogicalPlan
 super.verboseString(maxFields) + statsCache.map(", " + 
_.toString).getOrElse("")
   }
 
+  override protected def doCanonicalize(): LogicalPlan = {
+if (!resolved) {

Review comment:
   You think users use canonicalization? I think that is for internal use 
only though.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] huaxingao commented on issue #28306: [SPARK-31465][SQL][DOCS][FOLLOW-UP] Document Literal in SQL Reference

2020-04-22 Thread GitBox


huaxingao commented on issue #28306:
URL: https://github.com/apache/spark/pull/28306#issuecomment-618191338


   @cloud-fan @maropu 
   Sorry, your guys are too fast for me :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28306: [SPARK-31465][SQL][DOCS][FOLLOW-UP] Document Literal in SQL Reference

2020-04-22 Thread GitBox


AmplabJenkins removed a comment on issue #28306:
URL: https://github.com/apache/spark/pull/28306#issuecomment-618190834







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28302: [SPARK-31522][SQL] Hive metastore client initialization related configurations should be static

2020-04-22 Thread GitBox


AmplabJenkins commented on issue #28302:
URL: https://github.com/apache/spark/pull/28302#issuecomment-618190779







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28306: [SPARK-31465][SQL][DOCS][FOLLOW-UP] Document Literal in SQL Reference

2020-04-22 Thread GitBox


AmplabJenkins commented on issue #28306:
URL: https://github.com/apache/spark/pull/28306#issuecomment-618190834







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28302: [SPARK-31522][SQL] Hive metastore client initialization related configurations should be static

2020-04-22 Thread GitBox


AmplabJenkins removed a comment on issue #28302:
URL: https://github.com/apache/spark/pull/28302#issuecomment-618190779







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28302: [SPARK-31522][SQL] Hive metastore client initialization related configurations should be static

2020-04-22 Thread GitBox


SparkQA commented on issue #28302:
URL: https://github.com/apache/spark/pull/28302#issuecomment-618190414


   **[Test build #121652 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121652/testReport)**
 for PR 28302 at commit 
[`c72ba70`](https://github.com/apache/spark/commit/c72ba701ed5685e89b90fb001dfaf32a4b6a9e4a).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28306: [SPARK-31465][SQL][DOCS][FOLLOW-UP] Document Literal in SQL Reference

2020-04-22 Thread GitBox


SparkQA commented on issue #28306:
URL: https://github.com/apache/spark/pull/28306#issuecomment-618190388


   **[Test build #121651 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121651/testReport)**
 for PR 28306 at commit 
[`9ab90f2`](https://github.com/apache/spark/commit/9ab90f2a5cb305684746db2ec9ec98b1e8b9921e).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] huaxingao opened a new pull request #28306: [SPARK-31465][SQL][DOCS][FOLLOW-UP] Document Literal in SQL Reference

2020-04-22 Thread GitBox


huaxingao opened a new pull request #28306:
URL: https://github.com/apache/spark/pull/28306


   
   
   ### What changes were proposed in this pull request?
   Need to address a few more comments 
   
   
   ### Why are the changes needed?Fix a few problems
   
   
   ### Does this PR introduce any user-facing change?
   Yes
   
   
   ### How was this patch tested?
   Manually build and check
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28302: [SPARK-31522][SQL] Hive metastore client initialization related configurations should be static

2020-04-22 Thread GitBox


AmplabJenkins removed a comment on issue #28302:
URL: https://github.com/apache/spark/pull/28302#issuecomment-618188366







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28294: [SPARK-31519][SQL] Cast in having aggregate expressions returns the wrong result

2020-04-22 Thread GitBox


maropu commented on a change in pull request #28294:
URL: https://github.com/apache/spark/pull/28294#discussion_r413521707



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
##
@@ -238,13 +238,13 @@ class Analyzer(
   ResolveNaturalAndUsingJoin ::
   ResolveOutputRelation ::
   ExtractWindowExpressions ::
+  ResolveTimeZone(conf) ::

Review comment:
   ok, merged in 
https://github.com/apache/spark/commit/ca90e1932dcdc43748297c627ec857b6ea97dff7





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28302: [SPARK-31522][SQL] Hive metastore client initialization related configurations should be static

2020-04-22 Thread GitBox


AmplabJenkins commented on issue #28302:
URL: https://github.com/apache/spark/pull/28302#issuecomment-618188366







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] huaxingao edited a comment on issue #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-22 Thread GitBox


huaxingao edited a comment on issue #28237:
URL: https://github.com/apache/spark/pull/28237#issuecomment-618185381


   Thank you all for the help!
   Actually I need to address a couple of more comments. Sorry I was not fast 
enough. I will have a follow up in a few minutes. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on issue #28288: [SPARK-31515][SQL] Canonicalize Cast should consider the value of needTimeZone

2020-04-22 Thread GitBox


maropu commented on issue #28288:
URL: https://github.com/apache/spark/pull/28288#issuecomment-618187901


   Thanks, all! Merged to master/3.0



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28302: [SPARK-31522][SQL] Hive metastore client initialization related configurations should be static

2020-04-22 Thread GitBox


SparkQA commented on issue #28302:
URL: https://github.com/apache/spark/pull/28302#issuecomment-618188030


   **[Test build #121650 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121650/testReport)**
 for PR 28302 at commit 
[`09b87ff`](https://github.com/apache/spark/commit/09b87ff5cfdfc2844f6b94063e1863be71ff5a78).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28288: [SPARK-31515][SQL] Canonicalize Cast should consider the value of needTimeZone

2020-04-22 Thread GitBox


AmplabJenkins removed a comment on issue #28288:
URL: https://github.com/apache/spark/pull/28288#issuecomment-618184980







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] huaxingao commented on issue #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-22 Thread GitBox


huaxingao commented on issue #28237:
URL: https://github.com/apache/spark/pull/28237#issuecomment-618185381


   Thank you all for the help!
   Actually I need to address a couple of more comments. Sorry I was not fast 
enough. I will have a follow up in s few minutes. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28288: [SPARK-31515][SQL] Canonicalize Cast should consider the value of needTimeZone

2020-04-22 Thread GitBox


AmplabJenkins commented on issue #28288:
URL: https://github.com/apache/spark/pull/28288#issuecomment-618184980







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #28288: [SPARK-31515][SQL] Canonicalize Cast should consider the value of needTimeZone

2020-04-22 Thread GitBox


SparkQA removed a comment on issue #28288:
URL: https://github.com/apache/spark/pull/28288#issuecomment-618115365


   **[Test build #121641 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121641/testReport)**
 for PR 28288 at commit 
[`73f4694`](https://github.com/apache/spark/commit/73f4694e5cee1aa2f256a90b03d1fb09ee5a295d).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28288: [SPARK-31515][SQL] Canonicalize Cast should consider the value of needTimeZone

2020-04-22 Thread GitBox


SparkQA commented on issue #28288:
URL: https://github.com/apache/spark/pull/28288#issuecomment-618184359


   **[Test build #121641 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121641/testReport)**
 for PR 28288 at commit 
[`73f4694`](https://github.com/apache/spark/commit/73f4694e5cee1aa2f256a90b03d1fb09ee5a295d).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on issue #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-22 Thread GitBox


maropu commented on issue #28237:
URL: https://github.com/apache/spark/pull/28237#issuecomment-618182481


   Thanks, all! Merged to master/3.0.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] xuanyuanking commented on a change in pull request #28294: [SPARK-31519][SQL] Cast in having aggregate expressions returns the wrong result

2020-04-22 Thread GitBox


xuanyuanking commented on a change in pull request #28294:
URL: https://github.com/apache/spark/pull/28294#discussion_r413511664



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
##
@@ -583,6 +583,16 @@ case class Aggregate(
   }
 }
 
+case class AggregateWithHaving(

Review comment:
   ```
   move it into unresolved.scala?
   ```
   Yeah, make sense, will change it to unresolved.scala.
   
   ```
   Could we rename this into UnresolvedHaving
   ```
   Since the `group by` not always come with `Aggregate`, it can also be 
`GroupingSets`,  we only handle the Aggregate part with `AggregateWithHaving`. 
So maybe let's keep it `AggregateWithHaving`? WDYT :)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on issue #28276: [SPARK-31476][SQL][FOLLOWUP] Add tests for extract('field', source)

2020-04-22 Thread GitBox


cloud-fan commented on issue #28276:
URL: https://github.com/apache/spark/pull/28276#issuecomment-618178541


   thanks, merging to master/3.0!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-22 Thread GitBox


cloud-fan commented on a change in pull request #28237:
URL: https://github.com/apache/spark/pull/28237#discussion_r413508809



##
File path: docs/sql-ref-literals.md
##
@@ -0,0 +1,532 @@
+---
+layout: global
+title: Literals
+displayTitle: Literals
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+A literal (also known as a constant) represents a fixed data value. Spark SQL 
supports the following literals:
+
+ * [String Literal](#string-literal)
+ * [Binary Literal](#binary-literal)
+ * [Null Literal](#null-literal)
+ * [Boolean Literal](#boolean-literal)
+ * [Numeric Literal](#numeric-literal)
+ * [Datetime Literal](#datetime-literal)
+ * [Interval Literal](#interval-literal)
+
+### String Literal
+
+A string literal is used to specify a character string value.
+
+ Syntax
+
+{% highlight sql %}
+'c [ ... ]' | "c [ ... ]"
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set. Use \ to escape special 
characters (e.g., ' or \).
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 'Hello, World!' AS col;
++-+
+|  col|
++-+
+|Hello, World!|
++-+
+
+SELECT "SPARK SQL" AS col;
++-+
+|  col|
++-+
+|Spark SQL|
++-+
+
+SELECT 'it\'s $10.' AS col;
++-+
+|  col|
++-+
+|It's $10.|
++-+
+{% endhighlight %}
+
+### Binary Literal
+
+A binary literal is used to specify a byte sequence value.
+
+ Syntax
+
+{% highlight sql %}
+X { 'c [ ... ]' | "c [ ... ]" }
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT X'123456' AS col;
++--+
+|   col|
++--+
+|[12 34 56]|
++--+
+{% endhighlight %}
+
+### Null Literal
+
+A null literal is used to specify a null value.
+
+ Syntax
+
+{% highlight sql %}
+NULL
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT NULL AS col;
+++
+| col|
+++
+|NULL|
+++
+{% endhighlight %}
+
+### Boolean Literal
+
+A boolean literal is used to specify a boolean value.
+
+ Syntax
+
+{% highlight sql %}
+TRUE | FALSE
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT TRUE AS col;
+++
+| col|
+++
+|true|
+++
+{% endhighlight %}
+
+### Numeric Literal
+
+A numeric literal is used to specify a fixed or floating-point number.
+
+ Integral Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] digit [ ... ] [ L | S | Y ]
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  L
+  
+Case insensitive, indicates BIGINT, which is a 8-byte signed 
integer number.
+  
+
+
+  S
+  
+Case insensitive, indicates SMALLINT, which is a 2-byte 
signed integer number.
+  
+
+
+  Y
+  
+Case insensitive, indicates TINYINT, which is a 1-byte signed 
integer number.
+  
+
+
+  default (no postfix)
+  
+Indicates a 4-byte signed integer number.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT -2147483648 AS col;
++---+
+|col|
++---+
+|-2147483648|
++---+
+
+SELECT 9223372036854775807l AS col;
++---+
+|col|
++---+
+|9223372036854775807|
++---+
+
+SELECT -32Y AS col;
++---+
+|col|
++---+
+|-32|
++---+
+
+SELECT 482S AS col;
++---+
+|col|
++---+
+|482|
++---+
+{% endhighlight %}
+
+ Fractional Literals
+
+ Syntax
+
+decimal literals:
+{% highlight sql %}
+decimal_digits { [ BD ] | [ exponent BD ] } | digit [ ... ] [ exponent ] BD
+{% endhighlight %}
+
+double literals:
+{% highlight sql %}
+decimal_digits  { D | exponent [ D ] }  | digit [ ... ] { exponent [ D ] | [ 
exponent ] D }
+{% endhighlight %}
+
+While decimal_digits is defined as
+{% highlight sql %}
+[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] }
+{% endhighlight %}
+
+and exponent is defined as
+{% highlight sql %}
+E [ + | - ] digit [ ... ]
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  D
+  
+Case insensitive, indicates DOUBLE, which is a 8-byte 
double-precision floating point number.
+  
+
+
+  BD
+  
+Case 

[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-22 Thread GitBox


cloud-fan commented on a change in pull request #28237:
URL: https://github.com/apache/spark/pull/28237#discussion_r413508540



##
File path: docs/sql-ref-literals.md
##
@@ -0,0 +1,532 @@
+---
+layout: global
+title: Literals
+displayTitle: Literals
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+A literal (also known as a constant) represents a fixed data value. Spark SQL 
supports the following literals:
+
+ * [String Literal](#string-literal)
+ * [Binary Literal](#binary-literal)
+ * [Null Literal](#null-literal)
+ * [Boolean Literal](#boolean-literal)
+ * [Numeric Literal](#numeric-literal)
+ * [Datetime Literal](#datetime-literal)
+ * [Interval Literal](#interval-literal)
+
+### String Literal
+
+A string literal is used to specify a character string value.
+
+ Syntax
+
+{% highlight sql %}
+'c [ ... ]' | "c [ ... ]"
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set. Use \ to escape special 
characters (e.g., ' or \).
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 'Hello, World!' AS col;
++-+
+|  col|
++-+
+|Hello, World!|
++-+
+
+SELECT "SPARK SQL" AS col;
++-+
+|  col|
++-+
+|Spark SQL|
++-+
+
+SELECT 'it\'s $10.' AS col;
++-+
+|  col|
++-+
+|It's $10.|
++-+
+{% endhighlight %}
+
+### Binary Literal
+
+A binary literal is used to specify a byte sequence value.
+
+ Syntax
+
+{% highlight sql %}
+X { 'c [ ... ]' | "c [ ... ]" }
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT X'123456' AS col;
++--+
+|   col|
++--+
+|[12 34 56]|
++--+
+{% endhighlight %}
+
+### Null Literal
+
+A null literal is used to specify a null value.
+
+ Syntax
+
+{% highlight sql %}
+NULL
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT NULL AS col;
+++
+| col|
+++
+|NULL|
+++
+{% endhighlight %}
+
+### Boolean Literal
+
+A boolean literal is used to specify a boolean value.
+
+ Syntax
+
+{% highlight sql %}
+TRUE | FALSE
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT TRUE AS col;
+++
+| col|
+++
+|true|
+++
+{% endhighlight %}
+
+### Numeric Literal
+
+A numeric literal is used to specify a fixed or floating-point number.
+
+ Integral Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] digit [ ... ] [ L | S | Y ]
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  L
+  
+Case insensitive, indicates BIGINT, which is a 8-byte signed 
integer number.
+  
+
+
+  S
+  
+Case insensitive, indicates SMALLINT, which is a 2-byte 
signed integer number.
+  
+
+
+  Y
+  
+Case insensitive, indicates TINYINT, which is a 1-byte signed 
integer number.
+  
+
+
+  default (no postfix)
+  
+Indicates a 4-byte signed integer number.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT -2147483648 AS col;
++---+
+|col|
++---+
+|-2147483648|
++---+
+
+SELECT 9223372036854775807l AS col;
++---+
+|col|
++---+
+|9223372036854775807|
++---+
+
+SELECT -32Y AS col;
++---+
+|col|
++---+
+|-32|
++---+
+
+SELECT 482S AS col;
++---+
+|col|
++---+
+|482|
++---+
+{% endhighlight %}
+
+ Fractional Literals
+
+ Syntax
+
+decimal literals:
+{% highlight sql %}
+decimal_digits { [ BD ] | [ exponent BD ] } | digit [ ... ] [ exponent ] BD
+{% endhighlight %}
+
+double literals:
+{% highlight sql %}
+decimal_digits  { D | exponent [ D ] }  | digit [ ... ] { exponent [ D ] | [ 
exponent ] D }
+{% endhighlight %}
+
+While decimal_digits is defined as
+{% highlight sql %}
+[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] }
+{% endhighlight %}
+
+and exponent is defined as
+{% highlight sql %}
+E [ + | - ] digit [ ... ]
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  D
+  
+Case insensitive, indicates DOUBLE, which is a 8-byte 
double-precision floating point number.
+  
+
+
+  BD
+  
+Case 

[GitHub] [spark] cloud-fan commented on a change in pull request #28237: [SPARK-31465][SQL][DOCS] Document Literal in SQL Reference

2020-04-22 Thread GitBox


cloud-fan commented on a change in pull request #28237:
URL: https://github.com/apache/spark/pull/28237#discussion_r413508685



##
File path: docs/sql-ref-literals.md
##
@@ -0,0 +1,532 @@
+---
+layout: global
+title: Literals
+displayTitle: Literals
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+A literal (also known as a constant) represents a fixed data value. Spark SQL 
supports the following literals:
+
+ * [String Literal](#string-literal)
+ * [Binary Literal](#binary-literal)
+ * [Null Literal](#null-literal)
+ * [Boolean Literal](#boolean-literal)
+ * [Numeric Literal](#numeric-literal)
+ * [Datetime Literal](#datetime-literal)
+ * [Interval Literal](#interval-literal)
+
+### String Literal
+
+A string literal is used to specify a character string value.
+
+ Syntax
+
+{% highlight sql %}
+'c [ ... ]' | "c [ ... ]"
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set. Use \ to escape special 
characters (e.g., ' or \).
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT 'Hello, World!' AS col;
++-+
+|  col|
++-+
+|Hello, World!|
++-+
+
+SELECT "SPARK SQL" AS col;
++-+
+|  col|
++-+
+|Spark SQL|
++-+
+
+SELECT 'it\'s $10.' AS col;
++-+
+|  col|
++-+
+|It's $10.|
++-+
+{% endhighlight %}
+
+### Binary Literal
+
+A binary literal is used to specify a byte sequence value.
+
+ Syntax
+
+{% highlight sql %}
+X { 'c [ ... ]' | "c [ ... ]" }
+{% endhighlight %}
+
+ Parameters
+
+
+  c
+  
+One character from the character set.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT X'123456' AS col;
++--+
+|   col|
++--+
+|[12 34 56]|
++--+
+{% endhighlight %}
+
+### Null Literal
+
+A null literal is used to specify a null value.
+
+ Syntax
+
+{% highlight sql %}
+NULL
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT NULL AS col;
+++
+| col|
+++
+|NULL|
+++
+{% endhighlight %}
+
+### Boolean Literal
+
+A boolean literal is used to specify a boolean value.
+
+ Syntax
+
+{% highlight sql %}
+TRUE | FALSE
+{% endhighlight %}
+
+ Examples
+
+{% highlight sql %}
+SELECT TRUE AS col;
+++
+| col|
+++
+|true|
+++
+{% endhighlight %}
+
+### Numeric Literal
+
+A numeric literal is used to specify a fixed or floating-point number.
+
+ Integral Literal
+
+ Syntax
+
+{% highlight sql %}
+[ + | - ] digit [ ... ] [ L | S | Y ]
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  L
+  
+Case insensitive, indicates BIGINT, which is a 8-byte signed 
integer number.
+  
+
+
+  S
+  
+Case insensitive, indicates SMALLINT, which is a 2-byte 
signed integer number.
+  
+
+
+  Y
+  
+Case insensitive, indicates TINYINT, which is a 1-byte signed 
integer number.
+  
+
+
+  default (no postfix)
+  
+Indicates a 4-byte signed integer number.
+  
+
+
+ Examples
+
+{% highlight sql %}
+SELECT -2147483648 AS col;
++---+
+|col|
++---+
+|-2147483648|
++---+
+
+SELECT 9223372036854775807l AS col;
++---+
+|col|
++---+
+|9223372036854775807|
++---+
+
+SELECT -32Y AS col;
++---+
+|col|
++---+
+|-32|
++---+
+
+SELECT 482S AS col;
++---+
+|col|
++---+
+|482|
++---+
+{% endhighlight %}
+
+ Fractional Literals
+
+ Syntax
+
+decimal literals:
+{% highlight sql %}
+decimal_digits { [ BD ] | [ exponent BD ] } | digit [ ... ] [ exponent ] BD
+{% endhighlight %}
+
+double literals:
+{% highlight sql %}
+decimal_digits  { D | exponent [ D ] }  | digit [ ... ] { exponent [ D ] | [ 
exponent ] D }
+{% endhighlight %}
+
+While decimal_digits is defined as
+{% highlight sql %}
+[ + | - ] { digit [ ... ] . [ digit [ ... ] ] | . digit [ ... ] }
+{% endhighlight %}
+
+and exponent is defined as
+{% highlight sql %}
+E [ + | - ] digit [ ... ]
+{% endhighlight %}
+
+ Parameters
+
+
+  digit
+  
+Any numeral from 0 to 9.
+  
+
+
+  D
+  
+Case insensitive, indicates DOUBLE, which is a 8-byte 
double-precision floating point number.
+  
+
+
+  BD
+  
+Case 

[GitHub] [spark] viirya commented on a change in pull request #27207: [SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling.

2020-04-22 Thread GitBox


viirya commented on a change in pull request #27207:
URL: https://github.com/apache/spark/pull/27207#discussion_r413506421



##
File path: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala
##
@@ -319,20 +336,38 @@ private[spark] class TaskSchedulerImpl(
 taskSetsByStageIdAndAttempt -= manager.taskSet.stageId
   }
 }
+resetOnPreviousOffer -= manager.taskSet
 manager.parent.removeSchedulable(manager)
 logInfo(s"Removed TaskSet ${manager.taskSet.id}, whose tasks have all 
completed, from pool" +
   s" ${manager.parent.name}")
   }
 
+  /**
+   * Offers resources to a single [[TaskSetManager]] at a given max allowed 
[[TaskLocality]].
+   *
+   * @param taskSet task set manager to offer resources to
+   * @param maxLocality max locality to allow when scheduling
+   * @param shuffledOffers shuffled resource offers to use for scheduling,
+   *   remaining resources are tracked by below fields as 
tasks are scheduled
+   * @param availableCpus  remaining cpus per offer,
+   *   value at index 'i' corresponds to shuffledOffers[i]
+   * @param availableResources remaining resources per offer,
+   *   value at index 'i' corresponds to 
shuffledOffers[i]
+   * @param tasks tasks scheduled per offer, value at index 'i' corresponds to 
shuffledOffers[i]
+   * @param addressesWithDescs tasks scheduler per host:port, used for barrier 
tasks
+   * @return tuple of (had delay schedule rejects?, option of min locality of 
launched task)

Review comment:
   If returning true, I think it means no delay schedule rejects, not had 
delay schedule rejects.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #27207: [SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling.

2020-04-22 Thread GitBox


viirya commented on a change in pull request #27207:
URL: https://github.com/apache/spark/pull/27207#discussion_r413503891



##
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##
@@ -543,6 +543,16 @@ package object config {
   .version("1.2.0")
   .fallbackConf(DYN_ALLOCATION_SCHEDULER_BACKLOG_TIMEOUT)
 
+  private[spark] val LEGACY_LOCALITY_WAIT_RESET =
+ConfigBuilder("spark.locality.wait.legacyResetOnTaskLaunch")
+.doc("Whether to use the legacy behavior of locality wait, which resets 
the delay timer " +
+  "anytime a task is scheduled. See Delay Scheduling section of 
TaskSchedulerImpl's class " +
+  "documentation for more details.")
+.internal()
+.version("3.0.0")

Review comment:
   I think this was not merged into 3.0 branch, right?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28264: [SPARK-31491][SQL][DOCS] Re-arrange Data Types page to document Floating Point Special Values

2020-04-22 Thread GitBox


AmplabJenkins commented on issue #28264:
URL: https://github.com/apache/spark/pull/28264#issuecomment-618173352







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28264: [SPARK-31491][SQL][DOCS] Re-arrange Data Types page to document Floating Point Special Values

2020-04-22 Thread GitBox


AmplabJenkins removed a comment on issue #28264:
URL: https://github.com/apache/spark/pull/28264#issuecomment-618173352







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #28264: [SPARK-31491][SQL][DOCS] Re-arrange Data Types page to document Floating Point Special Values

2020-04-22 Thread GitBox


SparkQA removed a comment on issue #28264:
URL: https://github.com/apache/spark/pull/28264#issuecomment-618169835


   **[Test build #121649 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121649/testReport)**
 for PR 28264 at commit 
[`ef7611e`](https://github.com/apache/spark/commit/ef7611e870a3ee3069bccfb0804072eb185a5b34).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28264: [SPARK-31491][SQL][DOCS] Re-arrange Data Types page to document Floating Point Special Values

2020-04-22 Thread GitBox


SparkQA commented on issue #28264:
URL: https://github.com/apache/spark/pull/28264#issuecomment-618173252


   **[Test build #121649 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121649/testReport)**
 for PR 28264 at commit 
[`ef7611e`](https://github.com/apache/spark/commit/ef7611e870a3ee3069bccfb0804072eb185a5b34).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on issue #28295: [SPARK-31508][SQL] we need convert the type to double type if one is…

2020-04-22 Thread GitBox


HyukjinKwon commented on issue #28295:
URL: https://github.com/apache/spark/pull/28295#issuecomment-618171808


   Closing as a duplicate of https://github.com/apache/spark/pull/27150.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on issue #28305: [SPARK-31474][SQL][FOLLOWUP] Replace _FUNC_ placeholder with functionname in the note field of expression info

2020-04-22 Thread GitBox


HyukjinKwon commented on issue #28305:
URL: https://github.com/apache/spark/pull/28305#issuecomment-618171396


   Documentation build passed in the Github Actions. I am going to merge this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on issue #28305: [SPARK-31474][SQL][FOLLOWUP] Replace _FUNC_ placeholder with functionname in the note field of expression info

2020-04-22 Thread GitBox


HyukjinKwon commented on issue #28305:
URL: https://github.com/apache/spark/pull/28305#issuecomment-618171468


   Merged to master and branch-3.0.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28264: [SPARK-31491][SQL][DOCS] Re-arrange Data Types page to document Floating Point Special Values

2020-04-22 Thread GitBox


AmplabJenkins removed a comment on issue #28264:
URL: https://github.com/apache/spark/pull/28264#issuecomment-618170176







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28264: [SPARK-31491][SQL][DOCS] Re-arrange Data Types page to document Floating Point Special Values

2020-04-22 Thread GitBox


AmplabJenkins commented on issue #28264:
URL: https://github.com/apache/spark/pull/28264#issuecomment-618170176







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28264: [SPARK-31491][SQL][DOCS] Re-arrange Data Types page to document Floating Point Special Values

2020-04-22 Thread GitBox


SparkQA commented on issue #28264:
URL: https://github.com/apache/spark/pull/28264#issuecomment-618169835


   **[Test build #121649 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121649/testReport)**
 for PR 28264 at commit 
[`ef7611e`](https://github.com/apache/spark/commit/ef7611e870a3ee3069bccfb0804072eb185a5b34).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on issue #28300: [MINOR][SQL] Add comments for filters values and return values of Row.get()/apply()

2020-04-22 Thread GitBox


cloud-fan commented on issue #28300:
URL: https://github.com/apache/spark/pull/28300#issuecomment-618168987


   thanks, merging to master/3.0!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28303: [SPARK-31495][SQL][FOLLOW-UP][3.0] Fix test failure of explain-aqe.sql

2020-04-22 Thread GitBox


cloud-fan commented on a change in pull request #28303:
URL: https://github.com/apache/spark/pull/28303#discussion_r413496559



##
File path: sql/core/src/test/resources/sql-tests/results/explain-aqe.sql.out
##
@@ -314,7 +314,7 @@ Arguments: HashedRelationBroadcastMode(List(cast(input[0, 
int, true] as bigint))
 Left keys [1]: [key#x]
 Right keys [1]: [key#x]
 Join condition: None
-   

Review comment:
   does it mean the master branch outputs some extra spaces in EXPLAIN 
FORMATTED?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28302: [SPARK-31522][SQL] Hive metastore client initialization related configurations should be static

2020-04-22 Thread GitBox


AmplabJenkins removed a comment on issue #28302:
URL: https://github.com/apache/spark/pull/28302#issuecomment-618164998


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/121644/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28302: [SPARK-31522][SQL] Hive metastore client initialization related configurations should be static

2020-04-22 Thread GitBox


AmplabJenkins removed a comment on issue #28302:
URL: https://github.com/apache/spark/pull/28302#issuecomment-618164990


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #28302: [SPARK-31522][SQL] Hive metastore client initialization related configurations should be static

2020-04-22 Thread GitBox


SparkQA removed a comment on issue #28302:
URL: https://github.com/apache/spark/pull/28302#issuecomment-618137351


   **[Test build #121644 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121644/testReport)**
 for PR 28302 at commit 
[`5c15a98`](https://github.com/apache/spark/commit/5c15a98270c428b0d9e7bacf553162d650a887b3).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28302: [SPARK-31522][SQL] Hive metastore client initialization related configurations should be static

2020-04-22 Thread GitBox


AmplabJenkins commented on issue #28302:
URL: https://github.com/apache/spark/pull/28302#issuecomment-618164990







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28302: [SPARK-31522][SQL] Hive metastore client initialization related configurations should be static

2020-04-22 Thread GitBox


SparkQA commented on issue #28302:
URL: https://github.com/apache/spark/pull/28302#issuecomment-618164856


   **[Test build #121644 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121644/testReport)**
 for PR 28302 at commit 
[`5c15a98`](https://github.com/apache/spark/commit/5c15a98270c428b0d9e7bacf553162d650a887b3).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty

2020-04-22 Thread GitBox


HyukjinKwon commented on a change in pull request #27861:
URL: https://github.com/apache/spark/pull/27861#discussion_r413491709



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##
@@ -1691,7 +1691,19 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   override def visitWindowDef(ctx: WindowDefContext): WindowSpecDefinition = 
withOrigin(ctx) {
 // CLUSTER BY ... | PARTITION BY ... ORDER BY ...
 val partition = ctx.partition.asScala.map(expression)
-val order = ctx.sortItem.asScala.map(visitSortItem)
+val order = if (ctx.sortItem.asScala.nonEmpty) {
+  ctx.sortItem.asScala.map(visitSortItem)
+} else if (ctx.windowFrame != null &&
+  ctx.windowFrame().frameType.getType == SqlBaseParser.RANGE) {
+  // for RANGE window frame, we won't add default order spec
+  ctx.sortItem.asScala.map(visitSortItem)
+} else {
+  // Same default behaviors like hive, when order spec is null
+  // set partition spec expression as order spec
+  ctx.partition.asScala.map { expr =>
+SortOrder(expression(expr), Ascending, Ascending.defaultNullOrdering, 
Set.empty)

Review comment:
   I think we should not fix it because Spark side at least the results 
will be non-deterministic. I doubt if this is good to add this support only 
because of compatibility with other DMBSes when the output is expected to be 
useless.
   
   Maybe disallowing it might be a better idea than finding another problem 
later caused by the different and indeterministic data.
   
   Do you maybe know other cases from other distributed DBMSs such as presto?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28305: [SPARK-31474][SQL][FOLLOWUP] Replace _FUNC_ placeholder with functionname in the note field of expression info

2020-04-22 Thread GitBox


AmplabJenkins commented on issue #28305:
URL: https://github.com/apache/spark/pull/28305#issuecomment-618157415







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28305: [SPARK-31474][SQL][FOLLOWUP] Replace _FUNC_ placeholder with functionname in the note field of expression info

2020-04-22 Thread GitBox


AmplabJenkins removed a comment on issue #28305:
URL: https://github.com/apache/spark/pull/28305#issuecomment-618157415







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28305: [SPARK-31474][SQL][FOLLOWUP] Replace _FUNC_ placeholder with functionname in the note field of expression info

2020-04-22 Thread GitBox


SparkQA commented on issue #28305:
URL: https://github.com/apache/spark/pull/28305#issuecomment-618157211


   **[Test build #121648 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121648/testReport)**
 for PR 28305 at commit 
[`2a3d5ce`](https://github.com/apache/spark/commit/2a3d5cef9ce2b4a9c4ff6134d1a3ca4350654aa0).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] yaooqinn opened a new pull request #28305: [SPARK-31474][SQL][FOLLOWUP] Replace _FUNC_ placeholder with functionname in the note field of expression info

2020-04-22 Thread GitBox


yaooqinn opened a new pull request #28305:
URL: https://github.com/apache/spark/pull/28305


   
   
   ### What changes were proposed in this pull request?
   
   _FUNC_ is used in note() of `ExpressionDescription` since 
https://github.com/apache/spark/pull/28248, it can be more cases later, we 
should replace it with function name for documentation
   
   
   ### Why are the changes needed?
   
   doc fix
   
   ### Does this PR introduce any user-facing change?
   
   no
   
   ### How was this patch tested?
   
   pass Jenkins, and verify locally with Jekyll serve



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27557: [SPARK-30804][SS] Measure and log elapsed time for "compact" operation in CompactibleFileStreamLog

2020-04-22 Thread GitBox


AmplabJenkins removed a comment on issue #27557:
URL: https://github.com/apache/spark/pull/27557#issuecomment-618153699







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27557: [SPARK-30804][SS] Measure and log elapsed time for "compact" operation in CompactibleFileStreamLog

2020-04-22 Thread GitBox


AmplabJenkins commented on issue #27557:
URL: https://github.com/apache/spark/pull/27557#issuecomment-618153699







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27557: [SPARK-30804][SS] Measure and log elapsed time for "compact" operation in CompactibleFileStreamLog

2020-04-22 Thread GitBox


SparkQA commented on issue #27557:
URL: https://github.com/apache/spark/pull/27557#issuecomment-618153414


   **[Test build #121647 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121647/testReport)**
 for PR 27557 at commit 
[`648f0dc`](https://github.com/apache/spark/commit/648f0dc2b5b2b53e2c62641ea0da0e04a5ffec0b).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Ngone51 commented on issue #28303: [SPARK-31495][SQL][FOLLOW-UP][3.0] Fix test failure of explain-aqe.sql

2020-04-22 Thread GitBox


Ngone51 commented on issue #28303:
URL: https://github.com/apache/spark/pull/28303#issuecomment-618153067


   thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty

2020-04-22 Thread GitBox


AngersZh commented on a change in pull request #27861:
URL: https://github.com/apache/spark/pull/27861#discussion_r413477419



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##
@@ -1691,7 +1691,19 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   override def visitWindowDef(ctx: WindowDefContext): WindowSpecDefinition = 
withOrigin(ctx) {
 // CLUSTER BY ... | PARTITION BY ... ORDER BY ...
 val partition = ctx.partition.asScala.map(expression)
-val order = ctx.sortItem.asScala.map(visitSortItem)
+val order = if (ctx.sortItem.asScala.nonEmpty) {
+  ctx.sortItem.asScala.map(visitSortItem)
+} else if (ctx.windowFrame != null &&
+  ctx.windowFrame().frameType.getType == SqlBaseParser.RANGE) {
+  // for RANGE window frame, we won't add default order spec
+  ctx.sortItem.asScala.map(visitSortItem)
+} else {
+  // Same default behaviors like hive, when order spec is null
+  // set partition spec expression as order spec
+  ctx.partition.asScala.map { expr =>
+SortOrder(expression(expr), Ascending, Ascending.defaultNullOrdering, 
Set.empty)

Review comment:
   > deterministic
   
   For same sql, result is deterministic.
   
   And we add partition column as order by column by default can keep result 
deterministic. 
   
   
   
   I meet this problem when migration hive sql to spark sql.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28301: [SPARK-31521][CORE] Correct the fetch size when merging blocks into a merged block

2020-04-22 Thread GitBox


dongjoon-hyun commented on a change in pull request #28301:
URL: https://github.com/apache/spark/pull/28301#discussion_r413475681



##
File path: 
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
##
@@ -414,21 +414,23 @@ final class ShuffleBlockFetcherIterator(
 def shouldMergeIntoPreviousBatchBlockId =
   mergedBlockInfo.last.blockId.asInstanceOf[ShuffleBlockBatchId].mapId 
== startBlockId.mapId
 
-val startReduceId = if (mergedBlockInfo.nonEmpty && 
shouldMergeIntoPreviousBatchBlockId) {
-  // Remove the previous batch block id as we will add a new one to 
replace it.
-  mergedBlockInfo.remove(mergedBlockInfo.length - 1).blockId
-.asInstanceOf[ShuffleBlockBatchId].startReduceId
-} else {
-  startBlockId.reduceId
-}
+val (startReduceId, size) =
+  if (mergedBlockInfo.nonEmpty && shouldMergeIntoPreviousBatchBlockId) 
{
+// Remove the previous batch block id as we will add a new one to 
replace it.
+val removed = mergedBlockInfo.remove(mergedBlockInfo.length - 1)
+  (removed.blockId.asInstanceOf[ShuffleBlockBatchId].startReduceId,
+removed.size + toBeMerged.map(_.size).sum)
+  } else {
+(startBlockId.reduceId, toBeMerged.map(_.size).sum)
+  }

Review comment:
   Thank you, @Ngone51 !





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on issue #28293: [SPARK-31518][CORE] Expose filterByRange in JavaPairRDD

2020-04-22 Thread GitBox


dongjoon-hyun commented on issue #28293:
URL: https://github.com/apache/spark/pull/28293#issuecomment-618149931


   @wetneb . You are added to Apache Spark contributor group and SPARK-31518 is 
assigned to you. Thank you again, @wetneb !



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28304: [SPARK-31523][SQL] LogicalPlan doCanonicalize should throw exception if not resolved

2020-04-22 Thread GitBox


AmplabJenkins removed a comment on issue #28304:
URL: https://github.com/apache/spark/pull/28304#issuecomment-618149114


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/121646/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28304: [SPARK-31523][SQL] LogicalPlan doCanonicalize should throw exception if not resolved

2020-04-22 Thread GitBox


AmplabJenkins removed a comment on issue #28304:
URL: https://github.com/apache/spark/pull/28304#issuecomment-618149106


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #28304: [SPARK-31523][SQL] LogicalPlan doCanonicalize should throw exception if not resolved

2020-04-22 Thread GitBox


SparkQA removed a comment on issue #28304:
URL: https://github.com/apache/spark/pull/28304#issuecomment-618143562


   **[Test build #121646 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121646/testReport)**
 for PR 28304 at commit 
[`49b27e1`](https://github.com/apache/spark/commit/49b27e1bc150dfb8a356bc544b7cd247aa1513cc).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28304: [SPARK-31523][SQL] LogicalPlan doCanonicalize should throw exception if not resolved

2020-04-22 Thread GitBox


SparkQA commented on issue #28304:
URL: https://github.com/apache/spark/pull/28304#issuecomment-618149079


   **[Test build #121646 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121646/testReport)**
 for PR 28304 at commit 
[`49b27e1`](https://github.com/apache/spark/commit/49b27e1bc150dfb8a356bc544b7cd247aa1513cc).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28304: [SPARK-31523][SQL] LogicalPlan doCanonicalize should throw exception if not resolved

2020-04-22 Thread GitBox


AmplabJenkins commented on issue #28304:
URL: https://github.com/apache/spark/pull/28304#issuecomment-618149106







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on issue #28285: [SPARK-31510][R][BUILD] Set setwd in R documentation build

2020-04-22 Thread GitBox


dongjoon-hyun commented on issue #28285:
URL: https://github.com/apache/spark/pull/28285#issuecomment-618147860


   Thank you! Ya. It was really weird.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on issue #28264: [SPARK-31491][SQL][DOCS] Re-arrange Data Types page to document Floating Point Special Values

2020-04-22 Thread GitBox


maropu commented on issue #28264:
URL: https://github.com/apache/spark/pull/28264#issuecomment-618147127


   Could you update the screenshot, too? Looks fine except for the existing 
comments.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28304: [SPARK-31523][SQL] LogicalPlan doCanonicalize should throw exception if not resolved

2020-04-22 Thread GitBox


ulysses-you commented on a change in pull request #28304:
URL: https://github.com/apache/spark/pull/28304#discussion_r413470723



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala
##
@@ -40,6 +40,13 @@ abstract class LogicalPlan
 super.verboseString(maxFields) + statsCache.map(", " + 
_.toString).getOrElse("")
   }
 
+  override protected def doCanonicalize(): LogicalPlan = {
+if (!resolved) {

Review comment:
   Considered use assert, but it can be happen in a special way. e.g.
   ```
   spark.sql("select id, count(*) from t1 group by id limit 
1").queryExecution.logical.canonicalized
   ```
   
   So throwing maybe better.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28264: [SPARK-31491][SQL][DOCS] Re-arrange Data Types page to document Floating Point Special Values

2020-04-22 Thread GitBox


maropu commented on a change in pull request #28264:
URL: https://github.com/apache/spark/pull/28264#discussion_r413469950



##
File path: docs/sql-ref-datatypes.md
##
@@ -706,3 +708,61 @@ The following table shows the type names as well as 
aliases used in Spark SQL pa
 
 
 
+
+### Floating Point Special Values
+
+Spark SQL supports several special floating point values in a case-insensitive 
manner:
+
+ * Inf/+Inf/Infinity/+Infinity: positive infinity
+   * ```FloatType```: 1.0f / 0.0f, which is equal to the value returned by 
java.lang.Float.intBitsToFloat(0x7f80).
+   * ```DoubleType```: 1.0 / 0.0, which is equal to the value returned by 
java.lang.Double.longBitsToDouble(0x7ff0L).
+ * -Inf/-Infinity: negative infinity
+   * ```FloatType```: -1.0f / 0.0f, which is equal to the value returned by 
java.lang.Float.intBitsToFloat(0xff80).
+   * ```DoubleType```: -1.0 / 0.0, which is equal to the value returned by 
java.lang.Double.longBitsToDouble(0xfff0L).
+ * NaN: not a number
+   * ```FloatType```: 0.0f / 0.0f, which is equivalent to the value returned 
by java.lang.Float.intBitsToFloat(0x7fc0).
+   * ```DoubleType```:  0.0d / 0.0, which is equivalent to the value returned 
by java.lang.Double.longBitsToDouble(0x7ff8L).
+
+ Examples
+
+{% highlight sql %}
+SELECT double('infinity');
+++
+|CAST(infinity AS DOUBLE)|
+++
+|Infinity|
+++
+
+SELECT float('-inf');
++---+
+|CAST(-inf AS FLOAT)|
++---+
+|  -Infinity|
++---+
+
+SELECT float('NaN');
++--+
+|CAST(NaN AS FLOAT)|
++--+
+|   NaN|
++--+
+{% endhighlight %}
+
+### -Infinity/Infinity Semantics
+

Review comment:
   Could you leave a short description about what this section is.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] turboFei commented on issue #26339: [SPARK-27194][SPARK-29302][SQL] Fix the issue that for dynamic partition overwrite a task would conflict with its speculative task

2020-04-22 Thread GitBox


turboFei commented on issue #26339:
URL: https://github.com/apache/spark/pull/26339#issuecomment-618145797


   gentle ping @cloud-fan 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28304: [SPARK-31523][SQL] LogicalPlan doCanonicalize should throw exception if not resolved

2020-04-22 Thread GitBox


maropu commented on a change in pull request #28304:
URL: https://github.com/apache/spark/pull/28304#discussion_r413469121



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala
##
@@ -40,6 +40,13 @@ abstract class LogicalPlan
 super.verboseString(maxFields) + statsCache.map(", " + 
_.toString).getOrElse("")
   }
 
+  override protected def doCanonicalize(): LogicalPlan = {
+if (!resolved) {

Review comment:
   Is it an assert rather than throwing an exception?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on a change in pull request #27557: [SPARK-30804][SS] Measure and log elapsed time for "compact" operation in CompactibleFileStreamLog

2020-04-22 Thread GitBox


HeartSaVioR commented on a change in pull request #27557:
URL: https://github.com/apache/spark/pull/27557#discussion_r413469060



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CompactibleFileStreamLog.scala
##
@@ -268,6 +288,7 @@ abstract class CompactibleFileStreamLog[T <: AnyRef : 
ClassTag](
 
 object CompactibleFileStreamLog {
   val COMPACT_FILE_SUFFIX = ".compact"
+  val COMPACT_LATENCY_WARN_THRESHOLD_MS = 2000

Review comment:
   Yeah it's a heuristic - I think a batch spending more than 2 seconds 
only for compacting metadata should be noticed to the end users, as the latency 
here is opaque to end user if we don't log it and they will be questioning.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty

2020-04-22 Thread GitBox


HyukjinKwon commented on a change in pull request #27861:
URL: https://github.com/apache/spark/pull/27861#discussion_r413468819



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##
@@ -1691,7 +1691,19 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   override def visitWindowDef(ctx: WindowDefContext): WindowSpecDefinition = 
withOrigin(ctx) {
 // CLUSTER BY ... | PARTITION BY ... ORDER BY ...
 val partition = ctx.partition.asScala.map(expression)
-val order = ctx.sortItem.asScala.map(visitSortItem)
+val order = if (ctx.sortItem.asScala.nonEmpty) {
+  ctx.sortItem.asScala.map(visitSortItem)
+} else if (ctx.windowFrame != null &&
+  ctx.windowFrame().frameType.getType == SqlBaseParser.RANGE) {
+  // for RANGE window frame, we won't add default order spec
+  ctx.sortItem.asScala.map(visitSortItem)
+} else {
+  // Same default behaviors like hive, when order spec is null
+  // set partition spec expression as order spec
+  ctx.partition.asScala.map { expr =>
+SortOrder(expression(expr), Ascending, Ascending.defaultNullOrdering, 
Set.empty)

Review comment:
   I guess because PostgreSQL can keep the natural order. Spark can't keep 
the natural order. Is PostgreSQL result deterministic?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28304: [SPARK-31523][SQL] LogicalPlan doCanonicalize should throw exception if not resolved

2020-04-22 Thread GitBox


AmplabJenkins commented on issue #28304:
URL: https://github.com/apache/spark/pull/28304#issuecomment-618143947







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28304: [SPARK-31523][SQL] LogicalPlan doCanonicalize should throw exception if not resolved

2020-04-22 Thread GitBox


AmplabJenkins removed a comment on issue #28304:
URL: https://github.com/apache/spark/pull/28304#issuecomment-618143947







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28304: [SPARK-31523][SQL] LogicalPlan doCanonicalize should throw exception if not resolved

2020-04-22 Thread GitBox


SparkQA commented on issue #28304:
URL: https://github.com/apache/spark/pull/28304#issuecomment-618143562


   **[Test build #121646 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121646/testReport)**
 for PR 28304 at commit 
[`49b27e1`](https://github.com/apache/spark/commit/49b27e1bc150dfb8a356bc544b7cd247aa1513cc).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on a change in pull request #27557: [SPARK-30804][SS] Measure and log elapsed time for "compact" operation in CompactibleFileStreamLog

2020-04-22 Thread GitBox


HeartSaVioR commented on a change in pull request #27557:
URL: https://github.com/apache/spark/pull/27557#discussion_r413466284



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CompactibleFileStreamLog.scala
##
@@ -177,16 +178,35 @@ abstract class CompactibleFileStreamLog[T <: AnyRef : 
ClassTag](
* corresponding `batchId` file. It will delete expired files as well if 
enabled.
*/
   private def compact(batchId: Long, logs: Array[T]): Boolean = {
-val validBatches = getValidBatchesBeforeCompactionBatch(batchId, 
compactInterval)
-val allLogs = validBatches.flatMap { id =>
-  super.get(id).getOrElse {
-throw new IllegalStateException(
-  s"${batchIdToPath(id)} doesn't exist when compacting batch $batchId 
" +
-s"(compactInterval: $compactInterval)")
-  }
-} ++ logs
+val (allLogs, loadElapsedMs) = Utils.timeTakenMs {
+  val validBatches = getValidBatchesBeforeCompactionBatch(batchId, 
compactInterval)
+  validBatches.flatMap { id =>
+super.get(id).getOrElse {
+  throw new IllegalStateException(
+s"${batchIdToPath(id)} doesn't exist when compacting batch 
$batchId " +
+  s"(compactInterval: $compactInterval)")
+}
+  } ++ logs
+}
+val compactedLogs = compactLogs(allLogs)
+
 // Return false as there is another writer.
-super.add(batchId, compactLogs(allLogs).toArray)
+val (writeSucceed, writeElapsedMs) = Utils.timeTakenMs {
+  super.add(batchId, compactedLogs.toArray)
+}
+
+val elapsedMs = loadElapsedMs + writeElapsedMs
+if (elapsedMs >= COMPACT_LATENCY_WARN_THRESHOLD_MS) {
+  logWarning(s"Compacting took $elapsedMs ms (load: $loadElapsedMs ms," +
+s" write: $writeElapsedMs ms) for compact batch $batchId")
+  logWarning(s"Loaded ${allLogs.size} entries 
(${SizeEstimator.estimate(allLogs)} bytes in " +

Review comment:
   Yes that sounds better. I'll add "(estimated)" after "bytes". Thanks!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you opened a new pull request #28304: [SPARK-31523][SQL] LogicalPlan doCanonicalize should throw exception if not resolved

2020-04-22 Thread GitBox


ulysses-you opened a new pull request #28304:
URL: https://github.com/apache/spark/pull/28304


   
   
   ### What changes were proposed in this pull request?
   
   Throw AnalysisException if LogicalPlan not resolved.
   
   ### Why are the changes needed?
   
   It's no meaning to canonicalize unresolved plan.
   For fast fail.
   
   ### Does this PR introduce any user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Jenkins test pass.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty

2020-04-22 Thread GitBox


AngersZh commented on a change in pull request #27861:
URL: https://github.com/apache/spark/pull/27861#discussion_r413465498



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##
@@ -1691,7 +1691,19 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   override def visitWindowDef(ctx: WindowDefContext): WindowSpecDefinition = 
withOrigin(ctx) {
 // CLUSTER BY ... | PARTITION BY ... ORDER BY ...
 val partition = ctx.partition.asScala.map(expression)
-val order = ctx.sortItem.asScala.map(visitSortItem)
+val order = if (ctx.sortItem.asScala.nonEmpty) {
+  ctx.sortItem.asScala.map(visitSortItem)
+} else if (ctx.windowFrame != null &&
+  ctx.windowFrame().frameType.getType == SqlBaseParser.RANGE) {
+  // for RANGE window frame, we won't add default order spec
+  ctx.sortItem.asScala.map(visitSortItem)
+} else {
+  // Same default behaviors like hive, when order spec is null
+  // set partition spec expression as order spec
+  ctx.partition.asScala.map { expr =>
+SortOrder(expression(expr), Ascending, Ascending.defaultNullOrdering, 
Set.empty)

Review comment:
   > But the results will be useless. When can it be useful if the order is 
indeterministic for the functions dependent on the order .. ?
   
   In postgre sql , if we don't specify order column, the result is according 
to partition column 's default sort order.
   ```
   angerszhu=# explain analyze verbose select id, num, lead(id) over (partition 
by num) from s4;
   QUERY PLAN   
  
   
---
WindowAgg  (cost=158.51..198.06 rows=2260 width=12) (actual 
time=0.107..0.122 rows=6 loops=1)
  Output: id, num, lead(id) OVER (?)
  ->  Sort  (cost=158.51..164.16 rows=2260 width=8) (actual 
time=0.079..0.081 rows=6 loops=1)
Output: num, id
Sort Key: s4.num
Sort Method: quicksort  Memory: 25kB
->  Seq Scan on public.s4  (cost=0.00..32.60 rows=2260 width=8) 
(actual time=0.057..0.061 rows=6 loops=1)
  Output: num, id
Planning Time: 0.114 ms
Execution Time: 0.214 ms
   
   angerszhu=# explain analyze verbose select id, num, lead(id) over (partition 
by num order by id) from s4;
   QUERY PLAN   
  
   
---
WindowAgg  (cost=158.51..203.71 rows=2260 width=12) (actual 
time=0.976..1.017 rows=6 loops=1)
  Output: id, num, lead(id) OVER (?)
  ->  Sort  (cost=158.51..164.16 rows=2260 width=8) (actual 
time=0.067..0.070 rows=6 loops=1)
Output: id, num
Sort Key: s4.num, s4.id
Sort Method: quicksort  Memory: 25kB
->  Seq Scan on public.s4  (cost=0.00..32.60 rows=2260 width=8) 
(actual time=0.042..0.045 rows=6 loops=1)
  Output: id, num
Planning Time: 0.155 ms
Execution Time: 1.208 ms
   (10 rows)
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm commented on a change in pull request #28257: [SPARK-31485][CORE] Avoid application hang if only partial barrier tasks launched

2020-04-22 Thread GitBox


mridulm commented on a change in pull request #28257:
URL: https://github.com/apache/spark/pull/28257#discussion_r413463015



##
File path: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala
##
@@ -468,8 +466,9 @@ private[spark] class TaskSchedulerImpl(
   resourceProfileIds: Array[Int],
   availableCpus: Array[Int],
   availableResources: Array[Map[String, Buffer[String]]],
-  rpId: Int): Int = {
-val resourceProfile = sc.resourceProfileManager.resourceProfileFromId(rpId)
+  taskSet: TaskSetManager): Int = {
+val resourceProfile = sc.resourceProfileManager.resourceProfileFromId(
+  taskSet.taskSet.resourceProfileId)
 val offersForResourceProfile = resourceProfileIds.zipWithIndex.filter { 
case (id, _) =>

Review comment:
   Ah ! Yes, I knew I missed something :-)
   Thanks





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm commented on a change in pull request #28257: [SPARK-31485][CORE] Avoid application hang if only partial barrier tasks launched

2020-04-22 Thread GitBox


mridulm commented on a change in pull request #28257:
URL: https://github.com/apache/spark/pull/28257#discussion_r413462366



##
File path: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala
##
@@ -741,8 +750,12 @@ private[spark] class TaskSchedulerImpl(
 if (state == TaskState.LOST) {
   // TaskState.LOST is only used by the deprecated Mesos 
fine-grained scheduling mode,
   // where each executor corresponds to a single task, so mark the 
executor as failed.
-  val execId = taskIdToExecutorId.getOrElse(tid, throw new 
IllegalStateException(
-"taskIdToTaskSetManager.contains(tid) <=> 
taskIdToExecutorId.contains(tid)"))
+  val execId = taskIdToExecutorId.getOrElse(tid, {
+val errorMsg =
+  "taskIdToTaskSetManager.contains(tid) <=> 
taskIdToExecutorId.contains(tid)"
+taskSet.abort(errorMsg)
+throw new SparkException(errorMsg)

Review comment:
   The exception change is fine here ? +CC @tgravescs 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28257: [SPARK-31485][CORE] Avoid application hang if only partial barrier tasks launched

2020-04-22 Thread GitBox


AmplabJenkins commented on issue #28257:
URL: https://github.com/apache/spark/pull/28257#issuecomment-618139740







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28257: [SPARK-31485][CORE] Avoid application hang if only partial barrier tasks launched

2020-04-22 Thread GitBox


AmplabJenkins removed a comment on issue #28257:
URL: https://github.com/apache/spark/pull/28257#issuecomment-618139740







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Ngone51 commented on a change in pull request #28257: [SPARK-31485][CORE] Avoid application hang if only partial barrier tasks launched

2020-04-22 Thread GitBox


Ngone51 commented on a change in pull request #28257:
URL: https://github.com/apache/spark/pull/28257#discussion_r413462299



##
File path: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala
##
@@ -468,8 +466,9 @@ private[spark] class TaskSchedulerImpl(
   resourceProfileIds: Array[Int],
   availableCpus: Array[Int],
   availableResources: Array[Map[String, Buffer[String]]],
-  rpId: Int): Int = {
-val resourceProfile = sc.resourceProfileManager.resourceProfileFromId(rpId)
+  taskSet: TaskSetManager): Int = {
+val resourceProfile = sc.resourceProfileManager.resourceProfileFromId(
+  taskSet.taskSet.resourceProfileId)
 val offersForResourceProfile = resourceProfileIds.zipWithIndex.filter { 
case (id, _) =>

Review comment:
   We need `taskSet: TaskSetManager` now because we'll use it to abort the 
task set below.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28257: [SPARK-31485][CORE] Avoid application hang if only partial barrier tasks launched

2020-04-22 Thread GitBox


SparkQA commented on issue #28257:
URL: https://github.com/apache/spark/pull/28257#issuecomment-618139303


   **[Test build #121645 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121645/testReport)**
 for PR 28257 at commit 
[`6495a9a`](https://github.com/apache/spark/commit/6495a9a31c3076540e791e7f2652452407df28c2).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm commented on a change in pull request #28257: [SPARK-31485][CORE] Avoid application hang if only partial barrier tasks launched

2020-04-22 Thread GitBox


mridulm commented on a change in pull request #28257:
URL: https://github.com/apache/spark/pull/28257#discussion_r413461520



##
File path: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala
##
@@ -468,8 +466,9 @@ private[spark] class TaskSchedulerImpl(
   resourceProfileIds: Array[Int],
   availableCpus: Array[Int],
   availableResources: Array[Map[String, Buffer[String]]],
-  rpId: Int): Int = {
-val resourceProfile = sc.resourceProfileManager.resourceProfileFromId(rpId)
+  taskSet: TaskSetManager): Int = {
+val resourceProfile = sc.resourceProfileManager.resourceProfileFromId(
+  taskSet.taskSet.resourceProfileId)
 val offersForResourceProfile = resourceProfileIds.zipWithIndex.filter { 
case (id, _) =>

Review comment:
   True, but I was trying to make sense of whether it was relevant to the 
fix or not.
   Looks like an unrelated cleanup





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28302: [SPARK-31522][SQL] Hive metastore client initialization related configurations should be static

2020-04-22 Thread GitBox


AmplabJenkins removed a comment on issue #28302:
URL: https://github.com/apache/spark/pull/28302#issuecomment-618137618







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty

2020-04-22 Thread GitBox


HyukjinKwon commented on a change in pull request #27861:
URL: https://github.com/apache/spark/pull/27861#discussion_r413460172



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##
@@ -1691,7 +1691,19 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   override def visitWindowDef(ctx: WindowDefContext): WindowSpecDefinition = 
withOrigin(ctx) {
 // CLUSTER BY ... | PARTITION BY ... ORDER BY ...
 val partition = ctx.partition.asScala.map(expression)
-val order = ctx.sortItem.asScala.map(visitSortItem)
+val order = if (ctx.sortItem.asScala.nonEmpty) {
+  ctx.sortItem.asScala.map(visitSortItem)
+} else if (ctx.windowFrame != null &&
+  ctx.windowFrame().frameType.getType == SqlBaseParser.RANGE) {
+  // for RANGE window frame, we won't add default order spec
+  ctx.sortItem.asScala.map(visitSortItem)
+} else {
+  // Same default behaviors like hive, when order spec is null
+  // set partition spec expression as order spec
+  ctx.partition.asScala.map { expr =>
+SortOrder(expression(expr), Ascending, Ascending.defaultNullOrdering, 
Set.empty)

Review comment:
   But the results will be useless. When can it be useful if the order is 
indeterministic for the functions dependent on the order .. ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #28302: [SPARK-31522][SQL] Hive metastore client initialization related configurations should be static

2020-04-22 Thread GitBox


AmplabJenkins commented on issue #28302:
URL: https://github.com/apache/spark/pull/28302#issuecomment-618137618







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #28302: [SPARK-31522][SQL] Hive metastore client initialization related configurations should be static

2020-04-22 Thread GitBox


SparkQA commented on issue #28302:
URL: https://github.com/apache/spark/pull/28302#issuecomment-618137351


   **[Test build #121644 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121644/testReport)**
 for PR 28302 at commit 
[`5c15a98`](https://github.com/apache/spark/commit/5c15a98270c428b0d9e7bacf553162d650a887b3).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty

2020-04-22 Thread GitBox


HyukjinKwon commented on a change in pull request #27861:
URL: https://github.com/apache/spark/pull/27861#discussion_r413459621



##
File path: 
sql/core/src/test/resources/sql-tests/results/postgreSQL/window_part1.sql.out
##
@@ -422,7 +421,7 @@ struct
 -- !query
 SELECT count(*) OVER (PARTITION BY four) FROM (SELECT * FROM tenk1 WHERE 
FALSE)s

Review comment:
   Okay, now I completely got what you're trying to do it. You do want 
_window functions_ to work without specifying the ordering, and non-window 
functions already work without specifying ordering (because the results will be 
deterministic anyway). Yes, -1 for the same comment from @hvanhovell.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #28292: [SPARK-31516][DOC] Fix non-existed metric hiveClientCalls.count of CodeGenerator in DOC

2020-04-22 Thread GitBox


AmplabJenkins removed a comment on issue #28292:
URL: https://github.com/apache/spark/pull/28292#issuecomment-617641723


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty

2020-04-22 Thread GitBox


AngersZh commented on a change in pull request #27861:
URL: https://github.com/apache/spark/pull/27861#discussion_r413458776



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##
@@ -1691,7 +1691,19 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   override def visitWindowDef(ctx: WindowDefContext): WindowSpecDefinition = 
withOrigin(ctx) {
 // CLUSTER BY ... | PARTITION BY ... ORDER BY ...
 val partition = ctx.partition.asScala.map(expression)
-val order = ctx.sortItem.asScala.map(visitSortItem)
+val order = if (ctx.sortItem.asScala.nonEmpty) {
+  ctx.sortItem.asScala.map(visitSortItem)
+} else if (ctx.windowFrame != null &&
+  ctx.windowFrame().frameType.getType == SqlBaseParser.RANGE) {
+  // for RANGE window frame, we won't add default order spec
+  ctx.sortItem.asScala.map(visitSortItem)
+} else {
+  // Same default behaviors like hive, when order spec is null
+  // set partition spec expression as order spec
+  ctx.partition.asScala.map { expr =>
+SortOrder(expression(expr), Ascending, Ascending.defaultNullOrdering, 
Set.empty)

Review comment:
   > Wait .. why do we set the ordering column as partition column? We 
should just leave it unspecified so only (non-window) aggregation functions 
work together with unbounded windows so it doesn't get affected by the order. 
This is what Scala API does.
   
   e, hive doing like this...for me, when user not set order by clause, 
means he don't care about result order.  For Range DataFrame we can't support 
this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wezhang commented on issue #28292: [SPARK-31516][DOC] Fix non-existed metric hiveClientCalls.count of CodeGenerator in DOC

2020-04-22 Thread GitBox


wezhang commented on issue #28292:
URL: https://github.com/apache/spark/pull/28292#issuecomment-618136677


   Retest this please.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty

2020-04-22 Thread GitBox


HyukjinKwon commented on a change in pull request #27861:
URL: https://github.com/apache/spark/pull/27861#discussion_r413456806



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##
@@ -1691,7 +1691,19 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   override def visitWindowDef(ctx: WindowDefContext): WindowSpecDefinition = 
withOrigin(ctx) {
 // CLUSTER BY ... | PARTITION BY ... ORDER BY ...
 val partition = ctx.partition.asScala.map(expression)
-val order = ctx.sortItem.asScala.map(visitSortItem)
+val order = if (ctx.sortItem.asScala.nonEmpty) {
+  ctx.sortItem.asScala.map(visitSortItem)
+} else if (ctx.windowFrame != null &&
+  ctx.windowFrame().frameType.getType == SqlBaseParser.RANGE) {
+  // for RANGE window frame, we won't add default order spec
+  ctx.sortItem.asScala.map(visitSortItem)
+} else {
+  // Same default behaviors like hive, when order spec is null
+  // set partition spec expression as order spec
+  ctx.partition.asScala.map { expr =>
+SortOrder(expression(expr), Ascending, Ascending.defaultNullOrdering, 
Set.empty)

Review comment:
   Wait .. why do we set the ordering column as partition column? We should 
just leave it unspecified so only (non-window) aggregation functions work 
together with unbounded windows so it doesn't get affected by the order. This 
is what Scala API does.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon removed a comment on issue #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty

2020-04-22 Thread GitBox


HyukjinKwon removed a comment on issue #27861:
URL: https://github.com/apache/spark/pull/27861#issuecomment-618128508


   I think he wants to use the partition column as the ordering column 
implicitly, instead of not specifying it - 
https://github.com/apache/spark/pull/27861/files#diff-9847f5cef7cf7fbc5830fbc6b779ee10R1702.
 It wouldn't work if the ordering is not specified per 
https://github.com/apache/spark/blob/a28ed86a387b286745b30cd4d90b3d558205a5a7/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L2773-L2776
 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wezhang commented on issue #28292: [SPARK-31516][DOC] Fix non-existed metric hiveClientCalls.count of CodeGenerator in DOC

2020-04-22 Thread GitBox


wezhang commented on issue #28292:
URL: https://github.com/apache/spark/pull/28292#issuecomment-618134605


   There might be some networking issue in build, I have to restart it.
   
   ```
    ERRORS
SERVER ERROR: Gateway Time-out 
url=https://repo.typesafe.com/typesafe/ivy-releases/org.scala-sbt/classpath/0.13.18/ivys/ivy.xml
   
SERVER ERROR: Gateway Time-out 
url=https://repo.typesafe.com/typesafe/ivy-releases/org.scala-sbt/logging/0.13.18/ivys/ivy.xml
   
   
   :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
   unresolved dependency: org.scala-sbt#classpath;0.13.18: not found
   unresolved dependency: org.scala-sbt#logging;0.13.18: not found
   Error during sbt execution: Error retrieving required libraries
 (see /home/runner/.sbt/boot/update.log for complete log)
   Error: Could not retrieve sbt 0.13.18
   
 Jekyll 4.0.0   Please append `--trace` to the `build` command 
for any additional information or backtrace. 
   
   /home/runner/work/spark/spark/docs/_plugins/copy_api_dirs.rb:30:in `': Unidoc generation failed (RuntimeError)
from 
/opt/hostedtoolcache/Ruby/2.7.1/x64/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:92:in
 `require'
from 
/opt/hostedtoolcache/Ruby/2.7.1/x64/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:92:in
 `require'
from 
/opt/hostedtoolcache/Ruby/2.7.1/x64/lib/ruby/gems/2.7.0/gems/jekyll-4.0.0/lib/jekyll/external.rb:60:in
 `block in require_with_graceful_fail'
   ...
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wezhang commented on issue #28292: [SPARK-31516][DOC] Fix non-existed metric hiveClientCalls.count of CodeGenerator in DOC

2020-04-22 Thread GitBox


wezhang commented on issue #28292:
URL: https://github.com/apache/spark/pull/28292#issuecomment-618133795


   > Hi @wezhang, well spotted, indeed this was a mistake, thanks for fixing it.
   > LGTM
   
   Thank you a lot!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   >