[GitHub] [spark] AmplabJenkins removed a comment on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
AmplabJenkins removed a comment on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#issuecomment-552332580 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18461/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
AmplabJenkins removed a comment on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#issuecomment-552332574 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
AmplabJenkins commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#issuecomment-552332580 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18461/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
AmplabJenkins commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#issuecomment-552332574 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
SparkQA commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#issuecomment-552332236 **[Test build #113575 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113575/testReport)** for PR 26449 at commit [`34bf719`](https://github.com/apache/spark/commit/34bf71910830e1efabe4af635b2f0ef6692f57d1). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on issue #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries
AngersZh commented on issue #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#issuecomment-552331181 > > If result is huge, it's a big cost. > > Yea true, we can add a LIMIT 1 to the scalar subquery before count, then the result won't be huge. This is also how we implement `Dataset.isEmpty`. I am not sure if `LIMIT 1 ` will get result with only executing one partition. My thinking direction. https://github.com/apache/spark/pull/26437#discussion_r344589540 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries
AngersZh commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#discussion_r344589540 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala ## @@ -106,12 +106,20 @@ object RewritePredicateSubquery extends Rule[LogicalPlan] with PredicateHelper { // Filter the plan by applying left semi and left anti joins. withSubquery.foldLeft(newFilter) { -case (p, Exists(sub, conditions, _)) => - val (joinCond, outerPlan) = rewriteExistentialExpr(conditions, p) - buildJoin(outerPlan, sub, LeftSemi, joinCond) -case (p, Not(Exists(sub, conditions, _))) => - val (joinCond, outerPlan) = rewriteExistentialExpr(conditions, p) - buildJoin(outerPlan, sub, LeftAnti, joinCond) +case (p, exists @ Exists(sub, conditions, _)) => + if (SubqueryExpression.hasCorrelatedSubquery(exists)) { +val (joinCond, outerPlan) = rewriteExistentialExpr(conditions, p) +buildJoin(outerPlan, sub, LeftSemi, joinCond) + } else { +Filter(exists, newFilter) + } +case (p, Not(exists @ Exists(sub, conditions, _))) => + if (SubqueryExpression.hasCorrelatedSubquery(exists)) { +val (joinCond, outerPlan) = rewriteExistentialExpr(conditions, p) +buildJoin(outerPlan, sub, LeftAnti, joinCond) + } else { +Filter(Not(exists), newFilter) + } Review comment: > @AngersZh I discussed this with Wenchen briefly. Do you think we can safely inject a "LIMIT 1" into our subplan to expedite its execution ? Pl. lets us know what you think ? I am also thinking about reduce the execution cost of this sub query. `LIMIT 1` is ok . My direction is making this execution like Spark Thrift Server's incremental collect. Only execute one partition. Discuss these two ways safety and cost? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
SparkQA commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552329953 **[Test build #113574 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113574/testReport)** for PR 26439 at commit [`b5265a0`](https://github.com/apache/spark/commit/b5265a01907f7181639423b1f3ee54aa55654dc5). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries
cloud-fan commented on issue #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#issuecomment-552329321 > If result is huge, it's a big cost. Yea true, we can add a LIMIT 1 to the scalar subquery before count, then the result won't be huge. This is also how we implement `Dataset.isEmpty`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
yaooqinn commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#discussion_r344587864 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -425,14 +425,18 @@ object IntervalUtils { } private object ParseState extends Enumeration { +type ParseState = Value + val PREFIX, -BEGIN_VALUE, -PARSE_SIGN, -PARSE_UNIT_VALUE, -FRACTIONAL_PART, -BEGIN_UNIT_NAME, -UNIT_NAME_SUFFIX, -END_UNIT_NAME = Value +NEXT_VALUE_UNIT, Review comment: cool. good naming This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552328802 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552328807 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18460/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552328802 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552328807 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18460/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
AmplabJenkins commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#issuecomment-552328172 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18459/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
AmplabJenkins removed a comment on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#issuecomment-552328172 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18459/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
AmplabJenkins removed a comment on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#issuecomment-552328167 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
AmplabJenkins commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#issuecomment-552328167 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #26312: [SPARK-29649][SQL] Stop task set if FileAlreadyExistsException was thrown when writing to output file
viirya commented on a change in pull request #26312: [SPARK-29649][SQL] Stop task set if FileAlreadyExistsException was thrown when writing to output file URL: https://github.com/apache/spark/pull/26312#discussion_r344586962 ## File path: core/src/main/scala/org/apache/spark/TaskOutputFileAlreadyExistException.scala ## @@ -0,0 +1,23 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark + +/** + * Exception thrown when a task cannot write to output file due to the file already exists. + */ +private[spark] class TaskOutputFileAlreadyExistException(error: Throwable) extends Exception(error) Review comment: Seems not necessary? I saw there is `TaskNotSerializableException` which does not extend it too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
cloud-fan commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#discussion_r344586633 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -425,14 +425,18 @@ object IntervalUtils { } private object ParseState extends Enumeration { +type ParseState = Value Review comment: is it a necessary change? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels
SparkQA commented on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels URL: https://github.com/apache/spark/pull/26044#issuecomment-552327771 **[Test build #113573 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113573/testReport)** for PR 26044 at commit [`a3978f0`](https://github.com/apache/spark/commit/a3978f0add8fbd5498d560f80e822fa26382ab81). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
SparkQA commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-552327760 **[Test build #113572 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113572/testReport)** for PR 26439 at commit [`5711743`](https://github.com/apache/spark/commit/5711743ed39307ea26837a22fd20d35dfaea36b6). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
cloud-fan commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#discussion_r344586633 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -425,14 +425,18 @@ object IntervalUtils { } private object ParseState extends Enumeration { +type ParseState = Value Review comment: is it a necessary change? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
cloud-fan commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#discussion_r344586541 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -425,14 +425,18 @@ object IntervalUtils { } private object ParseState extends Enumeration { +type ParseState = Value + val PREFIX, -BEGIN_VALUE, -PARSE_SIGN, -PARSE_UNIT_VALUE, -FRACTIONAL_PART, -BEGIN_UNIT_NAME, -UNIT_NAME_SUFFIX, -END_UNIT_NAME = Value +NEXT_VALUE_UNIT, Review comment: how about `TRIM_BEFORE_SIGN` to be consistent? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
SparkQA commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#issuecomment-552327746 **[Test build #113571 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113571/testReport)** for PR 26449 at commit [`83b9ae3`](https://github.com/apache/spark/commit/83b9ae36b09a3636ee3e1761ee5b4415ed21a8af). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk removed a comment on issue #26358: [SPARK-29712][SQL] Take into account the left bound in `fromDayTimeString()`
MaxGekk removed a comment on issue #26358: [SPARK-29712][SQL] Take into account the left bound in `fromDayTimeString()` URL: https://github.com/apache/spark/pull/26358#issuecomment-552325651 > so let's keep the existing behavior if pgsql dialect is set, and fail for invalid bounds otherwise. This is what's implemented in the PR. Let me resolve conflicts. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels
AmplabJenkins removed a comment on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels URL: https://github.com/apache/spark/pull/26044#issuecomment-552326777 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113570/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels
AmplabJenkins removed a comment on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels URL: https://github.com/apache/spark/pull/26044#issuecomment-552326773 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels
SparkQA removed a comment on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels URL: https://github.com/apache/spark/pull/26044#issuecomment-552325588 **[Test build #113570 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113570/testReport)** for PR 26044 at commit [`1634003`](https://github.com/apache/spark/commit/163400365af5b8e44d6931c17a7162d90e47b9a7). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels
AmplabJenkins commented on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels URL: https://github.com/apache/spark/pull/26044#issuecomment-552326773 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26359: [SPARK-29713][SQL] Support Interval Unit Abbreviations in Interval Literals
AmplabJenkins removed a comment on issue #26359: [SPARK-29713][SQL] Support Interval Unit Abbreviations in Interval Literals URL: https://github.com/apache/spark/pull/26359#issuecomment-552326650 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels
SparkQA commented on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels URL: https://github.com/apache/spark/pull/26044#issuecomment-552326757 **[Test build #113570 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113570/testReport)** for PR 26044 at commit [`1634003`](https://github.com/apache/spark/commit/163400365af5b8e44d6931c17a7162d90e47b9a7). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26011: [SPARK-29343][SQL] Eliminate sorts without limit in the subquery of Join/Aggregation
cloud-fan commented on a change in pull request #26011: [SPARK-29343][SQL] Eliminate sorts without limit in the subquery of Join/Aggregation URL: https://github.com/apache/spark/pull/26011#discussion_r344585670 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ## @@ -953,40 +952,62 @@ object CombineFilters extends Rule[LogicalPlan] with PredicateHelper { } /** - * Removes no-op SortOrder from Sort + * Removes Sort operation. This can happen: + * 1) if the sort order is empty or the sort order does not have any reference + * 2) if the child is already sorted + * 3) if there is another Sort operator separated by 0...n Project/Filter operators + * 4) if the Sort operator is within Join separated by 0...n Project/Filter operators only, + *and the Join conditions is deterministic + * 5) if the Sort operator is within GroupBy separated by 0...n Project/Filter operators only, + *and the aggregate function is order irrelevant */ object EliminateSorts extends Rule[LogicalPlan] { def apply(plan: LogicalPlan): LogicalPlan = plan transform { case s @ Sort(orders, _, child) if orders.isEmpty || orders.exists(_.child.foldable) => val newOrders = orders.filterNot(_.child.foldable) if (newOrders.isEmpty) child else s.copy(order = newOrders) - } -} - -/** - * Removes redundant Sort operation. This can happen: - * 1) if the child is already sorted - * 2) if there is another Sort operator separated by 0...n Project/Filter operators - */ -object RemoveRedundantSorts extends Rule[LogicalPlan] { - def apply(plan: LogicalPlan): LogicalPlan = plan transformDown { case Sort(orders, true, child) if SortOrder.orderingSatisfies(child.outputOrdering, orders) => child case s @ Sort(_, _, child) => s.copy(child = recursiveRemoveSort(child)) +case j @ Join(originLeft, originRight, _, cond, _) if cond.forall(_.deterministic) => + j.copy(left = recursiveRemoveSort(originLeft), right = recursiveRemoveSort(originRight)) +case g @ Aggregate(_, aggs, originChild) if isOrderIrrelevantAggs(aggs) => + g.copy(child = recursiveRemoveSort(originChild)) } - def recursiveRemoveSort(plan: LogicalPlan): LogicalPlan = plan match { + private def recursiveRemoveSort(plan: LogicalPlan): LogicalPlan = plan match { case Sort(_, _, child) => recursiveRemoveSort(child) case other if canEliminateSort(other) => other.withNewChildren(other.children.map(recursiveRemoveSort)) case _ => plan } - def canEliminateSort(plan: LogicalPlan): Boolean = plan match { + private def canEliminateSort(plan: LogicalPlan): Boolean = plan match { case p: Project => p.projectList.forall(_.deterministic) case f: Filter => f.condition.deterministic case _ => false } + + private def isOrderIrrelevantAggs(aggs: Seq[NamedExpression]): Boolean = { +def isOrderIrrelevantAggFunction(func: AggregateFunction): Boolean = func match { + case _: Sum => true + case _: Min => true + case _: Max => true + case _: Count => true + case _: Average => true Review comment: ah that's a good point. AVG over floating values is order sensitive. Not sure if this can really affect queries in practice, but better to be conservative here. @WangGuangxin can you fix it in a followup? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26359: [SPARK-29713][SQL] Support Interval Unit Abbreviations in Interval Literals
AmplabJenkins commented on issue #26359: [SPARK-29713][SQL] Support Interval Unit Abbreviations in Interval Literals URL: https://github.com/apache/spark/pull/26359#issuecomment-552326653 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113564/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels
AmplabJenkins commented on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels URL: https://github.com/apache/spark/pull/26044#issuecomment-552326777 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113570/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26359: [SPARK-29713][SQL] Support Interval Unit Abbreviations in Interval Literals
AmplabJenkins commented on issue #26359: [SPARK-29713][SQL] Support Interval Unit Abbreviations in Interval Literals URL: https://github.com/apache/spark/pull/26359#issuecomment-552326650 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26359: [SPARK-29713][SQL] Support Interval Unit Abbreviations in Interval Literals
AmplabJenkins removed a comment on issue #26359: [SPARK-29713][SQL] Support Interval Unit Abbreviations in Interval Literals URL: https://github.com/apache/spark/pull/26359#issuecomment-552326653 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113564/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] peter-toth edited a comment on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels
peter-toth edited a comment on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels URL: https://github.com/apache/spark/pull/26044#issuecomment-552225930 > Could you show us actual performance numbers with/without this pr? I created a simple benchmark here: https://github.com/peter-toth/spark/commit/1fcc34bd6273258947da46e9cbf4182e6224aa66 just to show that this PR (`originalReuseExchangeVersion - false`) makes sense in some cases. ``` Running benchmark: Exchange reuse 1 Running case: originalReuseExchangeVersion - true Stopped after 2 iterations, 24415 ms Running case: originalReuseExchangeVersion - false Stopped after 2 iterations, 13158 ms OpenJDK 64-Bit Server VM 1.8.0_212-b04 on Mac OS X 10.14.6 Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz Exchange reuse 1: Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative originalReuseExchangeVersion - true 11946 12208 370 0.0 2389167936.6 1.0X originalReuseExchangeVersion - false 6538 6579 59 0.0 1307541708.0 1.8X Running benchmark: Exchange reuse 2 Running case: originalReuseExchangeVersion - true Stopped after 2 iterations, 23746 ms Running case: originalReuseExchangeVersion - false Stopped after 2 iterations, 13400 ms OpenJDK 64-Bit Server VM 1.8.0_212-b04 on Mac OS X 10.14.6 Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz Exchange reuse 2: Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative originalReuseExchangeVersion - true 11760 11873 160 0.0 2352022663.0 1.0X originalReuseExchangeVersion - false 6642 6700 82 0.0 1328495627.4 1.8X ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26359: [SPARK-29713][SQL] Support Interval Unit Abbreviations in Interval Literals
SparkQA removed a comment on issue #26359: [SPARK-29713][SQL] Support Interval Unit Abbreviations in Interval Literals URL: https://github.com/apache/spark/pull/26359#issuecomment-552281904 **[Test build #113564 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113564/testReport)** for PR 26359 at commit [`b75dbdb`](https://github.com/apache/spark/commit/b75dbdb8f6d96ba77684eb1dde966488178e76c0). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels
AmplabJenkins removed a comment on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels URL: https://github.com/apache/spark/pull/26044#issuecomment-552326083 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26359: [SPARK-29713][SQL] Support Interval Unit Abbreviations in Interval Literals
SparkQA commented on issue #26359: [SPARK-29713][SQL] Support Interval Unit Abbreviations in Interval Literals URL: https://github.com/apache/spark/pull/26359#issuecomment-552326190 **[Test build #113564 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113564/testReport)** for PR 26359 at commit [`b75dbdb`](https://github.com/apache/spark/commit/b75dbdb8f6d96ba77684eb1dde966488178e76c0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels
AmplabJenkins commented on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels URL: https://github.com/apache/spark/pull/26044#issuecomment-552326089 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18458/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26102: [SPARK-29448][SQL] Support the `INTERVAL` type by Parquet datasource
cloud-fan commented on a change in pull request #26102: [SPARK-29448][SQL] Support the `INTERVAL` type by Parquet datasource URL: https://github.com/apache/spark/pull/26102#discussion_r344585132 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala ## @@ -498,10 +498,8 @@ case class DataSource( outputColumnNames: Seq[String], physicalPlan: SparkPlan): BaseRelation = { val outputColumns = DataWritingCommand.logicalPlanOutputWithNames(data, outputColumnNames) -if (outputColumns.map(_.dataType).exists(_.isInstanceOf[CalendarIntervalType])) { Review comment: interval type is kind of an interval type for now. It's a big decision if we can read/write it from/to data sources. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels
AmplabJenkins removed a comment on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels URL: https://github.com/apache/spark/pull/26044#issuecomment-552326089 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18458/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels
AmplabJenkins commented on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels URL: https://github.com/apache/spark/pull/26044#issuecomment-552326083 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels
SparkQA commented on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels URL: https://github.com/apache/spark/pull/26044#issuecomment-552325588 **[Test build #113570 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113570/testReport)** for PR 26044 at commit [`1634003`](https://github.com/apache/spark/commit/163400365af5b8e44d6931c17a7162d90e47b9a7). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on issue #26358: [SPARK-29712][SQL] Take into account the left bound in `fromDayTimeString()`
MaxGekk commented on issue #26358: [SPARK-29712][SQL] Take into account the left bound in `fromDayTimeString()` URL: https://github.com/apache/spark/pull/26358#issuecomment-552325651 > so let's keep the existing behavior if pgsql dialect is set, and fail for invalid bounds otherwise. This is what's implemented in the PR. Let me resolve conflicts. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on a change in pull request #26011: [SPARK-29343][SQL] Eliminate sorts without limit in the subquery of Join/Aggregation
gatorsmile commented on a change in pull request #26011: [SPARK-29343][SQL] Eliminate sorts without limit in the subquery of Join/Aggregation URL: https://github.com/apache/spark/pull/26011#discussion_r344584156 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ## @@ -953,40 +952,62 @@ object CombineFilters extends Rule[LogicalPlan] with PredicateHelper { } /** - * Removes no-op SortOrder from Sort + * Removes Sort operation. This can happen: + * 1) if the sort order is empty or the sort order does not have any reference + * 2) if the child is already sorted + * 3) if there is another Sort operator separated by 0...n Project/Filter operators + * 4) if the Sort operator is within Join separated by 0...n Project/Filter operators only, + *and the Join conditions is deterministic + * 5) if the Sort operator is within GroupBy separated by 0...n Project/Filter operators only, + *and the aggregate function is order irrelevant */ object EliminateSorts extends Rule[LogicalPlan] { def apply(plan: LogicalPlan): LogicalPlan = plan transform { case s @ Sort(orders, _, child) if orders.isEmpty || orders.exists(_.child.foldable) => val newOrders = orders.filterNot(_.child.foldable) if (newOrders.isEmpty) child else s.copy(order = newOrders) - } -} - -/** - * Removes redundant Sort operation. This can happen: - * 1) if the child is already sorted - * 2) if there is another Sort operator separated by 0...n Project/Filter operators - */ -object RemoveRedundantSorts extends Rule[LogicalPlan] { - def apply(plan: LogicalPlan): LogicalPlan = plan transformDown { case Sort(orders, true, child) if SortOrder.orderingSatisfies(child.outputOrdering, orders) => child case s @ Sort(_, _, child) => s.copy(child = recursiveRemoveSort(child)) +case j @ Join(originLeft, originRight, _, cond, _) if cond.forall(_.deterministic) => + j.copy(left = recursiveRemoveSort(originLeft), right = recursiveRemoveSort(originRight)) +case g @ Aggregate(_, aggs, originChild) if isOrderIrrelevantAggs(aggs) => + g.copy(child = recursiveRemoveSort(originChild)) } - def recursiveRemoveSort(plan: LogicalPlan): LogicalPlan = plan match { + private def recursiveRemoveSort(plan: LogicalPlan): LogicalPlan = plan match { case Sort(_, _, child) => recursiveRemoveSort(child) case other if canEliminateSort(other) => other.withNewChildren(other.children.map(recursiveRemoveSort)) case _ => plan } - def canEliminateSort(plan: LogicalPlan): Boolean = plan match { + private def canEliminateSort(plan: LogicalPlan): Boolean = plan match { case p: Project => p.projectList.forall(_.deterministic) case f: Filter => f.condition.deterministic case _ => false } + + private def isOrderIrrelevantAggs(aggs: Seq[NamedExpression]): Boolean = { +def isOrderIrrelevantAggFunction(func: AggregateFunction): Boolean = func match { + case _: Sum => true + case _: Min => true + case _: Max => true + case _: Count => true + case _: Average => true Review comment: cc @maryannxue @cloud-fan This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
cloud-fan closed pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
yaooqinn commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#discussion_r344583580 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -425,11 +425,15 @@ object IntervalUtils { } private object ParseState extends Enumeration { +type ParseState = Value + val PREFIX, BEGIN_VALUE, Review comment: or `NEXT_VALUE_UNIT`? `NEXT_VALUE_UNIT_PAIR` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] peter-toth commented on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels
peter-toth commented on issue #26044: [SPARK-29375][SQL] Exchange reuse across all subquery levels URL: https://github.com/apache/spark/pull/26044#issuecomment-552324810 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
cloud-fan commented on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#issuecomment-552324944 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on a change in pull request #26011: [SPARK-29343][SQL] Eliminate sorts without limit in the subquery of Join/Aggregation
gatorsmile commented on a change in pull request #26011: [SPARK-29343][SQL] Eliminate sorts without limit in the subquery of Join/Aggregation URL: https://github.com/apache/spark/pull/26011#discussion_r344583899 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ## @@ -953,40 +952,62 @@ object CombineFilters extends Rule[LogicalPlan] with PredicateHelper { } /** - * Removes no-op SortOrder from Sort + * Removes Sort operation. This can happen: + * 1) if the sort order is empty or the sort order does not have any reference + * 2) if the child is already sorted + * 3) if there is another Sort operator separated by 0...n Project/Filter operators + * 4) if the Sort operator is within Join separated by 0...n Project/Filter operators only, + *and the Join conditions is deterministic + * 5) if the Sort operator is within GroupBy separated by 0...n Project/Filter operators only, + *and the aggregate function is order irrelevant */ object EliminateSorts extends Rule[LogicalPlan] { def apply(plan: LogicalPlan): LogicalPlan = plan transform { case s @ Sort(orders, _, child) if orders.isEmpty || orders.exists(_.child.foldable) => val newOrders = orders.filterNot(_.child.foldable) if (newOrders.isEmpty) child else s.copy(order = newOrders) - } -} - -/** - * Removes redundant Sort operation. This can happen: - * 1) if the child is already sorted - * 2) if there is another Sort operator separated by 0...n Project/Filter operators - */ -object RemoveRedundantSorts extends Rule[LogicalPlan] { - def apply(plan: LogicalPlan): LogicalPlan = plan transformDown { case Sort(orders, true, child) if SortOrder.orderingSatisfies(child.outputOrdering, orders) => child case s @ Sort(_, _, child) => s.copy(child = recursiveRemoveSort(child)) +case j @ Join(originLeft, originRight, _, cond, _) if cond.forall(_.deterministic) => + j.copy(left = recursiveRemoveSort(originLeft), right = recursiveRemoveSort(originRight)) +case g @ Aggregate(_, aggs, originChild) if isOrderIrrelevantAggs(aggs) => + g.copy(child = recursiveRemoveSort(originChild)) } - def recursiveRemoveSort(plan: LogicalPlan): LogicalPlan = plan match { + private def recursiveRemoveSort(plan: LogicalPlan): LogicalPlan = plan match { case Sort(_, _, child) => recursiveRemoveSort(child) case other if canEliminateSort(other) => other.withNewChildren(other.children.map(recursiveRemoveSort)) case _ => plan } - def canEliminateSort(plan: LogicalPlan): Boolean = plan match { + private def canEliminateSort(plan: LogicalPlan): Boolean = plan match { case p: Project => p.projectList.forall(_.deterministic) case f: Filter => f.condition.deterministic case _ => false } + + private def isOrderIrrelevantAggs(aggs: Seq[NamedExpression]): Boolean = { +def isOrderIrrelevantAggFunction(func: AggregateFunction): Boolean = func match { + case _: Sum => true + case _: Min => true + case _: Max => true + case _: Count => true + case _: Average => true Review comment: We could still have a precision difference after eliminating the sort for the floating point data type. I am afraid some end users might prefer to adding a sort in these cases to ensure the results are consistent? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
yaooqinn commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#discussion_r344583580 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -425,11 +425,15 @@ object IntervalUtils { } private object ParseState extends Enumeration { +type ParseState = Value + val PREFIX, BEGIN_VALUE, Review comment: or `NEXT_VALUE_UNIT`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on issue #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries
AngersZh commented on issue #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#issuecomment-552323985 > some more thoughts: I think we can rewrite non-correlated EXISTS subquery to a non-correlated scalar subquery. > > e.g. `SELECT.. FROM t1 WHERE (SELECT ...)` can be converted to `SELECT ... FROM t1 WHERE (SELECT count(*) FROM (SELECT ...)) = 0`. Then we don't need to create a new expression. It's ok to do like this, but it will add one more shuffle action for `count()`. If result is huge, it's a big cost. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dilipbiswal commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries
dilipbiswal commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#discussion_r344582313 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala ## @@ -106,12 +106,20 @@ object RewritePredicateSubquery extends Rule[LogicalPlan] with PredicateHelper { // Filter the plan by applying left semi and left anti joins. withSubquery.foldLeft(newFilter) { -case (p, Exists(sub, conditions, _)) => - val (joinCond, outerPlan) = rewriteExistentialExpr(conditions, p) - buildJoin(outerPlan, sub, LeftSemi, joinCond) -case (p, Not(Exists(sub, conditions, _))) => - val (joinCond, outerPlan) = rewriteExistentialExpr(conditions, p) - buildJoin(outerPlan, sub, LeftAnti, joinCond) +case (p, exists @ Exists(sub, conditions, _)) => + if (SubqueryExpression.hasCorrelatedSubquery(exists)) { +val (joinCond, outerPlan) = rewriteExistentialExpr(conditions, p) +buildJoin(outerPlan, sub, LeftSemi, joinCond) + } else { +Filter(exists, newFilter) + } +case (p, Not(exists @ Exists(sub, conditions, _))) => + if (SubqueryExpression.hasCorrelatedSubquery(exists)) { +val (joinCond, outerPlan) = rewriteExistentialExpr(conditions, p) +buildJoin(outerPlan, sub, LeftAnti, joinCond) + } else { +Filter(Not(exists), newFilter) + } Review comment: @AngersZh I discussed this with Wenchen. Do you think we can safely inject a "LIMIT 1" into our subplan to expedite its execution ? Pl. lets us know what you think ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dilipbiswal commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries
dilipbiswal commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#discussion_r344582313 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala ## @@ -106,12 +106,20 @@ object RewritePredicateSubquery extends Rule[LogicalPlan] with PredicateHelper { // Filter the plan by applying left semi and left anti joins. withSubquery.foldLeft(newFilter) { -case (p, Exists(sub, conditions, _)) => - val (joinCond, outerPlan) = rewriteExistentialExpr(conditions, p) - buildJoin(outerPlan, sub, LeftSemi, joinCond) -case (p, Not(Exists(sub, conditions, _))) => - val (joinCond, outerPlan) = rewriteExistentialExpr(conditions, p) - buildJoin(outerPlan, sub, LeftAnti, joinCond) +case (p, exists @ Exists(sub, conditions, _)) => + if (SubqueryExpression.hasCorrelatedSubquery(exists)) { +val (joinCond, outerPlan) = rewriteExistentialExpr(conditions, p) +buildJoin(outerPlan, sub, LeftSemi, joinCond) + } else { +Filter(exists, newFilter) + } +case (p, Not(exists @ Exists(sub, conditions, _))) => + if (SubqueryExpression.hasCorrelatedSubquery(exists)) { +val (joinCond, outerPlan) = rewriteExistentialExpr(conditions, p) +buildJoin(outerPlan, sub, LeftAnti, joinCond) + } else { +Filter(Not(exists), newFilter) + } Review comment: @AngersZh I discussed this with Wenchen briefly. Do you think we can safely inject a "LIMIT 1" into our subplan to expedite its execution ? Pl. lets us know what you think ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
yaooqinn commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#discussion_r344581845 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -425,11 +425,15 @@ object IntervalUtils { } private object ParseState extends Enumeration { +type ParseState = Value + val PREFIX, BEGIN_VALUE, Review comment: interval (PREFIX, BODY) BODY (SIGN, VALUE , UNIT)+ Seems not persuadable enough This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26358: [SPARK-29712][SQL] Take into account the left bound in `fromDayTimeString()`
cloud-fan commented on issue #26358: [SPARK-29712][SQL] Take into account the left bound in `fromDayTimeString()` URL: https://github.com/apache/spark/pull/26358#issuecomment-552322442 so let's keep the existing behavior if pgsql dialect is set, and fail for invalid bounds otherwise. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands
AmplabJenkins removed a comment on issue #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands URL: https://github.com/apache/spark/pull/26219#issuecomment-552322134 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands
AmplabJenkins commented on issue #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands URL: https://github.com/apache/spark/pull/26219#issuecomment-552322137 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113563/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands
AmplabJenkins removed a comment on issue #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands URL: https://github.com/apache/spark/pull/26219#issuecomment-552322137 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113563/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands
AmplabJenkins commented on issue #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands URL: https://github.com/apache/spark/pull/26219#issuecomment-552322134 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands
SparkQA removed a comment on issue #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands URL: https://github.com/apache/spark/pull/26219#issuecomment-552279012 **[Test build #113563 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113563/testReport)** for PR 26219 at commit [`11ea11b`](https://github.com/apache/spark/commit/11ea11b80898f006ffe314f626b046a524a15ff6). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
AmplabJenkins removed a comment on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#issuecomment-552321756 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
AmplabJenkins removed a comment on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#issuecomment-552321764 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18457/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
AmplabJenkins commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#issuecomment-552321764 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18457/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands
SparkQA commented on issue #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands URL: https://github.com/apache/spark/pull/26219#issuecomment-552321711 **[Test build #113563 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113563/testReport)** for PR 26219 at commit [`11ea11b`](https://github.com/apache/spark/commit/11ea11b80898f006ffe314f626b046a524a15ff6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
AmplabJenkins commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#issuecomment-552321756 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
SparkQA commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#issuecomment-552321400 **[Test build #113569 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113569/testReport)** for PR 26449 at commit [`0a301a1`](https://github.com/apache/spark/commit/0a301a17fcc2e76a89f2f8ea1ecf21125b8b88f8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
cloud-fan commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#discussion_r344580453 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -425,11 +425,15 @@ object IntervalUtils { } private object ParseState extends Enumeration { +type ParseState = Value + val PREFIX, BEGIN_VALUE, Review comment: what does `BODY` mean? others LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers
AmplabJenkins removed a comment on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers URL: https://github.com/apache/spark/pull/25964#issuecomment-552320377 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113567/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers
AmplabJenkins removed a comment on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers URL: https://github.com/apache/spark/pull/25964#issuecomment-552320369 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers
AmplabJenkins commented on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers URL: https://github.com/apache/spark/pull/25964#issuecomment-552320377 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113567/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers
AmplabJenkins commented on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers URL: https://github.com/apache/spark/pull/25964#issuecomment-552320369 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers
SparkQA removed a comment on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers URL: https://github.com/apache/spark/pull/25964#issuecomment-552296531 **[Test build #113567 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113567/testReport)** for PR 25964 at commit [`aac0b00`](https://github.com/apache/spark/commit/aac0b00260374bb89c1006cdeabe1a55d8b4fb20). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers
SparkQA commented on issue #25964: [SPARK-29287][Core] Add ExecutorConstructed message to tell driver which executor is ready for making offers URL: https://github.com/apache/spark/pull/25964#issuecomment-552320153 **[Test build #113567 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113567/testReport)** for PR 25964 at commit [`aac0b00`](https://github.com/apache/spark/commit/aac0b00260374bb89c1006cdeabe1a55d8b4fb20). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` case class LaunchedExecutor(executorId: String) extends CoarseGrainedClusterMessage` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries
cloud-fan commented on issue #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#issuecomment-552319361 some more thoughts: I think we can rewrite non-correlated EXISTS subquery to a non-correlated scalar subquery. e.g. `SELECT.. FROM t1 WHERE (SELECT ...)` can be converted to `SELECT ... FROM t1 WHERE (SELECT count(*) FROM (SELECT ...)) = 0`. Then we don't need to create a new expression. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
yaooqinn commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#discussion_r344578122 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -425,11 +425,15 @@ object IntervalUtils { } private object ParseState extends Enumeration { +type ParseState = Value + val PREFIX, BEGIN_VALUE, Review comment: so we may change `PARSE_UNIT_VALUE` to `PARSE_VALUE` too How about renaming them all as, ```scala val PREFIX, BODY, SIGN, TRIM_BEFORE_VALUE, VALUE, VALUE_FRACTIONAL_PART, TRIM_BEFORE_UNIT, UNIT_BEGIN, UNIT_SUFFIX, UNIT_END = Value ``` Seems loud and clear enough for each state This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
AmplabJenkins removed a comment on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#issuecomment-552318438 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
AmplabJenkins commented on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#issuecomment-552318438 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
AmplabJenkins removed a comment on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#issuecomment-552318446 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113561/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
AmplabJenkins commented on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#issuecomment-552318446 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113561/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
SparkQA removed a comment on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#issuecomment-552274572 **[Test build #113561 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113561/testReport)** for PR 26097 at commit [`0ae26a6`](https://github.com/apache/spark/commit/0ae26a627060c576d9daea23bd2eb17e4ec81b55). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
yaooqinn commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#discussion_r344578122 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -425,11 +425,15 @@ object IntervalUtils { } private object ParseState extends Enumeration { +type ParseState = Value + val PREFIX, BEGIN_VALUE, Review comment: so we may change `PARSE_UNIT_VALUE` to `PARSE_VALUE` too How about renaming them all as, ```scala val PREFIX, BODY, SIGN, TRIM_BEFORE_VALUE, VALUE, VALUE_FRACTIONAL_PART, TRIM_BEFORE_UNIT, BEGIN_UNIT, UNIT_SUFFIX, END_UNIT = Value ``` Seems loud and clear enough for each state This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
SparkQA commented on issue #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#issuecomment-552318069 **[Test build #113561 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113561/testReport)** for PR 26097 at commit [`0ae26a6`](https://github.com/apache/spark/commit/0ae26a627060c576d9daea23bd2eb17e4ec81b55). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26312: [SPARK-29649][SQL] Stop task set if FileAlreadyExistsException was thrown when writing to output file
cloud-fan commented on a change in pull request #26312: [SPARK-29649][SQL] Stop task set if FileAlreadyExistsException was thrown when writing to output file URL: https://github.com/apache/spark/pull/26312#discussion_r344577012 ## File path: core/src/main/scala/org/apache/spark/TaskOutputFileAlreadyExistException.scala ## @@ -0,0 +1,23 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark + +/** + * Exception thrown when a task cannot write to output file due to the file already exists. + */ +private[spark] class TaskOutputFileAlreadyExistException(error: Throwable) extends Exception(error) Review comment: shall we extend `SparkException`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] rednaxelafx commented on issue #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression.
rednaxelafx commented on issue #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression. URL: https://github.com/apache/spark/pull/26420#issuecomment-552315759 I'd like to propose a solution for the codegen part that'll augment this PR. The overall direction this PR is taking sounds good to me, although I haven't reviewed the full details yet (would like to do that some time this week). I'll prepare a separate PR for demo purposes to show how it'll augment the codegen part. It's actually fairly easy and could also serve as a bit of code clean up for a lot of the declarative aggregate functions. The tl;dr is that I'd like to have explicit support for the user-specified filter clause in the infrastructure, instead of solely relying on a rewrite. A lot of aggregate functions are null-skipping by nature, e.g. `count()`, `sum()`, `avg()` etc. But that's not a property common to ALL possible aggregate functions, and some of them have interesting semantics like `first()`/ `last()` where you can configure whether or not you want to include the nulls as the result, or skip them and only take the non-null values. Having explicit support for the filter clause in the infrastructure ensures that we can properly support this feature, without having to rely on logical rewrite that might work for most aggregate functions and then a handful of exception cases have to be implemented in really ugly ways. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on a change in pull request #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled"
gengliangwang commented on a change in pull request #26444: [SPARK-29807][SQL] Rename "spark.sql.ansi.enabled" to "spark.sql.dialect.spark.ansi.enabled" URL: https://github.com/apache/spark/pull/26444#discussion_r344573615 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -1655,6 +1655,14 @@ object SQLConf { .checkValues(Dialect.values.map(_.toString)) .createWithDefault(Dialect.SPARK.toString) + val DIALECT_SPARK_ANSI_ENABLED = buildConf("spark.sql.dialect.spark.ansi.enabled") Review comment: +1 with @HyukjinKwon `spark.sql.dialect.spark.ansi.enabled` is more clear, considering we have another option `spark.sql.dialect` which has two choices `spark` and `postgresql` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
cloud-fan commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#discussion_r344573576 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -425,11 +425,15 @@ object IntervalUtils { } private object ParseState extends Enumeration { +type ParseState = Value + val PREFIX, BEGIN_VALUE, Review comment: we can rename it to `TRIM_BEFORE_PARSE_VALUE` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
cloud-fan commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#discussion_r344573164 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -425,11 +425,15 @@ object IntervalUtils { } private object ParseState extends Enumeration { +type ParseState = Value + val PREFIX, BEGIN_VALUE, PARSE_SIGN, +TRIM_VALUE, Review comment: can we make the name clearer? e.g. `TRIM_BEFORE_PARSE_UNIT_VALUE` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
cloud-fan commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#discussion_r344573164 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -425,11 +425,15 @@ object IntervalUtils { } private object ParseState extends Enumeration { +type ParseState = Value + val PREFIX, BEGIN_VALUE, PARSE_SIGN, +TRIM_VALUE, Review comment: can we make the name clearer? e.g. `TRIM_BEFORE_PARSE_SIGN` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
cloud-fan commented on a change in pull request #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#discussion_r344573164 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -425,11 +425,15 @@ object IntervalUtils { } private object ParseState extends Enumeration { +type ParseState = Value + val PREFIX, BEGIN_VALUE, PARSE_SIGN, +TRIM_VALUE, Review comment: can we make the name clearer? e.g. `TRIM_BEFORE_UNIT_VALUE` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
AmplabJenkins removed a comment on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#issuecomment-552310293 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18456/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
AmplabJenkins removed a comment on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#issuecomment-552310291 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
AmplabJenkins commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#issuecomment-552310291 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
AmplabJenkins commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#issuecomment-552310293 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18456/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values
SparkQA commented on issue #26449: [SPARK-29822][SQL] Fix cast error when there are white spaces between signs and values URL: https://github.com/apache/spark/pull/26449#issuecomment-552309842 **[Test build #113568 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113568/testReport)** for PR 26449 at commit [`cb8aa6b`](https://github.com/apache/spark/commit/cb8aa6b78f191f5a40595a96f58de0fb31ff5b5b). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #26461: [SPARK-29831][SQL] Scan Hive partitioned table should not dramatically increase data parallelism
viirya commented on a change in pull request #26461: [SPARK-29831][SQL] Scan Hive partitioned table should not dramatically increase data parallelism URL: https://github.com/apache/spark/pull/26461#discussion_r344570879 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala ## @@ -155,6 +155,16 @@ private[spark] object HiveUtils extends Logging { .booleanConf .createWithDefault(true) + val HIVE_TABLE_SCAN_MAX_PARALLELISM = buildConf("spark.sql.hive.tableScan.maxParallelism") Review comment: I think we can get split size and total number of splits of a Hadoop RDD. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries
cloud-fan commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#discussion_r344568476 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala ## @@ -171,6 +171,63 @@ case class InSubqueryExec( } } +/** + * The physical node of exists-subquery. This is for support use exists in join's on condition, + * since some join type we can't pushdown exists condition, we plan it here + */ +case class ExistsExec(child: Expression, + subQuery: String, + plan: BaseSubqueryExec, + exprId: ExprId, + private var resultBroadcast: Broadcast[Boolean] = null) + extends ExecSubqueryExpression { + + @transient private var result: Boolean = _ + + override def dataType: DataType = BooleanType + override def children: Seq[Expression] = child :: Nil + override def nullable: Boolean = child.nullable + override def toString: String = s"EXISTS ${plan.name}" + override def withNewPlan(plan: BaseSubqueryExec): ExistsExec = copy(plan = plan) + + override def semanticEquals(other: Expression): Boolean = other match { +case in: ExistsExec => child.semanticEquals(in.child) && plan.sameResult(in.plan) +case _ => false + } + + + def updateResult(): Unit = { +result = !plan.execute().isEmpty() +resultBroadcast = plan.sqlContext.sparkContext.broadcast[Boolean](result) + } + + def values(): Option[Boolean] = Option(resultBroadcast).map(_.value) + + private def prepareResult(): Unit = { +require(resultBroadcast != null, s"$this has not finished") +result = resultBroadcast.value + } + + override def eval(input: InternalRow): Any = { +prepareResult() +result + } + + override lazy val canonicalized: ExistsExec = { +copy( + child = child.canonicalized, + subQuery = subQuery, + plan = plan.canonicalized.asInstanceOf[BaseSubqueryExec], + exprId = ExprId(0), + resultBroadcast = null) + } + + override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { +prepareResult() +ExistsSubquery(child, subQuery, result).doGenCode(ctx, ev) Review comment: We don't have to extend `UnaryExpression` and we can still implement codegen, right? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org