[GitHub] [spark] yaooqinn commented on a change in pull request #26465: [SPARK-29390][SQL] Add the justify_days(), justify_hours() and justif_interval() functions
yaooqinn commented on a change in pull request #26465: [SPARK-29390][SQL] Add the justify_days(), justify_hours() and justif_interval() functions URL: https://github.com/apache/spark/pull/26465#discussion_r345057918 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -637,4 +637,39 @@ object IntervalUtils { new CalendarInterval(totalMonths, totalDays, micros) } + + /** + * Adjust interval so 30-day time periods are represented as months + */ + def justifyDays(interval: CalendarInterval): CalendarInterval = { +val months = Math.addExact(interval.months, interval.days / DAYS_PER_MONTH.toInt) Review comment: good catch, we shall calculate all days first just like other 2 funcs This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26425: [SPARK-29789][SQL] should not parse the bucket column name when creating v2 tables
AmplabJenkins removed a comment on issue #26425: [SPARK-29789][SQL] should not parse the bucket column name when creating v2 tables URL: https://github.com/apache/spark/pull/26425#issuecomment-552777507 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26425: [SPARK-29789][SQL] should not parse the bucket column name when creating v2 tables
AmplabJenkins removed a comment on issue #26425: [SPARK-29789][SQL] should not parse the bucket column name when creating v2 tables URL: https://github.com/apache/spark/pull/26425#issuecomment-552777515 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113610/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26425: [SPARK-29789][SQL] should not parse the bucket column name when creating v2 tables
AmplabJenkins commented on issue #26425: [SPARK-29789][SQL] should not parse the bucket column name when creating v2 tables URL: https://github.com/apache/spark/pull/26425#issuecomment-552777515 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113610/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26425: [SPARK-29789][SQL] should not parse the bucket column name when creating v2 tables
AmplabJenkins commented on issue #26425: [SPARK-29789][SQL] should not parse the bucket column name when creating v2 tables URL: https://github.com/apache/spark/pull/26425#issuecomment-552777507 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26425: [SPARK-29789][SQL] should not parse the bucket column name when creating v2 tables
SparkQA commented on issue #26425: [SPARK-29789][SQL] should not parse the bucket column name when creating v2 tables URL: https://github.com/apache/spark/pull/26425#issuecomment-552776357 **[Test build #113610 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113610/testReport)** for PR 26425 at commit [`550bf25`](https://github.com/apache/spark/commit/550bf25ac6f2638e99965f0a12ee57f72af1121c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26425: [SPARK-29789][SQL] should not parse the bucket column name when creating v2 tables
SparkQA removed a comment on issue #26425: [SPARK-29789][SQL] should not parse the bucket column name when creating v2 tables URL: https://github.com/apache/spark/pull/26425#issuecomment-552724928 **[Test build #113610 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113610/testReport)** for PR 26425 at commit [`550bf25`](https://github.com/apache/spark/commit/550bf25ac6f2638e99965f0a12ee57f72af1121c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26461: [SPARK-29831][SQL] Scan Hive partitioned table should not dramatically increase data parallelism
cloud-fan commented on issue #26461: [SPARK-29831][SQL] Scan Hive partitioned table should not dramatically increase data parallelism URL: https://github.com/apache/spark/pull/26461#issuecomment-552776318 > Maybe we can also rely on maxSplitBytes in Hive Scan and decide parallelism? SGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean
cloud-fan commented on a change in pull request #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean URL: https://github.com/apache/spark/pull/26463#discussion_r345055743 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/postgreSQL/CastSuite.scala ## @@ -16,44 +16,57 @@ */ package org.apache.spark.sql.catalyst.expressions.postgreSQL +import java.sql.{Date, Timestamp} + import org.apache.spark.SparkFunSuite import org.apache.spark.sql.catalyst.expressions.{ExpressionEvalHelper, Literal} class CastSuite extends SparkFunSuite with ExpressionEvalHelper { - private def checkPostgreCastStringToBoolean(v: Any, expected: Any): Unit = { -checkEvaluation(PostgreCastStringToBoolean(Literal(v)), expected) + private def checkPostgreCastToBoolean(v: Any, expected: Any): Unit = { +checkEvaluation(PostgreCastToBoolean(Literal(v), None), expected) } test("cast string to boolean") { -checkPostgreCastStringToBoolean("true", true) -checkPostgreCastStringToBoolean("tru", true) -checkPostgreCastStringToBoolean("tr", true) -checkPostgreCastStringToBoolean("t", true) -checkPostgreCastStringToBoolean("tRUe", true) -checkPostgreCastStringToBoolean("tRue ", true) -checkPostgreCastStringToBoolean("tRu ", true) -checkPostgreCastStringToBoolean("yes", true) -checkPostgreCastStringToBoolean("ye", true) -checkPostgreCastStringToBoolean("y", true) -checkPostgreCastStringToBoolean("1", true) -checkPostgreCastStringToBoolean("on", true) +checkPostgreCastToBoolean("true", true) +checkPostgreCastToBoolean("tru", true) +checkPostgreCastToBoolean("tr", true) +checkPostgreCastToBoolean("t", true) +checkPostgreCastToBoolean("tRUe", true) +checkPostgreCastToBoolean("tRue ", true) +checkPostgreCastToBoolean("tRu ", true) +checkPostgreCastToBoolean("yes", true) +checkPostgreCastToBoolean("ye", true) +checkPostgreCastToBoolean("y", true) +checkPostgreCastToBoolean("1", true) +checkPostgreCastToBoolean("on", true) + +checkPostgreCastToBoolean("false", false) +checkPostgreCastToBoolean("fals", false) +checkPostgreCastToBoolean("fal", false) +checkPostgreCastToBoolean("fa", false) +checkPostgreCastToBoolean("f", false) +checkPostgreCastToBoolean("fAlse", false) +checkPostgreCastToBoolean("fAls", false) +checkPostgreCastToBoolean("FAlsE", false) +checkPostgreCastToBoolean("no", false) +checkPostgreCastToBoolean("n", false) +checkPostgreCastToBoolean("0", false) +checkPostgreCastToBoolean("off", false) +checkPostgreCastToBoolean("of", false) -checkPostgreCastStringToBoolean("false", false) -checkPostgreCastStringToBoolean("fals", false) -checkPostgreCastStringToBoolean("fal", false) -checkPostgreCastStringToBoolean("fa", false) -checkPostgreCastStringToBoolean("f", false) -checkPostgreCastStringToBoolean("fAlse", false) -checkPostgreCastStringToBoolean("fAls", false) -checkPostgreCastStringToBoolean("FAlsE", false) -checkPostgreCastStringToBoolean("no", false) -checkPostgreCastStringToBoolean("n", false) -checkPostgreCastStringToBoolean("0", false) -checkPostgreCastStringToBoolean("off", false) -checkPostgreCastStringToBoolean("of", false) +intercept[Exception](checkPostgreCastToBoolean("o", null)) +intercept[Exception](checkPostgreCastToBoolean("abc", null)) +intercept[Exception](checkPostgreCastToBoolean("", null)) + } -checkPostgreCastStringToBoolean("o", null) -checkPostgreCastStringToBoolean("abc", null) -checkPostgreCastStringToBoolean("", null) + test("unsupported data types to cast to boolean") { +intercept[Exception](checkPostgreCastToBoolean(new Timestamp(1), null)) Review comment: We expect to see `AnalysisException` for these cases. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean
cloud-fan commented on a change in pull request #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean URL: https://github.com/apache/spark/pull/26463#discussion_r345055219 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/postgreSQL/PostgreCastToBoolean.scala ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.sql.catalyst.expressions.postgreSQL + +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.expressions.{CastBase, Expression, TimeZoneAwareExpression} +import org.apache.spark.sql.catalyst.expressions.codegen.Block._ +import org.apache.spark.sql.catalyst.util.postgreSQL.StringUtils +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.types._ +import org.apache.spark.unsafe.types.UTF8String + +case class PostgreCastToBoolean(child: Expression, timeZoneId: Option[String]) + extends CastBase { + + override protected def ansiEnabled = SQLConf.get.ansiEnabled + + override def withTimeZone(timeZoneId: String): TimeZoneAwareExpression = +copy(timeZoneId = Option(timeZoneId)) + + override def checkInputDataTypes(): TypeCheckResult = child.dataType match { +case StringType | IntegerType => TypeCheckResult.TypeCheckSuccess +case dt => throw new UnsupportedOperationException(s"cannot cast type $dt to boolean") Review comment: We should follow other expressions and return `TypeCheckResult.TypeCheckFailure` here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean
cloud-fan commented on a change in pull request #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean URL: https://github.com/apache/spark/pull/26463#discussion_r345054980 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/postgreSQL/PostgreCastToBoolean.scala ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.sql.catalyst.expressions.postgreSQL + +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.expressions.{CastBase, Expression, TimeZoneAwareExpression} +import org.apache.spark.sql.catalyst.expressions.codegen.Block._ +import org.apache.spark.sql.catalyst.util.postgreSQL.StringUtils +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.types._ +import org.apache.spark.unsafe.types.UTF8String + +case class PostgreCastToBoolean(child: Expression, timeZoneId: Option[String]) + extends CastBase { + + override protected def ansiEnabled = SQLConf.get.ansiEnabled Review comment: to be safe, shall we throw exception here? We don't expect to see this method get called. pgsql dialect should not be affected by the ansi mode config. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean
cloud-fan commented on a change in pull request #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean URL: https://github.com/apache/spark/pull/26463#discussion_r345054707 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ## @@ -273,7 +273,7 @@ abstract class CastBase extends UnaryExpression with TimeZoneAwareExpression wit private[this] def needsTimeZone: Boolean = Cast.needsTimeZone(child.dataType, dataType) // [[func]] assumes the input is no longer null because eval already does the null check. - @inline private[this] def buildCast[T](a: Any, func: T => Any): Any = func(a.asInstanceOf[T]) + @inline protected[this] def buildCast[T](a: Any, func: T => Any): Any = func(a.asInstanceOf[T]) Review comment: nit: I don't think `[this]` here can buy us anything. How about just use `protected`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26437: [SPARK-29800][SQL] Plan non-correlated Exists 's subquery in PlanSubqueries
cloud-fan commented on a change in pull request #26437: [SPARK-29800][SQL] Plan non-correlated Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#discussion_r345054174 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala ## @@ -194,6 +258,9 @@ case class PlanSubqueries(sparkSession: SparkSession) extends Rule[SparkPlan] { } val executedPlan = new QueryExecution(sparkSession, query).executedPlan InSubqueryExec(expr, SubqueryExec(s"subquery#${exprId.id}", executedPlan), exprId) + case expressions.Exists(sub, children, exprId) => +val executedPlan = new QueryExecution(sparkSession, Project(Nil, sub)).executedPlan Review comment: Can we add the Project a bit earlier? This seems weird to add Project and do column pruning at planning time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on issue #26461: [SPARK-29831][SQL] Scan Hive partitioned table should not dramatically increase data parallelism
viirya commented on issue #26461: [SPARK-29831][SQL] Scan Hive partitioned table should not dramatically increase data parallelism URL: https://github.com/apache/spark/pull/26461#issuecomment-552774504 > spark.default.parallelism doesn't really affect data source scan AFAIK. We do have a similar problem to set the number of reducers and we solved in with the recent adaptive execution work. Because the default parallelism affects `maxSplitBytes`: https://github.com/apache/spark/blob/053dd858d38e6107bc71e0aa3a4954291b74f8c8/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FilePartition.scala#L86-L95 IIUC, `spark.sql.files.maxPartitionByte` and `spark.default.parallelism` both affect the used split byte and so final parallelism in scan. Maybe we can also rely on `maxSplitBytes` in Hive Scan and decide parallelism? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26437: [SPARK-29800][SQL] Plan non-correlated Exists 's subquery in PlanSubqueries
cloud-fan commented on a change in pull request #26437: [SPARK-29800][SQL] Plan non-correlated Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#discussion_r345053263 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala ## @@ -171,6 +173,68 @@ case class InSubqueryExec( } } +/** + * The physical node of exists-subquery. This is for support use exists in join's on condition, + * since some join type we can't pushdown exists condition, we plan it here + */ +case class ExistsExec( +plan: BaseSubqueryExec, +exprId: ExprId, +private var resultBroadcast: Broadcast[Boolean] = null) + extends ExecSubqueryExpression { + + @transient private var result: Boolean = _ + + override def dataType: DataType = BooleanType + override def children: Seq[Expression] = Nil + override def nullable: Boolean = false + override def toString: String = s"EXISTS ${plan.name}" Review comment: This should be `EXISTS (${plan.simpleString})`, to follow `org.apache.spark.sql.execution.ScalarSubquery` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26437: [SPARK-29800][SQL] Plan non-correlated Exists 's subquery in PlanSubqueries
cloud-fan commented on a change in pull request #26437: [SPARK-29800][SQL] Plan non-correlated Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#discussion_r345052924 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala ## @@ -171,6 +173,68 @@ case class InSubqueryExec( } } +/** + * The physical node of exists-subquery. This is for support use exists in join's on condition, Review comment: We can just say `The physical node for non-correlated EXISTS subquery.` This is pretty general and not only for join conditions This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean
AmplabJenkins removed a comment on issue #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean URL: https://github.com/apache/spark/pull/26463#issuecomment-552772595 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean
AmplabJenkins removed a comment on issue #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean URL: https://github.com/apache/spark/pull/26463#issuecomment-552772600 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18501/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26437: [SPARK-29800][SQL] Plan non-correlated Exists 's subquery in PlanSubqueries
cloud-fan commented on a change in pull request #26437: [SPARK-29800][SQL] Plan non-correlated Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#discussion_r345052634 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala ## @@ -171,6 +173,68 @@ case class InSubqueryExec( } } +/** + * The physical node of exists-subquery. This is for support use exists in join's on condition, + * since some join type we can't pushdown exists condition, we plan it here + */ +case class ExistsExec( +plan: BaseSubqueryExec, +exprId: ExprId, +private var resultBroadcast: Broadcast[Boolean] = null) Review comment: are you sure it's beneficial to broadcast a single boolean? I don't think so This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26437: [SPARK-29800][SQL] Plan non-correlated Exists 's subquery in PlanSubqueries
cloud-fan commented on a change in pull request #26437: [SPARK-29800][SQL] Plan non-correlated Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#discussion_r345052326 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala ## @@ -64,9 +64,9 @@ object SubqueryExpression { /** * Returns true when an expression contains an IN or EXISTS subquery and false otherwise. */ - def hasInOrExistsSubquery(e: Expression): Boolean = { + def hasInOrCorrelatedExistsSubquery(e: Expression): Boolean = { e.find { - case _: ListQuery | _: Exists => true + case _: ListQuery | _: Exists if e.children.nonEmpty => true Review comment: this will skip non-correlated IN, we should do ``` case _: ListQuery => true case _: Exists if e.children.nonEmpty => true ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean
AmplabJenkins commented on issue #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean URL: https://github.com/apache/spark/pull/26463#issuecomment-552772595 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean
AmplabJenkins commented on issue #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean URL: https://github.com/apache/spark/pull/26463#issuecomment-552772600 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18501/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26465: [SPARK-29390][SQL] Add the justify_days(), justify_hours() and justif_interval() functions
cloud-fan commented on a change in pull request #26465: [SPARK-29390][SQL] Add the justify_days(), justify_hours() and justif_interval() functions URL: https://github.com/apache/spark/pull/26465#discussion_r345051840 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -637,4 +637,39 @@ object IntervalUtils { new CalendarInterval(totalMonths, totalDays, micros) } + + /** + * Adjust interval so 30-day time periods are represented as months + */ + def justifyDays(interval: CalendarInterval): CalendarInterval = { +val months = Math.addExact(interval.months, interval.days / DAYS_PER_MONTH.toInt) Review comment: if `days` is negative, will we end like `-1 month 2 days`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26465: [SPARK-29390][SQL] Add the justify_days(), justify_hours() and justif_interval() functions
cloud-fan commented on a change in pull request #26465: [SPARK-29390][SQL] Add the justify_days(), justify_hours() and justif_interval() functions URL: https://github.com/apache/spark/pull/26465#discussion_r345051327 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/intervalExpressions.scala ## @@ -257,3 +257,69 @@ case class MakeInterval( override def prettyName: String = "make_interval" } + +abstract class IntervalJustifyLike( +child: Expression, +justify: CalendarInterval => CalendarInterval, +justifyType: String) extends UnaryExpression with ExpectsInputTypes { Review comment: nit: it's clearer to pass in `justifyFuncName: String` ``` extends IntervalJustifyLike(child, justifyHours, "justifyHours") ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols
AmplabJenkins removed a comment on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols URL: https://github.com/apache/spark/pull/26480#issuecomment-552769456 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols
AmplabJenkins removed a comment on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols URL: https://github.com/apache/spark/pull/26480#issuecomment-552769465 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113614/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols
AmplabJenkins commented on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols URL: https://github.com/apache/spark/pull/26480#issuecomment-552769456 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on issue #26406: [SPARK-29145][SQL][FOLLOW-UP] Move tests from `SubquerySuite` to `subquery/in-subquery/in-joins.sql`
AngersZh commented on issue #26406: [SPARK-29145][SQL][FOLLOW-UP] Move tests from `SubquerySuite` to `subquery/in-subquery/in-joins.sql` URL: https://github.com/apache/spark/pull/26406#issuecomment-552769401 @maropu Passed test, any more work to do? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols
AmplabJenkins commented on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols URL: https://github.com/apache/spark/pull/26480#issuecomment-552769465 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113614/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26461: [SPARK-29831][SQL] Scan Hive partitioned table should not dramatically increase data parallelism
cloud-fan commented on issue #26461: [SPARK-29831][SQL] Scan Hive partitioned table should not dramatically increase data parallelism URL: https://github.com/apache/spark/pull/26461#issuecomment-552769026 I agree with the problem mentioned by @viirya , but I'm not sure this config is the right cure. Users still need to know the big parallelism problem and set the config carefully. The file source config `spark.sql.files.maxPartitionBytes` is much simpler to use. It defines how much data you want each task to process, and mostly you don't need to change it for different queries. `spark.default.parallelism` doesn't really affect data source scan AFAIK. We do have a similar problem to set the number of reducers and we solved in with the recent adaptive execution work. I'm OK to have a config for hive table scan, but we should make it simple set. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols
SparkQA removed a comment on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols URL: https://github.com/apache/spark/pull/26480#issuecomment-552751882 **[Test build #113614 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113614/testReport)** for PR 26480 at commit [`0d2f624`](https://github.com/apache/spark/commit/0d2f624a5ce7a7cbc9ec9bc35cc08f8d6fcd98b4). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols
SparkQA commented on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols URL: https://github.com/apache/spark/pull/26480#issuecomment-552768706 **[Test build #113614 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113614/testReport)** for PR 26480 at commit [`0d2f624`](https://github.com/apache/spark/commit/0d2f624a5ce7a7cbc9ec9bc35cc08f8d6fcd98b4). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26418: [SPARK-29783][SQL] Support SQL Standard/ISO_8601 output style for interval type
cloud-fan commented on issue #26418: [SPARK-29783][SQL] Support SQL Standard/ISO_8601 output style for interval type URL: https://github.com/apache/spark/pull/26418#issuecomment-552766814 the part I worry most is the SQL STANDARD style. For `-2 day 23 hour`, if we use SQL STANDARD style it's `-2 23`, which is actually `-2 day -23 hour`. How is it solved? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 commented on a change in pull request #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean
Ngone51 commented on a change in pull request #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean URL: https://github.com/apache/spark/pull/26463#discussion_r345045212 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ## @@ -273,7 +273,7 @@ abstract class CastBase extends UnaryExpression with TimeZoneAwareExpression wit private[this] def needsTimeZone: Boolean = Cast.needsTimeZone(child.dataType, dataType) // [[func]] assumes the input is no longer null because eval already does the null check. - @inline private[this] def buildCast[T](a: Any, func: T => Any): Any = func(a.asInstanceOf[T]) + @inline private[catalyst] def buildCast[T](a: Any, func: T => Any): Any = func(a.asInstanceOf[T]) Review comment: Because `case StringType` inside `PostgreCastToBoolean.castToBoolean()` needs to call `buildCast`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception
cloud-fan commented on a change in pull request #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception URL: https://github.com/apache/spark/pull/26479#discussion_r345044456 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -1860,16 +1860,19 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging override def visitTypeConstructor(ctx: TypeConstructorContext): Literal = withOrigin(ctx) { val value = string(ctx.STRING) val valueType = ctx.identifier.getText.toUpperCase(Locale.ROOT) +val negativeSign = Option(ctx.negativeSign).map(_.getText).getOrElse("") +val isNegative = if (negativeSign == "-") true else false Review comment: do we really need to check `ctx.negativeSign.getText`? the parser rule is `MINUS? ...` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26481: [WIP][SPARK-29840][SQL] PostgreSQL dialect: cast to integer
AmplabJenkins removed a comment on issue #26481: [WIP][SPARK-29840][SQL] PostgreSQL dialect: cast to integer URL: https://github.com/apache/spark/pull/26481#issuecomment-552764490 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26481: [WIP][SPARK-29840][SQL] PostgreSQL dialect: cast to integer
AmplabJenkins commented on issue #26481: [WIP][SPARK-29840][SQL] PostgreSQL dialect: cast to integer URL: https://github.com/apache/spark/pull/26481#issuecomment-552765089 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands
cloud-fan commented on a change in pull request #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands URL: https://github.com/apache/spark/pull/26219#discussion_r345043711 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala ## @@ -187,6 +187,16 @@ case class CreateTableAsSelect( } } +/** + * Create a new table with the same table definition of the source table. + */ +case class CreateTableLike( +targetCatalog: TableCatalog, +targetTableName: Seq[String], +sourceCatalog: Option[TableCatalog], +sourceTableName: Seq[String], Review comment: for the source table, what we really care is the table itself, not which catalog it comes from. I think it's better to define the plan as ``` case class CreateTableLike( targetCatalog: TableCatalog, targetTableName: Seq[String], sourceTable: NamedRelation, location: Option[String], provider: Option[String], ifNotExists: Boolean) ``` In the planner, we match `CreateTableLike(..., r: DataSourceV2Relation, ..)`, and create the physical plan with source table `r.table` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands
cloud-fan commented on a change in pull request #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands URL: https://github.com/apache/spark/pull/26219#discussion_r345043711 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala ## @@ -187,6 +187,16 @@ case class CreateTableAsSelect( } } +/** + * Create a new table with the same table definition of the source table. + */ +case class CreateTableLike( +targetCatalog: TableCatalog, +targetTableName: Seq[String], +sourceCatalog: Option[TableCatalog], +sourceTableName: Seq[String], Review comment: for the source table, what we really care is the table itself, not which catalog it comes from. I think it's better to define the plan as ``` case class CreateTableLike( targetCatalog: TableCatalog, targetTableName: Seq[String], sourceTable: NamedRelation, ifNotExists: Boolean) ``` In the planner, we match `CreateTableLike(..., r: DataSourceV2Relation, ..)`, and create the physical plan with source table `r.table` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26481: [WIP][SPARK-29840][SQL] PostgreSQL dialect: cast to integer
AmplabJenkins commented on issue #26481: [WIP][SPARK-29840][SQL] PostgreSQL dialect: cast to integer URL: https://github.com/apache/spark/pull/26481#issuecomment-552764490 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands
cloud-fan commented on a change in pull request #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands URL: https://github.com/apache/spark/pull/26219#discussion_r345042723 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveCatalogs.scala ## @@ -141,6 +141,37 @@ class ResolveCatalogs(val catalogManager: CatalogManager) writeOptions = c.options.filterKeys(_ != "path"), ignoreIfExists = c.ifNotExists) +case c @ CreateTableLikeStatement(target, source, loc, ifNotExists) => + def validateLocation(loc: Option[String]) = { +if (loc.isDefined) { + throw new AnalysisException("Location clause not supported for " + +"CREATE TABLE LIKE statement when tables are of V2 type") +} + } + (target, source) match { +case (NonSessionCatalog(tCatalog, t), NonSessionCatalog(sCatalog, s)) => + validateLocation(loc) + CreateTableLike(tCatalog.asTableCatalog, +t, +Some(sCatalog.asTableCatalog), +s, +ifNotExists) +case (NonSessionCatalog(tCatalog, t), SessionCatalog(sCatalog, s)) => + validateLocation(loc) + CreateTableLike(tCatalog.asTableCatalog, +t, +None, Review comment: shall we pass `sCatalog` in ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands
cloud-fan commented on a change in pull request #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands URL: https://github.com/apache/spark/pull/26219#discussion_r345042409 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveCatalogs.scala ## @@ -141,6 +141,37 @@ class ResolveCatalogs(val catalogManager: CatalogManager) writeOptions = c.options.filterKeys(_ != "path"), ignoreIfExists = c.ifNotExists) +case c @ CreateTableLikeStatement(target, source, loc, ifNotExists) => + def validateLocation(loc: Option[String]) = { +if (loc.isDefined) { + throw new AnalysisException("Location clause not supported for " + +"CREATE TABLE LIKE statement when tables are of V2 type") +} + } + (target, source) match { +case (NonSessionCatalog(tCatalog, t), NonSessionCatalog(sCatalog, s)) => + validateLocation(loc) + CreateTableLike(tCatalog.asTableCatalog, +t, +Some(sCatalog.asTableCatalog), +s, +ifNotExists) +case (NonSessionCatalog(tCatalog, t), SessionCatalog(sCatalog, s)) => Review comment: If we need to catch session catalog, we should move the case to `ResolveSessionCatalog`. It's OK to create v2 command in `ResolveSessionCatalog` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] amanomer opened a new pull request #26481: [WIP][SPARK-29838][SQL] PostgreSQL dialect: cast to integer
amanomer opened a new pull request #26481: [WIP][SPARK-29838][SQL] PostgreSQL dialect: cast to integer URL: https://github.com/apache/spark/pull/26481 ### What changes were proposed in this pull request? To make SparkSQL's `cast as int` behavior consistent with PostgreSQL when spark.sql.dialect is configured as PostgreSQL. ### Why are the changes needed? SparkSQL and PostgreSQL have a lot different cast behavior between types by default. We should make SparkSQL's cast behavior be consistent with PostgreSQL when spark.sql.dialect is configured as PostgreSQL. ### Does this PR introduce any user-facing change? Yes. If user switches to PostgreSQL dialect now, they will - get an AnalysisException when they try to cast `ByteType`, 'TimestampType` or 'DateType` to 'IntegerType` ### How was this patch tested? Added test cases. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on a change in pull request #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception
MaxGekk commented on a change in pull request #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception URL: https://github.com/apache/spark/pull/26479#discussion_r345040668 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -1860,16 +1860,19 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging override def visitTypeConstructor(ctx: TypeConstructorContext): Literal = withOrigin(ctx) { val value = string(ctx.STRING) val valueType = ctx.identifier.getText.toUpperCase(Locale.ROOT) +val negativeSign = Option(ctx.negativeSign).map(_.getText).getOrElse("") +val isNegative = if (negativeSign == "-") true else false Review comment: and remove `val negativeSign` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on a change in pull request #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception
MaxGekk commented on a change in pull request #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception URL: https://github.com/apache/spark/pull/26479#discussion_r345040291 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -1860,16 +1860,19 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging override def visitTypeConstructor(ctx: TypeConstructorContext): Literal = withOrigin(ctx) { val value = string(ctx.STRING) val valueType = ctx.identifier.getText.toUpperCase(Locale.ROOT) +val negativeSign = Option(ctx.negativeSign).map(_.getText).getOrElse("") +val isNegative = if (negativeSign == "-") true else false Review comment: ```suggestion val isNegative = ctx.negativeSign != null && ctx.negativeSign.getText == "-" ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception
cloud-fan commented on issue #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception URL: https://github.com/apache/spark/pull/26479#issuecomment-552760958 looks reasonable to me. Is it a common behavior in other DBs like pgsql? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception
cloud-fan commented on a change in pull request #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception URL: https://github.com/apache/spark/pull/26479#discussion_r345040238 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -1860,16 +1860,19 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging override def visitTypeConstructor(ctx: TypeConstructorContext): Literal = withOrigin(ctx) { val value = string(ctx.STRING) val valueType = ctx.identifier.getText.toUpperCase(Locale.ROOT) +val negativeSign = Option(ctx.negativeSign).map(_.getText).getOrElse("") +val isNegative = if (negativeSign == "-") true else false Review comment: is it simply `val isNegative = ctx.negativeSign != null`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean
cloud-fan commented on a change in pull request #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean URL: https://github.com/apache/spark/pull/26463#discussion_r345039472 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ## @@ -273,7 +273,7 @@ abstract class CastBase extends UnaryExpression with TimeZoneAwareExpression wit private[this] def needsTimeZone: Boolean = Cast.needsTimeZone(child.dataType, dataType) // [[func]] assumes the input is no longer null because eval already does the null check. - @inline private[this] def buildCast[T](a: Any, func: T => Any): Any = func(a.asInstanceOf[T]) + @inline private[catalyst] def buildCast[T](a: Any, func: T => Any): Any = func(a.asInstanceOf[T]) Review comment: this is minor but just want to make it clear. In `PostgreCastToBoolean` we call `super.castToBoolean`, why the modifier of `buildCast` becomes a problem? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands
cloud-fan commented on a change in pull request #26219: [SPARK-29563][SQL] CREATE TABLE LIKE should look up catalog/table like v2 commands URL: https://github.com/apache/spark/pull/26219#discussion_r345038983 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveCatalogs.scala ## @@ -141,6 +141,37 @@ class ResolveCatalogs(val catalogManager: CatalogManager) writeOptions = c.options.filterKeys(_ != "path"), ignoreIfExists = c.ifNotExists) +case c @ CreateTableLikeStatement(target, source, loc, ifNotExists) => + def validateLocation(loc: Option[String]) = { +if (loc.isDefined) { + throw new AnalysisException("Location clause not supported for " + +"CREATE TABLE LIKE statement when tables are of V2 type") Review comment: we can support the LOCATION clause. See `CatalogV2Utils.convertTableProperties`, we can store the location in a special table property `location`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] 07ARB commented on issue #26477: [SPARK-29776][SQL] rpad returning invalid value when parameter is empty
07ARB commented on issue #26477: [SPARK-29776][SQL] rpad returning invalid value when parameter is empty URL: https://github.com/apache/spark/pull/26477#issuecomment-552759351 @HyukjinKwon , please verify this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 commented on a change in pull request #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean
Ngone51 commented on a change in pull request #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean URL: https://github.com/apache/spark/pull/26463#discussion_r345035046 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/postgreSQL/PostgreCastToBoolean.scala ## @@ -0,0 +1,82 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.sql.catalyst.expressions.postgreSQL + +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.expressions.{CastBase, Expression, TimeZoneAwareExpression} +import org.apache.spark.sql.catalyst.expressions.codegen.Block._ +import org.apache.spark.sql.catalyst.expressions.codegen.JavaCode +import org.apache.spark.sql.catalyst.util.postgreSQL.StringUtils +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.types._ +import org.apache.spark.unsafe.types.UTF8String + +case class PostgreCastToBoolean(child: Expression, timeZoneId: Option[String]) + extends CastBase { + + override protected def ansiEnabled = SQLConf.get.ansiEnabled + + override def withTimeZone(timeZoneId: String): TimeZoneAwareExpression = +copy(timeZoneId = Option(timeZoneId)) + + override def checkInputDataTypes(): TypeCheckResult = child.dataType match { +case StringType | IntegerType => super.checkInputDataTypes() +case dt => throw new UnsupportedOperationException(s"cannot cast type $dt to boolean") + } + + override def castToBoolean(from: DataType): Any => Any = from match { +case StringType => + buildCast[UTF8String](_, str => { +val s = str.trim().toLowerCase() +if (StringUtils.isTrueString(s)) { + true +} else if (StringUtils.isFalseString(s)) { + false +} else { + throw new IllegalArgumentException(s"invalid input syntax for type boolean: $s") +} + }) +case IntegerType => + super.castToBoolean(from) + } + + override def castToBooleanCode(from: DataType): CastFunction = from match { +case StringType => + val stringUtils = inline"${StringUtils.getClass.getName.stripSuffix("$")}" + (c, evPrim, evNull) => +code""" + if ($stringUtils.isTrueString($c.trim().toLowerCase())) { +$evPrim = true; + } else if ($stringUtils.isFalseString($c.trim().toLowerCase())) { +$evPrim = false; + } else { +throw new IllegalArgumentException("invalid input syntax for type boolean: $c"); + } +""" + +case IntegerType => + super.castToBooleanCode(from) + } + + override def dataType: DataType = BooleanType + + override def nullable: Boolean = false Review comment: make sense. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception
AmplabJenkins removed a comment on issue #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception URL: https://github.com/apache/spark/pull/26479#issuecomment-552753984 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception
AmplabJenkins removed a comment on issue #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception URL: https://github.com/apache/spark/pull/26479#issuecomment-552753992 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18500/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception
AmplabJenkins commented on issue #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception URL: https://github.com/apache/spark/pull/26479#issuecomment-552753992 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18500/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception
AmplabJenkins commented on issue #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception URL: https://github.com/apache/spark/pull/26479#issuecomment-552753984 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols
AmplabJenkins removed a comment on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols URL: https://github.com/apache/spark/pull/26480#issuecomment-552753705 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols
AmplabJenkins removed a comment on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols URL: https://github.com/apache/spark/pull/26480#issuecomment-552753711 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18499/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26478: [SPARK-29853][SQL]lpad returning empty instead of NULL for empty pad …
AmplabJenkins removed a comment on issue #26478: [SPARK-29853][SQL]lpad returning empty instead of NULL for empty pad … URL: https://github.com/apache/spark/pull/26478#issuecomment-552752714 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols
AmplabJenkins commented on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols URL: https://github.com/apache/spark/pull/26480#issuecomment-552753711 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18499/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26478: [SPARK-29853][SQL]lpad returning empty instead of NULL for empty pad …
AmplabJenkins commented on issue #26478: [SPARK-29853][SQL]lpad returning empty instead of NULL for empty pad … URL: https://github.com/apache/spark/pull/26478#issuecomment-552753596 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols
AmplabJenkins commented on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols URL: https://github.com/apache/spark/pull/26480#issuecomment-552753705 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26478: [SPARK-29853][SQL]lpad returning empty instead of NULL for empty pad …
AmplabJenkins commented on issue #26478: [SPARK-29853][SQL]lpad returning empty instead of NULL for empty pad … URL: https://github.com/apache/spark/pull/26478#issuecomment-552752714 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26477: [SPARK-29776][SQL] rpad returning invalid value when parameter is empty
AmplabJenkins removed a comment on issue #26477: [SPARK-29776][SQL] rpad returning invalid value when parameter is empty URL: https://github.com/apache/spark/pull/26477#issuecomment-552743106 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26477: [SPARK-29776][SQL] rpad returning invalid value when parameter is empty
AmplabJenkins commented on issue #26477: [SPARK-29776][SQL] rpad returning invalid value when parameter is empty URL: https://github.com/apache/spark/pull/26477#issuecomment-552752723 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception
SparkQA commented on issue #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception URL: https://github.com/apache/spark/pull/26479#issuecomment-552752803 **[Test build #113615 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113615/testReport)** for PR 26479 at commit [`1d04cb3`](https://github.com/apache/spark/commit/1d04cb3765198cc4ff36141dc7c34cd105ccefd8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols
SparkQA commented on issue #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols URL: https://github.com/apache/spark/pull/26480#issuecomment-552751882 **[Test build #113614 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113614/testReport)** for PR 26480 at commit [`0d2f624`](https://github.com/apache/spark/commit/0d2f624a5ce7a7cbc9ec9bc35cc08f8d6fcd98b4). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 commented on a change in pull request #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean
Ngone51 commented on a change in pull request #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean URL: https://github.com/apache/spark/pull/26463#discussion_r345030984 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/postgreSQL/PostgreCastToBoolean.scala ## @@ -0,0 +1,82 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.sql.catalyst.expressions.postgreSQL + +import org.apache.spark.sql.catalyst.analysis.TypeCheckResult +import org.apache.spark.sql.catalyst.expressions.{CastBase, Expression, TimeZoneAwareExpression} +import org.apache.spark.sql.catalyst.expressions.codegen.Block._ +import org.apache.spark.sql.catalyst.expressions.codegen.JavaCode +import org.apache.spark.sql.catalyst.util.postgreSQL.StringUtils +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.types._ +import org.apache.spark.unsafe.types.UTF8String + +case class PostgreCastToBoolean(child: Expression, timeZoneId: Option[String]) + extends CastBase { + + override protected def ansiEnabled = SQLConf.get.ansiEnabled + + override def withTimeZone(timeZoneId: String): TimeZoneAwareExpression = +copy(timeZoneId = Option(timeZoneId)) + + override def checkInputDataTypes(): TypeCheckResult = child.dataType match { +case StringType | IntegerType => super.checkInputDataTypes() Review comment: I just want the `StringType` and `IntegerType` to do `Cast.canCast()` check in `super.checkInputDataTypes()`, though I know it's not necessary. I'll return success directly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] amanomer commented on issue #26317: [SPARK-29628][SQL] Forcibly create a temporary view in CREATE VIEW if referencing a temporary view
amanomer commented on issue #26317: [SPARK-29628][SQL] Forcibly create a temporary view in CREATE VIEW if referencing a temporary view URL: https://github.com/apache/spark/pull/26317#issuecomment-552750431 @cloud-fan Kindly look into this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 commented on a change in pull request #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean
Ngone51 commented on a change in pull request #26463: [SPARK-29837][SQL] PostgreSQL dialect: cast to boolean URL: https://github.com/apache/spark/pull/26463#discussion_r345030210 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ## @@ -273,7 +273,7 @@ abstract class CastBase extends UnaryExpression with TimeZoneAwareExpression wit private[this] def needsTimeZone: Boolean = Cast.needsTimeZone(child.dataType, dataType) // [[func]] assumes the input is no longer null because eval already does the null check. - @inline private[this] def buildCast[T](a: Any, func: T => Any): Any = func(a.asInstanceOf[T]) + @inline private[catalyst] def buildCast[T](a: Any, func: T => Any): Any = func(a.asInstanceOf[T]) Review comment: no, but `protected[this]` could be fine. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #25702: [SPARK-29001][CORE] Print events that take too long time to process
cloud-fan commented on issue #25702: [SPARK-29001][CORE] Print events that take too long time to process URL: https://github.com/apache/spark/pull/25702#issuecomment-552749651 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #25702: [SPARK-29001][CORE] Print events that take too long time to process
cloud-fan closed pull request #25702: [SPARK-29001][CORE] Print events that take too long time to process URL: https://github.com/apache/spark/pull/25702 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] qudade edited a comment on issue #26118: [SPARK-24915][Python] Fix Row handling with Schema.
qudade edited a comment on issue #26118: [SPARK-24915][Python] Fix Row handling with Schema. URL: https://github.com/apache/spark/pull/26118#issuecomment-552749031 @zero323 @HyukjinKwon I did some performance tests (Details in [this gist](https://gist.github.com/qudade/dc9d01f55d27d65ab66d68e3b8d1588d)). As expected, for `Row`s that are created using kwargs (has `__from_dict__`) AND where fields are ordered alphabetically the performance is worse (~15% at 15 fields, ~25% at 150 fields) and [memory](https://gist.github.com/qudade/dc9d01f55d27d65ab66d68e3b8d1588d#gistcomment-3080607) [consumption](https://gist.github.com/qudade/dc9d01f55d27d65ab66d68e3b8d1588d#gistcomment-3080608) increases. Of course, this is an edge case - I think it is rare to construct dataframes from `Row`s (otherwise this bug would have been fixed earlier) but for tests/experiments when performance is less of an issue. If performance is an issue, we could check if the order of fields is already alphabetical (making the performance worse for the general case) or determine the order once and reuse this mapping (might require major changes). In my experience, not being able to create a dataframe from dict-like `Row`s was a time-consuming annoyance. The value added by this PR is to enable this. What do you think? Is there any way to improve the code without making it unnecessarily complex? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] qudade commented on issue #26118: [SPARK-24915][Python] Fix Row handling with Schema.
qudade commented on issue #26118: [SPARK-24915][Python] Fix Row handling with Schema. URL: https://github.com/apache/spark/pull/26118#issuecomment-552749031 @zero323 @HyukjinKwon I did some performance tests (Details inhttps://gist.github.com/qudade/dc9d01f55d27d65ab66d68e3b8d1588d). As expected, for `Row`s that are created using kwargs (has `__from_dict__`) AND where fields are ordered alphabetically the performance is worse (~15% at 15 fields, ~25% at 150 fields) and [memory](https://gist.github.com/qudade/dc9d01f55d27d65ab66d68e3b8d1588d#gistcomment-3080607) [consumption](https://gist.github.com/qudade/dc9d01f55d27d65ab66d68e3b8d1588d#gistcomment-3080608) increases. Of course, this is an edge case - I think it is rare to construct dataframes from `Row`s (otherwise this bug would have been fixed earlier) but for tests/experiments when performance is less of an issue. If performance is an issue, we could check if the order of fields is already alphabetical (making the performance worse for the general case) or determine the order once and reuse this mapping (might require major changes). In my experience, not being able to create a dataframe from dict-like `Row`s was a time-consuming annoyance. The value added by this PR is to enable this. What do you think? Is there any way to improve the code without making it unnecessarily complex? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26473: [WIP] Strict parsing of day-time strings to intervals
cloud-fan commented on issue #26473: [WIP] Strict parsing of day-time strings to intervals URL: https://github.com/apache/spark/pull/26473#issuecomment-552747382 +1 to keep the old implementation and add a legacy config to fallback This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] iRakson commented on a change in pull request #26467: [SPARK-29477]Improve tooltip for Streaming tab
iRakson commented on a change in pull request #26467: [SPARK-29477]Improve tooltip for Streaming tab URL: https://github.com/apache/spark/pull/26467#discussion_r345028095 ## File path: streaming/src/main/scala/org/apache/spark/streaming/ui/BatchPage.scala ## @@ -37,10 +37,11 @@ private[ui] class BatchPage(parent: StreamingTab) extends WebUIPage("batch") { private def columns: Seq[Node] = { Output Op Id Description - Output Op Duration + Output Op Duration {SparkUIUtils.tooltip("Time taken to handle all jobs of the batch", Review comment: we should show the exact start time and end time of the job as tooltip? Or, do we need to define the start and end time here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] huaxingao commented on a change in pull request #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols
huaxingao commented on a change in pull request #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols URL: https://github.com/apache/spark/pull/26480#discussion_r345028011 ## File path: mllib/src/main/scala/org/apache/spark/ml/feature/StopWordsRemover.scala ## @@ -51,6 +57,14 @@ class StopWordsRemover @Since("1.5.0") (@Since("1.5.0") override val uid: String @Since("1.5.0") def setOutputCol(value: String): this.type = set(outputCol, value) + /** @group setParam */ + @Since("3.0.0") + def setInputCols(value: Array[String]): this.type = set(inputCols, value) + + /** @group setParam */ + @Since("3.0.0") + def setOutputCols(value: Array[String]): this.type = set(outputCols, value) + Review comment: I am debating if I should add ```stopWordsArray/caseSensitiveArray/localArray```. Seems to me that users will use the same set of ```stopWords``` for all columns, so it's no need to add those. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on issue #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception
yaooqinn commented on issue #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception URL: https://github.com/apache/spark/pull/26479#issuecomment-552746551 cc @cloud-fan @maropu @HyukjinKwon @MaxGekk, thanks a lot This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] huaxingao opened a new pull request #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols
huaxingao opened a new pull request #26480: [SPARK-29808][ML][PYTHON] StopWordsRemover should support multi-cols URL: https://github.com/apache/spark/pull/26480 ### What changes were proposed in this pull request? Add multi-cols support in StopWordsRemover ### Why are the changes needed? As a basic Transformer, StopWordsRemover should support multi-cols. Param stopWords can be applied across all columns. ### Does this PR introduce any user-facing change? ```StopWordsRemover.setInputCols``` ```StopWordsRemover.setOutputCols``` ### How was this patch tested? Unit tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn opened a new pull request #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception
yaooqinn opened a new pull request #26479: [SPARK-29855][SQL] typed literals with negative sign with proper result or exception URL: https://github.com/apache/spark/pull/26479 ### What changes were proposed in this pull request? ```sql -- !query 83 select -integer '7' -- !query 83 schema struct<7:int> -- !query 83 output 7 -- !query 86 select -date '1999-01-01' -- !query 86 schema struct -- !query 86 output 1999-01-01 -- !query 87 select -timestamp '1999-01-01' -- !query 87 schema struct -- !query 87 output 1999-01-01 00:00:00 ``` the integer should be -7 and the date and timestamp results are confusing which should throw exceptions ### Why are the changes needed? bug fix ### Does this PR introduce any user-facing change? NO ### How was this patch tested? ADD UTs This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] 07ARB opened a new pull request #26478: [SPARK-29853][SQL]lpad returning empty instead of NULL for empty pad …
07ARB opened a new pull request #26478: [SPARK-29853][SQL]lpad returning empty instead of NULL for empty pad … URL: https://github.com/apache/spark/pull/26478 ### What changes were proposed in this pull request? lpad returning NULL instead of empty for empty pad value. ### Why are the changes needed? Need to add check-point if padding string is empty and require length is greater then zero. ### Does this PR introduce any user-facing change? NO ### How was this patch tested? Old unit tests correct as per this jira. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint
AmplabJenkins removed a comment on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint URL: https://github.com/apache/spark/pull/25971#issuecomment-552743462 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint
AmplabJenkins removed a comment on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint URL: https://github.com/apache/spark/pull/25971#issuecomment-552743470 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18498/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint
AmplabJenkins commented on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint URL: https://github.com/apache/spark/pull/25971#issuecomment-552743462 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint
AmplabJenkins commented on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint URL: https://github.com/apache/spark/pull/25971#issuecomment-552743470 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/18498/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26477: [SPARK-29776][SQL] rpad returning invalid value when parameter is empty
AmplabJenkins commented on issue #26477: [SPARK-29776][SQL] rpad returning invalid value when parameter is empty URL: https://github.com/apache/spark/pull/26477#issuecomment-552743106 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26473: [WIP] Strict parsing of day-time strings to intervals
cloud-fan commented on a change in pull request #26473: [WIP] Strict parsing of day-time strings to intervals URL: https://github.com/apache/spark/pull/26473#discussion_r345024238 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/IntervalUtilsSuite.scala ## @@ -266,4 +226,59 @@ class IntervalUtilsSuite extends SparkFunSuite { assert(e.getMessage.contains("divide by zero")) } } + + test("from day-time string to interval") { +def check(input: String, from: IntervalUnit, to: IntervalUnit, expected: String): Unit = { + withClue(s"from = $from, to = $to") { +assert(fromDayTimeString(input, from, to) === fromString(expected)) + } +} +def checkFail( +input: String, +from: IntervalUnit, +to: IntervalUnit, +errMsg: String): Unit = { + try { +fromDayTimeString(input, from, to) +fail("Expected to throw an exception for the invalid input") + } catch { +case e: IllegalArgumentException => + assert(e.getMessage.contains(errMsg)) + } +} + +check("12:40", HOUR, MINUTE, "12 hours 40 minutes") +check("+12:40", HOUR, MINUTE, "12 hours 40 minutes") +check("-12:40", HOUR, MINUTE, "-12 hours -40 minutes") +checkFail("5 12:40", HOUR, MINUTE, "must match day-time format") + +check("12:40:30.9", HOUR, SECOND, "12 hours 40 minutes 30.99 seconds") Review comment: can we test `12:40:30.0123456789` as well? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25728: [SPARK-29020][WIP][SQL] Improving array_sort behaviour
SparkQA commented on issue #25728: [SPARK-29020][WIP][SQL] Improving array_sort behaviour URL: https://github.com/apache/spark/pull/25728#issuecomment-552742241 **[Test build #113613 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113613/testReport)** for PR 25728 at commit [`89653bc`](https://github.com/apache/spark/commit/89653bc3892405160f7e53259cf64fe7e275e5c5). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26476: [SPARK-29851][SQL] V2 catalog: Change default behavior of dropping namespace to cascade
AmplabJenkins removed a comment on issue #26476: [SPARK-29851][SQL] V2 catalog: Change default behavior of dropping namespace to cascade URL: https://github.com/apache/spark/pull/26476#issuecomment-552740931 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113609/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26471: [SPARK-29850][SQL] sort-merge-join an empty table should not memory leak
SparkQA commented on issue #26471: [SPARK-29850][SQL] sort-merge-join an empty table should not memory leak URL: https://github.com/apache/spark/pull/26471#issuecomment-552741202 **[Test build #113611 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113611/testReport)** for PR 26471 at commit [`28416df`](https://github.com/apache/spark/commit/28416df9acb3de964bf1c152033c63b405250ce5). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26476: [SPARK-29851][SQL] V2 catalog: Change default behavior of dropping namespace to cascade
AmplabJenkins removed a comment on issue #26476: [SPARK-29851][SQL] V2 catalog: Change default behavior of dropping namespace to cascade URL: https://github.com/apache/spark/pull/26476#issuecomment-552740923 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] 07ARB opened a new pull request #26477: [SPARK-29776][SQL] rpad returning invalid value when parameter is empty
07ARB opened a new pull request #26477: [SPARK-29776][SQL] rpad returning invalid value when parameter is empty URL: https://github.com/apache/spark/pull/26477 ### What changes were proposed in this pull request? rpad returning null value when padding parameter is empty ### Why are the changes needed? Need to add check-point if padding string is empty and require length is greater then zero. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Old unit tests correct as per this jira. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint
SparkQA commented on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint URL: https://github.com/apache/spark/pull/25971#issuecomment-552741231 **[Test build #113612 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113612/testReport)** for PR 25971 at commit [`7b8b398`](https://github.com/apache/spark/commit/7b8b398633789b65d116ce716d6fb1afcded0427). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26176: [SPARK-29519][SQL] SHOW TBLPROPERTIES should do multi-catalog resolution.
cloud-fan commented on issue #26176: [SPARK-29519][SQL] SHOW TBLPROPERTIES should do multi-catalog resolution. URL: https://github.com/apache/spark/pull/26176#issuecomment-552741205 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #26176: [SPARK-29519][SQL] SHOW TBLPROPERTIES should do multi-catalog resolution.
cloud-fan closed pull request #26176: [SPARK-29519][SQL] SHOW TBLPROPERTIES should do multi-catalog resolution. URL: https://github.com/apache/spark/pull/26176 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26476: [SPARK-29851][SQL] V2 catalog: Change default behavior of dropping namespace to cascade
AmplabJenkins commented on issue #26476: [SPARK-29851][SQL] V2 catalog: Change default behavior of dropping namespace to cascade URL: https://github.com/apache/spark/pull/26476#issuecomment-552740923 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] huaxingao commented on issue #25573: [SPARK-28833][DOCS][SQL] Document ALTER VIEW command
huaxingao commented on issue #25573: [SPARK-28833][DOCS][SQL] Document ALTER VIEW command URL: https://github.com/apache/spark/pull/25573#issuecomment-552740967 @kevinyu98 The build failure seems to be OK. I always have that failure too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26476: [SPARK-29851][SQL] V2 catalog: Change default behavior of dropping namespace to cascade
AmplabJenkins commented on issue #26476: [SPARK-29851][SQL] V2 catalog: Change default behavior of dropping namespace to cascade URL: https://github.com/apache/spark/pull/26476#issuecomment-552740931 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113609/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26476: [SPARK-29851][SQL] V2 catalog: Change default behavior of dropping namespace to cascade
SparkQA removed a comment on issue #26476: [SPARK-29851][SQL] V2 catalog: Change default behavior of dropping namespace to cascade URL: https://github.com/apache/spark/pull/26476#issuecomment-552720168 **[Test build #113609 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113609/testReport)** for PR 26476 at commit [`118ac4f`](https://github.com/apache/spark/commit/118ac4f20fbfa7b918d5e2959690a50a97015234). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26476: [SPARK-29851][SQL] V2 catalog: Change default behavior of dropping namespace to cascade
SparkQA commented on issue #26476: [SPARK-29851][SQL] V2 catalog: Change default behavior of dropping namespace to cascade URL: https://github.com/apache/spark/pull/26476#issuecomment-552740675 **[Test build #113609 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113609/testReport)** for PR 26476 at commit [`118ac4f`](https://github.com/apache/spark/commit/118ac4f20fbfa7b918d5e2959690a50a97015234). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier
AmplabJenkins removed a comment on issue #26413: [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier URL: https://github.com/apache/spark/pull/26413#issuecomment-552740105 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org