[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2368#issuecomment-56244170 Thanks! Merged to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2368 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/2368#discussion_r17619745 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala --- @@ -244,18 +244,22 @@ case class Cast(child: Expression, dataType: DataType) extends UnaryExpression { b = x.numeric.asInstanceOf[Numeric[Any]].toFloat(b) } - private[this] lazy val cast: Any = Any = dataType match { -case StringType = castToString -case BinaryType = castToBinary -case DecimalType = castToDecimal -case TimestampType = castToTimestamp -case BooleanType = castToBoolean -case ByteType = castToByte -case ShortType = castToShort -case IntegerType = castToInt -case FloatType = castToFloat -case LongType = castToLong -case DoubleType = castToDouble + private[this] lazy val cast: Any = Any = if (child.dataType == dataType) { +(e: Any) = e + } else { +dataType match { + case StringType = castToString --- End diff -- Perhaps: ```scala private[this] lazy val cast: Any = Any = dataType match { case dt if dt == child.dataType = identity[Any] _ case StringType = castToString case BinaryType = castToBinary case DecimalType = castToDecimal case TimestampType = castToTimestamp case BooleanType = castToBoolean case ByteType = castToByte case ShortType = castToShort case IntegerType = castToInt case FloatType = castToFloat case LongType = castToLong case DoubleType = castToDouble } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2368#issuecomment-55788087 Minor style suggestion, otherwise LGTM. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/2368#discussion_r17639971 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala --- @@ -244,18 +244,22 @@ case class Cast(child: Expression, dataType: DataType) extends UnaryExpression { b = x.numeric.asInstanceOf[Numeric[Any]].toFloat(b) } - private[this] lazy val cast: Any = Any = dataType match { -case StringType = castToString -case BinaryType = castToBinary -case DecimalType = castToDecimal -case TimestampType = castToTimestamp -case BooleanType = castToBoolean -case ByteType = castToByte -case ShortType = castToShort -case IntegerType = castToInt -case FloatType = castToFloat -case LongType = castToLong -case DoubleType = castToDouble + private[this] lazy val cast: Any = Any = if (child.dataType == dataType) { +(e: Any) = e + } else { +dataType match { + case StringType = castToString --- End diff -- Thank you @marmbrus , it's more clean. :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2368#issuecomment-55843681 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20445/consoleFull) for PR 2368 at commit [`5c9c3a5`](https://github.com/apache/spark/commit/5c9c3a59d423383885151ff9cd17e0744776baf0). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/2368#issuecomment-6637 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2368#issuecomment-6842 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20329/consoleFull) for PR 2368 at commit [`b804abd`](https://github.com/apache/spark/commit/b804abd5be4161531db38193be310cf628674cec). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2368#issuecomment-55564638 **[Tests timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20329/consoleFull)** after a configured wait of `120m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/2368#discussion_r17562206 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -32,12 +32,12 @@ object Optimizer extends RuleExecutor[LogicalPlan] { Batch(Combine Limits, FixedPoint(100), CombineLimits) :: Batch(ConstantFolding, FixedPoint(100), + SimplifyCasts, --- End diff -- I guess we do not want to rely on optimization rules to avoid exceptions? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/2368#discussion_r17562464 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -32,12 +32,12 @@ object Optimizer extends RuleExecutor[LogicalPlan] { Batch(Combine Limits, FixedPoint(100), CombineLimits) :: Batch(ConstantFolding, FixedPoint(100), + SimplifyCasts, --- End diff -- I'm not proposing that we don't use simplify casts. I'm proposing that it should be possible to evaluate expressions even if we don't perform certain optimizations. Any valid expression should be evaluate-able and there is nothing invalid about casting a type to itself (just like there is nothing invalid with an expression like `x + 0` even though the `+ 0` is a no-op). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2368#issuecomment-55685521 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20361/consoleFull) for PR 2368 at commit [`3f15731`](https://github.com/apache/spark/commit/3f1573133ec7e9515f61361f4c37d65376a2). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2368#issuecomment-55690559 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20361/consoleFull) for PR 2368 at commit [`3f15731`](https://github.com/apache/spark/commit/3f1573133ec7e9515f61361f4c37d65376a2). * This patch **passes** unit tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/2368#issuecomment-55693692 You're right @marmbrus . the expression should work even without being optimized. I've updated the no-op case in `Cast` and revert the change in the `Optimizer`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/2368#discussion_r17524564 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -32,12 +32,12 @@ object Optimizer extends RuleExecutor[LogicalPlan] { Batch(Combine Limits, FixedPoint(100), CombineLimits) :: Batch(ConstantFolding, FixedPoint(100), + SimplifyCasts, --- End diff -- For this case, I don't think we have to move `SimplifyCasts` ahead, I am wondering if people construct the expressions with DSL API may jump into this trap accidentally. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/2368#discussion_r17524775 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -32,12 +32,12 @@ object Optimizer extends RuleExecutor[LogicalPlan] { Batch(Combine Limits, FixedPoint(100), CombineLimits) :: Batch(ConstantFolding, FixedPoint(100), + SimplifyCasts, --- End diff -- I'm inclined to leave it as it is. All of the rules in this batch are optimizations, and thus it should be possible to apply them at anytime without an exception being thrown. If there are other bugs with the casting operators we should fix those instead of rearranging to mask the problem. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2368#issuecomment-55548796 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20318/consoleFull) for PR 2368 at commit [`330a5c8`](https://github.com/apache/spark/commit/330a5c8331e22ba122dbc8b09c592cf60f97c919). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/2368#discussion_r17525387 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -32,12 +32,12 @@ object Optimizer extends RuleExecutor[LogicalPlan] { Batch(Combine Limits, FixedPoint(100), CombineLimits) :: Batch(ConstantFolding, FixedPoint(100), + SimplifyCasts, --- End diff -- I just added an new unit test ```SELECT CAST(CAST('123' AS binary) AS binary) FROM src LIMIT 1```, this test will fail because the `ConstantFolding` will try to compute the `foldable` expression prior to `SimplifyCasts`. Sorry, I should add this test case earlier. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2368#issuecomment-55549403 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20319/consoleFull) for PR 2368 at commit [`b804abd`](https://github.com/apache/spark/commit/b804abd5be4161531db38193be310cf628674cec). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/2368#discussion_r17525906 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -32,12 +32,12 @@ object Optimizer extends RuleExecutor[LogicalPlan] { Batch(Combine Limits, FixedPoint(100), CombineLimits) :: Batch(ConstantFolding, FixedPoint(100), + SimplifyCasts, --- End diff -- Okay, I see. I would argue that's a bug in the casting operator, not a reason to change the ordering here though. On Sep 14, 2014 8:09 PM, Cheng Hao notificati...@github.com wrote: In sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala: @@ -32,12 +32,12 @@ object Optimizer extends RuleExecutor[LogicalPlan] { Batch(Combine Limits, FixedPoint(100), CombineLimits) :: Batch(ConstantFolding, FixedPoint(100), + SimplifyCasts, I just added an new unit test SELECT CAST(CAST('123' AS binary) AS binary) FROM src LIMIT 1, this test will fail because the ConstantFolding will try to compute the foldable expression prior to SimplifyCasts. Sorry, I should add this test case earlier. â Reply to this email directly or view it on GitHub https://github.com/apache/spark/pull/2368/files#r17525387. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user adrian-wang commented on a diff in the pull request: https://github.com/apache/spark/pull/2368#discussion_r17526186 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -32,12 +32,12 @@ object Optimizer extends RuleExecutor[LogicalPlan] { Batch(Combine Limits, FixedPoint(100), CombineLimits) :: Batch(ConstantFolding, FixedPoint(100), + SimplifyCasts, --- End diff -- So maybe it is better to add a match case in the function `lazy val cast: Any = Any`, to eliminate those meaningless casts. And in this way we'll no longer need `SimplifyCasts` here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2368#issuecomment-2752 **[Tests timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20318/consoleFull)** after a configured wait of `120m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/2368#discussion_r17526456 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -32,12 +32,12 @@ object Optimizer extends RuleExecutor[LogicalPlan] { Batch(Combine Limits, FixedPoint(100), CombineLimits) :: Batch(ConstantFolding, FixedPoint(100), + SimplifyCasts, --- End diff -- Usually we don't change the expression tree within an expression node, but utilizing an external rule object like `SimplifyCasts`, thus they can achieve the same goal, the later will make the code more clean and effective in runtime. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2368#issuecomment-3380 **[Tests timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20319/consoleFull)** after a configured wait of `120m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/2368#discussion_r17513558 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUdfs.scala --- @@ -51,13 +51,13 @@ private[hive] abstract class HiveFunctionRegistry val function = functionInfo.getFunctionClass.newInstance().asInstanceOf[UDF] val method = function.getResolver.getEvalMethod(children.map(_.dataType.toTypeInfo)) - lazy val expectedDataTypes = method.getParameterTypes.map(javaClassToDataType) + val expectedDataTypes = method.getParameterTypes.map(javaClassToDataType) HiveSimpleUdf( functionClassName, children.zip(expectedDataTypes).map { case (e, NullType) = e - case (e, t) = Cast(e, t) + case (e, t) = if (e.dataType == t) e else Cast(e, t) --- End diff -- Nit: I'd structure this as a guard to call out the two cases more clearly: ```scala case (e, t) if (e.dataType == t) = e case (e, t) = Cast(e, t) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/2368#discussion_r17513564 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -32,12 +32,12 @@ object Optimizer extends RuleExecutor[LogicalPlan] { Batch(Combine Limits, FixedPoint(100), CombineLimits) :: Batch(ConstantFolding, FixedPoint(100), + SimplifyCasts, --- End diff -- Do we need to move this still? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2368#issuecomment-55502187 Thanks for fixing this! While I think eliminating the problem is great, it also seems like a bug to me that casting a timestamp to a timestamp throws an exception since none of the other datatypes do that. Mind adding a no-op case to the `eval` path as well? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2368#issuecomment-55369750 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20208/consoleFull) for PR 2368 at commit [`b834ed4`](https://github.com/apache/spark/commit/b834ed40363b19f924d51277880300d2f60f0524). * This patch **passes** unit tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/2368#issuecomment-55363030 test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2368#issuecomment-55363315 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20208/consoleFull) for PR 2368 at commit [`b834ed4`](https://github.com/apache/spark/commit/b834ed40363b19f924d51277880300d2f60f0524). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org