[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19752 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153242781 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -158,111 +178,86 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +// This variable represents whether the first successful condition is met or not. +// It is initialized to `false` and it is set to `true` when the first condition which +// evaluates to `true` is met and therefore is not needed to go on anymore on the computation +// of the following conditions. +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull) +ctx.addMutableState(ctx.javaType(dataType), ev.value) val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153241209 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -158,111 +178,86 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +// This variable represents whether the first successful condition is met or not. +// It is initialized to `false` and it is set to `true` when the first condition which +// evaluates to `true` is met and therefore is not needed to go on anymore on the computation +// of the following conditions. +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull) +ctx.addMutableState(ctx.javaType(dataType), ev.value) val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153239636 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -158,111 +178,86 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +// This variable represents whether the first successful condition is met or not. +// It is initialized to `false` and it is set to `true` when the first condition which +// evaluates to `true` is met and therefore is not needed to go on anymore on the computation +// of the following conditions. +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull) +ctx.addMutableState(ctx.javaType(dataType), ev.value) val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153237434 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -158,111 +178,86 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +// This variable represents whether the first successful condition is met or not. +// It is initialized to `false` and it is set to `true` when the first condition which +// evaluates to `true` is met and therefore is not needed to go on anymore on the computation +// of the following conditions. +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull) +ctx.addMutableState(ctx.javaType(dataType), ev.value) val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153236140 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -158,111 +178,86 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +// This variable represents whether the first successful condition is met or not. +// It is initialized to `false` and it is set to `true` when the first condition which +// evaluates to `true` is met and therefore is not needed to go on anymore on the computation +// of the following conditions. +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull) +ctx.addMutableState(ctx.javaType(dataType), ev.value) val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153226714 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -158,111 +178,86 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +// This variable represents whether the first successful condition is met or not. +// It is initialized to `false` and it is set to `true` when the first condition which +// evaluates to `true` is met and therefore is not needed to go on anymore on the computation +// of the following conditions. +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull) +ctx.addMutableState(ctx.javaType(dataType), ev.value) val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153225638 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -158,111 +178,86 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +// This variable represents whether the first successful condition is met or not. +// It is initialized to `false` and it is set to `true` when the first condition which +// evaluates to `true` is met and therefore is not needed to go on anymore on the computation +// of the following conditions. +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull) +ctx.addMutableState(ctx.javaType(dataType), ev.value) val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153183113 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -158,111 +178,86 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +// This variable represents whether the first successful condition is met or not. +// It is initialized to `false` and it is set to `true` when the first condition which +// evaluates to `true` is met and therefore is not needed to go on anymore on the computation +// of the following conditions. +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull) +ctx.addMutableState(ctx.javaType(dataType), ev.value) val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153181270 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -158,111 +178,86 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +// This variable represents whether the first successful condition is met or not. +// It is initialized to `false` and it is set to `true` when the first condition which +// evaluates to `true` is met and therefore is not needed to go on anymore on the computation +// of the following conditions. +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull) +ctx.addMutableState(ctx.javaType(dataType), ev.value) val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153177401 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -158,111 +178,86 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +// This variable represents whether the first successful condition is met or not. +// It is initialized to `false` and it is set to `true` when the first condition which +// evaluates to `true` is met and therefore is not needed to go on anymore on the computation +// of the following conditions. +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull) +ctx.addMutableState(ctx.javaType(dataType), ev.value) val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153177002 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -158,111 +178,86 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +// This variable represents whether the first successful condition is met or not. +// It is initialized to `false` and it is set to `true` when the first condition which +// evaluates to `true` is met and therefore is not needed to go on anymore on the computation +// of the following conditions. +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull) +ctx.addMutableState(ctx.javaType(dataType), ev.value) val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153124078 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -158,111 +178,73 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +// This variable represents whether the first successful condition is met or not. +// It is initialized to `false` and it is set to `true` when the first condition which +// evaluates to `true` is met and therefore is not needed to go on anymore on the computation +// of the following conditions. +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull) +ctx.addMutableState(ctx.javaType(dataType), ev.value) val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153123184 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -158,111 +178,73 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +// This variable represents whether the first successful condition is met or not. +// It is initialized to `false` and it is set to `true` when the first condition which +// evaluates to `true` is met and therefore is not needed to go on anymore on the computation +// of the following conditions. +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull) +ctx.addMutableState(ctx.javaType(dataType), ev.value) val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val allConditions
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153118605 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -211,111 +231,73 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +// This variable represents whether the first successful condition is met or not. +// It is initialized to `false` and it is set to `true` when the first condition which +// evaluates to `true` is met and therefore is not needed to go on anymore on the computation +// of the following conditions. +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull, "") --- End diff -- thanks, I branched from a version when there was no default value. I merged and fixed it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153118387 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -211,111 +231,73 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +// This variable represents whether the first successful condition is met or not. +// It is initialized to `false` and it is set to `true` when the first condition which +// evaluates to `true` is met and therefore is not needed to go on anymore on the computation +// of the following conditions. +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull, "") +ctx.addMutableState(ctx.javaType(dataType), ev.value, "") val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153118326 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -211,111 +231,73 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: --- End diff -- I don't think it is necessary since now the generated code is way easier and more standard and nowhere else a comment like this is provided. Anyway, if you feel it is needed, I can add it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153101065 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -211,111 +231,73 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: --- End diff -- shall we keep this comment and update it? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153100483 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -211,111 +231,73 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +// This variable represents whether the first successful condition is met or not. +// It is initialized to `false` and it is set to `true` when the first condition which +// evaluates to `true` is met and therefore is not needed to go on anymore on the computation +// of the following conditions. +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull, "") +ctx.addMutableState(ctx.javaType(dataType), ev.value, "") val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153100103 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -211,111 +231,73 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +// This variable represents whether the first successful condition is met or not. +// It is initialized to `false` and it is set to `true` when the first condition which +// evaluates to `true` is met and therefore is not needed to go on anymore on the computation +// of the following conditions. +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull, "") --- End diff -- nit: `ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull)`, as empty string is the default value of the 3rd parameter. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153082196 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -141,14 +141,34 @@ case class If(predicate: Expression, trueValue: Expression, falseValue: Expressi } /** - * Abstract parent class for common logic in CaseWhen and CaseWhenCodegen. + * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". + * When a = true, returns b; when c = true, returns d; else returns e. * * @param branches seq of (branch condition, branch value) * @param elseValue optional value for the else branch */ -abstract class CaseWhenBase( +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", + arguments = """ +Arguments: + * expr1, expr3 - the branch condition expressions should all be boolean type. + * expr2, expr4, expr5 - the branch value expressions and else value expression should all be + same type or coercible to a common type. + """, + examples = """ +Examples: + > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; + 1 + > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; + 2 + > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; --- End diff -- I confirm that Hive returns NULL. Then I am updating the description as requested. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153081719 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -211,111 +231,62 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState("boolean", ev.isNull, "") +ctx.addMutableState(ctx.javaType(dataType), ev.value, "") val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val allConditions = cases ++ elseCode + +val code = if (ctx.INPUT_ROW == null || ctx.currentVars != null) { + allConditions.mkString("\n") +} else { + ctx.splitExpressions(allConditions, "caseWhen", +("InternalRow", ctx.INPUT_ROW) :: ("boolean", conditionMet) :: Nil, returnType = "boolean", +
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153081547 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -211,111 +231,62 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState("boolean", ev.isNull, "") +ctx.addMutableState(ctx.javaType(dataType), ev.value, "") val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val allConditions = cases ++ elseCode + +val code = if (ctx.INPUT_ROW == null || ctx.currentVars != null) { + allConditions.mkString("\n") +} else { + ctx.splitExpressions(allConditions, "caseWhen", +("InternalRow", ctx.INPUT_ROW) :: ("boolean", conditionMet) :: Nil, returnType = "boolean",
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153081270 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -211,111 +231,62 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState("boolean", ev.isNull, "") +ctx.addMutableState(ctx.javaType(dataType), ev.value, "") val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val allConditions = cases ++ elseCode + +val code = if (ctx.INPUT_ROW == null || ctx.currentVars != null) { + allConditions.mkString("\n") +} else { + ctx.splitExpressions(allConditions, "caseWhen", +("InternalRow", ctx.INPUT_ROW) :: ("boolean", conditionMet) :: Nil, returnType = "boolean", +
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153081226 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -141,14 +141,34 @@ case class If(predicate: Expression, trueValue: Expression, falseValue: Expressi } /** - * Abstract parent class for common logic in CaseWhen and CaseWhenCodegen. + * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". + * When a = true, returns b; when c = true, returns d; else returns e. * * @param branches seq of (branch condition, branch value) * @param elseValue optional value for the else branch */ -abstract class CaseWhenBase( +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", + arguments = """ +Arguments: + * expr1, expr3 - the branch condition expressions should all be boolean type. + * expr2, expr4, expr5 - the branch value expressions and else value expression should all be + same type or coercible to a common type. + """, + examples = """ +Examples: + > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; + 1 + > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; + 2 + > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; --- End diff -- I can follow your first suggestion and I can test this on hive, but actually I haven't changed this part of code. I will post ASAP the result in Hive. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153079554 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -211,111 +231,62 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState("boolean", ev.isNull, "") +ctx.addMutableState(ctx.javaType(dataType), ev.value, "") val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val allConditions = cases ++ elseCode + +val code = if (ctx.INPUT_ROW == null || ctx.currentVars != null) { + allConditions.mkString("\n") +} else { + ctx.splitExpressions(allConditions, "caseWhen", +("InternalRow", ctx.INPUT_ROW) :: ("boolean", conditionMet) :: Nil, returnType = "boolean",
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153079931 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -141,14 +141,34 @@ case class If(predicate: Expression, trueValue: Expression, falseValue: Expressi } /** - * Abstract parent class for common logic in CaseWhen and CaseWhenCodegen. + * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". + * When a = true, returns b; when c = true, returns d; else returns e. * * @param branches seq of (branch condition, branch value) * @param elseValue optional value for the else branch */ -abstract class CaseWhenBase( +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", + arguments = """ +Arguments: + * expr1, expr3 - the branch condition expressions should all be boolean type. + * expr2, expr4, expr5 - the branch value expressions and else value expression should all be + same type or coercible to a common type. + """, + examples = """ +Examples: + > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; + 1 + > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; + 2 + > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; --- End diff -- > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; -> > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 END; --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153079772 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -211,111 +231,62 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState("boolean", ev.isNull, "") +ctx.addMutableState(ctx.javaType(dataType), ev.value, "") val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val allConditions = cases ++ elseCode + +val code = if (ctx.INPUT_ROW == null || ctx.currentVars != null) { + allConditions.mkString("\n") +} else { + ctx.splitExpressions(allConditions, "caseWhen", --- End diff -- Style issue. Indent ---
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153079586 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -211,111 +231,62 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +val conditionMet = ctx.freshName("caseWhenConditionMet") --- End diff -- Add a comment to explain what it is. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153079959 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -141,14 +141,34 @@ case class If(predicate: Expression, trueValue: Expression, falseValue: Expressi } /** - * Abstract parent class for common logic in CaseWhen and CaseWhenCodegen. + * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". + * When a = true, returns b; when c = true, returns d; else returns e. * * @param branches seq of (branch condition, branch value) * @param elseValue optional value for the else branch */ -abstract class CaseWhenBase( +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", + arguments = """ +Arguments: + * expr1, expr3 - the branch condition expressions should all be boolean type. + * expr2, expr4, expr5 - the branch value expressions and else value expression should all be + same type or coercible to a common type. + """, + examples = """ +Examples: + > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; + 1 + > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; + 2 + > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; --- End diff -- Could you double check Hive returns NULL in the following case? > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 END; --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153079595 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -211,111 +231,62 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState("boolean", ev.isNull, "") --- End diff -- `ctx.JAVA_BOOLEAN` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153080028 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -2126,4 +2126,17 @@ class DataFrameSuite extends QueryTest with SharedSQLContext { val mean = result.select("DecimalCol").where($"summary" === "mean") assert(mean.collect().toSet === Set(Row("0.034567890"))) } + + test("SPARK-22520: support code generation for large CaseWhen") { +val N = 30 +var expr1 = when($"id" === lit(0), 0) +var expr2 = when($"id" === lit(0), 10) +(1 to N).foreach { i => + expr1 = expr1.when($"id" === lit(i), -i) + expr2 = expr2.when($"id" === lit(i + 10), i) +} +val df = spark.range(1).select(expr1, expr2.otherwise(0)) +df.show --- End diff -- compare the results? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r153079696 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -211,111 +231,62 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState("boolean", ev.isNull, "") +ctx.addMutableState(ctx.javaType(dataType), ev.value, "") val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" +} + """ } -generatedCode += "}\n" * cases.size +val allConditions = cases ++ elseCode + +val code = if (ctx.INPUT_ROW == null || ctx.currentVars != null) { + allConditions.mkString("\n") +} else { + ctx.splitExpressions(allConditions, "caseWhen", +("InternalRow", ctx.INPUT_ROW) :: ("boolean", conditionMet) :: Nil, returnType = "boolean",
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r152865462 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -211,111 +231,61 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState("boolean", ev.isNull, "") +ctx.addMutableState(ctx.javaType(dataType), ev.value, "") val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" -} +} + """ +}.getOrElse("") -generatedCode += "}\n" * cases.size +val casesCode = if (ctx.INPUT_ROW == null || ctx.currentVars != null) { + cases.mkString("\n") +} else { + ctx.splitExpressions(cases, "caseWhen", --- End diff -- I can reuse the same UT added in #19797 for the 64KB limit, if it is ok for you. As far as the performance is
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r152643842 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -211,111 +231,61 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState("boolean", ev.isNull, "") +ctx.addMutableState(ctx.javaType(dataType), ev.value, "") val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" -} +} + """ +}.getOrElse("") -generatedCode += "}\n" * cases.size +val casesCode = if (ctx.INPUT_ROW == null || ctx.currentVars != null) { + cases.mkString("\n") +} else { + ctx.splitExpressions(cases, "caseWhen", --- End diff -- Then, could you show us a test case? Can be a performance test if the function is hard to hit a 64KB limit. ---
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r152538038 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -211,111 +231,61 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState("boolean", ev.isNull, "") +ctx.addMutableState(ctx.javaType(dataType), ev.value, "") val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" -} +} + """ +}.getOrElse("") -generatedCode += "}\n" * cases.size +val casesCode = if (ctx.INPUT_ROW == null || ctx.currentVars != null) { + cases.mkString("\n") +} else { + ctx.splitExpressions(cases, "caseWhen", --- End diff -- But I think that this implicitly covers also #18641, even though its main goal is another. ---
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r152537080 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -211,111 +231,61 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState("boolean", ev.isNull, "") +ctx.addMutableState(ctx.javaType(dataType), ev.value, "") val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" -} +} + """ +}.getOrElse("") -generatedCode += "}\n" * cases.size +val casesCode = if (ctx.INPUT_ROW == null || ctx.currentVars != null) { + cases.mkString("\n") +} else { + ctx.splitExpressions(cases, "caseWhen", --- End diff -- I think that we need to call it, indeed, as explained in this comment:
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19752#discussion_r152482401 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -211,111 +231,61 @@ abstract class CaseWhenBase( val elseCase = elseValue.map(" ELSE " + _.sql).getOrElse("") "CASE" + cases + elseCase + " END" } -} - - -/** - * Case statements of the form "CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END". - * When a = true, returns b; when c = true, returns d; else returns e. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -// scalastyle:off line.size.limit -@ExpressionDescription( - usage = "CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; else when `expr3` = true, returns `expr4`; else returns `expr5`.", - arguments = """ -Arguments: - * expr1, expr3 - the branch condition expressions should all be boolean type. - * expr2, expr4, expr5 - the branch value expressions and else value expression should all be - same type or coercible to a common type. - """, - examples = """ -Examples: - > SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 1 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END; - 2 - > SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 ELSE null END; - NULL - """) -// scalastyle:on line.size.limit -case class CaseWhen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with CodegenFallback with Serializable { - - override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -super[CodegenFallback].doGenCode(ctx, ev) - } - - def toCodegen(): CaseWhenCodegen = { -CaseWhenCodegen(branches, elseValue) - } -} - -/** - * CaseWhen expression used when code generation condition is satisfied. - * OptimizeCodegen optimizer replaces CaseWhen into CaseWhenCodegen. - * - * @param branches seq of (branch condition, branch value) - * @param elseValue optional value for the else branch - */ -case class CaseWhenCodegen( -val branches: Seq[(Expression, Expression)], -val elseValue: Option[Expression] = None) - extends CaseWhenBase(branches, elseValue) with Serializable { override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { -// Generate code that looks like: -// -// condA = ... -// if (condA) { -// valueA -// } else { -// condB = ... -// if (condB) { -// valueB -// } else { -// condC = ... -// if (condC) { -// valueC -// } else { -// elseValue -// } -// } -// } +val conditionMet = ctx.freshName("caseWhenConditionMet") +ctx.addMutableState("boolean", ev.isNull, "") +ctx.addMutableState(ctx.javaType(dataType), ev.value, "") val cases = branches.map { case (condExpr, valueExpr) => val cond = condExpr.genCode(ctx) val res = valueExpr.genCode(ctx) s""" -${cond.code} -if (!${cond.isNull} && ${cond.value}) { - ${res.code} - ${ev.isNull} = ${res.isNull}; - ${ev.value} = ${res.value}; +if(!$conditionMet) { + ${cond.code} + if (!${cond.isNull} && ${cond.value}) { +${res.code} +${ev.isNull} = ${res.isNull}; +${ev.value} = ${res.value}; +$conditionMet = true; + } } """ } -var generatedCode = cases.mkString("", "\nelse {\n", "\nelse {\n") - -elseValue.foreach { elseExpr => +val elseCode = elseValue.map { elseExpr => val res = elseExpr.genCode(ctx) - generatedCode += -s""" + s""" +if(!$conditionMet) { ${res.code} ${ev.isNull} = ${res.isNull}; ${ev.value} = ${res.value}; -""" -} +} + """ +}.getOrElse("") -generatedCode += "}\n" * cases.size +val casesCode = if (ctx.INPUT_ROW == null || ctx.currentVars != null) { + cases.mkString("\n") +} else { + ctx.splitExpressions(cases, "caseWhen", --- End diff -- In almost all the cases, we do not need to call `splitExpressions` after merging the PR
[GitHub] spark pull request #19752: [SPARK-22520][SQL] Support code generation for la...
GitHub user mgaido91 opened a pull request: https://github.com/apache/spark/pull/19752 [SPARK-22520][SQL] Support code generation for large CaseWhen ## What changes were proposed in this pull request? Code generation is disabled for CaseWhen when the number of branches is higher than `spark.sql.codegen.maxCaseBranches` (which defaults to 20). This was done to prevent the well known 64KB method limit exception. This PR proposes to support code generation also in those cases (without causing exceptions of course). As a side effect, we could get rid of the `spark.sql.codegen.maxCaseBranches` configuration. ## How was this patch tested? existing UTs You can merge this pull request into a Git repository by running: $ git pull https://github.com/mgaido91/spark SPARK-22520 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19752.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19752 commit 98eaae9436adf63ec3023ee077f2fff8e23dfa35 Author: Marco GaidoDate: 2017-11-14T17:41:00Z [SPARK-22520][SQL] Support code generation for large CaseWhen --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org