[GitHub] spark pull request #19813: [SPARK-22600][SQL] Fix 64kb limit for deeply nest...

viirya Mon, 04 Dec 2017 00:59:38 -0800

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19813#discussion_r154588045
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/ColumnarBatchScan.scala 
---
    @@ -108,7 +108,10 @@ private[sql] trait ColumnarBatchScan extends 
CodegenSupport {
              |}""".stripMargin)
     
         ctx.currentVars = null
    +    // `rowIdx` isn't in `ctx.currentVars`. If the expressions are split 
later, we can't track it.
    +    // So making it as global variable.
    --- End diff --
    
    I think it works, although it feels a bit hacky. Like:
    
    ```scala
      val rowidx = ctx.freshName("rowIdx")
      val rowidxExpr = AttributeReference("rowIdx", IntegerType, nullable = 
false)()
      val columnsBatchInput = (output zip colVars).map { case (attr, colVar) =>
        val exprCode = genCodeColumnVector(ctx, colVar, rowidx, attr.dataType, 
attr.nullable)
        exprCode.inputVars = Seq(ExprInputVar(rowidxExpr,
          ExprCode("", isNull = "false", value = rowidx)))
          exprCode
        }
      }
    ```
    
    This just adds one global variable. I think it is not a big problem?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19813: [SPARK-22600][SQL] Fix 64kb limit for deeply nest...

Reply via email to