Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/19813#discussion_r156287640
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
---
@@ -55,8 +55,45 @@ import org.apache.spark.util.{ParentClassLoader, Utils}
* to null.
* @param value A term for a (possibly primitive) value of the result of
the evaluation. Not
* valid if `isNull` is set to `true`.
+ * @param inputRow A term that holds the input row name when generating
this code.
+ * @param inputVars A list of [[ExprInputVar]] that holds input variables
when generating this code.
*/
-case class ExprCode(var code: String, var isNull: String, var value:
String)
+case class ExprCode(
+ var code: String,
+ var isNull: String,
+ var value: String,
+ var inputRow: String = null,
+ var inputVars: Seq[ExprInputVar] = Seq.empty) {
+
+ // Returns true if this value is a literal.
+ def isLiteral(): Boolean = {
+ assert(value.nonEmpty, "ExprCode.value can't be empty string.")
+
+ if (value == "true" || value == "false" || value == "null") {
+ true
+ } else {
+ // The valid characters for the first character of a Java variable
is [a-zA-Z_$].
+ value.head match {
+ case v if v >= 'a' && v <= 'z' => false
--- End diff --
hmmm this seems very hard to do, the code is already generated and use the
input names as whatever it is, e.g. java variable `a` or literal `123` or array
accessing `arr[1]`. Ideally we need to analyze what the input really refers,
e.g. `a` refers to a java variable `a`, `123` refers nothing, `arr[1]` refers
to a java variable `arr`. This is kind of impossible for the current string
based framework. We need to think more about how to deal with it.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]