GitHub user chenghao-intel opened a pull request:
https://github.com/apache/spark/pull/482
[WIP][Spark-SQL] Optimize the Constant Folding for Expression
Currently, expression does not support the "constant null" well in constant
folding.
e.g. Sum(a, null) actually always produces Literal(null, NumericType) in
runtime.
I changed the nullable interface from
def nullable: Boolean => def nullable: Nullability
the Nullability has 3 concrete objects (neverNull, alwaysNull and
possibleNull), which are very helpful hints in the constant folding in
Optimizer.
For example:
```
explain select isnull(key+null) from src;
== Logical Plan ==
Project [HiveGenericUdf#isnull((key#30 + CAST(null, IntegerType))) AS
c_0#28]
MetastoreRelation default, src, None
== Optimized Logical Plan ==
Project [true AS c_0#28]
MetastoreRelation default, src, None
== Physical Plan ==
Project [true AS c_0#28]
HiveTableScan [], (MetastoreRelation default, src, None), None
```
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/chenghao-intel/spark optimize_constant_folding
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/482.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #482
----
commit 787509bdcc6e7f23425b53a624fc92d9b22e8f24
Author: Cheng Hao <[email protected]>
Date: 2014-04-22T06:01:40Z
unify the nullable interface
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---