GitHub user viirya opened a pull request:

    https://github.com/apache/spark/pull/11810

    [SPARK-13996][SQL] Add more not null attributes for Filter codegen

    ## What changes were proposed in this pull request?
    JIRA: https://issues.apache.org/jira/browse/SPARK-13996
    
    Filter codegen finds the attributes not null by checking IsNotNull(a) 
expression with a condition if child.output.contains(a). However, the current 
approach to checking it is not comprehensive. We can improve it.
    
    E.g., for this plan:
    
        val rdd = sqlContext.sparkContext.makeRDD(Seq(Row(1, "1"), Row(null, 
"1"), Row(2, "2")))
        val schema = new StructType().add("k", IntegerType).add("v", StringType)
        val smallDF = sqlContext.createDataFrame(rdd, schema)
        val df = smallDF.filter("isnotnull(k + 1)")
    
    The code snippet generated without this patch:
    
        /* 031 */   protected void processNext() throws java.io.IOException {
        /* 032 */     /*** PRODUCE: Filter isnotnull((k#0 + 1)) */
        /* 033 */
        /* 034 */     /*** PRODUCE: INPUT */
        /* 035 */
        /* 036 */     while (!shouldStop() && inputadapter_input.hasNext()) {
        /* 037 */       InternalRow inputadapter_row = (InternalRow) 
inputadapter_input.next();
        /* 038 */       /*** CONSUME: Filter isnotnull((k#0 + 1)) */
        /* 039 */       /* input[0, int] */
        /* 040 */       boolean filter_isNull = inputadapter_row.isNullAt(0);
        /* 041 */       int filter_value = filter_isNull ? -1 : 
(inputadapter_row.getInt(0));
        /* 042 */
        /* 043 */       /* isnotnull((input[0, int] + 1)) */
        /* 044 */       /* (input[0, int] + 1) */
        /* 045 */       boolean filter_isNull3 = true;
        /* 046 */       int filter_value3 = -1;
        /* 047 */   
        /* 048 */       if (!filter_isNull) {
        /* 049 */         filter_isNull3 = false; // resultCode could change 
nullability.
        /* 050 */         filter_value3 = filter_value + 1;
        /* 051 */     
        /* 052 */       }
        /* 053 */       if (!(!(filter_isNull3))) continue;
        /* 054 */   
        /* 055 */       filter_metricValue.add(1);
    
    With this patch:
    
        /* 031 */   protected void processNext() throws java.io.IOException {
        /* 032 */     /*** PRODUCE: Filter isnotnull((k#0 + 1)) */
        /* 033 */
        /* 034 */     /*** PRODUCE: INPUT */
        /* 035 */
        /* 036 */     while (!shouldStop() && inputadapter_input.hasNext()) {
        /* 037 */       InternalRow inputadapter_row = (InternalRow) 
inputadapter_input.next();
        /* 038 */       /*** CONSUME: Filter isnotnull((k#0 + 1)) */
        /* 039 */       /* input[0, int] */
        /* 040 */       boolean filter_isNull = inputadapter_row.isNullAt(0);
        /* 041 */       int filter_value = filter_isNull ? -1 : 
(inputadapter_row.getInt(0));
        /* 042 */
        /* 043 */       if (filter_isNull) continue;
        /* 044 */
        /* 045 */       filter_metricValue.add(1);
    
    
    ## How was this patch tested?
    
    Existing tests.
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/viirya/spark-1 add-more-not-null-attrs

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11810.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11810
    
----
commit 8e59a7450db57ca210010fa3df1cb41a84311fed
Author: Liang-Chi Hsieh <[email protected]>
Date:   2016-03-18T03:36:40Z

    Add more not null attributes for Filter codegen.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to