Github user eowhadi commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/255#discussion_r49391405
  
    --- Diff: core/sql/src/main/java/org/trafodion/sql/HTableClient.java ---
    @@ -421,29 +839,140 @@ else if (versions > 0)
          else
            numColsInScan = 0;
          if (colNamesToFilter != null) {
    -       FilterList list = new FilterList(FilterList.Operator.MUST_PASS_ALL);
    -
    -       for (int i = 0; i < colNamesToFilter.length; i++) {
    -         byte[] colName = (byte[])colNamesToFilter[i];
    -         byte[] coByte = (byte[])compareOpList[i];
    -         byte[] colVal = (byte[])colValuesToCompare[i];
    -
    -         if ((coByte == null) || (colVal == null)) {
    -           return false;
    -         }
    -
    -         String coStr = new String(coByte);
    -         CompareOp co = CompareOp.valueOf(coStr);
    -
    -         SingleColumnValueFilter filter1 = 
    -             new SingleColumnValueFilter(getFamily(colName), 
getName(colName), 
    -                 co, colVal);
    -         list.addFilter(filter1);
    -       }
    -
    +           FilterList list;
    +           boolean narrowDownResultColumns = false; //to check if we need 
a narrow down column filter (V2 only feature)
    +           if (compareOpList == null)return false;
    +           if (new String((byte[])compareOpList[0]).equals("V2")){ // are 
we dealing with predicate pushdown V2
    +                   list = new 
FilterList(FilterList.Operator.MUST_PASS_ALL);
    +                   HashMap<String,Object> columnsToRemove = new 
HashMap<String,Object>();
    +                   //if columnsToRemove not null, we are narrowing down 
using the SingleColumnValue[Exclude]Filter method
    +                   //else we will use the explicit FamilyFilter and 
QualifierFilter
    +                   //the simplified logic is that we can use the first 
method if and only if each and every column in the
    +                   //pushed down predicate shows up only once.
    +               for (int i = 0; i < colNamesToFilter.length; i++) {
    +                 byte[] colName = (byte[])colNamesToFilter[i];
    +         
    +                 // check if the filter column is already part of the 
column list, if not add it if we are limiting columns (not *)
    +                 if(columns!=null && columns.length > 0){// if not *
    +                     boolean columnAlreadyIn = false; //assume column not 
yet in the scan object
    +                     for (int k=0; k<columns.length;k++){
    +                             if (Arrays.equals(colName, 
(byte[])columns[k])){
    +                                     columnAlreadyIn = true;//found 
already exist
    +                                     break;//no need to look further
    +                             }
    +                     }
    +                     if (!columnAlreadyIn){// column was not already in, 
so add it
    +                             
scan.addColumn(getFamily(colName),getName(colName));
    +                             narrowDownResultColumns = true; //since we 
added a column for predicate eval, we need to remove it later out of result set
    +                             String strColName = new String(colName);
    +                             if (columnsToRemove != null && 
columnsToRemove.containsKey(strColName)){// if we already added this column, it 
means it shows up more than once
    +                                     columnsToRemove = null; // therefore, 
use the FamilyFilter/QualifierFilter method
    +                             }else if (columnsToRemove != null)// else 
    +                                     columnsToRemove.put(strColName,null); 
// add it to the list of column that should be nuked with the Exclude version 
of the SingleColumnValueFilter
    +                     }
    +                 }         
    +               }
    +               if (columnsToRemove != null)
    +               { //we are almost done checking if Exclude version of 
SingleColumnnValueFilter can be used. Th elast check s about to know if there 
is a IS_NULL_NULL
    +                 //operation that cannot be using the Exclude method, as 
it is transformed in a filterList with OR, therefore we cannot guaranty that 
the SingleColumnValueExcludeFilter
    +                 //performing the exclusion will be reached.
    +                   boolean is_null_nullFound = false;
    +                   for (Object o:compareOpList ){
    +                           if (new 
String((byte[])o).equals("IS_NULL_NULL")){
    +                                   is_null_nullFound = true;
    +                                   break;
    +                           }                                       
    +                   }
    +                   if (is_null_nullFound){
    +                           columnsToRemove = null; // disable Exclude 
method version of SingleColumnnValueFilter
    +                   }else
    +                           narrowDownResultColumns = false; // we will use 
the Exclude version of SingleColumnnValueFilter, so bypass the 
Family/QualifierFilter method
    +               }
    +               Filter f 
=constructV2Filter(colNamesToFilter,compareOpList,colValuesToCompare, 
columnsToRemove);
    +               if (f==null) return false; // error logging done inside 
constructV2Filter
    +               list.addFilter(f);
    +           }//end V2
    +           else{// deal with V1
    +               list = new FilterList(FilterList.Operator.MUST_PASS_ALL);
    +               
    +               for (int i = 0; i < colNamesToFilter.length; i++) {
    +                 byte[] colName = (byte[])colNamesToFilter[i];
    +                 byte[] coByte = (byte[])compareOpList[i];
    +                 byte[] colVal = (byte[])colValuesToCompare[i];
    +   
    +                 if ((coByte == null) || (colVal == null)) {
    +                   return false;
    +                 }
    +                 String coStr = new String(coByte);
    +                 CompareOp co = CompareOp.valueOf(coStr);
    +   
    +                 SingleColumnValueFilter filter1 = 
    +                     new SingleColumnValueFilter(getFamily(colName), 
getName(colName), 
    +                         co, colVal);
    +                 list.addFilter(filter1);
    +               }                   
    +           }//end V1
    +       // if we added a column for predicate eval, we need to filter down 
result columns
    +       FilterList resultColumnsOnlyFilter = null;
    +       if (narrowDownResultColumns){               
    +               HashMap<String,ArrayList<byte[]>> hm = new 
HashMap<String,ArrayList<byte[]>>(3);//use to deal with multiple family table
    +               // initialize hm with list of columns requested for output
    +                   for (int i=0; i<columns.length; i++){ // if we are here 
we know columns is not null
    +                           if (hm.containsKey(new 
String(getFamily((byte[])columns[i])))){
    +                                   hm.get(new 
String(getFamily((byte[])columns[i]))).add((byte[])columns[i]);
    +                           }else{
    +                                   ArrayList<byte[]> al = new 
ArrayList<byte[]>();
    +                                   al.add((byte[])columns[i]);
    +                                   hm.put(new 
String(getFamily((byte[])columns[i])), al);
    +                           }                               
    +                   }
    +                   
    +           if (hm.size()==1){//only one column family
    +                   resultColumnsOnlyFilter = new 
FilterList(FilterList.Operator.MUST_PASS_ALL);
    --- End diff --
    
    The resultColumnsOnlyFilter is use to filter down columns that we don't 
need in the result set. On line 834, we have all the column that the scan 
operator will fetch from RS, but all the one that are only needed for predicate 
evaluation only can be filtered out and not returned to ESP. So 
resultColumnsOnlyFilter is one way to do this filtering down, along with the 
other method using SingleColumnValueExcludeFilter instead of the non exclude 
one. Here is a copy past of the section in blueprint describing the trick:
    Always returning predicate columns:
    The current pushdown feature only make use of SingleColumnValueFilter. 
However, there is a sibling filter, SingleColumnValueExcludeFilter, that will 
behave the same, except that it will not return the column on which it is doing 
the filtering. Correctly choosing the Exclude version or the regular version is 
the key to correctly handling the pushing or not of the column participating in 
the filter. This trick alone is not sufficient, as during complex expression 
evaluations, some nodes may not be evaluated due to “fast exit”. Therefore, 
combining results using a FilterList using ArrayList to preserve order and  
MUST_PASS_ALL using the original Filter expression as first filter, and a 
FilterList MUST_PASS_ONE of QualifierFilter that will filter in every columns 
that are needed in the result set. For multi column family table, need to add 
FamilyFilter in the logic.
    To simplify the logic when deciding if we will use the “Exclude” 
method, or the FamilyFilter/QualifierFilter, we are simply going to use the 
optimized Exclude method if and only if all of the column added in the result 
set show up only once in the predicate push down. If at least one shows up 
twice, we will fall back on Family/QualifierFilter. Also we will check that the 
predicate does not contain any IS_NULL_NULL operation, as these are transformed 
into OR filter list, making the Exclude method inappropriate, as we are not 
guaranteed that all filter inside the OR will be executed.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to