jon-wei opened a new issue #7970: Consider adding support for a pre-transform 
filter for transform specs
URL: https://github.com/apache/incubator-druid/issues/7970
 
 
   A user asked the following on the mailing list: 
https://groups.google.com/d/msg/druid-user/5QfVuff8MJw/65w9qsZ_BQAJ
   
   > I've noticed that in ingestion, when specifying filters section in 
transformSpec, I can only filter only fields that are in the dimensions list. 
Even if the field is present in the raw data, if it does not appear in the 
dimensions list the filter won't consider that field.
   This creates a situation where I need to add a field I do not care about to 
the Data Source's dimensions even though I do not care about that field apart 
from filtering purposes.
   >
   >How can we work around this?
   
   Transform spec filters are currently applied after the transforms:
   
   ```
   /**
      * Transforms an input row, or returns null if the row should be filtered 
out.
      *
      * @param row the input row
      */
     @Nullable
     public InputRow transform(@Nullable final InputRow row)
     {
       if (row == null) {
         return null;
       }
   
       final InputRow transformedRow;
   
       if (transforms.isEmpty()) {
         transformedRow = row;
       } else {
         transformedRow = new TransformedInputRow(row, transforms);
       }
   
       if (valueMatcher != null) {
         rowSupplierForValueMatcher.set(transformedRow);
         if (!valueMatcher.matches()) {
           return null;
         }
       }
   
       return transformedRow;
     }
   ```
   
   If pre-transform filters were supported, you could filter on a column and 
apply a transform to null out the filter column so that it's not written to the 
final segments, for cases where the user doesn't want to keep the filter column.
   
   Maybe there are better ways to support that use case as well.
   
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to