suneet-s opened a new issue #9589: TransformSpec for firehoses appear to 
perform the operation twice
URL: https://github.com/apache/druid/issues/9589
 
 
   ### Affected Version
   
   Tested in 0.18
   
   ### Description
   
   I am writing integration tests for transform specs and noticed that when 
using a transform spec with a parser, the transformation is being applied 
twice. See the below ingestion spec.
   
   You can re-create this by sym-linking `/resources` to 
`$DRUID_CODEBASE/integration-tests/src/test/resources`
   
   ```
   {
       "type": "index",
       "spec": {
           "dataSchema": {
               "dataSource": "wiki-tests-2",
               "metricsSpec": [
                   {
                       "type": "count",
                       "name": "count"
                   },
                   {
                       "type": "doubleSum",
                       "name": "added",
                       "fieldName": "added"
                   },
                   {
                       "type": "doubleSum",
                       "name": "triple-added",
                       "fieldName": "triple-added"
                   },
                   {
                       "type": "doubleSum",
                       "name": "deleted",
                       "fieldName": "deleted"
                   },
                   {
                       "type": "doubleSum",
                       "name": "delta",
                       "fieldName": "delta"
                   },
                   {
                       "name": "thetaSketch",
                       "type": "thetaSketch",
                       "fieldName": "user"
                   },
                   {
                       "name": "quantilesDoublesSketch",
                       "type": "quantilesDoublesSketch",
                       "fieldName": "delta"
                   },
                   {
                       "name": "HLLSketchBuild",
                       "type": "HLLSketchBuild",
                       "fieldName": "user"
                   }
               ],
               "granularitySpec": {
                   "segmentGranularity": "DAY",
                   "queryGranularity": "second",
                   "intervals" : [ "2013-08-31/2013-09-02" ]
               },
               "parser": {
                   "parseSpec": {
                       "format" : "json",
                       "timestampSpec": {
                           "column": "timestamp"
                       },
                       "dimensionsSpec": {
                           "dimensions": [
                               "page",
                               "language",
                               "user",
                               "unpatrolled",
                               "newPage",
                               "robot",
                               "anonymous",
                               "namespace",
                               "continent",
                               "country",
                               "region",
                               "city"
                           ]
                       }
                   }
               },
               "transformSpec": {
                   "transforms": [
                       {
                           "type": "expression",
                           "name": "language",
                           "expression": "concat('l-', language)"
                       },
                       {
                           "type": "expression",
                           "name": "triple-added",
                           "expression": "added * 3"
                       }
                   ]
               }
           },
           "ioConfig": {
               "type": "index",
               "firehose": {
                   "type": "local",
                   "baseDir": "/resources/data/batch_index",
                   "filter": "wikipedia_index_data*"
               }
           },
           "tuningConfig": {
               "type": "index",
               "maxRowsPerSegment": 10
           }
       }
   }
   ```
   
   However if you switch to the new format (inputSource/ inputFormat instead of 
Firehoses), it will perform the operation as expected.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to