kaplanmaxe opened a new pull request #10230:
URL: https://github.com/apache/druid/pull/10230


   Related to #8560 
   
   This feature adds several bitwise expressions to be used on ingestion.
   
   **Use Case**
   
   Say you're doing CDC and your source table has a column of binary flags. For 
OLAP queries, it's extremely useful to have these flags extracted out into 
their own dimensions which can be done now on ingestion via the new expressions.
   
   Example:
   
   - You might have a column in your source DB called `flags` that represents a 
series of binary flags. Bit 1 might mean an order was placed on mobile, bit 2 
might mean it was purchased on web, bit 4 might mean it was the users first 
purchase.
   - Instead of ingesting the raw integer value of the `flags` column, you can 
break these columns out into something like `mobile_ind`, `web_ind`, 
`first_purchase_ind` where your expression can be `bitwiseAnd(flags, 1)`, 
`bitwiseAnd(flags, 2)`, `bitwiseAnd(flags, 4)`.
   
   Druid SQL currently does not support bitwise operations (#8560) which makes 
these even more valuable IMO.
   
   **Tested on a local cluster**
   
   Ingestion spec:
   
   ```
   {
     "type": "index_parallel",
     "spec": {
       "ioConfig": {
         "type": "index_parallel",
         "inputSource": {
           "type": "inline",
           "data": "\"x\",\"y\"\n4,2\n8,4\n16,8\n3,2\n5,2"
         },
         "inputFormat": {
           "type": "csv",
           "findColumnsFromHeader": true
         }
       },
       "tuningConfig": {
         "type": "index_parallel",
         "partitionsSpec": {
           "type": "dynamic"
         }
       },
       "dataSchema": {
         "dataSource": "bitwise_test",
         "granularitySpec": {
           "type": "uniform",
           "queryGranularity": "NONE",
           "rollup": false,
           "segmentGranularity": "YEAR"
         },
         "timestampSpec": {
           "column": "!!!_no_such_column_!!!",
           "missingValue": "2010-01-01T00:00:00Z"
         },
         "transformSpec": {
           "transforms": [
             {
               "type": "expression",
               "name": "zBitwiseAnd",
               "expression": "bitwiseAnd(CAST(x, 'LONG'), CAST(y, 'LONG'))"
             },
             {
               "type": "expression",
               "name": "zBitwiseOr",
               "expression": "bitwiseOr(CAST(x, 'LONG'), CAST(y, 'LONG'))"
             },
             {
               "type": "expression",
               "name": "zBitwiseComplement",
               "expression": "bitwiseComplement(CAST(x, 'LONG'))"
             },
             {
               "type": "expression",
               "expression": "bitwiseShiftLeft(CAST(x, 'LONG'), CAST(y, 
'LONG'))",
               "name": "zBitwiseShiftLeft"
             },
             {
               "type": "expression",
               "name": "zBitwiseShiftRight",
               "expression": "bitwiseShiftRight(CAST(x, 'LONG'), CAST(y, 
'LONG'))"
             },
             {
               "type": "expression",
               "name": "zBitwiseXor",
               "expression": "bitwiseXor(CAST(x, 'LONG'), CAST(y, 'LONG'))"
             }
           ]
         },
         "dimensionsSpec": {
           "dimensions": [
             {
               "type": "long",
               "name": "x"
             },
             {
               "type": "long",
               "name": "y"
             },
             {
               "type": "long",
               "name": "zBitwiseAnd"
             },
             {
               "type": "long",
               "name": "zBitwiseComplement"
             },
             {
               "type": "long",
               "name": "zBitwiseOr"
             },
             {
               "type": "long",
               "name": "zBitwiseShiftLeft"
             },
             {
               "type": "long",
               "name": "zBitwiseShiftRight"
             },
             {
               "type": "long",
               "name": "zBitwiseXor"
             }
           ]
         }
       }
     }
   }
   ```
   
   ![Screenshot from 2020-08-01 
18-27-54](https://user-images.githubusercontent.com/3860450/89112251-f8bd6500-d42d-11ea-8aff-33f761a34f61.png)
   
   
   <hr>
   
   This PR has:
   - [x] been self-reviewed.
   - [x] added documentation for new or modified features or behaviors.
   - [x] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [x] been tested in a test Druid cluster.
   
   <!-- Check the items by putting "x" in the brackets for the done things. Not 
all of these items apply to every PR. Remove the items which are not done or 
not relevant to the PR. None of the items from the checklist above are strictly 
necessary, but it would be very helpful if you at least self-review the PR. -->


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to