teyeheimans opened a new issue #12393: URL: https://github.com/apache/druid/issues/12393
### Description Since [my last feature request](https://github.com/apache/druid/issues/8560) the bitwise operators have been added. These are VERY helpful and work _almost_ perfect. The problem now is that when we want to use them, you always have to make use of a virtual column. For example: A simple query like: ```sql SELECT * FROM "table" WHERE BITWISE_AND("flags", 4) = 0 LIMIT 10 ``` Converts into: ```json { "queryType": "scan", ... "virtualColumns": [ { "type": "expression", "name": "v0", "expression": "bitwiseAnd(\"flags\",4)", "outputType": "LONG" } ], ... "filter": { "type": "selector", "dimension": "v0", "value": "0", "extractionFn": null }, ... } ``` As you can see, due to the fact that we use a virtual column, we are able to use the bitwise expressions. However, druid has some other places where records can be filtered. For example, the [DruidInputSource](https://druid.apache.org/docs/latest/ingestion/native-batch.html#druid-input-source) and the [Filtered Aggregator](https://druid.apache.org/docs/latest/querying/aggregations.html#filtered-aggregator). In these places it is not possible to use a virtual column, and thus we cannot use the bitwise expressions. Example. Assume a table where we store the role a user can have in a "roles" field. This field is a digit, where each bit represents a role. Thus: 1 = Content moderator 2 = Financial user 4 = Data analyst 8 = Sales ... etc A person can have multiple roles of course. So somebody with a "roles" value of 5 is both Content moderator as data analyst. Now, back to druid. When I want to create a new druid dataSource and fill it with a native batch ingestion task with data from another druid dataSource, I cannot filter on the "roles" as I want to. See the "filter" part below: ```json { "type": "index_parallel", "spec": { "dataSchema": { "dataSource": "dataAnalystUsers", "timestampSpec": { ... }, "dimensionsSpec": { ... }, "metricsSpec": [ ... ], "granularitySpec": { ... } }, "ioConfig": { "type": "index_parallel", "inputSource": { "type": "druid", "dataSource": "users", "interval": "2022-02-12/P1D", "filter": { 'type': '????', 'dimension': 'roles', 'value': 4 } } }, "tuningConfig": { ... } } } ``` As you can see, we have no way to filter here using the bitwise expressions. My suggestion is to create native filter types for the bitwise expressions: Maybe something like below, but this is only a idea. ```json { 'filter': { 'type': 'bitwiseAnd', 'dimension': '<DIMENSION>' 'value': <VALUE>, 'expected': <VALUE> } } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
