tedyu opened a new pull request #30984:
URL: https://github.com/apache/spark/pull/30984
### What changes were proposed in this pull request?
This PR adds support for json / jsonb expression as PushableColumnBase.
With this change, SupportsPushDownFilters implementation would be able to
push Filter down to DB.
### Why are the changes needed?
Currently implementation of SupportsPushDownFilters doesn't have a chance to
perform pushdown even if third party DB engine supports json expression
pushdown.
This poses challenge when table schema uses json / jsonb to encode large
amount of data.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Here is the plan prior to predicate pushdown:
```
2020-12-26 03:28:59,926 (Time-limited test) [DEBUG -
org.apache.spark.internal.Logging.logDebug(Logging.scala:61)] Adaptive
execution enabled for plan: Sort [id#34 ASC NULLS FIRST], true, 0
+- Project [id#34, address#35, phone#37, get_json_object(phone#37, $.code)
AS phone#33]
+- Filter (get_json_object(phone#37, $.code) = 1200)
+- BatchScan[id#34, address#35, phone#37] Cassandra Scan: test.person
- Cassandra Filters: []
- Requested Columns: [id,address,phone]
Here is the plan with pushdown:
```
```
2020-12-28 01:40:08,150 (Time-limited test) [DEBUG -
org.apache.spark.internal.Logging.logDebug(Logging.scala:61)] Adaptive
execution enabled for plan: Sort [id#34 ASC NULLS FIRST], true, 0
+- Project [id#34, address#35, phone#37, get_json_object(phone#37,
$.code) AS phone#33]
+- BatchScan[id#34, address#35, phone#37] Cassandra Scan: test.person
- Cassandra Filters: [[phone->'code' = ?, 1200]]
- Requested Columns: [id,address,phone]
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]