gianm commented on code in PR #12350:
URL: https://github.com/apache/druid/pull/12350#discussion_r873069568
##########
docs/ingestion/native-batch.md:
##########
@@ -387,19 +400,44 @@ them to create the final segments. Finally, they push the
final segments to the
> the task may fail if the input changes in between the two passes.
#### Multi-dimension range partitioning
-> Multiple dimension (multi-dimension) range partitioning is an experimental
feature. Multi-dimension range partitioning is currently not supported in the
sequential mode of the Parallel task.
-When you use multi-dimension partitioning for your data, Druid is able to
distribute segment sizes more evenly than with single dimension partitioning.
+> Multiple dimension (multi-dimension) range partitioning is an experimental
feature.
+> Multi-dimension range partitioning is currently not supported in the
sequential mode of the
+> Parallel task.
-For segment pruning to be effective and translate into better query
performance, you must include the first of your `partitionDimensions` in the
`WHERE` clause at query time. For example, given the following
`partitionDimensions`:
-```
- "partitionsSpec": {
- "type": "range",
- "partitionDimensions":["coutryName","cityName"],
- "targetRowsPerSegment" : 5000
+When you use multi-dimension range partitioning for your data, Druid is able
to distribute segment
+sizes more evenly than with single dimension partitioning.
+
+Range partitioning has several benefits:
+
+1. Lower storage footprint due to combining similar data into the same
segments, which improves compressibility.
+2. Better query performance due to Broker-level segment pruning, which removes
segments from
+ consideration when they cannot possibly contain data matching the query
filter.
+
+For Broker-level segment pruning to be effective, you must include a set of
`partitionDimensions`,
+starting from the left, in the `WHERE` clause at query time using filters that
support pruning.
+Filters that support pruning include:
+
+- Equality on literals that match the type of the column, like `x = 'foo'` and
`x IN ('foo', 'bar')`
Review Comment:
I thought that non-string types would be converted into strings and still
would be included in the shard specs. IIRC you implemented it, though, so I'll
believe what you say. What does happen in this case?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]