writer-jill commented on code in PR #13329:
URL: https://github.com/apache/druid/pull/13329#discussion_r1018012761


##########
docs/ingestion/ingestion-spec.md:
##########
@@ -477,35 +477,37 @@ The `indexSpec` object can include the following 
properties:
 |-----|-----------|-------|
 |bitmap|Compression format for bitmap indexes. Should be a JSON object with 
`type` set to `roaring` or `concise`. For type `roaring`, the boolean property 
`compressRunOnSerialization` (defaults to true) controls whether or not 
run-length encoding will be used when it is determined to be more 
space-efficient.|`{"type": "roaring"}`|
 |dimensionCompression|Compression format for dimension columns. Options are 
`lz4`, `lzf`, `zstd`, or `uncompressed`.|`lz4`|
-|stringDictionaryEncoding|Encoding format for string typed column value 
dictionaries.|`{"type":"utf8"}`|
+|stringDictionaryEncoding|Encoding format for STRING-typed column value 
dictionaries. The default setting `utf8` suits most use cases.<br>Example to 
enable front coding: `{"type":"frontCoded", "bucketSize": 4}`<br>`bucketSize` 
is the number of values to place in a bucket to perform delta encoding. Must be 
a power of 2, maximum is 128. Defaults to 4.<br>See [Front 
coding](#front-coding) for more information.|`{"type":"utf8"}`|
 |metricCompression|Compression format for primitive type metric columns. 
Options are `lz4`, `lzf`, `zstd`, `uncompressed`, or `none` (which is more 
efficient than `uncompressed`, but not supported by older versions of 
Druid).|`lz4`|
 |longEncoding|Encoding format for long-typed columns. Applies regardless of 
whether they are dimensions or metrics. Options are `auto` or `longs`. `auto` 
encodes the values using offset or lookup table depending on column 
cardinality, and store them with variable size. `longs` stores the value as-is 
with 8 bytes each.|`longs`|
 |jsonCompression|Compression format to use for nested column raw data. Options 
are `lz4`, `lzf`, `zstd`, or `uncompressed`.|`lz4`|
 
+##### Front coding
 
-#### String Dictionary Encoding
+By default, Druid stores values in STRING-typed columns as uncompressed UTF-8 
encoded bytes.
 
-##### UTF8
-By default, `STRING` typed column store the values as uncompressed UTF8 
encoded bytes.
+Starting in version 25.0, Druid can store STRING columns using an incremental 
encoding strategy called front coding. This allows Druid to create smaller 
UTF-8 encoded segments with very little performance cost.

Review Comment:
   Added COMPLEX<json> to this section.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to