Jackie-Jiang commented on code in PR #11604:
URL: https://github.com/apache/pinot/pull/11604#discussion_r1337618005
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/realtime/impl/json/MutableJsonIndexImpl.java:
##########
@@ -106,8 +106,11 @@ private void addFlattenedRecords(List<Map<String, String>>
records) {
// Put both key and key-value into the posting list. Key is useful for
checking if a key exists in the json.
String key = entry.getKey();
_postingListMap.computeIfAbsent(key, k -> new
RoaringBitmap()).add(_nextFlattenedDocId);
- String keyValue = key + JsonIndexCreator.KEY_VALUE_SEPARATOR +
entry.getValue();
- _postingListMap.computeIfAbsent(keyValue, k -> new
RoaringBitmap()).add(_nextFlattenedDocId);
+ int length = _jsonIndexConfig.getMaxValueLength();
Review Comment:
Yes. In older to look up JSON index, you'll need to provide the exact value.
JSON index also supports key only lookup. More details can be found
[here](https://docs.pinot.apache.org/basics/indexing/json-index#how-to-use-the-json-index).
Based on this, I suggest replacing the very long value with a placeholder
such as `"__SKIPPED__"` to save more space. This way we will get better
compression as well because the key value pair might repeat thus can be
compressed. The original value can still be retrieved by `JSON_EXTRACT_SCALAR`
by looking up the key.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]