gortiz commented on PR #12223:
URL: https://github.com/apache/pinot/pull/12223#issuecomment-1882772321
This is a very nice feature! Nice contribution @vvivekiyer!
I have some comments to add:
## The concept itself
I understand that the issue was detected in on heap dictionaries, but I
think we can use this approach also in offheap dictionaries as well.
Specifically, interning could be applied when the value is read from the
dictionary.
## The config
I find this config too repetitive.
Instead of:
```json
{
"indexes": {
"dictionary": {
"onHeap": true,
"useVarLengthDictionary": true,
"onHeapConfig": {
"enableInterning":true,
"internerCapacity":32000000
}
}
}
}
```
I would suggest something like:
```json
{
"indexes": {
"dictionary": {
"onHeap": true,
"useVarLengthDictionary": true,
"intern": {
"capacity":32000000
}
}
}
}
```
With an optional implicit field `intern.disabled`.
I would use this approach even if we do not support interning in offheap
dictionaries (which is something we may decide to change in future). This
approach is simpler and easier to read.
## Support in older syntax
As said in the comments, I recommend *against* adding new features in the
old syntax (in this case, in indexingConfig).
For compatibility reasons `index-spi` had to support all features that were
supported in the old syntax, but we don't have to add support for new features
in that syntax.
Users can migrate to the new syntax in case they want to use new features.
The translation can even be done automatically. Is not like we want to force
people to use the new syntax, but the index config logic is already too complex
and 2 ways to configure each new feature will make it even more complex.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]