[
https://issues.apache.org/jira/browse/SOLR-11023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16097303#comment-16097303
]
Adrien Grand commented on SOLR-11023:
-------------------------------------
bq. I'm going to start working on this, but i'm still unclear if "points" is
the best way to go for the "very low cardinality + all values are small
positive ints" situation.
I think points are not a good fit in that case, they will use more disk and be
slower at exact queries, even though exact queries are probably common on an
enum field. Even if the user wants to run range queries, the low cardinality of
the field should make the inverted index more efficient than points. I'd really
store it like a string field but just add more logic in the field type to
restrict what values may be used?
> Need SortedNumerics/Points version of EnumField
> -----------------------------------------------
>
> Key: SOLR-11023
> URL: https://issues.apache.org/jira/browse/SOLR-11023
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Hoss Man
> Assignee: Hoss Man
> Labels: numeric-tries-to-points
> Attachments: SOLR-11023.patch
>
>
> although it's not a subclass of TrieField, EnumField does use
> "LegacyIntField" to index the int value associated with each of the enum
> values, in addition to using SortedSetDocValuesField when {{docValues="true"
> multivalued="true"}}.
> I have no idea if Points would be better/worse then Terms for low cardinality
> usecases like EnumField, but either way we should think about a new variant
> of EnumField that doesn't depend on
> LegacyIntField/LegacyNumericUtils.intToPrefixCoded and uses
> SortedNumericDocValues.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]