[jira] [Commented] (SOLR-11023) Need SortedNumerics/Points version of EnumField
[ https://issues.apache.org/jira/browse/SOLR-11023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16117543#comment-16117543 ] Hoss Man commented on SOLR-11023: - thanks for finishing this steve! > Need SortedNumerics/Points version of EnumField > --- > > Key: SOLR-11023 > URL: https://issues.apache.org/jira/browse/SOLR-11023 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Hoss Man >Assignee: Steve Rowe >Priority: Blocker > Labels: numeric-tries-to-points > Fix For: 7.0, master (8.0), 7.1 > > Attachments: SOLR-11023.patch, SOLR-11023.patch, SOLR-11023.patch, > SOLR-11023.patch, SOLR-11023.patch, SOLR-11023.patch > > > although it's not a subclass of TrieField, EnumField does use > "LegacyIntField" to index the int value associated with each of the enum > values, in addition to using SortedSetDocValuesField when {{docValues="true" > multivalued="true"}}. > I have no idea if Points would be better/worse then Terms for low cardinality > usecases like EnumField, but either way we should think about a new variant > of EnumField that doesn't depend on > LegacyIntField/LegacyNumericUtils.intToPrefixCoded and uses > SortedNumericDocValues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11023) Need SortedNumerics/Points version of EnumField
[ https://issues.apache.org/jira/browse/SOLR-11023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115407#comment-16115407 ] ASF subversion and git services commented on SOLR-11023: Commit 5d632c0a0e8769b512a365a98d348dd3d5ef0bbc in lucene-solr's branch refs/heads/branch_7x from [~steve_rowe] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5d632c0 ] SOLR-11023: add docValues="true" to an enum field declaration in schema.xml, so that EnumFieldType, which requires docValues, stops causing TestDistributedSearch to fail > Need SortedNumerics/Points version of EnumField > --- > > Key: SOLR-11023 > URL: https://issues.apache.org/jira/browse/SOLR-11023 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Hoss Man >Assignee: Steve Rowe >Priority: Blocker > Labels: numeric-tries-to-points > Fix For: 7.0, master (8.0), 7.1 > > Attachments: SOLR-11023.patch, SOLR-11023.patch, SOLR-11023.patch, > SOLR-11023.patch, SOLR-11023.patch, SOLR-11023.patch > > > although it's not a subclass of TrieField, EnumField does use > "LegacyIntField" to index the int value associated with each of the enum > values, in addition to using SortedSetDocValuesField when {{docValues="true" > multivalued="true"}}. > I have no idea if Points would be better/worse then Terms for low cardinality > usecases like EnumField, but either way we should think about a new variant > of EnumField that doesn't depend on > LegacyIntField/LegacyNumericUtils.intToPrefixCoded and uses > SortedNumericDocValues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11023) Need SortedNumerics/Points version of EnumField
[ https://issues.apache.org/jira/browse/SOLR-11023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115406#comment-16115406 ] ASF subversion and git services commented on SOLR-11023: Commit c58bbaa6cabe91c3823d2e9c6395379d987fec60 in lucene-solr's branch refs/heads/branch_7_0 from [~steve_rowe] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c58bbaa ] SOLR-11023: add docValues="true" to an enum field declaration in schema.xml, so that EnumFieldType, which requires docValues, stops causing TestDistributedSearch to fail > Need SortedNumerics/Points version of EnumField > --- > > Key: SOLR-11023 > URL: https://issues.apache.org/jira/browse/SOLR-11023 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Hoss Man >Assignee: Steve Rowe >Priority: Blocker > Labels: numeric-tries-to-points > Fix For: 7.0, master (8.0), 7.1 > > Attachments: SOLR-11023.patch, SOLR-11023.patch, SOLR-11023.patch, > SOLR-11023.patch, SOLR-11023.patch, SOLR-11023.patch > > > although it's not a subclass of TrieField, EnumField does use > "LegacyIntField" to index the int value associated with each of the enum > values, in addition to using SortedSetDocValuesField when {{docValues="true" > multivalued="true"}}. > I have no idea if Points would be better/worse then Terms for low cardinality > usecases like EnumField, but either way we should think about a new variant > of EnumField that doesn't depend on > LegacyIntField/LegacyNumericUtils.intToPrefixCoded and uses > SortedNumericDocValues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11023) Need SortedNumerics/Points version of EnumField
[ https://issues.apache.org/jira/browse/SOLR-11023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115408#comment-16115408 ] ASF subversion and git services commented on SOLR-11023: Commit 3f9e748202ab8619af83f093ba4739f5a1e5c57b in lucene-solr's branch refs/heads/master from [~steve_rowe] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3f9e748 ] SOLR-11023: add docValues="true" to an enum field declaration in schema.xml, so that EnumFieldType, which requires docValues, stops causing TestDistributedSearch to fail > Need SortedNumerics/Points version of EnumField > --- > > Key: SOLR-11023 > URL: https://issues.apache.org/jira/browse/SOLR-11023 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Hoss Man >Assignee: Steve Rowe >Priority: Blocker > Labels: numeric-tries-to-points > Fix For: 7.0, master (8.0), 7.1 > > Attachments: SOLR-11023.patch, SOLR-11023.patch, SOLR-11023.patch, > SOLR-11023.patch, SOLR-11023.patch, SOLR-11023.patch > > > although it's not a subclass of TrieField, EnumField does use > "LegacyIntField" to index the int value associated with each of the enum > values, in addition to using SortedSetDocValuesField when {{docValues="true" > multivalued="true"}}. > I have no idea if Points would be better/worse then Terms for low cardinality > usecases like EnumField, but either way we should think about a new variant > of EnumField that doesn't depend on > LegacyIntField/LegacyNumericUtils.intToPrefixCoded and uses > SortedNumericDocValues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11023) Need SortedNumerics/Points version of EnumField
[ https://issues.apache.org/jira/browse/SOLR-11023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115109#comment-16115109 ] ASF subversion and git services commented on SOLR-11023: Commit 9627d1db5dccd6dc9c0c307065628efea621d8e5 in lucene-solr's branch refs/heads/master from [~steve_rowe] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9627d1d ] SOLR-11023: Added EnumFieldType, a non-Trie-based version of EnumField, and deprecated EnumField in favor of EnumFieldType. > Need SortedNumerics/Points version of EnumField > --- > > Key: SOLR-11023 > URL: https://issues.apache.org/jira/browse/SOLR-11023 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Hoss Man >Assignee: Steve Rowe >Priority: Blocker > Labels: numeric-tries-to-points > Fix For: 7.0 > > Attachments: SOLR-11023.patch, SOLR-11023.patch, SOLR-11023.patch, > SOLR-11023.patch, SOLR-11023.patch, SOLR-11023.patch > > > although it's not a subclass of TrieField, EnumField does use > "LegacyIntField" to index the int value associated with each of the enum > values, in addition to using SortedSetDocValuesField when {{docValues="true" > multivalued="true"}}. > I have no idea if Points would be better/worse then Terms for low cardinality > usecases like EnumField, but either way we should think about a new variant > of EnumField that doesn't depend on > LegacyIntField/LegacyNumericUtils.intToPrefixCoded and uses > SortedNumericDocValues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11023) Need SortedNumerics/Points version of EnumField
[ https://issues.apache.org/jira/browse/SOLR-11023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115107#comment-16115107 ] ASF subversion and git services commented on SOLR-11023: Commit 41f6ae55ba2cbd01848130f2be3db2cea48e34a4 in lucene-solr's branch refs/heads/branch_7_0 from [~steve_rowe] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=41f6ae5 ] SOLR-11023: Added EnumFieldType, a non-Trie-based version of EnumField, and deprecated EnumField in favor of EnumFieldType. Conflicts: solr/core/src/java/org/apache/solr/schema/EnumField.java solr/core/src/test/org/apache/solr/schema/EnumFieldTest.java > Need SortedNumerics/Points version of EnumField > --- > > Key: SOLR-11023 > URL: https://issues.apache.org/jira/browse/SOLR-11023 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Hoss Man >Assignee: Steve Rowe >Priority: Blocker > Labels: numeric-tries-to-points > Fix For: 7.0 > > Attachments: SOLR-11023.patch, SOLR-11023.patch, SOLR-11023.patch, > SOLR-11023.patch, SOLR-11023.patch, SOLR-11023.patch > > > although it's not a subclass of TrieField, EnumField does use > "LegacyIntField" to index the int value associated with each of the enum > values, in addition to using SortedSetDocValuesField when {{docValues="true" > multivalued="true"}}. > I have no idea if Points would be better/worse then Terms for low cardinality > usecases like EnumField, but either way we should think about a new variant > of EnumField that doesn't depend on > LegacyIntField/LegacyNumericUtils.intToPrefixCoded and uses > SortedNumericDocValues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11023) Need SortedNumerics/Points version of EnumField
[ https://issues.apache.org/jira/browse/SOLR-11023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16115108#comment-16115108 ] ASF subversion and git services commented on SOLR-11023: Commit ca0696492382e4f44d48c399986a489a82281de0 in lucene-solr's branch refs/heads/branch_7x from [~steve_rowe] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ca06964 ] SOLR-11023: Added EnumFieldType, a non-Trie-based version of EnumField, and deprecated EnumField in favor of EnumFieldType. > Need SortedNumerics/Points version of EnumField > --- > > Key: SOLR-11023 > URL: https://issues.apache.org/jira/browse/SOLR-11023 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Hoss Man >Assignee: Steve Rowe >Priority: Blocker > Labels: numeric-tries-to-points > Fix For: 7.0 > > Attachments: SOLR-11023.patch, SOLR-11023.patch, SOLR-11023.patch, > SOLR-11023.patch, SOLR-11023.patch, SOLR-11023.patch > > > although it's not a subclass of TrieField, EnumField does use > "LegacyIntField" to index the int value associated with each of the enum > values, in addition to using SortedSetDocValuesField when {{docValues="true" > multivalued="true"}}. > I have no idea if Points would be better/worse then Terms for low cardinality > usecases like EnumField, but either way we should think about a new variant > of EnumField that doesn't depend on > LegacyIntField/LegacyNumericUtils.intToPrefixCoded and uses > SortedNumericDocValues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11023) Need SortedNumerics/Points version of EnumField
[ https://issues.apache.org/jira/browse/SOLR-11023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100647#comment-16100647 ] Steve Rowe commented on SOLR-11023: --- I marked this issue explicitly as a Blocker for 7.0. > Need SortedNumerics/Points version of EnumField > --- > > Key: SOLR-11023 > URL: https://issues.apache.org/jira/browse/SOLR-11023 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Hoss Man >Priority: Blocker > Labels: numeric-tries-to-points > Attachments: SOLR-11023.patch, SOLR-11023.patch > > > although it's not a subclass of TrieField, EnumField does use > "LegacyIntField" to index the int value associated with each of the enum > values, in addition to using SortedSetDocValuesField when {{docValues="true" > multivalued="true"}}. > I have no idea if Points would be better/worse then Terms for low cardinality > usecases like EnumField, but either way we should think about a new variant > of EnumField that doesn't depend on > LegacyIntField/LegacyNumericUtils.intToPrefixCoded and uses > SortedNumericDocValues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11023) Need SortedNumerics/Points version of EnumField
[ https://issues.apache.org/jira/browse/SOLR-11023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099688#comment-16099688 ] Adrien Grand commented on SOLR-11023: - bq. So I think we really still want to use ints for the store/index/docvalues and deal with all the String/label conversion only in response writing and query parsing. Obviously we'll still want to use NumericUtils.intToSortableBytes for the indexed bytes so that our range queries are still "numeric" and not "alphabetical" It sounds good to me. > Need SortedNumerics/Points version of EnumField > --- > > Key: SOLR-11023 > URL: https://issues.apache.org/jira/browse/SOLR-11023 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Hoss Man > Labels: numeric-tries-to-points > Attachments: SOLR-11023.patch, SOLR-11023.patch > > > although it's not a subclass of TrieField, EnumField does use > "LegacyIntField" to index the int value associated with each of the enum > values, in addition to using SortedSetDocValuesField when {{docValues="true" > multivalued="true"}}. > I have no idea if Points would be better/worse then Terms for low cardinality > usecases like EnumField, but either way we should think about a new variant > of EnumField that doesn't depend on > LegacyIntField/LegacyNumericUtils.intToPrefixCoded and uses > SortedNumericDocValues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11023) Need SortedNumerics/Points version of EnumField
[ https://issues.apache.org/jira/browse/SOLR-11023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16097303#comment-16097303 ] Adrien Grand commented on SOLR-11023: - bq. I'm going to start working on this, but i'm still unclear if "points" is the best way to go for the "very low cardinality + all values are small positive ints" situation. I think points are not a good fit in that case, they will use more disk and be slower at exact queries, even though exact queries are probably common on an enum field. Even if the user wants to run range queries, the low cardinality of the field should make the inverted index more efficient than points. I'd really store it like a string field but just add more logic in the field type to restrict what values may be used? > Need SortedNumerics/Points version of EnumField > --- > > Key: SOLR-11023 > URL: https://issues.apache.org/jira/browse/SOLR-11023 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Hoss Man >Assignee: Hoss Man > Labels: numeric-tries-to-points > Attachments: SOLR-11023.patch > > > although it's not a subclass of TrieField, EnumField does use > "LegacyIntField" to index the int value associated with each of the enum > values, in addition to using SortedSetDocValuesField when {{docValues="true" > multivalued="true"}}. > I have no idea if Points would be better/worse then Terms for low cardinality > usecases like EnumField, but either way we should think about a new variant > of EnumField that doesn't depend on > LegacyIntField/LegacyNumericUtils.intToPrefixCoded and uses > SortedNumericDocValues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org