I always prefer ints to strings, they can't help but take
up less memory, comparing two ints is much faster than
two strings etc. Although Lucene can play some tricks
to make that less noticeable.

Although if these are just a few values, it'll be hard to
actually measure the perf difference.

And if it's a _lot_ of unique values, you have other problems
than the int/string distinction. Faceting on very high
cardinality fields is something that can have performance
implications.

But I'd certainly add docValues="true" to the definition no matter
which you decide on.

Best,
Erick

On Wed, May 25, 2016 at 9:29 AM, Steven White <swhite4...@gmail.com> wrote:
> Hi everyone,
>
> I will be faceting on data of type integers and I'm wonder if there is any
> difference on how I design my schema.  I have no need to sort or use range
> facet, given this, in terms of Lucene performance and index size, does it
> make any difference if I use:
>
> #1: <field name="FACET_ID" type="string" multiValued="true" indexed="true"
> required="true" stored="false"/>
>
> Or
>
> #2: <field name="FACET_ID" type="int" multiValued="true" indexed="true"
> required="true" stored="false"/>
>
> (notice how I changed the "type" from "string" to "int" in #2)
>
> Thanks in advanced.
>
> Steve

Reply via email to