Re: Faceting on indexed=false stored=false docValues=true fields
Sorry, correction, taking "the" time On Mon, 19 Oct 2020 22:18:30 +0300 uyilmaz wrote: > Thanks for taking time to write a detailed answer. > > We use Solr to both store our data and to perform aggregations, using > faceting or streaming expressions. When required analysis is too complex to > do in Solr, we export large query results from Solr to a more capable > analysis tool. > > So I guess all fields need to be docValues="true", because export handler and > streaming both require fields to have docValues, and even if I won't use a > field in queries or facets, it should be in available to read in result set. > Fields that won't be searched or faceted can be (indexed=false stored=false > docValues=true) right? > > --uyilmaz > > > On Mon, 19 Oct 2020 14:14:27 -0400 > Michael Gibney wrote: > > > As you've observed, it is indeed possible to facet on fields with > > docValues=true, indexed=false; but in almost all cases you should > > probably set indexed=true. 1. for distributed facet count refinement, > > the "indexed" approach is used to look up counts by value; 2. assuming > > you're wanting to do something usual, e.g. allow users to apply > > filters based on facet counts, the filter application would use the > > "indexed" approach as well. Where indexed=false, if either filtering > > or distributed refinement is attempted, I'm not 100% sure what > > happens. It might fail, or lead to inconsistent results, or attempt to > > look up results via the equivalent of a "table scan" over docValues (I > > think the last of these is what actually happens, fwiw) ... but none > > of these options is likely desirable. > > > > Michael > > > > On Mon, Oct 19, 2020 at 1:42 PM uyilmaz wrote: > > > > > > Thanks! This also contributed to my confusion: > > > > > > https://lucene.apache.org/solr/guide/8_4/faceting.html#field-value-faceting-parameters > > > > > > "If you want Solr to perform both analysis (for searching) and faceting > > > on the full literal strings, use the copyField directive in your Schema > > > to create two versions of the field: one Text and one String. Make sure > > > both are indexed="true"." > > > > > > On Mon, 19 Oct 2020 13:08:00 -0400 > > > Alexandre Rafalovitch wrote: > > > > > > > I think this is all explained quite well in the Ref Guide: > > > > https://lucene.apache.org/solr/guide/8_6/docvalues.html > > > > > > > > DocValues is a different way to index/store values. Faceting is a > > > > primary use case where docValues are better than what 'indexed=true' > > > > gives you. > > > > > > > > Regards, > > > >Alex. > > > > > > > > On Mon, 19 Oct 2020 at 12:51, uyilmaz > > > > wrote: > > > > > > > > > > > > > > > Hey all, > > > > > > > > > > From my little experiments, I see that (if I didn't make a stupid > > > > > mistake) we can facet on fields marked as both indexed and stored > > > > > being false: > > > > > > > > > > > > > > indexed="false" stored="false" docValues="true"/> > > > > > > > > > > I'm suprised by this, I thought I would need to index it. Can you > > > > > confirm this? > > > > > > > > > > Regards > > > > > > > > > > -- > > > > > uyilmaz > > > > > > > > > -- > > > uyilmaz > > > -- > uyilmaz -- uyilmaz
Re: Faceting on indexed=false stored=false docValues=true fields
Thanks for taking time to write a detailed answer. We use Solr to both store our data and to perform aggregations, using faceting or streaming expressions. When required analysis is too complex to do in Solr, we export large query results from Solr to a more capable analysis tool. So I guess all fields need to be docValues="true", because export handler and streaming both require fields to have docValues, and even if I won't use a field in queries or facets, it should be in available to read in result set. Fields that won't be searched or faceted can be (indexed=false stored=false docValues=true) right? --uyilmaz On Mon, 19 Oct 2020 14:14:27 -0400 Michael Gibney wrote: > As you've observed, it is indeed possible to facet on fields with > docValues=true, indexed=false; but in almost all cases you should > probably set indexed=true. 1. for distributed facet count refinement, > the "indexed" approach is used to look up counts by value; 2. assuming > you're wanting to do something usual, e.g. allow users to apply > filters based on facet counts, the filter application would use the > "indexed" approach as well. Where indexed=false, if either filtering > or distributed refinement is attempted, I'm not 100% sure what > happens. It might fail, or lead to inconsistent results, or attempt to > look up results via the equivalent of a "table scan" over docValues (I > think the last of these is what actually happens, fwiw) ... but none > of these options is likely desirable. > > Michael > > On Mon, Oct 19, 2020 at 1:42 PM uyilmaz wrote: > > > > Thanks! This also contributed to my confusion: > > > > https://lucene.apache.org/solr/guide/8_4/faceting.html#field-value-faceting-parameters > > > > "If you want Solr to perform both analysis (for searching) and faceting on > > the full literal strings, use the copyField directive in your Schema to > > create two versions of the field: one Text and one String. Make sure both > > are indexed="true"." > > > > On Mon, 19 Oct 2020 13:08:00 -0400 > > Alexandre Rafalovitch wrote: > > > > > I think this is all explained quite well in the Ref Guide: > > > https://lucene.apache.org/solr/guide/8_6/docvalues.html > > > > > > DocValues is a different way to index/store values. Faceting is a > > > primary use case where docValues are better than what 'indexed=true' > > > gives you. > > > > > > Regards, > > >Alex. > > > > > > On Mon, 19 Oct 2020 at 12:51, uyilmaz wrote: > > > > > > > > > > > > Hey all, > > > > > > > > From my little experiments, I see that (if I didn't make a stupid > > > > mistake) we can facet on fields marked as both indexed and stored being > > > > false: > > > > > > > > > > > indexed="false" stored="false" docValues="true"/> > > > > > > > > I'm suprised by this, I thought I would need to index it. Can you > > > > confirm this? > > > > > > > > Regards > > > > > > > > -- > > > > uyilmaz > > > > > > -- > > uyilmaz -- uyilmaz
Re: Faceting on indexed=false stored=false docValues=true fields
Hmm. Fields used for faceting will also be used for filtering, which is a kind of search. Are docValues OK for filtering? I expect they might be slow the first time, then cached. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Oct 19, 2020, at 11:15 AM, Erick Erickson wrote: > > uyilmaz: > > Hmm, that _is_ confusing. And inaccurate. > > In this context, it should read something like > > The Text field should have indexed="true" docValues=“false" if used for > searching > but not faceting and the String field should have indexed="false" > docValues=“true" > if used for faceting but not searching. > > I’ll fix this, thanks for pointing this out. > > Erick > >> On Oct 19, 2020, at 1:42 PM, uyilmaz wrote: >> >> Thanks! This also contributed to my confusion: >> >> https://lucene.apache.org/solr/guide/8_4/faceting.html#field-value-faceting-parameters >> >> "If you want Solr to perform both analysis (for searching) and faceting on >> the full literal strings, use the copyField directive in your Schema to >> create two versions of the field: one Text and one String. Make sure both >> are indexed="true"." >> >> On Mon, 19 Oct 2020 13:08:00 -0400 >> Alexandre Rafalovitch wrote: >> >>> I think this is all explained quite well in the Ref Guide: >>> https://lucene.apache.org/solr/guide/8_6/docvalues.html >>> >>> DocValues is a different way to index/store values. Faceting is a >>> primary use case where docValues are better than what 'indexed=true' >>> gives you. >>> >>> Regards, >>> Alex. >>> >>> On Mon, 19 Oct 2020 at 12:51, uyilmaz wrote: Hey all, From my little experiments, I see that (if I didn't make a stupid mistake) we can facet on fields marked as both indexed and stored being false: >>> stored="false" docValues="true"/> I'm suprised by this, I thought I would need to index it. Can you confirm this? Regards -- uyilmaz >> >> >> -- >> uyilmaz >
Re: Faceting on indexed=false stored=false docValues=true fields
uyilmaz: Hmm, that _is_ confusing. And inaccurate. In this context, it should read something like The Text field should have indexed="true" docValues=“false" if used for searching but not faceting and the String field should have indexed="false" docValues=“true" if used for faceting but not searching. I’ll fix this, thanks for pointing this out. Erick > On Oct 19, 2020, at 1:42 PM, uyilmaz wrote: > > Thanks! This also contributed to my confusion: > > https://lucene.apache.org/solr/guide/8_4/faceting.html#field-value-faceting-parameters > > "If you want Solr to perform both analysis (for searching) and faceting on > the full literal strings, use the copyField directive in your Schema to > create two versions of the field: one Text and one String. Make sure both are > indexed="true"." > > On Mon, 19 Oct 2020 13:08:00 -0400 > Alexandre Rafalovitch wrote: > >> I think this is all explained quite well in the Ref Guide: >> https://lucene.apache.org/solr/guide/8_6/docvalues.html >> >> DocValues is a different way to index/store values. Faceting is a >> primary use case where docValues are better than what 'indexed=true' >> gives you. >> >> Regards, >> Alex. >> >> On Mon, 19 Oct 2020 at 12:51, uyilmaz wrote: >>> >>> >>> Hey all, >>> >>> From my little experiments, I see that (if I didn't make a stupid mistake) >>> we can facet on fields marked as both indexed and stored being false: >>> >>> >> stored="false" docValues="true"/> >>> >>> I'm suprised by this, I thought I would need to index it. Can you confirm >>> this? >>> >>> Regards >>> >>> -- >>> uyilmaz > > > -- > uyilmaz
Re: Faceting on indexed=false stored=false docValues=true fields
As you've observed, it is indeed possible to facet on fields with docValues=true, indexed=false; but in almost all cases you should probably set indexed=true. 1. for distributed facet count refinement, the "indexed" approach is used to look up counts by value; 2. assuming you're wanting to do something usual, e.g. allow users to apply filters based on facet counts, the filter application would use the "indexed" approach as well. Where indexed=false, if either filtering or distributed refinement is attempted, I'm not 100% sure what happens. It might fail, or lead to inconsistent results, or attempt to look up results via the equivalent of a "table scan" over docValues (I think the last of these is what actually happens, fwiw) ... but none of these options is likely desirable. Michael On Mon, Oct 19, 2020 at 1:42 PM uyilmaz wrote: > > Thanks! This also contributed to my confusion: > > https://lucene.apache.org/solr/guide/8_4/faceting.html#field-value-faceting-parameters > > "If you want Solr to perform both analysis (for searching) and faceting on > the full literal strings, use the copyField directive in your Schema to > create two versions of the field: one Text and one String. Make sure both are > indexed="true"." > > On Mon, 19 Oct 2020 13:08:00 -0400 > Alexandre Rafalovitch wrote: > > > I think this is all explained quite well in the Ref Guide: > > https://lucene.apache.org/solr/guide/8_6/docvalues.html > > > > DocValues is a different way to index/store values. Faceting is a > > primary use case where docValues are better than what 'indexed=true' > > gives you. > > > > Regards, > >Alex. > > > > On Mon, 19 Oct 2020 at 12:51, uyilmaz wrote: > > > > > > > > > Hey all, > > > > > > From my little experiments, I see that (if I didn't make a stupid > > > mistake) we can facet on fields marked as both indexed and stored being > > > false: > > > > > > > > stored="false" docValues="true"/> > > > > > > I'm suprised by this, I thought I would need to index it. Can you confirm > > > this? > > > > > > Regards > > > > > > -- > > > uyilmaz > > > -- > uyilmaz
Re: Faceting on indexed=false stored=false docValues=true fields
Thanks! This also contributed to my confusion: https://lucene.apache.org/solr/guide/8_4/faceting.html#field-value-faceting-parameters "If you want Solr to perform both analysis (for searching) and faceting on the full literal strings, use the copyField directive in your Schema to create two versions of the field: one Text and one String. Make sure both are indexed="true"." On Mon, 19 Oct 2020 13:08:00 -0400 Alexandre Rafalovitch wrote: > I think this is all explained quite well in the Ref Guide: > https://lucene.apache.org/solr/guide/8_6/docvalues.html > > DocValues is a different way to index/store values. Faceting is a > primary use case where docValues are better than what 'indexed=true' > gives you. > > Regards, >Alex. > > On Mon, 19 Oct 2020 at 12:51, uyilmaz wrote: > > > > > > Hey all, > > > > From my little experiments, I see that (if I didn't make a stupid mistake) > > we can facet on fields marked as both indexed and stored being false: > > > > > stored="false" docValues="true"/> > > > > I'm suprised by this, I thought I would need to index it. Can you confirm > > this? > > > > Regards > > > > -- > > uyilmaz -- uyilmaz
Re: Faceting on indexed=false stored=false docValues=true fields
I think this is all explained quite well in the Ref Guide: https://lucene.apache.org/solr/guide/8_6/docvalues.html DocValues is a different way to index/store values. Faceting is a primary use case where docValues are better than what 'indexed=true' gives you. Regards, Alex. On Mon, 19 Oct 2020 at 12:51, uyilmaz wrote: > > > Hey all, > > From my little experiments, I see that (if I didn't make a stupid mistake) we > can facet on fields marked as both indexed and stored being false: > > stored="false" docValues="true"/> > > I'm suprised by this, I thought I would need to index it. Can you confirm > this? > > Regards > > -- > uyilmaz
Faceting on indexed=false stored=false docValues=true fields
Hey all, >From my little experiments, I see that (if I didn't make a stupid mistake) we >can facet on fields marked as both indexed and stored being false: I'm suprised by this, I thought I would need to index it. Can you confirm this? Regards -- uyilmaz