Re: Faceting on indexed=false stored=false docValues=true fields

2020-10-19 Thread uyilmaz
Sorry, correction, taking "the" time

On Mon, 19 Oct 2020 22:18:30 +0300
uyilmaz  wrote:

> Thanks for taking time to write a detailed answer.
> 
> We use Solr to both store our data and to perform aggregations, using 
> faceting or streaming expressions. When required analysis is too complex to 
> do in Solr, we export large query results from Solr to a more capable 
> analysis tool.
> 
> So I guess all fields need to be docValues="true", because export handler and 
> streaming both require fields to have docValues, and even if I won't use a 
> field in queries or facets, it should be in available to read in result set. 
> Fields that won't be searched or faceted can be (indexed=false stored=false 
> docValues=true) right?
> 
> --uyilmaz
> 
> 
> On Mon, 19 Oct 2020 14:14:27 -0400
> Michael Gibney  wrote:
> 
> > As you've observed, it is indeed possible to facet on fields with
> > docValues=true, indexed=false; but in almost all cases you should
> > probably set indexed=true. 1. for distributed facet count refinement,
> > the "indexed" approach is used to look up counts by value; 2. assuming
> > you're wanting to do something usual, e.g. allow users to apply
> > filters based on facet counts, the filter application would use the
> > "indexed" approach as well. Where indexed=false, if either filtering
> > or distributed refinement is attempted, I'm not 100% sure what
> > happens. It might fail, or lead to inconsistent results, or attempt to
> > look up results via the equivalent of a "table scan" over docValues (I
> > think the last of these is what actually happens, fwiw) ... but none
> > of these options is likely desirable.
> > 
> > Michael
> > 
> > On Mon, Oct 19, 2020 at 1:42 PM uyilmaz  wrote:
> > >
> > > Thanks! This also contributed to my confusion:
> > >
> > > https://lucene.apache.org/solr/guide/8_4/faceting.html#field-value-faceting-parameters
> > >
> > > "If you want Solr to perform both analysis (for searching) and faceting 
> > > on the full literal strings, use the copyField directive in your Schema 
> > > to create two versions of the field: one Text and one String. Make sure 
> > > both are indexed="true"."
> > >
> > > On Mon, 19 Oct 2020 13:08:00 -0400
> > > Alexandre Rafalovitch  wrote:
> > >
> > > > I think this is all explained quite well in the Ref Guide:
> > > > https://lucene.apache.org/solr/guide/8_6/docvalues.html
> > > >
> > > > DocValues is a different way to index/store values. Faceting is a
> > > > primary use case where docValues are better than what 'indexed=true'
> > > > gives you.
> > > >
> > > > Regards,
> > > >Alex.
> > > >
> > > > On Mon, 19 Oct 2020 at 12:51, uyilmaz  
> > > > wrote:
> > > > >
> > > > >
> > > > > Hey all,
> > > > >
> > > > > From my little experiments, I see that (if I didn't make a stupid 
> > > > > mistake) we can facet on fields marked as both indexed and stored 
> > > > > being false:
> > > > >
> > > > >  > > > > indexed="false" stored="false" docValues="true"/>
> > > > >
> > > > > I'm suprised by this, I thought I would need to index it. Can you 
> > > > > confirm this?
> > > > >
> > > > > Regards
> > > > >
> > > > > --
> > > > > uyilmaz 
> > >
> > >
> > > --
> > > uyilmaz 
> 
> 
> -- 
> uyilmaz 


-- 
uyilmaz 


Re: Faceting on indexed=false stored=false docValues=true fields

2020-10-19 Thread uyilmaz
Thanks for taking time to write a detailed answer.

We use Solr to both store our data and to perform aggregations, using faceting 
or streaming expressions. When required analysis is too complex to do in Solr, 
we export large query results from Solr to a more capable analysis tool.

So I guess all fields need to be docValues="true", because export handler and 
streaming both require fields to have docValues, and even if I won't use a 
field in queries or facets, it should be in available to read in result set. 
Fields that won't be searched or faceted can be (indexed=false stored=false 
docValues=true) right?

--uyilmaz


On Mon, 19 Oct 2020 14:14:27 -0400
Michael Gibney  wrote:

> As you've observed, it is indeed possible to facet on fields with
> docValues=true, indexed=false; but in almost all cases you should
> probably set indexed=true. 1. for distributed facet count refinement,
> the "indexed" approach is used to look up counts by value; 2. assuming
> you're wanting to do something usual, e.g. allow users to apply
> filters based on facet counts, the filter application would use the
> "indexed" approach as well. Where indexed=false, if either filtering
> or distributed refinement is attempted, I'm not 100% sure what
> happens. It might fail, or lead to inconsistent results, or attempt to
> look up results via the equivalent of a "table scan" over docValues (I
> think the last of these is what actually happens, fwiw) ... but none
> of these options is likely desirable.
> 
> Michael
> 
> On Mon, Oct 19, 2020 at 1:42 PM uyilmaz  wrote:
> >
> > Thanks! This also contributed to my confusion:
> >
> > https://lucene.apache.org/solr/guide/8_4/faceting.html#field-value-faceting-parameters
> >
> > "If you want Solr to perform both analysis (for searching) and faceting on 
> > the full literal strings, use the copyField directive in your Schema to 
> > create two versions of the field: one Text and one String. Make sure both 
> > are indexed="true"."
> >
> > On Mon, 19 Oct 2020 13:08:00 -0400
> > Alexandre Rafalovitch  wrote:
> >
> > > I think this is all explained quite well in the Ref Guide:
> > > https://lucene.apache.org/solr/guide/8_6/docvalues.html
> > >
> > > DocValues is a different way to index/store values. Faceting is a
> > > primary use case where docValues are better than what 'indexed=true'
> > > gives you.
> > >
> > > Regards,
> > >Alex.
> > >
> > > On Mon, 19 Oct 2020 at 12:51, uyilmaz  wrote:
> > > >
> > > >
> > > > Hey all,
> > > >
> > > > From my little experiments, I see that (if I didn't make a stupid 
> > > > mistake) we can facet on fields marked as both indexed and stored being 
> > > > false:
> > > >
> > > >  > > > indexed="false" stored="false" docValues="true"/>
> > > >
> > > > I'm suprised by this, I thought I would need to index it. Can you 
> > > > confirm this?
> > > >
> > > > Regards
> > > >
> > > > --
> > > > uyilmaz 
> >
> >
> > --
> > uyilmaz 


-- 
uyilmaz 


Re: Faceting on indexed=false stored=false docValues=true fields

2020-10-19 Thread Walter Underwood
Hmm. Fields used for faceting will also be used for filtering, which is a kind
of search. Are docValues OK for filtering? I expect they might be slow the
first time, then cached.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Oct 19, 2020, at 11:15 AM, Erick Erickson  wrote:
> 
> uyilmaz:
> 
> Hmm, that _is_ confusing. And inaccurate.
> 
> In this context, it should read something like
> 
> The Text field should have indexed="true" docValues=“false" if used for 
> searching 
> but not faceting and the String field should have indexed="false" 
> docValues=“true"
> if used for faceting but not searching.
> 
> I’ll fix this, thanks for pointing this out.
> 
> Erick
> 
>> On Oct 19, 2020, at 1:42 PM, uyilmaz  wrote:
>> 
>> Thanks! This also contributed to my confusion:
>> 
>> https://lucene.apache.org/solr/guide/8_4/faceting.html#field-value-faceting-parameters
>> 
>> "If you want Solr to perform both analysis (for searching) and faceting on 
>> the full literal strings, use the copyField directive in your Schema to 
>> create two versions of the field: one Text and one String. Make sure both 
>> are indexed="true"."
>> 
>> On Mon, 19 Oct 2020 13:08:00 -0400
>> Alexandre Rafalovitch  wrote:
>> 
>>> I think this is all explained quite well in the Ref Guide:
>>> https://lucene.apache.org/solr/guide/8_6/docvalues.html
>>> 
>>> DocValues is a different way to index/store values. Faceting is a
>>> primary use case where docValues are better than what 'indexed=true'
>>> gives you.
>>> 
>>> Regards,
>>>  Alex.
>>> 
>>> On Mon, 19 Oct 2020 at 12:51, uyilmaz  wrote:
 
 
 Hey all,
 
 From my little experiments, I see that (if I didn't make a stupid mistake) 
 we can facet on fields marked as both indexed and stored being false:
 
 >>> stored="false" docValues="true"/>
 
 I'm suprised by this, I thought I would need to index it. Can you confirm 
 this?
 
 Regards
 
 --
 uyilmaz 
>> 
>> 
>> -- 
>> uyilmaz 
> 



Re: Faceting on indexed=false stored=false docValues=true fields

2020-10-19 Thread Erick Erickson
uyilmaz:

Hmm, that _is_ confusing. And inaccurate.

In this context, it should read something like

The Text field should have indexed="true" docValues=“false" if used for 
searching 
but not faceting and the String field should have indexed="false" 
docValues=“true"
if used for faceting but not searching.

I’ll fix this, thanks for pointing this out.

Erick

> On Oct 19, 2020, at 1:42 PM, uyilmaz  wrote:
> 
> Thanks! This also contributed to my confusion:
> 
> https://lucene.apache.org/solr/guide/8_4/faceting.html#field-value-faceting-parameters
> 
> "If you want Solr to perform both analysis (for searching) and faceting on 
> the full literal strings, use the copyField directive in your Schema to 
> create two versions of the field: one Text and one String. Make sure both are 
> indexed="true"."
> 
> On Mon, 19 Oct 2020 13:08:00 -0400
> Alexandre Rafalovitch  wrote:
> 
>> I think this is all explained quite well in the Ref Guide:
>> https://lucene.apache.org/solr/guide/8_6/docvalues.html
>> 
>> DocValues is a different way to index/store values. Faceting is a
>> primary use case where docValues are better than what 'indexed=true'
>> gives you.
>> 
>> Regards,
>>   Alex.
>> 
>> On Mon, 19 Oct 2020 at 12:51, uyilmaz  wrote:
>>> 
>>> 
>>> Hey all,
>>> 
>>> From my little experiments, I see that (if I didn't make a stupid mistake) 
>>> we can facet on fields marked as both indexed and stored being false:
>>> 
>>> >> stored="false" docValues="true"/>
>>> 
>>> I'm suprised by this, I thought I would need to index it. Can you confirm 
>>> this?
>>> 
>>> Regards
>>> 
>>> --
>>> uyilmaz 
> 
> 
> -- 
> uyilmaz 



Re: Faceting on indexed=false stored=false docValues=true fields

2020-10-19 Thread Michael Gibney
As you've observed, it is indeed possible to facet on fields with
docValues=true, indexed=false; but in almost all cases you should
probably set indexed=true. 1. for distributed facet count refinement,
the "indexed" approach is used to look up counts by value; 2. assuming
you're wanting to do something usual, e.g. allow users to apply
filters based on facet counts, the filter application would use the
"indexed" approach as well. Where indexed=false, if either filtering
or distributed refinement is attempted, I'm not 100% sure what
happens. It might fail, or lead to inconsistent results, or attempt to
look up results via the equivalent of a "table scan" over docValues (I
think the last of these is what actually happens, fwiw) ... but none
of these options is likely desirable.

Michael

On Mon, Oct 19, 2020 at 1:42 PM uyilmaz  wrote:
>
> Thanks! This also contributed to my confusion:
>
> https://lucene.apache.org/solr/guide/8_4/faceting.html#field-value-faceting-parameters
>
> "If you want Solr to perform both analysis (for searching) and faceting on 
> the full literal strings, use the copyField directive in your Schema to 
> create two versions of the field: one Text and one String. Make sure both are 
> indexed="true"."
>
> On Mon, 19 Oct 2020 13:08:00 -0400
> Alexandre Rafalovitch  wrote:
>
> > I think this is all explained quite well in the Ref Guide:
> > https://lucene.apache.org/solr/guide/8_6/docvalues.html
> >
> > DocValues is a different way to index/store values. Faceting is a
> > primary use case where docValues are better than what 'indexed=true'
> > gives you.
> >
> > Regards,
> >Alex.
> >
> > On Mon, 19 Oct 2020 at 12:51, uyilmaz  wrote:
> > >
> > >
> > > Hey all,
> > >
> > > From my little experiments, I see that (if I didn't make a stupid 
> > > mistake) we can facet on fields marked as both indexed and stored being 
> > > false:
> > >
> > >  > > stored="false" docValues="true"/>
> > >
> > > I'm suprised by this, I thought I would need to index it. Can you confirm 
> > > this?
> > >
> > > Regards
> > >
> > > --
> > > uyilmaz 
>
>
> --
> uyilmaz 


Re: Faceting on indexed=false stored=false docValues=true fields

2020-10-19 Thread uyilmaz
Thanks! This also contributed to my confusion:

https://lucene.apache.org/solr/guide/8_4/faceting.html#field-value-faceting-parameters

"If you want Solr to perform both analysis (for searching) and faceting on the 
full literal strings, use the copyField directive in your Schema to create two 
versions of the field: one Text and one String. Make sure both are 
indexed="true"."

On Mon, 19 Oct 2020 13:08:00 -0400
Alexandre Rafalovitch  wrote:

> I think this is all explained quite well in the Ref Guide:
> https://lucene.apache.org/solr/guide/8_6/docvalues.html
> 
> DocValues is a different way to index/store values. Faceting is a
> primary use case where docValues are better than what 'indexed=true'
> gives you.
> 
> Regards,
>Alex.
> 
> On Mon, 19 Oct 2020 at 12:51, uyilmaz  wrote:
> >
> >
> > Hey all,
> >
> > From my little experiments, I see that (if I didn't make a stupid mistake) 
> > we can facet on fields marked as both indexed and stored being false:
> >
> >  > stored="false" docValues="true"/>
> >
> > I'm suprised by this, I thought I would need to index it. Can you confirm 
> > this?
> >
> > Regards
> >
> > --
> > uyilmaz 


-- 
uyilmaz 


Re: Faceting on indexed=false stored=false docValues=true fields

2020-10-19 Thread Alexandre Rafalovitch
I think this is all explained quite well in the Ref Guide:
https://lucene.apache.org/solr/guide/8_6/docvalues.html

DocValues is a different way to index/store values. Faceting is a
primary use case where docValues are better than what 'indexed=true'
gives you.

Regards,
   Alex.

On Mon, 19 Oct 2020 at 12:51, uyilmaz  wrote:
>
>
> Hey all,
>
> From my little experiments, I see that (if I didn't make a stupid mistake) we 
> can facet on fields marked as both indexed and stored being false:
>
>  stored="false" docValues="true"/>
>
> I'm suprised by this, I thought I would need to index it. Can you confirm 
> this?
>
> Regards
>
> --
> uyilmaz 


Faceting on indexed=false stored=false docValues=true fields

2020-10-19 Thread uyilmaz


Hey all,

>From my little experiments, I see that (if I didn't make a stupid mistake) we 
>can facet on fields marked as both indexed and stored being false:



I'm suprised by this, I thought I would need to index it. Can you confirm this?

Regards

-- 
uyilmaz