Re: Facet performance problem

2018-02-20 Thread Shawn Heisey

On 2/20/2018 1:18 AM, LOPEZ-CORTES Mariano-ext wrote:

We return a facet list of values in "motifPresence" field (person status).
Status:
[ ] status1
[x] status2
[x] status3

The user then selects 1 or multiple status (It's this step that we called "facet 
filtering").

Query is then re-executed with fq=motifPresence:(status2 OR status3)

We use fq in order to not alter the score in main query.

We've read that docValues=true for facet fields.

We need also indexed=true?


Facets, grouping, and sorting are more efficient with docValues, but 
searches aren't helped by docValues.  Without indexed="true", searches 
on the field will be VERY slow.  A filter query is still a search.  The 
"filter" in filter query just refers to the fact that it's separate from 
the main query, and that it does not affect relevancy scoring.


Thanks,
Shawn



RE: Facet performance problem

2018-02-20 Thread LOPEZ-CORTES Mariano-ext
Our query looks like this:

...factet=true=motifPresence

We return a facet list of values in "motifPresence" field (person status).
Status:
[ ] status1
[x] status2
[x] status3

The user then selects 1 or multiple status (It's this step that we called 
"facet filtering").

Query is then re-executed with fq=motifPresence:(status2 OR status3)

We use fq in order to not alter the score in main query.

We've read that docValues=true for facet fields.  

We need also indexed=true?
Is there any other problem in our solution?

-Message d'origine-
De : Erick Erickson [mailto:erickerick...@gmail.com] 
Envoyé : lundi 19 février 2018 18:18
À : solr-user
Objet : Re: Facet performance problem

I'm confused here. What do you mean by "facet filtering"? Your examples have no 
facets at all, just a _filter query_.

I'll assume you want to use filter query (fq), and faceting has nothing to do 
with it. This is one of the tricky bits of docValues.
While it's _possible_ to search on a field that's defined as above, it's very 
inefficient since there's no "inverted index" for the field, you specified 
'indexed="false" '. So the docValues are searched, and it's essentially a table 
scan.

If you mean to search against this field, set indexed="true". You'll have to 
completely reindex your corpus of course.

If you intend to facet, group or sort on this field, you should _also_ have 
docValues="true".

Best,
Erick

On Mon, Feb 19, 2018 at 7:47 AM, MOUSSA MZE Oussama-ext 
<oussama.moussa-mze-...@pole-emploi.fr> wrote:
> Hi
>
> We have following environement :
>
> 3 nodes cluster
> 1 shard
> Replication factor = 2
> 8GB per node
>
> 29 millions of documents
>
> We've faceting over field "motifPresence" defined as follow:
>
>  indexed="false" stored="true" required="false"/>
>
> Once the user selects motifPresence filter we executes search again with:
>
> fq: (value1 OR value2 OR value3 OR ...)
>
> The problem is: During facet filtering query is too slow and her response 
> time is greater than main search (without facet filtering).
>
> Thanks in advance!


Re: Facet performance problem

2018-02-19 Thread Erick Erickson
I'm confused here. What do you mean by "facet filtering"? Your
examples have no facets at all, just a _filter query_.

I'll assume you want to use filter query (fq), and faceting has
nothing to do with it. This is one of the tricky bits of docValues.
While it's _possible_ to search on a field that's defined as above,
it's very inefficient since there's no "inverted index" for the field,
you specified 'indexed="false" '. So the docValues are searched, and
it's essentially a table scan.

If you mean to search against this field, set indexed="true". You'll
have to completely reindex your corpus of course.

If you intend to facet, group or sort on this field, you should _also_
have docValues="true".

Best,
Erick

On Mon, Feb 19, 2018 at 7:47 AM, MOUSSA MZE Oussama-ext
 wrote:
> Hi
>
> We have following environement :
>
> 3 nodes cluster
> 1 shard
> Replication factor = 2
> 8GB per node
>
> 29 millions of documents
>
> We've faceting over field "motifPresence" defined as follow:
>
>  stored="true" required="false"/>
>
> Once the user selects motifPresence filter we executes search again with:
>
> fq: (value1 OR value2 OR value3 OR ...)
>
> The problem is: During facet filtering query is too slow and her response 
> time is greater than main search (without facet filtering).
>
> Thanks in advance!