Re: facets & docValues
check out the videos on this website TROO.TUBE don't be such a sheep/zombie/loser/NPC. Much love! https://troo.tube/videos/watch/aaa64864-52ee-4201-922f-41300032f219 On Thu, May 7, 2020 at 8:49 PM Joel Bernstein wrote: > > You can be pretty sure that adding static warming queries will improve your > performance following softcommits. But, opening new searchers every 2 > seconds may be too fast to allow for warming so you may need to adjust. As > a general rule you cannot open searchers faster than you can warm them. > > Joel Bernstein > http://joelsolr.blogspot.com/ > > > On Tue, May 5, 2020 at 5:54 PM Revas wrote: > > > Hi joel, No, we have not, we have softCommit requirement of 2 secs. > > > > On Tue, May 5, 2020 at 3:31 PM Joel Bernstein wrote: > > > > > Have you configured static warming queries for the facets? This will warm > > > the cache structures for the facet fields. You just want to make sure you > > > commits are spaced far enough apart that the warming completes before a > > new > > > searcher starts warming. > > > > > > > > > Joel Bernstein > > > http://joelsolr.blogspot.com/ > > > > > > > > > On Mon, May 4, 2020 at 10:27 AM Revas wrote: > > > > > > > Hi Erick, Thanks for the explanation and advise. With facet queries, > > does > > > > doc Values help at all ? > > > > > > > > 1) indexed=true, docValues=true => all facets > > > > > > > > 2) > > > > > > > >- indexed=true , docValues=true => only for subfacets > > > >- inexed=true, docValues=false=> facet query > > > >- docValues=true, indexed=false=> term facets > > > > > > > > > > > > > > > > In case of 1 above, => Indexing slowed considerably. over all facet > > > > performance improved many fold > > > > In case of 2=> over all performance showed only slight > > > > improvement > > > > > > > > Does that mean turning on docValues even for facet query helps improve > > > the > > > > performance, fetching from docValues for facet query is faster than > > > > fetching from stored fields ? > > > > > > > > Thanks > > > > > > > > > > > > On Thu, Apr 16, 2020 at 1:50 PM Erick Erickson < > > erickerick...@gmail.com> > > > > wrote: > > > > > > > > > DocValues should help when faceting over fields, i.e. > > facet.field=blah. > > > > > > > > > > I would expect docValues to help with sub facets and, but don’t know > > > > > the code well enough to say definitely one way or the other. > > > > > > > > > > The empirical approach would be to set “uninvertible=true” (Solr 7.6) > > > and > > > > > turn docValues off. What that means is that if any operation tries to > > > > > uninvert > > > > > the index on the Java heap, you’ll get an exception like: > > > > > "can not sort on a field w/o docValues unless it is indexed=true > > > > > uninvertible=true and the type supports Uninversion:” > > > > > > > > > > See SOLR-12962 > > > > > > > > > > Speed is only one issue. The entire point of docValues is to not > > > > “uninvert” > > > > > the field on the heap. This used to lead to very significant memory > > > > > pressure. So when turning docValues off, you run the risk of > > > > > reverting back to the old behavior and having unexpected memory > > > > > consumption, not to mention slowdowns when the uninversion > > > > > takes place. > > > > > > > > > > Also, unless your documents are very large, this is a tiny corpus. It > > > can > > > > > be > > > > > quite hard to get realistic numbers, the signal gets lost in the > > noise. > > > > > > > > > > You should only shard when your individual query times exceed your > > > > > requirement. Say you have a 95%tile requirement of 1 second response > > > > time. > > > > > > > > > > Let’s further say that you can meet that requirement with 50 > > > > > queries/second, > > > > > but when you get to 75 queries/second your response time exceeds your > > > > > requirements. Do NOT shard at this point. Add another replica > > instead. > > > > > Sharding adds inevitable overhead and should only be considered when > > > > > you can’t get adequate response time even under fairly light query > > > loads > > > > > as a general rule. > > > > > > > > > > Best, > > > > > Erick > > > > > > > > > > > On Apr 16, 2020, at 12:08 PM, Revas wrote: > > > > > > > > > > > > Hi Erick, You are correct, we have only about 1.8M documents so far > > > and > > > > > > turning on the indexing on the facet fields helped improve the > > > timings > > > > of > > > > > > the facet query a lot which has (sub facets and facet queries). So > > > does > > > > > > docValues help at all for sub facets and facet query, our tests > > > > > > revealed further query time improvement when we turned off the > > > > docValues. > > > > > > is that the right approach? > > > > > > > > > > > > Currently we have only 1 shard and we are thinking of scaling by > > > > > > increasing the number of shards when we see a deterioration on > > query > > > > > time. > > > > > > Any suggestions? > > > > > > > > > > > > Thanks. > > > > > > > >
Re: facets & docValues
You can be pretty sure that adding static warming queries will improve your performance following softcommits. But, opening new searchers every 2 seconds may be too fast to allow for warming so you may need to adjust. As a general rule you cannot open searchers faster than you can warm them. Joel Bernstein http://joelsolr.blogspot.com/ On Tue, May 5, 2020 at 5:54 PM Revas wrote: > Hi joel, No, we have not, we have softCommit requirement of 2 secs. > > On Tue, May 5, 2020 at 3:31 PM Joel Bernstein wrote: > > > Have you configured static warming queries for the facets? This will warm > > the cache structures for the facet fields. You just want to make sure you > > commits are spaced far enough apart that the warming completes before a > new > > searcher starts warming. > > > > > > Joel Bernstein > > http://joelsolr.blogspot.com/ > > > > > > On Mon, May 4, 2020 at 10:27 AM Revas wrote: > > > > > Hi Erick, Thanks for the explanation and advise. With facet queries, > does > > > doc Values help at all ? > > > > > > 1) indexed=true, docValues=true => all facets > > > > > > 2) > > > > > >- indexed=true , docValues=true => only for subfacets > > >- inexed=true, docValues=false=> facet query > > >- docValues=true, indexed=false=> term facets > > > > > > > > > > > > In case of 1 above, => Indexing slowed considerably. over all facet > > > performance improved many fold > > > In case of 2=> over all performance showed only slight > > > improvement > > > > > > Does that mean turning on docValues even for facet query helps improve > > the > > > performance, fetching from docValues for facet query is faster than > > > fetching from stored fields ? > > > > > > Thanks > > > > > > > > > On Thu, Apr 16, 2020 at 1:50 PM Erick Erickson < > erickerick...@gmail.com> > > > wrote: > > > > > > > DocValues should help when faceting over fields, i.e. > facet.field=blah. > > > > > > > > I would expect docValues to help with sub facets and, but don’t know > > > > the code well enough to say definitely one way or the other. > > > > > > > > The empirical approach would be to set “uninvertible=true” (Solr 7.6) > > and > > > > turn docValues off. What that means is that if any operation tries to > > > > uninvert > > > > the index on the Java heap, you’ll get an exception like: > > > > "can not sort on a field w/o docValues unless it is indexed=true > > > > uninvertible=true and the type supports Uninversion:” > > > > > > > > See SOLR-12962 > > > > > > > > Speed is only one issue. The entire point of docValues is to not > > > “uninvert” > > > > the field on the heap. This used to lead to very significant memory > > > > pressure. So when turning docValues off, you run the risk of > > > > reverting back to the old behavior and having unexpected memory > > > > consumption, not to mention slowdowns when the uninversion > > > > takes place. > > > > > > > > Also, unless your documents are very large, this is a tiny corpus. It > > can > > > > be > > > > quite hard to get realistic numbers, the signal gets lost in the > noise. > > > > > > > > You should only shard when your individual query times exceed your > > > > requirement. Say you have a 95%tile requirement of 1 second response > > > time. > > > > > > > > Let’s further say that you can meet that requirement with 50 > > > > queries/second, > > > > but when you get to 75 queries/second your response time exceeds your > > > > requirements. Do NOT shard at this point. Add another replica > instead. > > > > Sharding adds inevitable overhead and should only be considered when > > > > you can’t get adequate response time even under fairly light query > > loads > > > > as a general rule. > > > > > > > > Best, > > > > Erick > > > > > > > > > On Apr 16, 2020, at 12:08 PM, Revas wrote: > > > > > > > > > > Hi Erick, You are correct, we have only about 1.8M documents so far > > and > > > > > turning on the indexing on the facet fields helped improve the > > timings > > > of > > > > > the facet query a lot which has (sub facets and facet queries). So > > does > > > > > docValues help at all for sub facets and facet query, our tests > > > > > revealed further query time improvement when we turned off the > > > docValues. > > > > > is that the right approach? > > > > > > > > > > Currently we have only 1 shard and we are thinking of scaling by > > > > > increasing the number of shards when we see a deterioration on > query > > > > time. > > > > > Any suggestions? > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > On Wed, Apr 15, 2020 at 8:21 AM Erick Erickson < > > > erickerick...@gmail.com> > > > > > wrote: > > > > > > > > > >> In a word, “yes”. I also suspect your corpus isn’t very big. > > > > >> > > > > >> I think the key is the facet queries. Now, I’m talking from > > > > >> theory rather than diving into the code, but querying on > > > > >> a docValues=true, indexed=false field is really doing a > > > > >> search. And searching on a field like that is
Re: facets & docValues
Hi joel, No, we have not, we have softCommit requirement of 2 secs. On Tue, May 5, 2020 at 3:31 PM Joel Bernstein wrote: > Have you configured static warming queries for the facets? This will warm > the cache structures for the facet fields. You just want to make sure you > commits are spaced far enough apart that the warming completes before a new > searcher starts warming. > > > Joel Bernstein > http://joelsolr.blogspot.com/ > > > On Mon, May 4, 2020 at 10:27 AM Revas wrote: > > > Hi Erick, Thanks for the explanation and advise. With facet queries, does > > doc Values help at all ? > > > > 1) indexed=true, docValues=true => all facets > > > > 2) > > > >- indexed=true , docValues=true => only for subfacets > >- inexed=true, docValues=false=> facet query > >- docValues=true, indexed=false=> term facets > > > > > > > > In case of 1 above, => Indexing slowed considerably. over all facet > > performance improved many fold > > In case of 2=> over all performance showed only slight > > improvement > > > > Does that mean turning on docValues even for facet query helps improve > the > > performance, fetching from docValues for facet query is faster than > > fetching from stored fields ? > > > > Thanks > > > > > > On Thu, Apr 16, 2020 at 1:50 PM Erick Erickson > > wrote: > > > > > DocValues should help when faceting over fields, i.e. facet.field=blah. > > > > > > I would expect docValues to help with sub facets and, but don’t know > > > the code well enough to say definitely one way or the other. > > > > > > The empirical approach would be to set “uninvertible=true” (Solr 7.6) > and > > > turn docValues off. What that means is that if any operation tries to > > > uninvert > > > the index on the Java heap, you’ll get an exception like: > > > "can not sort on a field w/o docValues unless it is indexed=true > > > uninvertible=true and the type supports Uninversion:” > > > > > > See SOLR-12962 > > > > > > Speed is only one issue. The entire point of docValues is to not > > “uninvert” > > > the field on the heap. This used to lead to very significant memory > > > pressure. So when turning docValues off, you run the risk of > > > reverting back to the old behavior and having unexpected memory > > > consumption, not to mention slowdowns when the uninversion > > > takes place. > > > > > > Also, unless your documents are very large, this is a tiny corpus. It > can > > > be > > > quite hard to get realistic numbers, the signal gets lost in the noise. > > > > > > You should only shard when your individual query times exceed your > > > requirement. Say you have a 95%tile requirement of 1 second response > > time. > > > > > > Let’s further say that you can meet that requirement with 50 > > > queries/second, > > > but when you get to 75 queries/second your response time exceeds your > > > requirements. Do NOT shard at this point. Add another replica instead. > > > Sharding adds inevitable overhead and should only be considered when > > > you can’t get adequate response time even under fairly light query > loads > > > as a general rule. > > > > > > Best, > > > Erick > > > > > > > On Apr 16, 2020, at 12:08 PM, Revas wrote: > > > > > > > > Hi Erick, You are correct, we have only about 1.8M documents so far > and > > > > turning on the indexing on the facet fields helped improve the > timings > > of > > > > the facet query a lot which has (sub facets and facet queries). So > does > > > > docValues help at all for sub facets and facet query, our tests > > > > revealed further query time improvement when we turned off the > > docValues. > > > > is that the right approach? > > > > > > > > Currently we have only 1 shard and we are thinking of scaling by > > > > increasing the number of shards when we see a deterioration on query > > > time. > > > > Any suggestions? > > > > > > > > Thanks. > > > > > > > > > > > > On Wed, Apr 15, 2020 at 8:21 AM Erick Erickson < > > erickerick...@gmail.com> > > > > wrote: > > > > > > > >> In a word, “yes”. I also suspect your corpus isn’t very big. > > > >> > > > >> I think the key is the facet queries. Now, I’m talking from > > > >> theory rather than diving into the code, but querying on > > > >> a docValues=true, indexed=false field is really doing a > > > >> search. And searching on a field like that is effectively > > > >> analogous to a table scan. Even if somehow an internal > > > >> structure would be constructed to deal with it, it would > > > >> probably be on the heap, where you don’t want it. > > > >> > > > >> So the test would be to take the queries out and measure > > > >> performance, but I think that’s the root issue here. > > > >> > > > >> Best, > > > >> Erick > > > >> > > > >>> On Apr 14, 2020, at 11:51 PM, Revas wrote: > > > >>> > > > >>> We have faceting fields that have been defined as indexed=false, > > > >>> stored=false and docValues=true > > > >>> > > > >>> However we use a lot of subfacets using json facets and facet > > ranges > > > >>>
Re: facets & docValues
Have you configured static warming queries for the facets? This will warm the cache structures for the facet fields. You just want to make sure you commits are spaced far enough apart that the warming completes before a new searcher starts warming. Joel Bernstein http://joelsolr.blogspot.com/ On Mon, May 4, 2020 at 10:27 AM Revas wrote: > Hi Erick, Thanks for the explanation and advise. With facet queries, does > doc Values help at all ? > > 1) indexed=true, docValues=true => all facets > > 2) > >- indexed=true , docValues=true => only for subfacets >- inexed=true, docValues=false=> facet query >- docValues=true, indexed=false=> term facets > > > > In case of 1 above, => Indexing slowed considerably. over all facet > performance improved many fold > In case of 2=> over all performance showed only slight > improvement > > Does that mean turning on docValues even for facet query helps improve the > performance, fetching from docValues for facet query is faster than > fetching from stored fields ? > > Thanks > > > On Thu, Apr 16, 2020 at 1:50 PM Erick Erickson > wrote: > > > DocValues should help when faceting over fields, i.e. facet.field=blah. > > > > I would expect docValues to help with sub facets and, but don’t know > > the code well enough to say definitely one way or the other. > > > > The empirical approach would be to set “uninvertible=true” (Solr 7.6) and > > turn docValues off. What that means is that if any operation tries to > > uninvert > > the index on the Java heap, you’ll get an exception like: > > "can not sort on a field w/o docValues unless it is indexed=true > > uninvertible=true and the type supports Uninversion:” > > > > See SOLR-12962 > > > > Speed is only one issue. The entire point of docValues is to not > “uninvert” > > the field on the heap. This used to lead to very significant memory > > pressure. So when turning docValues off, you run the risk of > > reverting back to the old behavior and having unexpected memory > > consumption, not to mention slowdowns when the uninversion > > takes place. > > > > Also, unless your documents are very large, this is a tiny corpus. It can > > be > > quite hard to get realistic numbers, the signal gets lost in the noise. > > > > You should only shard when your individual query times exceed your > > requirement. Say you have a 95%tile requirement of 1 second response > time. > > > > Let’s further say that you can meet that requirement with 50 > > queries/second, > > but when you get to 75 queries/second your response time exceeds your > > requirements. Do NOT shard at this point. Add another replica instead. > > Sharding adds inevitable overhead and should only be considered when > > you can’t get adequate response time even under fairly light query loads > > as a general rule. > > > > Best, > > Erick > > > > > On Apr 16, 2020, at 12:08 PM, Revas wrote: > > > > > > Hi Erick, You are correct, we have only about 1.8M documents so far and > > > turning on the indexing on the facet fields helped improve the timings > of > > > the facet query a lot which has (sub facets and facet queries). So does > > > docValues help at all for sub facets and facet query, our tests > > > revealed further query time improvement when we turned off the > docValues. > > > is that the right approach? > > > > > > Currently we have only 1 shard and we are thinking of scaling by > > > increasing the number of shards when we see a deterioration on query > > time. > > > Any suggestions? > > > > > > Thanks. > > > > > > > > > On Wed, Apr 15, 2020 at 8:21 AM Erick Erickson < > erickerick...@gmail.com> > > > wrote: > > > > > >> In a word, “yes”. I also suspect your corpus isn’t very big. > > >> > > >> I think the key is the facet queries. Now, I’m talking from > > >> theory rather than diving into the code, but querying on > > >> a docValues=true, indexed=false field is really doing a > > >> search. And searching on a field like that is effectively > > >> analogous to a table scan. Even if somehow an internal > > >> structure would be constructed to deal with it, it would > > >> probably be on the heap, where you don’t want it. > > >> > > >> So the test would be to take the queries out and measure > > >> performance, but I think that’s the root issue here. > > >> > > >> Best, > > >> Erick > > >> > > >>> On Apr 14, 2020, at 11:51 PM, Revas wrote: > > >>> > > >>> We have faceting fields that have been defined as indexed=false, > > >>> stored=false and docValues=true > > >>> > > >>> However we use a lot of subfacets using json facets and facet > ranges > > >>> using facet.queries. We see that after every soft-commit our > > performance > > >>> worsens and performs ideal between commits > > >>> > > >>> how is that docValue fields are affected by soft-commit and do we > need > > to > > >>> enable indexing if we use subfacets and facet query to improve > > >> performance? > > >>> > > >>> Tha > > >> > > >> > > > > >
Re: facets & docValues
Hi Erick, Thanks for the explanation and advise. With facet queries, does doc Values help at all ? 1) indexed=true, docValues=true => all facets 2) - indexed=true , docValues=true => only for subfacets - inexed=true, docValues=false=> facet query - docValues=true, indexed=false=> term facets In case of 1 above, => Indexing slowed considerably. over all facet performance improved many fold In case of 2=> over all performance showed only slight improvement Does that mean turning on docValues even for facet query helps improve the performance, fetching from docValues for facet query is faster than fetching from stored fields ? Thanks On Thu, Apr 16, 2020 at 1:50 PM Erick Erickson wrote: > DocValues should help when faceting over fields, i.e. facet.field=blah. > > I would expect docValues to help with sub facets and, but don’t know > the code well enough to say definitely one way or the other. > > The empirical approach would be to set “uninvertible=true” (Solr 7.6) and > turn docValues off. What that means is that if any operation tries to > uninvert > the index on the Java heap, you’ll get an exception like: > "can not sort on a field w/o docValues unless it is indexed=true > uninvertible=true and the type supports Uninversion:” > > See SOLR-12962 > > Speed is only one issue. The entire point of docValues is to not “uninvert” > the field on the heap. This used to lead to very significant memory > pressure. So when turning docValues off, you run the risk of > reverting back to the old behavior and having unexpected memory > consumption, not to mention slowdowns when the uninversion > takes place. > > Also, unless your documents are very large, this is a tiny corpus. It can > be > quite hard to get realistic numbers, the signal gets lost in the noise. > > You should only shard when your individual query times exceed your > requirement. Say you have a 95%tile requirement of 1 second response time. > > Let’s further say that you can meet that requirement with 50 > queries/second, > but when you get to 75 queries/second your response time exceeds your > requirements. Do NOT shard at this point. Add another replica instead. > Sharding adds inevitable overhead and should only be considered when > you can’t get adequate response time even under fairly light query loads > as a general rule. > > Best, > Erick > > > On Apr 16, 2020, at 12:08 PM, Revas wrote: > > > > Hi Erick, You are correct, we have only about 1.8M documents so far and > > turning on the indexing on the facet fields helped improve the timings of > > the facet query a lot which has (sub facets and facet queries). So does > > docValues help at all for sub facets and facet query, our tests > > revealed further query time improvement when we turned off the docValues. > > is that the right approach? > > > > Currently we have only 1 shard and we are thinking of scaling by > > increasing the number of shards when we see a deterioration on query > time. > > Any suggestions? > > > > Thanks. > > > > > > On Wed, Apr 15, 2020 at 8:21 AM Erick Erickson > > wrote: > > > >> In a word, “yes”. I also suspect your corpus isn’t very big. > >> > >> I think the key is the facet queries. Now, I’m talking from > >> theory rather than diving into the code, but querying on > >> a docValues=true, indexed=false field is really doing a > >> search. And searching on a field like that is effectively > >> analogous to a table scan. Even if somehow an internal > >> structure would be constructed to deal with it, it would > >> probably be on the heap, where you don’t want it. > >> > >> So the test would be to take the queries out and measure > >> performance, but I think that’s the root issue here. > >> > >> Best, > >> Erick > >> > >>> On Apr 14, 2020, at 11:51 PM, Revas wrote: > >>> > >>> We have faceting fields that have been defined as indexed=false, > >>> stored=false and docValues=true > >>> > >>> However we use a lot of subfacets using json facets and facet ranges > >>> using facet.queries. We see that after every soft-commit our > performance > >>> worsens and performs ideal between commits > >>> > >>> how is that docValue fields are affected by soft-commit and do we need > to > >>> enable indexing if we use subfacets and facet query to improve > >> performance? > >>> > >>> Tha > >> > >> > >
Re: facets & docValues
DocValues should help when faceting over fields, i.e. facet.field=blah. I would expect docValues to help with sub facets and, but don’t know the code well enough to say definitely one way or the other. The empirical approach would be to set “uninvertible=true” (Solr 7.6) and turn docValues off. What that means is that if any operation tries to uninvert the index on the Java heap, you’ll get an exception like: "can not sort on a field w/o docValues unless it is indexed=true uninvertible=true and the type supports Uninversion:” See SOLR-12962 Speed is only one issue. The entire point of docValues is to not “uninvert” the field on the heap. This used to lead to very significant memory pressure. So when turning docValues off, you run the risk of reverting back to the old behavior and having unexpected memory consumption, not to mention slowdowns when the uninversion takes place. Also, unless your documents are very large, this is a tiny corpus. It can be quite hard to get realistic numbers, the signal gets lost in the noise. You should only shard when your individual query times exceed your requirement. Say you have a 95%tile requirement of 1 second response time. Let’s further say that you can meet that requirement with 50 queries/second, but when you get to 75 queries/second your response time exceeds your requirements. Do NOT shard at this point. Add another replica instead. Sharding adds inevitable overhead and should only be considered when you can’t get adequate response time even under fairly light query loads as a general rule. Best, Erick > On Apr 16, 2020, at 12:08 PM, Revas wrote: > > Hi Erick, You are correct, we have only about 1.8M documents so far and > turning on the indexing on the facet fields helped improve the timings of > the facet query a lot which has (sub facets and facet queries). So does > docValues help at all for sub facets and facet query, our tests > revealed further query time improvement when we turned off the docValues. > is that the right approach? > > Currently we have only 1 shard and we are thinking of scaling by > increasing the number of shards when we see a deterioration on query time. > Any suggestions? > > Thanks. > > > On Wed, Apr 15, 2020 at 8:21 AM Erick Erickson > wrote: > >> In a word, “yes”. I also suspect your corpus isn’t very big. >> >> I think the key is the facet queries. Now, I’m talking from >> theory rather than diving into the code, but querying on >> a docValues=true, indexed=false field is really doing a >> search. And searching on a field like that is effectively >> analogous to a table scan. Even if somehow an internal >> structure would be constructed to deal with it, it would >> probably be on the heap, where you don’t want it. >> >> So the test would be to take the queries out and measure >> performance, but I think that’s the root issue here. >> >> Best, >> Erick >> >>> On Apr 14, 2020, at 11:51 PM, Revas wrote: >>> >>> We have faceting fields that have been defined as indexed=false, >>> stored=false and docValues=true >>> >>> However we use a lot of subfacets using json facets and facet ranges >>> using facet.queries. We see that after every soft-commit our performance >>> worsens and performs ideal between commits >>> >>> how is that docValue fields are affected by soft-commit and do we need to >>> enable indexing if we use subfacets and facet query to improve >> performance? >>> >>> Tha >> >>
Re: facets & docValues
Hi Erick, You are correct, we have only about 1.8M documents so far and turning on the indexing on the facet fields helped improve the timings of the facet query a lot which has (sub facets and facet queries). So does docValues help at all for sub facets and facet query, our tests revealed further query time improvement when we turned off the docValues. is that the right approach? Currently we have only 1 shard and we are thinking of scaling by increasing the number of shards when we see a deterioration on query time. Any suggestions? Thanks. On Wed, Apr 15, 2020 at 8:21 AM Erick Erickson wrote: > In a word, “yes”. I also suspect your corpus isn’t very big. > > I think the key is the facet queries. Now, I’m talking from > theory rather than diving into the code, but querying on > a docValues=true, indexed=false field is really doing a > search. And searching on a field like that is effectively > analogous to a table scan. Even if somehow an internal > structure would be constructed to deal with it, it would > probably be on the heap, where you don’t want it. > > So the test would be to take the queries out and measure > performance, but I think that’s the root issue here. > > Best, > Erick > > > On Apr 14, 2020, at 11:51 PM, Revas wrote: > > > > We have faceting fields that have been defined as indexed=false, > > stored=false and docValues=true > > > > However we use a lot of subfacets using json facets and facet ranges > > using facet.queries. We see that after every soft-commit our performance > > worsens and performs ideal between commits > > > > how is that docValue fields are affected by soft-commit and do we need to > > enable indexing if we use subfacets and facet query to improve > performance? > > > > Tha > >
Re: facets & docValues
In a word, “yes”. I also suspect your corpus isn’t very big. I think the key is the facet queries. Now, I’m talking from theory rather than diving into the code, but querying on a docValues=true, indexed=false field is really doing a search. And searching on a field like that is effectively analogous to a table scan. Even if somehow an internal structure would be constructed to deal with it, it would probably be on the heap, where you don’t want it. So the test would be to take the queries out and measure performance, but I think that’s the root issue here. Best, Erick > On Apr 14, 2020, at 11:51 PM, Revas wrote: > > We have faceting fields that have been defined as indexed=false, > stored=false and docValues=true > > However we use a lot of subfacets using json facets and facet ranges > using facet.queries. We see that after every soft-commit our performance > worsens and performs ideal between commits > > how is that docValue fields are affected by soft-commit and do we need to > enable indexing if we use subfacets and facet query to improve performance? > > Tha
facets & docValues
We have faceting fields that have been defined as indexed=false, stored=false and docValues=true However we use a lot of subfacets using json facets and facet ranges using facet.queries. We see that after every soft-commit our performance worsens and performs ideal between commits how is that docValue fields are affected by soft-commit and do we need to enable indexing if we use subfacets and facet query to improve performance? Tha