Yonik Seeley kirjoitti 1.9.2017 klo 17.03:> On Fri, Sep 1, 2017 at 9:17 AM, Ere Maijala <ere.maij...@helsinki.fi> wrote:
>> I spoke a bit too soon. Now I see why I didn't see any improvement from
>> facet.method=uif before: its performance seems to depend heavily on how many >> facets are returned. With an index of 6 million records and the facet having
>> 1960 buckets:
>>
>> facet.limit=20 takes 4ms
>> facet.limit=200 takes ~100ms
>> facet.limit=2000 takes ~1300ms
>>
>> So, for some uses it provides a nice boost, but if you need to fetch more
>> than a few top items, it doesn't perform properly.
>
> Another thought on this one:
> If it does slow down more than 4.x when requesting many items, it's either
> 1) a bug introduced at some point
> 2) not actually slower, but due to the 6.6 index having more segments
> (ord->string conversion needs to merge multiple term enumerators, so
> more segments == slower)
>
> If you could check #2, that would be great!  If it doesn't seem to be
> the problem, could you open up a new JIRA issue for this?
>
Thanks for the insight, Yonik. I can confirm that #2 is true. I ran

<optimize maxSegments="1" waitSearcher="true"/>

and after it completed I was able to retrieve 2000 values in 17ms.

Does this mean we should have a very aggressive merge policy? That's something I haven't tweaked, and it's not quite clear to me what would be the best way to achieve consistently low number of segments.

I encountered one issue with some further testing. I assume this is a bug: Trying to use facet.method=uif with a solr.DateRangeField causes the following exception:

2017-09-04 12:50:33.246 ERROR (qtp1205044462-18602) [ x:biblio2] o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: Exception during facet.field: search_daterange_mv at org.apache.solr.request.SimpleFacets.lambda$getFacetFieldCounts$0(SimpleFacets.java:809)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.apache.solr.request.SimpleFacets$3.execute(SimpleFacets.java:742) at org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:818) at org.apache.solr.handler.component.FacetComponent.getFacetCounts(FacetComponent.java:326) at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:274) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:304) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
        at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
        at org.eclipse.jetty.server.Server.handle(Server.java:534)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
        at java.lang.Thread.run(Thread.java:748)

Caused by: java.lang.IllegalStateException: instead call createFields() because isPolyField() is true at org.apache.solr.schema.AbstractSpatialFieldType.createField(AbstractSpatialFieldType.java:204) at org.apache.solr.schema.AbstractSpatialFieldType.createField(AbstractSpatialFieldType.java:73)
        at org.apache.solr.schema.FieldType.toObject(FieldType.java:385)
at org.apache.solr.search.facet.FacetFieldProcessorByArray.lambda$calcFacets$0(FacetFieldProcessorByArray.java:113) at org.apache.solr.search.facet.FacetFieldProcessor.findTopSlots(FacetFieldProcessor.java:333) at org.apache.solr.search.facet.FacetFieldProcessorByArray.calcFacets(FacetFieldProcessorByArray.java:110) at org.apache.solr.search.facet.FacetFieldProcessorByArray.process(FacetFieldProcessorByArray.java:58) at org.apache.solr.search.facet.FacetProcessor.processSubs(FacetProcessor.java:460) at org.apache.solr.search.facet.FacetProcessor.fillBucket(FacetProcessor.java:407) at org.apache.solr.search.facet.FacetQueryProcessor.process(FacetQuery.java:64) at org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:544) at org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:405) at org.apache.solr.request.SimpleFacets.lambda$getFacetFieldCounts$0(SimpleFacets.java:803)

--Ere

> -Yonik
>
>
>> Query used was:
>>
>> q=*:*&rows=0&facet=true&facet.field=building&facet.mincount=1&facet.limit=2000&debugQuery=true&facet.method=uif
>>
>> --Ere
>>
>>
>> Ere Maijala kirjoitti 1.9.2017 klo 13.10:
>>>
>>> I can confirm that we're seeing the same issue as Günter. For a collection >>> of 57 million bibliographic records, Solr 4.10.2 (without docValues) can >>> consistently return a facet in about 20ms, while Solr 6.6.0 with docValues >>> takes around 2600ms. I've tested some versions between those two too, but I
>>> don't have comparable numbers for them.
>>>
>>> I thought I had tried all different combinations of docValues="true/false" >>> and facet.method=fc/uif/enum, but now that I checked it again, it seems that
>>> I may have missed a test, as an 6.6.0 index with docValues="false" and
>>> facet.method=uif is markedly faster than other methods. At around 700ms it's >>> still not nowhere near as fast as 4.10.2, but a whole lot better. It seems
>>> that docValues needs to be disabled for facet.method=uif to have effect
>>> though, which is unfortunate. Otherwise it reports that applied method is >>> UIF, but the performance is actually much worse than with FC. I'll do just >>> another round of testing to verify all this. I can report to SOLR-8096 when
>>> I have something conclusive.
>>>
>>> --Ere
>>>
>>> Yonik Seeley kirjoitti 31.8.2017 klo 20.04:
>>>>
>>>> A possible improvement for some multiValued fields might be to use the
>>>> "uif" facet method (UnInvertedField was the default method for
>>>> multiValued fields in 4.x)
>>>> I'm not sure if you would need to reindex without docValues on that
>>>> field to try it though.
>>>>
>>>> Example: to enable on the "union" field, add f.union.facet.method=uif
>>>>
>>>> Support for this was added in
>>>> https://issues.apache.org/jira/browse/SOLR-8466
>>>>
>>>> -Yonik
>>>>
>>>>
>>>> On Thu, Aug 31, 2017 at 10:41 AM, Günter Hipler
>>>> <guenter.hip...@unibas.ch> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> in the meantime I came across the reason for the slow facet processing
>>>>> capacities of SOLR since version 5.x
>>>>>
>>>>>    https://issues.apache.org/jira/browse/SOLR-8096
>>>>> https://issues.apache.org/jira/browse/LUCENE-5666
>>>>>
>>>>> compared to version 4.x
>>>>>
>>>>> Various library networks across the world are suffering from the same
>>>>> symptoms:
>>>>>
>>>>> Facet processing is one of the most important features of a search
>>>>> server
>>>>> (for us) and it seems (at least IMHO) there is no solution for the issue
>>>>> since March 2015 (release date for the last SOLR 4 version)
>>>>>
>>>>> What are the plans / ideas of the solr developers for a possible future
>>>>> solution? Or maybe there is already a solution I haven't seen so far.
>>>>>
>>>>> Thanks for a feedback
>>>>>
>>>>> Günter
>>>>>
>>>>>
>>>>>
>>>>> On 21.08.2017 15:35, guenterh.li...@bluewin.ch wrote:
>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I can't figure out the reason why the facet processing in version 6
>>>>>> needs
>>>>>> significantly more time compared to version 4.
>>>>>>
>>>>>> The debugging response (for 30 million documents)
>>>>>>
>>>>>> solr 4
>>>>>> <lst name="process"><double name="time">280.0</double><lst
>>>>>> name="query"><double name="time">0.0</double></lst><lst
>>>>>> name="facet"><double
>>>>>> name="time">280.0</double></lst>
>>>>>> (once the query is cached)
>>>>>> before caching: between 1.5 and 2 sec
>>>>>>
>>>>>>
>>>>>> solr 6.x (my last try was with 6.6)
>>>>>> without docvalues for facetting fields (same schema as version 4)
>>>>>> <lst name="process"><double name="time">5874.0</double><lst
>>>>>> name="query"><double name="time">0.0</double></lst><lst
>>>>>> name="facet"><double
>>>>>> name="time">5873.0</double></lst><lst name="facet_module"><double
>>>>>> name="time">0.0</double></lst>
>>>>>> the time is not getting better even after repeating the query several
>>>>>> times
>>>>>>
>>>>>>
>>>>>> solr 6.6 with docvalues for facetting fields
>>>>>> <lst name="process"><double name="time">9837.0</double><lst
>>>>>> name="query"><double name="time">0.0</double></lst><lst
>>>>>> name="facet"><double
>>>>>> name="time">9837.0</double></lst><lst name="facet_module"><double
>>>>>> name="time">0.0</double></lst>
>>>>>>
>>>>>> used query (our productive system with version 4)
>>>>>>
>>>>>>
>>>>>> http://search.swissbib.ch/solr/sb-biblio/select?debugQuery=true&q=*:*&facet=true&facet.field=union&facet.field=navAuthor_full&facet.field=format&facet.field=language&facet.field=navSub_green&facet.field=navSubform&facet.field=publishDate&qt=edismax&ps=2&json.nl=arrarr&bf=recip(abs(ms(NOW/DAY,freshness)),3.16e-10,100,100)&fl=*,score&hl.fragsize=250&start=0&q.op=AND&sort=score+desc&rows=0&hl.simple.pre={{{{START_HILITE}}}}&facet.limit=100&hl.simple.post={{{{END_HILITE}}}}&spellcheck=false&qf=title_short^1000+title_alt^200+title_sub^200+title_old^200+title_new^200+author^750+author_additional^100+author_additional_dsv11_txt_mv^100+title_additional_dsv11_txt_mv^100+series^200+topic^500+addfields_txt_mv^50+publplace_txt_mv^25+publplace_dsv11_txt_mv^25+fulltext+callnumber^1000+ctrlnum^1000+publishDate+isbn+variant_isbn_isn_mv+issn+localcode+id&pf=title_short^1000&facet.mincount=1&hl.fl=fulltext&&wt=xml&facet.sort=count
>>>>>>
>>>>>>
>>>>>> Running the queries on smaller indices (8 million docs) the difference
>>>>>> is
>>>>>> similar although the absolut figures for processing time are smaller
>>>>>>
>>>>>>
>>>>>> Any hints why this huge differences?
>>>>>>
>>>>>> Günter
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Universität Basel
>>>>> Universitätsbibliothek
>>>>> Günter Hipler
>>>>> Projekt SwissBib
>>>>> Schoenbeinstrasse 18-20
>>>>> 4056 Basel, Schweiz
>>>>> Tel.: + 41 (0)61 267 31 12 Fax: ++41 61 267 3103
>>>>> E-Mail guenter.hip...@unibas.ch
>>>>> URL: www.swissbib.org  / http://www.ub.unibas.ch/
>>>>>
>>>
>>
>> --
>> Ere Maijala
>> Kansalliskirjasto / The National Library of Finland
--
Ere Maijala
Kansalliskirjasto / The National Library of Finland

Reply via email to