[ 
https://issues.apache.org/jira/browse/SOLR-6889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16084502#comment-16084502
 ] 

Shinichiro Abe commented on SOLR-6889:
--------------------------------------

I agree. But could it make much faster by being parallelized when using 
FieldOffsetStrategy#getOffsetsEnums(), especially OffsetSource.ANALYSIS 
strategy case, i.e. storeOffsetsWithPositions = false case in which user can 
select fields to highlight after indexing? I assumed text analysis work, which 
the standard highlighter has, would be able to be parallelized, borrowed by an 
idea of facet.threads method at that time. Although I saw a benchmark where 
UH's offsetSource=ANALYSIS is already much faster than the standard one.

> Highlight using multiple threads
> --------------------------------
>
>                 Key: SOLR-6889
>                 URL: https://issues.apache.org/jira/browse/SOLR-6889
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 6.0
>            Reporter: Shinichiro Abe
>         Attachments: SOLR-6889.patch
>
>
> I think we could gain search performance a little bit using 
> Stream.parallel().forEach()~ which has processors awareness via f/j framework 
> under the hood.
> Especially it would affect docList's for-loop processes, e.g. debugging, 
> highlighting.
> It seems to me that this improvement is effective for many CPUs environment.
> My test condition:
> 1. Core i5(2core 4thead), standalone Solr.
> 2. q=日本&debug=true&hl=true, other parameters are 
> [here|https://github.com/anond2/simplesearch/blob/master/conf/solrconfig.xml#L836].
> 3. 7171 hits / 12000 docs(taken from ja.wikipedia dump)
> 4. compared to trunk, parallel streams are faster a little.
> My query execution results(QTime):
> {noformat}
> == rows=10 ==
>     trunk  patch 
> 1st 236    146
> 2nd 179    100
> 3rd 79     72
> 4th 75     53
> 5th 91     80
> == rows=50 ==
>     trunk  patch 
> 1st 485    325
> 2nd 225    243
> 3rd 199    151
> 4th 168    127
> 5th 149    118
> == rows=100 ==
>     trunk  patch 
> 1st 948    607
> 2nd 472    390
> 3rd 237    201
> 4th 256    200
> 5th 224    178
> == rows=500 ==
>     trunk  patch 
> 1st 3248   2826
> 2nd 1545   1067
> 3rd 1563   801
> 4th 1551   816
> 5th 1452   777
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to