Re: Impact and WAND

Atri Sharma Thu, 11 Jul 2019 06:02:40 -0700

Note that any other scoring mode (COMPLETE or COMPLETE_NO_SCORES) will
mandatorily visit all hits, so there is no scope of skipping and hence
no point of using impacts.


On Thu, Jul 11, 2019 at 8:51 AM Wu,Yunfeng <wuyunfen...@baidu.com> wrote:
>
>
> @Adrien Grand <jpou...@gmail.com<mailto:jpou...@gmail.com>>. Thanks for your 
> reply.
>
> The explanation ` skip low-scoring matches` is great,  I  looked up some docs 
> and inspect some related code.
>
> I noticed the ` block-max WAND` mode only work when  ScoreMode.TOP_SCORES is 
> used,   is right?  (The basic TermQuery would generate ImpactDISI with 
> scoreMode is TOP_SCORES.)
>
> Lucene compute max score per block and then cached in `MaxScoreCache` , this 
> means we can skip low-scoring block( current one block 128 DocIds)  and in 
> competitive block  still need to score any docId as seen,   I confused with  
> `MaxScoreCache#getMaxScoreForLevel(int level)`, what the level mean? Skip 
> level?  (Somewhere invoke this method pass one Integer upTo param)
>
> Thanks Lucene Team
>
>
> 在 2019年7月10日，下午10:52，Adrien Grand 
> <jpou...@gmail.com<mailto:jpou...@gmail.com>> 写道：
>
> To clarify, the scoring process is not accelerated because we
> terminate early but because we can skip low-scoring matches (there
> might be competitive hits at the very end of the index).
>
> CompetitiveImpactAccumulator is indeed related to WAND. It helps store
> the maximum score impacts per block of documents in postings lists.
> Then this information is leveraged by block-max WAND in order to skip
> low-scoring blocks.
>
> This does indeed help avoid reading norms, but also document IDs and
> term frequencies.
>
> On Wed, Jul 10, 2019 at 4:10 PM Wu,Yunfeng 
> <wuyunfen...@baidu.com<mailto:wuyunfen...@baidu.com>> wrote:
>
> Hi,
>
> We discuss some topic from https://github.com/apache/lucene-solr/pull/595. As 
> Atri Sharma propose discuss with the java dev list.
>
>
> Impact `frequency ` and `norm ` just to accelerate the `score process`  which 
>  `terminate early`.
>
> In impact mode, `CompetitiveImpactAccumulator` will record (freq, norm) pair 
> , would stored at index level. Also I noted `CompetitiveImpactAccumulator` 
> commented with `This class accumulates the (freq, norm) pairs that may 
> produce competitive scores`,  maybe related to `WAND`?
>
>
> The norm value which produced or consumed by `Lucene80NormsFormat`.
>
> In this ` Impact way`, we can avoid read norms from `Lucene80NormsProducer` 
> that may generate the extra IO?  （ the norm value Lucene stored twice.）and 
> take full advantage of the WAND method?
>
>
>
> --
> Adrien
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Impact and WAND

Reply via email to