Derek:

Why do you care? What evidence do you have that this matters _practically_?

If you've look at scoring with a small number of documents, you'll see
significant
differences due to deleted documents. In most cases, as you get a larger number
of documents the ranking of documents in an index with no deletions .vs. indexes
that have deletions is usually not noticeable.

I'm suggesting that this is a red herring. Your specific situation may
be different
of course, but since scoring is really only about ranking docs
relative to each other,
unless the relative positions change enough to be noticeable it's not a problem.

Note that I'm saying "relative rankings", NOT "absolute score". Document scores
have no meaning outside comparisons to other docs _in the same query_. So
unless you see documents changing their position in the list due to
having deleted
docs, it's not worth spending time on IMO.

Best,
Erick

On Tue, Sep 1, 2015 at 12:45 AM, Upayavira <u...@odoko.co.uk> wrote:
> I wonder if this resolves it [1]. It has been applied to trunk, but not
> to the 5.x release branch.
>
> If you needed it in 5.x, I wonder if there's a way that particular
> choice could be made configurable.
>
> Upayavira
>
> [1] https://issues.apache.org/jira/browse/LUCENE-6711
> On Tue, Sep 1, 2015, at 02:43 AM, Derek Poh wrote:
>> Hi Upayavira
>>
>> In fact we are using optimize currently but was advised to use expunge
>> deletes as it is less resource intensive.
>> So expunge deletes will only remove deleted documents, it will not merge
>> all index segments into one?
>>
>> If we don't use optimize, the deleted documents in the index will affect
>> the scores (with docFreq=2) of the matched documents which will affect
>> the relevancy of the search result.
>>
>> Derek
>>
>> On 9/1/2015 12:05 AM, Upayavira wrote:
>> > If you really must expunge deletes, use optimize. That will merge all
>> > index segments into one, and in the process will remove any deleted
>> > documents.
>> >
>> > Why do you need to expunge deleted documents anyway? It is generally
>> > done in the background for you, so you shouldn't need to worry about it.
>> >
>> > Upayavira
>> >
>> > On Mon, Aug 31, 2015, at 06:46 AM, davidphilip cherian wrote:
>> >> Hi,
>> >>
>> >> The below curl command worked without error, you can try.
>> >>
>> >> curl http://localhost:8983/solr/techproducts/update?commit=true -H
>> >> "Content-Type: text/xml" --data-binary '<commit waitSearcher="false"
>> >> expungeDeletes="true"/>'
>> >>
>> >> However, after executing this, I could still see same deleted counts on
>> >> dashboard.  Deleted Docs:6
>> >> I am not sure whether that means,  the command did not take effect or it
>> >> took effect but did not reflect on dashboard view.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Mon, Aug 31, 2015 at 8:51 AM, Derek Poh <d...@globalsources.com>
>> >> wrote:
>> >>
>> >>> Hi
>> >>>
>> >>> I tried doing a expungeDeletes=true with the following but get the 
>> >>> message
>> >>> 'missing content stream'. What am I missing? I need to provide additional
>> >>> parameters?
>> >>>
>> >>> curl 'http://127.0.0.1:8983/solr/supplier/update/json?expungeDeletes=true
>> >>> ';
>> >>>
>> >>> Thanks,
>> >>> Derek
>> >>>
>> >>> ----------------------
>> >>> CONFIDENTIALITY NOTICE
>> >>> This e-mail (including any attachments) may contain confidential and/or
>> >>> privileged information. If you are not the intended recipient or have
>> >>> received this e-mail in error, please inform the sender immediately and
>> >>> delete this e-mail (including any attachments) from your computer, and 
>> >>> you
>> >>> must not use, disclose to anyone else or copy this e-mail (including any
>> >>> attachments), whether in whole or in part.
>> >>> This e-mail and any reply to it may be monitored for security, legal,
>> >>> regulatory compliance and/or other appropriate reasons.
>> >>>
>> >>>
>> >
>>
>>
>> ----------------------
>> CONFIDENTIALITY NOTICE
>>
>> This e-mail (including any attachments) may contain confidential and/or
>> privileged information. If you are not the intended recipient or have
>> received this e-mail in error, please inform the sender immediately and
>> delete this e-mail (including any attachments) from your computer, and
>> you must not use, disclose to anyone else or copy this e-mail (including
>> any attachments), whether in whole or in part.
>>
>> This e-mail and any reply to it may be monitored for security, legal,
>> regulatory compliance and/or other appropriate reasons.

Reply via email to