Hmm, I wonder if Sphinx doesn't delete documents that appear in both  
indexes...

Scenarios to try:
- delete a record, add a different record, merge and see if the  
deleted record is kept around or not
- flag the core copy as deleted, merge with the empty delta, re-index  
delta, merge again, see if only the updates are kept.

-- 
Pat

On 27/11/2009, at 12:48 PM, Canvas wrote:

> Hi Pat,
>
> Thanks for your timely reply. client.update() does work.  But I still
> have a wierd problem. Following are the steps for me to reproduce the
> problem.
>
> Step 1: build full index
> # rake thinking_sphinx:index RAILS_ENV=development
>
> Step 2: make some changes to file 4013
>
> Step 3: build delta index ==> the file 4013 is searchable with both
> old and new data
> # rake thinking_sphinx:index:delta RAILS_ENV=development
>
> Step 4: flag the file in core index as "sphinx_deleted" ==> the file
> is searchable only by new data, so far so good.
>>> client.update('buy_sell_file_core', ['sphinx_deleted'], { 4013 =>  
>>> [1] })
>
> Step 5: merge delta to core
> # /usr/local/bin/indexer --config '/workspace/CA/BETA_2/ 
> EconveyancePro/
> config/development.sphinx.conf' --rotate --merge buy_sell_file_core
> buy_sell_file_delta --merge-dst-range sphinx_deleted 0 0
>
> And now the problem occurs, I can search by both new and old data
> again. It doesn't make any sense to me. Why can I search by the old
> data again? Isn't the old index supposed to be deleted in Step 4? Did
> I do anything wrong in step 5?
>
> Thank you very much for your help.
>
>
> Best wishes,
>
> Canvas
>
> On Nov 25, 12:26 am, Pat Allan <[email protected]> wrote:
>> Hi Canvas
>>
>> Are you using Thinking Sphinx? It adds an internal attribute called
>> sphinx_deleted, and sets records' values to 1 when they are deleted  
>> in
>> Ruby code. Then, if you're using the datetime deltas, the merge
>> automatically uses the --merge-dst-range option to remove deleted
>> items from the index.
>>
>> However, if you're using Riddle, then the equivalent call is
>> client.update, not UpdateAttributes. I recommend looking at the  
>> source
>> code to get a good understanding of how it all works. Riddle
>> documentation is pretty thin on the ground, but I'd like to improve  
>> it
>> over time.
>>
>> Hope this helps.
>>
>> --
>> Pat
>>
>> On 25/11/2009, at 8:00 AM, Canvas wrote:
>>
>>
>>
>>> Hi there guys,
>>
>>> I am currently using sphinx 9.8.1. The following is from sphinx  
>>> 9.8.1
>>> document:
>>
>>> " The basic command syntax is as follows:
>>
>>> indexer --merge DSTINDEX SRCINDEX [--rotate]
>>
>>> Only the DSTINDEX index will be affected: the contents of SRCINDEX
>>> will be merged into it. --rotate switch will be required if DSTINDEX
>>> is already being served by searchd. The initially devised usage
>>> pattern is to merge a smaller update from SRCINDEX into DSTINDEX.
>>> Thus, when merging the attributes, values from SRCINDEX will win if
>>> duplicate document IDs are encountered. Note, however, that the  
>>> "old"
>>> keywords will not be automatically removed in such cases. For  
>>> example,
>>> if there's a keyword "old" associated with document 123 in DSTINDEX,
>>> and a keyword "new" associated with it in SRCINDEX, document 123  
>>> will
>>> be found by both keywords after the merge. You can supply an  
>>> explicit
>>> condition to remove documents from DSTINDEX to mitigate that; the
>>> relevant switch is --merge-dst-range:
>>
>>> indexer --merge main delta --merge-dst-range deleted 0 0
>>
>>> This switch lets you apply filters to the destination index along  
>>> with
>>> merging. There can be several filters; all of their conditions  
>>> must be
>>> met in order to include the document in the resulting mergid  
>>> index. In
>>> the example above, the filter passes only those records where
>>> 'deleted' is 0, eliminating all records that were flagged as deleted
>>> (for instance, using UpdateAttributes() call).  "
>>
>>> It seems that I need to use "UpdateAttributs()" call to update full
>>> index before merging. My question here is how to call
>>> "UpdateAttributes
>>> ()" to update the full index to mark the records in the delta  
>>> index as
>>> deleted?
>>
>>> By the way, I am using a view which contains a "update_at" column,
>>> which is used as the timestamp to catch data in delta index.
>>
>>> Any suggestion is appreciated. Thanks.
>>
>>> Best wishes,
>>
>>> Canvas
>>
>>> --
>>
>>> You received this message because you are subscribed to the Google
>>> Groups "Thinking Sphinx" group.
>>> To post to this group, send email to [email protected] 
>>> .
>>> To unsubscribe from this group, send email to 
>>> [email protected]
>>> .
>>> For more options, visit this group 
>>> athttp://groups.google.com/group/thinking-sphinx?hl=en
>>> .- Hide quoted text -
>>
>> - Show quoted text -
>
> --
>
> You received this message because you are subscribed to the Google  
> Groups "Thinking Sphinx" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected] 
> .
> For more options, visit this group at 
> http://groups.google.com/group/thinking-sphinx?hl=en 
> .
>
>

--

You received this message because you are subscribed to the Google Groups 
"Thinking Sphinx" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/thinking-sphinx?hl=en.


Reply via email to