Hi Pat, I have the initial ground work for a patch. No tests yet.
But I figured I would get your input on to whether you would accept this type of patch or not as it was a little more invasive than just modifying the datetime-delta. The basic approach was to modify the sql generation to add a sql_pre_query to the core models and then add a sql_query_killlist to the delta models. It does require the addition of a new table to track the index times and only adds the killlist if this table option is passed to the datetime-delta. Thinking sphinx changes http://github.com/adamcooper/thinking-sphinx/commit/8e07d16e7e6194205f161c025bbc7c9c7a94eb16 Ts-datetime-delta changes http://github.com/adamcooper/ts-datetime-delta/commit/73f6061503f9bfa8f5ba616292805031bc3b9279 We are looking to be using this in production and I would like to ensure it gets merged back to the mainline so we don't have to maintain a separate fork. Thanks, Adam On Mar 27, 10:12 pm, Pat Allan <[email protected]> wrote: > Hi Adam > > The kill list conf option is in TS (was added a couple of months ago, I > think). However, if you wanted a default option for deltas, then it really > only applies to the datetime deltas (in other deltas, you don't get the > doubling up, and there's no easy way to figure out which documents to delete > either). > > So: a patch is definitely welcome, and you'll want to add it to > ts-datetime-delta, not the thinking-sphinx gem. > > Great detective work by both you and Steven. > > Cheers > > -- > Pat > > On 27/03/2010, at 12:15 PM, adamcooper wrote: > > > > > We've been looking at this issue fairly in depth and it seems to be > > due to a datetime-delta implementation issue (not sure if it's only > > specific to ours...) > > > The real crux of the issue is because the main and delta index overlap > > and therefore double the facet counts. > > > After a full index, the delta and main index have an overlap of > > records that have been updated in last hour (or threshold time). So > > any facet counting on those records get doubled. Normal searches are > > not affected. Counts are further messed up as records that were > > originally only in the main index are updated and appear in the delta > > index and therefore become double counted. I can explain this issue > > further if it's not totally clear. Index merging further causes > > issues. > > > We have manually tweaked the conf file to get the correct behaviour > > using sql_query_killlist and sql_query_post and an intermediate > > table. Basically, we track the last full index time and add any > > records that have been updated since then to the delta killlist. > > We'll probably add the deleted records to there too (based on an > > acts_as_paranoid type deletion). > > > Does anyone know if there is an easier way to accomplish this? Or is > > this what others are doing too? > > > Pat, we are looking to patch Thinking Sphinx to add killlist support, > > have you done any initial thinking as to how you would want it added > > or how it best could be added? > > > Thanks, > > Adam > > > -- > > You received this message because you are subscribed to the Google Groups > > "Thinking Sphinx" group. > > To post to this group, send email to [email protected]. > > To unsubscribe from this group, send email to > > [email protected]. > > For more options, visit this group > > athttp://groups.google.com/group/thinking-sphinx?hl=en. -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/thinking-sphinx?hl=en.
