A brief update - the index on the delta column did the trick
(surprise, I know ;)).

Just for fun:
collected 1702286 docs, 119.8 MB
total 491.290 sec, 243840.36 bytes/sec, 3464.93 docs/sec

dpc

On Mar 24, 9:19 pm, Pat Allan <[email protected]> wrote:
> Yup, putting an index on the column is a good idea - and if you want  
> to put through a patch that only changes delta to 0 WHERE delta = 1,  
> then default_delta is the right place to make that change. Not sure if  
> it'd make it faster, but you've got a decent sized dataset to check :)
>
> Cheers
>
> --
> Pat
>
> On 25/03/2009, at 3:15 PM, Damon P. Cortesi wrote:
>
>
>
> > That'd certainly be a good start, wouldn't it hehe.
>
> > I don't seem to have an index on there, I'll update that this weekend
> > and see if it helps.
>
> > On Mar 24, 5:08 am, James Healy <[email protected]> wrote:
> >> I'm indexing a table with 500K records and delta indexing, and  
> >> building
> >> the core index only takes a couple of minutes. It's hard to imagine
> >> triple the number of records should blow out the indexing time by a
> >> factor of more than 20.
>
> >> In my case, each time the core index is rebuilt there's probably only
> >> 3000-4000 records with delta set to 1 - I'm not sure if that makes a
> >> difference to indexing time.
>
> >> Is your delta column indexed? At first thought I wouldn't think that
> >> would matter for an update query, but maybe it does?
>
> >> Indexing your delta column is good practice anyway, at the very  
> >> least it
> >> speeds up the building of the delta index.
>
> >> -- James Healy <jimmy-at-deefa-dot-com>  Tue, 24 Mar 2009 23:02:42  
> >> +1100
>
> >> Damon P. Cortesi wrote:
>
> >>> I've got a table with about 1.5m entries that I'm indexing using
> >>> ThinkingSphinx (Twitter data, bios specifically - tweepsearch.com).
>
> >>> I have delta indexing enabled, which works fantastic. But as the  
> >>> size
> >>> of the table has grown, as has the indexing time for the core index.
> >>> As an example, I have a `rake ts:index` task running right now  
> >>> that's
> >>> been going for 60 minutes. Not on indexing, though, on a db query  
> >>> - a
> >>> "show processlist" in MySQL shows the following query:
> >>> UPDATE `users` SET `delta` = 0
>
> >>> So I'm assuming this task is attempting to set the `delta` column of
> >>> every row in my table, which is leading to this delay. It seems like
> >>> this query is originating out of the reset_query method in:
> >>> lib/thinking_sphinx/deltas/default_delta.rb
>
> >>> I considered adding a WHERE clause to this to see if that might  
> >>> help,
> >>> but wasn't quite sure this was the right place, or even if that  
> >>> would
> >>> be appropriate.
>
> >>> Any insight would be appreciated,
>
> >>> dpc
>
> >>> --
> >>> Damon P. Cortesi
> >>> Security Guy, Twitter Apps
> >>> www. tweetstats | tweepsearch | tweetsum .com
>
>
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Thinking Sphinx" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/thinking-sphinx?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to