Hi Jonathan

Can you share your index definitions so I can get a better idea of where the 
problem might be?

Also: which versions of Rails and Thinking Sphinx are you using?

— 
Pat

> On 28 Jun 2015, at 11:47 pm, [email protected] wrote:
> 
> Hi Pat,
> 
> I implemented according to this, and the indexing time went down (5 times 
> faster on development). However, the delta indexing time went up (30 times 
> slower on development). See below the indexing stats:
> 
> Total docs    Bytes   Time (sec)                              Total docs      
> Bytes   Time (sec)
> incident_index_1_core 7331    6531122 39.436                  
> incident_index_6_core   7331    28239593        8.802
> incident_index_1_delta        6       1128    0.184                   
> incident_index_6_delta  6       24763425        5.234
> incident_index_2_core 7319    6751189 45.477                  
> incident_index_7_core   7319    28331726        8.819
> incident_index_2_delta        5       843     0.233                   
> incident_index_7_delta  5       24763289        5.321
> incident_index_3_core 7390    6803814 42.064                  
> incident_index_8_core   7390    28310121        7.913
> incident_index_3_delta        8       2143    0.203                   
> incident_index_8_delta  8       24764366        5.282
> incident_index_4_core 7278    6377664 37.665                  
> incident_index_9_core   7278    28162260        7.891
> incident_index_4_delta        6       1108    0.436                   
> incident_index_9_delta  6       24763330        5.456
> incident_index_5_core 7396    6601358 39.704                  
> incident_index_10_core  7396    28152075        9.562
> incident_index_5_delta        6       944     0.216                   
> incident_index_10_delta 6       24763308        5.303
> 
> Any idea why this is happening?
> 
> Thanks,
> Jonathan
> 
> On Friday, July 26, 2013 at 3:57:38 PM UTC+3, Pat Allan wrote:
> Heya Steve 
> 
> Was just looking into how difficult this would be to implement properly, and 
> noticed I have added the ability to take a string as the source query - 
> instead of the column references. So, it's possible without hacking around in 
> the index definition itself: 
> 
> https://gist.github.com/pat/6088629 <https://gist.github.com/pat/6088629> 
> 
> It's worth noting that the document id (Sphinx's equivalent of a primary key) 
> involves the normal primary key with an offset and a multiplier. Make sure 
> those two integers match what's in your generated index in sql_query. They 
> may change when you add other indices to your app (depends on alphabetical 
> order of your index files). 
> 
> Also: there's probably some metaprogramming you could add to simplify things 
> a bit more. 
> 
> Would love to hear if this approach helps with your real app and not just the 
> test one :) 
> 
> -- 
> Pat 
> 
> On 26/07/2013, at 12:14 AM, Pat Allan wrote: 
> 
> > Hi Steve 
> > 
> > I've got a way forward to greatly improve the speed of indexing… 
> > unfortunately, it's not going to work within Thinking Sphinx easily right 
> > now. 
> > 
> > Sphinx has the ability to gather attribute and field values from separate 
> > queries - this existed for TS v1/v2 for attributes, and fields was added in 
> > TS v3, but the catch is those separate queries don't work for HABTM joins. 
> > I'd love to change that, it's just painful from an ActiveRecord perspective 
> > because you're not dealing with a model's table as the base, but the HABTM 
> > join table. 
> > 
> > Here's the configuration for the relevant source that I modified by hand: 
> > https://gist.github.com/pat/6080031 <https://gist.github.com/pat/6080031> 
> > 
> > You'll see that the main query is nice and short - and then there's each of 
> > the MVA and joined field definitions. If you put this in the generated 
> > source definition in config/development.sphinx.conf, and then run the 
> > indexer manually (NOT through the rake task, that'll overwrite this): 
> >  indexer --config config/development.sphinx.conf --all --rotate 
> > 
> > (Remove --rotate if Sphinx isn't running.) You'll see it's pretty damn 
> > fast. 
> > 
> > Now, ways forward? Well, I'd love to write something for TS v3 that can 
> > handle HABTM - it's just a shame that it might need to be pure ARel rather 
> > than ActiveRecord-built (which can otherwise help with joins). 
> > 
> > But otherwise: switch from HABTM to has_many/has_many :through - make each 
> > of the joins an actual model. Then, you can add :source => :query to each 
> > of the appropriate field and attribute definitions, and it should generate 
> > something pretty much the same. 
> > 
> > Hope this provides some clarity at the very least! And also: thanks for the 
> > test app, really helped with debugging! 
> > 
> > -- 
> > Pat 
> > 
> > 
> > On 25/07/2013, at 2:54 PM, Steve Kenworthy wrote: 
> > 
> >> Hi there, 
> >> 
> >> Firstly, thinking-sphinx is awesome and I love it. Thanks Pat for an 
> >> excellent project. V3 is looking great and represents a lot of hard work 
> >> and effort. 
> >> 
> >> I've been using thinking-sphinx to index a document model and it's really 
> >> slowed down when I add lots of associations in the index. In fact, it 
> >> never finishes on my machine (8Gig RAM, 8 CPU's) when I add 4 indexes. 
> >> 
> >> Times: 
> >>         • 4 seconds - when 1 association (images) is indexed 
> >>         • 6 seconds - when 2 associations (images and subscribers) are 
> >> indexed 
> >>         • 23 seconds - when 2 associations (images and countries) are 
> >> indexed 
> >>         • 115 seconds - when 3 associations (images, subscribers and tags) 
> >> are indexed 
> >>         • 113 seconds - when 3 associations (images, subscribers and 
> >> videos) are indexed (just to prove it's not tags slowing it down) 
> >>         • ꝏ (not finishing) - when 4 associations or more are selected. 
> >> 
> >> Here's my index file: 
> >> 
> >> ThinkingSphinx::Index.define :document, with: :active_record, delta: true, 
> >> sql_range_step: 999999999, group_concat_max_len: 16384 do 
> >> 
> >>  has countries(:id), as: :country_ids 
> >>  has images(:id), as: :image_ids, facet: true 
> >>  has subscribers(:id), as: :subscriber_ids, facet: true 
> >>  has tags(:id), as: :tag_ids, facet: true 
> >>  has videos(:id), as: :video_ids, facet: true 
> >> 
> >>  indexes countries.name <http://countries.name/>, as: :countries 
> >>  indexes images.title, as: :images 
> >>  indexes subscribers.title, as: :subscribers 
> >>  indexes tags.name <http://tags.name/>, as: :tags 
> >>  indexes videos.title, as: :videos 
> >> 
> >>  has updated_at 
> >> 
> >> end 
> >> 
> >> The generated sql is a massive group_by query and is not finishing. See it 
> >> here https://github.com/crossroads/rails3-ts-example#what-sphinx-is-doing 
> >> <https://github.com/crossroads/rails3-ts-example#what-sphinx-is-doing> 
> >> 
> >> I'd really appreciate some advice on how to optimise this so indexing 
> >> becomes viable again. Do I just have too much going on here? I'm using 
> >> facets, indexes and attributes. Perhaps there is a better way to optimise? 
> >> A friend suggested pre-computing with some joins... how would this work? 
> >> 
> >> Vital stats: using mysql v14.14, sphinx 2.0.4, Ubuntu, rails 3.2.13, 
> >> thinking-sphinx 3.0.4 
> >> 
> >> For those who'd like to take a look, I've uploaded a sample project here 
> >> https://github.com/crossroads/rails3-ts-example 
> >> <https://github.com/crossroads/rails3-ts-example> which can be cloned. If 
> >> you follow the instructions, it will setup a db with test data and 
> >> reproduce the problem quickly. 
> >> 
> >> There's also the sphinx generated SQL and EXPLAIN: 
> >> https://github.com/crossroads/rails3-ts-example#what-sphinx-is-doing 
> >> <https://github.com/crossroads/rails3-ts-example#what-sphinx-is-doing> 
> >> 
> >> Thanks in advance for anyone taking the time to read. 
> >> 
> >> Regards, 
> >> Steve 
> >> 
> >> -- 
> >> You received this message because you are subscribed to the Google Groups 
> >> "Thinking Sphinx" group. 
> >> To unsubscribe from this group and stop receiving emails from it, send an 
> >> email to thinking-sphi...@ <>googlegroups.com <http://googlegroups.com/>. 
> >> To post to this group, send email to thinkin...@ <>googlegroups. 
> >> <http://googlegroups.com/>com <http://googlegroups.com/>. 
> >> Visit this group at http://groups.google.com/group/thinking-sphinx 
> >> <http://groups.google.com/group/thinking-sphinx>. 
> >> For more options, visit https://groups.google.com/groups/opt_out 
> >> <https://groups.google.com/groups/opt_out>. 
> >> 
> >> 
> > 
> > 
> > -- 
> > You received this message because you are subscribed to the Google Groups 
> > "Thinking Sphinx" group. 
> > To unsubscribe from this group and stop receiving emails from it, send an 
> > email to thinking-sphi...@ <>googlegroups.com <http://googlegroups.com/>. 
> > To post to this group, send email to thinkin...@ <>googlegroups. 
> > <http://googlegroups.com/>com <http://googlegroups.com/>. 
> > Visit this group at http://groups.google.com/group/thinking-sphinx 
> > <http://groups.google.com/group/thinking-sphinx>. 
> > For more options, visit https://groups.google.com/groups/opt_out 
> > <https://groups.google.com/groups/opt_out>. 
> > 
> > 
> 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Thinking Sphinx" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] 
> <mailto:[email protected]>.
> To post to this group, send email to [email protected] 
> <mailto:[email protected]>.
> Visit this group at http://groups.google.com/group/thinking-sphinx 
> <http://groups.google.com/group/thinking-sphinx>.
> For more options, visit https://groups.google.com/d/optout 
> <https://groups.google.com/d/optout>.

-- 
You received this message because you are subscribed to the Google Groups 
"Thinking Sphinx" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/thinking-sphinx.
For more options, visit https://groups.google.com/d/optout.

Reply via email to