Hi there, I've run into some scaling problems in the way the indexer handles long lists of multi-valued-attributes. Worst case scenario I have items with over 25000 attributes attached. Indexing these through a left-join with group_concat took a long time and caused quite some load on the database.
Reading up on the sphinx-documentation I found that multi-valued- attributes could also be indexed through a separate query that simply retrieves all the <document, attribute>-pairs. A quick test showed that this speeds up the indexing tremendously. This feature isn't supported by thinking-sphinx so I took a stab at it in my fork at http://github.com/menno/thinking-sphinx/commits/mva It's tested in production for my use case which is along the line of Item.has_many :tags, :through => :taggings. For which it can "select item_id, tag_id from taggings" to get all the pairs. There are specs and code for other has-many-associations but they, and other cases, haven't been thoroughly tested. Another point of concern is that I needed access to the unique-id- expression used in the select-query to match up the ids. I've moved this logic to ThinkingSphinx.unique_id_expression(offset) but I still needed to pass around the offset a lot more than I'd like. So I hope this can be of use to anyone, and feel free to comment on the implementation/tests as it's my first encounter with the internals of thinking-sphinx, cucumber and rspec ;) Cheers, Menno van der Sman --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/thinking-sphinx?hl=en -~----------~----~----~----~------~----~------~--~---
