Okay, here's a few thoughts on the matter - finally (sorry for the delay). So, I definitely think this should all be part of TS proper - I'd love for Thinking Sphinx to opt for smarter string facets when possible (as it removes the need for the class_crc attribute, and tracking which models are being indexed, which has been a pain from the beginning).
And same for string facets - although we're still faced with the existing limitation there when having arrays of strings - MVAs are still integer-only. Also, 1.10-beta has a useful option of sql_field_string - declaring both a field and attribute from a single column. This is a natural fit for :sortable => true. And the newly released 2.0.1-beta improves things further with string attributes - they can be used for sorting and grouping - wish they'd hurry up with the filtering as well! (I've had a brief attempt at getting TS working with Sphinx 2.0.1-beta - Sphinx was having problems starting up via Ruby code, but was fine if I called it from the command line. Will investigate more soon). The trick will be having it all degrade nicely for older versions of Sphinx. There's already a check or two in Riddle for this - Riddle.loaded_version. If you want to take a stab at it, that'd be wonderful - but otherwise, I'm guessing you'll be in Berlin for Euruko? Perhaps can work on it together then? -- Pat On 18/04/2011, at 7:38 AM, Clemens Kofler wrote: > I've been thinking about string attributes again lately, especially in > terms of facets, since my previous approach (see > http://groups.google.com/group/thinking-sphinx/browse_thread/thread/c8cc4fb1e38f7679/76f353007ff827ef?lnk=gst&q=string+attribute#76f353007ff827ef) > had speed issues. I came up with a way faster solution: > https://gist.github.com/924493. Now I'm wondering if it would make > sense to port that stuff back to Thinking Sphinx. > > I'm thinking the following: When defining a string attribute, Thinking > Sphinx could internally keep 2 attributes (similar to what it does for > facets now) – the original string value as well as its str2ordinal > counterpart. For sorting, grouping and filtering one could use the > str2ordinal value and stuff like facet labels it could use the actual > string value. This would largely allow to get rid of the translation > part that does all the reverse lookups which is the main reason for > the speed decrease. > > The main issue I see would be the support for 2 different flavors: pre > 1.10-beta Sphinx installations would need the current implementation > whereas 1.10-beta and later could use the new implementation. In the > near future, there might even be a third version that could lose the > sorting/grouping/filtering column, once Sphinx is able to do that > natively on string attributes. > > The question is: Would it make sense to implement that in Thinking > Sphinx, bearing in mind the additional complexity it brings (mapping > all grouping, sorting and filtering to a custom column internally – > although that happens for facets anyway). If so, I'm happy to try > coming up with a clean implementation. Otherwise, I'll just adapt a > blog post that I have in my pipeline where I explain the whole issue > and my solution. > > WDYT? > > -- > You received this message because you are subscribed to the Google Groups > "Thinking Sphinx" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/thinking-sphinx?hl=en. > -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/thinking-sphinx?hl=en.
