Okay, here's a few thoughts on the matter - finally (sorry for the delay).

So, I definitely think this should all be part of TS proper - I'd love for 
Thinking Sphinx to opt for smarter string facets when possible (as it removes 
the need for the class_crc attribute, and tracking which models are being 
indexed, which has been a pain from the beginning).

And same for string facets - although we're still faced with the existing 
limitation there when having arrays of strings - MVAs are still integer-only.

Also, 1.10-beta has a useful option of sql_field_string - declaring both a 
field and attribute from a single column. This is a natural fit for :sortable 
=> true.

And the newly released 2.0.1-beta improves things further with string 
attributes - they can be used for sorting and grouping - wish they'd hurry up 
with the filtering as well! (I've had a brief attempt at getting TS working 
with Sphinx 2.0.1-beta - Sphinx was having problems starting up via Ruby code, 
but was fine if I called it from the command line. Will investigate more soon).


The trick will be having it all degrade nicely for older versions of Sphinx. 
There's already a check or two in Riddle for this - Riddle.loaded_version. If 
you want to take a stab at it, that'd be wonderful - but otherwise, I'm 
guessing you'll be in Berlin for Euruko? Perhaps can work on it together then?

-- 
Pat

On 18/04/2011, at 7:38 AM, Clemens Kofler wrote:

> I've been thinking about string attributes again lately, especially in
> terms of facets, since my previous approach (see
> http://groups.google.com/group/thinking-sphinx/browse_thread/thread/c8cc4fb1e38f7679/76f353007ff827ef?lnk=gst&q=string+attribute#76f353007ff827ef)
> had speed issues. I came up with a way faster solution:
> https://gist.github.com/924493. Now I'm wondering if it would make
> sense to port that stuff back to Thinking Sphinx.
> 
> I'm thinking the following: When defining a string attribute, Thinking
> Sphinx could internally keep 2 attributes (similar to what it does for
> facets now) – the original string value as well as its str2ordinal
> counterpart. For sorting, grouping and filtering one could use the
> str2ordinal value and stuff like facet labels it could use the actual
> string value. This would largely allow to get rid of the translation
> part that does all the reverse lookups which is the main reason for
> the speed decrease.
> 
> The main issue I see would be the support for 2 different flavors: pre
> 1.10-beta Sphinx installations would need the current implementation
> whereas 1.10-beta and later could use the new implementation. In the
> near future, there might even be a third version that could lose the
> sorting/grouping/filtering column, once Sphinx is able to do that
> natively on string attributes.
> 
> The question is: Would it make sense to implement that in Thinking
> Sphinx, bearing in mind the additional complexity it brings (mapping
> all grouping, sorting and filtering to a custom column internally –
>  although that happens for facets anyway). If so, I'm happy to try
> coming up with a clean implementation. Otherwise, I'll just adapt a
> blog post that I have in my pipeline where I explain the whole issue
> and my solution.
> 
> WDYT?
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Thinking Sphinx" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/thinking-sphinx?hl=en.
> 

-- 
You received this message because you are subscribed to the Google Groups 
"Thinking Sphinx" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/thinking-sphinx?hl=en.

Reply via email to