aaaaaaaaaaaaaahhhhhhhhhhhhhhhh !!!!!!!!!!!!!!! :)
On May 27, 5:15 pm, Pat Allan <[email protected]> wrote:
> Okay, I've made the change. Anyone now installing Thinking Sphinx via
> plugin or gem gets a warning, and morphology has a default of nil.
>
> I'll remove the warning at some point, maybe in a couple of months.
>
> Cheers
>
> --
> Pat
>
> On 17/05/2009, at 10:08 PM, Pat Allan wrote:
>
>
>
> > You did indeed write a lot, but that's okay, provides a more thorough
> > understanding.
>
> > Grouping is probably the best way to do what you've done for the
> > account names - so your approach seems right to me (even though it's
> > not a perfect solution). As much as I can comprehend the problem at
> > the moment, anyway :)
>
> > As for alerting people to the removal of the default morphology, I
> > like the idea of having messages when the plugin or gem is installed
> > (and both of those should be doable, I'm almost certain).
>
> > If you want to have a go at forking and patching, be my guest - for
> > plugins, I think PLUGIN_ROOT/install.rb is what should hold code that
> > gets run on installation (it might be housed under PLUGIN_ROOT/rails/
> > install.rb since Rails 2.1). No idea what the process is for gems, but
> > the rspec gem outputs a message, so TS should be able to as well.
>
> > Otherwise, when I have the time and motivation, I'll attempt it myself
> > - which is fine by me, but don't be afraid to give it a shot yourself.
>
> > Cheers
>
> > --
> > Pat
>
> > On 15/05/2009, at 12:35 PM, aitrus wrote:
>
> >> Hi Pat,
>
> >> Thanks again for your work on TS. Sorry, I get worked up easily. To
> >> answer your question, first:
>
> >> I'm doing some data warehouse-ish applications. I pull in lots of
> >> data from various systems. Then I use things like account names,
> >> group names, resource names, host names, etc., to find unique
> >> records.
>
> >> When it comes to grouping, I have an association setup of Personnel/
> >> Divisions <-- ownership --> Accounts.
>
> >> A person can have many accounts. However, each person has only one
> >> personnel record. If I render a search in Sphinx, it paginates the
> >> Personnel records--then if I try to display accounts, the pagination
> >> is very strange.
>
> >> So, I needed to do a search in Sphinx on the Accounts table, (a)
> >> eliminating duplicate account names, and (b) eliminating accounts
> >> with
> >> no owner (took some digging to figure out I need to have a "has"
> >> attribute).
>
> >> The way I eventually got this to work (after much whiskey and self-
> >> mutilation) is to setup:
>
> >> has staffs(:id), :as => :has_staffs, :type
> >> => :integer
> >> has ["LOWER(`accounts`.`name`)"], :as => :sort_account_name, :type
> >> => :string
>
> >> in my define index. Then I run the following sphinx search:
>
> >> @staff_results = Account.search query, :conditions =>
> >> conditions, :page => params[:page],
> >> :group_function => :attr, :group_by =>
> >> "sort_account_name",
> >> :group_clause => sort, :without =>
> >> {:has_staffs => 0}
>
> >> Which solves my biggest problem. I still have the issue that one
> >> account can have many owners--but I have not begun that work. I also
> >> just noticed, after reviewing some logs, that if ":sortable => true"
> >> is enabled, you create a "<column>_sort" attribute. I haven't tried
> >> using this in the above "group_by" entry, yet.
>
> >> The biggest use of Sphinx (for me) is that it lets me minimize the
> >> size of my MySQL indexes (thus speeding up MySQL), and instead uses
> >> Sphinx to quickly crawl text fields. For example, a unique unix
> >> account could be described as an account (case-sensitive) per server.
> >> There's several platforms/accounts being warehoused. My account
> >> database has 634,000 records. A mysql search for this account would
> >> be ungodly, since InnoDB lacks fulltext indexing. etc.
>
> >> Another issue I've had is figuring out that I needed to setup the
> >> Charset Table for Sphinx, so it would index various special
> >> characters--some user/group/resource names can have those tucked
> >> away. Of special note are @ (at-sign), $ (dollar-sign), #(hash/
> >> pound-
> >> symbol), and parenthesis, period, hyphen, underscore, etc.
>
> >> I solved that in the sphinx.yml and it looks like:
>
> >> development:
> >> morphology: "none"
> >> charset_table: "0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U
> >> +44F, U+430..U+44F, U+0024, $, @, *, ., -, U+0028, (, U+0029, ), \"#
> >> \""
>
> >> I'm saying some things you probably already know--but I'm hoping
> >> google indexes my post and saves other developers from the
> >> psychological trauma that I experienced.
>
> >> I'll be using Sphinx also as part of a web page, but every search
> >> term
> >> will be literal--there's not much use for wordlists, stemming, etc,
> >> even in that situation.
>
> >> Hope this gives good insight into my experience. As for how to
> >> notify, that would be a question of how Rails plugin / gem install
> >> stuff works.
>
> >> My first question would be if you can issue a notice on screen when
> >> you first install a plugin. Or if "gem install" lets you output
> >> something, similar to a license agreement.
>
> >> If there's no easy, verbose way to do it--then I think you should
> >> have
> >> the next update look for a "sphinx.yml" file. If it doesn't exist,
> >> create it with a boiler plate and have your current defaults remain
> >> the default. But below them, comment out a line that overrides it.
>
> >> Another way is to intentionally break the existing plugin-install url
> >> for Sphinx--so people have to go look at your webpage and pay
> >> attention.
>
> >> I can think of more ideas. I'd be happy to contribute to TS, but I'm
> >> still new to Ruby/Rails (coming from Perl) and I want to avoid the
> >> risk of committing bad code.
>
> >> I wrote a lot :( Thank you.
>
> >> On May 14, 11:58 pm, Pat Allan <[email protected]> wrote:
> >>> Fair points, even if you're a little worked up about it.
>
> >>> When I was last doing some refactoring of the TS Configuration
> >>> class,
> >>> I considered removing the default morphology, but didn't because
> >>> people were already using TS working on the (yes, barely documented)
> >>> assumption that it *is* the default.
>
> >>> So, I agree about the default being nothing, and people set it if
> >>> they
> >>> want one.. but how to we deprecate it cleanly? Beyond just removing
> >>> it, which is easy to do, but a warning would be nice, except we
> >>> don't
> >>> want that warning appearing *every* time ts:in is run, or something
> >>> like that.
>
> >>> Suggestions welcome.
>
> >>> Also, re: your grouping issue, care to elaborate?
>
> >>> --
> >>> Pat
>
> >>> On 14/05/2009, at 12:04 PM, aitrus wrote:
>
> >>>> Pat, I love Thinking Sphinx and I appreciate everything you've done
> >>>> for Rails.
>
> >>>> Having said that.... for the love of god, please don't set defaults
> >>>> like this. I didn't even know what was going on. I'm doing an
> >>>> import
> >>>> on hundreds of thousands of records and the full-text search of
> >>>> Sphinx
> >>>> makes this so much faster.
>
> >>>> But apparently you're setting the morphology to "stem_en" as a
> >>>> default. I can't find anything about this behavior and it took me
> >>>> forever to figure out that this was the actual issue. I have spent
> >>>> hours trying to figure out why "AB0E" also matched "AB0S". In
> >>>> fact, I
> >>>> didn't even realize this was an issue until after I developed
> >>>> everything, and began to QA my records.
>
> >>>> Sweet jesus :( Please organize this in a way that is either
> >>>> obvious
> >>>> or painstakingly documented.
>
> >>>> I had another issue with TS, where I was trying to group results
> >>>> based
> >>>> on certain columns (via has_many and h_m:through). Such a
> >>>> nightmare.
>
> >>>> I really appreciate your work, but there needs to be some kind of
> >>>> emphasis on documenting various assumptions before
> >>>> implementation. Or
> >>>> maybe, at least, just have:
>
> >>>> rake ts:in --no-stems
>
> >>>> Sigh.
>
>
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Thinking Sphinx" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/thinking-sphinx?hl=en
-~----------~----~----~----~------~----~------~--~---