works perfect ! thanks for your work on this plugin and sorry for not paying attention :)
On Jul 21, 5:41 pm, Pat Allan <[email protected]> wrote: > Or I can just cut and paste that message here: > > With the release of Thinking Sphinx 1.1.18, there is one important > change to > note: previously, the default morphology for indexing was 'stem_en'. > The new > default is nil, to avoid any unexpected behavior. If you wish to keep > the old > value though, you will need to add the following settings to your > config/sphinx.yml file: > > development: > morphology: stem_en > test: > morphology: stem_en > production: > morphology: stem_en > > To understand morphologies/stemmers better, visit the following > link:http://www.sphinxsearch.com/docs/manual-0.9.8.html#conf-morphology > > Hope this helps. > > Cheers > > -- > Pat > > On 21/07/2009, at 4:24 PM, jim wrote: > > > > > cool but, how do you turn on stemming. Sorry haven't read this entire > > post yet but was betting that when I did I'd see my answer. Plus, I > > think I remember seeing some info/post install notes on the screen > > when I installed TS. I was going to re-install and look at that again > > real close. > > > On Jul 21, 4:48 pm, Pat Allan <[email protected]> wrote: > >> Sorry for the confusion Jim. I'll update the documents to remove the > >> mention of the default stemming. > > >> -- > >> Pat > > >> On 21/07/2009, at 3:35 PM, jim wrote: > > >>> aaaaaaaaaaaaaahhhhhhhhhhhhhhhh !!!!!!!!!!!!!!! :) > > >>> On May 27, 5:15 pm, Pat Allan <[email protected]> wrote: > >>>> Okay, I've made the change. Anyone now installing Thinking Sphinx > >>>> via > >>>> plugin or gem gets a warning, and morphology has a default of nil. > > >>>> I'll remove the warning at some point, maybe in a couple of months. > > >>>> Cheers > > >>>> -- > >>>> Pat > > >>>> On 17/05/2009, at 10:08 PM, Pat Allan wrote: > > >>>>> You did indeed write a lot, but that's okay, provides a more > >>>>> thorough > >>>>> understanding. > > >>>>> Grouping is probably the best way to do what you've done for the > >>>>> account names - so your approach seems right to me (even though > >>>>> it's > >>>>> not a perfect solution). As much as I can comprehend the problem > >>>>> at > >>>>> the moment, anyway :) > > >>>>> As for alerting people to the removal of the default morphology, I > >>>>> like the idea of having messages when the plugin or gem is > >>>>> installed > >>>>> (and both of those should be doable, I'm almost certain). > > >>>>> If you want to have a go at forking and patching, be my guest - > >>>>> for > >>>>> plugins, I think PLUGIN_ROOT/install.rb is what should hold code > >>>>> that > >>>>> gets run on installation (it might be housed under PLUGIN_ROOT/ > >>>>> rails/ > >>>>> install.rb since Rails 2.1). No idea what the process is for gems, > >>>>> but > >>>>> the rspec gem outputs a message, so TS should be able to as well. > > >>>>> Otherwise, when I have the time and motivation, I'll attempt it > >>>>> myself > >>>>> - which is fine by me, but don't be afraid to give it a shot > >>>>> yourself. > > >>>>> Cheers > > >>>>> -- > >>>>> Pat > > >>>>> On 15/05/2009, at 12:35 PM, aitrus wrote: > > >>>>>> Hi Pat, > > >>>>>> Thanks again for your work on TS. Sorry, I get worked up > >>>>>> easily. To > >>>>>> answer your question, first: > > >>>>>> I'm doing some data warehouse-ish applications. I pull in lots > >>>>>> of > >>>>>> data from various systems. Then I use things like account names, > >>>>>> group names, resource names, host names, etc., to find unique > >>>>>> records. > > >>>>>> When it comes to grouping, I have an association setup of > >>>>>> Personnel/ > >>>>>> Divisions <-- ownership --> Accounts. > > >>>>>> A person can have many accounts. However, each person has only > >>>>>> one > >>>>>> personnel record. If I render a search in Sphinx, it paginates > >>>>>> the > >>>>>> Personnel records--then if I try to display accounts, the > >>>>>> pagination > >>>>>> is very strange. > > >>>>>> So, I needed to do a search in Sphinx on the Accounts table, (a) > >>>>>> eliminating duplicate account names, and (b) eliminating accounts > >>>>>> with > >>>>>> no owner (took some digging to figure out I need to have a "has" > >>>>>> attribute). > > >>>>>> The way I eventually got this to work (after much whiskey and > >>>>>> self- > >>>>>> mutilation) is to setup: > > >>>>>> has staffs(:id), :as > >>>>>> => :has_staffs, :type > >>>>>> => :integer > >>>>>> has ["LOWER(`accounts`.`name`)"], :as > >>>>>> => :sort_account_name, :type > >>>>>> => :string > > >>>>>> in my define index. Then I run the following sphinx search: > > >>>>>> @staff_results = Account.search query, :conditions => > >>>>>> conditions, :page => params[:page], > >>>>>> :group_function => :attr, :group_by => > >>>>>> "sort_account_name", > >>>>>> :group_clause => sort, :without => > >>>>>> {:has_staffs => 0} > > >>>>>> Which solves my biggest problem. I still have the issue that one > >>>>>> account can have many owners--but I have not begun that work. I > >>>>>> also > >>>>>> just noticed, after reviewing some logs, that if ":sortable => > >>>>>> true" > >>>>>> is enabled, you create a "<column>_sort" attribute. I haven't > >>>>>> tried > >>>>>> using this in the above "group_by" entry, yet. > > >>>>>> The biggest use of Sphinx (for me) is that it lets me minimize > >>>>>> the > >>>>>> size of my MySQL indexes (thus speeding up MySQL), and instead > >>>>>> uses > >>>>>> Sphinx to quickly crawl text fields. For example, a unique unix > >>>>>> account could be described as an account (case-sensitive) per > >>>>>> server. > >>>>>> There's several platforms/accounts being warehoused. My account > >>>>>> database has 634,000 records. A mysql search for this account > >>>>>> would > >>>>>> be ungodly, since InnoDB lacks fulltext indexing. etc. > > >>>>>> Another issue I've had is figuring out that I needed to setup the > >>>>>> Charset Table for Sphinx, so it would index various special > >>>>>> characters--some user/group/resource names can have those tucked > >>>>>> away. Of special note are @ (at-sign), $ (dollar-sign), #(hash/ > >>>>>> pound- > >>>>>> symbol), and parenthesis, period, hyphen, underscore, etc. > > >>>>>> I solved that in the sphinx.yml and it looks like: > > >>>>>> development: > >>>>>> morphology: "none" > >>>>>> charset_table: "0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U > >>>>>> +44F, U+430..U+44F, U+0024, $, @, *, ., -, U+0028, (, U+0029, ), > >>>>>> \"# > >>>>>> \"" > > >>>>>> I'm saying some things you probably already know--but I'm hoping > >>>>>> google indexes my post and saves other developers from the > >>>>>> psychological trauma that I experienced. > > >>>>>> I'll be using Sphinx also as part of a web page, but every search > >>>>>> term > >>>>>> will be literal--there's not much use for wordlists, stemming, > >>>>>> etc, > >>>>>> even in that situation. > > >>>>>> Hope this gives good insight into my experience. As for how to > >>>>>> notify, that would be a question of how Rails plugin / gem > >>>>>> install > >>>>>> stuff works. > > >>>>>> My first question would be if you can issue a notice on screen > >>>>>> when > >>>>>> you first install a plugin. Or if "gem install" lets you output > >>>>>> something, similar to a license agreement. > > >>>>>> If there's no easy, verbose way to do it--then I think you should > >>>>>> have > >>>>>> the next update look for a "sphinx.yml" file. If it doesn't > >>>>>> exist, > >>>>>> create it with a boiler plate and have your current defaults > >>>>>> remain > >>>>>> the default. But below them, comment out a line that overrides > >>>>>> it. > > >>>>>> Another way is to intentionally break the existing plugin-install > >>>>>> url > >>>>>> for Sphinx--so people have to go look at your webpage and pay > >>>>>> attention. > > >>>>>> I can think of more ideas. I'd be happy to contribute to TS, but > >>>>>> I'm > >>>>>> still new to Ruby/Rails (coming from Perl) and I want to avoid > >>>>>> the > >>>>>> risk of committing bad code. > > >>>>>> I wrote a lot :( Thank you. > > >>>>>> On May 14, 11:58 pm, Pat Allan <[email protected]> wrote: > >>>>>>> Fair points, even if you're a little worked up about it. > > >>>>>>> When I was last doing some refactoring of the TS Configuration > >>>>>>> class, > >>>>>>> I considered removing the default morphology, but didn't because > >>>>>>> people were already using TS working on the (yes, barely > >>>>>>> documented) > >>>>>>> assumption that it *is* the default. > > >>>>>>> So, I agree about the default being nothing, and people set it > >>>>>>> if > >>>>>>> they > >>>>>>> want one.. but how to we deprecate it cleanly? Beyond just > >>>>>>> removing > >>>>>>> it, which is easy to do, but a warning would be nice, except we > >>>>>>> don't > >>>>>>> want that warning appearing *every* time ts:in is run, or > >>>>>>> something > >>>>>>> like that. > > >>>>>>> Suggestions welcome. > > >>>>>>> Also, re: your grouping issue, care to elaborate? > > >>>>>>> -- > >>>>>>> Pat > > >>>>>>> On 14/05/2009, at 12:04 PM, aitrus wrote: > > >>>>>>>> Pat, I love Thinking Sphinx and I appreciate everything you've > >>>>>>>> done > >>>>>>>> for Rails. > > >>>>>>>> Having said that.... for the love of god, please don't set > >>>>>>>> defaults > >>>>>>>> like this. I didn't even know what was going on. I'm doing an > >>>>>>>> import > >>>>>>>> on hundreds of thousands of records and the full-text search of > >>>>>>>> Sphinx > >>>>>>>> makes this so much faster. > > >>>>>>>> But apparently you're setting the morphology to "stem_en" as a > >>>>>>>> default. I can't find anything about this behavior and it took > >>>>>>>> me > >>>>>>>> forever to figure out that this was the actual issue. I have > >>>>>>>> spent > >>>>>>>> hours trying to figure out why "AB0E" also matched "AB0S". In > >>>>>>>> fact, I > >>>>>>>> didn't even realize this was an issue until after I developed > >>>>>>>> everything, and began to QA my records. > > >>>>>>>> Sweet jesus :( Please organize this in a way that is either > > ... > > read more » --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/thinking-sphinx?hl=en -~----------~----~----~----~------~----~------~--~---
