Thanks a lot Pat, where do you want me to send?

On Mar 5, 12:44 am, Pat Allan <[email protected]> wrote:
> Hi Pedro
>
> Looks like there's something seriously wrong in the generated SQL statement. 
> If possible (and let's do this off-list), can I get access to the app's code 
> so I can reproduce the problem locally, and see if I can fix it?
>
> --
> Pat
>
> On 05/03/2011, at 9:55 AM, Pedro Cunha wrote:
>
>
>
>
>
>
>
> > Pat could you take a look at:http://dl.dropbox.com/u/1175879/output.zip
>
> > As you can see, this is an output for just 1 restaurant and it's
> > totally weird :\
>
> > Best Regards,
> > Pedro
>
> > On Mar 3, 6:57 am, Pat Allan <[email protected]> wrote:
> >> Hi Pedro
>
> >> Not that it's particularly helpful, but this is definitely not how Sphinx 
> >> behaves. It definitely should not take an hour toindex15 documents. It 
> >> shouldn't even take a minute, unless the query is particularly complex - 
> >> and this isn't what I'd call complex!
>
> >> The reason for adding the group_by call, is that PostgreSQL is strict with 
> >> the SQL it expects - anything that isn't within an aggregation function 
> >> should be mentioned in the GROUP BY clause.
>
> >> Thinking Sphinx will do this automatically for any fields it references to 
> >> - but it can't look into SQL snippets to figure out which columns are 
> >> used, hence why you need to manually add it yourself, because of your 
> >> CRC'd state attribute.
>
> >> The repeated data is definitely odd. Thinking Sphinx concatenates data 
> >> from columns in associations, but the query shouldn't duplicate data 
> >> (unless the record is attached to many associated records that look the 
> >> same - but that's not a SQL problem, that's a data problem).
>
> >> When you limit the query to one restaurant, can you confirm it's joining 
> >> only on the menu categories and menu items you're expecting? (Add the id 
> >> columns from both tables to the start of your query, and to the GROUP BY 
> >> clause).
>
> >> --
> >> Pat
>
> >> On 03/03/2011, at 6:13 AM, Pedro Cunha wrote:
>
> >>> Hello Pat,
>
> >>> Have been investigating last days and I'm almost at the same place.
>
> >>> I'm using PostgreSQL.
>
> >>> As far as I read we need to trust how postgres analyses queries and he
> >>> can choose or not to use indexes.
>
> >>> Herehttp://explain.depesz.com/s/sA9isa bit (only a bit..) more
> >>> detailed the EXPLAIN ANALYSE.
>
> >>> I've been digging through the query that TS generates, and for the 10
> >>> entities (restaurants) i'm trying toindex, 8500~ rows are created.
>
> >>> What is the reason for when including all "has x" to use "group by"?
>
> >>> Also I noticed something weird, imagine I pick one of the lines of the
> >>> output, and now pick a column, is it supposed to see lots of
> >>> information repeated?  like:
>
> >>> ".. Negócios Negócios Negócios Negócios Negócios Negócios Sair a dois
> >>> Negócios Negócios Negócios Negócios Negócios Negócios Negócios Sair a
> >>> dois Negócios Negócios Negócios Sair a dois Negócios Sair a dois
> >>> Descontrair Sair a dois Sair a dois Sair a dois Sair a dois Sair a
> >>> dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair
> >>> a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois
> >>> Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a
> >>> dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair
> >>> a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois
> >>> Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a
> >>> dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair
> >>> a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois
> >>> Sair a dois Sair a dois Sair a dois Sair a dois Sair a dois Sair a
> >>> dois Sair a dois Sair a dois ..."
>
> >>> Production database is at the moment with 15 restaurants.. and it's
> >>> taking 1h toindex
>
> >>> As DB grows, time is not incremental but exponential and this is
> >>> worrying me a lot. Also because delta indexing is taking around 15~
> >>> mins.
>
> >>> Is it supposed that 1:N associations decay so badly?
>
> >>> I'm running out of ideas and considering breaking references for this
> >>> case to save one inner join to happen.
>
> >>> Restaurant (1 : N ) Menu Categories
> >>> Menu Categories ( 1 : N ) Menu Items  <====== Shortcut this one
>
> >>> Best regards,
> >>> Pedro
>
> >>> On Feb 24, 3:59 am, Pat Allan <[email protected]> wrote:
> >>>> Hi Pedro
>
> >>>> Wow, PostgreSQL's EXPLAIN output is pretty complex!
>
> >>>> I'm not spotting anything obviously odd, but I'm not sure what to look 
> >>>> for.
>
> >>>> Perhaps you try it all again, but comment out all the fields and 
> >>>> attributes except those from menu items? Indexing should be a bit 
> >>>> faster, but I'd imagine there's still a speed hit to some extent. I'm 
> >>>> not sure if that'll shine any light on the situation though...
>
> >>>> Is this all on your development environment? If it's production or 
> >>>> staging, are you sure that the database indexes on the foreign keys 
> >>>> exist? (ie: latest migrations have been run)
>
> >>>> --
> >>>> Pat
>
> >>>> On 24/02/2011, at 11:47 AM, Pedro Cunha wrote:
>
> >>>>> Hello Pat,
>
> >>>>> Here is the link:
> >>>>>https://gist.github.com/841520
>
> >>>>> With:
> >>>>> - define_index block
> >>>>> - sql query on conf
> >>>>> - explain
> >>>>> - explain analyse
>
> >>>>> Thanks in advance
>
> >>>>> On Feb 23, 11:51 pm, Pat Allan <[email protected]> wrote:
> >>>>>> Hi Pedro
>
> >>>>>> Perhaps post the EXPLAIN and EXPLAIN ANALYSE outputs to a gist? To be 
> >>>>>> honest, given I've not used the PostgreSQL equivalents, I'm not sure 
> >>>>>> how much sense I'll be able to make out of it, but let's start with 
> >>>>>> that.
>
> >>>>>> Also, can you add the SQL query to the gist as well?
>
> >>>>>> Cheers
>
> >>>>>> --
> >>>>>> Pat
>
> >>>>>> On 24/02/2011, at 6:07 AM, Pedro Cunha wrote:
>
> >>>>>>> Hey Pat, I'm using PostgresSQL, and command is the same.
>
> >>>>>>> I tried looking at the output and it kinda looks like chinese to me.
> >>>>>>> From what I see, I can see no reference to indexes whatsoever. That
> >>>>>>> can be a problem. According to Postgres docs I must do an ANALYZE
> >>>>>>> <table_name>  to make pg server aware of anindexon that specific
> >>>>>>> table.
>
> >>>>>>> After I ve done that and run the EXPLAIN and EXPLAIN ANALYSE I still
> >>>>>>> can't see any reference to indexes. That might be a problem.
>
> >>>>>>> Do you want me to copy paste here the whole log or is there some way I
> >>>>>>> can attach that file to this thread? (it's a bit lengthy)
>
> >>>>>>> Best regards
>
> >>>>>>> On Feb 23, 12:31 am, Pat Allan <[email protected]> wrote:
> >>>>>>>> Hi Pedro
>
> >>>>>>>> Given you've got the database indexes, this definitely feels far 
> >>>>>>>> tooslow. Are you using MySQL or PostgreSQL?
>
> >>>>>>>> If you're using MySQL, then I would recommend taking the generated 
> >>>>>>>> sql_query value from the restaurant_core_0 source in your sphinx 
> >>>>>>>> configuration file, remove the $start and $end parts of the WHERE 
> >>>>>>>> clause, and add EXPLAIN to the start of the query. Then, run this 
> >>>>>>>> query in a MySQL console:
>
> >>>>>>>>   EXPLAIN SELECT ...
>
> >>>>>>>> This should give you some information on what database indexes are 
> >>>>>>>> being used for the query.
>
> >>>>>>>> I'm not sure what the equivalent of the EXPLAIN command for 
> >>>>>>>> PostgreSQL is, but I'd be surprised if such a thing didn't exist.
>
> >>>>>>>> Cheers
>
> >>>>>>>> --
> >>>>>>>> Pat
>
> >>>>>>>> On 23/02/2011, at 1:32 AM, Pedro Cunha wrote:
>
> >>>>>>>>> Hello there,
>
> >>>>>>>>> I'm having a weirdslowindexing. Tried looking for similar issues on
> >>>>>>>>> this group and couldn't find.
>
> >>>>>>>>> When Iindexan has_many through association, perfomance decays a lot.
> >>>>>>>>> To help trace this issue I ll give as much info as I can.
>
> >>>>>>>>> So the association looks like this:
>
> >>>>>>>>> Restaurant (1 : N ) Menu Categories
> >>>>>>>>> Menu Categories ( 1 : N ) Menu Items
>
> >>>>>>>>>Indexon table menu_categories for restaurant_id (DB level)
> >>>>>>>>>Indexon table menu_items for menu_category_id (DB level)
>
> >>>>>>>>> define_index on restaurant
> >>>>>>>>> ...
> >>>>>>>>>    indexes menu_categories.menu_items(:name), :as => :menu_item_name
> >>>>>>>>>    indexes menu_categories.menu_items(:subtitle), :as
> >>>>>>>>> => :menu_item_subtitle
> >>>>>>>>>    indexes menu_categories.menu_items(:description), :as
> >>>>>>>>> => :menu_item_description
> >>>>>>>>> ...
> >>>>>>>>> end
>
> >>>>>>>>> Without the indexes before, and with a lot of other stuff I get this
> >>>>>>>>> perfomance:
>
> >>>>>>>>> indexingindex'restaurant_core'...
> >>>>>>>>> collected 12 docs, 3.0 MB
> >>>>>>>>> collected 122936 attr values
> >>>>>>>>> sorted 0.2 Mvalues, 100.0% done
> >>>>>>>>> sorted 2.2 Mhits, 100.0% done
> >>>>>>>>> total 12 docs, 2973437 bytes
> >>>>>>>>> total 16.741 sec, 177611 bytes/sec, 0.71 docs/sec
> >>>>>>>>> distributedindex'restaurant' can not be directly indexed; skipping.
>
> >>>>>>>>> With the indexes
>
> >>>>>>>>> indexingindex'restaurant_core'...
> >>>>>>>>> collected 12 docs, 15.9 MB
> >>>>>>>>> collected 423920 attr values
> >>>>>>>>> sorted 0.8 Mvalues, 100.0% done
> >>>>>>>>> sorted 11.4 Mhits, 100.0% done
> >>>>>>>>> total 12 docs, 15885288 bytes
> >>>>>>>>> total 320.783 sec, 49520 bytes/sec, 0.03 docs/sec
> >>>>>>>>> distributedindex'restaurant' can not be directly indexed; skipping.
>
> >>>>>>>>> table ids are pretty small so I think that issue is not causing
> >>>>>>>>> problems.. still I tried adding the sql step config and performance
> >>>>>>>>> stays the same
>
> >>>>>>>>> Properties on define_index are:
>
> >>>>>>>>>    set_property :min_prefix_len => 3
> >>>>>>>>>    set_property :stopwords => "#{RAILS_ROOT}/config/sphinx/
> >>>>>>>>> stoplist.txt"
> >>>>>>>>>    set_property :delta => ThinkingSphinx::Deltas::ResqueDelta
>
> >>>>>>>>> Tried commenting min_prefix_len and perfomance keeps the same..
>
> >>>>>>>>> Running out of ideas :(
>
> >>>>>>>>> Best regards,
> >>>>>>>>> Pedro
>
> >>>>>>>>> --
> >>>>>>>>> You received this message because you are subscribed to the Google 
> >>>>>>>>> Groups "Thinking Sphinx" group.
> >>>>>>>>> To post to this group, send email to 
> >>>>>>>>> [email protected].
> >>>>>>>>> To
>
> ...
>
> read more »

-- 
You received this message because you are subscribed to the Google Groups 
"Thinking Sphinx" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/thinking-sphinx?hl=en.

Reply via email to