On May 27, 8:11 pm, Paulo Pereira <[email protected]> wrote:
> Hi everyone,
>
> I'm currently working on a project which will execute a rake task from
> time to time, and said taks involves a complex recommendation
> algorithm that can be accomplished with a complex SQL query.
>
> I may use a single one to do everything, or a lot of them, one per
> item to be computed. The nature of the query isn't in cause here, it's
> just
>
> #################
> ######## ONE SINGLE QUERY
> query "INSERT INTO deviations(pivot_id, deviant_id, value)
> SELECT
>   ratings.content_id as pivot_id,
>   ratings2.content_id as deviant_id,
>   AVG(ratings.rate - ratings2.rate) as value
> FROM ratings
> INNER JOIN ratings as ratings2
>   ON ratings.viewer_id = ratings2.viewer_id
>   AND ratings.viewer_type = ratings2.viewer_type
>   AND ratings.content_id < ratings2.content_id
> GROUP BY pivot_id, deviant_id"
>
> ActiveRecord::Base.connection().execute(query)
>
> ######## MULTIPLE QUERIES
> sql = ActiveRecord::Base.connection()
> ids.each do |id|
>   query = "INSERT INTO deviations(pivot_id, deviant_id, value)
>   SELECT
>     #{id.to_i} as pivot_id,
>     ratings2.content_id as deviant_id,
>     AVG(ratings.rate - ratings2.rate) as value
>   FROM ratings
>   INNER JOIN ratings as ratings2
>     ON ratings.viewer_id = ratings2.viewer_id
>     AND ratings.viewer_type = ratings2.viewer_type
>     AND ratings.content_id = #{id.to_i}
>     AND ratings.content_id < ratings2.content_id
>   GROUP BY deviant_id"
>   sql.execute(query)
> end
> #################
>
> Just for curiosity sake, I have an index on [viewer_id, viewer_type,
> content_id]
>
> I teste performance, for the single query variant and the multiple
> queries variant, and then changed MySQL so I can have more memory and
> the temp tables aren't stored in the hard drive.
> After changing MySQL settings, the single query variant improved,
> (268sec vs 80sec) but the multiple queries variant took almost twice
> the time. (190sec vs 390sec) What the...??

hard to say without knowing what you changed. Take a look at your
query plans (use explain) to see if your queries could be better. If
your tables are innodb you'll might also want to increase the innodb
buffer pool size
>
> I checked process info, and it's just using one of the four available
> CPUs.
>
> What can I do to improve this performance?
> How to use all the available CPUs?

> Is it to create multiple threads inside the rake task, and then use a
> connection per task with multiple queries??
> Am I doing something fundamentally wrong here?
>
Running multiple queries in parallel won't help if you are IO bound
(also unfortunately running a mysql query with the standard ruby mysql
gem will block all threads in the ruby interpreter).

Fred


> Thanks in advance for some tips, I'm really out of ideas, and this is
> my master thesis that ends in two weeks :(
>
> Cheers,
>
> Paulo Pereira
>
> THE NEW MYSQL OPTIONS:
> [mysqld]
> sort_buffer_size=2M
> read_buffer_size=2M
> join_buffer_size=2M
>
> read_rnd_buffer_size=2M
>
> max_heap_table_size=256M
> tmp_table_size=256M
>
> myisam_sort_buffer_size=64M
>
> thread_cache=256
>
> query_cache_type=1
> query_cache_limit=1M
> query_cache_size=32M
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Ruby 
on Rails: Talk" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/rubyonrails-talk?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to