80,000 records is small enough data set that you should have a lot of
options available.

Just taking a step back from the problem:

1) Do you have a specific, measurable goal for your performance?
2) How big is the gap between your goal and where you are now?

On May 13, 2:43 pm, Sean Seefried <[email protected]> wrote:
> Hi all,
>
> I've got a question that I hope generates healthy debate and perhaps
> even a solution for me.  Without going into too much detail I'm
> working on a project in which we perform calculations on a large
> hierarchical data set. We haven't used the acts_as_tree or
> acts_as_nested plugins because each level in the hierarchy has a well
> defined role and various attributes that only fit at that level in the
> hierarchy. To give you a brief taste the hierarchy roughly goes: local
> government area, precinct, building type, consumption.
>
> For a given local government area the number of records contained in
> the entire hierarchy is about 80000, the bulk of them being
> consumption records (since they are the leaves of the hierarchy). A
> feature of the project is that one should be able to perform a
> projection of consumptions into the future. This is a fairly complex
> algorithm and involves looking at all 80000 records and combining them
> in various ways.
>
> This algorithm, naively written, has a huge database latency. The bulk
> of the time is spent querying and receiving results from the
> database.
>
> We have had some success in optimising various parts of the algorithm
> by performing less queries. A lot of the time this means that we pull
> the records out of the database and put them into some kind of look-up
> structure (a hash with a key equal to the attributes (plural) of
> interest in the model).  This allows us to do a kind of "in memory"
> query but, annoyingly, only on whatever we choose as the key for the
> hash. You can no longer perform general queries on the collection in
> memory. Basically we lose all the expressiveness/terseness of
> ActiveRecord.
>
> What we really want to be able to do is this:
>
> 1. We want to pull a large collection of objects from the database
> into memory
> 2. We want to be able to select subsets of these in-memory objects
> with a similar flexibility
>    to querying them using ActiveRecord
> 3. After having updated them in memory we want to be able to write
> them back to the database. I should metion that none of their unique
> keys will have changed. We want this to be some kind of bulk update.
>
> Some approaches that I've thought of but I don't think work:
> - Simply caching will not work because, as far as I know, this only
> works when you perform the same query twice. We are not doing this.
> We're querying the collection in many different ways.
> - This isn't really an issue to do with the kind of database.
> Switching over to CouchDB or an Object Database doesn't obviously
> solve our problem. The problem is that although we know in advance
> that all our queries will returns results that are a subset of the
> collection of 80000 objects we want to be able to perform many
> different sorts of queries returning many different subsets of the
> 80000 objects.  Having the objects all sitting in memory really seems
> to be the way to go.
> - We could also just create a class hierarchy that mirrors the
> hierarchy of the data and forget about ActiveRecord entirely. We could
> then just serialize this structure and write it to disk or a database.
> We don't get the advantage of being able to query the data structure
> this way though.
>
> Some final notes:
> a) This site will, at most, have a few simultaneous clients so having
> all 80000 records in memory should not be a problem.
> b) We've had some luck with bulk-update part of point 3 above.  Zach
> Dennis' AR-extensions plug-in has been quite useful.
>
> This email is still not as clear as I wanted it to be even though I've
> spent some time on it. If you need any clarification please feel free
> to ask.
>
> Cheers,
>
> Sean
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Ruby or Rails Oceania" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group 
> athttp://groups.google.com/group/rails-oceania?hl=en.

-- 
You received this message because you are subscribed to the Google Groups "Ruby 
or Rails Oceania" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/rails-oceania?hl=en.

Reply via email to