80,000 records is small enough data set that you should have a lot of options available.
Just taking a step back from the problem: 1) Do you have a specific, measurable goal for your performance? 2) How big is the gap between your goal and where you are now? On May 13, 2:43 pm, Sean Seefried <[email protected]> wrote: > Hi all, > > I've got a question that I hope generates healthy debate and perhaps > even a solution for me. Without going into too much detail I'm > working on a project in which we perform calculations on a large > hierarchical data set. We haven't used the acts_as_tree or > acts_as_nested plugins because each level in the hierarchy has a well > defined role and various attributes that only fit at that level in the > hierarchy. To give you a brief taste the hierarchy roughly goes: local > government area, precinct, building type, consumption. > > For a given local government area the number of records contained in > the entire hierarchy is about 80000, the bulk of them being > consumption records (since they are the leaves of the hierarchy). A > feature of the project is that one should be able to perform a > projection of consumptions into the future. This is a fairly complex > algorithm and involves looking at all 80000 records and combining them > in various ways. > > This algorithm, naively written, has a huge database latency. The bulk > of the time is spent querying and receiving results from the > database. > > We have had some success in optimising various parts of the algorithm > by performing less queries. A lot of the time this means that we pull > the records out of the database and put them into some kind of look-up > structure (a hash with a key equal to the attributes (plural) of > interest in the model). This allows us to do a kind of "in memory" > query but, annoyingly, only on whatever we choose as the key for the > hash. You can no longer perform general queries on the collection in > memory. Basically we lose all the expressiveness/terseness of > ActiveRecord. > > What we really want to be able to do is this: > > 1. We want to pull a large collection of objects from the database > into memory > 2. We want to be able to select subsets of these in-memory objects > with a similar flexibility > to querying them using ActiveRecord > 3. After having updated them in memory we want to be able to write > them back to the database. I should metion that none of their unique > keys will have changed. We want this to be some kind of bulk update. > > Some approaches that I've thought of but I don't think work: > - Simply caching will not work because, as far as I know, this only > works when you perform the same query twice. We are not doing this. > We're querying the collection in many different ways. > - This isn't really an issue to do with the kind of database. > Switching over to CouchDB or an Object Database doesn't obviously > solve our problem. The problem is that although we know in advance > that all our queries will returns results that are a subset of the > collection of 80000 objects we want to be able to perform many > different sorts of queries returning many different subsets of the > 80000 objects. Having the objects all sitting in memory really seems > to be the way to go. > - We could also just create a class hierarchy that mirrors the > hierarchy of the data and forget about ActiveRecord entirely. We could > then just serialize this structure and write it to disk or a database. > We don't get the advantage of being able to query the data structure > this way though. > > Some final notes: > a) This site will, at most, have a few simultaneous clients so having > all 80000 records in memory should not be a problem. > b) We've had some luck with bulk-update part of point 3 above. Zach > Dennis' AR-extensions plug-in has been quite useful. > > This email is still not as clear as I wanted it to be even though I've > spent some time on it. If you need any clarification please feel free > to ask. > > Cheers, > > Sean > > -- > You received this message because you are subscribed to the Google Groups > "Ruby or Rails Oceania" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group > athttp://groups.google.com/group/rails-oceania?hl=en. -- You received this message because you are subscribed to the Google Groups "Ruby or Rails Oceania" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/rails-oceania?hl=en.
