I believe postgres is able to run stored procedures written in ruby (or perl, python, lisp, c...). One alternative I haven't seen suggested is to load the entire database into an SQLITE in-memory instance. That would give you in-memory sql access.
On May 13, 10:17 pm, Shanon McQuay <[email protected]> wrote: > 80,000 records is small enough data set that you should have a lot of > options available. > > Just taking a step back from the problem: > > 1) Do you have a specific, measurable goal for your performance? > 2) How big is the gap between your goal and where you are now? > > On May 13, 2:43 pm, Sean Seefried <[email protected]> wrote: > > > > > > > Hi all, > > > I've got a question that I hope generates healthy debate and perhaps > > even a solution for me. Without going into too much detail I'm > > working on a project in which we perform calculations on a large > > hierarchical data set. We haven't used the acts_as_tree or > > acts_as_nested plugins because each level in the hierarchy has a well > > defined role and various attributes that only fit at that level in the > > hierarchy. To give you a brief taste the hierarchy roughly goes: local > > government area, precinct, building type, consumption. > > > For a given local government area the number of records contained in > > the entire hierarchy is about 80000, the bulk of them being > > consumption records (since they are the leaves of the hierarchy). A > > feature of the project is that one should be able to perform a > > projection of consumptions into the future. This is a fairly complex > > algorithm and involves looking at all 80000 records and combining them > > in various ways. > > > This algorithm, naively written, has a huge database latency. The bulk > > of the time is spent querying and receiving results from the > > database. > > > We have had some success in optimising various parts of the algorithm > > by performing less queries. A lot of the time this means that we pull > > the records out of the database and put them into some kind of look-up > > structure (a hash with a key equal to the attributes (plural) of > > interest in the model). This allows us to do a kind of "in memory" > > query but, annoyingly, only on whatever we choose as the key for the > > hash. You can no longer perform general queries on the collection in > > memory. Basically we lose all the expressiveness/terseness of > > ActiveRecord. > > > What we really want to be able to do is this: > > > 1. We want to pull a large collection of objects from the database > > into memory > > 2. We want to be able to select subsets of these in-memory objects > > with a similar flexibility > > to querying them using ActiveRecord > > 3. After having updated them in memory we want to be able to write > > them back to the database. I should metion that none of their unique > > keys will have changed. We want this to be some kind of bulk update. > > > Some approaches that I've thought of but I don't think work: > > - Simply caching will not work because, as far as I know, this only > > works when you perform the same query twice. We are not doing this. > > We're querying the collection in many different ways. > > - This isn't really an issue to do with the kind of database. > > Switching over to CouchDB or an Object Database doesn't obviously > > solve our problem. The problem is that although we know in advance > > that all our queries will returns results that are a subset of the > > collection of 80000 objects we want to be able to perform many > > different sorts of queries returning many different subsets of the > > 80000 objects. Having the objects all sitting in memory really seems > > to be the way to go. > > - We could also just create a class hierarchy that mirrors the > > hierarchy of the data and forget about ActiveRecord entirely. We could > > then just serialize this structure and write it to disk or a database. > > We don't get the advantage of being able to query the data structure > > this way though. > > > Some final notes: > > a) This site will, at most, have a few simultaneous clients so having > > all 80000 records in memory should not be a problem. > > b) We've had some luck with bulk-update part of point 3 above. Zach > > Dennis' AR-extensions plug-in has been quite useful. > > > This email is still not as clear as I wanted it to be even though I've > > spent some time on it. If you need any clarification please feel free > > to ask. > > > Cheers, > > > Sean > > > -- > > You received this message because you are subscribed to the Google Groups > > "Ruby or Rails Oceania" group. > > To post to this group, send email to [email protected]. > > To unsubscribe from this group, send email to > > [email protected]. > > For more options, visit this group > > athttp://groups.google.com/group/rails-oceania?hl=en. > > -- > You received this message because you are subscribed to the Google Groups > "Ruby or Rails Oceania" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group > athttp://groups.google.com/group/rails-oceania?hl=en. -- You received this message because you are subscribed to the Google Groups "Ruby or Rails Oceania" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/rails-oceania?hl=en.
