Thank you, many points to consider.I haven't intentionally used 'deferring' before, I'll look into that.
On Mon, Jun 29, 2009 at 11:32 AM, Mike Orr <[email protected]> wrote: > > On Sun, Jun 28, 2009 at 6:31 PM, Ian Jamieson<[email protected]> wrote: > > I have a similar question, but what I'm interested in is how to > > deal with a large number of records? > > In the previous example if there are thousands of users > > and paginate ended up with something like 200 pages. > > Suppose one day I had to do some sort of online report, > > how do I try not use all the server resources? > > That's basic SQL planning. Keep all the hard work inside the database > engine, and fetch as little data into Python as necessary. Add > indexes for columns used in 'where' or 'order by' clauses. Take > advantage of aggregate functions and 'group by'. Use 'limit' and > 'offset' if you're only interested in a subset of results (this is > what Paginate does). Both MySQL and Postgres have extensive > documetation on how to optimize queries; look for the 'EXPLAIN' and > 'ANALYZE' commands. > > If your table has several large text fields that you use only > occasionally, put them in a 'defer' group in the mapper, so they will > only be fetched when you tell it to. > > # Model > orm.mapper(Entry, t_entry, properties={ > "content": orm.deferred(ec.content, group="details"), > }, > order_by=[ec.entry_date.desc()]) > > # Controller code, or class method in model > q = meta.Session.query(Entry) > q = q.options(orm.undefer_group("details")) > > Here I defer all columns except those used in searches and index > pages. The main query will ignore them, but SQLAlchemy will fetch > them just-in-time if I access them. But I know which pages I need > them for -- the details page and the modify form -- and those happen > to be the pages where I fetch only one record. So I undefer the group > on those pages, and SQLAlchemy includes them in the main query. > > Sometimes it's more efficient to fetch slightly more than you need, if > you can leverage an index, and each record is short. Say you have to > do some complex search or reporting that's cumbersome to translate to > SQL. Fetch a single column and the primary key, using 'where' to > knock off any chunks that are easy to specify. Then do your final > calculation in Python. This is also useful if you have to do two > different calculations on the same column; maybe in different subsets > of records. You can make one inclusive query that fetches the records > for both, and then do both calculations in parallel in Python. > > When you have to do ad hoc calculations over a large number of > records, and you only need one or two columns, make a SQL builder > query rather than an ORM query. It's several times faster. > > You can also precalculate report statistics at night and save them in > cache tables. So if your report shows monthly results, you can put > the intermediate results in a Monthly table, and then the report will > come up faster on demand. > > -- > Mike Orr <[email protected]> > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "pylons-discuss" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/pylons-discuss?hl=en -~----------~----~----~----~------~----~------~--~---
