Thank you, many points to consider.I haven't intentionally used  'deferring'
before, I'll look into that.




On Mon, Jun 29, 2009 at 11:32 AM, Mike Orr <[email protected]> wrote:

>
> On Sun, Jun 28, 2009 at 6:31 PM, Ian Jamieson<[email protected]> wrote:
> > I have a similar question, but what I'm interested in is how to
> > deal with a large number of records?
> > In the previous example  if there are thousands of users
> > and paginate ended up with something like 200 pages.
> > Suppose one day I had to do some sort of online report,
> > how do I try not use all the server resources?
>
> That's basic SQL planning.  Keep all the hard work inside the database
> engine, and fetch as little data into Python as necessary.  Add
> indexes for columns used in 'where' or 'order by' clauses.  Take
> advantage of aggregate functions and 'group by'.  Use 'limit' and
> 'offset' if you're only interested in a subset of results (this is
> what Paginate does).  Both MySQL and Postgres have extensive
> documetation on how to optimize queries; look for the 'EXPLAIN' and
> 'ANALYZE' commands.
>
> If your table has several large text fields that you use only
> occasionally, put them in a 'defer' group in the mapper, so they will
> only be fetched when you tell it to.
>
> # Model
> orm.mapper(Entry, t_entry, properties={
>    "content": orm.deferred(ec.content, group="details"),
>    },
>    order_by=[ec.entry_date.desc()])
>
> # Controller code, or class method in model
> q = meta.Session.query(Entry)
> q = q.options(orm.undefer_group("details"))
>
> Here I defer all columns except those used in searches and index
> pages.  The main query will ignore them, but SQLAlchemy will fetch
> them just-in-time if I access them.  But I know which pages I need
> them for -- the details page and the modify form -- and those happen
> to be the pages where I fetch only one record.  So I undefer the group
> on those pages, and SQLAlchemy includes them in the main query.
>
> Sometimes it's more efficient to fetch slightly more than you need, if
> you can leverage an index, and each record is short.  Say you have to
> do some complex search or reporting that's cumbersome to translate to
> SQL.  Fetch a single column and the primary key, using 'where' to
> knock off any chunks that are easy to specify.  Then do your final
> calculation in Python.  This is also useful if you have to do two
> different calculations on the same column; maybe in different subsets
> of records.  You can make one inclusive query that fetches the records
> for both, and then do both calculations in parallel in Python.
>
> When you have to do ad hoc calculations over a large number of
> records, and you only need one or two columns, make a SQL builder
> query rather than an ORM query.  It's several times faster.
>
> You can also precalculate report statistics at night and save them in
> cache tables.  So if your report shows monthly results, you can put
> the intermediate results in a Monthly table, and then the report will
> come up faster on demand.
>
> --
> Mike Orr <[email protected]>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pylons-discuss" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to