On 08/22/2018 01:14 PM, Andy Armstrong wrote:

I've got a database of TV and Radio programmes [1] which currently uses DBI and 
ad-hoc queries. I'm considering migrating it to use DBIC.

The two most important tables are 'programmes' - which contains about 5.5 
million TV and Radio programme listings and 'services' which lists 90 different 
channels on which those programmes have been broadcast over time. There are 
lots of other tables but those two are hit by all the hot paths through the 

The programmes table is large enough that there's no point in trying to cache 
it at query level - generally speaking we're hitting it completely randomly and 
if, say, a particular programme gets a lot of hits that's taken care of by CDN 

The services table is small, almost completely static (i.e. I don't mind a 
manual restart if it changes) and, in the current implementation, cached 
entirely in memory.

Anywhere we display a programme we also need to have its service.

programmes -> belongs_to -> services
services -> has_many -> programmes

Most of the pages listing programmes (search, schedule) can be generated with a 
single query to programmes. However for each programme I have to look up its 
service. That's currently very quick because the services are cached in memory.

With DBIC every call to $prog->service->name results in a query against 
services - so for a page of 20 search results that's 20 additional queries. Even on a 
single page it's often the case that many of those queries are searching for the same 

I've played around with DBIx::Class::Cursor::Cached but that doesn't seem to 
cache belongs_to lookups at all. Here's what I tried:

   my $cache = Cache::FileCache->new( { namespace => 'GenomeCache' } );

     { cache_object => $cache,
       cache_for    => 3600

It seems to cache my top level search against programmes but not the subsequent 
queries for $prog->service->name.

What's the state of the art for dealing with such static data? Ideally I'd like an interface 
that amounts to a switch that I can throw on a per-table basis to say "this is 
reference data". Obviously I can just implement my own $prog->service that finds the 
service in a static cache but that feels messy compared to having a DBIC based 
implementation of services that understands that it's completely in-core cacheable.


[1] https://genome.ch.bbc.co.uk

This seems like a simple schema, not sure what caching would buy you unless the DB is not a local service on the same box or network (or getting several hundred requests per second). Generally, in my apps when it comes to more complex queries, I first write the query in raw SQL and benchmark it. Then I convert it over to a DBIC resultset (which normally means using the prefetch option) and bench mark that. If it much slower than the raw SQL query, I might look into only fetching the columns I need (via the 'columns => []' property). If that doesn't speed it up enough, I'll then switch to using the hashref inflator resultsource class which usually brings it right in line raw SQL performance.


List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@lists.scsys.co.uk

Reply via email to