On Sun, Jan 29, 2012 at 06:12:50PM +0000, Dave Mitchell wrote:
> 
> I've now written some working caching code, that reduces CPU usage on
> while ($sth->fetch()) {$c++} from 15.23s to 13.10s, which is a 14%
> saving!! The test code just fetches millions of rows from a 2-int table
> and then doesn't do anything real with the returned data, so a real-world
> example will see a smaller saving, but I'm still very pleased with it :-)

Nice.

> It's not ready for use yet; it passes the DBI test suite, but:
>  * I haven't tested it under ithreads or early perls yet;
>  * I haven't tried it against the DBD::mysql test suite yet;
>  * I haven't written any tests yet for the method cache invalidation,
>       e.g. if code does *DBD::mysql::st::foo = sub {...},
>    will subsequent calls to $sth->foo() call the new function?
> 
> As regards that last point, can you recommend where I should place the new
> tests? Is there an existing test script that they can be added to, or if a
> new script needs adding, what should it be called, and is there a good
> existing script to use as a template?

The DBI test suite is, sadly, rather crufty. Lots of test files have far
outgrown their original scope.

I'd reccomend starting a new one, say 31methcache.t. You could use
DBD::Sponge (as many tests do, e.g. 30subclass.t) and create two
statement handles each preloaded with different data.
Then fetchrow_arrayref from one and fetchrow_hashref from the other,
for example, checking you get back the right data with the right
kind of reference. Just a thought.

> The code itself (see diff below) just attaches some magic to the outer CV
> that caches the GV of the corresponding inner method.

Any reason for not using the existing 'internal method attribute'
mechanism? _install_method() already allocates a structure and
attaches it the the CV using CvXSUBANY(cv).any_ptr.  You could add the
three elements of method_cache_t directly into dbi_ima_t.  Then, since
the dispatch code already has the structure pointer (ima) handy, your
cache code could avoid the mg_findext call and use ima->method_cache
directly.

> The caching code *does* assume that the name of the outer sub never
> changes, i.e. that GvNAME(CvGV(cv)) in XS_DBI_dispatch is constant WRT cv;
> is this a reasonable assumption?

Yes, I think so.

Though some kind of 'kill switch' to disable the cache might be handy,
in case odd things are happening and the cache is suspected.

> The code makes use of non-API knowledge about PL_generation and
> HvMROMETA(stash)->cache_gen, which could conceivably change in a later
> perl release.

Which seems like a good argument for extending the perl API to cover
that kind of functionality.

> > PS - I'm specifically being paid only to fix a performance issue on
> > non-threaded builds, so I won't be looking at threaded issues. But
> > dPERINTERP looks like bad news on threaded builds.
> 
> I've since been told that a) I'm allowed to say who's funding me:
> it's booking.com (thanks, guys!); and that b) depending on how much time I
> burn on the cache issue, I may have time to look at threaded issues too,
> specifically the dPERINTERP rewrite using MY_CXT - assuming that no one
> beats me to it!

That would be great. Thanks again Dave.

> Dave.
> 
> 
> Index: DBI.xs

Looks good.

Tim.

Reply via email to