On 6/23/06, Mark Hedges <[EMAIL PROTECTED]> wrote: > > I tried to explain on the Mason list that DBIx::Class is "fork- > and thread-safe" in a discussion on how (why not) to cache a > DBI connection at Apache start-up. > > It occurred to me I don't really know what this means, and I > couldn't find in previous discussions or the DBIC::Storage::DBI > man page what actually happens. >
Its a rather complicated can of worms is what it is :) To begin with, fork/thread-safety goes beyond just apache + $worker issues and stages of the startup of such an app. I for instance have commandline utilities and long-running system daemons I've written that use DBIx::Class and fork themselves whenever they feel its convenient. Because of the DBIx::Class support for this, my DBIC-related code doesn't need to know anything about that, or do anything about. $schema just keeps working as expected after the fork for both child and parent, even if one or the other exits. The basic underlying issue is that if you get a $dbh via DBI->connect, then fork off a child, the parent and child share the connection, and they don't play nice with each other. DBI documents that this is unsafe, and in practice you can see warnings and errors from this (its a race thing, so sometimes you can get away with it for a little while, but it will bite you eventually). DBI makes no effort to detect the situation or do anything about it, things just break. Threads will also by default share a connection if you let them share a $dbh, but instead of seemingly-working and then randomly failing later, DBI will throw exceptions as soon as you try to touch the $dbh from the wrong thread, and likely terminate your app/worker/whatever. So, from the point of view of the application or module author who is using DBI directly, one has to be careful that every time one forks (or spawns a thread), that one obtains a fresh new database handle for the new process / thread to avoid problems. You also have to be careful what you do with the old one, as the $dbh destructor will close the physical (by that I mean socket) connection. So if you fork off a child, and the child does "undef $dbh", this will kill the parent's connection too by default. The $dbh attribute "InactiveDestroy" is used to work around that particular issue. For a generically-useful ORM like DBIx::Class, the larger issue is that since we are not the application, we can't really know when or if the app author is going to fork or thread. I suppose if just *before* any forking or threading operation the app author did a $schema->storage->disconnect, that would solve the issue right there. But they often won't, or don't know where to, or potentially don't even have direct control of the forking/threading code (as is the case with apache worker modules). So the most robust answer was that we built support in DBIx::Class to automatically detect that the process or thread context has changed and take appropriate measures as neccesary to use DBI safely and correctly, which frees the user from ever having to worry about all this crap. You just use it, and it just works, and you can keep your $schema across forks and threads just fine. > If I connect in the startup.pl, does that mean that each forked > child shares the connection? (I'm guessing no.) > With straight DBI, yes, and that breaks things. With DBIx::Class, the first time each child tries to use their connection, they will first detect that the PID has changed, then set InactiveDestroy, undef their $dbh, and reconnect. This is transparent to the user of DBIx::Class. > If I connect in startup.pl under an Apache2 threaded worker > model, does that mean each mod_perl thread shares the > connection? > Same answer as above - with straight DBI if you connect in startup.pl you will have issues, but with DBIx::Class you can connect in startup.pl and everything works fine. Each new process and/or thread gets its own connection (but won't actually make that connection until it tries to access the database and "notices" that the pid/thread has changed out from under it). > How does this actually work under FastCGI? It all depends on what FastCGI environment you're in and how you're using it, but I think normally its a non-issue for FastCGI as the workers are all seperate procs to begin with (spawned by the FCGI proc manager). > Is there any way for multiple processes or threads to really > share a DBIC connection? If by that you mean multiple procs/threads to share a DBI connection via DBIC, not at the moment, but in theory yes. There are modules out there on CPAN that go about this by multiplexing the requests of many procs/threads into a single "db worker" proc/thread which handles everything via single connection (or a pool of connections, but the important thins is n_threads > n_conns). They use locking to make sure only one proc/thread can really access a connection at a time. We could make a Storage::DBI subclass that works similarly. Note that there are issues wrt to transactions and other potentially (accidentally) shared state between the multiplexed processes that must be dealt with one way or another. Personally, I don't think its much worth it in most normalish real-world scenarios. On reasonable platforms idle connections really don't cost much at all. Distributing the same txn load over 5 or 50 connections shouldn't really change much for most people - some perhaps, but enough to be worth the added complexity? But if someone finds themselves in a situation where this would be beneficial, feel free to write the support for it, or bug someone else to, or sponsor someone else to, etc :) -- Brandon _______________________________________________ List: http://lists.rawmode.org/cgi-bin/mailman/listinfo/dbix-class Wiki: http://dbix-class.shadowcatsystems.co.uk/ IRC: irc.perl.org#dbix-class SVN: http://dev.catalyst.perl.org/repos/bast/trunk/DBIx-Class/ Searchable Archive: http://www.mail-archive.com/[email protected]/
