On Wed, Apr 02, 2014 at 09:23:13AM +0200, Philippe Bruhat (BooK) wrote:
> Hi,
> 
> During the last Perl QA Hackathon, I've been talking with TIMB, REHSACK,
> HMBRAND and RIBASUSHI about Test::Database.
> 
> Going back to the needs that people have, it seems to me that:
> - the main need is to obtain DSN to databases that match a certain
>   set of prerequisites
> - the prerequisites can vary wildly, and much more than what
>   Test::Database currently supports (the most obvious example being any
>   type of database that is behind a DBD::ODBC DSN)

See note about grep below.

> - with enough support from CPAN Testers, there is no real need to
>   *automatically* create databases (unless explicitely asked to, most
>   likely for the file-based databases)
> 
> Test::Database suffers from bad naming: Test::Database::Handle,
> Test::Database::Driver, driver_dsn are overloading names already used
> with slightly different meanings in DBI. These can probably be fixed,
> as the main interface for Test::Database is its handle / handles method.
> 
> It is also used by a number of modules on CPAN, so we have to be careful
> with backwards compatibility.
> 
> Here are some of my conclusions:
> - Test::Database should only return workspaces / databases that have
>   already been configured, and assume that the necessary persmissions for
>   testing are granted (create table, drop table, etc)
> - Test::Database must support a much larger set of configuration files,
>   located in various places: system (/etc), global ($HOME), local (cwd).
>   That last bit would allow some projects to provide their own config file
>   (as RIBASUSHI wants to do with DBIx::Class)

Would the config be the 'union' of all the found config files?

> - the configuration files would support a much more fine-grained
>   description for each DSN, and the characteristics would be described,
>   and not discovered (well, apart from the DBD version, which is probably
>   the only runtime attrbute). For example, the details of a database
>   engine to which one connect through an ODBC DSN probably can't easily
>   be discovered. These DSN would be Test::DSN objects, which have a larger
>   set of attributes than the current Test::Database::Handle objects,
>   but could share the same basic methods (dns, connection_info, etc)
> 
> With this in mind, Test::DSN would replace the legacy
> Test::Database::Handle, and provide the dsn itself (pointing to a database
> where the test script can create and drop its tables), the connection
> information, and a set of attributes describing the workspace. Test::Database
> would read its config files to get the list of available Test::DSN objects
> and their attributes, and provide a method to request them based on their
> attributes (e.g. give me a dsn for MySQL version 5.6 minimum, with such and
> such options configured).

So Test::DSN is an extra module in the Test-Database distro?

> There are other issues that come to mind, like how to deal with parallel
> testing, or even multiple different test suites using the same database
> dsn at the same time. Maybe Test::Database could provide some sort of
> optional locking mechanism, but I've have a hard time coming up with a
> good one. Modules like DBIT already take that into account, by creating
> tables with names explicitely designed to avoid clashes.

I think it's reasonable for Test::Database/Test::DSN to declare that
"the database you get conected to may be in active use by other
applications, and/or concurrent instances of your application,
so you need to take appropriate care."

In other words, I think it's outside the scope of Test::Database/Test.

Some logic to assist with table naming and cleanup etc will be
developed as part of DBIT and could be extracted. If so, I think it
belongs in a separate distro. (Separation of Concerns etc.)


> If some locking is provided,

If table naming and cleanup is done right then locking shouldn't be needed.

> I think it probably must be at the table
> level, i.e. a script will declare that it wants exclusive access to tables
> X, Y, Z when requesting a dsn. The function will block (there's probably
> a need for a timeout, to avoid having test scripts hanging forever)
> until it can return a dsn with exclusive access to those tables.

This is a potential can of worms. There are modules on CPAN that provide
this kind of functionality. It would be good to use one of them if
possible - or provide a mechanism for third-party modules to plug in to.


> Another option is that Test::Database could provide a function that
> returns a unique prefix for a given script, so that the test script
> can use that when creating the tables. That would provide isolated
> "namespaces", and then we don't have to deal with locking directly.
> With that namespace, we can even let Test::Database do things like
> "drop all tables in that namespace".

That's the approach DBIT is taking. There's a basic start in here:
https://github.com/perl5-dbi/DBI-Test/blob/master/sandbox/tim/lib/DBI/Test/FixtureProvider/GenericBase.pm#L49
I'd be very happy to see that factored out into a separate distro.
I'd rather it wasn't shipped within Test-Database though as I think it
has wider uses.


> As for file-based DBD, they should be handled specifically:
> - Test::Database can already create a unique Test::DSN for any of those
> - Test::Database would only have "drivers" for file-based DSN, as these
>   would be the only case where database creation needs to be delegated
>   to Test::Database.
> - if a test wants specific options, they could be part of the request,
>   and influence the database creation proceass

I'd caution against making a hard distinction between file and non-file DBDs
as it may influence the design it ways that cause limitations later.

It seems to me that the key distinction is "can we reliably and safely
create and delete a private database". That's true for file-based DBDs
(ie create a subdirectory and rm -rf it afterwards) but may also be true
for some non-file DBDs in some situations.

Even for file-based DBDs a user may want to define several DSNs for a
given DBD each with various combinations of extra arguments (such as
mldbm_type=$mldbm_type,dbm_type=$dbm_type for DBD::DBM).

So providing a 'private database creation service' requires editing the
DSN that's provided in the config file in order to add-in the details of
the specific newly created database (eg file path or database/schema name).

So, rather than have separate kinds of config entries (dsn= and driver_dsn=)
perhaps have a single dsn= entry and define a 'placeholder' string that
indicates that a private database should be created. Something like this:

    dbi:DBM:f_dir={CREATE_DATABASE}
    dbi:Oracle:sid={CREATE_DATABASE}

If Test::Database sees a placeholder it would pass the dsn to some
driver-specific code that would create the database and edit the DSN.
Extra params could influence the creation:

    {CREATE_DATABASE:foo=bar,bing=pop}


> I've said above that all DSN should come from configuration files, but
> there are still a few that can be discovered automatically, like the
> test databases for MySQL and Pg. Should these be automatically included
> in the list of DSN if they are found?

IIRC I think we decided that "well defined test databases" (ie those
with standard locations/ports and publically known username/password)
should be included implicitly, if there's no explicit config entries for
that driver. I don't have a strong opinion on it though.


> Looking at some of the existing test suites using Test::Database, I've
> noted the following:
> - it should be possible to make negative queries (e.g. "any DSN, but
>   not for SQLite2 or DBM)

Rather than complicate the API, can't users just grep the results:

    @handles = grep { $_->mumble ne 'SQLite2' } $foo->handles;

That seems like an idiom worth supporting (via suitable attribute
methods on the handles) and promoting in the docs.

> - I've seen the following used in several places:
>   $handle->{driver}->drop_database($handle->name);


Tim.

Reply via email to