On Thu, Apr 03, 2014 at 03:22:56PM +0100, Tim Bunce wrote: > > > > Here are some of my conclusions: > > - Test::Database should only return workspaces / databases that have > > already been configured, and assume that the necessary persmissions for > > testing are granted (create table, drop table, etc) > > - Test::Database must support a much larger set of configuration files, > > located in various places: system (/etc), global ($HOME), local (cwd). > > That last bit would allow some projects to provide their own config file > > (as RIBASUSHI wants to do with DBIx::Class) > > Would the config be the 'union' of all the found config files?
A priori yes, but I guess it depends on the type of things found in the config (the reference behaviour for this would be git config files). If necessary the configuration items in the most specific file would override those in the more general one. So DSN would be collected from all the available files, but if we come up with some other configuration parameters that can influence the selection of the dsn, that might influence the list. For example, maybe the requests could be partially stored in the config file. If some project only supports MySQL and SQLite, that could be written in the project's local configuration, and the scripts wouldn't need to give a specific request when requesting a dsn. > > With this in mind, Test::DSN would replace the legacy > > Test::Database::Handle, and provide the dsn itself (pointing to a database > > where the test script can create and drop its tables), the connection > > information, and a set of attributes describing the workspace. > > So Test::DSN is an extra module in the Test-Database distro? I think so, for now. > > There are other issues that come to mind, like how to deal with parallel > > testing, or even multiple different test suites using the same database > > dsn at the same time. Maybe Test::Database could provide some sort of > > optional locking mechanism, but I've have a hard time coming up with a > > good one. Modules like DBIT already take that into account, by creating > > tables with names explicitely designed to avoid clashes. > > I think it's reasonable for Test::Database/Test::DSN to declare that > "the database you get conected to may be in active use by other > applications, and/or concurrent instances of your application, > so you need to take appropriate care." > > In other words, I think it's outside the scope of Test::Database/Test. OK. > Some logic to assist with table naming and cleanup etc will be > developed as part of DBIT and could be extracted. If so, I think it > belongs in a separate distro. (Separation of Concerns etc.) And if Test::Database only deals with giving out Test::DSN objects, then it should not need to care about naming issues (except for file-based DSN, but that's a simpler problem). Does the logic already exist somewhere as code or module? > > If some locking is provided, > > If table naming and cleanup is done right then locking shouldn't be needed. That's a relief. :-) > > I think it probably must be at the table > > level, i.e. a script will declare that it wants exclusive access to tables > > X, Y, Z when requesting a dsn. The function will block (there's probably > > a need for a timeout, to avoid having test scripts hanging forever) > > until it can return a dsn with exclusive access to those tables. > > This is a potential can of worms. There are modules on CPAN that provide > this kind of functionality. It would be good to use one of them if > possible - or provide a mechanism for third-party modules to plug in to. Agreed. > > Another option is that Test::Database could provide a function that > > returns a unique prefix for a given script, so that the test script > > can use that when creating the tables. That would provide isolated > > "namespaces", and then we don't have to deal with locking directly. > > With that namespace, we can even let Test::Database do things like > > "drop all tables in that namespace". > > That's the approach DBIT is taking. There's a basic start in here: > https://github.com/perl5-dbi/DBI-Test/blob/master/sandbox/tim/lib/DBI/Test/FixtureProvider/GenericBase.pm#L49 > I'd be very happy to see that factored out into a separate distro. > I'd rather it wasn't shipped within Test-Database though as I think it > has wider uses. OK. Do you have a name in mind already? The namespace:: top-level seems to be mostly about Perl namespaces. It seems to me that it's something that should have various "strategies" to generate a "namespace". Like, which character set is available, are there any length limits, that kind of thing. The constraints are different for table names, database names, etc. And I'll expand by just copying and pasting the comments from DBI::Test::FixtureProvider::GenericBase: # could also provide methods to: # - supply a regex that matches the get_dbit_temp_name name # - parse a get_dbit_temp_name to extract especially the yymmddhh # - drop all get_dbit_temp_name tables older than N hours (0=all) > > As for file-based DBD, they should be handled specifically: > > - Test::Database can already create a unique Test::DSN for any of those > > - Test::Database would only have "drivers" for file-based DSN, as these > > would be the only case where database creation needs to be delegated > > to Test::Database. > > - if a test wants specific options, they could be part of the request, > > and influence the database creation proceass > > I'd caution against making a hard distinction between file and non-file DBDs > as it may influence the design it ways that cause limitations later. > > It seems to me that the key distinction is "can we reliably and safely > create and delete a private database". That's true for file-based DBDs > (ie create a subdirectory and rm -rf it afterwards) but may also be true > for some non-file DBDs in some situations. > > Even for file-based DBDs a user may want to define several DSNs for a > given DBD each with various combinations of extra arguments (such as > mldbm_type=$mldbm_type,dbm_type=$dbm_type for DBD::DBM). > > So providing a 'private database creation service' requires editing the > DSN that's provided in the config file in order to add-in the details of > the specific newly created database (eg file path or database/schema name). > > So, rather than have separate kinds of config entries (dsn= and driver_dsn=) > perhaps have a single dsn= entry and define a 'placeholder' string that > indicates that a private database should be created. Something like this: > > dbi:DBM:f_dir={CREATE_DATABASE} > dbi:Oracle:sid={CREATE_DATABASE} > > If Test::Database sees a placeholder it would pass the dsn to some > driver-specific code that would create the database and edit the DSN. > Extra params could influence the creation: > > {CREATE_DATABASE:foo=bar,bing=pop} > I really like this. The notion of "private database" is both very explicit and nicely generic. So after having read the config files, Test::Database has: - a list of plain DSN - a list of DSN that can be used by the private database creation service to produce a DSN on demand (similar to the current driver_dsn) I think that the warning you gave above ("the database you get connected to may be in active use by other applications, and/or concurrent instances of your application, so you need to take appropriate care.") only applies to the first kind of DSN. So, for DBD::CSV, a dsn for the private database creation service would look like this: dbi:CSV:f_dir={CREATE_DATABASE} And following Tux's preferences, it could also be: dbi:CSV:f_dir={CREATE_DATABASE};f_ext=.csv/r Now there are plenty of combinations of parameters that make up valid but slightly different DSN. I think that Test::Database should only provide one of those by default (to avoid combinatorial explosion), to be dropped if the configuration happens to provide one of more DSN for that DBD. So, for all supported DBD, Test::Database would provide a private database factory in the form of a DSN with the {CREATE_DATABASE} placeholder. And actually, that's really something that each Test::Database::Driver should provide. For the "private databases", we can even provide various levels of privacy (as offered by the namespace-providing module), like DSN that are private to the project (the namespace is generated using the project top-level directory), or to a given test script (the namespace is generated using the script full path). This is something that should be part of the request done by the test script: # this DSN is expected to not be shared with anything else @handles = Test::Database->handles( privacy_level => 'script' ); # this DSN will be shared across test scripts in the same project, # enabling some 000-startup.t to setup some data to be used by later # scripts, for example @handles = Test::Database->handles( privacy_level => 'project' ); I'm not sure which privacy level is a good default. Probably script-level. Looking at the mapping files Test::Database left in my /tmp, I realized that the consequence of `dzil test` creating a new directory under .build every time it's run is that naively using cwd as the key to differentiate between different projects is probably not enough. The best location for private file-based DSN is probably cwd itself, under some hidden directory. In the dzil case, they would be implicitely be removed with the temporary build directory when the test suite runs successfully. So a .test-database/ directory at the project level would be the home for the project local configuration file, and the various file-based private databases created for the project. > > I've said above that all DSN should come from configuration files, but > > there are still a few that can be discovered automatically, like the > > test databases for MySQL and Pg. Should these be automatically included > > in the list of DSN if they are found? > > IIRC I think we decided that "well defined test databases" (ie those > with standard locations/ports and publically known username/password) > should be included implicitly, if there's no explicit config entries for > that driver. I don't have a strong opinion on it though. > Agreed. > > Looking at some of the existing test suites using Test::Database, I've > > noted the following: > > - it should be possible to make negative queries (e.g. "any DSN, but > > not for SQLite2 or DBM) > > Rather than complicate the API, can't users just grep the results: > > @handles = grep { $_->mumble ne 'SQLite2' } $foo->handles; > > That seems like an idiom worth supporting (via suitable attribute > methods on the handles) and promoting in the docs. Indeed. One does not prevent the other, so we can have a nice and simple declarative interface to the method, and anything really complex can be programmed using the Test::DSN attributes and grep. It should be possible, as I've noted above, to also enable some sort of project-level filter capabilities in the local config file. For example to indicate things are are supported or unsupported throughout the whole project. This should be doable declaratively in the config files. -- Philippe Bruhat (BooK) Just because you believe it does not make it so. (Moral from Groo The Wanderer #54 (Epic))