Re: Is DBIx::Table::Inflate a good name for a CPAN module?

Jonathan Leffler Fri, 10 Aug 2012 17:28:32 -0700

On Sun, Aug 5, 2012 at 12:48 PM, "José Diaz Seng" <josediazs...@gmx.de>wrote:


> I am about to write a CPAN module having the working title
> DBIx::Table::Inflate and would like to hear your opinion about the name.
>
>  Your comments are very welcome, especially recommendations against the
> name or alternative names (and also any other suggestions regarding the
> capabilities of the module). I have read the module naming guide and posted
> a pretty detailed description of the module to the perl modules mailing
> list, see the tread starting with
> http://www.nntp.perl.org/group/perl.modules/2012/08/msg81608.html .
>
>  brian d foy suggested that I talk to this mailing list too to get
> additional advice, so I did.
>

And, in response to  further question from (accidentally sent privately,
and hence responded to privately), José sent the following extra
information from the perl.org web-site.  I apologize if GMail doesn't wrap
the very long lines here sanely...I'm not sure how to twist its arm so it
does so.

The module exposes only one method inflate() which can be used to add
records to a table, mainly with performance tests of database clients
in mind. inflate() automatically takes into account the different
types of constraints on the target table and tries to generate
meaningful records by using values coming from the target table itself
or from referenced tables. A sample usage (subject to changes) looks
like this:
>
> use DBIx::Table::Inflate;
>
> my $inflation = DBIx::Table::Inflate->new();
>
> $inflation->inflate({
>  dbh =>$dbh,
>  table_name => $table,
>  target_size => $size,
>  num_random => $random,
>  max_tree_depth => $depth,
>  min_children => $min_children,
>  min_roots => $min_roots
>  });
>
>  The parameters are as follows:
>
> dbh: DBI database handle
>
> table_name: name of the table to inflated
>
> target_size: total number of records to be reached
>
> num_random: The first num_random number of records use fresh random choices 
> for their values taken from foreign key relations or from the target table 
> itself. For performance reasons, these values are stored in a cache and 
> re-used for the remaining target_size - num_random records.
>
>    the last three parameters become relevant only in case of self-references 
> of the target table (current definition of a self-ref: one-column wide 
> foreign key constraint to itself:
>
> max_tree_depth: maximum depth at which new records will be inserted
>
> min_children: minimum number of child records to be inserted (currently using 
> a breadth-first traversal)
>
> min_roots: the minimum number of root elements defined after completion. A 
> record is considered to be a root element if the corresponding parent id is 
> null or equal to its child id
>
>
As already mentioned, there is a module DBIx::Table::Dup; this module fits
into the same general namespace reasonably sensibly.  I'm not
over-enthusiastic about the name 'Inflate', but at the moment I don't have
a good alternative.  I don't know whether a name such as
DBIx::Table::DataGenerator would be better?  Given that you have a database
handle, I assume it actively loads the data, rather than generating a file
for later loading.  Thus, it is arguably a
DBIx::Table::DataGeneratorAndLoader, or even
DBIx::Table::TestDataGeneratorAndLoader, but that gets rather unwieldy.
(OTOH, the ideas might set your thought processes in motion.)  If the
module is renamed, would you rename the method too?

The overall functionality looks reasonable (and the recursive table
structure handling is interesting).

I'm not sure if you would want to allow the user to specify a seed (for the
PRNG), so that they can get deterministic results, or provide a mechanism
to allow them to retrieve the seed used (or both).  Have you considered an
option to specify the size of transaction (to avoid overwhelming the
database and its logs)?  Have you considered what information goes in the
'table name'?  Does it allow for 'owner' or 'schema' or similar, or should
that be specifiable separately?  The $dbh specifies the database to be
used, which avoids some problems (notably, connection strings); that's good.

-- 
Jonathan Leffler <jonathan.leff...@gmail.com>  #include <disclaimer.h>
Guardian of DBD::Informix - v2011.0612 - http://dbi.perl.org
"Blessed are we who can laugh at ourselves, for we shall never cease to be
amused."

Re: Is DBIx::Table::Inflate a good name for a CPAN module?

Reply via email to