On Fri, Aug 10, 2012 at 05:27:47PM -0700, Jonathan Leffler wrote:
> On Sun, Aug 5, 2012 at 12:48 PM, "José Diaz Seng" <josediazs...@gmx.de>wrote:
> 
> > I am about to write a CPAN module having the working title
> > DBIx::Table::Inflate and would like to hear your opinion about the name.
> >
> >  Your comments are very welcome, especially recommendations against the
> > name or alternative names (and also any other suggestions regarding the
> > capabilities of the module). I have read the module naming guide and posted
> > a pretty detailed description of the module to the perl modules mailing
> > list, see the tread starting with
> > http://www.nntp.perl.org/group/perl.modules/2012/08/msg81608.html .
> >
> >  brian d foy suggested that I talk to this mailing list too to get
> > additional advice, so I did.

> The module exposes only one method inflate() which can be used to add
> records to a table, mainly with performance tests of database clients
> in mind. inflate() automatically takes into account the different
> types of constraints on the target table and tries to generate
> meaningful records by using values coming from the target table itself
> or from referenced tables. A sample usage (subject to changes) looks
> like this:

> > use DBIx::Table::Inflate;
> >
> > my $inflation = DBIx::Table::Inflate->new();
> >
> > $inflation->inflate({
> >  dbh =>$dbh,
> >  table_name => $table,
> >  target_size => $size,
> >  num_random => $random,
> >  max_tree_depth => $depth,
> >  min_children => $min_children,
> >  min_roots => $min_roots
> >  });
> >
> >  The parameters are as follows:
> >
> > dbh: DBI database handle
> > table_name: name of the table to inflated
> > target_size: total number of records to be reached
> > num_random: The first num_random number of records use fresh random choices 
> > for their values taken from foreign key relations or from the target table 
> > itself. For performance reasons, these values are stored in a cache and 
> > re-used for the remaining target_size - num_random records.
> >    the last three parameters become relevant only in case of 
> > self-references of the target table (current definition of a self-ref: 
> > one-column wide foreign key constraint to itself:
> > max_tree_depth: maximum depth at which new records will be inserted
> > min_children: minimum number of child records to be inserted (currently 
> > using a breadth-first traversal)
> > min_roots: the minimum number of root elements defined after completion. A 
> > record is considered to be a root element if the corresponding parent id is 
> > null or equal to its child id

> As already mentioned, there is a module DBIx::Table::Dup; this module fits
> into the same general namespace reasonably sensibly.  I'm not
> over-enthusiastic about the name 'Inflate', but at the moment I don't have
> a good alternative.

The term inflate is already closely associated with other kinds of
actions so wouldn't be a good choice for this module.
See http://search.cpan.org/search?query=inflate&mode=all

> I don't know whether a name such as
> DBIx::Table::DataGenerator would be better?  Given that you have a database
> handle, I assume it actively loads the data, rather than generating a file
> for later loading.  Thus, it is arguably a
> DBIx::Table::DataGeneratorAndLoader, or even
> DBIx::Table::TestDataGeneratorAndLoader, but that gets rather unwieldy.
> (OTOH, the ideas might set your thought processes in motion.)  If the
> module is renamed, would you rename the method too?

DBIx::Table::TestDataGenerator seems reasonable. Or perhaps
DBIx::Table::TestDataPopulate.

> The overall functionality looks reasonable (and the recursive table
> structure handling is interesting).
> 
> I'm not sure if you would want to allow the user to specify a seed (for the
> PRNG), so that they can get deterministic results, or provide a mechanism
> to allow them to retrieve the seed used (or both).  Have you considered an
> option to specify the size of transaction (to avoid overwhelming the
> database and its logs)?  Have you considered what information goes in the
> 'table name'?  Does it allow for 'owner' or 'schema' or similar, or should
> that be specifiable separately?

All good advice.

You're already using an object, so you could allow further control by
defining and documenting methods that could be overridden in a subclass.

Rather than thinking in terms of providing a solution, it's good to
think in terms of providing a kit of parts that's already configured for
one common use-case.

The dbix-class mailing list might be a useful place to discuss what
you're doing. They're likely to know about related modules and ideas.
http://lists.perl.org/list/dbix-class.html

Have fun.

Tim.

Reply via email to