Authors,

I have written a module that deals with France's INSEE codes, which
allows one to look up postcodes and stuff like that. I've been toying
with Geography::FR::Postcode as a name. (any other ideas?)

The thing is, it relies on a text file that is 750KiB zipped, updated
periodically. So I'm looking at a reader package that knows how to pick
apart a certain format (or formats) of the data file and answer
questions (for instance, what towns have the postcode 66100). Reading
the unzipped file on each run and producing hashes takes about a second,
which is good enough for a first version.

One problem is that the INSEE web site doesn't make it easy to predict
what the new filename will be, so I can't fetch the data from INSEE
during the installation process. And I would like to avoid wrapping it
up as a CPAN module. So I create another package, that contains a
solitary package variable that contains the URI that points to the data
file on INSEE's web site, and I just update *that* when new versions
come out.

Something like this:

Geography::FR::Postcode

  depends on
  Geography::FR::Postcode::Data

Installing Geography::FR::Postcode forces the dependency on
Geography::FR::Postcode::Data to be resolved first. So Data is downloaded and as part of its installation process, the file is downloaded and installed somewhere on the local system.

I suppose it will default to the site_perl directory if run in batch mode, but interactive installations allow the directory to be specified. OS distribution maintainers may wish to override the default (how? an environment variable à la PERL_G_F_P_PATH=/usr/local/share/doc/insee?)

After Geography::FR::Postcode::Data is installed, the installation
of Geography::FR::Postcode goes forward (waving hands: knowing where Data put the damned file).

Next year, a new version of the INSEE file comes out. I test, and see that the current reader code can deal with it. I release a new version of Geography::FR::Postcode::Data. The client sees that there is an update for this, and installs it. New data file, everyone happy. (Assuming the installation causes the new file to overwrite the old one, otherwise Postcode will continue to run with the old file).

The following year, a new version comes out, and surprise! they've added
a new column in the file. So I release a new version of
Geography::FR::Postcode as well, that knows how to read both formats,
and a new version of Geography::FR::Postcode::Data.

Does that sound sane? Does anyone have some pointers on how to deal with
the placement of datafiles on the local system with one module, and having the other module know where to find them?

Or am I making this unnecessarily complicated? (I could just bundle the data file with the distribution, but the size of the data file, and the probability that the format is unlikely to change invites the above approach).

Thanks,
David
--
Much of the propaganda that passes for news in our own society is given
to immobilising and pacifying people and diverting them from the idea
that they can confront power. -- John Pilger



Reply via email to