Hi Sam and Tony:

On Wed, 11 Jul 2007, Samuele Kaplun wrote:

> with Jean-Yves and Tony we were thinking about directly using the
> current filesystem structure for the refextract job. Tony says his
> tool work on main documents only.  Right now, the structure of the
> files/g0..gn folders is that every gn folder contains a list of
> subfolders named after the bibdoc id, and each subfolder contains
> then the real documents, and a secret hidden file: ".recid", which,
> as you can guess, contains the recid to which the bibdoc belongs.
> Since refextract should run on main files only, and since maybe this
> feature could be used in the future for other purposes, what do you
> think if a new hidden file is created, named ".type" which contains
> the content of the bibrec_bibdoc.type column, i.e. the kind of
> bibdoc (either Main or Additional). I know this is a replication of
> information, but could be very useful for tools that can avoid to
> depend on the database.  What do you think? Could I add such a
> feature (even to run on production...)

What is the status of this thing -- have you added it?

I think we can easily avoid it, because refextract has to have an
associated daemon anyway that would look inside the database to fetch
the list of recIDs to extract references from -- that is, for new
records and for records with modified PDF files since the last run.
This daemon, let's call it 'refextractadmin', will normally run via
bibsched periodically and will invoke its 'refextract' slave program
only for good (main) PDF files.  In this way no duplicate '.type'
hidden file would be needed.

Best regards
-- 
Tibor Simko ** CERN Document Server ** <http://cds.cern.ch/>

Reply via email to