On Sat, 1 Jun 2002 [EMAIL PROTECTED] wrote:

> We have a little over a thousand 1 or 2 page documents stored in native
> notes format.  They are things like product specifications, recipies, etc.
> These are available to staff internally using native notes access.  Notes
> serves these up to the web for customer access dynamically converting them
> to html.  It also provides indexing, searching and browsing.  These
> documents tend to be in only about 6 standard formats.

This sounds a lot like an Electronic Document Management System (EDMS).
Then again, I work for an EDMS vendor, so everything looks like an EDMS.

> - I am keen to serve the documents as pdf's.  These seem to be reasonably
> universally readable and have several other advantages.  They make it much
> more difficult for end users to modify.  We have had problems with people
> downloading large portions of our web site and presenting it as their own.
> We can include things like a watermark.  However, storing as pdf seems like
> a bad option.

You can edit PDF documents, but it isn't trivial. It would mean that the
baddies would have to touch each PDF document individually, which is
probably enought to amke them just go and steal from someone else. Why is
storing as PDF bad?

> - should I store the text in a database or in individual files.  The
> database encourages structure and the possiblility to make simple changes
> across all documents.  Storing as individual files provides flexibility at
> the expense of having lots of small things to manipulate.

Many files can also cause IO pain. I like the idea of storing the
information in a database and then generating the PDF documents on the
fly.

> - as the documents are generally limited to about 6 different layouts
> DocBook might be a good option.  If so, do we store the text in files or a
> database?  How do I edit it?  How do I train authorised users to modify
> documents?

I'm yet to get docbook presenting information in a layout I like. Docbook
doesn't let you specifiy things like page rbeaks, because you're not meant
to know what device the data is being rendered to.

> - how do we provide the indexing, searching and browsing functions with
> any/all of the above?

htDig.

> I am leaning a little towards PostgreSQL and perl for several reasons.
>
> 1) Gus says perl is good
> 2) I have a little (read very little) perl experience.
> 3) I suspect in the long run it is going to be easier and more efficient to
> store all the text in a RDBMS than as text files
> 4) I need to have a system for authenticating users and ensuring they have
> access only to selected subsets of the documents.  Documents could be more
> easily categorised in the RDBMS.

MySQL and perl just works. I am sure this is also the case for many other
databases as well. You can also integrate PDF generators with perl. The
other obvious option is PHP.

> I am not sure whether this project is beyond our capabilities given other
> commitments.  It may be necessary to contract out the initial work and then
> take up maintenance / improvements ourselves.

Be careful assuming that you can easily comprehend other people's poor
coding.

Mikal

-- 

Michael Still ([EMAIL PROTECTED])     UMT+10hrs

-- 
SLUG - Sydney Linux User's Group - http://slug.org.au/
More Info: http://lists.slug.org.au/listinfo/slug

Reply via email to