I am sure this is going to end up a long email so I will apologise in
advance.

I am looking for some advice as to what direction to take in a project I am
considering at work.  We currently use Lotus Notes for email and document
storage.  (This is my fault, a decision made 6 or 7 years ago before I new
better.)

We have a little over a thousand 1 or 2 page documents stored in native
notes format.  They are things like product specifications, recipies, etc.
These are available to staff internally using native notes access.  Notes
serves these up to the web for customer access dynamically converting them
to html.  It also provides indexing, searching and browsing.  These
documents tend to be in only about 6 standard formats.

I want to move away from Notes for a number of reasons including

- Notes is proprietary.
- Notes native is OK but Notes web is slow
- I am keen to serve the documents as pdf's.  These seem to be reasonably
universally readable and have several other advantages.  They make it much
more difficult for end users to modify.  We have had problems with people
downloading large portions of our web site and presenting it as their own.
We can include things like a watermark.  However, storing as pdf seems like
a bad option.

I have been thinking about this for a while and as usual with Linux there
is more than one way to do it.  I have lots of ideas and questions bumping
around in my head.  None seems to jump out as an ideal solution.  The
purpose of this email is to seek advice about what direction to head and /
or where to look for more information.

My ideas/questions include:

- we could edit the documents in something like OpenOffice, save as
postscript and convert from there.  This would provide flexiblity and a
good editing platform but I think it would be slow and cumbersome.
OpenOfiice is supposed to store as XML.  However, a cursory look at a file
created by OpenOffice didn't show anything human readable.

- should I store the text in a database or in individual files.  The
database encourages structure and the possiblility to make simple changes
across all documents.  Storing as individual files provides flexibility at
the expense of having lots of small things to manipulate.

- I have read a little about Zope but am not sure whether it might be
useful or a bit much like Notes.  Zope might be useful for the rest of the
web site.

- as the documents are generally limited to about 6 different layouts
DocBook might be a good option.  If so, do we store the text in files or a
database?  How do I edit it?  How do I train authorised users to modify
documents?

- We have experimented a little with latex.  Is latex a better option, and
if so, do we store the markup bits with the text or store the text in a
database and add markup on the fly to present documents in a standard
format.?  One advantage of latex seems to be that we could store each
document individually and so have very flexible formats.  However, again
training could be a problem.  No WYSIWYG.

- how do we provide the indexing, searching and browsing functions with
any/all of the above?


I am leaning a little towards PostgreSQL and perl for several reasons.

1) Gus says perl is good
2) I have a little (read very little) perl experience.
3) I suspect in the long run it is going to be easier and more efficient to
store all the text in a RDBMS than as text files
4) I need to have a system for authenticating users and ensuring they have
access only to selected subsets of the documents.  Documents could be more
easily categorised in the RDBMS.


I am not sure whether this project is beyond our capabilities given other
commitments.  It may be necessary to contract out the initial work and then
take up maintenance / improvements ourselves.


thankyou & regards
Steven

-- 
SLUG - Sydney Linux User's Group - http://slug.org.au/
More Info: http://lists.slug.org.au/listinfo/slug

Reply via email to