This is something I have some experience with, having worked
at Australia's largest legal publisher.

Preserving paging information is vital for looseleaf, but is
very dificult all at the same time

* You need to be able to break *anywhere* - that could be in
  the middle of paragraph

* If you are going to be reusing the content for different
  publications you need to be able to preserve different
  page breaks for different publications

* SGML/XML is intrinsically hierachical/tree like, but
  loose leaf pages are instrinsically a linear list. They
  just don't map very well.

It is a nasty problem to try and solve, particularly
using a purely technological solution.

I did a lot of thinking about this one. In my view it would
be damn near impossible for a CMS to maintain that kind of state.
Personally, I think that the CMS should output a set of "Book
Specific Content" to another application whose job it is
track page numbers and page breaks from one version of a book
to the next.

At 19:42 9/12/2002, Mike McNamara wrote:
Hi Darrel,

As Bill Trippe mentioned in his reply, this is an old but very current
problem that many Legal & Technical publishers of 'large-volume'
documents face as they move their publications to a variety of
electronic media products whilst still having to maintain paper
products and the various 'links' and updates (loose-leaf) to them.

PDF can certainly 'solve' some of these problems, but as Bill mentioned
the problem may be better solved by looking at a 'composition' engine
where you can create controlled paged publications that can be
maintained with loose-leaf versioning control.

Typically, but not always, these 'composition' engines/formatters can
be fed structured data such as XML & SGML (however many of today's data
streams into these engines/formatters remain proprietary), the same
data stream (preferably the structured ones) can also be used to feed
the production engines of other multimedia products.

Some question that I would ask are how many pages are we talking about
here? If it's thousands then a non/structured data/composition engine
may be the way to go, if it's only hundreds then the cost of
implementation may prohibit that route. Another question would be what
is the current format of the original data? I would suspect that there
are a few different ones?

I have worked with a number of Legal/Technical publishers over the past
few years with there own composition formatters/engines and in some
cases the service companies that provide paged products to them and
would be happy to share some background experiences off-list if you
would like.

You should be assured that you are not the only person with this
problem out there, but you can also be assured that there are a number
of solutions to the problem, depending on how you want to go about it.

Mike McNamara


> One bit of content we are planning on disseminating via a
> CMS has an unusual restriction: the pagination must be preserved.
>
> These are documents that ultimately need to be searchable, viewable,
> parsable, but also retain a specic pagination scheme for proper
> citations. For instance, the document, itself, may cite another
> page of the document, and other documents need to cite specific
> pages of other documents.
>
> The easy solution is to just publish them as PDFs, but that
> just doesn't seem to be the elegant solution in my mind. Is there
> way to store structured content in a way that also retains the page
> structure of the original typed paper document? Or would PDF be the
> way to go?
>
> -Darrel
> --
> http://cms-list.org/
> trim your replies for good karma.

--
http://cms-list.org/
trim your replies for good karma.
-------------------------
Brian White
Step Two Designs Pty Ltd
Knowledge Management Consultancy, SGML & XML
Phone: +612-93197901
Web:   http://www.steptwo.com.au/
Email: [EMAIL PROTECTED]

Content Management Requirements Toolkit
112 CMS requirements, ready to cut-and-paste


--
http://cms-list.org/
trim your replies for good karma.

Reply via email to