[Pharo-dev] Improving the documentation model

stephan Tue, 21 Apr 2015 06:11:08 -0700

TL;DR: Some roadmap ideas. Looks like a lot of work.
Comments and improvements welcome:
We should replace the Pillar document format
by a better one, suitable for WYSIWYG editing and
creating long documents.


---

The current documentation model for Pharo is Pillar.
Pillar is the document model from the Pier CMS and
provides exports to (a.o) html and LaTeX. It is a
simplified form of the LaTeX document model
without a WYSIWYG UI.

In the research world two documentation systems
dominate: LaTeX and Word. Word and its clones
dominate areas where ease of use for small papers
without maths are important, LaTeX the other fields.

From personal experience I know that the lack of
abstraction in Word and clones makes it very expensive
to create large, consistently formatted documents.
In addition, the typographical quality of the resulting
documents is much lower than that achievable with
LaTeX.

On the other hand, repurposing LaTeX to generate
anything other than PDF/paper documentation is
difficult because of the underlying language that
LaTeX is written in, and there is no easy to use
WYSIWYG UI for LaTeX.

It pains me to see the return of text based formatting
with primitive formats like markdown. At least in LaTeX
you can preserve semantics level content, in markdown
we are back at html 1.0.

The program I liked best for creating longer documents
was Framemaker. That provided the needed abstractions
in an efficient WYSIWYG UI. Framemaker was sold
from 1986, so the performance of current hardware
should be enough to run something similar in smalltalk.
I used versions 5.5 and 6, and had to abandon it when
Adobe stopped development and it was never migrated
from PowerPC.

Framemaker was fast enough to create books with
hundreds and even thousands of pages. It had working
versions of the long document features Word claimed to
have.

With Athens and TxText we now have low level
abstractions for dealing with cursor and selection,  fonts,
rendering glyphs and having both on-screen and
PDF output.

On top of TxText we could add a model somewhat like
the attached figure
UML diagram of document structure
A book consists of a number of named documents.
This is essential for dealing with longer material, as
in a wysiwyg system we want to avoid having to re-layout
too much after a key is pressed. Across documents we only
need to remember the starting page/section numbers.

Each document consists of pages. On a page there can be fixed
content and content that is dependent on the text flow.
Most pages of a document have a similar layout, so each page
refers to a masterpage that defines the default content.
A document can have separate masterpages for
first, left and right pages, and rotated or extra large ones.
A masterpage can define fixed items and calculated ones
(pagenumbers and current chapter). A textframedefinition
describes the textframes and the textflow for each textframe.

The text (and other in-line contents) of the document are stored
in paragraphs, which are stored in textflows.
The paragraphstyle of a paragraph knows how to layout it in
a textframe, and how to deal with the end of a textframe.
The paragraphstyle knows how to paginate, how to number
or provide other autotext at the beginning of a paragraph and
if the paragraph text should be part of a table of contents.
A textframe is  a (rectangular) area on a page.
The characterstyle of a paragraph is responsible for the font family,
size and style. The characterstyle can be overridden
by a specific paragraph or by a textrange.

With a model like this (and adding maths, tables, notes, figures
and references) we should be able to use Pharo to create both
high-quality documentation, and write research articles
(and books) in-image.

 Stephan

[Pharo-dev] Improving the documentation model

Reply via email to