I strongly believe that the WYSIWYG editing route is a fundamentally worse approach for documentation (and textbooks) than text-based formatting such as Pillar, Markdown and Asciidoc.
Specifically, it will result in less community contribution, and it will make distributed version control of documents much harder. That said, I will support whatever documentation format or tech stack that the community adapts. Any documentation is better than none, regardless of the underlying details. (Though I will always advocate for text-based formats). Instead of going the route that you propose (which essentially attempts to Word or Google Docs in-image), I think we should: * Extend Pillar. With a few more features, it can be on par with Markdown and Asciidoc, and then eventually surpass it (and have nice Pharo-specific features like detection and unit-testing of Pharo code blocks, etc). * Invest in instant-preview tech, in-image. This is similar to what PillarHub does, for example, by using the Ace online editor. Take a look at the first screenshot in: http://pillarhub.pharocloud.com/hub/pillarhub/about Side-by-side instant preview makes it possible to have the best of both worlds, text-based markup and WYSIWYG, without the typical WYSIWYG drawbacks (of making distributed version control difficult, first and foremost). I will attempt to explain my reasoning, both for text-based markup, and against heavier WYSIWYG approaches. 1) With documentation (and textbooks), the number one goal and number one virtue is: make it easy for people to contibute. This means two things -- the simplicity of the format (this is where LaTeX fails), and the ease of distributed version control (merging, pull requests, reviews and commentary on contributions). Look at the explosion in community-generated documentation and content (READMEs on repos, the entirety of Wikipedia) that has resulted from *making it easy for people to contribute and edit collaboratively*. 2) Version control, especially with more than one or two collaborators, is a *nightmare* with WYSIWYG tools. Look at what the state of the art is, at what Microsoft's Word and Google's Docs have been able to accomplish, in terms of revision control. Despite unimaginable amounts of person-hours of development put into it, it's pretty much unusable (I speak from extensive personal experience, both from collaborating on technical and business documents using Word and Docs, and from seeing my wife and her friends (who are professional authors) struggle with Word's revision control systems while working with their publishers). We are not Microsoft or Google. We are not going to solve the WYSIWYG source control / version control problem better than they are. We need to focus where we spend our efforts. In contrast, text-based markup version control is a solved problem. 3) The convenience of WYSIWYG can be provided with side-by-side instant preview (again, see PillarHub and the numerous WYSIWYG instant-previews in Markdown editors). 4) The ability to render source text-based markup into multiple formats (PDF, HTML, etc), is *essential*. Going from WYSIWYG to HTML is impossible (all attempts to do so, by Microsoft, Adobe, etc, have utterly failed). Whereas going from text-based to print/PDF is very doable (see LaTeX, the entirety of HTML ecosystem, Pillar, etc). This is a serious problem that FrameMaker never had to solve. 5) Text-based markup is not primitive. You mention a comparison to HTML 1.0. This is apt, but in the opposite direction than you intend. 1.0 may have been primitive. But it has evolved into HTML 5, which not only has many semantics level content features, but is expressive enough that pretty much all UIs are moving to it (operating systems, desktop app suites, mobile devices, etc). Pillar may be in a primitive state right now, but it already has some decent semantics-level capability, and can have a lot more with considerably less effort than it would take to evolve WYSIWYG tools. And actually, the WYSIWYG approach is much closer to HTML 1.0 (in the sense that, users have to indicate *semantic* intentions like emphasis by selecting different fonts, versus something like HTML 4/5, where the actual intention is declared (EMPH tags, QUOTE tags, etc). 6) You mention the two standards in document writing (Word and LaTeX), and the drawbacks to each. I completely agree there. There is a third option, however, on which the world of open-source community technical writing is standardizing. And it involves text-based markup languages: * Markdown (in the form of https://www.gitbook.com/ ) * Asciidoc (more specifically, the Asciidoctor http://asciidoctor.org/ text processor and publishing toolchain). Take a look at https://medium.com/@chacon/living-the-future-of-technical-writing-2f368bd0a272 for example. * Pillar (that's us). 7) Text-based markup formats (aside from LaTeX) actually possess all of the desired features that you mention, that are required to write book-length technical documentations. Something like Asciidoc already possesses them, and Pillar has most of them (and can be extended to have the rest). Let's take a look at some of them: * High-level publishing-centered semantic abstractions. In other words, both the concept of book sections, chapters, chapter sections, paragraphs, figures, etc, as well as the ability to compose a larger document out of smaller named documents: - Pillar and Asciidoc have chapters, sections, paragraphs, figures and named scripts/code blocks. - Asciidoc has the ability to do file imports (compose a larger document out of smaller docs) * Links. Asciidoc has semantic links both within a document, and across different Asciidoc documents (references to chapters, sections, etc). Pillar has within-document links, and inter-doc links are on the roadmap. * The ability to drop down to a more expressive markup (LaTeX or HTML). For heavier-duty features like formulas and equations, all of the simple markup languages allow the author to drop down to LaTeX and lay out formulas to their heart's content. 8) Text-based markup formats are much easier to both extend (add new semantic tags, etc) and to machine-process (parse, apply macros, etc) than WYSIWYG formats. So, in summary: - Text-based markup languages result in a lot more technical docs being written - Pillar can match or beat any of the WYSIWYG editor features, with not too much time and effort investment. On Tue, Apr 21, 2015 at 9:11 AM, stephan <[email protected]> wrote: > TL;DR: Some roadmap ideas. Looks like a lot of work. > Comments and improvements welcome: > We should replace the Pillar document format > by a better one, suitable for WYSIWYG editing and > creating long documents. > > --- > > The current documentation model for Pharo is Pillar. > Pillar is the document model from the Pier CMS and > provides exports to (a.o) html and LaTeX. It is a > simplified form of the LaTeX document model > without a WYSIWYG UI. > > In the research world two documentation systems > dominate: LaTeX and Word. Word and its clones > dominate areas where ease of use for small papers > without maths are important, LaTeX the other fields. > > From personal experience I know that the lack of > abstraction in Word and clones makes it very expensive > to create large, consistently formatted documents. > In addition, the typographical quality of the resulting > documents is much lower than that achievable with > LaTeX. > > On the other hand, repurposing LaTeX to generate > anything other than PDF/paper documentation is > difficult because of the underlying language that > LaTeX is written in, and there is no easy to use > WYSIWYG UI for LaTeX. > > It pains me to see the return of text based formatting > with primitive formats like markdown. At least in LaTeX > you can preserve semantics level content, in markdown > we are back at html 1.0. > > The program I liked best for creating longer documents > was Framemaker. That provided the needed abstractions > in an efficient WYSIWYG UI. Framemaker was sold > from 1986, so the performance of current hardware > should be enough to run something similar in smalltalk. > I used versions 5.5 and 6, and had to abandon it when > Adobe stopped development and it was never migrated > from PowerPC. > > Framemaker was fast enough to create books with > hundreds and even thousands of pages. It had working > versions of the long document features Word claimed to > have. > > With Athens and TxText we now have low level > abstractions for dealing with cursor and selection, fonts, > rendering glyphs and having both on-screen and > PDF output. > > On top of TxText we could add a model somewhat like > the attached figure > [image: UML diagram of document structure] > A book consists of a number of named documents. > This is essential for dealing with longer material, as > in a wysiwyg system we want to avoid having to re-layout > too much after a key is pressed. Across documents we only > need to remember the starting page/section numbers. > > Each document consists of pages. On a page there can be fixed > content and content that is dependent on the text flow. > Most pages of a document have a similar layout, so each page > refers to a masterpage that defines the default content. > A document can have separate masterpages for > first, left and right pages, and rotated or extra large ones. > A masterpage can define fixed items and calculated ones > (pagenumbers and current chapter). A textframedefinition > describes the textframes and the textflow for each textframe. > > The text (and other in-line contents) of the document are stored > in paragraphs, which are stored in textflows. > The paragraphstyle of a paragraph knows how to layout it in > a textframe, and how to deal with the end of a textframe. > The paragraphstyle knows how to paginate, how to number > or provide other autotext at the beginning of a paragraph and > if the paragraph text should be part of a table of contents. > A textframe is a (rectangular) area on a page. > The characterstyle of a paragraph is responsible for the font family, > size and style. The characterstyle can be overridden > by a specific paragraph or by a textrange. > > With a model like this (and adding maths, tables, notes, figures > and references) we should be able to use Pharo to create both > high-quality documentation, and write research articles > (and books) in-image. > > Stephan > > > > >
