Russel Winder wrote:
Adam,

On Tue, 2008-10-28 at 08:36 +1100, Adam Murdoch wrote:
  
Hi,

I don't like latex. At all. I tried to like it, I really did. I just 
can't do it.
    

You just haven't tried hard enough ;-)

LaTeX is the best markup language for writing big documents, if anyone
suggests DocBook/XML I shall scream and say "Don't do it, don't let XML
pollute your life, don't let O'Reilly and Pragmatic Programmers
indoctrinate you".  Also of course PDF generation from DocBook/XML is an
arcane art form whereas doing it from LaTeX is a doddle

My experience was the exact opposite.

I spent maybe 3 hours last weekend doing a proof of concept port of the Gradle user guide to docbook. I started from scratch, with no knowledge of the docbook markup or the tools, and ended up with a gradle build which generates some reasonable looking html and pdfs. I followed the steps described in the user guide, and they just worked. No arcane art involved.

Compare that to my attempts to generate the userguide using latex. I started with a build that was already set up, and latex source that had already been written. All I had to do was install the tools. Heh. I reckon it took about 5 hours. As a newcomer, I found the documentation cryptic, full of terminology no-one wanted to explain to me. I couldn't find any authoritative guide which would lead me through the installation and setup process. The tools didn't work and the usability of latex is insultingly bad when something goes wrong. Awful, just awful. And the end result was an ordinary looking pdf and some terrible html.

To me, getting to the point where I could just generate the user guide was a truly frustrating experience. It most definitely was not a doddle. This is not the experience I want others to have when they want to contribute to the documentation. And I don't think my experience was unique for someone with no previous exposure to latex. I reckon it's representative of what most newcomers will go through.

Then I decided to try to make the output look at bit better. I started with the html, as this was the worst by far. My experience was unpleasant beyond words. I reckon I spent 8 hours on this before I gave up. I could find no documentation to help me with this. I found snippets of config files on the internet. I dug through the source of tex4ht. I tried to find alternatives to tex4ht. Blerg. After all this I ended up with something that looked acceptable. But I still couldn't manage to do everything I wanted to.

Any mechanism that involves a binary format such as Word or ODF is a
disaster from a version control perspective.  FlatFile ODF generates
single line files, at least in OpenOffice.org, and so is useless for
version control.  The only sane formats are LaTeX and XML.  And XML
sucks for any form of human authorship -- it is a communication notation
only for consenting computers.

  

What's the problem with xml in this instance? You have to put pretty much the same mark-up in pretty much the same places as you do for latex.


  
I've been having a play with docbook, and I quite like it. Here's what I 
think are it's advantages over latex:
    

SCREEAAMMMM... Don't do it, don't let XML pollute your life, don't let
O'Reilly and Pragmatic Programmers indoctrinate you (see I said I
would :-)

I must immediately point out that DocBook/XML appears to have no easy
way of including files by reference.  This makes it very hard to
construct big documents by decomposition to smaller units -- something
that LaTeX handles very well.  

This is one of the things I tried in my proof of concept. You can compose any logical docbook component (book, chapter, section, program listings, etc) from multiple source files, using the <include> tag:

<include href=""/>

Pretty much the same as latex.

OK there are known idioms using entities
for creating books from chapters, but it is including parts of code
files that DocBook/XML seems to have no way of handling. 

  
- Tool support

Because docbook is an xml based markup language, I can point intellij at 
the (well documented) schema and then get all kinds of goodness, such as 
code completion, error checking, syntax highlighting, refactors, etc. 
For example, Intellij understands the code includes, so I can jump back 
and forth between the doc files and the samples.
    

This is true.  On the other hand Emacs with nXML is better.  Actually
Emacs with AucTeX is far superior because it uses LaTeX and not XML.

O'Reilly push using XXE and to some extent they have a point.  If you
have to author DocBook/XML this does do the job.

How are you doing code includes?  Is it by inclusion or reference?

  
- 100% Java toolchain
    

Why is this relevant?  I would argue that enforcing non-functional
requirements such as this is counter-productive.

True. I have the words backwards. The benefit is that I can put together a distribution of reasonable size (10s of megs) which the gradle build can install (by download or from source control), on any platform, and have a pretty good chance of the documentation generation working for any newcomer. And for the CI builds.

The 100% Java bit is just the implementation.

  The tool support above
is a much better line.  Except that I don't want to write XML programs
in IntelliJ IDEA :-)

  
There are a bunch of tools for converting docbook to other formats. The 
one I've been playing with is 100% Java.

To convert docbook to (x)html, you run the source documents through 
xalan with some stylesheets provided by the docbook project. To convert 
docbook to pdf, you do the above (with a different stylesheet) to 
produce XML-FO output, then run that through apache FOP.
    

The publishing industry would have a good laugh here.  True they use XML
extensively and DocBook/XML has a place, but they all use extortionately
expensive XML-FO toolchains that actually work.
  

Yeah, but we're just writing a user guide here. We're not publishing a book. The docbook tools just have to work well enough. And they do. Both the html and pdf output look pretty good to me.

XML-FO is simply an intermediate representation used by the toolchain I happened to use. There are other options for toolchains if we want to use them. For example, the build could use dblatex for pdf generation if it is installed, and fall back to xalan/fop if not. This gives us both the super low cost of contributing, with the typesetting goodness of latex.

 
  
Because these tools are 100% java, and relatively small (certainly 
compared to the 600meg texlive distro needed for the mac), they could be 
checked in to subversion. This means that there would be no setup effort 
for a developer to build (and contribute to) the documentation: just 
checkout and gradle userguide. Setting up the latex tools is complex and 
confusing and fragile.
    

Bad metric.  The faults of Mac packaging are the faults of Mac packaging
-- and its 800MB on my Mac Mini, but then I don't actually use that I
use Ubuntu where you can pick and choose and so avoid the 400MB of
foreign language documentation.
  

But what choice do I have as someone who uses a mac? 600megs is a big distribution. For a while I thought the download was the wrong thing to get (how could it possibly be that big?) and kept chased around the documentation looking for the actual download.

It's another factor that makes one think twice about contributing to Gradle's documentation.

The argument about managing the toolchain is, however, a strong one, so
keep on with that.  I still hate authoring material in XML, or HTML for
that matter.

Also Subversion should be avoided as well :-)

  
Having the tools in svn would also mean that the CI builds could 
generate the documentation (and test the generation of the documentation 
on multiple platforms).

Having docbook integration might also be an interesting option for 
report/website generation too.
    

There is nothing you can do in XML you can't do with LaTeX and indeed
vice versa.  The question is what is easy and what is hard using the
format for the purpose it is being used for.  For writing books LaTex is
best and DocBook/XML come a far distant second.

  

Yeah, but we're not writing books. We're writing a user guide. And we want people to contribute. To me, the most important factor for Gradle's documentation system is that it have the smallest possible mental and time cost for someone to pay before they can start to contribute. Docbook easily trumps latex here.

Small cost of contributing is more important than whether we can do typesetting for printing. Or even what the markup language is.


  
- Familiar technologies

I think xml + xslt are technologies more familiar to java developers and 
the pool of potential Gradle contributors, than latex is. This lowers 
the barrier for contributing to the documentation.
    

This is a bit sophistic.  However the point about contribution is
important.  I would suggest that neither XML nor LaTeX are appealing in
that sense.  They are however the only sane representations for version
control.

  
- Better documented

Latex has heaps of documentation, and most of it is not very detailed. 
Docbook has some fine and detailed documentation. There's the definitive 
guide:

http://docbook.org/tdg/en/html/docbook.html

and the user guide for the stylesheets:

http://www.sagehill.net/docbookxsl/index.html
    

Sorry but this is just wrong.  LaTeX has excellent documentation, not
only online but also in physical form, i.e. books.

Maybe it does. I couldn't find it. I spent a lot of time searching for it.

I did find some good tutorials on how to write latex documents. But they really only covered the basic stuff like how to create a section or a table. They stopped long before all my questions were answered (such as, how do I install it?)

Compare this with my docbook experience. About 15 minutes of searching found me the definitive guide and the user guide. These told me exactly what steps I needed to take. And answered pretty much all my questions about authoring, generation, and customisation.

  
- Better looking output

The pdf that latex produces looks dated (and not in a funky retro kind 
of way, just kind of tired and old). I think the docbook generated 
output looks better.
    

Sophistry.  And indeed casuistry.  Just because the default LaTeX fonts
are crap and the default LaTeX styling crap doesn't make the tool crap.
  

I have managed to produce better looking output using docbook in far less time that it has taken using latex. I still don't have acceptably good output using latex.

This is certainly another data point suggesting the crapness of latex.

It takes only a small amount of effort (admittedly relatively expert
effort, but then this is also true of DocBook?XML) to make a LaTeX
document look good

But it doesn't. The effort also includes the work of figuring out how to make the change. And in my experience, this is very high with latex.

 -- cf. "Python for Rookies" by Sarah Mount, James
Shuttleworth, and errr.... me.  Typeset purely in LaTeX by errr... me.
And I am not a LaTeX expert, for that you need Sebastian Rahtz or one of
that crew.

(http://edu.cengage.co.uk/instructors/product.aspx?isbn=1844807010
http://www.amazon.co.uk/Python-Rookies-Sarah-Mount/dp/1844807010/ go on,
you know you want to buy it :-)

  
- More options for customisation

Because docbook is an xml based markup, we have more options for 
transformation and customisation. The docbook stylesheets have 
(apparently) been designed to be easily extended to change the generated 
markup. The HTML markup that they generate is semantically cleaner than 
that produced by tex4ht, and so is easier to style.
    

This is true.  So what repurposing are you expecting other than PDF and
HTML generation?

It would be nice to weave some of the content into the website. For example, chapter 2 would make an excellent 'Gradle in 10 minutes' page. And all of the samples with the associated words would work well assembled as a cookbook/FAQ page.

So I think there are 3 outputs: standalone pdf and html userguides, and the website.

  There is no point in planning for situations that will
not happen.

  

Yeah, like publishing a book from this content.

I suspect the quality of the HTML generation is a factor of the
configuration of the tools not a problem with the tools per se.  Refer
back to the PDF generation issue.  It is not LaTeX that is the problem
here it is the configuration of LaTeX for this case.

  

But configuring the tool is part of the cost of using the tool. It is very much an important factor for those who have to use it.


  
More importantly, the documentation describes how to do the 
customisations, which cannot be said for the atrocious tex4ht documentation.

I could keep going, but you get the idea.

Thoughts?
    

This reads like advocacy of the new convert.

Quite possibly. My comments are, however, based on doing stuff, rather than just reading about stuff.


Adam

--------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email

Reply via email to