Thanks for your help, everyone. I need to brush up on my DocBook before I reply in real detail. It's been eons. I did know DocBook quite well back in the day, but at the time was not happy with the available tools. DocBook itself I think is just dandy, but the tools I was using then were a real PITA.
Camille, I'm afraid mine is quite a low budget operation. However, I'm contemplating using a KickStarter Campaign to finance an initial print run of at least one of my books. If I do that, I expect I could afford to pay for a license for the proprietary version of your tool. It's been a long time, but I was at one time intimately familiar with the Apache Xerces-C (actually C++) XML DOM API. One approach that I could conceivably take, would be to write a C++ program, that would use Xerces-C to read my essays one-at-a-time into their own DOM, then copy the contents of the XHTML elements into the corresponding DocBook 5 XML elements. For <p> to <Para> that would be straightforward, but I haven't looked into the other kinds of elements yet, or attributes. I just now installed some of the DocBook packages on my Mountain Lion MacBook Pro with MacPorts, however the docbook-utils package would not install, no doubt due to some configuration bug in its port file. I'll report that via the MacPorts trouble ticket procedure. Best, Mike Crawford [email protected] http://www.warplife.com/ On Mon, Jul 29, 2013 at 6:25 PM, Richard Hamilton <[email protected]> wrote: > Hi Mike, > > I have had very good luck with Herold (http://www.michael-a-fuchs.de). > > I'm usually not fortunate enough to have strict xhtml, so we do some > pre-processing (usually on well-behaved, but idiosyncratic, html), tidy it up > into xhtml, then run Herold. > > You may find that you need to do some light pre- or post-processing, but for > us it has never been more than a short XSL stylesheet to do things like > remove empty paragraphs from the initial XHTML or change the root element in > the resulting DocBook (the latter can probably be handled by Herold using > Groovy scripts, but I've learning all the scripting languages I need for the > time being, so I stick with XSL or Perl-:). > > When we build a book, like you're doing, rather than concatenate pieces, we > keep each file separate, then create a "book" file that uses xinclude to pull > in the chapters. That simplifies the scripting and makes it easier to move > parts around in the book. > > Regarding the killer feature, if you use the right option (I don't remember > off-hand, but it's in Bob Stayton's book (http://sagehill.net)), you can get > exactly what you want for links in the hard copy. > > Best Regards, > Dick Hamilton > ------- > XML Press > XML for Technical Communicators > http://xmlpress.net > [email protected] > > > > On Jul 27, 2013, at 6:18 PM, Michael Crawford wrote: > >> Greetings, Earthlings, >> >> I have some articles and essays that are all marked up with valid XHTML 1.0 >> Strict with CSS, that I would like to publish as bound, dead-tree books, >> possibly also eBooks. >> >> It seems to me that the best way to do that would be to convert each >> collection of essays into a single DocBook XML document. Can you give me >> some tips on how to get started? I'm happy to Read The Fine Manual, but >> there are so many. >> >> One such volume, when printed both-sides on US Letter paper, is ~250 pages. >> The essays range from two to fifty pages. >> >> What I _think_ I need to do is to use some manner of XML-to-XML >> transformation, to strip everything from the beginning of each document, up >> to and including the opening <body>, then from the closing </body>, to the >> end of each document.... >> >> ... then concatenate them all together, with each present XHTML document >> being a single chapter in the resulting DocBook document... >> >> ... then replace HTML-style tags and attributes with DocBook-style: <p> to >> <Para>, for example... >> >> ... what would be for me, A Killer Feature, would be to convert each HTML <a >> href="..."> hyperlink into a DocBook footnote. So where I have this: >> >> =========== >> a long-forgotten <a href="http://www.kuro5hin.org/">cesspool</a> in a >> far-off corner of the World-Wide Web... >> =========== >> would look something like this in hardcopy form: >> >> a long-forgotten cesspool[1] in a far-off corner of the World-Wide Web... >> ---- >> 1. http://www.kuro5hin.org/ >> >> ========= >> >> I'd also like to design my own custom stylesheets. I'll ask about that >> later though. I have a copy of "Android Programming: The Big Nerd Ranch >> Guide" by Bill Phillips and Brian Hardy. In the Acknowledgements, the >> authors credit Chris Loper of http://www.intelligentenglish.com/ for his >> DocBook toolchain. >> >> That volume is exquisite. I'd like to design my own volume, not to look the >> same, but to look as good, with my own personal style. >> >> Thanks for any advice you can give me. >> >> Mike Crawford >> [email protected] >> http://www.warplife.com/ >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
