Re: [docbook-apps] A little XML-to-XML handholding?

Michael Crawford Tue, 30 Jul 2013 00:02:30 -0700

Thanks for your help, everyone.

I need to brush up on my DocBook before I reply in real detail.  It's
been eons.  I did know DocBook quite well back in the day, but at the
time was not happy with the available tools.  DocBook itself I think
is just dandy, but the tools I was using then were a real PITA.


Camille, I'm afraid mine is quite a low budget operation.  However,
I'm contemplating using a KickStarter Campaign to finance an initial
print run of at least one of my books.  If I do that, I expect I could
afford to pay for a license for the proprietary version of your tool.

It's been a long time, but I was at one time intimately familiar with
the Apache Xerces-C (actually C++) XML DOM API.  One approach that I
could conceivably take, would be to write a C++ program, that would
use Xerces-C to read my essays one-at-a-time into their own DOM, then
copy the contents of the XHTML elements into the corresponding DocBook
5 XML elements.  For <p> to <Para> that would be straightforward, but
I haven't looked into the other kinds of elements yet, or attributes.

I just now installed some of the DocBook packages on my Mountain Lion
MacBook Pro with MacPorts, however the docbook-utils package would not
install, no doubt due to some configuration bug in its port file.
I'll report that via the MacPorts trouble ticket procedure.

Best,

Mike Crawford
[email protected]
http://www.warplife.com/

On Mon, Jul 29, 2013 at 6:25 PM, Richard Hamilton <[email protected]> wrote:
> Hi Mike,
>
> I have had very good luck with Herold (http://www.michael-a-fuchs.de).
>
> I'm usually not fortunate enough to have strict xhtml, so we do some 
> pre-processing (usually on well-behaved, but idiosyncratic, html), tidy it up 
> into xhtml, then run Herold.
>
> You may find that you need to do some light pre- or post-processing, but for 
> us it has never been more than a short XSL stylesheet to do things like 
> remove empty paragraphs from the initial XHTML or change the root element in 
> the resulting DocBook (the latter can probably be handled by Herold using 
> Groovy scripts, but I've learning all the scripting languages I need for the 
> time being, so I stick with XSL or Perl-:).
>
> When we build a book, like you're doing, rather than concatenate pieces, we 
> keep each file separate, then create a "book" file that uses xinclude to pull 
> in the chapters. That simplifies the scripting and makes it easier to move 
> parts around in the book.
>
> Regarding the killer feature, if you use the right option (I don't remember 
> off-hand, but it's in Bob Stayton's book (http://sagehill.net)), you can get 
> exactly what you want for links in the hard copy.
>
> Best Regards,
> Dick Hamilton
> -------
> XML Press
> XML for Technical Communicators
> http://xmlpress.net
> [email protected]
>
>
>
> On Jul 27, 2013, at 6:18 PM, Michael Crawford wrote:
>
>> Greetings, Earthlings,
>>
>> I have some articles and essays that are all marked up with valid XHTML 1.0 
>> Strict with CSS, that I would like to publish as bound, dead-tree books, 
>> possibly also eBooks.
>>
>> It seems to me that the best way to do that would be to convert each 
>> collection of essays into a single DocBook XML document.  Can you give me 
>> some tips on how to get started?  I'm happy to Read The Fine Manual, but 
>> there are so many.
>>
>> One such volume, when printed both-sides on US Letter paper, is ~250 pages.  
>> The essays range from two to fifty pages.
>>
>> What I _think_ I need to do is to use some manner of XML-to-XML 
>> transformation, to strip everything from the beginning of each document, up 
>> to and including the opening <body>, then from the closing </body>, to the 
>> end of each document....
>>
>> ... then concatenate them all together, with each present XHTML document 
>> being a single chapter in the resulting DocBook document...
>>
>> ... then replace HTML-style tags and attributes with DocBook-style: <p> to 
>> <Para>, for example...
>>
>> ... what would be for me, A Killer Feature, would be to convert each HTML <a 
>> href="..."> hyperlink into a DocBook footnote.  So where I have this:
>>
>> ===========
>> a long-forgotten <a href="http://www.kuro5hin.org/";>cesspool</a> in a 
>> far-off corner of the World-Wide Web...
>> ===========
>> would look something like this in hardcopy form:
>>
>> a long-forgotten cesspool[1] in a far-off corner of the World-Wide Web...
>> ----
>> 1. http://www.kuro5hin.org/
>>
>> =========
>>
>> I'd also like to design my own custom stylesheets.  I'll ask about that 
>> later though.  I have a copy of "Android Programming: The Big Nerd Ranch 
>> Guide" by Bill Phillips and Brian Hardy.  In the Acknowledgements, the 
>> authors credit Chris Loper of http://www.intelligentenglish.com/ for his 
>> DocBook toolchain.
>>
>> That volume is exquisite.  I'd like to design my own volume, not to look the 
>> same, but to look as good, with my own personal style.
>>
>> Thanks for any advice you can give me.
>>
>> Mike Crawford
>> [email protected]
>> http://www.warplife.com/
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [docbook-apps] A little XML-to-XML handholding?

Reply via email to