from:"maxwell"

Re: [docbook-apps] Upquotes for code

2022-08-07 Thread Mike Maxwell


On 8/7/2022 3:11 AM, Esteban Zimanyi wrote:

However, I wonder whether a better solution would be to translate
docbook  to LaTeX \lstinline in the same way that
 is translated to \lstlisting; That will definitely
solve the problem in an easy and efficient way.

Do you think that the docbook community could envision this possibility?


I can't speak for the DocBook community, although my sense is that most 
people on this mailing list use XSL-FO rather than dblatex.  The 
developer of dblatex is Benoît Guillon.  Back when we were using 
dblatex, he was responsive to emails, but the last release I see is 
2016, so I suspect he's no longer actively maintaining it.


That said, the XSLT code that translates DocBook XML into LaTeX is 
included in the dblatex package, so you can edit it to produce whatever 
LaTeX code you want.  We did that in our project, although I confess I 
found XSLT extremely hard to debug.  So yes, you can translate  
into \lstinline, and that particular transform sounds like it would be 
fairly simple.


   Mike Maxwell
   University of Maryland

--
This email has been checked for viruses by AVG antivirus software.
www.avg.com

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Upquotes for code

2022-08-06 Thread Mike Maxwell

We used dblatex extensively some years ago, with our own LaTeX 
stylesheets.  My memory is somewhat vague, but I think that you won't 
see the \usepackage{upquote} in the LaTeX output file, because it's 
inside the (in your case) texstyle.sty file--you'll only see a line that 
says

   \usepackage{texstyle}
You should however see it loading 'upquote' in LaTeX's .log file.

But more importantly, the documentation for the upquote package 
(https://ctan.math.washington.edu/tex-archive/macros/latex/contrib/upquote/upquote.pdf) 
says "The package does not affect \tt, \texttt, etc."  So I wouldn't 
expect it to force straight quotes inside \texttt{} strings, only inside 
'verbatim' or 'verb' blocks.  The way to force straight quotes inside 
\texttt{} is here:


https://tex.stackexchange.com/questions/257612/adding-straight-quote-marks-to-texttt
although it admittedly requires you to use \textquotesingle{}, which is 
a lot of typing.  (You could of course re-define a simple macro like 
\tq{} to be \textquotesingle{}.)


There are some more complicated solutions (in that you may have to 
copy-past more code into your .sty file) here, which allow you to just 
use the single quote mark:


https://tex.stackexchange.com/questions/436308/changing-all-single-quotes-to-be-straight-when-within-texttt
My personal preference is the XeLaTeX solution, since that allows you to 
use non-ASCII Unicode text in your input files.  dblatex calls it if you use

-b xetex
on the command line.  But you may have no need for that, and pdflatex is 
admittedly faster.


Mike Maxwell
University of Maryland

On 8/6/2022 8:21 AM, Esteban Zimanyi wrote:

I am using dblatex to generate pdf content from docbook source files.

I was able to make programlisting (listings in Latex) to produce
straight quotes by passing a file named 'textstyle.sty' whose contents
is as follows

%%
%% This style is derived from the manual
%% http://dblatex.sourceforge.net/doc/manual/sec-custom-latex.html
%%
\NeedsTeXFormat{LaTeX2e}
\ProvidesPackage{texstyle}[2017/04/25 PostGIS DocBook Style]

%% Just use the original package and pass the options
\RequirePackageWithOptions{docbook}

%% Make regular quotes within programlisting tags (#3726)
\usepackage{upquote}
\usepackage{listings}
\lstset{upquote=true}


However this does not work with inline code (texttt in Latex). When
compiling with the debug flag

$dblatex -s texstyle.sty -d mobilitydb-berlinmod.xml

and analyze the generated file I can see that there is NO

\usepackage{upquote}

and the content where the problem is looks as follows

 with one of the values \texttt{'minimal'} (the default),
\texttt{'medium'}, 

Any idea how to solve this ?

Thanks for your answer !


--
This email has been checked for viruses by AVG antivirus software.
www.avg.com

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Language support in XSL-FO Stylesheets

2022-06-23 Thread Mike Maxwell

This is probably not relevant to your cases, but we typeset and 
published (through Mouton) a number of grammars ten or so years ago, 
where the languages being described had unusual scripts: Western Panjabi 
(Nasta'liq variety of Arabic script, hence right-to-left), Bangla 
(Bengali script), and Dhivehi (Thaana script, also right-to-left), and 
Pashto (Naskh variety of Arabic script).  But we were using dblatex 
(http://dblatex.sourceforge.net) to convert our DocBook source to 
XeLaTeX (a Unicode-aware version of LaTeX).


   Mike Maxwell
   University of Maryland

On 6/23/2022 10:35 AM, M. Downing Roberts wrote:

Hi Frank,

I can't add much except to say that I also hit a wall trying to generate 
a bilingual book (English and Japanese), and the index in particular was 
very difficult.


I got some help from Bob Stayton, but the only solution was a hack to 
generate the index using another application that I wrote, which 
massaged the XSL-FO.


The problem, including more detail from Bob, is recorded in this GitHub 
issue: https://github.com/docbook/xslt10-stylesheets/issues/238 
<https://github.com/docbook/xslt10-stylesheets/issues/238>


All best,

M. Roberts

On Thu, Jun 23, 2022 at 11:07 PM Frank Steimke 
<mailto:f-stei...@berger-und-steimke.de>> wrote:


Dear List Members,

i had already sent this to the docbook List, but probably this list
docbook-apps fits better.

I am using DocBook for a bi-lingual book. Most of the content is
written in english, but parts are written in the german language.
PDF is produced with XSL Stylesheets 1.79.2 shipped within Oxygen 24
and the Antenna House Formatter v7.

Since most content is english, /book/@xml:lang is 'en'. Fragments in
german have @xml:lang='de' at the appropriate level, e. g. for
section or note elements. Sometimes i have phrase or emphasis
elements only because of the @xml:lang attribute.

Observation is, that hyphenation is wrong in the PDF Document for
the german fragments. I think i have found the reason, but i am
puzzled. There are two issues which i can't understand:

1) There is a template named "language.attribute" in I10n.xsl. It
calculates the language value looking at the ancestor axis, and
emits an attribute named @lang with that value. *First Issue: *the
name of the attribute is wrong, the correct name is @language. See
section 7.10.2 "Language" <https://www.w3.org/TR/xsl11/#language> 
in Extensible Stylesheet Language (XSL) Version 1.1.


2) The template named "language.attribute" is rarely used. *2nd
Issue: *I had to create a customization layer for the templates that
matches d:para or d:simpara, which do emit an fo:block element, so
that they call the language.attribute template. Same for d:phrase
and d:emphasis in inline.xsl

Maybe i have missed something obvious. Are there any reasons for
this lack of language support?

Sincerely, Frank Steimke



--
This email has been checked for viruses by AVG.
https://www.avg.com


-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Asciidoc -> docbook -> PDF tooling

2021-11-10 Thread Michael Maxwell


Probably not the solution you're looking for, but we have used dblatex
   http://dblatex.sourceforge.net
(not to be confused with the older db2latex).  For our use 
case--grammars of foreign languages, with mixed languages and scripts 
(Arabic, Bengali, Thaana)--it was probably the only solution.  But if 
you're not familiar with LaTeX, tweaking it might be a steep learning curve.


On 11/9/2021 7:01 AM, Randall Wood wrote:

I am working on an AsciiDoc -> DocBook -> PDF toolchain for an open source project 
(so all tooling must be freely available) because the direct AsciiDoc -> PDF 
toolchain is inadequate for our purposes.

I currently have a java/maven-based AsciiDoc -> DocBook -> FOP -> PDF chain 
within the docbkx-maven-plugin, but would like any suggestions that appear to be better 
maintained and are cross platform.


Randall Wood--

Mike Maxwell
"Digital objects last forever--or five years,
whichever comes first."  --Jeff Rothenberg

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Two nagging problems with docbook2pdf in texlive 2019

2021-06-11 Thread Michael Maxwell

> I’ve been chatting, off and on, with Peter Flynn about working
> on a way to use TeX as a formatting back end in the modern era.

dblatex?
   https://pypi.org/project/dblatex/
   http://dblatex.sourceforge.net

In the period 2006 through 2016, we used dblatex with XeLaTeX a *lot*, 
and found it very good.  It did require some tailoring for our purposes, 
in part because we had some additions to the DocBook schema for 
linguistic "stuff", and in part to convert it into the styles we were 
using, one of which was a style conforming to the Mouton language 
grammar series.  Some of the tailoring was in XSLT (yuck, although I 
suppose your mileage may vary), and some was in LaTeX style sheets.

To boast a little, one of the grammars that we typeset with DocBook + 
dblatex + XeLaTex is here:

https://www.amazon.com/Descriptive-Grammar-Pashto-Its-Dialects/dp/B00XTAT77U/

Unfortunately, the "Look inside" view doesn't let you see our Arabic 
script examples (right-to-left text, of course), but trust me, they're 
pretty :).

It looks like dblatex hasn't been updated for a few years, but in our 
experience it was reasonably mature.  At the very least, it could 
provide a good starting point.

On 6/11/2021 11:00 AM, Norm Tovey-Walsh wrote:

Kevin Dunn  writes:

Thanks, Dave. You were helpful to me 10 years ago. The XEP PDF output
looks pretty nice with the default xsl stylesheet. There are some
fancy things I achieved with dsssl and jadetex, and I'm not sure how

There’s a blast from the past!

I’ve been chatting, off and on, with Peter Flynn about working on a way
to use TeX as a formatting back end in the modern era. But it’s not in
the top couple of reams of the todo list, at the moment.

I expect the future is XML+CSS and that’s what I have in mind for the
xslTNG stylesheets. It would be entirely possible, of course, to
generate XSL-FO, but it feels like custom-HTML output and custom-CSS fed
through Antennahouse would be the shortest path to victory.

PrinceXML would also work that way.

AFAIK, there are no free formatters that take HTML+CSS and produce
results comparable with FOP, which surprises me. (Not that the FOP level
of output would satisfy your requirements; but the lack of reasonable
open source print formatters is one of the things that leads me to
ponder generating LaTeX. You know, like we did in the 90’s when we were
young! :-))

 Be seeing you,
   norm

--
Norman Tovey-Walsh 
https://nwalsh.com/

Linux. Because rebooting is for hardware upgrades.

--
Mike Maxwell
"Digital objects last forever--or five years,
whichever comes first."  --Jeff Rothenberg

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] DocBook XSL: The Next Generation

2020-07-29 Thread Michael Maxwell




On 7/29/2020 11:10 AM, Thomas Schraitle wrote:

Just for clarification: how do we get PDF thesedays?


I suspect I'm not part of this "we", but for a different "we": we get 
them via dblatex + (Xe)LaTeX.  That gave us the ability to mix 
left-to-right and right-to-left scripts and still get well-typeset 
results, as well as typesetting tables that ran > 1 page (maybe that's 
doable with other methods, not sure), and interlinear text examples 
(which "we" linguists like to use, but are quite difficult to typeset 
without LaTeX tools).


But I suspect most people don't have our needs, so for most of you this 
would be overkill.

--
Mike Maxwell
"I may not remember, but I never forget."
--Social Crimes, Jane Stanton Hitchcock

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] How to make chemical structure automatically numbered

2018-10-12 Thread maxwell


Bernhard--

I'll just comment on one thing, then shut up :-).  It appears you're 
already familiar with LaTeX.  It's possible to convert DocBook to LaTeX 
using the dblatex program (http://dblatex.sourceforge.net/), and you can 
create customizations to take advantage of other LaTeX packages, etc.  
We've used dblatex for over a decade to do grammars; we added a few XML 
schemas + transformations not in standard DocBook for linguistic 
structures.  There was a bit of a learning curve; xslt is definitely my 
un-favorite programming language.  (Andy Black, who may be lurking on 
this list, helped us through the early days.)


   Mike Maxwell
   University of Maryland





-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: AW: [docbook-apps] Capital latin "Q" with dot above

2018-06-15 Thread maxwell


On 2018-06-15 00:48, Frank Steimke wrote:

...
So writing down the combining character sequence is no problem at all,
however it has to be supported by a font in the final PDF file. If the
default font does not work: Have a look at Google Noto Fonts
(https://www.google.com/get/noto/).


SIL's fonts are also very good at supporting standards, including the 
positioning of combining diacritics.  There's a list here:
   
http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi=fontdownloads
Many of these are non-Roman fonts, so not what you're looking for.  But 
the Doulos and Charis fonts have wide coverage of Latin characters (I 
want to say "complete"), and support not only combining diacritics, but 
stacked diacritics.  The SIL fonts are free (in fact the SIL font 
license has become a semi-standard for free fonts).


There are other fonts which do *not* correctly support stacked 
diacritics the last time I looked, such as Linux Biolinum, Linux 
Libertine, Nimbus Sans, Fira Mono (and maybe other Fira fonts), and 
DejaVu.  Of course if you don't have stacked diacritics (=multiple 
diacritics on a single base character), these may be adequate.


   Mike Maxwell
   University of Maryland


-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

RE: [docbook-apps] Capital latin "Q" with dot above

2018-06-12 Thread maxwell


On 2018-06-12 14:14, Jan Tosovsky wrote:

You cannot combine multiple characters to form the composed one. If
there is Unicode character for this combination, you can use it
directly. If it is not displayed correctly, it means your font doesn't
contain this character and you need to switch to the font with broader
character support.


You can if, as jmt pointed out, one of the characters (the dot, in this 
case) is a Unicode combining diacritical mark.  At least you can in 
systems I've used, for most fonts.  Systems I've used in this case means 
XeLaTeX (a Unicode-aware version of LaTeX) and Microsoft Word.  I have 
not used the XSL-FO transform, but hopefully it works there too.


There may of course be Latin fonts where the dot-over is not defined, 
and there might possibly be fonts where it is defined but where it 
doesn't position itself correctly over some base characters 
(particularly where there is more than one diacritic, i.e. stacked 
diacritics).  Fonts should come with a chart that tells what code points 
they handle, but that information can sometimes be difficult to find.


   Mike Maxwell
   University of Maryland

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Re: Using the DocBook XSLT 2.0 stylesheets with Gradle

2018-03-06 Thread maxwell


On 2018-03-06 10:44, Tony Graham wrote:

On 06/03/2018 14:44, Niels Müller Larsen wrote:
...

I am on Linux, my present web development students on Win or Mac, so
 cross platform is important, and that was all I wanted to comment
on.


After installing Emacs, the second thing that I install on any Windows
system is the Cygwin tools.  With a bash shell running in an Emacs
buffer, it's almost like using a rational operating system.


FWIW, Windows 10 now has the ability to install a Linux sub-system, 
basically a bash prompt.  You can then use apt-get to install lots of 
other standard Linux programs.  Support for gui apps is not officially 
supported, but the apps I've tried work if you also install an X-windows 
server app.  I used to use Cygwin, but I prefer the Windows Linux 
sub-system.


   Mike Maxwell
   University of Maryland


-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

[docbook-apps] editing localization files

2017-03-29 Thread maxwell

We're using dblatex, which uses some files from the docbook 
distribution, including specifically the localization files.  I noticed 
that the English localization file (en.xml) does not include 
localization terms for 'acknowledgments' or 'Acknowledgements' (although 
nb.xml happens to include it, in English no less).


I could add this as a customization, but it seems like I (or someone) 
should really add it to the stock DocBook files, since 
 is one of the DocBook elements.  And indeed, up at 
the top of en.xml, it says

-






-
But the latter link is broken (maybe we have an old version), and I 
can't figure out where the source for these things is now stored.  (I 
tried Google...)


If someone can point me to where the source is, I'd be happy to edit it; 
or if there are permission issues, perhaps one of you can edit it to add 
this term.


   Mike Maxwell
   University of Maryland


-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] [ANN] XSL Coverage scripts

2016-11-12 Thread maxwell


Sorry, I meant Saxon-HE (same link)

On 11/12/2016 1:13 PM, maxwell wrote:

Looks great, I want to try it!  But does it work with the open source
version of Saxon?  Saxon-CE, described here:
   http://www.saxonica.com/download/opensource.xml
The version # there is quite different from what you give below (6.5.5):
theirs is 9.7.

   Mike Maxwell

On 11/11/2016 10:06 PM, ben.guillon wrote:

Hi Richard,

Yes, it means that: when stylesheets are processed on a given test xml
file, the tool traces the templates applied, and within the called
templates the XSL instructions performed (some instructions can be
unreachable because of conditional processing with xsl:if, xsl:choose).
In the HTML coverage report, the covered lines (that is, the XSL lines
used to transform the XML) have green background, while unused lines
have yellow backgrounds. Clicking on the green lines point to the XML
line(s) processed. The overall statistics for the stylesheets used are
in a coverage index file.

An example of such a report is here:
https://marsgui.github.io/xslcoverage/example/traces/coverage_index.html

Regards,
BG

On Fri, 11 Nov 2016 20:35:37 +0100, Richard Hamilton
<hamil...@xmlpress.net> wrote:


Hi Ben,

This looks interesting, but I’ve got a basic (dumb:-) question. What
do you mean by coverage?

Do you mean test coverage, that is, calculating how much a given test
xml file exercises the stylesheets, or do you mean something else?

Thanks,
Dick Hamilton
---
XML Press
XML for Technical Communicators
http://xmlpress.net
hamil...@xmlpress.net


On Nov 8, 2016, at 15:58, ben.guillon <ben.guil...@gmail.com> wrote:

Hi,

For your information, I've packaged a few python scripts and a java
plugin for Saxon to compute and visualize the coverage of XSL
stylesheets when processed on documents with saxon (currently tested
with saxon 6.5.5).

It's available here:

https://github.com/marsgui/xslcoverage/tree/master

You can look at the result with an example at the end of the readme.

Regards,
BG



-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org



-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org



-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] [ANN] XSL Coverage scripts

2016-11-12 Thread maxwell

Looks great, I want to try it!  But does it work with the open source 
version of Saxon?  Saxon-CE, described here:

   http://www.saxonica.com/download/opensource.xml
The version # there is quite different from what you give below (6.5.5): 
theirs is 9.7.


   Mike Maxwell

On 11/11/2016 10:06 PM, ben.guillon wrote:

Hi Richard,

Yes, it means that: when stylesheets are processed on a given test xml
file, the tool traces the templates applied, and within the called
templates the XSL instructions performed (some instructions can be
unreachable because of conditional processing with xsl:if, xsl:choose).
In the HTML coverage report, the covered lines (that is, the XSL lines
used to transform the XML) have green background, while unused lines
have yellow backgrounds. Clicking on the green lines point to the XML
line(s) processed. The overall statistics for the stylesheets used are
in a coverage index file.

An example of such a report is here:
https://marsgui.github.io/xslcoverage/example/traces/coverage_index.html

Regards,
BG

On Fri, 11 Nov 2016 20:35:37 +0100, Richard Hamilton
<hamil...@xmlpress.net> wrote:


Hi Ben,

This looks interesting, but I’ve got a basic (dumb:-) question. What
do you mean by coverage?

Do you mean test coverage, that is, calculating how much a given test
xml file exercises the stylesheets, or do you mean something else?

Thanks,
Dick Hamilton
---
XML Press
XML for Technical Communicators
http://xmlpress.net
hamil...@xmlpress.net


On Nov 8, 2016, at 15:58, ben.guillon <ben.guil...@gmail.com> wrote:

Hi,

For your information, I've packaged a few python scripts and a java
plugin for Saxon to compute and visualize the coverage of XSL
stylesheets when processed on documents with saxon (currently tested
with saxon 6.5.5).

It's available here:

https://github.com/marsgui/xslcoverage/tree/master

You can look at the result with an example at the end of the readme.

Regards,
BG



-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org



-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] how should manpage 2nd level headings include quoted code?

2016-08-08 Thread maxwell


On 2016-08-08 16:32, Bob Stayton wrote:

...I seem to recall that while \fP will work in simple
cases to restore the previous font, it does not work when inline font
changes are nested.  For example:

the  sequence  ‘‘\fB...\fR...\fI...\fP...\fP’’

will result in italics afterward instead of bold.  That's because the
font changes are not a true stack, just a single "previous" font.


IIUC, this is the reason that LaTeX (an entirely different typesetting 
system, of course) now discourages the use of the old commands \it and 
\bf, in favor of \textit, textbf etc.


   Mike Maxwell

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Show off what you've done with Docbook

2015-09-14 Thread maxwell


On 2015-09-14 03:07, Eric Streit wrote:

I tried DBLaTeX: the output is better, but it's only docbook 4.


Not sure what you mean here.  You mean you're still working in DB4?  I 
vaguely recall that we started out using dblatex with DB4 before we 
moved to DB5.  At any rate we've been using dblatex with DB5 for a long 
time.  I *think* dblatex still handles DB4, but I haven't tried it for 
years (if at all).



Some of the above problems were solved using DBLaTeX, but the output
needs some adjustements: maybe new styles 


No doubt.  Fortunately and unfortunately, there's a huge number of pages 
on the web about formatting LaTeX documents to look like nearly anything 
you can imagine.  FWIW, I would recommend using xelatex, rather than 
vanilla latex, because xelatex (often referred to as xetex: xetex is to 
xelatex as tex is to latex) handles Unicode UTF-8 out of the box.  
Almost all LaTeX packages can be used in XeLaTeX (the ones that can't 
are mostly ones that do something special with non-unicode characters).  
dblatex will output xe(la)tex-conformant code if you give it the command 
line parameter

-b xetex

   Mike Maxwell

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Show off what you've done with Docbook

2015-09-13 Thread maxwell


On 9/13/2015 2:45 PM, Warren Block wrote:

The FreeBSD DocBook toolchain almost supports dblatex, but the older
versions did not have all the features we needed for PDFs.  Table of
contents and some other things, as I recall.  We now have a port of the
latest version of dblatex, but it dies mysteriously and needs more
investigation.


I don't recall exactly how we do the table of contents, but as I recall 
it's literally a one liner.  For our grammars, we additionally have a 
Table of tables, table of figures, and for one document a Table of 
equations.  Also a glossary, bibliography (for which we use bibtex data 
files, together with the Biber program, which has advantages over the 
older bibtex program), and an index.  The glossary and index require the 
appropriate XML elements, and of course the biblio requires the .bib 
files, but otherwise it's all automagic.


Most of the customization we've had to do has been on the LaTeX side, 
for which we have two style sheets--one for producing PDFs for use 
on-line, the other for camera-ready copy for the print publisher.


Does your system die during the dblatex phase, or the LaTeX phase?  We 
hardly ever see a failure in the dblatex phase, unless there's actually 
an error in the DocBook XML.

--
Mike Maxwell
maxw...@umiacs.umd.edu
"I cannot believe that our existence in this universe
is a mere quirk of fate, an accident of history, an
incidental blip in the great cosmic drama. Our
involvement is too intimate. The physical species
Homo may count for nothing, but the existence of
mind in some organism on some planet in the universe
is surely a fact of fundamental significance. Through
conscious beings the universe has generated
self-awareness." --Paul Davies

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Show off what you've done with Docbook

2015-09-13 Thread maxwell


On 9/13/2015 12:19 PM, Warren Block wrote:

On Sun, 13 Sep 2015, Gerard Nicol wrote:

2. The complexity of customizing the default look of the documents,
which look OK, but don't look as good as they would need to be to be
put in front of a client.


That is difficult.  Based on my unscientific sampling, XSL is not
well-regarded.  As languages go, it's rarely used and poorly understood.
And then there is the separation between print and other media, which
usually means also dealing with XSL-FO and Fop.  Fop has its own set of
problems.


FWIW, there is (at least) one alternative: dblatex 
(http://dblatex.sourceforge.net/).  No XSL-FO, rather it uses XSLT to 
convert a DocBook doc to a LaTeX doc, then you run LaTeX.  We've used it 
for book-length grammars (with a couple additions to support things that 
linguists need).  The grammars are (if I may say so) nicely formatted, 
support right-to-left text, etc.

--
Mike Maxwell
maxw...@umiacs.umd.edu
"I cannot believe that our existence in this universe
is a mere quirk of fate, an accident of history, an
incidental blip in the great cosmic drama. Our
involvement is too intimate. The physical species
Homo may count for nothing, but the existence of
mind in some organism on some planet in the universe
is surely a fact of fundamental significance. Through
conscious beings the universe has generated
self-awareness." --Paul Davies

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Visual DocBook editors

2015-05-23 Thread maxwell


On 5/23/2015 6:18 AM, Dew, Simon wrote:

The free (GPL) version of Serna 4.4 is still available:


I believe there's also a free older version of XMLMind available 
(www.xmlmind.com).  I don't find any reference to it on their website, 
but there's some info on getting it here:

   http://www.xlingpaper.org/?page_id=51
XLingPaper is an alternative XML schema to DocBook, designed for 
linguists.  But the editor itself is the XMLmind XML editor, XXE for short.


It's also possible to download a temporary evaluation copy of the 
current version of XXE, but that presumably is not what you want.

--
   Mike Maxwell
   What good is a universe without somebody around to look at it?
   --Robert Dicke, Princeton physicist

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Strip docbook-5 to content only

2014-03-25 Thread maxwell


On 2014-03-25 03:42, davep wrote:

  I'm tempted to ask why you didn't use XSLT but I won't grin/


or xml_grep --text_only

   Mike Maxwell


-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Syntax highlighting

2013-12-07 Thread Mike Maxwell


On 12/7/2013 5:05 PM, Frank Arensmeier wrote:

Actually, there is a renderer that is able to generate PDF from HTML with 
really good results.
Wkhtmltopdf (https://code.google.com/p/wkhtmltopdf/).


There's also htmltolatex (with LaTeX to create the PDF):
   http://htmltolatex.sourceforge.net/
I'm not sure how it handles character encodings; I would guess that it could be 
made to preserve
UTF-8 (rather than substituting special LaTeX names), in which case you could 
use XeLaTeX to produce
the PDF.
--
Mike Maxwell
maxw...@umiacs.umd.edu
My definition of an interesting universe is
one that has the capacity to study itself.
--Stephen Eastmond

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] GSoC Project Idea: integrated LaTeX output support for the stylesheets

2013-04-14 Thread Mike Maxwell


On 4/5/2013 4:40 PM, Gábor Kövesdán wrote:

Em 05-04-2013 22:16, maxwell escreveu:

One obvious disadvantage is that we've needed to understand LaTeX (not plain 
TeX), since many
of the tweaks rely on changes to our LaTeX style sheets, or alternative LaTeX 
packages. But
this has been an advantage at the same time, since the LaTeX typesetting is 
quite mature and
has handled everything we've thrown at it.

Someone mentioned that dblatex uses Python.  The amount of Python code in 
dblatex is quite
small, and I've never had to do anything with it. The xslt code, otoh, I've had 
to deal with
extensively, although most of that has had to do with odd things we're doing 
with the alignment
of right-to-left text, and some linguistic data structures we've added to the 
standard DocBook
structures.


Having this solution, would you still be interested in a more integrated 
solution that follows
the conventions of the DocBook XSL stylesheets and allows tuning with 
parameters and easy
customization? Are there any serious problems in dblatex that aren't solved for 
you? Any
functionality that you preferred to be implemented in a different way?


Sorry to be late responding; my excuse is that I got married the day after this 
email.

dblatex does have some parameters for tuning, see 
http://dblatex.sourceforge.net/doc/manual/sec-params.html.  My impression is that dblatex directly 
implements many of the DocBook XSL parameters, and that dblatex handles internationalization the 
same way that DocBook XSL does (although I could be mistaken).


I haven't run into any serious problems in dblatex.  The only thing I would prefer is that the 
longer xslt functions be broken into smaller ones.  When I do a major customization, I have to copy 
and change an entire function, and when the function is long, that's a nuisance.  And if the author 
of dblatex changes that function in a later version, then I'll have to hunt down the diffs between 
my changed copy and the function I copied from, then make those changes in the newer version. 
That's pretty much unavoidable given the way xslt works, but it's easier with small functions.


Or of course if it were done with Python instead of xslt, I'd be still happier, because I understand 
Python reasonably well.  Xslt, otoh, always trips me up when I try to do even simple things.  But I 
guess that's just my personal experience!

--
Mike Maxwell
maxw...@umiacs.umd.edu
My definition of an interesting universe is
one that has the capacity to study itself.
--Stephen Eastmond

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

RE: [docbook-apps] GSoC Project Idea: integrated LaTeX output support for the stylesheets

2013-04-05 Thread maxwell


On 2013-04-05 12:19, honyk wrote:
I entered company that used dblatex. My task was to create outputs 
that meet
the corporate identity. I very quickly switched to XSL-FO and 
commercial
XSL-FO processor. I still believe this was the only solution to cope 
with

that.

If you need huge customizations in TeX based solutions, you need deep
knowledge both XSLT and TeX parts of the production workflow as you 
usually

have to customize both of them.


FWIW, we've used the dblatex solution for five years now, and are very 
pleased with it.  We chose it, rather than the XSL-FO route, because 
we're producing grammars of languages that have mixed left-to-right and 
right-to-left text, and not just garden variety Arabic script).  At the 
time we made this choice, I don't believe XSL-FO supported right-to-left 
text well.  Of course, that probably matters not at all in your 
situation!  (BTW, we're actually using XeLaTeX, a Unicode-aware version 
of LaTeX.)


One obvious disadvantage is that we've needed to understand LaTeX (not 
plain TeX), since many of the tweaks rely on changes to our LaTeX style 
sheets, or alternative LaTeX packages.  But this has been an advantage 
at the same time, since the LaTeX typesetting is quite mature and has 
handled everything we've thrown at it.


Someone mentioned that dblatex uses Python.  The amount of Python code 
in dblatex is quite small, and I've never had to do anything with it.  
The xslt code, otoh, I've had to deal with extensively, although most of 
that has had to do with odd things we're doing with the alignment of 
right-to-left text, and some linguistic data structures we've added to 
the standard DocBook structures.


   Mike Maxwell
   University of Maryland

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Spell and grammar checking DocBook/ XML documents

2012-10-21 Thread Mike Maxwell


On 10/18/2012 10:33 AM, daniel.ke...@finaris.de wrote:

I´m probably not the first one to ask this question, but I still need to ask: 
Is there any
efficient, mostly MS-Word-like method to spell and grammar check XML documents 
and/ or their
transformation results (HTML, Pdf, etc.)?


I don't recall seeing the following answer: XMLmind (xmlmind.com), which is a semi-wysiwig DocBook 
(and DITA) editor has a built-in spell checker.  It defaults to English, but other languages can be 
selected.  There is an option in a config file to skip certain elements (like program listings, I 
suppose; I haven't tried it).  There is no grammar checker, but then I've never seen a grammar 
checker that I would consider worth using.


Disclaimer: we use XMLmind a lot, and are happy with it.
--
Mike Maxwell
maxw...@umiacs.umd.edu
My definition of an interesting universe is
one that has the capacity to study itself.
--Stephen Eastmond

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Small!! Lightweight!! xslt processor which is standalone!! and runs Docbook/XSL stylesheets?

2012-08-17 Thread maxwell

On Fri, 17 Aug 2012 16:31:39 +0200, Dan Shelton dan.f.shel...@gmail.com
wrote:
 No, because the package MUST be self-contained, which means it must
 have all parts in one source bundle to work on machines which are
 behind firewalls or more likely even an intranet with no connection to
 the Internet. The only requirement is a working C89 compiler.
 
 We already checked xsltproc and it can completely be ruled out because
 it will require almost 80MB of extra source code to meet it's minimum
 dependencies.

I sympathize.  And I find xslt extremely counter-intuitive--every time I
work with it, I spend hours trying to figure out how to do s.t. which seems
like it should be trivial.  I guess if I used xslt every day...but I don't
feel that way about other computer languages that I use on an occasional
basis.

I don't suppose an alternative solution would work--doing it in Python or
some other language that implements SAX or DOM?

   Mike Maxwell

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] WYSIWYG Editor for docbook

2012-08-13 Thread maxwell

On Mon, 13 Aug 2012 17:21:55 +0100, Paul Taylor paul_t...@fastmail.fm
wrote:
 Im giving Oxygen a go, found it a bit difficult to use at first but now 
 getting the hang of it.
 One advantage is does seem to have is not only can you work with docbook

 xml its also setup to generate html/pdf ecetera form the xml with minium

 effort.

XMLMind (http://www.xmlmind.com) has also been mentioned.  We use it in
our projects with a slightly modified version of DocBook v5 (we added a
couple additional constructs, which was not a hard task).  There's a slight
initial learning curve--I don't think there's any DocBook editor which is
truly wysiwyg--but our writers and editors have adopted well.  There is a
free version as well as a licensed version; the free version is quite
capable, in fact there's very little that the professional version adds. 
(We do use the pro version, in part because we need its capability to
interact with a WebDAV server for purposes of using svn.)

   Mike Maxwell
   University of Maryland

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] WYSIWYG Editor for docbook

2012-08-13 Thread maxwell

On Mon, 13 Aug 2012 23:05:29 +0200, Jirka Kosek ji...@kosek.cz wrote:
 On 13.8.2012 21:00, Jeff Chimene wrote:
 
 Not exactly the answer you're looking for, but to hijack the thead -
 have HTML 5 + CSS3 sufficiently advanced the art that WYSIWYGness can
 be achieved?
 
 Sure, for example http://xopus.com/

I don't know enough about how these things work, but what happens if you
resize your browser window--does the HTML change width to fit, or do you
get a scroll bar if the window is narrower than the the anticipated text
pane width?  (I personally hate websites that do that, I'd much prefer that
they wrap, but ymmv.)

   Mike Maxwell



-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Are docbook xsl up-to-date?

2012-03-29 Thread maxwell

On Thu, 29 Mar 2012 09:15:38 -0700, Bob Stayton b...@sagehill.net
wrote:
 As a workaround, I would suggest avoiding putting indexterms inside
inline
 elements 
 inside footnotes.  Instead, put the indexterm just before or after the
 inline element.

FYI, until recently indexterms were not allowed inside footnotes at all:
   http://www.docbook.org/tdg5/en/html/footnote.html
They are allowed with v5.1:
   http://www.docbook.org/tdg51/en/html/footnote.html

   Mike Maxwell

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] dblatex and FO

2012-03-25 Thread Mike Maxwell


On 3/25/2012 2:44 PM, Bob Stayton wrote:

I have come across a Makefile that is trying to apply dblatex to an FO file. 
The Makefile command
 specifies the .fo file (which was generated from xsltproc in another step) as 
the input, and a
.pdf file as the output. I'm not a dblatex user, but it is my understanding 
that it operates on
the original XML, not XSL-FO output, to generate a PDF. Is this Makefile in 
error?


You're correct, it does not have a command line parameter for .fo input, just 
xml or sgml.

I just checked that with the current version, 0.3.2.  Perhaps an older version used .fo input. 
There was also an older program called db2latex, I believe.  Might could be that used .fo files as 
input; I've never used it.

--
Mike Maxwell
maxw...@umiacs.umd.edu
My definition of an interesting universe is
one that has the capacity to study itself.
--Stephen Eastmond

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] DocBook - PDF paths that aren't hugely expensive?

2012-03-22 Thread maxwell

On Thu, 22 Mar 2012 03:11:37 -0700, Robin Lee Powell
rlpow...@digitalkingdom.org wrote:
 - dblatex is hugely brittle (trust me on this)

I guess I don't trust you on this.  We've been using dblatex for several
years now on book-length grammars, with excellent results for articles,
reports, and books.  It's true that we don't exercise all the DocBook
elements (we don't use most of the computational elements, for example),
but I would be surprised to find dblatex having significant problems there.
Between the command-line parameters to dblatex (which can be put into a
.xsl file) and LaTeX style sheets, there's not much you can't tweak.

   Mike Maxwell
   University of Maryland

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] info problems

2012-02-06 Thread maxwell

On Sun, 05 Feb 2012 09:26:24 -0500, Mike Maxwell maxw...@umiacs.umd.edu
wrote:
 Yes, that's what it's doing--it's complaining about the second case 
 above, and singling out the info element...  
 
 That eliminates one explanation; I'll look into others.  Thanks!

It was my dumb error.  Sorry 'bout the noise.

   Mike Maxwell

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] info problems

2012-02-05 Thread Mike Maxwell


On 2/5/2012 2:27 AM, Richard Hamilton wrote:

The limitation is that you can't have a title both insideinfo  and
directly under the same element, so the following is invalid:

book title.../title infotitle.../title/info ...
/book

This, however is valid:

book infotitle.../title/info ... chapter
title.../title infoother info stuff/info ... /chapter
/book


Are you saying that XMLmind complains about the second case above?


Yes, that's what it's doing--it's complaining about the second case 
above, and singling out the info element.  Now that I think about it, 
I guess if the problem had been what I thought (the second structure 
above was invalid), it would have complained about the title elements 
in the chapters instead, assuming it parses top-down.


That eliminates one explanation; I'll look into others.  Thanks!
--
Mike Maxwell
maxw...@umiacs.umd.edu
My definition of an interesting universe is
one that has the capacity to study itself.
--Stephen Eastmond

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

[docbook-apps] Processing cells in CALS tables

2011-09-12 Thread maxwell

I'd like to apply an XSL transformation to my DocBook (CALS-style) tables
which outputs each cell of each table, together with the label of its
row(s) and column(s).  That is, for a table that looks like

| Col1Label  |  Col2Label|
-|
Row1Label   |   Cell1|  Cell2|
-|
Row2Label   |   Cell3|
Row3Label   ||
-|

the transform would output something like
...
Entry
   Cell1
   RowLabels 
  RowLabel text=Row1Label/
   /RowLabels
   ColLabels
  ColLabel text=Col1Label/
   /ColLabels
/Entry
Entry
   Cell2
   RowLabels 
  RowLabel text=Row1Label/
   /RowLabels
   ColLabels
  ColLabel text=Col2Label/
   /ColLabels
/Entry
Entry
   Cell3
   RowLabels 
  RowLabel text=Row2Label/
  RowLabel text=Row3Label/
   /RowLabels
   ColLabels
  ColLabel text=Col1Label/
  ColLabel text=Col2Label/
   /ColLabels
/Entry
...

Before I go off and try to do something like that, is there any existing
transform that does that?  I'd prefer XSLT or Python, but I'd accept other
solutions.

BTW, the application is that we're writing grammars in DocBook, and if a
table represents a paradigm, then its entries--the forms of the
paradigm--are test cases for our parser.  I can extract the test cases
easily, but it would be nice to also extract an indication of the expected
parse, which is partly indicated by the row and column labels.  I have
thought of going in the opposite direction--that is, creating an XML
structure for paradigms, then automatically converting that into a DocBook
table.  But there are other problems with that.

   Mike Maxwell

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: AW: [docbook-apps] ragged index with recent fop snapshots

2011-08-01 Thread Mike Maxwell


On 8/1/2011 3:45 AM, Markus Hoenicka wrote:

I assume that dblatex and/or teTeX do not have any provisions to handle
UTF-8 encoded XML files automagically. This may be the point where my
problems started.


Dunno about teTeX, but dblatex is fine with UTF-8--we use it that way 
exclusively, and then use XeLaTeX (a Unicode-aware version of LaTeX) to 
process the resulting file.



I'm currently using two non-Unicode fonts that still have all required
glyphs. It required a bit of testing though, but many fonts actually
have a full set of greek characters and those few symbols that my
document uses.


If you're using non-Unicode fonts, then you'll need LaTeX rather than 
XeLaTeX.  I'm not sure how (or whether) dblatex handles the conversion 
in that case--it wouldn't be something a standard encoding converter 
could handle, because you'd have to convert the non-ASCII (or at least 
non-ISO) characters into some kind of LaTeX commands, I imagine.  I've 
never tried that.

--
Mike Maxwell
maxw...@umiacs.umd.edu
My definition of an interesting universe is
one that has the capacity to study itself.
--Stephen Eastmond

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: AW: [docbook-apps] ragged index with recent fop snapshots

2011-07-30 Thread Mike Maxwell


On 7/30/2011 5:47 PM, markus.hoeni...@mhoenicka.de wrote:

1) I gave dblatex a whirl. It took a while to install missing packages
from CTAN until dblatex ran at all. Eventually I got stuck when
(La)TeX told me that a glyph was missing from a font that it intended
to use. My document uses quite a few greek characters and other
symbols, but I wouldn't expect LaTeX to have a problem with that. As it
wasn't apparent to me how to move on from there, I had to give up.


Sounds like you found another solution, but for the record:

The easiest way to get all the LaTeX packages you might need to have 
installed is to install the latest TeX Live distro:

   http://www.tug.org/texlive/
It's big, but compared to hard disks these days, no worries.

As for the font issue: I'm not sure whether you were using LaTeX (8-bit 
pre-Unicode characters) or XeLaTeX (Unicode compliant version).  (Both 
come in the Tex Live distro.)  On the assumption that your XML was 
Unicode, you should have been using XeLaTeX.  A good Unicode font (and 
it's nice looking, too) is the Charis SIL font:

   http://scripts.sil.org/CharisSILfont
It includes regular, bold, italic, bold italic, small caps, tons of 
diacritics, IPA etc.  It does not however cover Greek characters.


I'm sure there are Unicode fonts that cover both Roman and Greek 
characters (Arial Unicode does, but is probably not what one would want 
for typesetting).


The more general solution is to tag the non-Roman strings in XML (either 
manually or by a script), and create a small XSL transform that tags 
them for the font when converting to XeLaTeX.  We do that for our 
grammars, which routinely mix in strings in Perso-Arabic scripts, 
Bengali script, etc.  Or you could run a script over the XeLaTeX output 
by dblatex and font-tag the non-Roman strings directly.


You might also want to use the Polyglossia package
   http://www.tex.ac.uk/ctan/macros/xetex/latex/polyglossia/
to provide language-specific hyphenation, etc.

And if someone wants help, there's a XeTeX mailing list:
   http://www.tug.org/mailman/listinfo/xetex
--
Mike Maxwell
maxw...@umiacs.umd.edu
My definition of an interesting universe is
one that has the capacity to study itself.
--Stephen Eastmond

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] db 5, para formatting, fo output

2011-07-26 Thread maxwell

On Tue, 26 Jul 2011 22:01:31 +0200, Křištof Želechovski
giecr...@stegny.2a.pl wrote:
 What is the purpose of including such long URL in a printed document?
 Do you expect your readers to actually type it (without errors) when it
is
 wider than the printed page?

PDF?

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: AW: [docbook-apps] ragged index with recent fop snapshots

2011-07-18 Thread Mike Maxwell


On 7/18/2011 3:46 AM, Markus Hoenicka wrote:

On Mon, 18 Jul 2011 08:19:33 +0200, robert.buer...@bmw.de wrote:

we are using dblatex with which we never had such problems.
Maybe you give it a try.


this may be an option although I never tried if dblatex handles
citations and reference lists well enough for my purposes. I keep my
references in a SQL database and use RefDB to provide them as raw
DocBook bibliomixed entries.


dblatex does well with citations.  As for the reference lists, it looks 
like RefDB outputs as BibTeX as well.  We use dblatex +XeLaTeX with 
BibTeX for the references (rather than DocBook entries), and the ff. 
processing instruction:

 ?bibtex bibfiles=References bibstyle=sp mode=all?
(which pulls in a BibTeX file References.bib).  Then we can take 
advantage of the diversity of LaTeX packages for formatting the 
bibliography.  The result is more than adequate for our purposes.  The 
one drawback might be that BibTeX doesn't handle Unicode sorts well, 
although I think that only bites you if you have non-Roman scripts in 
some of your authors' names.  (It seems to handle accented Roman scripts 
IIRC, but I haven't tested this thoroughly.)

--
Mike Maxwell
maxw...@umiacs.umd.edu
My definition of an interesting universe is
one that has the capacity to study itself.
--Stephen Eastmond

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: AW: [docbook-apps] ragged index with recent fop snapshots

2011-07-18 Thread maxwell

Markus Hoenicka wrote:
 At first sight, I didn't find any information about SVG 
 graphics though. Are they known to work?

I don't know.  I expect if there's a problem, it would be with LaTeX (or
XeLaTeX), not with dblatex.  As of a few years ago, svg and Xe(La)TeX were
not compatible:
   http://tug.org/pipermail/xetex/2008-June/010185.html
Doesn't look like the situation's much better in vanilla LaTeX; the
consensus seems to be that you need to convert a SVG graphic to PDF before
including it in the LaTeX source.  The same work-around would work in
XeLaTeX, too.

   Mike Maxwell

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Alternatives to MS Arial Unicode for PDF output?

2011-07-08 Thread maxwell

On Fri, 8 Jul 2011 12:05:56 EDT, deannel...@aol.com wrote:
 We use DejaVu fonts that are freely available from 
 _http://dejavu-fonts.org/_ (http://dejavu-fonts.org/) 
  
 They have several unicode fonts that incorporate almost all of the
unicode
 sets. 
 ...
 From: Ron Catterall  r...@catterall.net
 ...
 the Arial Unicode MS font can be downloaded for  free.
 ...
 This font was useful for fop, to  produce PDFs of documents containing,
 for example, Chinese or Japanese  characters.

AFAIK, the DejaVu fonts don't include any CJK characters.  In fact, there
are few if any fonts that cover the entire Unicode range, or even most of
it.  (The only one I know of simply puts up a box containing the code point
of the character--not any glyphs for the character.)  And once you get
beyond Latin or Cyrillic characters, many of the character sets place
extraordinary demands on the rendering system.  Arabic scripts are
connected, and the Nasta'liq versions of Arabic scripts even more so;
glyphs flip over preceding glyphs in many Indic languages, or show up on
both the left and the right of a preceding glyph; and so on.  It's hard to
find a good font for any one such script, much less a font that covers all
or most of them.

That said, you can look here:
   http://en.wikipedia.org/wiki/Unicode_typefaces

I think the right solution is to use multiple fonts for a document that
contains multiple scripts.  We have that problem with multilingual
documents, and it's reasonably (not completely) straightforward to tag
sequences of characters in this or that Unicode block for the font that
they should use.  (The tags will be dependent on your typesetting system,
of course.)  

   Mike Maxwell

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Writing mode, xsl-fo output

2011-04-01 Thread maxwell

On Fri, 1 Apr 2011 10:40:16 -0700, Bob Stayton b...@sagehill.net
wrote:
 But when you say some rl-tb text, do you mean a mixed language
document?
 In that case, the writing mode value should be for the dominant
language, 
 since the document's writing mode determines the page layout.. 
 Any inline translated text should get the 
 correct text direction based on its Unicode character range.

That last sentence--that the writing direction can be determined by
inspecting the characters--is a common intuition (it was once my own
intuition).  But it isn't quite that simple, since some symmetrical
punctuation marks belong sometimes to L2R text, and sometimes to R2L text. 
For example, an ASCII period at the end of a run of R2L text might belong
at the left end of the R2L text, or--if the R2L text is at the end of an
L2R text--it might belong at the right end of the L2R text (and therefore
at the right end of the R2L text).  

Unsymmetrical punctuation marks sometimes exist as distinct L2R and R2L
code points in Unicode, like the ASCII comma vs. the Arabic comma U+060C. 
But Parentheses (which of course are asymmetrical) are also sometimes used
inside runs of R2L text--I've seen them in Urdu, for example.  Here I
believe the ASCII open parenthesis is used as an Urdu close paren, and vice
versa.

Space characters of course also fall into this category of ambiguous
direction, although that's generally handled correctly by algorithmic
methods.

There's been considerable discussion of this general issue (whether it's
possible to algorithmically determine the ends of an R2L run inside an L2R
run, or vice versa) over on the XeTeX mailing list.  The opinion of Those
Who Know seems to be that it is not 100% decidable.

   Mike Maxwell

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Writing mode, xsl-fo output

2011-04-01 Thread maxwell

On Fri, 1 Apr 2011 20:38:13 +0100, Dave Pawson da...@dpawson.co.uk
wrote:
 I think I would rather specify what I'm writing rather than
 leave it to the code point.
 
 Although I'm unsure who / what would do that? The formatter?

There's a general DocBook attr @dir, see:
   http://www.docbook.org/tdg5/en/html/ref-elements.html
For spans of text within e.g. a paragraph, you might use this attr on a
phrase.

That said, I'm not sure how well the formatting tools use this.  We use
dblatex, and I don't recall how well this attr is supported; there's this
comment in the file dblatex-0.3/xsl/common/l10n.xsl:  

   !-- FIXME: This is sort of hack, but it was the easiest 
way to add at least partial support for dir attribute
   --

There are also two Unicode chars for this purpose, see:
   http://www.w3.org/TR/WCAG-TECHS/H34.html

In our own grammar work, we have added a few elements in our DocBook
localization for text in right-to-left languages, which of course
necessitated our writing some special XSLT code for the conversion to
XeTeX.

   Mike Maxwell


-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] programlisting page breaks

2011-03-15 Thread maxwell

On Tue, 15 Mar 2011 07:30:42 -0500, David Cramer da...@thingbag.net
wrote:
 On 03/15/2011 04:08 AM, Dave Pawson wrote:
 I have both long and short programlistings? 3 to 100 lines and more.
 I think it rather judgemental to assume all docbook users
 only have one or the other?

 I agree completely. If we put keep-together=always on example, then 
 we're saying examples must never span more than one page. 

FWIW, the same problem comes up with tables (and perhaps other things). My
team is actually using another tool chain, dblatex -- XeTeX -- PDF, but
most of the parameters that dblatex uses are the same as the parameters
that the FO tool chain uses.  In particular, you can elect at the document
level when you run dblatex whether tables should float or extend over
multiple pages.  We have documents where some tables need to do one, and
some the other.  While we could use a processing instruction, we have
instead (mis-)used the table@floatstyle attribute, together with some
locally munged xslt code, to set individual tables to long or float
(with the default still determined at the document level by the parameter
when the XML document is converted).

I guess an issue with using an attribute like this is that it's really
only valid for paginated output (most PDFs), and then only if you don't
change the paper size or even the font size.  IMO, issues like this should
really be left to the discretion of the processor, which (presumably) knows
what the final page size will be (or if there will not even be pages, e.g.
an HTML output).  But today's processors (even LaTeX) aren't smart enough
for that.

A similar issue concerns table widths.  By default, tables usually extend
the width of the page or the column, but this can be overridden in several
ways (by explicitly setting colspec@colwidth in centimeters or inches, by
using the table@width attr, or by a processing instruction). But if we were
producing HTML output on the fly, and the user happened to have a browser
set very wide, 50% might not be appropriate, nor would a setting in terms
of inches or centimeters be appropriate (since it's impossible to know what
the user's physical screen width is).  I suppose a setting in terms of
number of characters might be reasonable (assuming that the table doesn't
contain a bunch of 'W's or '.'s, I guess).

And another issue is the table@orient parameter, which is needed for
output which may be printed, but might be considered irrelevant for HTML
output (where the screen can be scrolled).

I guess it's for these reasons that some of these settings can be
determined by processor instructions.  But processor instructions of course
make for maintenance and archivability problems...

   Mike Maxwell

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Problem with tables and roundtrip to WordML

2011-02-28 Thread maxwell

On Tue, 1 Mar 2011 07:56:03 +1100, Steve Ball steve.b...@explain.com.au
wrote:
 Although I have not yet checked, I believe the problem is that the
content
 of the cells is not in a para. That is, your DocBook should look like:
 

theadrowentryparaa1/para/entryentryparaa2/para/entry/row/thead

tbodyrowentryparab1/para/entryentryparab2/para/entry/row/tbody

FWIW, the DocBook 5 standard explicitly allows for text inside entrys,
without a para tag:
   http://www.docbook.org/tdg5/en/html/entry.html#children
I believe earlier versions of DocBook also allowed inline elements,
including text; see the discussion of Pernicious Mixed Content here:
   http://www.docbook.org/tdg/en/html/entry.html

  Mike Maxwell

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: AW: [docbook-apps] PDF customization questions from newbie

2011-02-11 Thread Mike Maxwell


On 2/10/2011 8:02 PM, ben.guillon wrote:

In a word, if you know and like latex you'll be in a friendly world.


As a happy user of dblatex, I'll second Benoit's (and Robert Buergel's) 
response.  We use it for typesetting grammars, where the primary 
language is English and the secondary languages have challenging 
scripts: Bengali, Urdu, Pashto, which I think would be doubly 
challenging if we used the standard XSL-FO approach.


We've needed to add a few xslt transforms for specialized constructs 
found in grammars (like interlinear text--and of course we've added 
those constructs to our DocBook schemas), and we've made a few other 
modifications to particular transforms for specialized sorts of things. 
 All these changes are kept in separate files and survive upgrades to 
dblatex.


Benoit also mentioned xetex.  Since I assume your DocBook documents are 
in UTF-8, I highly recommend using xe(la)tex as your back end, rather 
than plain latex.  Xetex comes with the TeX-live distro, so if you have 
that you should be all set to go.


   Mike Maxwell
   University of Maryland

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

RE: [docbook-apps] Need recommendations for the best modern Linux backend tools for DocBook

2010-11-23 Thread maxwell

On Tue, 23 Nov 2010 15:06:42 -0800 (PST), Alan W. Irwin
ir...@beluga.phys.uvic.ca wrote:
 On 2010-11-23 22:55+0100 Mauritz Jeanson wrote:
 
 dblatex can produce DVI: http://dblatex.sourceforge.net/.
 
 That sounds like a most interesting approach since it transforms to
 latex.  The dvi, PostScript and PDF results are then produced from
 that latex source using native latex, dvips, etc. tools which I am
 familiar with.  Has anybody compared dblatex results for PostScript
 and PDF to what you get with a FOP-based approach? (I will obviously
 be doing that comparison myself when I get a chance, but I am curious
 about what other's comparisons have shown as well.)

I haven't done a *comparison*, but we routinely use dblatex + XeTeX (= a
Unicode-aware version of LaTeX) to produce multi-lingual PDF documents.  We
chose this route because it wasn't clear that the FOP route would deal well
with some of the scripts we needed, such as the Nasta'liq version of the
Arabic script.  

We have been well satisfied with dblatex.  Despite its current version
number (0.3), it seems to be stable and reasonably complete.  (Disclaimer:
there are a lot of DocBook elements that we are not using, so we haven't
done a thorough test.)  Once you get the hang of it, it seems reasonably
easy to modify, which we have needed to do because of some additional XML
elements we added.  That isn't to say it's perfect; in the last two weeks,
I seem to have run into two bugs, or at least incomplete features: one
having to do with captions on long tables, the other being the fact that
the XSL template for the entry element forgot to check whether the
'valign' attr was specified on the tbody (or thead or tfoot) element.
As usual, fixing the problem was much easier than finding it.  Fixing it
was of course made possible by the fact that the software is open source
(http://dblatex.sourceforge.net/).

In order to modify it you have to understand XML (of course), XSLT, and
LaTeX; any XSL templates you define automatically override the
corresponding dblatex-supplied templates.  XSLT is still not my favorite
programming language...  There are of course a number of things you can do
without modifying the program, by using command line parameters or LaTeX
style sheets.  I believe most of these are modeled after the corresponding
XSL-FO parameters, but not having used the latter I can't say for sure.

The author of dblatex, Benoît Guillon, has historically been quick to
respond to queries and bug fixes on the mailing list.  However, the list
seems to have gone silent recently, apart from my two bug fix posts. 
(Maybe they're sparing me embarrassment by not pointing out that my fix was
wrong and/or the feature was already there :-).)

   Mike Maxwell

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Need recommendations for the best modern Linux backend tools for DocBook

2010-11-23 Thread maxwell

On 11/24/2010 09:34 AM, maxwell wrote:
 I haven't done a *comparison*, but we routinely use dblatex + XeTeX (= a
 Unicode-aware version of LaTeX) to produce multi-lingual PDF documents. 
 We chose this route because it wasn't clear that the FOP route would 
 deal well with some of the scripts we needed, such as the Nasta'liq 
 version of the Arabic script.

I'll mention one other thing that's nice about using dblatex: there is a
ton of useful LaTeX packages out there.  We have a few landscape tables
in our grammars, and they're always a pain in the neck if you're viewing
the PDF.  You can spare the neck pain by picking the monitor up and turning
it sideways, by telling Adobe Reader--and probably most other PDF
readers--to rotate the image sideways; just click on the menu item.  But
even nicer, it turns (pun intended) out that you can embed a piece of code
in the PDF that tells the Reader to rotate any given page.  And there's a
LaTeX package which, given a landscape mode page (which dblatex produces
when it encounters a table with the 'orient=land' attribute), inserts
that code into the PDF.  A one-line change to our style sheet now produces
PDFs that tell Reader to automagically rotate landscape pages.  I was
impressed!

   Mike Maxwell


-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] diff marking

2010-11-14 Thread Mike Maxwell


On 11/14/2010 12:52 AM, Keith Fahlgren wrote:

On Sat, Nov 13, 2010 at 6:59 PM, Mike Maxwellmaxw...@umiacs.umd.edu  wrote:

But what a price :-(


Understood. I haven't used either tool in a production environment, so
I'm unable to give any endorsement. That said, I would not expect a
feature-complete open source alternative in the short term:
(schema-aware-) XML differencing is a seriously hard problem.


Would it be possible to do a two-step diff:
1) Run another XML diff, which produces some format like a patch
2) Use the patch to create a modified version of one of the input files, 
with 'revisionflag' appropriately marked.


There seem to be several programs out there that accomplish step (1), 
although I have no idea how well they work.  It seems like a fairly 
simple task to write a program that would perform step (2).  Admittedly, 
I haven't tried that, and it could be lots harder than I think :-).

--
Mike Maxwell
maxw...@umiacs.umd.edu
A library is the best possible imitation, by human beings,
of a divine mind, where the whole universe is viewed and
understood at the same time... we have invented libraries
because we know that we do not have divine powers, but we
try to do our best to imitate them. --Umberto Eco

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

[docbook-apps] diff marking

2010-11-13 Thread Mike Maxwell

Is there any program out there which, given two versions of a DocBook v5 
file, finds the differences and outputs a marked-up version *using the 
revisionflag attributes*?


Afaik, most xml diff programs create an output more like the traditional 
line-based diff program, or some sort of 'patch' output.  One exception 
is Norm Walsh's diffmk program, which is supposed to set the 
revisionflag attributes.  Unfortunately, v2 of this program sees to work 
only with DocBook v4 (and possibly earlier) documents; in particular, it 
requires a DocType.  And I can't get v3 to run at all, it just gives me 
the ff. error:

   Failed to load Main-Class manifest attribute
which seems to mean that something that's supposed to be there (the
equivalent of a 'main' in some programming languages, maybe) isn't.

I'd be very happy to be proven wrong.
--
Mike Maxwell
maxw...@umiacs.umd.edu
A library is the best possible imitation, by human beings,
of a divine mind, where the whole universe is viewed and
understood at the same time... we have invented libraries
because we know that we do not have divine powers, but we
try to do our best to imitate them. --Umberto Eco

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] diff marking

2010-11-13 Thread Mike Maxwell


On 11/13/2010 9:50 PM, Keith Fahlgren wrote:

On Sat, Nov 13, 2010 at 5:26 PM, Mike Maxwellmaxw...@umiacs.umd.edu  wrote:

Is there any program out there which, given two versions of a DocBook v5
file, finds the differences and outputs a marked-up version *using the
revisionflag attributes*?


oXygen's XML Diff (http://www.oxygenxml.com/xml_diff_and_merge.html)
and DeltaXML (http://www.deltaxml.com/) are the two commercial
differencing products I'm aware of. I believe that DeltaXML does
generate the revisionflag-marked output documents.


Ah, thanks, I missed that!  From their website:
   DeltaXML DocBook Compare highlights changes between
   any two docbook files. Docbook Compare performs a
   detailed comparison between the two files and
   automatically adds revision flags to highlight added,
   deleted or changed text. Changes to individual table
   cells are clearly identified.
But what a price :-(: $1500/year for a site license!  (Which is the 
cheapest license they have; the only other choice is a multiple site 
license.)

--
Mike Maxwell
maxw...@umiacs.umd.edu
A library is the best possible imitation, by human beings,
of a divine mind, where the whole universe is viewed and
understood at the same time... we have invented libraries
because we know that we do not have divine powers, but we
try to do our best to imitate them. --Umberto Eco

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] removing elements and attrs from title

2010-11-11 Thread Mike Maxwell


On 11/11/2010 1:44 AM, Cramer, David W (David) wrote:

Interesting idea about removing xml:id from titles. Try the
following: ...

That seems to work...not sure if it's the best/easiest way.


Yes, that gets rid of the xml:id attr in titles.  Thanks!


I think to remove remark you just add the following inside
yourinclude:

define name=db.remark notAllowed/ /define


Sorry, I wasn't clear; I *only* want to remove remark from inside 
title, whereas the above removes it everywhere.  The problem is that 
there is something about the title defn in DB5 that prevents any 
changes to its content elements.  Even the ff. minimalist defn triggers 
the interleave error msg:

  define name=db.title
element name=title
text/
/element
  /define

BTW, the reason I want to make these two changes has to do with our 
processing to produce a PDF (what you call a toolchain, I think).  Our 
output method doesn't allow xrefs to titles (which is the only use we're 
making of xml:id).  And it appears that the LaTeX macro I cribbed for 
outputting remarks with a colored background (to make them easy to 
see) goes into an infinite loop when the remark (or the LaTeX 
\comment{} command that this gets translated into) is inside a title 
(\title{}).  Maybe I'll go back and try to figure out why the LaTeX 
macro does that, instead of prohibiting remarks inside titles...

--
Mike Maxwell

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

[docbook-apps] removing elements and attrs from title

2010-11-10 Thread Mike Maxwell

I want to customize my DocBook schema by removing one element (remark) 
and one attribute (xml:id) from the title element.  I'm trying to 
follow the customization instructions for DocBook 5.0 
(http://www.docbook.org/tdg5/en/html/ch05.html#ch05-layers), but I'm 
apparently doing something wrong. I'm using the XML form of the RelaxNG 
schema, whereas the instructions show the RNC form; but I think I've 
done the conversion right.


I have the following:
  include href=xxe-config:docbook5/rng/V5.0/docbook.rng
...
define name=db.title
  !--Replace the DocBook standard db.title--
  element name=title
ref name=db.MyTitle.attlist/
zeroOrMore
  ref name=db.MyTitle.inlines/
/zeroOrMore
  /element
/define
  /include

  define name=db.MyTitle.attlist.../define
  define name=db.MyTitle.inlines.../define

('xxe-config' in the first line points to a place in the file system)

The code for define name=db.title is copied over from the DB5 schema 
included with XXE, except I substituted db.MyTitle.attlist for 
db.title.attlist, and db.MyTitle.inlines for

db.all.inlines (and omitted the a:documentation element).

However, validation gives me the following error:
   overlapping element names in operands of interleave
pointing to line 1262 of the docbook.rng.  Line 1262 is the second line 
in this definition:

define name=db.titleonlyreq.info
  element name=info
a:documentationA wrapper for information about a component or 
other block with only a required title/a:documentation

ref name=db.titleonlyreq.info.attlist/
interleave
  ref name=db._title.onlyreq/
  zeroOrMore
ref name=db.info.elements/
  /zeroOrMore
/interleave
  /element
/define
At this point, I get lost tracing things backwards.  I just don't see 
how db.titleonlyreq.info (or db.titleonlyreq.info.attlist) gets called 
from my re-definition of db.title, nor what the error is.


Commenting out parts of my modifications doesn't help, either; in fact, 
I still get the error msg if I reduce my modifications to this:

  include href=xxe-config:docbook5/rng/V5.0/docbook.rng
...
define name=db.title
  !--Replace the DocBook standard db.title--
  element name=title
text/
  /element
/define
  /include

What am I doing wrong?
--
Mike Maxwell
maxw...@umiacs.umd.edu
A library is the best possible imitation, by human beings,
of a divine mind, where the whole universe is viewed and
understood at the same time... we have invented libraries
because we know that we do not have divine powers, but we
try to do our best to imitate them. --Umberto Eco

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Docbook for industrial usage

2010-09-24 Thread Mike Maxwell


On 9/24/2010 1:15 PM, Thomas Schraitle wrote:

Actually OpenSuse internally also uses DocBook for documentation, but I
think that they use TeX for doing final typesetting of PDF version.


The TeX toolchain was long gone. We use a XSL-FO toolchain now. :)


I don't know what toolchain that was, but we routinely use dblatex
   http://dblatex.sourceforge.net/
It produces LaTeX, or better, XeLaTeX output.
--
Mike Maxwell
maxw...@umiacs.umd.edu
A library is the best possible imitation, by human beings,
of a divine mind, where the whole universe is viewed and
understood at the same time... we have invented libraries
because we know that we do not have divine powers, but we
try to do our best to imitate them. --Umberto Eco

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

[docbook-apps] elements inside remark

2010-07-22 Thread maxwell

According to the documentation on DB 5
(http://www.docbook.org/tdg5/en/html/remark.html), it appears that a large
number of inline elements ought to be able to appear inside a remark
element.  However, the docbook.rng
(http://www.docbook.org/xml/5.0b9/rng/docbook.rng) appears to allow only a
few of these:
   inlinemediaobject, remark, superscript, subscript, 
   xref, link, olink, anchor, biblioref, alt, annotation, 
   indexterm.singular, indexterm.startofrange, indexterm.endofrange,
   phrase, replaceable
Missing from this list afaict are many of the gui inlines, markup inlines,
operating system inlines, etc. that the documentation says should be
allowed (and which I believe DocBook 4.5 allowed).

Perhaps this is intentional; there are some comments at
http://www.oasis-open.org/docbook/specs/docbook-5.0b6-spec-wd-01.html to
the effect that DB5 drastically reduced the content models of many
inlines. This description uses the command element as an example, and
this content model seems to correspond to the DB5 schema.  Again, this
differs from the documentation at
http://www.docbook.org/tdg5/en/html/command.html.  So maybe this is a case
of the documentation at http://www.docbook.org/tdg5/en/html/command.html
(etc.) not having caught up with the schema?

Or maybe http://www.docbook.org/tdg5/ is not the definitive documentation
for DB5, because it's v1.1 of the documentation, and for now 1.0 is what I
should be using?

So maybe my question boils down to this:  Where can I find both the
current and consistent schema and human-readable documentation for DB5?
 
   Mike Maxwell
   

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] linkend attr required in biblioref?

2010-07-22 Thread maxwell

On Thu, 22 Jul 2010 09:25:48 +0200, Markus Hoenicka
markus.hoeni...@mhoenicka.de wrote:
 maxwell maxw...@umiacs.umd.edu was heard to say:
 
 Perhaps this XML frag is supposed to be
citationbiblioref linkend=Chomsky1965 begin=23
end=25//citation
 
 I can't comment on the innards of the RNG schema regarding the  
 biblioref element, but as one of those who originally suggested the  
 biblioref element I'd like to confirm that the above usage is what we  
 had in mind. 

Thanks, after playing around I see that this works well--in fact, I don't
need the citation element (which I showed above wrapping the biblioref
element) at all.  Much cleaner!

I think the 'xrefstyle' attribute on biblioref is supposed to be used
for things like where the parens go, or whether things are parenthesized,
etc.  I suppose it's too much to ask that this could be standardized in
DocBook, perhaps along the lines of the LaTeX 'natbib' package--probably
there's just too much variability.  If only everyone did things like me!

   Mike Maxwell

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

[docbook-apps] linkend attr required in biblioref?

2010-07-21 Thread maxwell

The DocBook 5 RNG schema
(http://www.docbook.org/xml/5.0b1/rng/docbook.rng) makes the 'linkend' attr
in the biblioref element obligatory.  But afaict, the documentation for
DocBook 5 does not require that attr on this element.  

Specifically, http://www.docbook.org/tdg5/en/html/biblioref.html does not
show the 'linkend' attr among the attributes specific to biblioref.  The
'linkend' attr apparently comes as one of the common linking attributes. 
But there's no indication at the description of Common Linking Attributes
(http://www.docbook.org/tdg5/en/html/ref-elements.html#common.linking.attributes)
that this attribute should be obligatory, either.

In docbook.rng, on the other hand, the 'linkend' attr is one of two
possible attrs under db.common.req.linking.attributes, where the 'req'
presumably means 'required'.  And this db.common.req.linking.attributes is
among the attributes on the list db.biblioref.attlist, but it isn't
bracketed by optional.../optional.  

So if the documentation is correct, then it seems like either
'db.common.req.linking.attributes' should be bracketed by
optional.../optional under the def for the biblioref element
(although it seems odd to have a 'required' element be 'optional'), or more
likely the biblioref element should include the list
'db.common.linking.attributes', rather than the list
'db.common.req.linking.attributes'.

Or am I missing something?  Stepping back to the purpose of the
biblioref element, my midunderstanding is that one place this element is
used is inside a citation, so that you can add e.g. page numbers to a
citation.  And this is exactly what we want to do, e.g. we would like to
get the output
   Chomsky 1965:23-25
from something like
   citation Chomsky1965 biblioref begin=23 end=25//citation
(where 'Chomsky1965' corresponds to the abbrev element in one of our
bibliorefs, or some such).  Given this XML frag, there's no need for a
'linkend' attr on the biblioref.

But maybe there is an issue of mixed content?  Perhaps this XML frag is
supposed to be
   citationbiblioref linkend=Chomsky1965 begin=23
end=25//citation
?  If so, I guess the documentation should be clarified to say that the
'linkend' attr is obligatory on biblioref (and an example added to show
how to use biblioref: there is no example at present).

In sum, there seems to be a problem with either the documentation or the
schema--or me.  (Nah, can't be me...)

   Mike Maxwell


-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] write docbook attributes from Word 2007 (roundtrip stylesheets)

2010-06-22 Thread maxwell

On Tue, 22 Jun 2010 17:03:37 +0200, Jorge Martínez de Salinas
jorge.mar...@gmail.com wrote:
 We're thinking in documenting our designs using Docbook. However, some
 members of our team don't feel comfortable editing the XML directly
 (they usually work with Word 2007). 

FWIW, there is a product--
   http://www.xmlmind.com/xmleditor/
which puts a nice interface in front of the XML.  Not quite WYSIWYG, but
nice.  We use it for our grammars.  Most of our grammar writers have never
used anything but Word before, but after a day or so they seem to be quite
comfortable with it.  The company that makes it has a great mailing list,
and responds to each query.

(As a Word 2003-and-earlier-versions user, if you can stomach the jump to
Word 2007, you can do anything; XMLmind is a much easier jump, IMO.)

   Mike Maxwell
   CASL/ U MD

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

[docbook-apps] LaTeX (was: InDesign typography advantage)

2010-06-15 Thread maxwell

Giuseppe Bonelli peppo.bone...@gmail.com wrote:
 I agree with you that the quality of a page typesetted with LateX is
 _very_ high, but I think it would be _very_ difficult to introduce a
 LateX typesetting phase in a production worflow of a traditional
 publishing house. In other environment this could definitely be a good
 solution.

I am told that many publishing houses routinely use LaTeX.  I have dealt
with one (Springer).  One catch, however, might be that dblatex's output
contains some LaTeX commands that refer to dblatex-specific stylesheets;
often publishers that use LaTeX have their own in-house style sheets, I
think.

Introducing a LaTeX phase to a publishing house that is *not* already
familiar with it is of course a different problem...

   Mike Maxwell

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] InDesign typography advantage [Was: Re: [docbook-apps] DocBook and InDesign]

2010-06-15 Thread maxwell

Jirka Kosek ji...@kosek.cz wrote:
 TeX as well FO work in batch mode -- you can't interactively fiddle with
 details like line and page breaks and object placement and instantly see
 changes on-screen. This necessary especially for document with more
 artistic design.

There are LaTeX editors that allow you to do this via a two-pane editor,
with the LaTeX editor in one pane and a PDF view in the other. LEd is an
example:
   http://www.latexeditor.org
I've never used such an editor, so I can't vouch for whether it is
instant.  Perhaps one could create such a DocBook editor.  (XMLmind
allows you to work in a partially wysiwyg environment, although they're
quick to point out that it's very partial; and it certainly doesn't allow
low-level fiddling.)

 Also I'm not sure whether pdfTeX implementation of hz-algorithm and
 hanging punctuation is on a par with one available in InDesign.
 
 http://en.wikipedia.org/wiki/Hz-program
 http://en.wikipedia.org/wiki/Hanging_punctuation

XeTeX (a Unicode-enabled version of LaTeX) has an experimental
implementation of character protrusion or margin kerning (new in the
last few months).  Not being a typographer (I almost wrote typologist, an
area that I do claim to know a little about!), I'm not sure how much that
answers your question.

Perhaps more relevant, there is a discussion thread here:
   http://scripts.sil.org/xetex
about the relative uses and merits of Xe(La)TeX and InDesign.  Some of the
points would also pertain to DocBook in general.

   Mike Maxwell

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] dblatex Randomly Produces Bold Pages

2010-06-11 Thread maxwell

On Fri, 11 Jun 2010 10:23:22 -0500, Russell Harvey
rhar...@morrisdickson.com wrote:
 I'm using dblatex via Python to produce a PDF file on a Windows machine.

This is perhaps better addressed on the dblatex mailing list (unless
there's s.t. odd about your XML, which I didn't look at that closely).  The
dblatex mailing list can be accessed here:
   https://lists.sourceforge.net/lists/listinfo/dblatex-users
FWIW, we've used dblatex for several years now, with some fairly hairy
stuff (mixed left-to-right and right-to-left scripts, for example), and
never had a problem with bolding.  But we haven't used some of the elements
you're using, such as procedure, mediaobject, screenshot etc.  So
perhaps there's a problem there.

   Mike Maxwell
   CASL/ U MD

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Re: including non-xml

2010-04-29 Thread maxwell

On Thu, 29 Apr 2010 16:32:13 +0100, Ivan Ristic ivan.ris...@gmail.com
wrote:
 On Mon, Apr 26, 2010 at 11:26 PM, maxwell maxw...@umiacs.umd.edu
wrote:
 Sounds like a job for Literate Programming.  Instead of keeping your
 source code and examples in external file(s), keep them in your DocBook
 document, and extract them automagically to produce the source code in
 the programming language.
 
 I can see how that could work in some simple cases. For anything other
 than trivial programs, the edit-compile-run cycle would be very small
 because of the inability to use an IDE or edit the source code
 directly.

We do edit the program fragments directly, in a programmer's editor.  They
are contained in separate files, and x-included into the DocBook document. 
The point of the literate programming approach is not to make you edit
everything in one application, but to document the result in an enduring
way, including bringing all the test cases and program code together in one
place in a coherent way (as opposed to a bunch of files with maybe a
readme)--and maybe most importantly, to document all this in a way that
will allow people in the future to understand what you have done.

I haven't thought through the IDE issue, because I dislike IDEs.  Your
mileage may vary...

   Mike Maxwell

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Re: including non-xml

2010-04-26 Thread maxwell

On Mon, 26 Apr 2010 16:25:49 -0500, Grant Taylor
gtay...@riverviewtech.net wrote:
 Can you not have all your examples in separate source files?
 
 You might need a pre-processor to include them together when compiling 
 your source code.  Can you compile your source code as is or are some of

 the sections duplicate?

Sounds like a job for Literate Programming.  Instead of keeping your
source code and examples in external file(s), keep them in your DocBook
document, and extract them automagically to produce the source code in the
programming language.  The implementation that Norm Walsh did several years
back, for an earlier version of DocBook, can easily be adapted to DB5.  It
assumes you only want to get a single source code file out, but could
easily be adopted to allow for extracting (tangling) multiple output
files.

We're using this for grammars of natural languages; the individual grammar
rules scattered throughout our prose grammar get tangled into an XML
document, which we further process into another target language.  But the
tangled doc can be in any programming language.

   Mike Maxwell

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Converting Symbol Fonts to UTF-8

2010-04-18 Thread Mike Maxwell


Bob Stayton wrote:
Thanks, that clarifies the situation.  This seems to be a two-byte 
encoding, perhaps specific to Microsoft's Symbol font?  I thought the 
Symbol font was single byte, so I'm not understanding those numbers.  
Anybody else recognize this?


There is something of an explanation here:
  http://scripts.sil.org/fontfaq_unicodeword
Specifically:
--
To further complicate the picture, there are two different ways to 
encode 8-bit fonts: as normal text fonts, called UGL, or as symbol 
fonts. Most fonts containing alphabetic characters (e.g., Times New 
Roman, Arial) are encoded as UGL fonts. Fonts containing symbols (e.g. 
Wingdings) are typically encoded as symbol fonts. Word 97/2000 uses two 
different translation schemes between Unicode values and 8-bit values, 
depending on whether the font used for the text in question is a UGL 
font or a symbol font. If the font is a UGL font, Word 97/2000 converts 
the characters between the standard 8-bit and Unicode values defined by 
the active codepage. No such standard conversion exists for symbols, 
however, so if the font is a symbol font, Word 97/2000 converts the 
characters to a different set of Unicode values in what is called the 
“Private Use Area” (PUA) of Unicode.

-
There's a PDF link at the bottom of the page which goes into more details.
--
   Mike Maxwell
   What good is a universe without somebody around to look at it?
   --Robert Dicke, Princeton physicist


-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] CSS based docbook editor

2010-04-02 Thread maxwell

On Fri, 02 Apr 2010 13:51:05 +0200, Nathalie Sequeira n...@n-faktor.net
wrote:
 I am currently looking for a docBook editor that is directed towards end

 users
 ...
 - And especially:
 ability to import and transform rtf documents preformatted in a 
 specified way by contributing authors (or at least open them and 
 reformat, instead of having to cut and paste single text blocks, which 
 seems to be the necessary procedure e.g. in XMLmind?)

Three or four years ago, we had to start our project with MsWord.  Once we
had XMLMind, we imported the Word doc into XXE (= XMLMind Editor) using a
tool which I have long since forgotten.  The tool worked reasonably well,
and we were able to import the doc as a whole; I think we had to do some
clean-up after we got it into XXE.  Unless you're doing this conversion on
multiple documents, I would consider that a one-time cost, and a small one.

I haven't looked at tools to convert Word to DocBook since then, but I
would guess that they have gotten better.  I don't know whether there are
tools that do both the import and allow editing; conceptually, those are
rather different tasks.

There was a slight learning curve for XXE, but we have now at least half a
dozen people who have learned it and are reasonably happy with it.  (Some
are happier in XXE than in Word 2007, but your mileage may vary.)

I do have a few minor gripes about XXE:
1) It doesn't handle right-to-left text well (that's probably not an issue
for most people!)
2) It doesn't do track changes (yet; that's on their list, and there are
work-arounds if you don't mind having the changes flagged only in your
PDFs)
3) It mungs the XML code by adding newlines in places no human would add
them (but that's only an issue if you want to hand-edit the XML in a
programmer's editor; it has no practical effect otherwise)

I will also mention that they have an active mailing list for users, and
they are *very* responsive to reasonable requests.

Unless you are lucky, you will need someone who can figure out how to
modify the standard configurations for XXE, and for DocBook as a whole, to
do what you want.  Our own setup is decidedly non-vanilla, and it does take
some work to keep on top of things.

In sum, we are satisfied customers of XXE.  (We have gladly paid for the
Professional version.)

   Mike Maxwell

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

[docbook-apps] table titles

2010-03-25 Thread maxwell

The DocBook 5 web page that describes (CALS) tables
(http://www.docbook.org/tdg5/en/html/cals.table.html) says:
   This table element identifies a formal table (one without a title).
Isn't that a typo?  Shouldn't it say one with a title?  (The synopsis
and children both mention a title element.)

(The description of figure, informalfigure and informaltable all
appear to be correct.)

   Mike Maxwell


-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Capturing phrase books and dictionaries

2010-03-18 Thread Mike Maxwell


Lech Rzedzicki wrote:

We're trying to keep our markup close to DB5 but we also want to
tighten the schema a bit further.
One area we're particularly struggling with is phrase books and
dictionaries. This was originally modelled using TEI and reflects the
actual structure quite well.
The problem we have is that both in the original language portion
(form) and in the the target language explanation (sense) we need to
allow many optional elements such as example, pronunciation, often
multiple times (as there can be many forms or senses or many examples
for each sense or form), gradually this led us to a very complex and
loose model which also doesn't maintain the relationship between the
original and translation too well.

I was wondering if any of you have any experience dealing with similar
content and whether you could share your experience and schemas?


We are working a lot with XML-based bilingual dictionaries (not phrase 
books, although they may be similar).  I think the bottom line is, don't 
use DocBook for dictionaries (at least not for the body of the 
dictionary, i.e. all the entries).  It just isn't the same kind of 
structure.


TEI-encoded dictionaries tend to reflect the structure of the print 
dictionary from which the electronic form was derived.  That has a 
couple advantages:
1) It's easy(-er) to convert from the print form to the electronic form, 
and go back later and make sure you did it right
2) It makes producing a new print copy of the dictionary that looks like 
the original print dictionary easy(-er).


It also has some disadvantages:
1) Unless you're working with a bunch of similar dictionaries from a 
single publisher, you're likely to wind up with a large number of 
schemas (or DTDs), one for each dictionary, and that can be hard to manage.
2) The large number of schemas in (1) also means that you probably have 
to write a different CSS (or whatever you use) for each one.
3) You're limited to a single presentation form, i.e. it is difficult to 
display a root-based dictionary as a stem-based dictionary.


What we (and probably most people who work with multiple electronic 
dictionaries) do instead, is to use a generic lexicon schema.  This 
flattens the overall structure of a typical print dictionary (e.g. 
subentries become entries on their own); the original structure is 
instead represented by xrefs (so a sub-entry and a minor entry both have 
pointers back to the main entry). One can then postpone until run-time 
decisions like root-based vs. stem-based presentation, or whether a 
given minor entry is displayed as a sub-entry or as an entry on its own 
(and perhaps alphabetized on its own, if that's relevant to the 
electronic display).  The run-time decisions are then implemented using 
one of two (or several) style sheets.


More than that about this approach (as opposed to doing something with 
dictionaries inside DocBook) probably doesn't belong on this list. 
Fortunately there are lexicography mailing lists, e.g. the Lexicography 
list (see http://linguistlist.org/lists/get-lists.cfm).

--
   Mike Maxwell
   What good is a universe without somebody around to look at it?
   --Robert Dicke, Princeton physicist

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Google summer of code

2010-03-17 Thread Mike Maxwell


Stefan Seefeld wrote:

On 03/14/2010 03:26 PM, Mike Maxwell wrote:


My sense (which I guess I've voiced a couple times) is that there is 
already an awfully lot (too much, IMO) about DB that is specific to 
programming languages.  Our localization has over 200 lines like

define name=db.classsynopsisnotAllowed//define
My guess is that if you were to add programming elements in a separate 
namespace, you would want to move all the existing 
programming-specific elements into that namespace too.


I don't think this is possible without breaking lots of existing 
documentation.


If backward-compatibility wasn't an issue, I would very much like the 
suggestion.


Given that the root element of a DocBook 5 file looks something like
   chapter xmlns=http://docbook.org/ns/docbook;...
is this really a problem?  Couldn't a DB document written for the new 
modular DocBook schema have something like

   chapter xmlns=http://docbook.org/ns/ModDocBook;
or
   chapter xmlns=http://docbook.org/ns/docbook6;...
?  So any documentation written in the Olde DB could continue to use the 
old schema, and not get broken.


At any rate, I would think it would be trivial to port a DB 5 document 
to such a modular docbook, by adding a namespace declaration for 
programming language-specific elements at the top, and prefixing any 
programming language-specific element names with the namespace 
abbreviation.


But maybe I'm missing something...
--
   Mike Maxwell
   What good is a universe without somebody around to look at it?
   --Robert Dicke, Princeton physicist

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Google summer of code

2010-03-14 Thread Mike Maxwell


Stefan Seefeld wrote:
I have been watching with interest the transition from DB 4 to DB 5, and 
in particular, the discussions about extensibility of a core 
vocabulary. A long time ago we discussed DocBook's limited support for 
programming language representations, and what to do about it. Norm 
argued that DB was not a modeling language, and thus, that it wasn't a 
good idea to add more elements akin to ooclass, methodsynopsis, etc. 
to the core.


Well, I hope that as part of my project proposal 
(http://docbook.xmlpress.net/tiki-index.php?page=api-markup), we can in 
fact add more elements, but keep them in a separate namespace, so the 
extension is better defined as such.


I would hope that other extensions (slides, website, etc.) could use a 
similar approach, i.e. all become domain-specific extensions (or 
profiles). This has a couple of important advantages, not the least 
that users are free to mix these vocabularies for their own purpose.


I come to this discussion as a DocBook user, but don't have any 
background on why DB got to where it is today.  Our use case is probably 
different from anyone else's: we're doing grammars of natural languages 
(Bengali, Urdu, Pashto...), so we have some grammar-specific extensions 
in their own name space.  We also use the literate programming extension 
that Norm wrote some years ago, so that we can automatically turn our 
grammars into parsers.


That said--

My sense (which I guess I've voiced a couple times) is that there is 
already an awfully lot (too much, IMO) about DB that is specific to 
programming languages.  Our localization has over 200 lines like

   define name=db.classsynopsisnotAllowed//define
My guess is that if you were to add programming elements in a separate 
namespace, you would want to move all the existing programming-specific 
elements into that namespace too.  I think the result would be a very
much more modular DocBook, rather like a modern programming language 
(Python, say) in which the generic stuff is in the main language, and 
constructs that deal with particular domains are in library modules.


In case it's not clear, I'm entirely for such modularization.
--
   Mike Maxwell
   What good is a universe without somebody around to look at it?
   --Robert Dicke, Princeton physicist

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Apostrophe in docbook document

2010-01-26 Thread maxwell

On Tue, 26 Jan 2010 14:42:34 -0600, Ron Catterall r...@catterall.net
wrote:
 Imagine a linguist wanting to search some text to count
 ...
 The problem of course is not a Docbook problem, it is in the UTF tables 

The problem is with neither, it is with the linguist :-).  (I can say
that, because I'm a linguist.)

All seriousness aside, using corpora for linguistics requires more than
looking for certain Unicode characters, which may not be used consistently
anyway (and especially in a case like this, where the characters--if they
were distinct Unicode characters--would doubtless be confused).  

Distinguishing between quotes and apostrophes requires some fairly complex
methods.  There are rules of thumb that often work, but they will break on
certain cases.  Corpora linguists become familiar with where these things
break, and construct work-arounds accordingly, or hand-tag recalcitrant
cases.

If you really want an interesting problem, go for distinguishing among the
uses of the ASCII period!

   Mike Maxwell

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Graphic formats for screenshots

2009-08-06 Thread maxwell

On Thu, 6 Aug 2009 13:14:26 EDT, deannel...@aol.com wrote:
 I've never personally used it as a spoon, but it could be used 
 that way with some modifications.

Modifications to the CD, or to your mouth?

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] RE: DocBook topic element]

2009-08-02 Thread Mike Maxwell


Denis Bradford wrote:

As a writer who uses both every day, I would look elsewhere for relative
strengths and weaknesses. For example, in DITA's favor I might point out
its small tag set...

IMO, DocBook's swiss army knife flexibility is a Good Thing in a
modular XML publishing system. From this standpoint, adding a
well-considered topic model is one more useful refinement to DocBook.


When a Swiss Army Knife has too many tools, it's hard to find the right 
one; all the blades you don't want get in the way.


I know there's work going on on a simplified DocBook.  Has any thought 
been given to modularizing DB, in the sense of having a core (which 
would probably be the simplified DB) and modules (for software/ hardware 
documentation, maybe this topic model, and who knows what else--maybe 
linguistics and computational linguistics (my thing), chemistry,...)? 
It would be similar to the idea of having programming libraries (the 
Python model), rather than cramming everything into the main language 
(the Common Lisp/ Ada model).


(Yes, I do know that I can omit elements.  We do that--lots of it.)
--
   Mike Maxwell
   What good is a universe without somebody around to look at it?
   --Robert Dicke, Princeton physicist

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

69 matches

Mail list logo