Re: Asciidoc r2d984d29754c breaks a2x dblatex toolchain

Stuart Rackham Wed, 23 Feb 2011 17:11:50 -0800


On 24/02/11 13:38, Lex Trotman wrote:

On 24 February 2011 10:32, Stuart Rackham<[email protected]>  wrote:

Hi Christian

On 24/02/11 03:45, Christian Kampka wrote:


Hi Stuart,

I believe in revision 2d984d29754c, you broke the toolchain by replacing
the regexp that substitutes unicode characters in section title ids with
German umlauts.

Attached you will find a document test.txt that will not build with
asciidoc tip and dblatex 0.3 using 'a2x -f pdf test.txt'.


You are correct, but the problem is with dblatex -- the id="_einführung"
attribute is perfectly legal DocBook. If you run the same test using FOP
instead of dblatex it works fine.

Your patch does allow dblatex to execute without errors but it will cause
legal id attributes from other languages to degenerate to possibly ambiguous
(and hence illegal) underscores e.g. id="_" will be generated instead of
id="_这是一 个测试"

You can work around the dblatex problem by specifying explicit ids e.g.

[_einf_hrung]
== Einführung
This is a test

or you could disable all auto-generated ids by undefining the sectids
attribute e.g.

:sectids!:

== Einführung
This is a test

Unless I can be convinced otherwise I'm reluctant to commit a patch that
introduces other problems and that really belongs in dblatex.

I've cc'd this email to the asciidoc discussion group for comment.


Well since you asked for comment :-)

I'd agree that a patch that fixes one language but breaks another is
not a good idea.

Without investigating too far, maybe all ids have to be hashes of the
human readable label so they will still be unique, but only contain
hex digits.  And that way there would not be any need for the
underscore that isn't acceptable in HTML 4.

My rationale for the auto-generated IDs was for a reasonably consistent humanreadable ID, a link like:


http://www.methods.co.nz/asciidoc/userguide.html#_inline_elements

is friendlier than:

http://www.methods.co.nz/asciidoc/userguide.html#_abc1345252345

So for me anyway, changing to a hash kind of defeats their purpose.

Of course the problem with auto-generated ids, vs explicit ids, is that they arebrittle (if you edit the section title the id changes).

Regards ``underscore that isn't acceptable in HTML 4'', this does not cause aproblem because asciidoc uses the HTML 4 name attribute not the id attributewhen generating HTML 4 (seehttp://groups.google.com/group/asciidoc/browse_thread/thread/98e0b437cb97bd91/4c9c9353126d2e6f).



Cheers, Stuart


Cheers
Lex



Cheers, Stuart

I also attached a patch to correct the regexp for this problem, I hope
you can accept it into the official mainline code.

Cheers,
Christian


--
You received this message because you are subscribed to the Google Groups
"asciidoc" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/asciidoc?hl=en.


--
You received this message because you are subscribed to the Google Groups 
"asciidoc" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/asciidoc?hl=en.

Re: Asciidoc r2d984d29754c breaks a2x dblatex toolchain

Reply via email to