On 24 February 2011 11:38, Lex Trotman <[email protected]> wrote:
> On 24 February 2011 10:32, Stuart Rackham <[email protected]> wrote:
>> Hi Christian
>>
>> On 24/02/11 03:45, Christian Kampka wrote:
>>>
>>> Hi Stuart,
>>>
>>> I believe in revision 2d984d29754c, you broke the toolchain by replacing
>>> the regexp that substitutes unicode characters in section title ids with
>>> German umlauts.
>>>
>>> Attached you will find a document test.txt that will not build with
>>> asciidoc tip and dblatex 0.3 using 'a2x -f pdf test.txt'.
>>
>> You are correct, but the problem is with dblatex -- the id="_einführung"
>> attribute is perfectly legal DocBook. If you run the same test using FOP
>> instead of dblatex it works fine.
>>
>> Your patch does allow dblatex to execute without errors but it will cause
>> legal id attributes from other languages to degenerate to possibly ambiguous
>> (and hence illegal) underscores e.g. id="_" will be generated instead of
>> id="_这是一 个测试"
>>
>> You can work around the dblatex problem by specifying explicit ids e.g.
>>
>> [_einf_hrung]
>> == Einführung
>> This is a test
>>
>> or you could disable all auto-generated ids by undefining the sectids
>> attribute e.g.
>>
>> :sectids!:
>>
>> == Einführung
>> This is a test
>>
>> Unless I can be convinced otherwise I'm reluctant to commit a patch that
>> introduces other problems and that really belongs in dblatex.
>>
>> I've cc'd this email to the asciidoc discussion group for comment.
>
> Well since you asked for comment :-)
>
> I'd agree that a patch that fixes one language but breaks another is
> not a good idea.
>
> Without investigating too far, maybe all ids have to be hashes of the
> human readable label so they will still be unique, but only contain
> hex digits.  And that way there would not be any need for the
> underscore that isn't acceptable in HTML 4.

In fact they don't need to be a hash, just the hex of the characters would do.

Cheers
lex

>
> Cheers
> Lex
>
>>
>>
>> Cheers, Stuart
>>
>>> I also attached a patch to correct the regexp for this problem, I hope
>>> you can accept it into the official mainline code.
>>>
>>> Cheers,
>>> Christian
>>>
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "asciidoc" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected].
>> For more options, visit this group at
>> http://groups.google.com/group/asciidoc?hl=en.
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"asciidoc" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/asciidoc?hl=en.

Reply via email to