On 24 February 2011 11:38, Lex Trotman <[email protected]> wrote: > On 24 February 2011 10:32, Stuart Rackham <[email protected]> wrote: >> Hi Christian >> >> On 24/02/11 03:45, Christian Kampka wrote: >>> >>> Hi Stuart, >>> >>> I believe in revision 2d984d29754c, you broke the toolchain by replacing >>> the regexp that substitutes unicode characters in section title ids with >>> German umlauts. >>> >>> Attached you will find a document test.txt that will not build with >>> asciidoc tip and dblatex 0.3 using 'a2x -f pdf test.txt'. >> >> You are correct, but the problem is with dblatex -- the id="_einführung" >> attribute is perfectly legal DocBook. If you run the same test using FOP >> instead of dblatex it works fine. >> >> Your patch does allow dblatex to execute without errors but it will cause >> legal id attributes from other languages to degenerate to possibly ambiguous >> (and hence illegal) underscores e.g. id="_" will be generated instead of >> id="_这是一 个测试" >> >> You can work around the dblatex problem by specifying explicit ids e.g. >> >> [_einf_hrung] >> == Einführung >> This is a test >> >> or you could disable all auto-generated ids by undefining the sectids >> attribute e.g. >> >> :sectids!: >> >> == Einführung >> This is a test >> >> Unless I can be convinced otherwise I'm reluctant to commit a patch that >> introduces other problems and that really belongs in dblatex. >> >> I've cc'd this email to the asciidoc discussion group for comment. > > Well since you asked for comment :-) > > I'd agree that a patch that fixes one language but breaks another is > not a good idea. > > Without investigating too far, maybe all ids have to be hashes of the > human readable label so they will still be unique, but only contain > hex digits. And that way there would not be any need for the > underscore that isn't acceptable in HTML 4.
In fact they don't need to be a hash, just the hex of the characters would do. Cheers lex > > Cheers > Lex > >> >> >> Cheers, Stuart >> >>> I also attached a patch to correct the regexp for this problem, I hope >>> you can accept it into the official mainline code. >>> >>> Cheers, >>> Christian >>> >>> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "asciidoc" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]. >> For more options, visit this group at >> http://groups.google.com/group/asciidoc?hl=en. >> >> > -- You received this message because you are subscribed to the Google Groups "asciidoc" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/asciidoc?hl=en.
