Again. The code doesn't even *attempt* to hyphenate because the word contains non-letter characters.
Julian -- <green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760 > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > Sent: Monday, August 11, 2003 3:39 PM > To: [EMAIL PROTECTED] > Subject: RE: 0.20.5 vs hyphenation > > > > Another observation: the hyphenation character may not be the minus sign. > In fact, this would be semantically inconsistent: the minus sign and the > hyphen are two different Unicode codepoints. See <hyphen-char> in the > hyphenation rules. > > ============================================= > Marcelo Jaccoud Amaral > Petrobras - TI - Negócios Eletrônicos > mailto:jaccoud [at] petrobras.com.br > voice: +55 21 2534-3485 > fax: +55 21 2534-1809 > ============================================= > There are only 10 kinds of people in the world: those who > understand binary > and those who don't. > > > > > > "Julian Reschke" > > <[EMAIL PROTECTED] Para: > <[EMAIL PROTECTED]> > mx.de> cc: (cco: > Marcelo Jaccoud Amaral/RJ/Petrobras) > Assunto: RE: > 0.20.5 vs hyphenation > 2003-08-11 10:06 > > Favor responder a > > fop-user > > > > > > > > > > Marcelo, > > yes, I did search the archives. BTW: the test I am processing is english. > > As far as I understand, the issue is that once the code encounters a word > containing non-letters, it doesn't even attempt to hyphenate it at all (so > language settings and hyphenation rules aren't applied anyway). > If the word > itself is wider than the current line, it will just overflow (or > be wrapped > around silently). > > I think that it's definitively better to break the word where the minus > sign > is instead of giving up and producing warnings. > > Hope this clarifies a bit, > > Julian > > -- > <green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760 > > > -----Original Message----- > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > > Sent: Monday, August 11, 2003 2:46 PM > > To: [EMAIL PROTECTED] > > Subject: Re: 0.20.5 vs hyphenation > > > > > > > > This issue has appeared in the list before (search the > archives, please). > > > > The point is that these "words" do not belong to the language being > > processed (für Sie, ich glaube es ist Deutsch). FOP can only hyphenate > > based on the hyphenation rules, which are language-specific. I don't > > believe this is a bug, on the contrary, it is a very consistent > behaviour, > > because there are no valid German words which contain colons, or > percents, > > or other peculiar signs. Or English. Or Portuguese. Or words from > > any other > > natural language. > > > > The most semantic consistente way to do what you want without messing > with > > the code -- I think it is correct as it is -- would be to create a > > hyphenation rule file for the language you are trying to hyphenate (say, > > "x-url") and assign it to the text using xml:lang. Yes, you can create > > hyphenation rules for artificial languages, too. FOP cannot tell the > > difference. You can also insert soft hyphens manually, although that > would > > no function if the hyphenation rules reject the "word" (I have not > tried). > > > > Cheers. <|:-) <--[how would you hyphenate this???] > > > > ============================================= > > Marcelo Jaccoud Amaral > > Petrobras - TI - Negócios Eletrônicos > > mailto:jaccoud [at] petrobras.com.br > > ============================================= > > There are only 10 kinds of people in the world: those who > > understand binary > > and those who don't. > > > > > > > > > > > > "Julian Reschke" > > > > <[EMAIL PROTECTED] Para: > > <[EMAIL PROTECTED]> > > mx.de> cc: (cco: > > Marcelo Jaccoud Amaral/RJ/Petrobras) > > Assunto: 0.20.5 > > vs hyphenation > > 2003-08-09 18:38 > > > > Favor responder a > > > > fop-user > > > > > > > > > > > > > > > > > > > > Hi. > > > > I think the non-letter handling in LineArea's hyphenation routine needs > to > > enhanced. > > > > I'm producing a two-column index which contains keywords from > > WebDAV specs, > > such as "DAV:version-controlled-collection", i.e. the words contain > > multiple > > non-letter characters. > > > > The current implementation seems to have two limitations: > > > > - if a non-letter character other than "/" and "-" is found, the > > whole word > > isn't hyphenated at all. > > > > - the code that tries to identify "-"-separated word parts does > this only > > once instead of continuing until no space is left. > > > > The result is words like the example above aren't hyphenated at all, and > > that words like "version-controlled-collection" will always be > hyphenated > > after the first minus sign, even if there would have been > enough room for > > more. > > > > I'm currently testing a patch than changes this to: > > > > - accept *all* non-letter characters as possible hyphenation points, > > > > - continue scanning the word after the first non-letter char was > > found, and > > > > - add a single hyphenation character if hyphenation occurs at a > non-letter > > character other than "-". > > > > This seems to work fine for my test cases. > > > > As I'm new to this list I wonder how to proceed? Should I raise > a bug and > > post the patch? > > > > > > Julian > > > > (patch attached) > > > > > > -- > > <green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760 > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]