Another observation: the hyphenation character may not be the minus sign.
In fact, this would be semantically inconsistent: the minus sign and the
hyphen are two different Unicode codepoints. See <hyphen-char> in the
hyphenation rules.

=============================================
Marcelo Jaccoud Amaral
Petrobras - TI - Negócios Eletrônicos
mailto:jaccoud [at] petrobras.com.br
voice: +55 21 2534-3485
fax: +55 21 2534-1809
=============================================
There are only 10 kinds of people in the world: those who understand binary
and those who don't.



                                                                                
                                      
                      "Julian Reschke"                                          
                                      
                      <[EMAIL PROTECTED]        Para:     <[EMAIL PROTECTED]>   
                                 
                      mx.de>                   cc:       (cco: Marcelo Jaccoud 
Amaral/RJ/Petrobras)                   
                                               Assunto:  RE: 0.20.5 vs 
hyphenation                                    
                      2003-08-11 10:06                                          
                                      
                      Favor responder a                                         
                                      
                      fop-user                                                  
                                      
                                                                                
                                      
                                                                                
                                      




Marcelo,

yes, I did search the archives. BTW: the test I am processing is english.

As far as I understand, the issue is that once the code encounters a word
containing non-letters, it doesn't even attempt to hyphenate it at all (so
language settings and hyphenation rules aren't applied anyway). If the word
itself is wider than the current line, it will just overflow (or be wrapped
around silently).

I think that it's definitively better to break the word where the minus
sign
is instead of giving up and producing warnings.

Hope this clarifies a bit,

Julian

--
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760

> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> Sent: Monday, August 11, 2003 2:46 PM
> To: [EMAIL PROTECTED]
> Subject: Re: 0.20.5 vs hyphenation
>
>
>
> This issue has appeared in the list before (search the archives, please).
>
> The point is that these "words" do not belong to the language being
> processed (für Sie, ich glaube es ist Deutsch). FOP can only hyphenate
> based on the hyphenation rules, which are language-specific. I don't
> believe this is a bug, on the contrary, it is a very consistent
behaviour,
> because there are no valid German words which contain colons, or
percents,
> or other peculiar signs. Or English. Or Portuguese. Or words from
> any other
> natural language.
>
> The most semantic consistente way to do what you want without messing
with
> the code -- I think it is correct as it is -- would be to create a
> hyphenation rule file for the language you are trying to hyphenate (say,
> "x-url") and assign it to the text using xml:lang. Yes, you can create
> hyphenation rules for artificial languages, too. FOP cannot tell the
> difference. You can also insert soft hyphens manually, although that
would
> no function if the hyphenation rules reject the "word" (I have not
tried).
>
> Cheers.  <|:-)         <--[how would you hyphenate this???]
>
> =============================================
> Marcelo Jaccoud Amaral
> Petrobras - TI - Negócios Eletrônicos
> mailto:jaccoud [at] petrobras.com.br
> =============================================
> There are only 10 kinds of people in the world: those who
> understand binary
> and those who don't.
>
>
>
>
>
>                       "Julian Reschke"
>
>                       <[EMAIL PROTECTED]        Para:
> <[EMAIL PROTECTED]>
>                       mx.de>                   cc:       (cco:
> Marcelo Jaccoud Amaral/RJ/Petrobras)
>                                                Assunto:  0.20.5
> vs hyphenation
>                       2003-08-09 18:38
>
>                       Favor responder a
>
>                       fop-user
>
>
>
>
>
>
>
>
>
> Hi.
>
> I think the non-letter handling in LineArea's hyphenation routine needs
to
> enhanced.
>
> I'm producing a two-column index which contains keywords from
> WebDAV specs,
> such as "DAV:version-controlled-collection", i.e. the words contain
> multiple
> non-letter characters.
>
> The current implementation seems to have two limitations:
>
> - if a non-letter character other than "/" and "-" is found, the
> whole word
> isn't hyphenated at all.
>
> - the code that tries to identify "-"-separated word parts does this only
> once instead of continuing until no space is left.
>
> The result is words like the example above aren't hyphenated at all, and
> that words like "version-controlled-collection" will always be hyphenated
> after the first minus sign, even if there would have been enough room for
> more.
>
> I'm currently testing a patch than changes this to:
>
> - accept *all* non-letter characters as possible hyphenation points,
>
> - continue scanning the word after the first non-letter char was
> found, and
>
> - add a single hyphenation character if hyphenation occurs at a
non-letter
> character other than "-".
>
> This seems to work fine for my test cases.
>
> As I'm new to this list I wonder how to proceed? Should I raise a bug and
> post the patch?
>
>
> Julian
>
> (patch attached)
>
>
> --
> <green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]







---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to