Re: 0.20.5 vs hyphenation

jaccoud 11 Aug 2003 13:00:53 -0000

This issue has appeared in the list before (search the archives, please).

The point is that these "words" do not belong to the language being
processed (für Sie, ich glaube es ist Deutsch). FOP can only hyphenate
based on the hyphenation rules, which are language-specific. I don't
believe this is a bug, on the contrary, it is a very consistent behaviour,
because there are no valid German words which contain colons, or percents,
or other peculiar signs. Or English. Or Portuguese. Or words from any other
natural language.


The most semantic consistente way to do what you want without messing with
the code -- I think it is correct as it is -- would be to create a
hyphenation rule file for the language you are trying to hyphenate (say,
"x-url") and assign it to the text using xml:lang. Yes, you can create
hyphenation rules for artificial languages, too. FOP cannot tell the
difference. You can also insert soft hyphens manually, although that would
no function if the hyphenation rules reject the "word" (I have not tried).

Cheers.  <|:-)         <--[how would you hyphenate this???]

=============================================
Marcelo Jaccoud Amaral
Petrobras - TI - Negócios Eletrônicos
mailto:jaccoud [at] petrobras.com.br
=============================================
There are only 10 kinds of people in the world: those who understand binary
and those who don't.



                                                                                
                                      
                      "Julian Reschke"                                          
                                      
                      <[EMAIL PROTECTED]        Para:     <[EMAIL PROTECTED]>   
                                 
                      mx.de>                   cc:       (cco: Marcelo Jaccoud 
Amaral/RJ/Petrobras)                   
                                               Assunto:  0.20.5 vs hyphenation  
                                      
                      2003-08-09 18:38                                          
                                      
                      Favor responder a                                         
                                      
                      fop-user                                                  
                                      
                                                                                
                                      
                                                                                
                                      




Hi.

I think the non-letter handling in LineArea's hyphenation routine needs to
enhanced.

I'm producing a two-column index which contains keywords from WebDAV specs,
such as "DAV:version-controlled-collection", i.e. the words contain
multiple
non-letter characters.

The current implementation seems to have two limitations:

- if a non-letter character other than "/" and "-" is found, the whole word
isn't hyphenated at all.

- the code that tries to identify "-"-separated word parts does this only
once instead of continuing until no space is left.

The result is words like the example above aren't hyphenated at all, and
that words like "version-controlled-collection" will always be hyphenated
after the first minus sign, even if there would have been enough room for
more.

I'm currently testing a patch than changes this to:

- accept *all* non-letter characters as possible hyphenation points,

- continue scanning the word after the first non-letter char was found, and

- add a single hyphenation character if hyphenation occurs at a non-letter
character other than "-".

This seems to work fine for my test cases.

As I'm new to this list I wonder how to proceed? Should I raise a bug and
post the patch?


Julian

(patch attached)


--
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: 0.20.5 vs hyphenation

Reply via email to