Again.

The code doesn't even *attempt* to hyphenate because the word contains
non-letter characters.

Julian

--
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760

> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> Sent: Monday, August 11, 2003 3:39 PM
> To: [EMAIL PROTECTED]
> Subject: RE: 0.20.5 vs hyphenation
>
>
>
> Another observation: the hyphenation character may not be the minus sign.
> In fact, this would be semantically inconsistent: the minus sign and the
> hyphen are two different Unicode codepoints. See <hyphen-char> in the
> hyphenation rules.
>
> =============================================
> Marcelo Jaccoud Amaral
> Petrobras - TI - Negócios Eletrônicos
> mailto:jaccoud [at] petrobras.com.br
> voice: +55 21 2534-3485
> fax: +55 21 2534-1809
> =============================================
> There are only 10 kinds of people in the world: those who
> understand binary
> and those who don't.
>
>
>
>
>
>                       "Julian Reschke"
>
>                       <[EMAIL PROTECTED]        Para:
> <[EMAIL PROTECTED]>
>                       mx.de>                   cc:       (cco:
> Marcelo Jaccoud Amaral/RJ/Petrobras)
>                                                Assunto:  RE:
> 0.20.5 vs hyphenation
>                       2003-08-11 10:06
>
>                       Favor responder a
>
>                       fop-user
>
>
>
>
>
>
>
>
>
> Marcelo,
>
> yes, I did search the archives. BTW: the test I am processing is english.
>
> As far as I understand, the issue is that once the code encounters a word
> containing non-letters, it doesn't even attempt to hyphenate it at all (so
> language settings and hyphenation rules aren't applied anyway).
> If the word
> itself is wider than the current line, it will just overflow (or
> be wrapped
> around silently).
>
> I think that it's definitively better to break the word where the minus
> sign
> is instead of giving up and producing warnings.
>
> Hope this clarifies a bit,
>
> Julian
>
> --
> <green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760
>
> > -----Original Message-----
> > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> > Sent: Monday, August 11, 2003 2:46 PM
> > To: [EMAIL PROTECTED]
> > Subject: Re: 0.20.5 vs hyphenation
> >
> >
> >
> > This issue has appeared in the list before (search the
> archives, please).
> >
> > The point is that these "words" do not belong to the language being
> > processed (für Sie, ich glaube es ist Deutsch). FOP can only hyphenate
> > based on the hyphenation rules, which are language-specific. I don't
> > believe this is a bug, on the contrary, it is a very consistent
> behaviour,
> > because there are no valid German words which contain colons, or
> percents,
> > or other peculiar signs. Or English. Or Portuguese. Or words from
> > any other
> > natural language.
> >
> > The most semantic consistente way to do what you want without messing
> with
> > the code -- I think it is correct as it is -- would be to create a
> > hyphenation rule file for the language you are trying to hyphenate (say,
> > "x-url") and assign it to the text using xml:lang. Yes, you can create
> > hyphenation rules for artificial languages, too. FOP cannot tell the
> > difference. You can also insert soft hyphens manually, although that
> would
> > no function if the hyphenation rules reject the "word" (I have not
> tried).
> >
> > Cheers.  <|:-)         <--[how would you hyphenate this???]
> >
> > =============================================
> > Marcelo Jaccoud Amaral
> > Petrobras - TI - Negócios Eletrônicos
> > mailto:jaccoud [at] petrobras.com.br
> > =============================================
> > There are only 10 kinds of people in the world: those who
> > understand binary
> > and those who don't.
> >
> >
> >
> >
> >
> >                       "Julian Reschke"
> >
> >                       <[EMAIL PROTECTED]        Para:
> > <[EMAIL PROTECTED]>
> >                       mx.de>                   cc:       (cco:
> > Marcelo Jaccoud Amaral/RJ/Petrobras)
> >                                                Assunto:  0.20.5
> > vs hyphenation
> >                       2003-08-09 18:38
> >
> >                       Favor responder a
> >
> >                       fop-user
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > Hi.
> >
> > I think the non-letter handling in LineArea's hyphenation routine needs
> to
> > enhanced.
> >
> > I'm producing a two-column index which contains keywords from
> > WebDAV specs,
> > such as "DAV:version-controlled-collection", i.e. the words contain
> > multiple
> > non-letter characters.
> >
> > The current implementation seems to have two limitations:
> >
> > - if a non-letter character other than "/" and "-" is found, the
> > whole word
> > isn't hyphenated at all.
> >
> > - the code that tries to identify "-"-separated word parts does
> this only
> > once instead of continuing until no space is left.
> >
> > The result is words like the example above aren't hyphenated at all, and
> > that words like "version-controlled-collection" will always be
> hyphenated
> > after the first minus sign, even if there would have been
> enough room for
> > more.
> >
> > I'm currently testing a patch than changes this to:
> >
> > - accept *all* non-letter characters as possible hyphenation points,
> >
> > - continue scanning the word after the first non-letter char was
> > found, and
> >
> > - add a single hyphenation character if hyphenation occurs at a
> non-letter
> > character other than "-".
> >
> > This seems to work fine for my test cases.
> >
> > As I'm new to this list I wonder how to proceed? Should I raise
> a bug and
> > post the patch?
> >
> >
> > Julian
> >
> > (patch attached)
> >
> >
> > --
> > <green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to