Re: Leading/trailing space removal in LineLM

2005-11-08 Thread Manuel Mall
On Sat, 5 Nov 2005 12:05 am, Luca Furini wrote: Manuel Mall wrote: Here are some of the combinations I have identified: 1. Non breaking / non elastic space = probably just a normal character, i.e. part of a word. 2. Non breaking / elastic space - eg. U+00A0 Non breaking space =

Re: Leading/trailing space removal in LineLM

2005-11-08 Thread Manuel Mall
On Wed, 9 Nov 2005 12:47 am, Andreas L Delmelle wrote: On Nov 4, 2005, at 17:05, Luca Furini wrote: Hi Manuel / Luca, Manuel Mall wrote: Here are some of the combinations I have identified: snip / 6. Breaking / elastic / non removable - eg. U+3000 Ideographic space = Must handle

Re: Leading/trailing space removal in LineLM

2005-11-08 Thread Manuel Mall
On Wed, 9 Nov 2005 12:32 pm, Andreas L Delmelle wrote: On Nov 9, 2005, at 02:09, Manuel Mall wrote: Manuel, snip/ We're (again) more in agreement than we realize, I think... Although, now you got me wondering what you think is my 'basic sentiment' :-) After reading your post = yes we are

Re: Leading/trailing space removal in LineLM

2005-11-04 Thread J.Pietschmann
Luca Furini wrote: note that a word with a soft hyphen in its middle would not be hyphenated, unless we ignore this character when collecting word fragments Well, in order to prepare for hyphenation, other characters like joiners has to be removed too. We should probably also use Unicode

Re: Leading/trailing space removal in LineLM

2005-11-03 Thread Manuel Mall
On Wed, 2 Nov 2005 11:58 pm, Luca Furini wrote: Manuel Mall wrote: Luca wrote a longer response to this but my mail reader doesn't like the character set (is that topical or what?). Sorry, it looks really horrible ... still don't know what went wrong, but I won't do it again! :-) Any

Re: Leading/trailing space removal in LineLM

2005-11-03 Thread J.Pietschmann
Manuel Mall wrote: Hmm, to me it appears that UNICODE and XSL-FO have slightly different models when it comes to white space in the context of line breaking which is causing the discussion here. I don't think so. The overlap between UAX14 and XSLFO is that both mandate a line break for each

Re: Leading/trailing space removal in LineLM

2005-11-03 Thread Andreas L Delmelle
On Nov 3, 2005, at 08:53, Manuel Mall wrote: On Thu, 3 Nov 2005 06:03 am, J.Pietschmann wrote: Computing line breaking opportunities and discarding whitespace at the end (or beginning) of a line are different matters. If whitespace has to be retained, trailing spaces after a non-space string

Re: Leading/trailing space removal in LineLM

2005-11-03 Thread Manuel Mall
On Fri, 4 Nov 2005 06:22 am, Andreas L Delmelle wrote: On Nov 3, 2005, at 08:53, Manuel Mall wrote: On Thu, 3 Nov 2005 06:03 am, J.Pietschmann wrote: But I am not sure if this can be done in all cases. Otherwise we may have to modify the UNICODE line breaking algorithm to cater for the

Re: Leading/trailing space removal in LineLM

2005-11-02 Thread Luca Furini
Manuel Mall wrote: So we end up with only two cases to consider: preserve white space and remove white space around a line break created by the Knuth algorithm. 1. Preserve white space: IMO in this case the space itself is actually not a break opportunity but there are now two break

Re: Leading/trailing space removal in LineLM

2005-11-02 Thread Manuel Mall
On Wed, 2 Nov 2005 01:59 pm, Manuel Mall wrote: On Wed, 2 Nov 2005 04:18 am, Simon Pepping wrote: On Tue, Nov 01, 2005 at 11:40:42PM +0800, Manuel Mall wrote: This is probably a question for Luca or Simon. snip/ Glue and penalty items are removed at the start of a line. This is part

Re: Leading/trailing space removal in LineLM

2005-11-02 Thread Luca Furini
Manuel Mall wrote: Luca wrote a longer response to this but my mail reader doesn't like the character set (is that topical or what?). Sorry, it looks really horrible ... still don't know what went wrong, but I won't do it again! :-) Any way at end Luca ask the question about the UAX#14

Re: Leading/trailing space removal in LineLM

2005-11-02 Thread Simon Pepping
On Wed, Nov 02, 2005 at 04:58:09PM +0100, Luca Furini wrote: Manuel Mall wrote: Luca wrote a longer response to this but my mail reader doesn't like the character set (is that topical or what?). Sorry, it looks really horrible ... still don't know what went wrong, but I won't do it

Re: Leading/trailing space removal in LineLM

2005-11-02 Thread J.Pietschmann
Manuel Mall wrote: a) Yes UAX#14 always breaks at the of a sequence of spaces b) But is also says that it assumes any trailing spaces in a line are being removed This conflicts with XSL-FO which can force spaces being retained therefore adjustments to the algorithm are necessary to cater for

Re: Leading/trailing space removal in LineLM

2005-11-02 Thread Manuel Mall
On Thu, 3 Nov 2005 06:03 am, J.Pietschmann wrote: Manuel Mall wrote: a) Yes UAX#14 always breaks at the of a sequence of spaces b) But is also says that it assumes any trailing spaces in a line are being removed This conflicts with XSL-FO which can force spaces being retained therefore

Re: Leading/trailing space removal in LineLM

2005-11-01 Thread Simon Pepping
On Tue, Nov 01, 2005 at 11:40:42PM +0800, Manuel Mall wrote: This is probably a question for Luca or Simon. In LineLM we have this code: // ignore KnuthGlue and KnuthPenalty objects // at the beginning of the line seqIterator =