Re: Line wrapping of mixed LTR/RTL text
> From: Cosmin Apreutesei > Date: Tue, 28 Aug 2018 21:28:58 +0300 > Cc: unicode@unicode.org > > > That is not so if the line ends after the whitespace: in that case the > > whitespace is trailing, and will appear at the visual end of the > > line. > > So only if it's a soft break I should indeed remove the last logical > space, if it's before a hard break then leave it alone. Actually, you don't have to remove it, you could leave it. It's only an aesthetic issue. > > No, it should show the space after ABC to the left of ABC, > > i.e. immediately before the line end. > > Just to make sure, this moving of the last space at the visual end of > the line can only be experienced with a moving cursor, right? I mean > as far as displaying goes (and as far as line width computation for > the purposes of line wrapping goes), that space is just removed, > right? As I said, not necessarily. But it is definitely there when you reorder characters for display. > I'm trying to infer the purpose of moving that space to the > end of the line instead of just removing it If you remove trailing space, then you need to see it being trailing before you remove it. That is the purpose of moving it. > > What UAX#9 tells you is that you need to decide that the line will > > wrap after the space that follows "ABC" > > ... but when computing the line width I should not include the width > of that space, right? since it will not take space in the box in the > end. If you will remove the space, then yes. > You mean it will produce this: > > " ABC لمفاتيح" Yes.
Re: Line wrapping of mixed LTR/RTL text
Hi Philippe, > The space encoded just before the logical end of line or linewrap (in the > middle of the displayed line) has to be moved at end of the physical line (in > the paragraph direction), it should not be kept in the middle. Ok, that seem to confirm what Eli is saying and it clarifies that sentence from UAX#9. Thanks!
Re: Line wrapping of mixed LTR/RTL text
Hi Eli, thanks for answering! I think I'm getting closer. Just a few more clarifications if you please. > That is not so if the line ends after the whitespace: in that case the > whitespace is trailing, and will appear at the visual end of the > line. So only if it's a soft break I should indeed remove the last logical space, if it's before a hard break then leave it alone. > Only if you add some character after the whitespace will the > whitespace "jump" to the other side of the word. ... because the hard break just turned into a soft break and the newly typed character will appear on the next line with a hard line break after it, right? > No, it should show the space after ABC to the left of ABC, > i.e. immediately before the line end. Just to make sure, this moving of the last space at the visual end of the line can only be experienced with a moving cursor, right? I mean as far as displaying goes (and as far as line width computation for the purposes of line wrapping goes), that space is just removed, right? I'm trying to infer the purpose of moving that space to the end of the line instead of just removing it: is the idea to always provide a cursor at the visual end of the line so that typing can continue there or is there more to it? > What UAX#9 tells you is that you need to decide that the line will > wrap after the space that follows "ABC" ... but when computing the line width I should not include the width of that space, right? since it will not take space in the box in the end. >, then reorder the line as if it > ended after that space, which will produce this: > > لمفاتيح ABC > > (with the trailing space to the left of "ABC"). Then you should > display "DEF" on the next line. You mean it will produce this: " ABC لمفاتيح"
Re: Line wrapping of mixed LTR/RTL text
The space encoded just before the logical end of line or linewrap (in the middle of the displayed line) has to be moved at end of the physical line (in the paragraph direction), it should not be kept in the middle. If you need to force a linewrap on a non-breaking space (because there's no other break opportunity to wrap the line elsewhere), then treat that non-breaking space as a regular breaking space which will also be moved at end of the row (after the margin on the ending side of the paragraph), and choose the last non-breaking space on the row; usually, all spaces present at linewraps (including non-breaking spaces) are compacted. But there are other style policies that will force the linewrap preferably after a trailing punctuation or a separator punctuation, or before a leading punctuation, or just after the last unbreakable cluster that can fit the row (including ion the middle of words at arbitrary position if there's no hyphenation process or the script does not support hyphenation, such as sinograms and kanas). Where to insert linewraps is very fuzzy and depends on the rendering context and capabilities of the target device (you cannot scroll a piece of printed paper, but you can scroll a display with a scrollbar or using navigation cursors in a width-restricted input field) Le mar. 28 août 2018 à 16:34, Cosmin Apreutesei via Unicode < unicode@unicode.org> a écrit : > Hello everyone, > > I'm having a bit of trouble implementing line wrapping with bidi and I > would like to ask for some advice or hints on what is the proper way > to do this. > > UAX#9 section 3.4 says that bidi reordering should be done after line > wrapping. But in order to do line wrapping correctly I need to be able > to visually ignore some whitespace, and I'm not sure exactly which > whitespace must be ignored. > > There is this sentence in UAX#9 which provides a clue: "[...] trailing > whitespace will appear at the visual end of the line (in the paragraph > direction).". I'm not sure what that means, but by doing some tests > with fribidi and libunibreak I noticed that the whitespace always > sticks to the logical end of the word (so visually to the right for > LTR runs and to the left for RTL runs), regardless of the base > paragraph direction. Is it safe to use this assumption and always > remove the whitespace at the logical end of the last word of the line? > Or is it more complicated than that? > > Quick example showing the problem. The following text: > > لمفاتيح ABC DEF > > with RTL base direction would wrap (for a certain line width) as: > > ABC لمفاتيح > DEF > > with two spaces between the Latin and Arabic text, one from the Latin > text and one from the Arabic text. Since the line logically ends with > the "C" and LTR direction, I should have to probably remove the space > after the "C" (and, as a rule, just remove the whitespace at the > logical end of the word, regardless of paragraph's direction or word's > direction). Is this the right way to do it? > > Screenshots attached. > > Thanks! >
Re: Line wrapping of mixed LTR/RTL text
> Date: Tue, 28 Aug 2018 13:44:58 +0300 > From: Cosmin Apreutesei via Unicode > > There is this sentence in UAX#9 which provides a clue: "[...] trailing > whitespace will appear at the visual end of the line (in the paragraph > direction).". I'm not sure what that means, but by doing some tests > with fribidi and libunibreak I noticed that the whitespace always > sticks to the logical end of the word (so visually to the right for > LTR runs and to the left for RTL runs), regardless of the base > paragraph direction. That is not so if the line ends after the whitespace: in that case the whitespace is trailing, and will appear at the visual end of the line. Only if you add some character after the whitespace will the whitespace "jump" to the other side of the word. > Quick example showing the problem. The following text: > > لمفاتيح ABC DEF > > with RTL base direction would wrap (for a certain line width) as: > > ABC لمفاتيح > DEF > > with two spaces between the Latin and Arabic text, one from the Latin > text and one from the Arabic text. No, it should show the space after ABC to the left of ABC, i.e. immediately before the line end. What UAX#9 tells you is that you need to decide that the line will wrap after the space that follows "ABC", the reorder the line as if it ended after that space, which will produce this: لمفاتيح ABC (with the trailing space to the left of "ABC"). Then you should display "DEF" on the next line. IOW, the correct order is: . find levels . wrap in logical order . reorder wrapped lines