On Mon, 12 Feb 2001, Mark Davis wrote: Thank you for your answer. > Asmus Freytag is the one to talk to; he can look into this. Do you think I should contact him directly off-line? I thought he's on this list now as well as back in March 2000 when I wrote about TUS 3.0 p. 124. > On Mon, 12 Feb 2001, "Jungshik Shin" <[EMAIL PROTECTED]> wrote: > > On Sun, 11 Feb 2001, Mark Davis wrote: > > > > MD> Please read TUS Chapter 5 and the Linebreak TR before proceeding, as I > > MD> recommended in my last message. The Unicode standard is online, as is > > As I wrote when TUS 3.0 came out, I cannot help wondering where the idea > > that leads to the following in the TR on line breaking (and what's written > > about it in Chap 5o of TUS 3.0) came from. > > > > UTR14> Korean may alternately use a space-based (style 1) instead of the > > UTR14> style 2 context analysis. BTW, this clearly shows that what Rick McGowan wrote about 'either ... or' in response to what I wrote about Korean line breaking rule (TUS 3.0 p. 124) in March 2000 is not right like I argued then. I'm sure he's right about 'either ... or ' in English grammar but the intention of the author is on my side if the author of UTR 14 is the same as that of the part in question in TUS 3.0. I'm enclosing at the end of this message a part of my message in response to him. > > I'm very alarmed to find this 'misinformation' crept into the UTS and > > UTR14 (now UAX #14). It would be nice if somebody in charge could get > > this straightened. This didn't make it in Unicode 3.1, either. What would be the best way to get it addressed before next revision comes out? I'm afraid just raising it on this list wouldn't be sufficient (of course, I should have followed up more vigorously last year) Regards, Jungshik Shin Enc. 1. Two messages of mine the first one : March 1, 2000 the second one: March 2, 2000 From: Jungshik Shin <[EMAIL PROTECTED]> Subject: Korean line breaking rules : Unicode 3.0 (p. 124) Date: Wed, 1 Mar 2000 19:23:23 -0800 (PST) On Sun, 13 Feb 2000, Kenneth Whistler wrote: > Lest anyone feel unduly constrained, let me note that now that > the editorial committee has closed the book, so to speak, on Unicode 3.0, > all of you who are about to open the book for the first time should > feel free to unleash your commentary on the text. I've just received my copy of Unicode 3.0 book, here goes my first commentary. On page 124(section 5.15 Locatiing Text element boundaries), the third paragraph has the following around the end: U3.0> In particular, word, line, and sentence boundaries will need to U3.0> be customized according to locale and user preference. In Korean, U3.0> for example, lines may be broken either at spaces(as in Latin text) or U3.0> on ideographic boundaries (as in Chinese). First of all, it's a great mystery to me how on earth this strange notion of Korean having *two* different line breaking rules(as opposed to one) crept into the expertise of non-Korean experts on Korean and finally made it into Unicode 3.0 book and Unicode TR on line breaking. None of tens of Korean books on my bookshelves I've just gone through breaks lines *exclusively* at spaces. All of them break lines freely at *syllables*. Only places where lines are broken *exclusively* at spaces(for Korean text) I can think of are completely *broken*(as far as Korean line breaking is concerned) web browsers like Netscape and MS IE and possibly earlier implementations of Korean LaTeX. One may add to the list Korean text formatted by non-localized version of 'fmt' (in Unix) as another example. To work around the problem caused by these broken web browsers, some Korean web authors apply a simple filter to insert <wbr> between every pair of Korean syllables to their html files. To see what I mean, you may wanna take a look at <http://photon.hgs.yale.edu/~jungshik/lb.html> and <http://photon.hgs.yale.edu/~jungshik/lbscreenshot.jpg> Let me emphasize that line can be broken at any syllable boundaries in Korean text (except for some obvious exceptions as applied in English text: i.e. punctuation marks like '!', '?' cannot begin a line). Secondly, even in Latin scripts(well, at least in English) lines can be broken not only at spaces but also at syllables(syllabic boundaries) with hyphen. Only difference between Korean line breaking and English line breaking is Korean doesn't need hyphen when lines are broken at syllables because in Korean syllables form another visual unit a level higher than alphabetic/phonetic letters(consonants and vowels). Thirdly, the expression 'ideographic boundaries' is not appropriate 'syllabic boundaries' or 'syllables'. Given these, I'd like to suggest the last sentence(that begins with 'In Korean, for instance...') be removed in the future edition because Korean is NOT a good example case where there can be multiple line breaking rules depending on user preference. Jungshik Shin From: Jungshik Shin <[EMAIL PROTECTED]> Subject: RE: Korean line breaking rules : Unicode 3.0 (p. 124) Date: Thu, 2 Mar 2000 12:20:31 -0800 (PST) On Thu, 2 Mar 2000, Rick McGowan wrote: > I think that unfortunately both Hoon Kim and Jungshik Shin I think have > *entirely* mis-interpreted the text. The text says: > U3.0> for example, lines may be broken either at spaces(as in Latin > U3.0> text) or U3.0 on ideographic boundaries (as in Chinese). > The word "or" on the second line would never be interpreted as an "exclusive > or", it is an "inclusive or". In "C Language" syntax, it means "A|B"; it > does not mean "A^B". U3.0> In particular, word, line, and sentence boundaries will need to U3.0> be customized according to locale and user preference. In Korean, If it's written with that intention, what would you say about the preceeding two lines? What's 'user preference' here? It implies 'exclusive or', doesn't it? In other words, it implies users may choose to turn off 'B', doesn't it? (No Korean typesetter in her/his right mind would do that.) If not, what's the point of taking an example of Korean line breaking after that sentence about 'user preference'? On top of that, if that's your intention, it'd be clearer to say 'lines can be broken on both spaces and syllable boundaries'(or on any syllable boundaries including spaces), woudln't it? > In that light, some of their previous comments should probably be re-examined. Nonetheless, the last sentence of the paragraph in question about Korean line breaking had better be removed(it's not necessary at all in my opinion) to avoid possible/unnecessary confusion it leads to (as is evident in Netscape's implementation of Korean line breaking). Jungshik Shin