Re: RTL PUA?

2011-08-22 Thread Asmus Freytag
On 8/21/2011 7:34 PM, Doug Ewell wrote: So what you are asking about is a directional control character that would assign subsequent characters a BC of 'AL', right? You don't want to call this a LANGUAGE MARK or anything else that implies language identification, because of the existence of

RE: RTL PUA?

2011-08-22 Thread Jonathan Rosenne
I don't buy the assumption that all the world is either AAT, Graphite or Uniscribe. Anyhow, this discussion is going off topic, the issue is should Unicode specify an RTL PUA area, not whether some products, however respectable, provide a bypass. Jony -Original Message- From:

Re: RTL PUA?

2011-08-22 Thread Michael Everson
On 22 Aug 2011, at 03:57, Peter Constable wrote: From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On Behalf Of Asmus Freytag Treating PUA characters as ON is very problematic As would be changing the default property of PUA characters from L to ON. Which is why that

Re: RTL PUA?

2011-08-22 Thread Michael Everson
On 22 Aug 2011, at 05:53, Shriramana Sharma wrote: While I don't know much about RTL scripts, if the logic order is ALEF + LAMED, but the presentation order is LAMED + ALEF *because of the RTL nature* do you write the rule as ALEF + LAMED = ALEF_LAMED_LIGATURE or LAMED + ALEF =

Re: C1 Control Pictures Proposal

2011-08-22 Thread Sean Leonard
On Aug 17, 2011, at 4:38 PM, Andrew West wrote: Unless you can show evidence that C1 control pictures are currently in use and that there is a clear demand from the user community to On Aug 21, 2011, at 10:13 AM, Doug Ewell wrote: Perhaps it would help for you to do a quick survey of

Re: RTL PUA?

2011-08-22 Thread Petr Tomasek
On Mon, Aug 22, 2011 at 10:42:05AM +0530, Shriramana Sharma wrote: On 08/22/2011 08:24 AM, Peter Constable wrote: I'm not saying that there shouldn't be_some_ software that can do what you expect. But there will likely be some different views on what ought to be included within that some.

Re: Code pages and Unicode

2011-08-22 Thread Andrew West
On 21 August 2011 02:14, Richard Wordingham richard.wording...@ntlworld.com wrote: On Fri, 19 Aug 2011 17:03:41 -0700 Ken Whistler k...@sybase.com wrote: O.k., so apparently we have awhile to go before we have to start worrying about the Y2K or IPv4 problem for Unicode. Call me again in the

Re: Code pages and Unicode

2011-08-22 Thread Shriramana Sharma
On 08/22/2011 03:05 PM, Andrew West wrote: Can anyone think of a way to extend UTF-16 without adding new surrogates or inventing a new general category? Why would anyone *need* to do so? UTF-16 can represent all codepoints upto Plane 16 right? -- Shriramana Sharma

Re: RTL PUA?

2011-08-22 Thread Shriramana Sharma
On 08/22/2011 04:34 PM, Behdad Esfahbod wrote: On 08/22/11 06:53, Shriramana Sharma wrote: While I don't know much about RTL scripts, if the logic order is ALEF + LAMED, but the presentation order is LAMED + ALEF*because of the RTL nature* do you write the rule as ALEF + LAMED =

Re: RTL PUA?

2011-08-22 Thread Shriramana Sharma
On 08/22/2011 05:26 PM, Behdad Esfahbod wrote: OpenType tables contain entries in the logical order of the script in question. Ie. Arabic tables are always RTL. Yes I understand, but still, to clarify: The font tables themselves contain only ASCII characters I presume. In it do you write:

Re: Code pages and Unicode

2011-08-22 Thread Andrew West
On 22 August 2011 12:51, Shriramana Sharma samj...@gmail.com wrote: On 08/22/2011 03:05 PM, Andrew West wrote: Can anyone think of a way to extend UTF-16 without adding new surrogates or inventing a new general category? Why would anyone *need* to do so? UTF-16 can represent all codepoints

Re: RTL PUA?

2011-08-22 Thread Shriramana Sharma
On 08/22/2011 12:21 PM, Jonathan Rosenne wrote: I don't buy the assumption that all the world is either AAT, Graphite or Uniscribe. Nobody asserted that either. It is only pointed out that major implementations are able to provide what you seek. Anyhow, this discussion is going off topic,

Re: RTL PUA?

2011-08-22 Thread Joó Ádám
Um... Computers are hardware, and don't understand a thing. What I think you mean is computer _software_. (I know, I'm being pedantic, but with good reason.) Sorry, I just can’t resist pointing out that difference between hardware and software is only the fact that the former is material,

Re: RTL PUA?

2011-08-22 Thread Mark E. Shoulson
On 08/22/2011 08:26 AM, Shriramana Sharma wrote: On 08/22/2011 05:26 PM, Behdad Esfahbod wrote: OpenType tables contain entries in the logical order of the script in question. Ie. Arabic tables are always RTL. Yes I understand, but still, to clarify: The font tables themselves contain only

Re: RTL PUA?

2011-08-22 Thread Philippe Verdy
2011/8/22 Peter Constable peter...@microsoft.com: From: ver...@gmail.com [mailto:ver...@gmail.com] On Behalf Of Philippe Verdy As I explained in an earlier message, the layout engine doesn't use the default property value but the resolved bidi level. Once again, you refuse to understand my

Re: RTL PUA?

2011-08-22 Thread Philippe Verdy
2011/8/22 Peter Constable peter...@microsoft.com: From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On Behalf Of Asmus Freytag Treating PUA characters as ON is very problematic As would be changing the default property of PUA characters from L to ON. I also agree with

Re: RTL PUA?

2011-08-22 Thread Philippe Verdy
2011/8/22 Shriramana Sharma samj...@gmail.com: On 08/22/2011 12:01 AM, Peter Constable wrote: If you mean a rule to substitute [g1 g2] with [g3] won't apply if the sequence processed by the OpenType Layout lookup processor is [g2 g1], Peter, actually I suspect Philippe is thinking that in

RE: Code pages and Unicode

2011-08-22 Thread Doug Ewell
srivas sinnathurai sisrivas at blueyonder dot co dot uk wrote: The true lifting of UTF-16 would be to UTF-32. Leave the UTF-16 un touched and make the new half versatile as possible. I think any other solution is just a patch up for the timebeing. There is no evidence whatsoever that this

RE: RTL PUA?

2011-08-22 Thread Murray Sargent
It's actually quite easy to convince Uniscribe to treat specific characters as RTL, others as LTR, and, in general, with whatever classifications you desire. Pass a preprocessed string to Uniscribe's ScriptItemize(). RichEdit has used that approach to some degree starting with RichEdit 3.0

Re: RTL PUA?

2011-08-22 Thread Philippe Verdy
2011/8/22 Mark E. Shoulson m...@kli.org: I'm not certain I understand the question, but if I have it right... The logic order is ALEF + LAMED, and the presentation... places those in a right-to-left sequence, shall we say (since talking about the presentation *order* is confusing here). The

Re: RTL PUA?

2011-08-22 Thread Philippe Verdy
2011/8/22 Shriramana Sharma samj...@gmail.com: Hi Behdad. I only asked whether the OT *tables* would contain the entries in the logical order or the visual order. Clearly it would still be the visual order (but Philippe Verdy seemed to imagine/suggest otherwise). No ! I've not imagined that.

Re: RTL PUA?

2011-08-22 Thread Philippe Verdy
2011/8/22 Shriramana Sharma samj...@gmail.com: On 08/22/2011 05:26 PM, Behdad Esfahbod wrote: OpenType tables contain entries in the logical order of the script in question.  Ie. Arabic tables are always RTL. Yes I understand, but still, to clarify: The font tables themselves contain only

Re: Code pages and Unicode

2011-08-22 Thread John H. Jenkins
Christoph Päper 於 2011年8月20日 上午2:31 寫道: Mark Davis ☕: Under the original design principles of Unicode, the goal was a bit more limited; we envisioned […] a generative mechanism for infrequent CJK ideographs, I'd still like having that as an option. Et voilà! We have Ideographic

Re: RTL PUA?

2011-08-22 Thread Philippe Verdy
2011/8/22 Joó Ádám a...@jooadam.hu: Um... Computers are hardware, and don't understand a thing. What I think you mean is computer _software_. (I know, I'm being pedantic, but with good reason.) Sorry, I just can’t resist pointing out that difference between hardware and software is only

Implement BIDI algorithm by line

2011-08-22 Thread li bo
Hi all, I have a question about the BIDI algorithm implementation. Bidi algorithm describe that one must resolving embedding level in a paragraph before break paragraph into lines. I don't understand why. Should we firstly break paragraph into lines and remember the paragraph level, and then

RE: RTL PUA?

2011-08-22 Thread Doug Ewell
Philippe Verdy verdy underscore p at wanadoo dot fr wrote: As well, the small properties files can be embedded, in a very compact form, in the PUA font. As soon as you embed all the information in the font, you require different solutions for systems that use different font technologies. I was

Re: RTL PUA?

2011-08-22 Thread Shriramana Sharma
On 08/20/2011 10:54 AM, Shriramana Sharma wrote: On 08/19/2011 10:05 PM, Mark Davis ☕ wrote: All of the property assignments to PUA characters (except the GC) are purely informative. I just now noticed that you had excepted the GC in the above. Why is that? How are applications supposed to

Re: RTL PUA?

2011-08-22 Thread Shriramana Sharma
On 08/22/2011 05:20 PM, Shriramana Sharma wrote: Hi Behdad. I only asked whether the OT *tables* would contain the entries in the logical order or the visual order. Clearly it would still be the visual order My mistake: I should have said *logical* order. (but Philippe Verdy seemed to

Re: RTL PUA?

2011-08-22 Thread Shriramana Sharma
On 08/22/2011 09:00 PM, Philippe Verdy wrote: The font tables themselves contain only ASCII characters I presume. No. The lookup tables contain sequences of numeric glyph ids (16 bit integers in TrueType and OpenType). Which are also not the code point values, and not the character names or

Re: RTL PUA?

2011-08-22 Thread Shriramana Sharma
On 08/22/2011 09:31 PM, Doug Ewell wrote: Philippe Verdyverdy underscore p at wanadoo dot fr wrote: As well, the small properties files can be embedded, in a very compact form, in the PUA font. As soon as you embed all the information in the font, you require different solutions for systems

Re: Feedback from C1 Control Pictures Proposal

2011-08-22 Thread Frank da Cruz
I would like to ask Frank for a bit of help here (and, to the extent that Ken thinks that the proposal is reasonable, some affirmation that the uses/demonstration of demand will be seen as acceptable to the Unicode people). Specifically, can Frank help identify, and possibly provide

Re: Code pages and Unicode

2011-08-22 Thread William_J_G Overington
On Monday 22 August 2011, Andrew West andrewcw...@gmail.com wrote: Can anyone think of a way to extend UTF-16 without adding new surrogates or inventing a new general category? Andrew How about a triple sequence of two high surrogates followed by one low surrogate? I suggest this as a

RE: RTL PUA?

2011-08-22 Thread Doug Ewell
Shriramana Sharma samjnaa at gmail dot com wrote: As soon as you embed all the information in the font, you require different solutions for systems that use different font technologies. Why? In the end all the systems base upon the character properties specified by the standard. For the

Re: RTL PUA?

2011-08-22 Thread Petr Tomasek
On Mon, Aug 22, 2011 at 07:51:22AM -0700, Doug Ewell wrote: Some PUA properties, like glyph shapes and maybe directionality, can be stored in a font. Others, like numeric values and casing, might not or cannot. An interchangeable format needs to be agreed upon for the Why not? P.T. --

Re: RTL PUA?

2011-08-22 Thread Shriramana Sharma
On 08/22/2011 10:12 PM, Doug Ewell wrote: Right, so if you embed that table in an OT font, the information is not available to a system that uses a font technology other than OT. I don't understand why you would say so -- assuming we are all talking about TrueType fonts, AAT just uses some

Re: RTL PUA?

2011-08-22 Thread John Hudson
Shriramana Sharma wrote: The font tables themselves contain only ASCII characters I presume. OpenType Layout tables use Glyph IDs. OTL development tools typically use glyph names, which may be particular to the tool or the same names used in the post or CFF tables. OTL tables work on

RE: RTL PUA?

2011-08-22 Thread Doug Ewell
Petr Tomasek tomasek at etf dot cuni dot cz wrote: Some PUA properties, like glyph shapes and maybe directionality, can be stored in a font. Others, like numeric values and casing, might not or cannot. An interchangeable format needs to be agreed upon for Why not? Where does one store

Re: RTL PUA?

2011-08-22 Thread John Hudson
Shriramana Sharma wrote: I was just noting that the glyph tables themselves don't *use* the actual codepoints of the characters getting ligated (while they *refer* to them). Characters are mapped to glyph IDs in the font cmap tables. Glyph IDs are mapped to other glyph IDs (one-to-one,

Re: Code pages and Unicode

2011-08-22 Thread Jean-François Colson
On 22/08/11 16:55, Doug Ewell wrote: srivas sinnathuraisisrivas at blueyonder dot co dot uk wrote: The true lifting of UTF-16 would be to UTF-32. Leave the UTF-16 un touched and make the new half versatile as possible. I think any other solution is just a patch up for the timebeing. There

Re: RTL PUA?

2011-08-22 Thread William_J_G Overington
On Monday 22 August 2011, Philippe Verdy verd...@wanadoo.fr wrote: So there are only two options: [snipped] ... : this requires an approval either by the UTC WG2 (solution 1) or by the OpenType working group (solution 2). Would a third option work? In the Description section of the

Re: Code pages and Unicode

2011-08-22 Thread Jean-François Colson
On 20/08/11 02:03, Ken Whistler wrote: O.k., so apparently we have awhile to go before we have to start worrying about the Y2K or IPv4 problem for Unicode. Call me again in the year 2851, and we'll still have 5 years left to design a new scheme and plan for the transition. ;-) --Ken I

Re: RTL PUA?

2011-08-22 Thread John H. Jenkins
Doug Ewell 於 2011年8月22日 上午10:59 寫道: Petr Tomasek tomasek at etf dot cuni dot cz wrote: Some PUA properties, like glyph shapes and maybe directionality, can be stored in a font. Others, like numeric values and casing, might not or cannot. An interchangeable format needs to be agreed upon

Re: RTL PUA?

2011-08-22 Thread Joó Ádám
True -- so if someone wanted a PUA script to be handled properly in sorting etc one would have to prepare collation tables which would obviously go *outside* the font. If a proper definition of an unencoded script needs additional properties which cannot be stored in the font anyway, why would

Re: RTL PUA?

2011-08-22 Thread Shriramana Sharma
On 08/22/2011 10:55 PM, Joó Ádám wrote: If a proper definition of an unencoded script needs additional properties which cannot be stored in the font anyway, why would you want to store part of it in OT tables? It’s just not the right place. Fonts’ sole purpose is to display already defined

Re: RTL PUA?

2011-08-22 Thread John H. Jenkins
William_J_G Overington 於 2011年8月22日 上午10:49 寫道: In the Description section of the Macintosh Roman section of a TrueType font, include a line of text in a plain text format of which the following line of text is an example.

Re: RTL PUA?

2011-08-22 Thread William_J_G Overington
On Monday 22 August 2011, John H. Jenkins jenk...@apple.com wrote: Forgive my asking, but this reference to the description section of the Macintosh Roman section of a TrueType font has me puzzled, because I don't know what you're talking about.  What table contains this string? When I

Re: RTL PUA?

2011-08-22 Thread Philippe Verdy
2011/8/22 Doug Ewell d...@ewellic.org: Depending on how you count, there are already two to four fonts that support Ewellic in the PUA.  There are probably many more that support Tengwar or Cirth or Klingon. First, these fonts can work fine with the default LTR directionality. So there's no

Re: Implement BIDI algorithm by line

2011-08-22 Thread Asmus Freytag
Huh? What context is this in? On 8/22/2011 11:18 AM, CE Whitehead wrote: Hi. I think many line breaks within paragraphs are soft line breaks but that embedding levels have to be taken into account when deciding the width of the glyphs; that's as near as I can tell. Here is the description

Re: RTL PUA?

2011-08-22 Thread Philippe Verdy
2011/8/22 Shriramana Sharma samj...@gmail.com: On 08/22/2011 09:00 PM, Philippe Verdy wrote: The font tables themselves contain only ASCII characters I  presume. No. The lookup tables contain sequences of numeric glyph ids (16 bit integers in TrueType and OpenType). Which are also not the

Re: RTL PUA?

2011-08-22 Thread Philippe Verdy
2011/8/22 Shriramana Sharma samj...@gmail.com: True -- so if someone wanted a PUA script to be handled properly in sorting etc one would have to prepare collation tables which would obviously go *outside* the font. Collation tables can aleady be tailored very easily with existing technologies.

Re: Code pages and Unicode

2011-08-22 Thread Ken Whistler
On 8/22/2011 9:58 AM, Jean-François Colson wrote: I wonder whether you aren’t a little too optimistic. No. If anything I'm assuming that the folks working on proposals will be amazingly assiduous during the next decade. Have you considered the unencoded ideographic scripts? Why, yes I

Re: RTL PUA?

2011-08-22 Thread John H. Jenkins
William_J_G Overington 於 2011年8月22日 下午12:36 寫道: On Monday 22 August 2011, John H. Jenkins jenk...@apple.com wrote: Forgive my asking, but this reference to the description section of the Macintosh Roman section of a TrueType font has me puzzled, because I don't know what you're talking

RE: RTL PUA?

2011-08-22 Thread Doug Ewell
There is more to displaying characters than LTR versus RTL, and there is more to handling characters than just displaying them. This point continues to be lost on several people responding to this thread. -- Doug Ewell | Thornton, Colorado, USA | RFC 5645, 4645, UTN #14 www.ewellic.org |

RE: RTL PUA?

2011-08-22 Thread Doug Ewell
Philippe Verdy verdy underscore p at wanadoo dot fr wrote: Depending on how you count, there are already two to four fonts that support Ewellic in the PUA. There are probably many more that support Tengwar or Cirth or Klingon. First, these fonts can work fine with the default LTR

RE: RTL PUA?

2011-08-22 Thread Doug Ewell
Shriramana Sharma samjnaa at gmail dot com wrote: Right, so if you embed that table in an OT font, the information is not available to a system that uses a font technology other than OT. I don't understand why you would say so -- assuming we are all talking about TrueType fonts, AAT just

Re: RTL PUA?

2011-08-22 Thread Philippe Verdy
2011/8/22 William_J_G Overington wjgo_10...@btinternet.com: Having selected a platform, one may view the text content of various fields for that platform, such as font family name and copyright notice, version string and postscript name. There is then a button that is labelled Advanced...

ALM (was: Re: RTL PUA?)

2011-08-22 Thread Ken Whistler
On 8/21/2011 3:31 PM, Richard Wordingham wrote: I expect ARABIC LANGUAGE MARK would not go down well - has it already been proposed and rejected?. ARABIC *LETTER* MARK, not *LANGUAGE* mark. (And suggested to just be renamed to AL MARK.) Proposed? Yes. Discussed? Yes. Rejected? No. The last

Re: RTL PUA?

2011-08-22 Thread Richard Wordingham
On Mon, 22 Aug 2011 07:51:22 -0700 Doug Ewell d...@ewellic.org wrote: Some PUA properties, like glyph shapes and maybe directionality, can be stored in a font. Others, like numeric values and casing, might not or cannot. An interchangeable format needs to be agreed upon for the properties

RE: RTL PUA?

2011-08-22 Thread Doug Ewell
Richard Wordingham richard dot wordingham at ntlworld dot com wrote: One reason for associating properties with a font is that text that is to be displayed is at that point tentatively associated with a font. I thought John said fonts dealt with glyph IDs, not characters per se. Another is

Re: RTL PUA?

2011-08-22 Thread N. Ganesan
On Sat, Aug 20, 2011 at 7:08 AM, Shriramana Sharma samj...@gmail.com wrote: On 08/20/2011 01:57 PM, Martin Hosken wrote: D49 states that all properties of PUA characters are overridable by a higher protocol. But in 'normal' implementations, there are no higher level protocols to override the

Re: Code pages and Unicode

2011-08-22 Thread Richard Wordingham
On Mon, 22 Aug 2011 14:06:00 +0100 (BST) William_J_G Overington wjgo_10...@btinternet.com wrote: On Monday 22 August 2011, Andrew West andrewcw...@gmail.com wrote: Can anyone think of a way to extend UTF-16 without adding new surrogates or inventing a new general category? Andrew

Re: Code pages and Unicode

2011-08-22 Thread Ken Whistler
On 8/22/2011 3:15 PM, Richard Wordingham wrote: On Monday 22 August 2011, Andrew Westandrewcw...@gmail.com wrote: Can anyone think of a way to extend UTF-16 without adding new surrogates or inventing a new general category? Andrew How about a triple sequence of two

Re: Implement BIDI algorithm by line

2011-08-22 Thread li bo
Yes, this is the algorithm I have read. http://unicode.org/reports/tr9/ But I don't know why user must take a paragraph as a unit to determine the embedding levels. Why can't i shape the text first and then wrapping the line, and determining the embedding levels for characters within a line.

Re: Implement BIDI algorithm by line

2011-08-22 Thread li bo
Sorry, Asmus, what do you mean? On Tue, Aug 23, 2011 at 2:44 AM, Asmus Freytag asm...@ix.netcom.com wrote: Huh? What context is this in? On 8/22/2011 11:18 AM, CE Whitehead wrote: Hi. I think many line breaks within paragraphs are soft line breaks but that embedding levels have to be

Re: RTL PUA?

2011-08-22 Thread Shriramana Sharma
On 08/23/2011 03:29 AM, N. Ganesan wrote: Hope a new proposal or a UTN from UC will make things clear, and RTL community benefits. Dear Ganesan, I wonder if you have actually understood all the issues here. As usual you have done your copy-paste from somebody else's post. Please say

Re: Code pages and Unicode

2011-08-22 Thread Jean-François Colson
On 23/08/11 00:15, Richard Wordingham wrote: The problem is that a search for the character represented by the code unit sequence (H2,L3) would also pick up the sequence (H1,H2,L3). While there is no ambiguity, it does make searching more complicated to code. The same issue applies to the