On 8/21/2011 7:34 PM, Doug Ewell wrote:
So what you are asking about is a directional control character that would
assign subsequent characters a BC of 'AL', right?
You don't want to call this a LANGUAGE MARK or anything else that implies language
identification, because of the existence of
I don't buy the assumption that all the world is either AAT, Graphite or
Uniscribe.
Anyhow, this discussion is going off topic, the issue is should Unicode specify
an RTL PUA area, not whether some products, however respectable, provide a
bypass.
Jony
-Original Message-
From:
On 22 Aug 2011, at 03:57, Peter Constable wrote:
From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On
Behalf Of Asmus Freytag
Treating PUA characters as ON is very problematic
As would be changing the default property of PUA characters from L to ON.
Which is why that
On 22 Aug 2011, at 05:53, Shriramana Sharma wrote:
While I don't know much about RTL scripts, if the logic order is ALEF +
LAMED, but the presentation order is LAMED + ALEF *because of the RTL nature*
do you write the rule as ALEF + LAMED = ALEF_LAMED_LIGATURE or LAMED + ALEF =
On Aug 17, 2011, at 4:38 PM, Andrew West wrote:
Unless you can show evidence that C1 control pictures are currently in
use and that there is a clear demand from the user community to
On Aug 21, 2011, at 10:13 AM, Doug Ewell wrote:
Perhaps it would help for you to do a quick survey of
On Mon, Aug 22, 2011 at 10:42:05AM +0530, Shriramana Sharma wrote:
On 08/22/2011 08:24 AM, Peter Constable wrote:
I'm not saying that there shouldn't be_some_ software that can do
what you expect. But there will likely be some different views on
what ought to be included within that some.
On 21 August 2011 02:14, Richard Wordingham
richard.wording...@ntlworld.com wrote:
On Fri, 19 Aug 2011 17:03:41 -0700
Ken Whistler k...@sybase.com wrote:
O.k., so apparently we have awhile to go before we have to start
worrying about the Y2K or IPv4 problem for Unicode. Call me again in
the
On 08/22/2011 03:05 PM, Andrew West wrote:
Can anyone think of a way to extend UTF-16 without adding new
surrogates or inventing a new general category?
Why would anyone *need* to do so? UTF-16 can represent all codepoints
upto Plane 16 right?
--
Shriramana Sharma
On 08/22/2011 04:34 PM, Behdad Esfahbod wrote:
On 08/22/11 06:53, Shriramana Sharma wrote:
While I don't know much about RTL scripts, if the logic order is ALEF +
LAMED,
but the presentation order is LAMED + ALEF*because of the RTL nature* do you
write the rule as ALEF + LAMED =
On 08/22/2011 05:26 PM, Behdad Esfahbod wrote:
OpenType tables contain entries in the logical order of the script in
question. Ie. Arabic tables are always RTL.
Yes I understand, but still, to clarify:
The font tables themselves contain only ASCII characters I presume. In
it do you write:
On 22 August 2011 12:51, Shriramana Sharma samj...@gmail.com wrote:
On 08/22/2011 03:05 PM, Andrew West wrote:
Can anyone think of a way to extend UTF-16 without adding new
surrogates or inventing a new general category?
Why would anyone *need* to do so? UTF-16 can represent all codepoints
On 08/22/2011 12:21 PM, Jonathan Rosenne wrote:
I don't buy the assumption that all the world is either AAT, Graphite
or Uniscribe.
Nobody asserted that either. It is only pointed out that major
implementations are able to provide what you seek.
Anyhow, this discussion is going off topic,
Um... Computers are hardware, and don't understand a thing. What I think you
mean is computer _software_. (I know, I'm being pedantic, but with good
reason.)
Sorry, I just can’t resist pointing out that difference between
hardware and software is only the fact that the former is material,
On 08/22/2011 08:26 AM, Shriramana Sharma wrote:
On 08/22/2011 05:26 PM, Behdad Esfahbod wrote:
OpenType tables contain entries in the logical order of the script in
question. Ie. Arabic tables are always RTL.
Yes I understand, but still, to clarify:
The font tables themselves contain only
2011/8/22 Peter Constable peter...@microsoft.com:
From: ver...@gmail.com [mailto:ver...@gmail.com] On Behalf Of Philippe Verdy
As I explained in an earlier message, the layout engine doesn't use
the default property value but the resolved bidi level.
Once again, you refuse to understand my
2011/8/22 Peter Constable peter...@microsoft.com:
From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On
Behalf Of Asmus Freytag
Treating PUA characters as ON is very problematic
As would be changing the default property of PUA characters from L to ON.
I also agree with
2011/8/22 Shriramana Sharma samj...@gmail.com:
On 08/22/2011 12:01 AM, Peter Constable wrote:
If you mean a rule to substitute [g1 g2] with [g3] won't apply if the
sequence processed by the OpenType Layout lookup processor is [g2
g1],
Peter, actually I suspect Philippe is thinking that in
srivas sinnathurai sisrivas at blueyonder dot co dot uk wrote:
The true lifting of UTF-16 would be to UTF-32.
Leave the UTF-16 un touched and make the new half versatile as possible.
I think any other solution is just a patch up for the timebeing.
There is no evidence whatsoever that this
It's actually quite easy to convince Uniscribe to treat specific characters as
RTL, others as LTR, and, in general, with whatever classifications you desire.
Pass a preprocessed string to Uniscribe's ScriptItemize(). RichEdit has used
that approach to some degree starting with RichEdit 3.0
2011/8/22 Mark E. Shoulson m...@kli.org:
I'm not certain I understand the question, but if I have it right... The
logic order is ALEF + LAMED, and the presentation... places those in a
right-to-left sequence, shall we say (since talking about the presentation
*order* is confusing here). The
2011/8/22 Shriramana Sharma samj...@gmail.com:
Hi Behdad. I only asked whether the OT *tables* would contain the entries in
the logical order or the visual order. Clearly it would still be the visual
order (but Philippe Verdy seemed to imagine/suggest otherwise).
No ! I've not imagined that.
2011/8/22 Shriramana Sharma samj...@gmail.com:
On 08/22/2011 05:26 PM, Behdad Esfahbod wrote:
OpenType tables contain entries in the logical order of the script in
question. Ie. Arabic tables are always RTL.
Yes I understand, but still, to clarify:
The font tables themselves contain only
Christoph Päper 於 2011年8月20日 上午2:31 寫道:
Mark Davis ☕:
Under the original design principles of Unicode, the goal was a bit more
limited; we envisioned […] a generative mechanism for infrequent CJK
ideographs,
I'd still like having that as an option.
Et voilà! We have Ideographic
2011/8/22 Joó Ádám a...@jooadam.hu:
Um... Computers are hardware, and don't understand a thing. What I think you
mean is computer _software_. (I know, I'm being pedantic, but with good
reason.)
Sorry, I just can’t resist pointing out that difference between
hardware and software is only
Hi all,
I have a question about the BIDI algorithm implementation. Bidi algorithm
describe that one must resolving embedding level in a paragraph before break
paragraph into lines. I don't understand why. Should we firstly break
paragraph into lines and remember the paragraph level, and then
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
As well, the small properties files can be embedded, in a very compact
form, in the PUA font.
As soon as you embed all the information in the font, you require
different solutions for systems that use different font technologies.
I was
On 08/20/2011 10:54 AM, Shriramana Sharma wrote:
On 08/19/2011 10:05 PM, Mark Davis ☕ wrote:
All of the property assignments to PUA characters (except the GC) are
purely informative.
I just now noticed that you had excepted the GC in the above. Why is
that? How are applications supposed to
On 08/22/2011 05:20 PM, Shriramana Sharma wrote:
Hi Behdad. I only asked whether the OT *tables* would contain the
entries in the logical order or the visual order. Clearly it would still
be the visual order
My mistake: I should have said *logical* order.
(but Philippe Verdy seemed to
On 08/22/2011 09:00 PM, Philippe Verdy wrote:
The font tables themselves contain only ASCII characters I presume.
No. The lookup tables contain sequences of numeric glyph ids (16 bit
integers in TrueType and OpenType). Which are also not the code point
values, and not the character names or
On 08/22/2011 09:31 PM, Doug Ewell wrote:
Philippe Verdyverdy underscore p at wanadoo dot fr wrote:
As well, the small properties files can be embedded, in a very compact
form, in the PUA font.
As soon as you embed all the information in the font, you require
different solutions for systems
I would like to ask Frank for a bit of help here (and, to the extent that
Ken thinks that the proposal is reasonable, some affirmation that the
uses/demonstration of demand will be seen as acceptable to the Unicode
people). Specifically, can Frank help identify, and possibly provide
On Monday 22 August 2011, Andrew West andrewcw...@gmail.com wrote:
Can anyone think of a way to extend UTF-16 without adding new surrogates or
inventing a new general category?
Andrew
How about a triple sequence of two high surrogates followed by one low
surrogate?
I suggest this as a
Shriramana Sharma samjnaa at gmail dot com wrote:
As soon as you embed all the information in the font, you require
different solutions for systems that use different font technologies.
Why? In the end all the systems base upon the character properties
specified by the standard. For the
On Mon, Aug 22, 2011 at 07:51:22AM -0700, Doug Ewell wrote:
Some PUA properties, like glyph shapes and maybe directionality, can be
stored in a font. Others, like numeric values and casing, might not or
cannot. An interchangeable format needs to be agreed upon for the
Why not?
P.T.
--
On 08/22/2011 10:12 PM, Doug Ewell wrote:
Right, so if you embed that table in an OT font, the information is not
available to a system that uses a font technology other than OT.
I don't understand why you would say so -- assuming we are all talking
about TrueType fonts, AAT just uses some
Shriramana Sharma wrote:
The font tables themselves contain only ASCII characters I presume.
OpenType Layout tables use Glyph IDs. OTL development tools typically
use glyph names, which may be particular to the tool or the same names
used in the post or CFF tables.
OTL tables work on
Petr Tomasek tomasek at etf dot cuni dot cz wrote:
Some PUA properties, like glyph shapes and maybe directionality, can
be stored in a font. Others, like numeric values and casing, might
not or cannot. An interchangeable format needs to be agreed upon for
Why not?
Where does one store
Shriramana Sharma wrote:
I was just noting
that the glyph tables themselves don't *use* the actual codepoints of
the characters getting ligated (while they *refer* to them).
Characters are mapped to glyph IDs in the font cmap tables.
Glyph IDs are mapped to other glyph IDs (one-to-one,
On 22/08/11 16:55, Doug Ewell wrote:
srivas sinnathuraisisrivas at blueyonder dot co dot uk wrote:
The true lifting of UTF-16 would be to UTF-32.
Leave the UTF-16 un touched and make the new half versatile as possible.
I think any other solution is just a patch up for the timebeing.
There
On Monday 22 August 2011, Philippe Verdy verd...@wanadoo.fr wrote:
So there are only two options:
[snipped]
... : this requires an approval either by the UTC WG2 (solution 1) or by
the OpenType working group (solution 2).
Would a third option work?
In the Description section of the
On 20/08/11 02:03, Ken Whistler wrote:
O.k., so apparently we have awhile to go before we have to start worrying
about the Y2K or IPv4 problem for Unicode. Call me again in the
year 2851, and we'll still have 5 years left to design a new scheme
and plan
for the transition. ;-)
--Ken
I
Doug Ewell 於 2011年8月22日 上午10:59 寫道:
Petr Tomasek tomasek at etf dot cuni dot cz wrote:
Some PUA properties, like glyph shapes and maybe directionality, can
be stored in a font. Others, like numeric values and casing, might
not or cannot. An interchangeable format needs to be agreed upon
True -- so if someone wanted a PUA script to be handled properly in sorting
etc one would have to prepare collation tables which would obviously go
*outside* the font.
If a proper definition of an unencoded script needs additional
properties which cannot be stored in the font anyway, why would
On 08/22/2011 10:55 PM, Joó Ádám wrote:
If a proper definition of an unencoded script needs additional
properties which cannot be stored in the font anyway, why would you
want to store part of it in OT tables? It’s just not the right place.
Fonts’ sole purpose is to display already defined
William_J_G Overington 於 2011年8月22日 上午10:49 寫道:
In the Description section of the Macintosh Roman section of a TrueType font,
include a line of text in a plain text format of which the following line of
text is an example.
On Monday 22 August 2011, John H. Jenkins jenk...@apple.com wrote:
Forgive my asking, but this reference to the description section of the
Macintosh Roman section of a TrueType font has me puzzled, because I don't
know what you're talking about. What table contains this string?
When I
2011/8/22 Doug Ewell d...@ewellic.org:
Depending on how you count, there are already two to four fonts that
support Ewellic in the PUA. There are probably many more that support
Tengwar or Cirth or Klingon.
First, these fonts can work fine with the default LTR directionality.
So there's no
Huh? What context is this in?
On 8/22/2011 11:18 AM, CE Whitehead wrote:
Hi.
I think many line breaks within paragraphs are soft line breaks but
that embedding levels have to be taken into account when deciding the
width of the glyphs; that's as near as I can tell.
Here is the description
2011/8/22 Shriramana Sharma samj...@gmail.com:
On 08/22/2011 09:00 PM, Philippe Verdy wrote:
The font tables themselves contain only ASCII characters I presume.
No. The lookup tables contain sequences of numeric glyph ids (16 bit
integers in TrueType and OpenType). Which are also not the
2011/8/22 Shriramana Sharma samj...@gmail.com:
True -- so if someone wanted a PUA script to be handled properly in sorting
etc one would have to prepare collation tables which would obviously go
*outside* the font.
Collation tables can aleady be tailored very easily with existing
technologies.
On 8/22/2011 9:58 AM, Jean-François Colson wrote:
I wonder whether you aren’t a little too optimistic.
No. If anything I'm assuming that the folks working on proposals will
be amazingly assiduous during the next decade.
Have you considered the unencoded ideographic scripts?
Why, yes I
William_J_G Overington 於 2011年8月22日 下午12:36 寫道:
On Monday 22 August 2011, John H. Jenkins jenk...@apple.com wrote:
Forgive my asking, but this reference to the description section of the
Macintosh Roman section of a TrueType font has me puzzled, because I don't
know what you're talking
There is more to displaying characters than LTR versus RTL, and there is
more to handling characters than just displaying them. This point
continues to be lost on several people responding to this thread.
--
Doug Ewell | Thornton, Colorado, USA | RFC 5645, 4645, UTN #14
www.ewellic.org |
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
Depending on how you count, there are already two to four fonts that
support Ewellic in the PUA. There are probably many more that
support Tengwar or Cirth or Klingon.
First, these fonts can work fine with the default LTR
Shriramana Sharma samjnaa at gmail dot com wrote:
Right, so if you embed that table in an OT font, the information is not
available to a system that uses a font technology other than OT.
I don't understand why you would say so -- assuming we are all talking
about TrueType fonts, AAT just
2011/8/22 William_J_G Overington wjgo_10...@btinternet.com:
Having selected a platform, one may view the text content of various fields
for that platform, such as font family name and copyright notice, version
string and postscript name. There is then a button that is labelled
Advanced...
On 8/21/2011 3:31 PM, Richard Wordingham wrote:
I expect ARABIC LANGUAGE MARK would not go down well
- has it already been proposed and rejected?.
ARABIC *LETTER* MARK, not *LANGUAGE* mark. (And suggested
to just be renamed to AL MARK.)
Proposed? Yes.
Discussed? Yes.
Rejected? No.
The last
On Mon, 22 Aug 2011 07:51:22 -0700
Doug Ewell d...@ewellic.org wrote:
Some PUA properties, like glyph shapes and maybe directionality, can
be stored in a font. Others, like numeric values and casing, might
not or cannot. An interchangeable format needs to be agreed upon for
the properties
Richard Wordingham richard dot wordingham at ntlworld dot com wrote:
One reason for associating properties with a font is that text that is
to be displayed is at that point tentatively associated with a font.
I thought John said fonts dealt with glyph IDs, not characters per se.
Another is
On Sat, Aug 20, 2011 at 7:08 AM, Shriramana Sharma samj...@gmail.com
wrote:
On 08/20/2011 01:57 PM, Martin Hosken wrote:
D49 states that all properties of PUA characters are overridable by a
higher protocol. But in 'normal' implementations, there are no higher
level protocols to override the
On Mon, 22 Aug 2011 14:06:00 +0100 (BST)
William_J_G Overington wjgo_10...@btinternet.com wrote:
On Monday 22 August 2011, Andrew West andrewcw...@gmail.com wrote:
Can anyone think of a way to extend UTF-16 without adding new
surrogates or inventing a new general category?
Andrew
On 8/22/2011 3:15 PM, Richard Wordingham wrote:
On Monday 22 August 2011, Andrew Westandrewcw...@gmail.com wrote:
Can anyone think of a way to extend UTF-16 without adding new
surrogates or inventing a new general category?
Andrew
How about a triple sequence of two
Yes, this is the algorithm I have read. http://unicode.org/reports/tr9/
But I don't know why user must take a paragraph as a unit to determine the
embedding levels. Why can't i shape the text first and then wrapping the
line, and determining the embedding levels for characters within a line.
Sorry, Asmus, what do you mean?
On Tue, Aug 23, 2011 at 2:44 AM, Asmus Freytag asm...@ix.netcom.com wrote:
Huh? What context is this in?
On 8/22/2011 11:18 AM, CE Whitehead wrote:
Hi.
I think many line breaks within paragraphs are soft line breaks but that
embedding levels have to be
On 08/23/2011 03:29 AM, N. Ganesan wrote:
Hope a new proposal or a UTN from UC will make things clear, and RTL
community benefits.
Dear Ganesan,
I wonder if you have actually understood all the issues here. As usual
you have done your copy-paste from somebody else's post. Please say
On 23/08/11 00:15, Richard Wordingham wrote:
The problem is that a search for the character represented by the code
unit sequence (H2,L3) would also pick up the sequence (H1,H2,L3).
While there is no ambiguity, it does make searching more complicated
to code. The same issue applies to the
66 matches
Mail list logo