At 06:19 PM 15-08-02, James Kass wrote:
Does anyone know of a writing system which actually uses the
Latin letter t with a bona-fide cedilla?
The newish Gagauz Turkish Latin-script orthography derives from both
Turkish and Romanian models. This has led to a peculiar hybrid, in which
the
Kenneth Whistler wrote as follows about my idea.
It occurs to me that it is possible to introduce a convention, either as
a
matter included in the Unicode specification, or as just a known about
thing, that if one has a plain text Unicode file with a file name that
has
some particular
From: William Overington [EMAIL PROTECTED]
Could this be discussed at the Unicode Technical Committee
meeting next week please?
whoosh
William,
Please read Ken's message again. He was *talking* about HTML, and pointing
out how all of these things are supported in browsers already.
You will
Yes, yes, I think this is an idea which could fly.
--Ken
Good. It is a solution which could be very useful for people writing
programs in Java, Pascal and C and so on which programs take in plain text
files and process them for such purposes as producing a desktop publishing
package.
Uhh,
William Overington wrote,
No, it is a story about an artist who wanted to paint a picture of a horse
and a picture of a dog and, since he knew that the horse and the dog were
great friends and liked to be together and also that he only had one canvas
upon which to paint, the artist
William,
So let me see if I understand this correctly.
Let's take 2 perfectly good standards, Unicode and HTML, and make some
very minor tweaks to them, such as changing the meaning of U+FFFC and a
special format for filenames in the beginning of the file and a new
extension, so we have
On 08/14/2002 05:53:58 AM James Kass wrote:
Once a meaning like
INTERLINEAR ANNOTATION ANCHOR has been assigned to
a code point, any application which chooses to use that code
point for any other purpose would be at fault.
Since it's for internal use only, nobody would ever know. Unicode
On 08/13/2002 10:08:00 AM William Overington wrote:
I've been ignoring the list for a few days, but come back to find that not
much has changed.
2) Superscript, subscript, combining above, and other forms of
identifying placement of characters, are better left to markup or other
rendering
On 08/14/2002 02:36:37 PM William Overington wrote:
U+0360 COMBINING DOUBLE TILDE
U+035D COMBINING DOUBLE BREVE
U+035E COMBINING DOUBLE MACRON
U+035F COMBINING DOUBLE LOW LINE
I also note U+0361 COMBINING DOUBLE INVERTED BREVE and U+0362 COMBINING
DOUBLE RIGHTWARDS ARROW BELOW in the code
On 08/14/2002 10:52:32 AM Michael Everson wrote:
I'm saying I WANT to use these characters. They solve an apparent
need of mine
They only *appear* to you to solve that need, but in fact do not offer a
good solution. Markup is recommended for your need.
- Peter
On 08/14/2002 02:04:50 PM William Overington wrote:
As this concerns the U+FFFC character and the Unicode Technical Committee
is
due to meet next week, I think it might be helpful if this idea is
discussed
before the meeting as a straightforward idea like this might mean that
the
possibility
On 08/14/2002 04:34:27 PM Doug Ewell wrote:
Broad ranges of Planes 0 and 1 have been tentatively blocked out on the
Roadmap for RTL scripts.
Oh? I was somewhat sharply rebuked a few years for suggesting that such a
thing be done. References to relevant documentation, please?
- Peter
On 08/14/2002 01:16:29 AM starner wrote:
That seems to be basically what William Overington is proposing,
except these characters only handle furigana, instead all markup.
Not quite. WO has proposed characters to be used in interchange. These are
only intended for internal use by programmers.
At 09:38 +0100 2002-08-16, [EMAIL PROTECTED] wrote:
On 08/14/2002 04:34:27 PM Doug Ewell wrote:
Broad ranges of Planes 0 and 1 have been tentatively blocked out on the
Roadmap for RTL scripts.
Oh? I was somewhat sharply rebuked a few years for suggesting that such a
thing be done. References to
Michael Everson wrote,
Appropriate font technology for Latin ligature display exists,
but it isn't enabled yet in Microsoft's Uniscribe.*
That doesn't mean that this particular cataloguing of ligatures in
the PUA is a good idea.
The Golden Ligatures Collection simply offers font
Mark Davis wrote:
There is a new version of Unicode Technical Report #29: Text Boundaries on
http://www.unicode.org/reports/tr29/, covering grapheme-cluster, word and
sentence boundaries. There are significant modifications to this version;
for a summary, see
[EMAIL PROTECTED] wrote:
On 08/14/2002 12:45:22 AM Kenneth Whistler wrote:
But even at the time, as the record of the deliberations would
show, if we had a more perfect record, the proponents were clear
that the interlinear annotation characters were to solve an
internal anchor point
Kenneth Whistler replied to my posting as follows.
An interesting point for consideration is as to whether the following
sequence is permitted in interchanged documents.
U+FFF9 U+FFFC U+FFFA Temperature variation with time. U+FFFB
That is, the annotated text is an object replacement
John Hudson scripsit:
The newish Gagauz Turkish Latin-script orthography derives from both
Turkish and Romanian models. This has led to a peculiar hybrid, in which
the cedilla is used for the s and the commaaccent is used for the t.
ME's remarks in _The Alphabets of Europe_ seem
Tex Texin scripsit:
At the time (in the discussion), I don't think we had many examples of
what the uses would be, and it wan't clear that many were needed, since
the functionality could be arrived at with higher level protocols.
One application that has always seemed obvious to me is
I believe, Eric is talking about the characters on the attached page 8 of
the OCR standard.
Regards
Arnold
-Original Message-
From: Eric Muller [mailto:[EMAIL PROTECTED]]
Sent: Thursday, August 15, 2002 7:44 PM
To: [EMAIL PROTECTED]
Subject: OCR characters
In our OCR fonts, we
On Fri, 16 Aug 2002, John Cowan wrote:
John Hudson scripsit:
The newish Gagauz Turkish Latin-script orthography derives from both
Turkish and Romanian models. This has led to a peculiar hybrid, in which
the cedilla is used for the s and the commaaccent is used for the t.
ME's
The Times Atlas of the World uses t-cedilla, d-cedilla, and h-cedilla
in transcriptions of Yemen placenames.
--
Michael Everson *** Everson Typography *** http://www.evertype.com
Eric Muller had written:
In our OCR fonts, we have two glyphs named erase [...]
and grouperase [...] I suspect those are mandated by these
standards. On the other hand, and I can't find traces of those in
Unicode,
Arnold F. Winkler wrote:
I believe, Eric is talking about the
Michael Everson scripsit:
The Times Atlas of the World uses t-cedilla, d-cedilla, and h-cedilla
in transcriptions of Yemen placenames.
But is it correct? The National Geographic map on my wall uses s-cedilla
in Romanian place names, and that's definitely wrong.
--
Knowledge studies others
At 10:58 -0400 2002-08-16, John Cowan wrote:
Michael Everson scripsit:
The Times Atlas of the World uses t-cedilla, d-cedilla, and h-cedilla
in transcriptions of Yemen placenames.
But is it correct? The National Geographic map on my wall uses s-cedilla
in Romanian place names, and that's
James Kass wrote as follows.
William Overington wrote,
No, it is a story about an artist who wanted to paint a picture of a
horse
and a picture of a dog and, since he knew that the horse and the dog were
great friends and liked to be together and also that he only had one
canvas
upon which
Tex Texin wrote as follows.
William,
So let me see if I understand this correctly.
Let's take 2 perfectly good standards, Unicode and HTML,
Yes.
and make some
very minor tweaks to them,
No.
such as changing the meaning of U+FFFC and a
special format for filenames in the beginning of the
Peter_Constable at sil dot org wrote:
Broad ranges of Planes 0 and 1 have been tentatively blocked out on
the Roadmap for RTL scripts.
Oh? I was somewhat sharply rebuked a few years for suggesting that
such a thing be done. References to relevant documentation, please?
It looks like the
John,
Why would you want them to be for internal-use only? When you exchange
regular expressions wouldn't you want operators such as any character
to be passed as well, and standardized so that there is agreement on the
meaning of the expression?
It is also not clear to me that it is desirable
Arsa Deborah Goldsmith [EMAIL PROTECTED]:
There is lots of good news about keyboards in Mac OS X 10.2, none of
Thank you for that rapid, if intriguing response, Deborah.
which I'm allowed to discuss until August 24, unfortunately. If you
have signed an Apple non-disclosure agreement, write me
At 06:57 AM 16-08-02, Michael Everson wrote:
The Times Atlas of the World uses t-cedilla, d-cedilla, and h-cedilla in
transcriptions of Yemen placenames.
I would expect those cedillas to be dots below the letters for standard
Arabic transliteration.
John Hudson
Tiro Typeworks
Arsa Deborah Goldsmith [EMAIL PROTECTED]:
There is lots of good news about keyboards in Mac OS X 10.2, none of
Thank you for that rapid, if intriguing response, Deborah.
which I'm allowed to discuss until August 24, unfortunately. If you
have signed an Apple non-disclosure agreement, write me
Tex Texin scripsit:
Why would you want them to be for internal-use only? When you exchange
regular expressions wouldn't you want operators such as any character
to be passed as well, and standardized so that there is agreement on the
meaning of the expression?
Regular expressions are
***
Register now! Just 3 weeks to go Register now! Just 3 weeks to go
***
Twenty-second International Unicode Conference (IUC22)
John Cowan wrote:
Tex Texin scripsit:
Why would you want them to be for internal-use only? When you exchange
regular expressions wouldn't you want operators such as any character
to be passed as well, and standardized so that there is agreement on the
meaning of the expression?
Proposed unknown and missing character representation. This would be an
alternate to method currently described in 5.3.
The missing or unknown character would be represented as a series of
vertical hex digit pairs for each byte of the character. BMP characters
would be represented with 4 hex
Otto,
I am looking at ISO 1073/II-1976:
The two erase characters are the only members of set #5, reference numbers
are 120 and 121. The Remarks column is empty. 6.4 says : Application
advise is given in the column Remarks, where it is indicated, inter alia,
which characters are included for
Folks, that is my VERY LAST post on this VERY OLD subject:
In the L2 document register I found L2/98-397
http://www.unicode.org/L2/L2/98396.pdf which is a proposal for ISO/IEC TR
15907, a Type 3 TR for the revision of ISO 1073/II:1976.
On page 18 is a note that says:
NOTE – The glyphs
This is a reminder.
The Unicode.ORG system (web services, ftp, and mail lists) will be taken
off-line sometime today for maintenance and upgrades. We will keep the
downtime as short as possible.
You will receive another note when the system comes back up, but
it may note be possible to warn
On 08/15/2002 06:41:59 AM William Overington wrote:
In essence, though not formally, U+FFF9..U+FFFC are non-characters as
well, and the Unicode semantics just tells what programs *may* find
them
useful for. Unicode 4.0 editors: it might be a good idea to emphasize
the close relationship of
41 matches
Mail list logo