On 08/17/2002 09:29:00 AM William Overington wrote:
Peter Constable wrote as follows.
The standard already specifies that FFFC should not be exported from an
application or interchanged.
As far as I am aware that is not presently the case.
If you still say that that is correct, could you
On 08/16/2002 04:58:58 PM William Overington wrote:
The DVB-MHP (Digital Video Broadcasting - Multimedia Home Platform)
system
(details at http://www.mhp.org ) which implements my telesoftware
invention.
A Java program which has been broadcast can read a Unicode plain text
file
and act upon the
On 08/16/2002 04:58:58 PM William Overington wrote:
The DVB-MHP (Digital Video Broadcasting - Multimedia Home Platform)
system
(details at http://www.mhp.org ) which implements my telesoftware
invention.
A Java program which has been broadcast can read a Unicode plain text
file
and act upon the
Kenneth Whistler wrote as follows about my idea.
It occurs to me that it is possible to introduce a convention, either as
a
matter included in the Unicode specification, or as just a known about
thing, that if one has a plain text Unicode file with a file name that
has
some particular
Yes, yes, I think this is an idea which could fly.
--Ken
Good. It is a solution which could be very useful for people writing
programs in Java, Pascal and C and so on which programs take in plain text
files and process them for such purposes as producing a desktop publishing
package.
Uhh,
William Overington wrote,
No, it is a story about an artist who wanted to paint a picture of a horse
and a picture of a dog and, since he knew that the horse and the dog were
great friends and liked to be together and also that he only had one canvas
upon which to paint, the artist
William,
So let me see if I understand this correctly.
Let's take 2 perfectly good standards, Unicode and HTML, and make some
very minor tweaks to them, such as changing the meaning of U+FFFC and a
special format for filenames in the beginning of the file and a new
extension, so we have
but doesn't let any symptoms of that leak outside, I haven't
voilated any conformance requirement.
In other words, if these characters are to be used internally for
Japanese Ruby (furigana), etc., then they ought to be able to
be used externally, as well.
They simply aren't adequate for anything
On 08/14/2002 10:52:32 AM Michael Everson wrote:
I'm saying I WANT to use these characters. They solve an apparent
need of mine
They only *appear* to you to solve that need, but in fact do not offer a
good solution. Markup is recommended for your need.
- Peter
On 08/14/2002 02:04:50 PM William Overington wrote:
As this concerns the U+FFFC character and the Unicode Technical Committee
is
due to meet next week, I think it might be helpful if this idea is
discussed
before the meeting as a straightforward idea like this might mean that
the
possibility
On 08/14/2002 01:16:29 AM starner wrote:
That seems to be basically what William Overington is proposing,
except these characters only handle furigana, instead all markup.
Not quite. WO has proposed characters to be used in interchange. These are
only intended for internal use by programmers
[EMAIL PROTECTED] wrote:
On 08/14/2002 12:45:22 AM Kenneth Whistler wrote:
But even at the time, as the record of the deliberations would
show, if we had a more perfect record, the proponents were clear
that the interlinear annotation characters were to solve an
internal anchor point
Kenneth Whistler replied to my posting as follows.
An interesting point for consideration is as to whether the following
sequence is permitted in interchanged documents.
U+FFF9 U+FFFC U+FFFA Temperature variation with time. U+FFFB
That is, the annotated text is an object replacement
Tex Texin scripsit:
At the time (in the discussion), I don't think we had many examples of
what the uses would be, and it wan't clear that many were needed, since
the functionality could be arrived at with higher level protocols.
One application that has always seemed obvious to me is
James Kass wrote as follows.
William Overington wrote,
No, it is a story about an artist who wanted to paint a picture of a
horse
and a picture of a dog and, since he knew that the horse and the dog were
great friends and liked to be together and also that he only had one
canvas
upon which
Tex Texin wrote as follows.
William,
So let me see if I understand this correctly.
Let's take 2 perfectly good standards, Unicode and HTML,
Yes.
and make some
very minor tweaks to them,
No.
such as changing the meaning of U+FFFC and a
special format for filenames in the beginning of the
John,
Why would you want them to be for internal-use only? When you exchange
regular expressions wouldn't you want operators such as any character
to be passed as well, and standardized so that there is agreement on the
meaning of the expression?
It is also not clear to me that it is desirable
Tex Texin scripsit:
Why would you want them to be for internal-use only? When you exchange
regular expressions wouldn't you want operators such as any character
to be passed as well, and standardized so that there is agreement on the
meaning of the expression?
Regular expressions are
John Cowan wrote:
Tex Texin scripsit:
Why would you want them to be for internal-use only? When you exchange
regular expressions wouldn't you want operators such as any character
to be passed as well, and standardized so that there is agreement on the
meaning of the expression?
On 08/15/2002 06:41:59 AM William Overington wrote:
In essence, though not formally, U+FFF9..U+FFFC are non-characters as
well, and the Unicode semantics just tells what programs *may* find
them
useful for. Unicode 4.0 editors: it might be a good idea to emphasize
the close relationship of
On Wed, 14 Aug 2002, James Kass wrote:
One, the use of *.html clearly violates the standard file naming
convention of eight uppercase ASCII letters followed by a period
followed by a *three* letter uppercase ASCII file name extension.
I was wondering if the capitalization, ASCII, is for
An interesting point for consideration is as to whether the following
sequence is permitted in interchanged documents.
U+FFF9 U+FFFC U+FFFA Temperature variation with time. U+FFFB
That is, the annotated text is an object replacement character and the
annotation is a caption for a
that, instead of just throwing
the annotation characters away, I should attempt to display them
directly above (and smaller than) the normal text, the way furigana
are displayed above kanji.
This would work not only for typical Japanese ruby, but also for
Michael's English-or-Swedish-over-Bliss
the annotation characters away, I should attempt to display them
directly above (and smaller than) the normal text, the way furigana
are displayed above kanji.
This would work not only for typical Japanese ruby, but also for
Michael's English-or-Swedish-over-Bliss scenario. It might even
At 16:35 -0700 2002-08-13, Murray Sargent wrote:
Michael Everson said Well then they [interlinear annotation characters]
oughtn't to have been encoded.
Michael, you aren't an implementer.
I'm not the kind of implementor you are. I do implement things. :-)
When you implement things
At 17:59 -0700 2002-08-13, Kenneth Whistler wrote:
And Microsoft has others of such beasties hiding internally as
anchors for you-don't-wanna-know-what -- also not interchanged.
I am ***NOT*** bashing MS here, but what is everyone saying? That
these characters should be annotated in the
James Kass scripsit:
Once a meaning like
INTERLINEAR ANNOTATION ANCHOR has been assigned to
a code point, any application which chooses to use that code
point for any other purpose would be at fault.
But a purely nominal one, since any use of these three codepoints
should be behind the
Michael Everson scripsit:
Excuse me, this makes no sense whatsoever. If your company, for
instance, needed INTERNAL code points to attach to higher level
protocols, why did you not use the Private Use Area?
Well, suppose I wanted to use a codepoint internally to a program for
some
At 20:09 -0700 2002-08-12, Doug Ewell wrote:
Everybody will welcome the new conventional, graphical-type
characters and scripts that are coming with Unicode 4.0. But maybe
before standardizing another COMBINING GRAPHEME JOINER or other
control-type character, it would be prudent to study the
Doug Ewell wrote:
I'll have to check with Adelphia and see who or what is trying to
protect me from myself.
Those automatic b*llsh*ts!
A few years ago I was temporarily assigned to the central national office of
my previous employer. It was when the Unicode list was discussing something
about
furigana
are displayed above kanji.
This would work not only for typical Japanese ruby, but also for
Michael's English-or-Swedish-over-Bliss scenario. It might even be
useful in assisting beleaguered Azerbaijanis, for example, by annotating
Latin-script text with its Cyrillic equivalent. (Just
William Overington teased us all unmercifully with:
It occurs to me that it is possible to introduce a convention, either as a
matter included in the Unicode specification, or as just a known about
thing, that if one has a plain text Unicode file with a file name that has
some particular
Kenneth Whistler wrote in response to William Overington,
...or to pick an extension, more or less at random, say .html
The file story7.uof could thus be used with a file named story.txt so as to
indicate which objects were intended to be used for three uses of U+FFFC in
the file
John Cowan wrote as follows.
In essence, though not formally, U+FFF9..U+FFFC are non-characters as
well, and the Unicode semantics just tells what programs *may* find them
useful for. Unicode 4.0 editors: it might be a good idea to emphasize
the close relationship of this small repertoire with
? That if I have a text all nice and marked up
with furigana in Quark I can't export it to Word and reimport it in
InDesign and expect my nice marked up text to still be marked up?
Surely all Unicode/10646 characters are expected to be preserved in
interchange. What have I got wrong, Ken?
--
Michael
At 19:59 +0900 2002-08-08, Dan Kogai wrote:
On Thursday, August 8, 2002, at 04:17 , Michael Everson wrote:
Where do I start looking for information about implementing
furigana? Can you have more than one gloss attached to a word? We
are considering implementing this for Blissymbols.
What do
and only exports them for RichEdit-specific
contexts.
Murray
-Original Message-
From: Michael Everson [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, August 13, 2002 7:52 AM
To: [EMAIL PROTECTED]
Cc: Ken Whistler
Subject: Re: Furigana
At 12:11 -0700 2002-08-08, Kenneth Whistler wrote:
Ah
I want to be able to send a Blissymbol string with a gloss in English
or Swedish attached. Nothing to do with Japanese whatsoever.
Basically, as for all things annotational or interlineating, this
is an excellent application for markup.
--Ken
Hi Michael,
ME I want to be able to send a Blissymbol string with a gloss in
ME English or Swedish attached.
Do you need this in plain text? If I understand Blissymbols correctly,
this is just to give an explanation of the Blissymbol string, much
like giving the Pinyin pronunciation to a Han
At 14:16 -0700 2002-08-13, Kenneth Whistler wrote:
I want to be able to send a Blissymbol string with a gloss in English
or Swedish attached. Nothing to do with Japanese whatsoever.
Basically, as for all things annotational or interlineating, this
is an excellent application for markup.
At 23:50 +0200 2002-08-13, Philipp Reichmuth wrote:
Hi Michael,
ME I want to be able to send a Blissymbol string with a gloss in
ME English or Swedish attached.
Do you need this in plain text?
We are exploring what to do.
If I understand Blissymbols correctly,
this is just to give an
be removed...
^
The Japanese national body was very clear about this, and was opposed
to these going into the standard unless such clarifications were made,
to ensure that these were not intended for plain text interchange
of furigana (or other similar annotations).
--Ken
At 16:00 -0700 2002-08-13, Kenneth Whistler wrote
The Japanese national body was very clear about this, and was opposed
to these going into the standard unless such clarifications were made,
to ensure that these were not intended for plain text interchange
of furigana (or other similar
Michael Everson (in training as a curmudgeon) harrumpfed ;-)
The Japanese national body was very clear about this, and was opposed
to these going into the standard unless such clarifications were made,
to ensure that these were not intended for plain text interchange
of furigana (or other
Michael Everson said Well then they [interlinear annotation characters]
oughtn't to have been encoded.
Michael, you aren't an implementer. When you implement things
unambiguously, you may need internal code points in your plain-text
stream to attach higher-level protocols (such as formatting
only handle furigana, instead all markup.
Murray,
It's true implementers need some place to attach higher level
protocols, but they don't need specific points for specific
implementations of internal protocols. If they weren't good enough to be
used for exchange, then simply having some unpurposed code points
available for internal use
that these were not intended for plain text interchange
of furigana (or other similar annotations).
--Ken
--
-
Tex Texin cell: +1 781 789 1898 mailto:[EMAIL PROTECTED]
Xen Master http://www.i18nGuy.com
Tex asked:
But does the standard address their removal by receivers (or
intermediaries) , and does removing them include removing the contained
annotation?
Yes and yes. p. 326:
On input, a plain text receiver should either preserve all characters
uses them for table-row delimiters, which have
nothing to do with Furigana. Instead, RichEdit 5.0 uses codes from the
U+FDD0 - U+FDEF block for Furigana and various 2D math objects.
Thanks
Murray
-Original Message-
From: Tex Texin [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, August 13, 2002
Thanks Ken. I don't know how I missed the text on 326 when I scanned it
before I mailed.
tex
Kenneth Whistler wrote:
Tex asked:
But does the standard address their removal by receivers (or
intermediaries) , and does removing them include removing the contained
annotation?
Yes and
Kenneth Whistler wrote,
The interlinear annotation characters fall in a gray zone, since
they are not noncharacters, but by rights ought to have been.
Since they are standard characters though, the standard has to
provide some guidelines -- and it is simply safer, if you encounter
and
James Kass scripsit:
Should a character encoding standard ever encode a non-character?
Non-characters aren't encoded, they're reserved either for specific
purposes or for any desired purpose.
Is there such a thing as a non-character with a specific semantic
meaning?
Why not?
Can't apps
, if these characters are to be used internally for
Japanese Ruby (furigana), etc., then they ought to be able to
be used externally, as well.
I understand that having common internal use code points might
be considered handy from an implementer's point of view, but
suggest that such conventions should
.
What does this mean? That if I have a text all nice and marked up
with furigana in Quark I can't export it to Word and reimport it in
InDesign and expect my nice marked up text to still be marked up?
Yes, among other things.
Surely all Unicode/10646 characters are expected to be preserved
likely result you will get is:
anchor1textanchor2annotationanchor3
where the anchors will just be blorts. You should not expect that
the whole annotation *framework* will be implemented, and certainly
not that these three characters will suffice for nice[ly] marked
up... furigana.
I don't have
? That if I have a text all nice and marked up
with furigana in Quark I can't export it to Word and reimport it in
InDesign and expect my nice marked up text to still be marked up?
Surely all Unicode/10646 characters are expected to be preserved in
interchange. What have I got wrong, Ken?
--
Michael
Stefan wrote:
Many Japanese word processors already have that capability. HTML4 has
ruby tag exactly for that purpose.
And Unicode has characters for that purpose, too.
Unicode: U+FFF9 kanji U+FFFA furigana U+FFFB
HTML4: RUBYRD kanji /RDRT furigana /RT/RUBY
In my Love Hina vol 7, $B@iG/(B has furigana $B%_%l%K%"%`(B.
Just thought you might wanna know.
_
$B%a!<%k%5!<%S%9$O!"@$3&(B No.1 $B$N(B MSN Hotmail
$B$G!*(Bhttp://www.hotmail.com/JA/
- Original Message -
From: ろ ろ〇〇〇 [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: den 25 januari 2002 23:23
Subject: Furigana can be katakana
In my Love Hina vol 7, 千年 has furigana ミレニアム.
In cases such as ?瑞典?スウェーデン? (is the furigana encoded correctly
At 7:05 AM -0700 6/10/01, ÇÇÒǫǧÇËÇÇ§Ç wrote:
If you want to see some weird furigana, try a Rurouni Kenshin album I've got.
í¡çâà with furigana ÇÍÇǢǶÇÞ
The weirdest I know of is
kin mon bashi
goruden geeto briji
--
Edward Cherlin
Generalist
A knot! exclaimed Alice. Oh, do let me
Daniel Biddle wrote:
On Wed, 5 Jul 2000, Rick McGowan wrote:
iRck
I thought this was a typo until I saw your address. U263A
It's not a typo: Rick's signature has passed through an Indic renderer, so
the "i" was reordered. U+FF1AU+FF0DU+FF09
_ Maco`
Will someone PLEASE send this boy a book!?
iRck
Begin forwarded message:
From: [EMAIL PROTECTED]
Date: Sat, 01 Jul 2000 02:49:30 -0800 (GMT-0800)
To: Unicode List [EMAIL PROTECTED]
Subject: Furigana codes?
X-UML-Sequence: 14481 (2000-07-01 10:49:31 GMT
On Tue, 4 Jul 2000, Nelson H. F. Beebe wrote:
However, Gary Kildall's CP/M did use 1A SUB for end-of-file marks, and
as far as I know, Microsoft/IBM DOS borrowed that practice, and many
other things, from it.
Correct. I think that CP/M got it from RT-11, the single-process PDP-11
OS that
quot; and nobody would suggest doing away with the plain-text
characters needed to control those functions.
In the case of furigana, the need was to have a set of codes that control
these functions for *internal* use in algorithms, rather than *external*
use in interchange. Another example o
On Tue, Jul 04, 2000 at 11:26:59AM -0800, John Cowan wrote:
On Mon, 3 Jul 2000, Edward Cherlin wrote:
*Some* computer system designers, noticing
that the demands of printing terminals were not requirements on
system file internals, chose to use either CR alone or LF alone for
line
On 07/02/2000 09:16:36 AM [EMAIL PROTECTED] wrote:
The problem with the phrase "plain text ceases to be plain if you decide
that
layout information needs to be encoded" is the word "layout." In the
broadest
sense, line and paragraph separation could be considered "layout," and
nobody
would
11-Digit Boy [EMAIL PROTECTED] wrote:
John Hudson [EMAIL PROTECTED] wrote:
Note that this is a text tagging issue, not a Unicode issue, unless
you feel that there is some need to indicate Ruby/Furigana in plain
text. At some point, plain text ceases to be plain if you decide
At 02:37 PM 7/1/00 -0800, Michael \(michka\) Kaplan wrote:
Well, its not entirely fair to say that Furigana is another way of saying
Ruby in OpenType, since Furigana predates OpenType entirely, as well as the
HTML/DHTML RUBY element.
They do provide the same functionality though... Furigana
"John Hudson" [EMAIL PROTECTED] wrote:
... In any case, Furigana is definitely what Adobe had
in mind when they registered the ruby feature, as
is evident from the feature description.
Is this OT ruby feature to be applied when e.g. a ruby/ruby
tag is encountered in
From: [EMAIL PROTECTED]
Are there furigana codes? If not, there darn well need to be.
Like: BEGIN WHAT THE FURIGANA IS FOR, then START FURIGANA, then END
FURIGANA.
AFAIK, Furigana is not made up of separate code points it is text that
can be Hiragana, Katakana, or Romanji
On Sat, 1 Jul 2000 [EMAIL PROTECTED] wrote:
Furigana codes would simply mark certain text as furigana, meaning to
the text-display device, "These characters are not to be displayed on
the main line of text, but rather above it and in smaller type". There
ought to be
Furigana codes would simply mark certain text as furigana, meaning to
the text-display device, "These characters are not to be displayed on
the main line of text, but rather above it and in smaller type". There
ought to be furi kana="" and /furi codes, or the
At 04:04 AM 7/1/00 -0800, [EMAIL PROTECTED] wrote:
Furigana codes would simply mark certain text as furigana, meaning to
the text-display device, "These characters are not to be displayed on
the main line of text, but rather above it and in smaller type". There
ought to be
--
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax
John Hudson [EMAIL PROTECTED] wrote:
At 04:04 AM 7/1/00 -0800, you wrote:
Furigana codes would simply mark certain text as furigana, meaning
to
the text-display device, "These characters are not to be disp
75 matches
Mail list logo