Unicode cannot be the arbiter of mathematical (or other) notation, but,
within limits, you could ask for some annotations if this would help
ensure that there's some uniformity in how people pick symbols for
certain purposes.
Why not contact the relevant publishers and find out what they are
On 7/29/2013 4:25 PM, Ilya Zakharevich wrote:
On Wed, Jul 10, 2013 at 04:24:36AM +, Murray Sargent wrote:
Ilya asked, Are there any other ways to show Unicode on Windows?
You can download Unibook (http://www.unicode.org/unibook/) and set up your
fonts for the ranges. That's the way The
On 7/30/2013 11:39 AM, Buck Golemon wrote:
I shudder to imagine the circumstances that forced you to learn this
information.
I shudder to imagine the state of mind that prompted you to make this
valuable contribution.
A./
On 7/30/2013 12:26 PM, Doug Ewell wrote:
Buck Golemon buck at yelp dot com replied to Richard Wordingham
richard dot wordingham at ntlworld dot com:
There are no Unicode code pages.
Just to be pedantic, there are several on Windows. They encode the
coding form (Unicode codes being best
On 7/30/2013 2:15 PM, Doug Ewell wrote:
Asmus Freytag asmusf at ix dot netcom dot com wrote:
A code page is not, in general,
the same as an encoding scheme.
What is, then, the proper definition of a code page?
I might not be able to do better than Potter Stewart here. I think of a
code page
On 8/27/2013 9:34 PM, Stephan Stiller wrote:
All good replies
It means the program needs to go back (a.k.a. back up)
but I'd say backtracking would make for better wording in TUS.
I tend to disagree, because back up seems to me the one expression
that people dealing in code point conversion
On 8/28/2013 4:15 PM, Stephan Stiller wrote:
To appease the nit pickers:
I totally didn't know there's nitpickers on this list, like, those
that reply to and pick on each other. Interesting!
It's called life-long learning.
A./
On 8/28/2013 3:29 PM, Xue Fuqiao wrote:
I see. Thanks for all your replies!
BTW I have a further question:
On Wed, Aug 28, 2013 at 1:44 PM, Philippe Verdy verd...@wanadoo.fr wrote:
- in UTF-8, you'll need to look backward between 1 to 3 positions before
your start position to find the
On 8/28/2013 1:00 PM, Stephan Stiller wrote:
For Web formats (HTML, etc.), the answer is no.
The obvious follow-up to the list: It'd be interesting to know where
the answer is yes.
People will occasionally mention ISO/IEC 2022, which can be thought of
as a meta-encoding or encoding template or
On 8/28/2013 5:19 PM, Doug Ewell wrote:
Actually 0xC2, according to the rules of UTF-8.
Hmm. What you are referring to is that 0xC0 and 0xC1 don't occur because
of the requirement for minimal length encoding. However, a check for
=0xC0 will give the correct result for backing up, assuming
On 8/28/2013 6:25 PM, Karl Williamson wrote:
On 08/28/2013 06:52 PM, Asmus Freytag wrote:
On 8/28/2013 5:19 PM, Doug Ewell wrote:
Actually 0xC2, according to the rules of UTF-8.
Hmm. What you are referring to is that 0xC0 and 0xC1 don't occur because
of the requirement for minimal length
On 8/28/2013 6:31 PM, Doug Ewell wrote:
He didn't ask if such a practice was common, or confusing, or a good
idea, though perhaps those were underlying questions.
The answer may well have depended on the underlying question.
But until he comes back with an elaboration, the discussion might
On 9/2/2013 5:08 PM, Doug Ewell wrote:
I asked because, as Philippe said, an octet is the same as an 8-bit byte.
Yes, that's the standard definition of octet, er 8-bit byte.
Never having encountered a non-8-bit byte anywhere in the wild, I've
always ceded the field of octets to nitpickers.
On 9/2/2013 6:47 PM, Doug Ewell wrote:
In any case, there is nothing about multi-octet versus multi-byte
that makes one fixed-length and the other variable-length.
Yep.
A./
Good question, Jean-François.
I seem to recall that typographers may make a distinction between
black-letter and fraktur forms, but even if they, the differences
are typographical, not essential. For the purpose of *character*
encoding, one would need to make a very strong rationale for
On 9/10/2013 11:05 AM, Michael Everson wrote:
On 10 Sep 2013, at 18:01, Asmus Freytag asm...@ix.netcom.com wrote:
This rationale is absent in document WG2 N3907 that requests these characters.
Therefore, it seems these two additions should not have been made.
I disagree. The mathematical
On 9/10/2013 12:09 PM, Michael Everson wrote:
On 10 Sep 2013, at 20:04, Asmus Freytag asm...@ix.netcom.com wrote:
The proper thing would be to deprecate these accidental duplications forthwith.
Nonsense. And blackletter isn't identical to Fraktur.
It is not different enough to base
On 9/11/2013 1:13 PM, Michael Everson wrote:
Nonsense. And blackletter isn't identical to Fraktur.
It is not different enough to base a character encoding distinction on it. Why don't we code
times and garamond shapes then as characters as well.
The Mathematical Alphanumeric Symbols block
On 9/11/2013 9:50 PM, Charlie Ruland ☘ wrote:
One final remark: Thinking about it I have the impression that the
blackletter vs. antiqua distinction once made in German very much
resembles that made between Hiragana and Katakana in Japanese. In both
cases the underlying systems of the
On 9/12/2013 1:36 AM, Gerrit Ansmann wrote:
On Thu, 12 Sep 2013 06:50:23 +0200, Charlie Ruland ☘
rul...@luckymail.com wrote:
One final remark: Thinking about it I have the impression that the
blackletter vs. antiqua distinction once made in German very much
resembles that made between
On 9/13/2013 10:54 AM, Whistler, Ken wrote:
Stephan Stiller noted:
Maybe ... and the origin of the single-glyph ellipsis remains a mystery
to me.
As Philippe surmised, it is a compatibility character, originally included
in the Unicode 1.0 repertoire for cross-mapping to existing legacy
On 9/14/2013 6:24 AM, Michael Everson wrote:
On 14 Sep 2013, at 14:16, Stephan Stiller stephan.stil...@gmail.com wrote:
Books never used it. The tradition in typing was developed to assist
typesetters to navigate the typewritten text they were setting. The typesetters
never put two spaces
On 9/14/2013 12:19 PM, Michael Everson wrote:
And as a book designer and publisher, I think that having large spaces after a
full stop is both unnecessary and vulgar.
Quote from the blog:
While the modern convention is the single space, it is no less
arbitrary than any other, and if
On 9/16/2013 1:41 PM, Doug Ewell wrote:
This has nothing to do with UTF-Anything or Normalization Form Anything.
But all with keeping the discussion alive for any reason, however
insignificant :)
A./
On 9/16/2013 2:18 PM, Doug Ewell wrote:
Asmus Freytag asmusf at ix dot netcom dot com wrote:
On 9/16/2013 1:41 PM, Doug Ewell wrote:
This has nothing to do with UTF-Anything or Normalization Form
Anything.
But all with keeping the discussion alive for any reason, however
insignificant :)
I
On 9/16/2013 2:05 PM, Steffen Daode Nurpmeso wrote:
But this may be a sign that the Unicode Consortium is about to have
its own status changed to become a non-profit charity foundation
dedicated to wordlwide promotion of education and culture. Thanks. But
this should be clear, and some
On 9/16/2013 3:01 PM, Philippe Verdy wrote:
Please stop, I've enough replies about the Unicode Consortium status.
But my questions about consequences of **dedicated** grants remain as
it affects how you'll organize works and manage it, within a limited
timeframe. We've not seen this discussed
I seems this post is a bit inappropriate for this forum in its content
and given its rather bizarre immaturity of interaction with other
member, seems altogether more fitting for a kindergarten playground in .
It would be nice if such posts could be kept off this list.
A./
On 9/17/2013 8:15
On 9/17/2013 2:55 PM, Stephan Stiller wrote:
[AF:]
It is the wording in your posts that adds to the confusion.
My fundamental point is, has been, and continues to be that whenever
people use the more general word code point instead of the more
appropriate scalar value, that will add to the
On 9/17/2013 8:40 PM, Philippe Verdy wrote:
In what way does UTF-16 use surrogate code /points/? An encoding
form is a mapping. Let's look at this mapping:
* One _inputs_ scalar values (not surrogate code points).
In fact the input is one code point.
Then only if that code
On 9/18/2013 2:42 AM, Philippe Verdy wrote:
There are scalar values used in so many other unrelated domains
(notably in mathematics, where a scalar value is an identifiable
object that remains constant in relation with some operations and
independant of its context, unlike functions,
On 9/18/2013 3:14 PM, Philippe Verdy wrote:
I would propose exactly the opposite of what you want: avoid using
scalar value alone. But only speak about 'Unicode scalar value
character property.
If it is a property, it would be a code point property...
Still, I support your general point.
A superscript glyph would in my view
normally be larger than a glyph for a combining superscript
character. The reason is that the former just has to appear raised
and smaller, while the latter has to fit somehow in the space
above x-height.
The
On 10/20/2013 1:47 AM, Jukka K. Korpela wrote:
2013-10-20 2:38, Richard Wordingham wrote:
Is a sequence of a U+25CC DOTTED CIRCLE plus a combining mark plain
text?
Well, is h1helloh1 plain text? The answer is that any string of
characters may be considered as plain text and any string of
On 10/20/2013 3:45 PM, Philippe Verdy wrote:
2013/10/20 Asmus Freytag asm...@ix.netcom.com
mailto:asm...@ix.netcom.com
Incidentally, the dotted circle shown in the Unicode Code charts
is *not* 25CC, and if I were to implement a show dotted circle
feature in a program I would
On 10/22/2013 11:38 AM, Jean-François Colson wrote:
Hello.
I know that in some Japanese encodings (JIS, EUC), \ was replaced by a ¥.
On my computer, there are some Japanese fonts where the characters
seems coded following Unicode, except for the \ which remained a ¥.
Is that acceptable from
On 12/12/2013 2:25 PM, Leo Broukhis wrote:
Hmmm... As a person with Russian as the first language I can assure
you that from any literate Russian-speaking person's perspective
italic ū is an unacceptable and *WRONG* representation of п (because
in Russian, unlike Serbian, there is й). Should
the circle.
That's exactly right.
Leo
On Thu, Dec 12, 2013 at 2:52 PM, Asmus Freytag asm...@ix.netcom.com
mailto:asm...@ix.netcom.com wrote:
On 12/12/2013 2:25 PM, Leo Broukhis wrote:
Hmmm... As a person with Russian as the first language I can
assure you that from any literate
On 12/12/2013 6:38 PM, Leo Broukhis wrote:
Italic is not plain text.
Is this the only thing that would have stopped you from advocating
disunification?
Yeah. To heck with the end user and their pathetic preferences.
Is a preference to have traditional and simplified CJK characters
[mailto:unicode-bou...@unicode.org] *Puolesta *Marc Blanchet
*Lähetetty:* 13. joulukuuta 2013 00:00
*Vastaanottaja:* Asmus Freytag
*Kopio:* verd...@wanadoo.fr; William_J_G Overington; Michael Everson;
unicode Unicode Discussion
*Aihe:* Re: The Ruble sign has been approved
Le 2013-12-12 à 13:42, Asmus
I find it unhelpful to consider 2052 as the italic variant of 00F7, and
further find the evidence for that not all that germane.
Both are variants of the - sign, and so ipso facto are variants of
each other.
However, to identify something as italic to me would require that
one form is used in
Halvard Silli
Asmus Freytag, Wed, 15 Jan 2014 23:17:46 -0800:
I find it unhelpful to consider 2052 as the italic variant of 00F7, and
further find the evidence for that not all that germane.
Both are variants of the - sign, and so ipso facto are variants of
each other.
However, to identify
I agree, the use of nobreak markup is more appropriate to the problem.
This is not a plain text issue and it even fails the smell test for
issue that is more elegantly solved by format characters than markup.
A./
On 2/5/2014 2:27 PM, Jukka K. Korpela wrote:
2014-02-05 23:44, Rhavin Grobert
On 2/27/2014 2:32 AM, Shriramana Sharma wrote:
Given that Unicode encodes scripts and not languages, how appropriate
is it to call the BMP and the SMP as the multi*lingual* planes?
Isn't it lovely how these things work?
A./
___
Unicode mailing list
On 3/16/2014 9:05 AM, William_J_G Overington wrote:
So, everyone, can the Romanized Singhala system be used with a QWERTY keyboard
to produce Unicode-encoded text, thereby producing a good combined system?
Could this be achieved if a text-processing software package were produced that
could
On 3/18/2014 1:57 PM, Tom Gewecke wrote:
On Mar 18, 2014, at 1:48 PM, Marc Durdin wrote:
Can anyone who is more knowledgeable in Unicode Sinhala tell me which
is the correct rendering? See graphic below.
image002.png
The OS X version is the most correct according my limited knowledge of
On 3/19/2014 9:17 PM, J. Leslie Turriff wrote:
Perhaps it might be useful to be able to distinguish between an editing
mode and a composition mode: editing mode would be active when a document
is first loaded into the editor, when the editor has no keystroke history to
consult, and in
On 3/21/2014 8:22 AM, Jan Velterop wrote:
But are the chances nil?
Essentially you are trying to create a symbol for this material is
placed in the public domain. If you get that symbol adopted by similar
authorities as those that created ©, then you would see it encoded in
due time. If
On managing some types of spacing between elements in running text:
On 3/27/2014 8:04 AM, Jukka K. Korpela wrote:
2014-03-27 15:10, Kalvesmaki, Joel wrote:
William, try the U+2000..U+200A glyphs under General Punctuation--I
think
that's what you're looking for to manage precise widths of
I think this calls for an implementation note on UAX#9 along these lines.
-
During line breaking, if a line is broken at the location of a SHY, the
text around the line break may change. A common case is the replacement
of the invisible SHY by a visible HYPHEN, but see
On 4/1/2014 4:12 PM, Jonathan Rosenne wrote:
The use of soft hyphen is a cultural matter. In Hebrew, Classic and Israeli,
soft hyphens are not used.
More to the point, how does software render a soft hyphen included in
inserted LTR text, when the outer text is Hebrew? Would it always be
On 4/2/2014 12:36 AM, Richard Wordingham wrote:
On Tue, 1 Apr 2014 23:41:48 +
Whistler, Ken ken.whist...@sap.com wrote:
Is it legitimate to truncate the context to a single line? The BiDi
algorithm is attempting to interpret unlabelled text as embedded
text
(it's not an arbitrary dance),
On 4/2/2014 1:42 AM, Christopher Fynn wrote:
Rather than Emoji it might be better if people learnt Han ideographs
which are also compact (and a far more developed system of
communication than emoji). One CJK character can also easily replace
dozens of Latin characters - which is what is being
On 4/2/2014 4:05 AM, Koji Ishii wrote:
On Apr 2, 2014, at 7:19 PM, Asmus Freytag asm...@ix.netcom.com wrote:
On 4/2/2014 1:42 AM, Christopher Fynn wrote:
Rather than Emoji it might be better if people learnt Han ideographs
which are also compact (and a far more developed system
On 4/20/2014 3:24 AM, Eli Zaretskii wrote:
Would someone please help understand the following subtleties and
obscure language in the UBA document found at
http://www.unicode.org/reports/tr9/? Thanks in advance.
Eli,
I've tried to give you some explanations - in some places, I concur with
On 4/20/2014 6:54 PM, James Clark wrote:
On Mon, Apr 21, 2014 at 2:58 AM, Asmus Freytag asm...@ix.netcom.com
mailto:asm...@ix.netcom.com wrote:
On 4/20/2014 3:24 AM, Eli Zaretskii wrote:
Would someone please help understand the following subtleties and
obscure language in the UBA
On 4/21/2014 1:33 AM, Eli Zaretskii wrote:
Date: Sun, 20 Apr 2014 23:03:20 -0700
From: Asmus Freytag asm...@ix.netcom.com
CC: Eli Zaretskii e...@gnu.org, unicode@unicode.org,
Kenneth Whistler k...@unicode.org
Note that the current embedding level is not changed by this rule
On 4/21/2014 12:55 AM, Eli Zaretskii wrote:
in some places, I concur with you that the wording could be improved
and that such improved wording should be proposed to the UTC (or its
editorial committee) for incorporation into a future update.
How do we do that?
You file a problem report using
algorithms in a conformance test case.
2014-04-21 16:32 GMT+02:00 Asmus Freytag asm...@ix.netcom.com
mailto:asm...@ix.netcom.com:
On 4/21/2014 1:33 AM, Eli Zaretskii wrote:
Date: Sun, 20 Apr 2014 23:03:20 -0700
From: Asmus Freytagasm...@ix.netcom.com mailto:asm...@ix.netcom.com
to be less dependent on the sample
implementation.
2014-04-21 19:48 GMT+02:00 Asmus Freytag asm...@ix.netcom.com
mailto:asm...@ix.netcom.com:
Philippe,
I fail to understand how your post contributes to the topic.
The issue was unclear wording of the specification, not
deficiencies
On 4/21/2014 11:14 AM, Doug Ewell wrote:
From: Asmus Freytag asmusf at ix dot netcom dot com wrote:
In general, I heartily dislike specifications that just narrate a
particular implementation...
I agree completely. I see this with CLDR as well; there is a more or
less implicit assumption
On 4/21/2014 2:47 AM, William_J_G Overington wrote:
I am hoping to attach images showing the designs to other posts in this thread.
Please find attached an image of the designs of the colourful glyphs.
The language I would use for my reaction to this, is just too colorful
to reproduce here
On 4/21/2014 1:54 PM, Philippe Verdy wrote:
My intent was not to demonstrate a bug in the algorithm, I have not
even claimed that, but to make sure that (less common) usages of
paired brackets that do not obey to a pure hierarchy (because these
notations use different type of brackets, they
Ilya,
I appreciate your taking the time to take apart Philippe's message. That
aspect of it was not obvious to me.
A./
PS: more comments below
On 4/21/2014 4:41 PM, Ilya Zakharevich wrote:
On Mon, Apr 21, 2014 at 02:44:14PM -0700, Asmus Freytag wrote:
On 4/21/2014 1:54 PM, Philippe Verdy
On 4/21/2014 5:44 PM, Whistler, Ken wrote:
So one may ask: what will be the result of the CURRENT UNICODE parsing
applied
to Phillipe’s example?
This is an [«] example [»] for demonstration only.
That is easily answered. Let's crank up the bidi reference code with
a shorter example
On 4/21/2014 8:32 PM, Ilya Zakharevich wrote:
On Mon, Apr 21, 2014 at 06:08:12PM -0700, Asmus Freytag wrote:
Here's the text I supplied, with numbers added for discussion. It
definitely needs some
editing, but the point of the exercise would be to see what:
1. A bracket pair is a pair
On 4/22/2014 2:19 AM, Ilya Zakharevich wrote:
I think the crucial problem is with
1( 2[ 3( 4] 5) 5b] 6)
I have two possible interpretations: one matches 2 with 5b, another
leaves 2 unmatched.
Ilya,
if you read UAX#9, the way the algorithm works is by pushing openers on
a stack,
On 4/22/2014 9:02 AM, Eli Zaretskii wrote:
an resolve it, so we match 1) and 6).
But that's wrong, isn't it?
Yes, brain fart.
I agree, but let me try to say the same more concisely:
A bracket pair is a pair of an opening paired bracket and a closing
paired bracket characters
On 4/22/2014 10:11 AM, Eli Zaretskii wrote:
Date: Tue, 22 Apr 2014 09:52:43 -0700
From: Asmus Freytag asm...@ix.netcom.com
CC: nospam-ab...@ilyaz.org, verd...@wanadoo.fr, k...@unicode.org,
j...@jclark.com, unicode@unicode.org
I agree, but let me try to say the same more concisely
On 4/22/2014 2:17 PM, Ilya Zakharevich wrote:
On Tue, Apr 22, 2014 at 07:08:56PM +0300, Eli Zaretskii wrote:
Sorry, I do not see any definition here. Just a collection of words
which looks like a definition, but only locally…
Any definition is just a collection of words, of course. Can you
On 4/23/2014 12:35 AM, Ilya Zakharevich wrote:
On Tue, Apr 22, 2014 at 09:06:27AM -0700, Asmus Freytag wrote:
if you read UAX#9, the way the algorithm works is by pushing openers
on a stack, then, on finding the first closer, going down the stack
and attempting to locate a match
On 4/23/2014 4:41 PM, Ilya Zakharevich wrote:
On Wed, Apr 23, 2014 at 09:21:04AM -0700, Asmus Freytag wrote:
a parsing is good if it satisfies all conditions below:
0) Some delimiters in the string are marked as “non-matching”; the rest
is broken into disjoint “matched” pairs
On 4/23/2014 7:37 PM, Philippe Verdy wrote:
Thanks for the clear reply, now I know that my example in a prior
message would work appropriately with UBA:
This is an [«] ARABIC EXAMPLE [»] for demonstration only.
Because:
- the opening guillemet is not stripped out of the context stack when
On 4/24/2014 8:20 AM, Eli Zaretskii wrote:
So nothing (at least not the reason of the GC which is just an intermediate
but incomplete helper) forbids the guillemets to be listed in
BidiBrackets.txt.
They don't satisfy the conditions for that. From BidiBrackets.txt:
Philippe is incorrect once
On 4/24/2014 7:39 AM, Eli Zaretskii wrote:
This is _*incorrect*_, see the text in blue/bold in the definition
copied below.
The second bullet in item 3 of the second second-level bullet of the
third top-level bullet of BD16 clearly says that all elements that are
above the matched element are
On this side show, Philippe finally is correct, because I received his
message without ASCII-i-fication; he cc'd me directly, and I never saw
the mangled text. It's a bit embarassing for a Unicode mail list to not
even be able to let guillemets through unmolested.
But this shall not distract
This has seen off-line discussion with the mail manager and we're good.
A./
On 5/1/2014 3:44 PM, Richard Wordingham wrote:
On Thu, 24 Apr 2014 17:19:57 -0700
Asmus Freytag asm...@ix.netcom.com wrote:
On this side show, Philippe finally is correct, because I received
his message without ASCII
On 5/8/2014 9:09 AM, catherine butler wrote:
We're struggling to master the intricacies of proposing new Unicode characters specific
to the James Joyce masterpiece Finnegans Wake.
http://fwpages.blogspot.com/2014/05/unicode-for-james-joyce-needed.html
There are somewhere from two to two-dozen
On 5/9/2014 10:45 AM, catherine butler wrote:
What is needed is an authoritative and complete inventory of these,
using *images* from the works and notes to show their shapes (and a few
images to document that they are indeed part of running text).
I don't have access to the manuscript
On 5/9/2014 6:32 PM, Shriramana Sharma wrote:
Dear Richard,
It is true that Vowel_Independent can behave like Consonant
characters. Given that consonant letters also have an inherent vowel
in these scripts, IMO there is not really much to distinguish
*technically*. At least in *Indian* Indic
On 5/30/2014 11:26 AM, Karl Williamson wrote:
I'm having a problem with this
http://www.unicode.org/versions/corrigendum9.html
You are not alone.
Some people now think it means that noncharacters are really no
different from private-use characters, and should be treated very
similarly if
On 5/31/2014 4:09 AM, Philippe Verdy wrote:
2014-05-30 20:49 GMT+02:00 Asmus Freytag asm...@ix.netcom.com
mailto:asm...@ix.netcom.com:
This might have been possible at the time these were added, but
now it is probably not feasible. One of the reasons is that block
names are exposed
On 5/31/2014 12:36 PM, Philippe Verdy wrote:
May be; but there's real doubt that a regular expression that would
need this property would be severely broken if that property was
corrected. There are many other properties that are more useful (and
mich more used) whose associated set of
On 5/31/2014 10:06 PM, Philippe Verdy wrote:
I've not proposed to move these characters elsewhere (or ro reencode
them), why do you think that?.
I just challenge your statement that a block cannot be discontinuous,
Well, go ahead and challenge that.
As implemented in the current nameslist
On 6/1/2014 9:07 AM, Markus Scherer wrote:
On Sun, Jun 1, 2014 at 7:49 AM, Karl Williamson
pub...@khwilliamson.com mailto:pub...@khwilliamson.com wrote:
Thanks, I had not thought about that. I'm thinking wording
something like this is more appropriate
Noncharacters may be openly
On 6/2/2014 9:27 AM, Mark Davis ☕️ wrote:
On Mon, Jun 2, 2014 at 6:21 PM, Shawn Steele
shawn.ste...@microsoft.com mailto:shawn.ste...@microsoft.com wrote:
The “problem” is now that previously these characters were illegal
The problem was that we were inconsistent in standard and
On 6/2/2014 9:08 AM, Mark Davis ☕️ wrote:
The problem is where to draw the line. In today's world, what's an
app? You may have a cooperating system of apps, where it is
perfectly reasonable to interchange sentinel values (for example).
The way to draw the line is to insist on there being an
On 6/2/2014 9:38 AM, Shawn Steele wrote:
I agree with Markus; I think the FAQ is pretty clear. (And if not,
that's where we should make it clearer.)
But the formal wording of the standard should reflect that clarity, right?
I don't tend to read the FAQ :)
FAQ's are useful, but they are not
On 6/2/2014 2:53 PM, Markus Scherer wrote:
On Mon, Jun 2, 2014 at 1:32 PM, David Starner prosfil...@gmail.com
mailto:prosfil...@gmail.com wrote:
I would especially discourage any web browser from handling
these; they're noncharacters used for unknown purposes that are
undisplayable
On 6/2/2014 3:08 PM, Asmus Freytag wrote:
On 6/2/2014 2:53 PM, Markus Scherer wrote:
On Mon, Jun 2, 2014 at 1:32 PM, David Starner prosfil...@gmail.com
mailto:prosfil...@gmail.com wrote:
I would especially discourage any web browser from handling
these; they're noncharacters used
Michelle,
Unicode normally does not document all known usages of symbols.
Occasionally, if a symbol is used in ways that might be unexpected from
its name, the standard may add an alias or annotation. This is done in
particular, when there is a question of whether a given symbol is the
Nicely put.
A./
On 6/3/2014 12:09 AM, Martin J. Dürst wrote:
On 2014/06/03 07:08, Asmus Freytag wrote:
On 6/2/2014 2:53 PM, Markus Scherer wrote:
On Mon, Jun 2, 2014 at 1:32 PM, David Starner prosfil...@gmail.com
mailto:prosfil...@gmail.com wrote:
I would especially discourage any web
On 6/3/2014 10:17 AM, Jukka K. Korpela wrote:
On the practical side, it might be in order to warn against usage that
relies on some particular interpretation like that. What I mean is
that it is OK to use WARNING SIGN as warning about risk of personal
injury, but questionable to expect that
On 6/4/2014 11:26 AM, Doug Ewell wrote:
Sorry, I left out an important detail.
I wrote:
3. U+FEFF at the beginning of a stream (note: not packet or
arbitrary cutoff point)
I meant U+FEFF as a zero-width no-break space. Obviously it is very
common to see U+FEFF as a signature or BOM.
My
On 6/4/2014 12:21 PM, Richard Wordingham wrote:
On Wed, 04 Jun 2014 11:40:11 -0700
Asmus Freytag asm...@ix.netcom.com wrote:
On 6/4/2014 11:26 AM, Doug Ewell wrote:
I meant U+FEFF as a zero-width no-break space. Obviously it is very
common to see U+FEFF as a signature or BOM.
The semantics
On 6/7/2014 9:19 PM, Karl Williamson wrote:
On 06/02/2014 11:00 AM, Shawn Steele wrote:
To further my understanding, can someone provide examples of how
these are used in actual practice? I can't think of any offhand and
the closest I get is like the old escape characters to get a dot
matrix
On 6/29/2014 11:44 AM, Koji Ishii wrote:
Surrogate code points, private-use characters, and control characters are not
given the Default_Ignorable_Code_Point property. To avoid security problems,
such characters or code points, when not interpreted and not displayable by
normal rendering,
On 6/30/2014 10:55 PM, Koji Ishii wrote:
Thanks for the reply. It’s very likely that the page contains images, borders,
background, etc., so I can recognize all the text are missing. But neither of
text missing nor text garbled suggests me how to fix it. I’d try another
browser, then give up
On 7/2/2014 8:02 AM, Karl Williamson wrote:
Corrigendum #9 has changed this so much that people are coming to me
and saying that inputs may very well have non-characters, and that the
default should be to pass them through. Since we have no published
wording for how the TUS will absorb
On 7/3/2014 11:02 AM, Richard COOK wrote:
On Jul 2, 2014, at 8:02 AM, Karl Williamson pub...@khwilliamson.com wrote:
Corrigendum #9 has changed this so much that people are coming to me and saying
that inputs may very well have non-characters, and that the default should be
to pass them
701 - 800 of 1250 matches
Mail list logo