/unicode/2014-January/13.html
[2] attachment of the file “screenshot-of-minuses.png attachment: screenshot-of-minuses.png
Leif Halvard Silli
Asmus Freytag, Wed, 15 Jan 2014 23:17:46 -0800:
I find it unhelpful to consider 2052 as the italic variant of 00F7, and
further find the evidence
in the dark as some strange traditions without any roots.
Also, I don't want the commercial minus to live a life as if it is such
a unique thing. Let us document things properly.
Leif Halvard Silli
Gesendet: Donnerstag, 16. Januar 2014 um 04:43 Uhr
Von: Leif Halvard Silli xn--mlform-iua@målform.no
Asmus Freytag, Thu, 16 Jan 2014 07:24:45 -0800:
On 1/16/2014 5:34 AM, Leif Halvard Silli wrote:
when looking at my message in Firefox [1], the commercial minus
looks like a “handwritten” variant of the division sign.
the fact that the slant is reverse, rather than forward,
is contrary
are
found - for instance, in calendar calculations.
[1] http://www.unicode.org/mail-arch/unicode-ml/y2012-m07/0053.html
[2]
https://archive.org/stream/kaufmnnischeari00schegoog#page/n229/mode/2up
--
leif halvard silli
___
Unicode mailing list
Unicode
error, if the xml encoding
declaration conflicts with the external method - BOM or HTTP.
[1] http://www.w3.org/TR/REC-xml/#NT-document
[2]
http://tools.ietf.org/html/draft-ietf-appsawg-xml-mediatypes-06#section-3.3
[3] http://www.w3.org/TR/REC-xml/#charencoding
--
leif halvard silli
be useful in the Western … ?
Cheers to the Russians’ hard work on liberating the Armenian keh!
Michel
PS It was a good joke, some didn’t get the memo
Leif Halvard Silli
Sounds like “Bush hid the facts”:
http://en.wikipedia.org/wiki/Bush_hid_the_facts
Per the charset decoding algorithm of HTML5, the charset label
'unicode' ought to be interpreted as synonymous with 'UTF-16.
The baffling thing, per the same algorithm, is that if the HTML parser
sees the label
have HYPHEN-MINUS in its
NamesList?
--
Leif Halvard Silli
Markus Scherer, Fri, 22 Feb 2013 10:07:15 -0800:
I suggest you report this via http://www.unicode.org/reporting.html
With your encouragement, I have reported it. Thanks!
--
leif h silli
halvard silli
…
In constructing the example I realize that I had the wrong final 'm',
but that does not affect the point.
If there is more than one word, the order of words IS correct, but
the order of characters in each word is reversed.
--
leif halvard silli
the file name during upload.
Leif Halvard Silli
Costello, Roger L. wrote:
Hi Folks,
The W3C recommends [1] text sent out over the Internet be in
Normalized Form C (NFC):
This document therefore chooses NFC as the
base for Web-related early normalization.
So why would one ever
Jukka K. Korpela, Wed, 09 Jan 2013 11:03:28 +0200:
2013-01-09 2:55, Leif Halvard Silli wrote:
The benefit of doing such a comparison is that we then get to
count both the HTML page *plus* all the extra fonts that is included in
the romanized Singhala file. Thus, we get a more *real* basis
Singhala seems grossly exaggerated, if at all true, based as they are
on a test of two files which aren actually equal when it comes to the
extra CSS stuff that they embed.
[1] http://en.wikipedia.org/wiki/Webarchive
[2] http://yslow.org/
--
leif halvard silli
Doug Ewell, Sun, 6 Jan 2013 20:57:58 -0700:
We are pretty much going round and round on this. The bottom line for
me is, it would be nice if there were a shorthand way of saying
big-endian UTF-16, and many people (including you?) feel that
UTF-16BE is that way, but it is not. That term has
,
which may not be stripped.
I believe I understand this reasonably well. I think we are looking for
a term is unaffacted by how we label it.
leif halvard silli
Doug Ewell, Sun, 6 Jan 2013 17:58:38 -0700:
Leif Halvard Silli wrote:
I believe that even the U+FEFF *itself* is either UTF-32LE or UTF-32BE.
Thus, there is, per se, no implication of lack of byte-order mark in
Martin’s statement.
By definition, data in the UTF-nBE or UTF-nLE encoding
, then it is an umbrella label or a
macro label (hint: macro language) which covers the two *real*
encodings - UTF-32LE and UTF-32BE.
Just my 5 øre.
--
leif halvard silli
Asmus Freytag, Mon, 31 Dec 2012 06:44:44 -0800:
On 12/31/2012 3:27 AM, Leif Halvard Silli wrote:
Asmus Freytag, Sun, 30 Dec 2012 17:05:56 -0800:
The Web archive for this very list, needs a fix as well …
The way to formally request any action by the Unicode Consortium is
via the contact
Asmus Freytag, Sun, 30 Dec 2012 17:05:56 -0800:
On 12/30/2012 3:19 PM, Leif Halvard Silli wrote:
My feeling is that interoperability is getting better everywhere.
But one field which lags behind is e-mail. Especially Web archives of
e-mail (for instance, take the WHATwg.org’s web archive
that the tool defaults to UTF-8.)
There probably most productive is to file bugs against each an every
tool that doesn’t default to UTF-8.
--
leif halvard silli
#Russian
--
leif halvard silli
Leo Broukhis, Fri, 21 Dec 2012 08:57:11 -0800:
On Fri, Dec 21, 2012 at 4:56 AM, Leif Halvard Silli wrote:
You say that the difference is primary in the beginning of a word but
elsewhere secondary. And yes, that orthographic dictionary that you
link to above, looks as you describe.
However
Jukka K. Korpela, Fri, 21 Dec 2012 21:35:16 +0200:
2012-12-21 21:05, Leif Halvard Silli wrote:
My Moscow Russian-Norwegian from 1987 and my Pocket Oxford Russian
Dictionary from 2003 agree that both list words on Ё and Е under the
same category – namely, under the letter Е.
This appears
meaning.
[1] http://en.wikipedia.org/wiki/Vladimir_Dal
С праздником!
--
leif halvard silli
Leo Broukhis, Fri, 21 Dec 2012 13:43:14 -0800:
On Fri, Dec 21, 2012 at 1:08 PM, Leif Halvard Silli
xn--mlform-...@xn--mlform-iua.no wrote:
In «Tolkovïj slovar’ sovremennogo russkogo jazïka» from 2005
(«Dictionary over contempary Russian language»), has located words on Ё
in its a separate
Andreas Prilop, Thu, 20 Dec 2012 15:41:28 +0100 (CET):
On Thu, 20 Dec 2012, Jukka K. Korpela wrote:
http://www.ling.helsinki.fi/filt/info/mes2/
Unicode names have certain restrictions (capital ASCII letters, etc).
This Finnish list even uses non-ASCII characters but sticks to
capital
/xhtml1/#media
[2] http://www.w3.org/TR/xhtml1/#C_9
--
leif halvard silli
/html;
charset=utf-8 /)'
[1] http://www.w3.org/TR/xhtml1/#media
[2] http://www.w3.org/TR/xhtml1/#C_9
[3] http://www.w3.org/TR/xhtml11/xhtml11.html#strict
[4] http://www.w3.org/TR/xhtml-media-types/#text-html
[5] http://www.w3.org/TR/xhtml-media-types/#C_9
--
leif halvard silli
have notified the developers of
Unicorn about the shortcoming, asking them to issue a warning.
[1] http://www.w3.org/TR/xhtml-media-types/#C_1
[2] http://www.w3.org/TR/xhtml-media-types/#C_9
--
leif halvard silli
, is
not going to be met with enthusiasm.
--
leif halvard silli
Philippe Verdy, Thu, 29 Nov 2012 19:11:42 +0100:
2012/11/29 Leif Halvard Silli:
Philippe Verdy, Thu, 29 Nov 2012 16:27:13 +0100:
?html version=5.0 encoding=utf-8
Thus I can guarantee you that your idea about at method number 9, is
not going to be met with enthusiasm.
- Method 5 is where
resort, fall back to UTF-8.
--
leif halvard silli
.)
So, for what it is worth - and with reference to your pages, I filed a
bug against Firefox, to make it start to use the encoding declartion of
the XML prologue, when nothing else is available:
https://bugzilla.mozilla.org/show_bug.cgi?id=815279
--
leif halvard silli
also blame the The history of how HTML developed.
--
leif halvard silli
Philippe Verdy, Wed, 28 Nov 2012 01:10:45 +0100:
2012/11/27 Leif Halvard Silli
The fact that XHTML 1 permits the XML prolog regardless how the
document is served, is just a shortcoming of the XHTML 1 specification.
No, it was by design. Making HTML an application of XML. Only XML
Philippe Verdy, Wed, 28 Nov 2012 04:23:10 +0100:
2012/11/28 Leif Halvard Silli xn--mlform-...@xn--mlform-iua.no
For
a new version of the validator, that ask more of those questions,
please try http://validator.w3.org/nu/ - it happens to for the most
part be developed by one of the Firefox
result
- then final syntax - becomes simple to understand, without too
complicated and convoluted rules.
Just my two cents, about how I see it.
--
leif halvard silli
for. Thus, I don't think
she meant to correct his language. So, she took wild guess and
concluded that he did not ask about transliteration into a Cyrillic
alphabet, for instance.
--
leif halvard silli
Bill Poser, Wed, 5 Sep 2012 15:15:37 -0700:
It is also at least logically possible for there to be transliterations
from Semitic writing systems to non-Roman writing systems. I'm not aware of
such a thing, but one can imagine, for example, Russian work using a
Cyrillic-based transliteration.
. But it is a helpful explanation
of the terms, I think.
The word Roman, can also refer to Greek. So it is best to avoid
that term. ;-)
--
leif halvard silli
If you had added
border='1'
to the table elements of those pages, then the tgables would also be
*human*-readable in simplistic Web browsers without support for CSS (or
with CSS disabled etc). Consider that a proposal.
--
Leif Halvard Silli
.
His overall approach is to map public script usage to language to
regions. If the map had tried to document religion related script
cultivation, then the map would looked different - more diverse, and
with more 'dead scripts' coming alive.
--
leif halvard silli
Robert Wheelock, Tue, 21 Aug 2012 16:56:26 -0400:
On Tue, Aug 21, 2012 at 10:34 AM, Jonathan Rosenne wrote:
I can state that for Israel the scripts in common use are Hebrew, Latin
(mainly for English but also for several other languages), Arabic and
Cyrillic.
—Reply—
I do believe that
you select alternate stylesheets
then you can't change fonts quickly but only slowly :-)
Safari Preferences Appearance Standard font
Or he can download iCab - http://www.icab.de - it lets one change
stylesheets.
--
Leif Halvard Silli
publication I have to make.
On my Norwegian Mac keyboard, I must type Option+Shift+A to get the ◊.
And the difficult shortcut is another indication that it is not used
very often.
--
leif halvard silli
Michael Everson, Mon, 13 Aug 2012 15:38:48 +0100:
On 13 Aug 2012, at 15:20, Leif Halvard Silli wrote:
Less so than the ƒ, but many of us learnt to use the ƒ for our folder
names.
I too learned to use the ƒ for folder names. But while I learned to do
it, I seldom did it as it had
. The modern German name for diamond
cards, Karo, geht auf lateinisch quadrum „Viereck, Quadrat“ zurück.
http://de.wikipedia.org/wiki/Karo_(Farbe)
--
Leif Halvard Silli
Otto Stolz, Mon, 13 Aug 2012 22:14:17 +0200:
am 2012-08-13 20:48, schrieb Leif Halvard Silli:
Norwegian 'rute' may refer to a cell in a (data) table or in a square
board for chess. Such a 'rute' is of course a square. Perhaps German
'Raute' has a similar possibility of being interpreted
John W Kennedy, Wed, 18 Jul 2012 14:48:15 -0400:
On Jul 18, 2012, at 4:21 AM, Leif Halvard Silli wrote:
On my OS X 10.7 computer, then TextEdit does sniff UTF-8 (without the
BOM).
It does indeed have a sniffing feature, though it also appears to use
the com.apple.TextEncoding extended
Martin J. Dürst, Wed, 18 Jul 2012 11:00:42 +0900:
On 2012/07/17 23:11, Leif Halvard Silli wrote:
Martin J. Dürst, Tue, 17 Jul 2012 18:49:47 +0900:
On 2012/07/17 17:22, Leif Halvard Silli wrote:
that a page with strict ASCII characters inside could still
contain character entities/references
Martin,
Martin J. Dürst, Wed, 18 Jul 2012 10:05:40 +0900:
On 2012/07/18 4:35, Leif Halvard Silli wrote:
But is the Windows Notepad really to blame?
Pretty much so. There may have been other products from Microsoft
that also did it, but with respect to forcing browsers and XML
parsers
Martin J. Dürst, Wed, 18 Jul 2012 17:20:31 +0900:
On 2012/07/18 16:35, Leif Halvard Silli wrote:
Martin J. Dürst, Wed, 18 Jul 2012 11:00:42 +0900:
The best reason is simply that nobody should be using
crutches as long as they can walk with their own legs.
Crutches, in that sense, is only
Steven Atreju, Wed, 18 Jul 2012 13:40:30 +0200:
Except that the internet is almost unusable without cookies
and scripting, lynx(1) works very well, too, if the ncursesw
library is linked against (and the terminal font supports
Unicode characters). Funny that it writes garbage for
Philippe Verdy, Tue, 17 Jul 2012 03:40:37 +0200:
2012/7/16 Leif Halvard Silli:
HTML5:
(ASCII is considered now an alias of Windows-1252, also for
compatibiluty reasons, even if strict US-ASCII resources could be
interpreted without changes as UTF-8)
I agree that HTML5 ought to ask UAs
Martin J. Dürst, Tue, 17 Jul 2012 18:49:47 +0900:
On 2012/07/17 17:22, Leif Halvard Silli wrote:
And an argument was put forward in the WHATWG mailinglist
earlier tis year/end of previous year, that a page with strict ASCII
characters inside could still contain character entities/references
Jukka K. Korpela, Tue, 17 Jul 2012 17:31:46 +0300:
2012-07-17 17:11, Leif Halvard Silli wrote:
For instance, early on in 'the Web', some appeared to think
that all non-ASCII had to be represented as entities.
Yes indeed. There's still some such stuff around. It's mostly
unnecessary
Hi Martin,
Martin J. Dürst, Tue, 17 Jul 2012 19:02:01 +0900:
On 2012/07/13 20:44, Leif Halvard Silli wrote:
Martin J. Dürst, Fri, 13 Jul 2012 18:17:05 +0900:
On 2012/07/13 0:12, Leif Halvard Silli wrote:
Doug Ewell, Wed, 11 Jul 2012 09:12:46 -0600:
HTML5-parsers MUST support UTF-8. They do
Steven Atreju, Mon, 16 Jul 2012 13:35:04 +0200:
Doug Ewell d...@ewellic.org wrote:
And:
Q: Is the UTF-8 encoding scheme the same irrespective of whether
the underlying processor is little endian or big endian?
...
Where a BOM is used with UTF-8, it is only used as an ecoding
in their
back-compatibility mantra.)
--
Leif Halvard Silli
Leif Halvard Silli, Fri, 13 Jul 2012 13:44:42 +0200:
I do at least not think that user agents that
want to be conforming pre-HTML5 user agents have any justification for
ignoring the BOM.
* The effect of the BOM - as encoding signature - is not discussed
anywhere in HTML4 or in the 'text
their thing well
can be plugged together to achieve complex tasks. Unicode is
very, very important. Really.
In the future simple things like '$ cat File1 File2 File3' will
no longer work that easily.
I guess you get the same problem with UTF-16 files also, then?
--
Leif Halvard Silli
Naena Guru, Tue, 10 Jul 2012 01:40:19 -0500:
As I said, I use HTML-Kit (and Tools).
Your problem appears to be that HTML-Kit does not directly support
UTF-8. But are you aware that you can still work with UTF-8 with it?
You only need to use UnicodePad in the Unicode menu of the Tools menu,
Doug Ewell, Wed, 11 Jul 2012 09:12:46 -0600:
and people who want to create or modify UTF-8 files which will
be consumed by a process that is intolerant of the signature
should not use Notepad. That goes for HTML (pre-5) pages [snip]
HTML5-parsers MUST support UTF-8. They do not need to
Philippe Verdy, Wed, 11 Jul 2012 07:36:56 +0200:
2012/7/11 Leif Halvard Silli:
In VIM, you set or unset the BOM via the commands
set bomb
set nobomb
Should these command specify if your computer will explode when saving
the file ?
:'o
Probably signals the weird fear
copy of a classic Norwegian book from 1971 on time reckoning, calendar
etc,[1] and he used both the – and the ÷ as minus, but predominantly
the ÷, it seems.
[1] http://books.google.no/books/about/?id=kHgyQwAACAAJ
--
Leif Halvard Silli
Leif Halvard Silli, Wed, 11 Jul 2012 03:01:53 +0200:
Btw, the venerable Danish Salomonsens conversional encyclopedia, the
1924 edition, says, that subtraction, quote: is written a – b or a ÷
b, where the – and the ÷ is called the minus sign. [7] So it sounds as
if it saw it as shapes
Philippe Verdy, Wed, 11 Jul 2012 14:15:39 +0200:
2012/7/11 Jean-François Colson j...@colson.eu
If your document only contains
?php
header(location:http://unicode.org;);
?
but you save it with a BOM, the BOM will be sent and you’ll get an
error message like
Warning: Cannot modify
be needed?
--
Leif Halvard Silli
Philippe Verdy, Tue, 10 Jul 2012 13:50:03 +0200:
2012/7/10 Leif Halvard Silli:
Asmus Freytag, Mon, 09 Jul 2012 19:32:47 -0700:
The European use (this is not limited to Scandinavia)
Thanks. It seems to me that that this tradition is not without a link
to the (also) European tradition
Jukka K. Korpela, Tue, 10 Jul 2012 15:52:27 +0300:
2012-07-10 15:33, Andrew West wrote:
On 10 July 2012 11:50, Leif Halvard Silli wrote:
My candidate characters, this round, are:
DIVISION SIGN (÷) as minus sign.
COLON (:) as division sign.
MIDDLE DOT
Hans Aberg, Tue, 10 Jul 2012 22:41:26 +0200:
On 10 Jul 2012, at 21:30, Asmus Freytag wrote:
On 7/10/2012 3:50 AM, Leif Halvard Silli wrote:
Asmus Freytag, Mon, 09 Jul 2012 19:32:47 -0700:
The European use (this is not limited to Scandinavia)
Thanks. It seems to me that that this tradition
Mark E. Shoulson on Tue, 10 Jul 2012 21:22:36 -0400:
I'd not heard of using it as a subtraction symbol before, but it feels
to me like someone thought that the normal minus sign was too confusable
with an ordinary hyphen or something, maybe in a mixed presentation with
ordinary text and
Asmus Freytag, Tue, 10 Jul 2012 15:22:32 -0700:
Here's my summary of the annotations that we've been discussing so far:
General: I'm OK with the 'preferred' word. I don't think it spreads
'guilt' to say so. E.g. I know that «» and “” are preferred, but I use
e.g. because I once heard an
Naena Guru, Tue, 10 Jul 2012 01:40:19 -0500:
HTML5 assumes UTF-8 as the character set if you do not declare one
explicitly. My current pages are in HTML 4.
There is in principle no difference between what HTML5-parsers assume
and what HTML4-parsers assume: All of them default to the default
)#Notation
--
Leif halvard silli
Hi David and Jukka,
Jukka K. Korpela, Mon, 09 Jul 2012 10:04:08 +0300:
2012-07-09 8:19, Leif Halvard Silli wrote:
Thanks for letting me know that the '÷' is used a minus in the Finish
context too. I'm sure there is some interesting story around this ...
Btw, I can say that when using
Jukka K. Korpela, Mon, 09 Jul 2012 15:14:56 +0300:
2012-07-09 11:39, Leif Halvard Silli wrote:
In practice, it’s always a symbol of division in calculators.
It wasn't always like that. Take the Danish Contex calculators:
* Contex mechanical calculator from the 1960-ies
using 'Div' instead
78 matches
Mail list logo