in the future?
--
David Starner - [EMAIL PROTECTED]
http/ftp: dvdeug.dhis.org
I knew all of the floors in my high school, and none of the ceilings.
- Chris Painter
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/
;
that is, treat U+2060..U+2069 and U+E..U+E1000 as Cf, and everything
else unassigned as printing.
--
David Starner - [EMAIL PROTECTED]
http/ftp: dvdeug.dhis.org
And crawling, on the planet's face, some insects called the human race.
Lost in space, lost in time, and meaning.
-- RHPS
-
Linux
.
What size is that word ?
In order to hold all the byte of a maximum size UTF-8 sequence; they
need an 8 byte word.
Um, 6 if you need characters beyond 16#10#, 4 if you don't.
--
David Starner - [EMAIL PROTECTED]
http://dvdeug.dhis.org
If you wish to strive for peace of soul then believe
for English and Fraktur for German as if
they were written in different scripts.
--
David Starner - [EMAIL PROTECTED]
Pointless website: http://dvdeug.dhis.org
"I don't care if Bill personally has my name and reads my email and
laughs at me. In fact, I'd be rather honored." - Joseph_Greg
-
entry comfortably are available.
Ack! Why on earth would Project Gutenberg use ISO 6429? If you want
richtext, use HTML. It's easy to write, can be viewed on far more
platforms, and can be read as plain text without problems.
--
David Starner - [EMAIL PROTECTED]
Pointless website: http
does this relate to w3mmee?
--
David Starner - [EMAIL PROTECTED]
Pointless website: http://dvdeug.dhis.org
"I don't care if Bill personally has my name and reads my email and
laughs at me. In fact, I'd be rather honored." - Joseph_Greg
-
Linux-UTF8: i18n of Linux on all leve
GNU
Unifont (at least the bdf version) into single and double
width fonts?
--
David Starner - [EMAIL PROTECTED]
Pointless website: http://dvdeug.dhis.org
"I don't care if Bill personally has my name and reads my email and
laughs at me. In fact, I'd be rather honored." - Joseph_Greg
-
claim those) aren't going to give any hint that you
recieved a Unicode file rather than a SJIS file - it will just
convert it to whatever encoding is convienent (probably Unicode,
on Windows) and display it.
--
David Starner - [EMAIL PROTECTED]
Pointless website: http://dvdeug.dhis.org
"I
On Tue, Apr 10, 2001 at 02:41:12PM +0200, Bruno Haible wrote:
Now all that will remain to be done is to fix 'more' and 'less' to
correctly the resulting UTF-8 encoded output.
What has to be done with less? If LESSCHARSET=utf-8, it should handle
utf-8 correct already.
--
David Starner
annot handle Unicode text.
Are you sure? I know Windows NT and I believe Window ME (98?) can
handle Unicode in notepad and wordpad just fine. (
--
David Starner - [EMAIL PROTECTED]
Pointless website: http://dvdeug.dhis.org
"I don't care if Bill personally has my name and reads my email a
and expect them to be
translated to koi8-r, or carry both, or what?
--
David Starner - [EMAIL PROTECTED]
Pointless website: http://dvdeug.dhis.org
"I don't care if Bill personally has my name and reads my email and
laughs at me. In fact, I'd be rather honored." - Joseph_Greg
-
Linux-UT
ion has gotten even worse."
(page 7). Later he explains how the decreasing quality of The Art of
Computer Programming's typesetting got him to design TeX.
--
David Starner - [EMAIL PROTECTED]
Pointless website: http://dvdeug.dhis.org
"I don't care if Bill personally has my name and r
for the EURO SIGN
will look somewhat ridiculous for EU customers by the end of this year.
Well, netscape 4.77 doesn't get it right, Konqeror does it a little better
(but ignores the HTTP headers) and Mozilla gets it perfect (if just it would
render other stuff a little better on my system.)
--
David
broader scope than Unicode and is (at least in the context of
communication with ISO 6429 terminals) the preferred reference.
Are you sure this isn't a deviation of ISO 10646? I thought they
removed the 5 and 6-byte UTF-8 sequences in the latest stuff.
--
David Starner - [EMAIL PROTECTED
their business. Anything external that claims to be UTF-8 should be valid
UTF-8, so any program that handles UTF-8 can handle it, even if we're
just talking wordwrapping or normalization.
--
David Starner [EMAIL PROTECTED]
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org
copyright owner. If Juliusz and the maintainer of screen wanted better
intergration, I would be surprised if the FSF was the obstructing factor.
Of course, it's entirely possible that Juliusz or the screen maintainer is
not interested in such a project. Then this is a moot point.
--
David Starner
If I recall correctly, an international treaty on copyright states
that a citizen of country X gets the same rights in country Y as a
citizen of country Y, so it doesn't make any difference that you're an
American. Your work won't be in the public domain everywhere unless
you say so.
Fine.
The new release can be found at
http://people.debian.org/unifont-dvdeug-1.0.tar.gz
Try http://people.debian.org/~dvdeug/unifont-dvdeug-1.0.tar.gz instead.
--
David Starner - [EMAIL PROTECTED], [EMAIL PROTECTED]
The pig - belongs - to all mankind! - Invader Zim
-
Linux-UTF8: i18n of Linux
Bruno Haible [EMAIL PROTECTED] writes:
David Starner writes:
Btw, what is the license of the unifont? Is it suitable for inclusion
in XFree86?
License was included:
License:
All of my works you find here are freeware. ...
It's not clear whether this license covers only your
.
--
David Starner - [EMAIL PROTECTED]
The pig -- belongs -- to _all_ mankind! - Invader Zim
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/
Latin-script
the same way, or at least chose between a standard, language-neutral
'correct'
sort and an efficient sort.
--
David Starner - [EMAIL PROTECTED]
The pig -- belongs -- to _all_ mankind! - Invader Zim
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org
From: Florian Weimer [EMAIL PROTECTED]
I don't think it's worth the trouble. Collation will always be
confusing if you're mixing two or more languages.
I realize that, but we should minimize confusion. If there is no national
standard (formal or informal) on sorting of certain characters,
The charsets (7) page was a little outdated, and contained several errors.
So I forwared a new copy to the Debian maintainer, who should send it
upstream. That copy is attached for your use or edification.
--
David Starner - [EMAIL PROTECTED]
The pig -- belongs -- to _all_ mankind! - Invader Zim
+1, U+10001, etc., not U+D800 U+DC00, U+D800 U+DC01,
etc.
--
David Starner - [EMAIL PROTECTED]
Pointless website: http://dvdeug.dhis.org
I saw a daemon stare into my face, and an angel touch my breast; each
one softly calls my name . . . the daemon scares me less.
- Disciple, Stuart Davis
clarify, but I don't understand where
you're coming from at all.
--
David Starner - [EMAIL PROTECTED]
Pointless website: http://dvdeug.dhis.org
I saw a daemon stare into my face, and an angel touch my breast; each
one softly calls my name . . . the daemon scares me less.
- Disciple, Stuart Davis
the file up with garbage.
Again, there are two bugs here. If you won't fix the first one, please
realize that I can't open up a UTF-8 file by using the encoded-text
dialog, either.
--
David Starner - [EMAIL PROTECTED]
Pointless website: http://dvdeug.dhis.org
I saw a daemon stare into my face
shell scripts that spend 90% of their time in fork().
CPU time is cheap. Rewriting shell scripts in another language has
always been a good means of optimization, but for run-once or even
run-rarely, shell is fine.
--
David Starner - [EMAIL PROTECTED]
Pointless website: http://dvdeug.dhis.org
I
Unix's to adopt BSD versions of tools.)
--
David Starner - [EMAIL PROTECTED]
Pointless website: http://dvdeug.dhis.org
I saw a daemon stare into my face, and an angel touch my breast; each
one softly calls my name . . . the daemon scares me less.
- Disciple, Stuart Davis
--
Linux-UTF8: i18n
should support, and a consensus, then we can start
claiming they're broken and fixing them.
--
David Starner - [EMAIL PROTECTED]
Pointless website: http://dvdeug.dhis.org
I saw a daemon stare into my face, and an angel touch my breast; each
one softly calls my name . . . the daemon scares me
. Maybe it's a half solution; but a
half solution covers a whole lot more problems than a ten percent
solution. A 100 percent solution won't be here until at least Buhid and
Shavian are part of Unicode; need we wait until then to start building
Unicode tools?
--
David Starner - [EMAIL PROTECTED
. (I think Debian woody upgraded to the
beta mutt at some point.)
--
David Starner - [EMAIL PROTECTED], ICQ #61271672
Pointless website: http://dvdeug.dhis.org
When the aliens come, when the deathrays hum, when the bombers bomb,
we'll still be freakin' friends. - Freakin' Friends
--
Linux-UTF8
.)
ISO646-DE users did it. So did ISO646-DK, ISO646-ES and all the rest of
the 7-bit codes. Why is it so different for CP932?
--
David Starner - [EMAIL PROTECTED], dvdeug/jabber.com (Jabber)
Pointless website: http://dvdeug.dhis.org
When the aliens come, when the deathrays hum, when the bombers bomb
On Sun, Jan 13, 2002 at 08:43:40PM -0500, Glenn Maynard wrote:
On Sun, Jan 13, 2002 at 06:06:11PM -0600, David Starner wrote:
Because that's not portable. Read
http://www.debian.or.jp/~kubota/unicode-symbols.html.
I know the problem. It still doesn't mean that every file format
have read them at all.
--
David Starner - [EMAIL PROTECTED], dvdeug/jabber.com (Jabber)
Pointless website: http://dvdeug.dhis.org
When the aliens come, when the deathrays hum, when the bombers bomb,
we'll still be freakin' friends. - Freakin' Friends
--
Linux-UTF8: i18n of Linux on all levels
standard, in
the same way that Perl has a standard - it's called glibc. If you feel
compelled to write a formal standard, you have to write one that defines
what the standard implementation does.
--
David Starner - [EMAIL PROTECTED], dvdeug/jabber.com (Jabber)
Pointless website: http://dvdeug.dhis.org
the whole herd of cats thing.
David == David Starner [EMAIL PROTECTED] writes:
David Why all the IBM code pages? glibc currently supports two -
David 1251 (be_BY, bg_BG) and 1255 (yi_US).
What do you mean by support? For code pages, I would say iconv is
the relevant functionality.
I
alphabet has 26 letters, but English does not have 26 alphabets.
--
David Starner - [EMAIL PROTECTED], dvdeug/jabber.com (Jabber)
Pointless website: http://dvdeug.dhis.org
What we've got is a blue-light special on truth. It's the hottest thing
with the youth. -- Information Society, Peace and Love, inc
-R for KOI8-R, SHIFTJIS for Shift_JIS, CP-437 for
CP437, and a bunch more along that line.
Hmm, more painful then I thought at first glance. Gratitious differences
all over the place.
--
David Starner - [EMAIL PROTECTED], dvdeug/jabber.com (Jabber)
Pointless website: http://dvdeug.dhis.org
What
and Rosetta to be interesting, but I can't even
say that about Bytext. Maybe after some serious editing, some
interesting ideas might surface, but the complexity would still make it
unusable.
--
David Starner - [EMAIL PROTECTED], dvdeug/jabber.com (Jabber)
Pointless website: http://dvdeug.dhis.org
bytewise in Bytext?
--
David Starner - [EMAIL PROTECTED], dvdeug/jabber.com (Jabber)
Pointless website: http://dvdeug.dhis.org
What we've got is a blue-light special on truth. It's the hottest thing
with the youth. -- Information Society, Peace and Love, Inc.
--
Linux-UTF8: i18n of Linux on all
:
FEFF13B013B50020...
How about an example? Say, ᎰᎵ hat Musik gut gehört. What does that
look like bytewise in Bytext?
After reading the Bytext standard three times, I still don't how to
encode that in Bytext.
--
David Starner - [EMAIL PROTECTED], dvdeug/jabber.com (Jabber)
Pointless website
when my language was being implemented for
computers. They asked people who used and implemented RTL scripts.
As was mentioned on the Unicode list, a reverisable BIDI algorithm is
not feasible with explicit BIDI markers, which some scripts that can be
written both ways need.
--
David Starner
and RTLba/RTL both render to ab.)
--
David Starner / Давид Старнэр - [EMAIL PROTECTED]
Pointless website: http://dvdeug.dhis.org
What we've got is a blue-light special on truth. It's the hottest thing
with the youth. -- Information Society, Peace and Love, Inc.
--
Linux-UTF8: i18n of Linux on all
, but it's a lot better than ISO 2022.
So as I said at first glance there seemed to be a contradiction
because some people did accept there is a problem:
On Tue, 5 Feb 2002 David Starner [EMAIL PROTECTED] wrote
Why would that fix the problem?
But this moring - as I was watching on TV that the US
of people will be using curved quotes - another thing it's bloody
impossible to get from the keyboard.
--
David Starner / Давид Старнэр - [EMAIL PROTECTED]
What we've got is a blue-light special on truth. It's the hottest thing
with the youth. -- Information Society, Peace and Love, Inc
. :)
No. I'd get rid of the neutral quotes, the apostrophe and backtick. I
don't know about everyone else, but I could live with switching between
a programmer's/Unix keyboard, with #'`~^*_\/| on it and one that has,
say, curved quotes, Euro, dead keys for French and German, and daggers.
--
David
subset of human communication. It was provided to Debian under
the GPL, but the upstream author doesn't include a license upstream. (I
emailed him personally to get a license.) If anyone else wants to take a
look at it, here it is.
--
David Starner / Давид Старнэр - [EMAIL PROTECTED]
What we've got
adapt to this - the original single font is wrong, since
the charcell X fonts should be monowidth, not biwidth. The two font
method works right with xterm, and the proportional fonts the new
release builds work right with yudit.
--
David Starner - [EMAIL PROTECTED]
It's not a habit; it's cool; I
On Tue, Mar 05, 2002 at 04:02:15PM -0600, David Starner wrote:
I have made a new release of the GNU Unifont, version 2. It includes a
much improved Arabic block, Syraic and Bopomofo, as well as various
other improvements. It makes yet another attempt to get the whole X font
thing right
may
question _their_ usability.
--
David Starner - [EMAIL PROTECTED]
It's not a habit; it's cool; I feel alive.
If you don't have it you're on the other side.
- K's Choice (probably referring to the Internet)
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux
, and any
explanation should include that point. It's certainly no better to
encourage everyone to use en_GB.UTF-8 than to use en_US.UTF-8.
--
David Starner - [EMAIL PROTECTED]
It's not a habit; it's cool; I feel alive.
If you don't have it you're on the other side.
- K's Choice (probably referring
, letter or A4, ignoring the possiblity that someone might
have, say, legal size paper loaded.
--
David Starner - [EMAIL PROTECTED]
It's not a habit; it's cool; I feel alive.
If you don't have it you're on the other side.
- K's Choice (probably referring to the Internet)
--
Linux-UTF8: i18n
On Tue, Apr 30, 2002 at 11:09:55PM -0400, Jungshik Shin wrote:
However, to me overiding the default at the command line is a perfectly
good solution.
Everytime you use a program? Stuff like that gets real tiring, real fast
to me.
--
David Starner - [EMAIL PROTECTED]
It's not a habit; it's
for a
problem, I know where to find it.
The paper situation is even worse. I see no benefits accruing to me from
switching paper sizes; and I don't want to switch out all my notebooks
or end up mixing letter sized and A4 sized paper.
--
David Starner - [EMAIL PROTECTED]
It's not a habit; it's cool; I
with hundreds of charsets which results in corrupted data. I'd
done it years ago hadn't it been for all the complaints from people
with archaic mailreaders.
Just tell them that they have to upgrade. The world needs trailblazers
for any new standard to be accepted.
--
David Starner - [EMAIL PROTECTED
to
displaying that charset properly, which on the console means converting
it to the locale charset. A webbrowser that didn't do that would
certainly be considered broken.
--
David Starner - [EMAIL PROTECTED]
What we've got is a blue-light special on truth. It's the hottest thing
with the youth
clear on what you need to handle how. Make sure
that wordinspect handles the standard right, and let dict and dictd deal
with thier own problems.
--
David Starner - [EMAIL PROTECTED]
What we've got is a blue-light special on truth. It's the hottest thing
with the youth. -- Information Society
On Thu, Oct 10, 2002 at 10:25:36AM +0400, Vadim Plessky wrote:
On Monday 07 October 2002 9:18 am, David Starner wrote:
|
| There are some rather good fonts available *for free* for
| Latin+Greek+Cyrillic alphabet.
|
| Which is but a small subset of Unicode. That doesn't cover the 1200
are the worst; how do you reliably convert “wait 'till they are clear,
and run them 'thro a strainer”? I’ve seen a number of errors from
Microsoft systems automatically doing the conversion, and it’s too
closely related to the hard AI problem for me to think we’ll do much
better.
--
David Starner
accessible on a UK keyboard. But all these can only be considered to be
guru hacks.
Okay, what are your guru hacks?
--
David Starner - [EMAIL PROTECTED]
Great is the battle-god, great, and his kingdom--
A field where a thousand corpses lie.
-- Stephen Crane, War is Kind
--
Linux-UTF8: i18n
in a document which wishes to show the
difference between English and German conventions of the 1920's. Does that
mean that Fraktur and Antigua should have been encoded seperately?
David Starner - [EMAIL PROTECTED]
([EMAIL PROTECTED] may be disappearing soon - [EMAIL PROTECTED] will work
. Fonts for one culture may not be considered beautiful or even very
readable to another.
David Starner - [EMAIL PROTECTED]
([EMAIL PROTECTED] may be disappearing soon - [EMAIL PROTECTED] will work,
but is not suitable for high-volume traffic.)
--
Linux-UTF8: i18n of Linux on all levels
Archive
(available online) describes the separation
of the glyphs in a 3 dimensional array, with the Z-axis being typeface.
David Starner - [EMAIL PROTECTED]
([EMAIL PROTECTED] may be disappearing soon - [EMAIL PROTECTED] will work,
but is not suitable for high-volume traffic.)
--
Linux-UTF8: i18n of Linux
On 8/17/06, Rich Felker [EMAIL PROTECTED] wrote:
This is nothing but glibc being idiotic. Yes it's _allowed_ to do this
according to POSIX (POSIX makes no requirements about correspondence
of the values returned to any other standard) but it's obviously
incorrect for the width of À to be
On 9/1/06, Rich Felker [EMAIL PROTECTED] wrote:
IMO the answer is common sense. Languages that have a low information
per character density (lots of letters/marks per word, especially
Indic) should be in 2-byte range and those with high information
density (especially ideographic) should be in
On 9/5/06, Rich Felker [EMAIL PROTECTED] wrote:
On Mon, Sep 04, 2006 at 11:44:26PM -0500, David Starner wrote:
Once you compress the data with a decent compression scheme, you may
as well store the data by writing out the full Unicode name (e.g.
LATIN CAPITAL LETTER OU); the final result
On 3/27/07, Rich Felker [EMAIL PROTECTED] wrote:
This is not a simple task at all, and in fact it's a task that a
computer should (almost) never do...
Of course. Why shouldn't an editor go through and change 257 headings
to titlecase by hand? Humans are known for their abilities to do such
On 3/27/07, Rich Felker [EMAIL PROTECTED] wrote:
On Tue, Mar 27, 2007 at 06:44:42PM -0500, David Starner wrote:
On 3/27/07, Rich Felker [EMAIL PROTECTED] wrote:
This is one of the very few
places where a computer should ever perform case mappings: in a
powerful editor or word processor
Just
On 3/28/07, Rich Felker [EMAIL PROTECTED] wrote:
Matching equivalence classes (including case and
other equivalences) is trivial and mostly language-independent. Case
mapping is ugly (think German SS/ß) and language-dependent (think
Turkish I/ı and İ/i).
In Turkish, I and i should be in
On 3/30/07, Rich Felker [EMAIL PROTECTED] wrote:
My point was that, had the mistake of introducing ISO-8859 support not
been made (i.e. if bytes 128-255 had remained considered as
unprintable at the time), there would have been both much more
incentive to get UTF-8 working quickly, and much less
70 matches
Mail list logo