[conlang] Digest Number 4802

conlang Tue, 10 Jan 2006 01:57:25 -0800

There are 25 messages in this issue.

Topics in this digest:

      1. Re: OT coins and currency
           From: R A Brown <[EMAIL PROTECTED]>
      2. Re: Conlang Wiki
           From: "Jonathyn Bet'nct" <[EMAIL PROTECTED]>
      3. Re: OT coins and currency
           From: Andreas Johansson <[EMAIL PROTECTED]>
      4. Conlangs flag in actual cloth - last reminder (deadline is Sat. 14th)
           From: Sai Emrys <[EMAIL PROTECTED]>
      5. Conlang flag in actual cloth - final colors?
           From: Sai Emrys <[EMAIL PROTECTED]>
      6. Re: OT: Unicode 5.0
           From: John Vertical <[EMAIL PROTECTED]>
      7. Re: OT: Unicode 5.0
           From: "Jonathyn Bet'nct" <[EMAIL PROTECTED]>
      8. Re: OT: Unicode 5.0
           From: Andreas Johansson <[EMAIL PROTECTED]>
      9. Re: OT: Unicode 5.0
           From: Paul Bennett <[EMAIL PROTECTED]>
     10. Re: OT: Unicode 5.0
           From: Tim May <[EMAIL PROTECTED]>
     11. Re: OT: Unicode 5.0
           From: John Vertical <[EMAIL PROTECTED]>
     12. Re: [Theory] Types of numerals
           From: "Ph.D." <[EMAIL PROTECTED]>
     13. Re: OT: Unicode 5.0
           From: [EMAIL PROTECTED]
     14. Re: OT: Unicode 5.0
           From: Tim May <[EMAIL PROTECTED]>
     15. Re: OT: Unicode 5.0
           From: Tristan McLeay <[EMAIL PROTECTED]>
     16. Re: [Theory] Types of numerals; bases in natlangs.
           From: Thomas Hart Chappell <[EMAIL PROTECTED]>
     17. Re: OT: Unicode 5.0
           From: "Mark J. Reed" <[EMAIL PROTECTED]>
     18. Re: OT: Unicode 5.0
           From: Henrik Theiling <[EMAIL PROTECTED]>
     19. Re: OT: Unicode 5.0
           From: Paul Bennett <[EMAIL PROTECTED]>
     20. Re: OT: Unicode 5.0
           From: "Mark J. Reed" <[EMAIL PROTECTED]>
     21. Re: OT: Unicode 5.0
           From: Paul Bennett <[EMAIL PROTECTED]>
     22. Re: Conlang flag in actual cloth - final colors?
           From: Paul Bennett <[EMAIL PROTECTED]>
     23. Re: OT: Unicode 5.0
           From: "Jonathyn Bet'nct" <[EMAIL PROTECTED]>
     24. Re: OT: Unicode 5.0
           From: Isaac Penzev <[EMAIL PROTECTED]>
     25. Re: Conlang Wiki
           From: Peter Bleackley <[EMAIL PROTECTED]>

________________________________________________________________________
________________________________________________________________________

Message: 1         
   Date: Mon, 9 Jan 2006 19:21:53 +0000
   From: R A Brown <[EMAIL PROTECTED]>
Subject: Re: OT coins and currency

Isaac Penzev wrote:
> R A Brown jazdy:
> 
> 
>>It will be interesting to see what the venerable Commissioners lay down
>>as the "correct" Cyrillic form of the name.
> 
> 
> There are no universal values for Cyrillic characters, e.g. |и| is [i] in
> Russian but [I] in Ukrainian. 

I know.  There are also no universal values for Roman characters either!!

But that has not prevented the Commissioners laying down that the
official Roman script form is _euro_ (written EURO on the banknotes).

> For now the only form I saw is |евро| in
> Russian and |євро| in Ukrainian. Both indeclinable masculine.

The Russians & Ukrainians are not members of the EU, so they can do as
they please. But I have no doubt whatsoever that when a country using
the Cyrillic alphabet joins the EU, then a "correct and official" 
Cyrillic form will be laid down   :)

-- 
Ray
==================================
[EMAIL PROTECTED]
http://www.carolandray.plus.com
==================================
MAKE POVERTY HISTORY

________________________________________________________________________
________________________________________________________________________

Message: 2         
   Date: Mon, 9 Jan 2006 13:38:28 -0800
   From: "Jonathyn Bet'nct" <[EMAIL PROTECTED]>
Subject: Re: Conlang Wiki

On 1/9/06, Aaron Morse <[EMAIL PROTECTED]> wrote:
> I'd be happy to. . .but I can't see how to sign up.

I think you just send him an email about it. That's what I did.

--
Hasta la pasta,
Jonathyn Bet'nct.
------------------------------------------------------------
I tried the real world once; didn't really care for it.

Hey, free iPods! - http://www.mp3players4free.com/default.aspx?r=543572

Beth: Lisa, all dogs are boys, all cats are girls. Is that right, Max?
Max: Exactly.
Lisa: Well, I'm sorry to have to tell you this, but Daisy is
obviously, and I mean obviously, a girl.
Max: Oh we're not disputing that. It's not a question of sex, but of gender.
Lisa: Sex and gender are the same thing.
Max: Uh, not so. I would much rather have sex than gender.
Lisa: Since you have neither that must be very sad for you.

________________________________________________________________________
________________________________________________________________________

Message: 3         
   Date: Mon, 9 Jan 2006 22:53:30 +0100
   From: Andreas Johansson <[EMAIL PROTECTED]>
Subject: Re: OT coins and currency

Quoting "Mark J. Reed" <[EMAIL PROTECTED]>:

> On 1/9/06, Andreas Johansson <[EMAIL PROTECTED]> wrote:
> > Back in the day, some Swedes used an English-inspired [j8\:ru], but that
> > pronunciation, thankfully, seems to have died out.
>
> Yeah, thank goodness.  The last thing you want is to sound like those
> ignorant Anglophones. :)

Indeed. :p

Seriously, apart from the fact I personally find that pronunciation ugly, it's
totally disconnected from how we pronounce related words like _Europa_,
_eurocentrisk_, etc.

Quoting Benct Philip Jonsson <[EMAIL PROTECTED]>:

> Andreas Johansson skrev:
>
>
> > The treatment of |eu| in Greek-derived Swedish words is erratic, but [Ev]
> is
> > probably the commonest.
> >
> > Back in the day, some Swedes used an English-inspired [j8\:ru], but that
> > pronunciation, thankfully, seems to have died out.
> >
> >                                           Andreas
> >
> >
>
> IME Swedish Euro-supporters say [j8\:ru] while Euro-opponents say
> [Evru] or [evru].  There may be exceptions, but by and large the
> picture holds true.

Really? I've not heard [j8\:ru] in a long time, and most younger people I know
are pro-euro.

I suppose we need a usage survey at this point, but I find it hard to believe
that most pro-euro people say [j8\:ru].

Looking around at some usage advice sites, I find some telling me to pronounce
it with "[Eu] as in _Europa_". Considering how I pronounce the name of the
continent, I suppose I should start refering to the currency as [e:ru] ...

                                          Andreas

________________________________________________________________________
________________________________________________________________________

Message: 4         
   Date: Mon, 9 Jan 2006 14:29:25 -0800
   From: Sai Emrys <[EMAIL PROTECTED]>
Subject: Conlangs flag in actual cloth - last reminder (deadline is Sat. 14th)

In case you didn't see it the first time or weren't sure:

I'm coordinating a bulk order of Conlang flags
[wiki.frath.net/Image:Conflag_med.png]. 3'x5' poly, 1ply screenprint,
grommeted.

Current price (I bargained it down further, yay) is just under $27.50
per for an order of 12 - currently, I have 13 people on the list as
ordering (though 4 of them are for max prices <$25... which if those
four stick to that and decline the current price, means it'll be about
$35 per instead.) 36-piece price is $16 per, so there's still room for
that to go down if I get more in before Saturday. Shipping is $8.50
worldwide.

Anywho: if you're interested, email me by the 14th, 'cause that's when
I'm initiating the order.

'ta.
 - Sai

________________________________________________________________________
________________________________________________________________________

Message: 5         
   Date: Mon, 9 Jan 2006 14:40:13 -0800
   From: Sai Emrys <[EMAIL PROTECTED]>
Subject: Conlang flag in actual cloth - final colors?

Taliesin says (below) the colors:
black: Black 6 2X
yellow: PMS 123
purple: PMS 2592 2X or PMS 527   [I think I prefer 527 m'self...]

are the most suitable. That agreed? Any vexillologists in the house?

 - Sai

( http://www.the-flag-makers.com/pantone-color-chart.htm is your palette.)

---------- Forwarded message ----------
From: taliesin the storyteller <[EMAIL PROTECTED]>
Date: Dec 29, 2005 12:09 PM
Subject: Re: [CONLANG] Conlang flag in actual cloth
To: [EMAIL PROTECTED]

* Sai Emrys said on 2005-12-29 08:13:13 +0100
> And please make sure that the colors conform to this palette:
> http://www.the-flag-makers.com/pantone-color-chart.htm
>
> Maybe argue over (er, I mean... discuss) which exactly is the right
> color for the sun and the sky, since I'm not the herladry expert
> around here. :-P

There were even several colors for black, but I'd say:

black: Black 6 2X
yellow: PMS 123
purple: PMS 2592 2X or PMS 527

t.

________________________________________________________________________
________________________________________________________________________

Message: 6         
   Date: Tue, 10 Jan 2006 01:21:32 +0200
   From: John Vertical <[EMAIL PROTECTED]>
Subject: Re: OT: Unicode 5.0

Paul Bennett wrote:
>On Sun, 08 Jan 2006 19:07:49 -0500, Herman Miller wrote:
>>Hmm.... I can't seem to find the specifics about what's new in 5.0.
>>What sorts of characters are included in Latin Extended C & D?
>
>See the Roadmaps at http://www.unicode.org/roadmaps/

Many interesting new ones. I think my favourites are the "squirrel-tail p" 
and the Norse digraphs.

...At risk of threadjack accusations, I'll use the opening to also fire a 
question that's been bothering me for a while - Why does Unicode include 
several characters multiple times? There are 6561 different ways to write 
"THAI POEM". If capital alpha is different from capital ay just because it's 
used in a different alphabet to write a different language, isn't (eg) 
Icelandic "A" also a different character then? Are they really purposely 
randomly tagging unnecessary etymological/usage information to symbols, or 
is it that they just fudged it up initially (for whatever political reasons) 
and can't fix it at this stage any more?

John Vertical

________________________________________________________________________
________________________________________________________________________

Message: 7         
   Date: Mon, 9 Jan 2006 15:42:41 -0800
   From: "Jonathyn Bet'nct" <[EMAIL PROTECTED]>
Subject: Re: OT: Unicode 5.0

On 1/9/06, John Vertical <[EMAIL PROTECTED]> wrote:
> ...At risk of threadjack accusations, I'll use the opening to also fire a
> question that's been bothering me for a while - Why does Unicode include
> several characters multiple times? There are 6561 different ways to write
> "THAI POEM". If capital alpha is different from capital ay just because it's
> used in a different alphabet to write a different language, isn't (eg)
> Icelandic "A" also a different character then? Are they really purposely
> randomly tagging unnecessary etymological/usage information to symbols, or
> is it that they just fudged it up initially (for whatever political reasons)
> and can't fix it at this stage any more?

This is because Icelandic uses the same /script/ as English. Greek
uses a different /script/, therefore capital alpha gets its own
encoding, while Icelandic ay is encoded as the same as English ay.
Unicode stresses the distinctions between script, language (many of
which may use the same script), and glyph variants (which are left to
the realm of fonts, not text encodings).

Unicode certainly has fudged a bunch of stuff up initially, and
unfortunately they can't fix it now. (One thing in particular, I think
they should have encoded small caps a long time ago. One of the
proposals that was linked to included a small-cap F and S, and
mentioned that the only other small caps left unencoded were Q and X.
Interesting, I thought, so I went on a hunt for all the small caps
(other than F, Q, S, and X). I could only find a handful of them, and
they're randomly dotted all over the place: Latin Extended A, IPA
Extensions, Letterlike Symbols, etc. But anyway, enough of my rant.)

--
Hasta la pasta,
Jonathyn Bet'nct.
------------------------------------------------------------
I tried the real world once; didn't really care for it.

Hey, free iPods! - http://www.mp3players4free.com/default.aspx?r=543572

Beth: Lisa, all dogs are boys, all cats are girls. Is that right, Max?
Max: Exactly.
Lisa: Well, I'm sorry to have to tell you this, but Daisy is
obviously, and I mean obviously, a girl.
Max: Oh we're not disputing that. It's not a question of sex, but of gender.
Lisa: Sex and gender are the same thing.
Max: Uh, not so. I would much rather have sex than gender.
Lisa: Since you have neither that must be very sad for you.

________________________________________________________________________
________________________________________________________________________

Message: 8         
   Date: Tue, 10 Jan 2006 01:13:11 +0100
   From: Andreas Johansson <[EMAIL PROTECTED]>
Subject: Re: OT: Unicode 5.0

Quoting Jonathyn Bet'nct <[EMAIL PROTECTED]>:

> On 1/9/06, John Vertical <[EMAIL PROTECTED]> wrote:
> > ...At risk of threadjack accusations, I'll use the opening to also fire a
> > question that's been bothering me for a while - Why does Unicode include
> > several characters multiple times? There are 6561 different ways to write
> > "THAI POEM". If capital alpha is different from capital ay just because
> it's
> > used in a different alphabet to write a different language, isn't (eg)
> > Icelandic "A" also a different character then? Are they really purposely
> > randomly tagging unnecessary etymological/usage information to symbols, or
> > is it that they just fudged it up initially (for whatever political
> reasons)
> > and can't fix it at this stage any more?
>
> This is because Icelandic uses the same /script/ as English. Greek
> uses a different /script/, therefore capital alpha gets its own
> encoding, while Icelandic ay is encoded as the same as English ay.
> Unicode stresses the distinctions between script, language (many of
> which may use the same script), and glyph variants (which are left to
> the realm of fonts, not text encodings).

Icelandic is sometimes considered a separate script from Latin, presumably since
it includes the Runic-derived thorn. Now, I think the Unicoders took the right
decision not to treat it as separate, but the distinction between variants of
the same script and different scripts is not necessarily unambiguous.

                                                  Andreas

________________________________________________________________________
________________________________________________________________________

Message: 9         
   Date: Mon, 9 Jan 2006 19:19:09 -0500
   From: Paul Bennett <[EMAIL PROTECTED]>
Subject: Re: OT: Unicode 5.0

On Mon, 09 Jan 2006 18:42:41 -0500, Jonathyn Bet'nct <[EMAIL PROTECTED]>  
wrote:

> Unicode stresses the distinctions between script, language (many of
> which may use the same script), and glyph variants (which are left to
> the realm of fonts, not text encodings).

See also the Variation Selectors, which tell a different story, and the  
Rubric brackets proposed for Egyptian.

> Unicode certainly has fudged a bunch of stuff up initially, and
> unfortunately they can't fix it now.

They *could* fix it, by the same act of administrative fiat that created  
Unicode in the first place: make up a new standard with a new name. If  
it's superior enough, it will become prefered (if I hear one person so  
much as mutter the word "qwerty" from the peanut gallery, I shall smite  
thee, for that is an utter fabrication).

My own suggestions?

Purge all characters that are transparently a base character plus one or  
more combining diacritics, obviously allowing fonts to store precomposed  
versions of any combination the font author desires, just not at  
codepoints within the defined standard -- some of this goes on already,  
but it ought to be the rule rather than the exception.

Likewise, use ZWJ, ZWNJ and Variation Selectors to encode ligatures,  
digraphs, and presentation forms, and encode the composed forms outside  
the standard.

Having purged the needless characters, order all remaining glyphs by  
script name (alphabetically), and by glyph name (alphabetically) within  
each script, including combining characters and spacing modifier letters  
(which should have a less silly name). Leave at least one full row (plus a  
fractional row to bring the total range to a full number of rows) at the  
end of each script, just in case.

Replace the U+FFFE / U+FEFF byteorder/start mark with a mark that encodes  
the version number of the standard being adhered to, to allow for future  
bugfixes.

Paul

________________________________________________________________________
________________________________________________________________________

Message: 10        
   Date: Tue, 10 Jan 2006 00:26:10 +0000
   From: Tim May <[EMAIL PROTECTED]>
Subject: Re: OT: Unicode 5.0

Jonathyn Bet'nct wrote at 2006-01-09 15:42:41 (-0800) 
 > On 1/9/06, John Vertical <[EMAIL PROTECTED]> wrote:
 > > ...At risk of threadjack accusations, I'll use the opening to
 > > also fire a question that's been bothering me for a while - Why
 > > does Unicode include several characters multiple times? There are
 > > 6561 different ways to write "THAI POEM". If capital alpha is
 > > different from capital ay just because it's used in a different
 > > alphabet to write a different language, isn't (eg) Icelandic "A"
 > > also a different character then? Are they really purposely
 > > randomly tagging unnecessary etymological/usage information to
 > > symbols, or is it that they just fudged it up initially (for
 > > whatever political reasons) and can't fix it at this stage any
 > > more?
 > 
 > This is because Icelandic uses the same /script/ as English. Greek
 > uses a different /script/, therefore capital alpha gets its own
 > encoding, while Icelandic ay is encoded as the same as English ay.

Furthermore they have different lower-case forms, which can cause
similar situations even within scripts.  Witness U+00D0 LATIN CAPITAL
LETTER ETH vs. U+0110 LATIN CAPITAL LETTER D WITH STROKE vs. 
U+0189 LATIN CAPITAL LETTER AFRICAN D.

 > Unicode stresses the distinctions between script, language (many of
 > which may use the same script), and glyph variants (which are left to
 > the realm of fonts, not text encodings).
 > 
 > Unicode certainly has fudged a bunch of stuff up initially, and
 > unfortunately they can't fix it now. (One thing in particular, I
 > think they should have encoded small caps a long time ago. One of
 > the proposals that was linked to included a small-cap F and S, and
 > mentioned that the only other small caps left unencoded were Q and
 > X.  Interesting, I thought, so I went on a hunt for all the small
 > caps (other than F, Q, S, and X). I could only find a handful of
 > them, and they're randomly dotted all over the place: Latin
 > Extended A, IPA Extensions, Letterlike Symbols, etc. But anyway,
 > enough of my rant.)
 > 

U+0262 LATIN LETTER SMALL CAPITAL G
U+026A LATIN LETTER SMALL CAPITAL I
U+0274 LATIN LETTER SMALL CAPITAL N
U+0280 LATIN LETTER SMALL CAPITAL R
U+028F LATIN LETTER SMALL CAPITAL Y
U+0299 LATIN LETTER SMALL CAPITAL B
U+029C LATIN LETTER SMALL CAPITAL H
U+029F LATIN LETTER SMALL CAPITAL L
U+1D00 LATIN LETTER SMALL CAPITAL A
U+1D04 LATIN LETTER SMALL CAPITAL C
U+1D05 LATIN LETTER SMALL CAPITAL D
U+1D07 LATIN LETTER SMALL CAPITAL E
U+1D0A LATIN LETTER SMALL CAPITAL J
U+1D0B LATIN LETTER SMALL CAPITAL K
U+1D0D LATIN LETTER SMALL CAPITAL M
U+1D0F LATIN LETTER SMALL CAPITAL O
U+1D18 LATIN LETTER SMALL CAPITAL P
U+1D1B LATIN LETTER SMALL CAPITAL T
U+1D1C LATIN LETTER SMALL CAPITAL U
U+1D20 LATIN LETTER SMALL CAPITAL V
U+1D21 LATIN LETTER SMALL CAPITAL W
U+1D22 LATIN LETTER SMALL CAPITAL Z

(None of which are, actually, in Latin Extended A (you may be thinking
of U+0138 LATIN SMALL LETTER KRA) or Letterlike Symbols (which don't
count as letters).  But I can certainly agree that it would have been
more convenient to have encoded them all together at the beginning)

________________________________________________________________________
________________________________________________________________________

Message: 11        
   Date: Tue, 10 Jan 2006 02:30:43 +0200
   From: John Vertical <[EMAIL PROTECTED]>
Subject: Re: OT: Unicode 5.0

>On 1/9/06, John Vertical <[EMAIL PROTECTED]> wrote:
> > ...At risk of threadjack accusations, I'll use the opening to also fire 
>a
> > question that's been bothering me for a while - Why does Unicode include
> > several characters multiple times? There are 6561 different ways to 
>write
> > "THAI POEM". If capital alpha is different from capital ay just because 
>it's
> > used in a different alphabet to write a different language, isn't (eg)
> > Icelandic "A" also a different character then? Are they really purposely
> > randomly tagging unnecessary etymological/usage information to symbols, 
>or
> > is it that they just fudged it up initially (for whatever political 
>reasons)
> > and can't fix it at this stage any more?
>
>This is because Icelandic uses the same /script/ as English. Greek
>uses a different /script/, therefore capital alpha gets its own
>encoding, while Icelandic ay is encoded as the same as English ay.

My argument is that Latin, Cyrillic and Greek capital letters are 
essentially one and the same script.
...Not that I see a point in differentiating by script anyway. I would just 
stick with defining glyphs (shapes) and let the users sort out the meaning.

>Unicode certainly has fudged a bunch of stuff up initially, and
>unfortunately they can't fix it now. (One thing in particular, I think
>they should have encoded small caps a long time ago. One of the
>proposals that was linked to included a small-cap F and S, and
>mentioned that the only other small caps left unencoded were Q and X.
>Interesting, I thought, so I went on a hunt for all the small caps
>(other than F, Q, S, and X). I could only find a handful of them, and
>they're randomly dotted all over the place: Latin Extended A, IPA
>Extensions, Letterlike Symbols, etc. But anyway, enough of my rant.)
>
>--
>Hasta la pasta,
>Jonathyn Bet'nct.

But aren't there a lot of letters (OSVWXZ) which are exactly the same in 
small caps and lower case? If they're randomly dotted all over the place 
anyway, there isn't even the benefit of having the whole set in one place.

John Vertical

________________________________________________________________________
________________________________________________________________________

Message: 12        
   Date: Mon, 9 Jan 2006 19:50:15 -0500
   From: "Ph.D." <[EMAIL PROTECTED]>
Subject: Re: [Theory] Types of numerals

Tristan McLeay wrote:
> 
> Ph.D. wrote:
> > 
> > That's essentially the reason the government gives for 
> > not having colored bills. If people barely glanced at 
> > them, it would be easier to pass counterfiets. In reality, 
> > it's easy to see the large numbers in the corners of each 
> > note. I can count through a stack of US bills very quickly. 
> 
> Or, they could make it difficult to counterfeit them by (e.g.) 
> making them plastic. Anyone can run paper through a 
> printer and make something passable-offable for real 
> money at a glance, it's a lot harder to make polymer notes 
> with the transparent windows.

Yes, I've seen the Australian notes, and they're pretty cool. 
I like the transparent window. I'm surprised that the United
States and other countries haven't adopted polymer money.

A friend of mine teaches a printing class in a high school. 
They have all the latest high-resolution color scanners and 
printers. One boy in one of his classes scanned in and 
printed a one-dollar bill. He then took it to the school cafeteria
and put it in the change machine. It was accepted. Realizing
that he might get in trouble, he told his teacher, who contacted
the authorities to explain the situation. Government agents 
came to the school and thoroughly checked all the computer
equipment. Normal policy here in the United States is to 
destroy any equipment which has been used to make counter-
feit money. But my friend was able to talk them out of it because
it was a public school and a one-time thing. He told me that 
the agents went to the boy's home and confiscated his home
computer.

--Ph. D. 

________________________________________________________________________
________________________________________________________________________

Message: 13        
   Date: Mon, 9 Jan 2006 16:54:44 -0800
   From: [EMAIL PROTECTED]
Subject: Re: OT: Unicode 5.0

Probably the easiest way to design a useful standard is to get rid of
the same entities.  Due to accent marks, tones, etc., we have about 50
"o" characters.  If the accent mark was a separate "modifier"
character, that could significantly reduce the number of characters
and make it more ordered.

On 1/9/06, John Vertical <[EMAIL PROTECTED]> wrote:
> >On 1/9/06, John Vertical <[EMAIL PROTECTED]> wrote:
> > > ...At risk of threadjack accusations, I'll use the opening to also fire
> >a
> > > question that's been bothering me for a while - Why does Unicode include
> > > several characters multiple times? There are 6561 different ways to
> >write
> > > "THAI POEM". If capital alpha is different from capital ay just because
> >it's
> > > used in a different alphabet to write a different language, isn't (eg)
> > > Icelandic "A" also a different character then? Are they really purposely
> > > randomly tagging unnecessary etymological/usage information to symbols,
> >or
> > > is it that they just fudged it up initially (for whatever political
> >reasons)
> > > and can't fix it at this stage any more?
> >
> >This is because Icelandic uses the same /script/ as English. Greek
> >uses a different /script/, therefore capital alpha gets its own
> >encoding, while Icelandic ay is encoded as the same as English ay.
>
> My argument is that Latin, Cyrillic and Greek capital letters are
> essentially one and the same script.
> ...Not that I see a point in differentiating by script anyway. I would just
> stick with defining glyphs (shapes) and let the users sort out the meaning.
>
>
> >Unicode certainly has fudged a bunch of stuff up initially, and
> >unfortunately they can't fix it now. (One thing in particular, I think
> >they should have encoded small caps a long time ago. One of the
> >proposals that was linked to included a small-cap F and S, and
> >mentioned that the only other small caps left unencoded were Q and X.
> >Interesting, I thought, so I went on a hunt for all the small caps
> >(other than F, Q, S, and X). I could only find a handful of them, and
> >they're randomly dotted all over the place: Latin Extended A, IPA
> >Extensions, Letterlike Symbols, etc. But anyway, enough of my rant.)
> >
> >--
> >Hasta la pasta,
> >Jonathyn Bet'nct.
>
> But aren't there a lot of letters (OSVWXZ) which are exactly the same in
> small caps and lower case? If they're randomly dotted all over the place
> anyway, there isn't even the benefit of having the whole set in one place.
>
> John Vertical

>

________________________________________________________________________
________________________________________________________________________

Message: 14        
   Date: Tue, 10 Jan 2006 00:52:38 +0000
   From: Tim May <[EMAIL PROTECTED]>
Subject: Re: OT: Unicode 5.0

Herman Miller wrote at 2006-01-08 18:07:49 (-0600) 
 > Paul Bennett wrote:
 > > Am I alone in being rather excited by Unicode 5.0?
 > > 
 > 
 > Hmm.... I can't seem to find the specifics about what's new in 5.0. What 
 > sorts of characters are included in Latin Extended C & D?

See here:
http://babelstone.blogspot.com/2005/11/whats-new-in-unicode-50.html
for some details and links on 5.0

Code charts for all but 4 of the 1,369 new characters in Unicode 5.0
(see above page for details) are in this document:
http://std.dkuug.dk/jtc1/sc2/wg2/docs/N2991.pdf

________________________________________________________________________
________________________________________________________________________

Message: 15        
   Date: Tue, 10 Jan 2006 12:13:13 +1100
   From: Tristan McLeay <[EMAIL PROTECTED]>
Subject: Re: OT: Unicode 5.0

John Vertical wrote:
> Paul Bennett wrote:
> 
>> On Sun, 08 Jan 2006 19:07:49 -0500, Herman Miller wrote:
>>
>>> Hmm.... I can't seem to find the specifics about what's new in 5.0.
>>> What sorts of characters are included in Latin Extended C & D?
>>
>>
>> See the Roadmaps at http://www.unicode.org/roadmaps/
> 
> 
> Many interesting new ones. I think my favourites are the "squirrel-tail 
> p" and the Norse digraphs.

Indeed, the inclusion of AO/ao means I can now write Føtisk in Unicode 
:) (I think we're still lacking a capital UE ligature though.)

--
Tristan.

________________________________________________________________________
________________________________________________________________________

Message: 16        
   Date: Mon, 9 Jan 2006 20:14:46 -0500
   From: Thomas Hart Chappell <[EMAIL PROTECTED]>
Subject: Re: [Theory] Types of numerals; bases in natlangs.

http://scholar.google.com/scholar?q=author:"Hammarstrom"%20intitle:"Number%
20Bases,%20Frequencies%20and%20Lengths%20Cross-Linguistically"

and

[PS] Number Bases, Frequencies and Lengths Cross-Linguistically Harald ...
File Format: Adobe PostScript - View as Text
2.2 Less Common Bases Frequency data on other numeral systems tends to be 
very
... a frequency curve not much different from its base-10 neighbour 
languages ...
www.cs.chalmers.se/~harald2/numericals.ps - Similar pages 

[PDF] Number Bases, Frequencies and Lengths Cross-Linguistically
File Format: PDF/Adobe Acrobat - View as HTML
systems is now lost forever. The different-natured body-tally counting 
systems of
... Chepang: A sino-tibetan language with a duodecimal numeral. base? ...
www.cs.chalmers.se/~harald2/utrechtabs.pdf - Similar pages 

discuss base-and-place systems in the world's languages, among other topics.

He mentions the existence of, and provides references to find out about, 
natlang systems that are (or, in some cases, were) binary, ternary, 
quaternary, octal, base-six, base-five, base-ten, base-twenty, and base-
twelve.

He says: "The different-natured body-tally counting systems of Papua New 
Guinea can have cycles of sizes from eighteen to seventy-four, with twenty-
seven the commonest; but it is not clear in what sense they should be 
equated with 'bases'..."

He looks at the "commonest numbers" in several languages (specifically, the 
frequency of numbers from zero to one-hundred in corpora from 100 different 
languages).  Low numbers and round numbers (powers of bases, and low-number-
multiples of powers of bases) tend to be used with greater frequency in 
every language.

He also looks at the "length" (in segments) of number-words in these 
languages to test his hypothesis that the more frequent numbers tend to be 
shorter than the less frequent numbers.  He looks more closely at a decimal 
language (English) and a vigesimal language (Danish).

Interestingly, he reports that, among the numbers between eleven and 
nineteen, the most frequently used numbers in seven languages (English, 
French, Japanese, Kannada, Dutch, Catalan, and Spanish) are twelve and 
fifteen.  Twelve is a "round number" in bases two, three, four, six, and of 
course twelve; but not in bases five, eight, ten, or twenty.  Fifteen is 
a "round number" in bases three and five; but not in bases two, four, six, 
eight, ten, twelve, or twenty.  

I do not know the number systems of any of those languages except English 
and French (and I or may not remember that much Spanish), so I don't know 
what bases they use.  English is decimal, and French is vigesimal from 
seventy up to ninety-nine, decimal otherwise.  Hints in the manuscript seem 
to indicate that the other five languages are also each mostly-decimal, at 
least for most of the range from zero to one-hundred.

If a language does not have base three or base five, why is "fifteen" a 
common numeral?  If a language has base five or base eight or base ten or 
base twenty, why is "twelve" a common numeral?

-----

Extant, as opposed to extinct, base-four and base-eight languages, are 
(says the author) hard to get corpora for; as is the only "bona-fide base-
six system" he knows about.  He says all of the base-five systems he knows 
about convert to base-ten or base-twenty for numbers above twenty.  He 
mentions some base-twelve systems that stay base-twelve up to twelve-cubed 
(1728).

ObConLang; Human natlang multiplication-and-addition based numeral systems, 
seem to have bases of seventy-four or less.  Would a conlang for a non-
human language work well with a base of eight-four, ninety, ninety-six, one-
hundred-eight, one-hundred-twenty, or one-hundred-sixty-eight?  What sort 
of rationale would make this plausible?

-----

Tom H.C. in MI

________________________________________________________________________
________________________________________________________________________

Message: 17        
   Date: Mon, 9 Jan 2006 20:34:17 -0500
   From: "Mark J. Reed" <[EMAIL PROTECTED]>
Subject: Re: OT: Unicode 5.0

\> > Unicode certainly has fudged a bunch of stuff up initially, and
> > unfortunately they can't fix it now.

> They *could* fix it, by the same act of administrative fiat that created
> Unicode in the first place:

<cough>Ido<cough>

It took an age for Unicode to even start to catch on; now that it's
practically mainstream it would be pure foolhardiness to try to switch
to a new standard.  Everyone would cry foul and go their own way.

I see no harm in letting Unicode have extra, unneeded characters.  You
need rules for dealing with composing and decomposing anyway, so it
doesn't really hurt to have a LATIN CAPITAL LETTER A WITH CIRCUMFLEX
in addition to LATIN CAPITAL LETTER A and COMBINING CIRCUMFLEX ABOVE
(or whatever the actual names are).

Also, a lot of the redundancy comes from the design goal of round-trip
preservation of the contents of documents in a national character set
when converted to Unicode and then back.  I think that's a worthwhile
goal from a computational standpoint.

And a lot of the acceptability of Unicode comes from the fact that its
a strict superset of Latin-1 which is,  in turn, a strict superset of
ASCII.  Your rearrangement loses that and thereby instantly loses many
current Unicode adopters.

--
Mark J. Reed <[EMAIL PROTECTED]>

________________________________________________________________________
________________________________________________________________________

Message: 18        
   Date: Tue, 10 Jan 2006 02:49:31 +0100
   From: Henrik Theiling <[EMAIL PROTECTED]>
Subject: Re: OT: Unicode 5.0

Hi!

Tristan McLeay <[EMAIL PROTECTED]> writes:
>...
> Indeed, the inclusion of AO/ao means I can now write Føtisk in Unicode
> :) (I think we're still lacking a capital UE ligature though.)

That's indeed nice.  The usual digraph I'm using for [O] in Lower
German is _ao_.  People tend to read that as [ao] or [aU].  And using
_ô_ is unusual for German eyes and, therefore, not really intuitive.
The _ao_ ligature will help here.  Very nice inclusion.

**Henrik

________________________________________________________________________
________________________________________________________________________

Message: 19        
   Date: Mon, 9 Jan 2006 20:49:53 -0500
   From: Paul Bennett <[EMAIL PROTECTED]>
Subject: Re: OT: Unicode 5.0

On Mon, 09 Jan 2006 20:34:17 -0500, Mark J. Reed <[EMAIL PROTECTED]>  
wrote:

> \> > Unicode certainly has fudged a bunch of stuff up initially, and
>> > unfortunately they can't fix it now.
>
>> They *could* fix it, by the same act of administrative fiat that created
>> Unicode in the first place:
>
> <cough>Ido<cough>
>
> It took an age for Unicode to even start to catch on; now that it's
> practically mainstream it would be pure foolhardiness to try to switch
> to a new standard.  Everyone would cry foul and go their own way.

I'm sure that's what somebody said about 7-bit USASCII at some point, or  
indeed the original Baudot codes. However, the rest of your post is well  
reasoned, and well enough argued. There's always room for a sufficiently  
superior method to supplant an established standard, though. I do rather  
suspect the key is in the word "sufficiently", to be sure.

Paul

________________________________________________________________________
________________________________________________________________________

Message: 20        
   Date: Mon, 9 Jan 2006 20:57:14 -0500
   From: "Mark J. Reed" <[EMAIL PROTECTED]>
Subject: Re: OT: Unicode 5.0

On 1/9/06, Paul Bennett <[EMAIL PROTECTED]> wrote:
> I'm sure that's what somebody said about 7-bit USASCII at some point, or
> indeed the original Baudot codes.

Perhaps, but there was a clear need for more capability than those
supplied. I was addressing a suggestion for a new system supplying the
same capabilities rearranged, which is far less compelling. :)

> However, the rest of your post is well
> reasoned, and well enough argued. There's always room for a sufficiently
> superior method to supplant an established standard, though. I do rather
> suspect the key is in the word "sufficiently", to be sure.
>
>
>
>
>
> Paul
>

--
Mark J. Reed <[EMAIL PROTECTED]>

________________________________________________________________________
________________________________________________________________________

Message: 21        
   Date: Mon, 9 Jan 2006 21:01:35 -0500
   From: Paul Bennett <[EMAIL PROTECTED]>
Subject: Re: OT: Unicode 5.0

On Mon, 09 Jan 2006 20:49:31 -0500, Henrik Theiling <[EMAIL PROTECTED]>  
wrote:

> Hi!
>
> Tristan McLeay <[EMAIL PROTECTED]> writes:
>> ...
>> Indeed, the inclusion of AO/ao means I can now write F�tisk in Unicode
>> :) (I think we're still lacking a capital UE ligature though.)
>
> That's indeed nice.  The usual digraph I'm using for [O] in Lower
> German is _ao_.  People tend to read that as [ao] or [aU].  And using
> _�_ is unusual for German eyes and, therefore, not really intuitive.
> The _ao_ ligature will help here.  Very nice inclusion.

The full set of digrpaphs in Latin Extended-D is:

AA, AO, AU, AV, AV-barred, AY, OO and VY, all in upper and lower case.

Also of note are the insular forms of D, F, R, S and T (some only in  
lower), Middle Welsh forms of LL, D and V, the R Rotunda (looks like a 2),  
O-loop, Vend, Broken L and Visigothic Z (the precursor to C-cedilla), as  
well as a huge set of Brachygraphic signs and abbreviations.

Paul

________________________________________________________________________
________________________________________________________________________

Message: 22        
   Date: Mon, 9 Jan 2006 22:21:30 -0500
   From: Paul Bennett <[EMAIL PROTECTED]>
Subject: Re: Conlang flag in actual cloth - final colors?

On Mon, 09 Jan 2006 17:40:13 -0500, Sai Emrys <[EMAIL PROTECTED]> wrote:

> Taliesin says (below) the colors:
> black: Black 6 2X
> yellow: PMS 123
> purple: PMS 2592 2X or PMS 527   [I think I prefer 527 m'self...]
>
> are the most suitable. That agreed? Any vexillologists in the house?

Naming Pantone colors is verging on pointless, unless each one of us has a  
well-calibrated monitor, or a well-cared-for Pantone book. There's too  
much variation between displays and display adaptors without calibration  
to make Pantone much more effective than descriptions like "oh, no, a bit  
redder than that, and maybe with less grey in it".

That said, they look okay to me. I'd possibly prefer something closer to  
heraldic gold, maybe is something in the 107/108 range instead of 123,  
which seems a bit orange. It's not a big enough deal for me to draw a line  
in the sand, though. 527 is nice. Why not straight-up, no messin' around,  
Pantone Black for the Black? It's the utter defintion of 100%  
pigmentation, as a Black should be, IMO.

Paul

________________________________________________________________________
________________________________________________________________________

Message: 23        
   Date: Mon, 9 Jan 2006 20:04:18 -0800
   From: "Jonathyn Bet'nct" <[EMAIL PROTECTED]>
Subject: Re: OT: Unicode 5.0

On 1/9/06, Tim May <[EMAIL PROTECTED]> wrote:
> U+0262 LATIN LETTER SMALL CAPITAL G
> ...
> U+1D00 LATIN LETTER SMALL CAPITAL A
> ...
> U+1D22 LATIN LETTER SMALL CAPITAL Z

Ah, that's where the rest of them ended up.

> (None of which are, actually, in Latin Extended A (you may be thinking
> of U+0138 LATIN SMALL LETTER KRA)

Yes, that's exactly what I was thinking of. I actually doubted if that
was the small-cap K when I went looking for it.

> or Letterlike Symbols (which don't
> count as letters).

I must have been thinking of something else (maybe the cursive letters).

--
Hasta la pasta,
Jonathyn Bet'nct.
------------------------------------------------------------
I tried the real world once; didn't really care for it.

Hey, free iPods! - http://www.mp3players4free.com/default.aspx?r=543572

Beth: Lisa, all dogs are boys, all cats are girls. Is that right, Max?
Max: Exactly.
Lisa: Well, I'm sorry to have to tell you this, but Daisy is
obviously, and I mean obviously, a girl.
Max: Oh we're not disputing that. It's not a question of sex, but of gender.
Lisa: Sex and gender are the same thing.
Max: Uh, not so. I would much rather have sex than gender.
Lisa: Since you have neither that must be very sad for you.

________________________________________________________________________
________________________________________________________________________

Message: 24        
   Date: Tue, 10 Jan 2006 11:44:36 +0200
   From: Isaac Penzev <[EMAIL PROTECTED]>
Subject: Re: OT: Unicode 5.0

I'm also happy since they plan to include 6 additional Cyrillic characters
used in Paleoasian langs, especially CYRILLIC LETTER EL WITH HOOK that I
needed
badly.

-- Yitzik

________________________________________________________________________
________________________________________________________________________

Message: 25        
   Date: Tue, 10 Jan 2006 09:48:26 +0000
   From: Peter Bleackley <[EMAIL PROTECTED]>
Subject: Re: Conlang Wiki

staving Aaron Morse:
>I'd be happy to. . .but I can't see how to sign up.
>
>Peter Bleackley <[EMAIL PROTECTED]> wrote:
>The conlang wiki http://www.talideon.com/concultures/wiki/ has been locked
>for some time, due to vandalism. Keith has stated that he wants to set up
>an editors table - hopefully if enough people sign up for this, he'll be
>able to resurrect the wiki. I've volunteered already, and I think it would
>be a good thing if other people did. Please have a look at the wiki, and
>sign up if you think you can make a contribution to bringing a useful
>resource back to life.
>
>Pete

Click on the "Contact me" link from the front page.

Pete

________________________________________________________________________
________________________________________________________________________

------------------------------------------------------------------------
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/conlang/

<*> To unsubscribe from this group, send an email to:
    [EMAIL PROTECTED]

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/

------------------------------------------------------------------------

[conlang] Digest Number 4802

Reply via email to