Re: unicode

2016-09-17 Thread Timo Paulssen
On 17/09/16 13:34, Moritz Lenz wrote:>> Searching further I found the ucd2c.pl program in the Moarvm tools >> directory. This generates the unicode_db.c somewhere else in the >> rakudo tree. I run this program myself on the Unicode 9.0.0 >> database and comparing the generated files shows many dif

Re: unicode

2016-09-17 Thread MT
Hi, I am looking forward to it Thanks, Marcel On Sat, Sep 17, 2016 at 01:34:45PM +0200, Moritz Lenz wrote: Hi, On 17.09.2016 13:12, MT wrote: The date found in the file unicode_db.c file is 2012-07-20 which is about Unicode version 6.1.0 So the content in that file is not getting updated whe

Re: unicode

2016-09-17 Thread Nicholas Clark
On Sat, Sep 17, 2016 at 01:34:45PM +0200, Moritz Lenz wrote: > Hi, > > On 17.09.2016 13:12, MT wrote: > > The date found in the file unicode_db.c file is 2012-07-20 which is > > about Unicode version 6.1.0 So the content in that file is not getting updated when the shipped Unicode version is u

Re: unicode

2016-09-17 Thread MT
Searching further I found the ucd2c.pl program in the Moarvm tools directory. This generates the unicode_db.c somewhere else in the rakudo tree. I run this program myself on the Unicode 9.0.0 database and comparing the generated files shows many differences between the one in the rakudo tree a

Re: unicode

2016-09-17 Thread Moritz Lenz
Hi, On 17.09.2016 13:12, MT wrote: > Searching further I found the ucd2c.pl program in the Moarvm tools > directory. This generates the unicode_db.c somewhere else in the rakudo > tree. I run this program myself on the Unicode 9.0.0 database and > comparing the generated files shows many diffe

Re: Unicode Categories

2010-11-12 Thread karl williamson
Tom Christiansen wrote: Patrick wrote: : > * Almost. E.g. isL would be nice to have as well. : : Those exist also: : : $ ./perl6 : > say 'abCD34' ~~ / / : a : > say 'abCD34' ~~ / / : 3 : > They may exist, but I'm not certain it's a good idea to encourage the Is_XXX approach on *anything

Re: Unicode Categories

2010-11-11 Thread Tom Christiansen
>The 'Is' prefix can be used on any property in 5.12 for which there is >no naming conflict. The only naming conflicts are certain of the block >properties, such as Arabic. IsArabic means the Arabic script. InArabic >means the base Arabic block. Personally, I find Is and In unintuitive, >an

Re: Unicode Categories

2010-11-10 Thread Tom Christiansen
Patrick wrote: : > * Almost. E.g. isL would be nice to have as well. : : Those exist also: : : $ ./perl6 : > say 'abCD34' ~~ / / : a : > say 'abCD34' ~~ / / : 3 : > They may exist, but I'm not certain it's a good idea to encourage the Is_XXX approach on *anything* except Script=XXX proper

Re: Unicode Categories

2010-11-10 Thread Tom Christiansen
Patrick wrote at 12:15pm CST on Wednesday, 10 November 2010: >> Sorry if this is the wrong forum. I was wondering if there was a way to >> specify unicode >> categoriesin >> a regular expression (and hence a grammar), or if there would be

Re: Unicode Categories

2010-11-10 Thread Chase Albert
Even awesomer, thank you again. On Wed, Nov 10, 2010 at 13:28, Patrick R. Michaud wrote: > On Wed, Nov 10, 2010 at 01:21:57PM -0500, Chase Albert wrote: > > That's exactly what I was looking for*. Awesome, thank you. > > > > * Almost. E.g. isL would be nice to have as well. > > Those exist also:

Re: Unicode Categories

2010-11-10 Thread Patrick R. Michaud
On Wed, Nov 10, 2010 at 01:21:57PM -0500, Chase Albert wrote: > That's exactly what I was looking for*. Awesome, thank you. > > * Almost. E.g. isL would be nice to have as well. Those exist also: $ ./perl6 > say 'abCD34' ~~ / / a > say 'abCD34' ~~ / / 3 > Pm

Re: Unicode Categories

2010-11-10 Thread Chase Albert
That's exactly what I was looking for*. Awesome, thank you. ~Cheers * Almost. E.g. isL would be nice to have as well. On Wed, Nov 10, 2010 at 13:15, Patrick R. Michaud wrote: > "Unicode > properties are always available with a prefix" >

Re: Unicode Categories

2010-11-10 Thread Patrick R. Michaud
On Wed, Nov 10, 2010 at 01:03:26PM -0500, Chase Albert wrote: > Sorry if this is the wrong forum. I was wondering if there was a way to > specify unicode > categoriesin > a regular expression (and hence a grammar), or if there would be any

Re: "Unicode in 'NFG' formation" ?

2009-05-20 Thread John M. Dlugosz
Larry Wall larry-at-wall.org |Perl 6| wrote: On Mon, May 18, 2009 at 11:11:32AM +0200, Helmut Wollmersdorfer wrote: [1] Open questions: 1) Will graphemes have an unique charname? e.g. GRAPHEME LATIN SMALL LETTER A WITH DOT BELOW AND DOT ABOVE Yes, presumably that comes with the "no

Re: "Unicode in 'NFG' formation" ?

2009-05-20 Thread Helmut Wollmersdorfer
Larry Wall wrote: On Mon, May 18, 2009 at 11:11:32AM +0200, Helmut Wollmersdorfer wrote: 2) Can I use Unicode property matching safely with graphemes? If yes, who or what maintains the necessary tables? Good question. My assumption is that adding marks to a character doesn't change its

Re: "Unicode in 'NFG' formation" ?

2009-05-18 Thread Brandon S. Allbery KF8NH
On May 18, 2009, at 21:54 , Larry Wall wrote: On Mon, May 18, 2009 at 07:59:31PM -0500, John M. Dlugosz wrote: No, a few million code points in the Unicode standard can produce an arbitrary number of unique grapheme clusters, since you can apply as many modifiers as you like to each different ba

Re: "Unicode in 'NFG' formation" ?

2009-05-18 Thread Larry Wall
On Mon, May 18, 2009 at 07:59:31PM -0500, John M. Dlugosz wrote: > No, a few million code points in the Unicode standard can produce an > arbitrary number of unique grapheme clusters, since you can apply as > many modifiers as you like to each different base character. If you > allow multipl

Re: "Unicode in 'NFG' formation" ?

2009-05-18 Thread John M. Dlugosz
Larry Wall larry-at-wall.org |Perl 6| wrote: into *uint16 as long as they don't synthesize codepoints. And we can always resort to *uint32 and *int32 knowing that the Unicode consortium isn't going to use the top bit any time in the foreseeable future. (Unless, of course, they endorse something

Re: "Unicode in 'NFG' formation" ?

2009-05-18 Thread John M. Dlugosz
Larry Wall larry-at-wall.org |Perl 6| wrote: Sure, but this is a weak argument, since you can already write complete ord/chr nonsense at the codepoint level (even in ASCII), and all we're doing here is making graphemes work more like codepoints in terms of storage and indexing. If people abuse i

Re: "Unicode in 'NFG' formation" ?

2009-05-18 Thread John M. Dlugosz
Mark J. Reed markjreed-at-gmail.com |Perl 6| wrote: On Mon, May 18, 2009 at 9:11 AM, Austin Hastings wrote: If you haven't read the PDD, it's a good start. I get all that, really. I still question the necessity of mapping each grapheme to a single integer. A single *value*, sure.

Re: "Unicode in 'NFG' formation" ?

2009-05-18 Thread Austin Hastings
Larry Wall wrote: Which is a very interesting topic, with connections to type theory, scope/domain management, and security issues (such as the possibility of a DoS attack on the translation tables). I think that a DoS attack on Unicode would be called "IBM/Windows Code Pages." The rest of

Re: "Unicode in 'NFG' formation" ?

2009-05-18 Thread Austin Hastings
Brandon S. Allbery KF8NH wrote: On May 18, 2009, at 14:16 , Larry Wall wrote: On Mon, May 18, 2009 at 11:11:32AM +0200, Helmut Wollmersdorfer wrote: 3) Details of 'life-time', round-trip. Which is a very interesting topic, with connections to type theory, scope/domain management, and security

Re: "Unicode in 'NFG' formation" ?

2009-05-18 Thread Brandon S. Allbery KF8NH
On May 18, 2009, at 14:16 , Larry Wall wrote: On Mon, May 18, 2009 at 11:11:32AM +0200, Helmut Wollmersdorfer wrote: 3) Details of 'life-time', round-trip. Which is a very interesting topic, with connections to type theory, scope/domain management, and security issues (such as the possibility

Re: "Unicode in 'NFG' formation" ?

2009-05-18 Thread Larry Wall
On Mon, May 18, 2009 at 02:16:17PM -0400, Mark J. Reed wrote: : Surrogates are just weird, since they have assigned code points even : though they're purely an encoding mechanism. As such, they straddle : the line between abstract characters and an encoding form. I assume : that if text comes in a

Re: "Unicode in 'NFG' formation" ?

2009-05-18 Thread Larry Wall
On Mon, May 18, 2009 at 11:11:32AM +0200, Helmut Wollmersdorfer wrote: > [1] Open questions: > > 1) Will graphemes have an unique charname? >e.g. GRAPHEME LATIN SMALL LETTER A WITH DOT BELOW AND DOT ABOVE Yes, presumably that comes with the "normalization" part of NFG. We're not aiming for rou

Re: "Unicode in 'NFG' formation" ?

2009-05-18 Thread Mark J. Reed
> On Mon, May 18, 2009 at 12:37:49PM -0400, Brandon S. Allbery KF8NH wrote: >> I would argue that if you are working with a grapheme cluster >> ("grapheme"), arithmetic on individual grapheme values is undefined. Yup, that was exactly what I was arguing. >> In short, I think the only remotely san

Re: "Unicode in 'NFG' formation" ?

2009-05-18 Thread Larry Wall
On Mon, May 18, 2009 at 12:37:49PM -0400, Brandon S. Allbery KF8NH wrote: > On May 18, 2009, at 09:21 , Mark J. Reed wrote: >> If you're doing arithmetic with the code points or scalar values of >> characters, then the specific numbers would seem to matter. I'm > > > I would argue that if you are

Re: "Unicode in 'NFG' formation" ?

2009-05-18 Thread Brandon S. Allbery KF8NH
On May 18, 2009, at 09:21 , Mark J. Reed wrote: If you're doing arithmetic with the code points or scalar values of characters, then the specific numbers would seem to matter. I'm I would argue that if you are working with a grapheme cluster ("grapheme"), arithmetic on individual grapheme v

Re: "Unicode in 'NFG' formation" ?

2009-05-18 Thread Austin Hastings
Mark J. Reed wrote: On Mon, May 18, 2009 at 9:11 AM, Austin Hastings wrote: If you haven't read the PDD, it's a good start. I get all that, really. I still question the necessity of mapping each grapheme to a single integer. A single *value*, sure. length($weird_grapheme) should a

Re: "Unicode in 'NFG' formation" ?

2009-05-18 Thread Mark J. Reed
On Mon, May 18, 2009 at 9:11 AM, Austin Hastings wrote: > If you haven't read the PDD, it's a good start. I get all that, really. I still question the necessity of mapping each grapheme to a single integer. A single *value*, sure. length($weird_grapheme) should always be 1, absolutely. But w

Re: "Unicode in 'NFG' formation" ?

2009-05-18 Thread Austin Hastings
If you haven't read the PDD, it's a good start. To summarize, probably oversimplifying badly: 1. A grapheme is a character *as seen on the page.* That is, if composing "a" + "dot above" + "dot below" produces an a with dots above and below it, then THAT is the grapheme. 2. Unicode has a lot

Re: "Unicode in 'NFG' formation" ?

2009-05-18 Thread Mark J. Reed
Do we really need to be able to map arbitrary graphemes to integers, or is it enough to have an opaque value returned by ord() that, when fed to chr(), returns the same grapheme? If the latter, a list of code points (in one of the official Normalzation Formats) would seem to be sufficient. On 5/1

Re: "Unicode in 'NFG' formation" ?

2009-05-18 Thread Helmut Wollmersdorfer
Darren Duncan wrote: Since you seem eager, I recommend you start with porting the Parrot PDD 28 to a new Perl 6 Synopsis 15, and continue from there. IMHO we need some people for a broad discussion on the details first. Helmut Wollmersdorfer

Re: "Unicode in 'NFG' formation" ?

2009-05-18 Thread Helmut Wollmersdorfer
John M. Dlugosz wrote: I was going over S02, and found it opens with, "By default Perl presents Unicode in "NFG" formation, where each grapheme counts as one character." I looked up NFG, and found it to be an invention of this group, but didn't find any details when I tried to chase down the l

Re: "Unicode in 'NFG' formation" ?

2009-05-16 Thread Darren Duncan
John M. Dlugosz wrote: I was going over S02, and found it opens with, "By default Perl presents Unicode in "NFG" formation, where each grapheme counts as one character." I looked up NFG, and found it to be an invention of this group, but didn't find any details when I tried to chase down the l

Re: Unicode bracketing spec question

2009-04-24 Thread Timothy S. Nelson
On Thu, 23 Apr 2009, Helmut Wollmersdorfer wrote: Timothy S. Nelson wrote: I note that S02 says that the unicode classes Ps/Pe are blessed to act as opening and closing quotes. Is there a reason that we can't have Pi/Pf blessed too? I ask because there are quotation marks in the Pi/Pf se

Re: Unicode bracketing spec question

2009-04-23 Thread Helmut Wollmersdorfer
Timothy S. Nelson wrote: I note that S02 says that the unicode classes Ps/Pe are blessed to act as opening and closing quotes. Is there a reason that we can't have Pi/Pf blessed too? I ask because there are quotation marks in the Pi/Pf set that are called "Substitution" and "Transposition

[perl #61394] Re: unicode and macosx

2008-12-16 Thread via RT
# New Ticket Created by Stephane Payrard # Please include the string: [perl #61394] # in the subject line of all future correspondence about this issue. # http://rt.perl.org/rt3/Ticket/Display.html?id=61394 > my $s = " "; say $s.chars # now returns 1 Note : the bug was reported on macinte

[Patch] Re: Unicode Operators cheatsheet, please!

2005-06-02 Thread Kevin Puetz
Rob Kinyon wrote: > xOn 5/31/05, Sam Vilain <[EMAIL PROTECTED]> wrote: >> Rob Kinyon wrote: >> > I would love to see a document (one per editor) that describes the >> > Unicode characters in use and how to make them. The Set implementation >> > in Pugs uses (at last count) 20 different Unicode cha

Re: Unicode Operators cheatsheet, please!

2005-06-01 Thread Rob Kinyon
xOn 5/31/05, Sam Vilain <[EMAIL PROTECTED]> wrote: > Rob Kinyon wrote: > > I would love to see a document (one per editor) that describes the > > Unicode characters in use and how to make them. The Set implementation > > in Pugs uses (at last count) 20 different Unicode characters as > > operators.

Re: Unicode Operators cheatsheet, please!

2005-05-31 Thread Sam Vilain
Rob Kinyon wrote: I would love to see a document (one per editor) that describes the Unicode characters in use and how to make them. The Set implementation in Pugs uses (at last count) 20 different Unicode characters as operators. I have updated the unicode quickref, and started a Perlmonks dis

Re: Unicode Operators cheatsheet, please!

2005-05-27 Thread Gaal Yahas
On Fri, May 27, 2005 at 10:29:39AM -0400, Rob Kinyon wrote: > I would love to see a document (one per editor) that describes the > Unicode characters in use and how to make them. The Set implementation > in Pugs uses (at last count) 20 different Unicode characters as > operators. Good idea. A mode

Re: Unicode Support - ICU Optional

2004-08-05 Thread Andy Dougherty
On Thu, 5 Aug 2004, Nicholas Clark wrote: > On Wed, Aug 04, 2004 at 04:10:56AM -0700, Joshua Gatcomb wrote: > > > WRT improving the ease of use of ICU. My suggestion > > is that a representative from each platform that > > Parrot is currently being built on download the latest > > stable version

Re: Unicode Support - ICU Optional

2004-08-05 Thread Nicholas Clark
On Wed, Aug 04, 2004 at 04:10:56AM -0700, Joshua Gatcomb wrote: > WRT improving the ease of use of ICU. My suggestion > is that a representative from each platform that > Parrot is currently being built on download the latest > stable version of ICU source, build it, and note x86 Debian builds a

Re: Unicode Support - ICU Optional

2004-08-05 Thread Nicholas Clark
On Wed, Aug 04, 2004 at 04:10:56AM -0700, Joshua Gatcomb wrote: > WRT improving the ease of use of ICU. My suggestion > is that a representative from each platform that > Parrot is currently being built on download the latest > stable version of ICU source, build it, and note > anything "special"

Re: Unicode Support - ICU Optional

2004-08-05 Thread Nicholas Clark
On Thu, Aug 05, 2004 at 10:51:46AM +0100, Nicholas Clark wrote: > It's this one again. Solaris 10 seems too new for it. OK, Solaris 10 is in > beta but this is the same pain as before. I should report this to the > ICU people. Reported as bug #4047 ICU 3 will build, pass all tests and install if

Re: Unicode Support - ICU Optional

2004-08-05 Thread Nicholas Clark
On Thu, Aug 05, 2004 at 10:51:46AM +0100, Nicholas Clark wrote: > On Wed, Aug 04, 2004 at 04:10:56AM -0700, Joshua Gatcomb wrote: > > > WRT improving the ease of use of ICU. My suggestion > > is that a representative from each platform that > > Parrot is currently being built on download the late

Re: Unicode Support - ICU Optional

2004-08-05 Thread Nicholas Clark
On Wed, Aug 04, 2004 at 04:10:56AM -0700, Joshua Gatcomb wrote: > WRT improving the ease of use of ICU. My suggestion > is that a representative from each platform that > Parrot is currently being built on download the latest > stable version of ICU source, build it, and note > anything "special"

Re: Unicode Support - ICU Optional

2004-08-04 Thread Dan Sugalski
At 4:10 AM -0700 8/4/04, Joshua Gatcomb wrote: All: After speaking with Dan in #parrot last night, I either had originally misunderstood his position or he has changed it (paraphrased): We will ship Parrot with unicode support, but:. A. The unicode support does not necessarily need to be limited t

Re: Unicode step by step

2004-04-13 Thread Leopold Toetsch
Dan Sugalski <[EMAIL PROTECTED]> wrote: > At 6:22 PM +0200 4/13/04, Leopold Toetsch wrote: >>Marcus Thiesen wrote: >>>. Seems as if something doesn't get cleaned up in icu with a parrot >>>realclean. >> >>Yep. I've removed cleaning icu from clean/realclean[1]. > I think we need to put that back fo

Re: Unicode step by step

2004-04-13 Thread Dan Sugalski
At 6:22 PM +0200 4/13/04, Leopold Toetsch wrote: Marcus Thiesen wrote: . Seems as if something doesn't get cleaned up in icu with a parrot realclean. Yep. I've removed cleaning icu from clean/realclean[1]. I think we need to put that back for a bit, but with this: [1] If anyone puts that in again

Re: Unicode step by step

2004-04-13 Thread Leopold Toetsch
Marcus Thiesen wrote: . Seems as if something doesn't get cleaned up in icu with a parrot realclean. Yep. I've removed cleaning icu from clean/realclean[1]. $ make help | grep clean ... icu.clean: ... And there is always "make cvsclean". Have fun, Marcus leo [1] If anyone puts that in

Re: Unicode step by step

2004-04-13 Thread Marcus Thiesen
On Tuesday 13 April 2004 13:28, luka frelih wrote: > just a confirmation... > my i386 debian linux gives the same error repeatedly after make > realclean, > if i make again, it compiles a broken parrot which fails (too) many > tests... > > also it seems (to me) that parrot's configured choice of co

Re: Unicode step by step

2004-04-13 Thread luka frelih
just a confirmation... my i386 debian linux gives the same error repeatedly after make realclean, if i make again, it compiles a broken parrot which fails (too) many tests... also it seems (to me) that parrot's configured choice of compiler, linker, ... is not used in building icu? does icu ha

Re: Unicode step by step

2004-04-13 Thread Jeff Clites
BTW, it doesn't compile on any platform at the moment, after a realclean on the first "make" it complains about ../data/locales/ja.txt:15: parse error. Stopped parsing with U_INVALID_FORMAT_ERROR couldn't parse the file ja.txt. Error:U_INVALID_FORMAT_ERROR make[1]: *** [../data/out/build/icudt26l_

Re: Unicode step by step

2004-04-13 Thread Marcus Thiesen
On Saturday 10 April 2004 15:13, Leopold Toetsch wrote: > There is of course still the question: Should we really have ICU in the > tree. This needs tracking updates and patching (again) to make it build > and so on. In the sake of platform independence I'd say to keep it there. It's far easier i

Re: [PATCH] Re: Unicode step by step

2004-04-11 Thread Leopold Toetsch
Jeff Clites <[EMAIL PROTECTED]> wrote: > Here's a patch to src/pf_items.c, and a ppc t/native_pbc/number_3.pbc. Works. > If it's working correctly, the attached strings-and-byte-order.* should > both do the same thing--output the Angstrom symbol. If it's wrong, then > the pbc version should outp

Re: Unicode step by step

2004-04-11 Thread Leopold Toetsch
Jeff Clites <[EMAIL PROTECTED]> wrote: > On Apr 10, 2004, at 6:13 AM, Leopold Toetsch wrote: >> 2) String PBC layout. The internal string type has changed. This >> currently breaks native_pbc tests (that have strings) as well as some >> "parrot xx.pbc" tests related to strings. > These are workin

[PATCH] Re: Unicode step by step

2004-04-10 Thread Jeff Clites
On Apr 10, 2004, at 1:12 PM, Jeff Clites wrote: On Apr 10, 2004, at 6:13 AM, Leopold Toetsch wrote: 2) String PBC layout. The internal string type has changed. This currently breaks native_pbc tests (that have strings) as well as some "parrot xx.pbc" tests related to strings. These are working

Re: Unicode step by step

2004-04-10 Thread Jeff Clites
On Apr 10, 2004, at 6:13 AM, Leopold Toetsch wrote: 2) String PBC layout. The internal string type has changed. This currently breaks native_pbc tests (that have strings) as well as some "parrot xx.pbc" tests related to strings. These are working for me (which tests are failing for you?)--I did

Re: Unicode support in Emacs

2004-03-24 Thread Calle Dybedahl
> "Karl" == Karl Brodowsky <[EMAIL PROTECTED]> writes: > I get the impression that Unicode-support has kind of gone on top of > this stuff and I must admit that the way I am currently using > Unicode is to edit the stuff with \ucafe\ubabe-kind of replacements > and run perlscripts to convert f

Re: Unicode in Emacs (was: Semantics of vector operations)

2004-02-04 Thread Kurt Starsinic
On Feb 03, David Wheeler wrote: > On Feb 3, 2004, at 7:13 AM, Kurt Starsinic wrote: > > >No joke. You'll need to have the "mule-ucs" module installed. > >A quick Google search turns up plenty of sources. > > Oh, I have Emacs 21.3.50. Mule is gone. I'm afraid you're on your own, then. I

Re: Unicode under Windows (was RE: Semantics of vector operations)

2004-01-30 Thread Rod Adams
Austin Hastings wrote: From: Rod Adams [mailto:[EMAIL PROTECTED] Question in all this: What does one do when they have to _debug_ some code that was written with these lovely Unicode ops, all while stuck in an ASCII world? That's why I suggested a standard script for Unicode2Ascii be shi

RE: Unicode under Windows (was RE: Semantics of vector operations)

2004-01-29 Thread Austin Hastings
> -Original Message- > From: Austin Hastings [mailto:[EMAIL PROTECTED] > > From: Rod Adams [mailto:[EMAIL PROTECTED] > > > > Question in all this: What does one do when they have to _debug_ some > > code that was written with these lovely Unicode ops, all while stuck in > > an ASCII w

Re: Unicode, internationalization, C++, and ICU

2004-01-16 Thread Jonathan Worthington
> > > This page give instructions for building on Windows--it doesn't seem to > require installing bash or anything: > > http://oss.software.ibm.com/cvs/icu/~checkout~/icu/ > readme.html#HowToBuildWindows > > I assume that on Windows you don't need to run the configure script. > Thanks for that, I

Re: Unicode, internationalization, C++, and ICU

2004-01-16 Thread Jeff Clites
On Jan 15, 2004, at 3:33 PM, Jonathan Worthington wrote: - Original Message - From: "Dan Sugalski" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, January 15, 2004 8:09 PM Subject: Unicode, internationalization, C++, and ICU Now, assuming there's still anyone left reading this

Re: Unicode, internationalization, C++, and ICU

2004-01-16 Thread Dan Sugalski
At 10:40 AM +0100 1/16/04, Michael Scott wrote: Maybe we can use someone else's solution... http://lists.ximian.com/archives/public/mono-list/2003-November/016731.html Could be handy. We really ought to detect a system-installed ICU and use that rather than our local copy at configure time, if it

Re: Unicode, internationalization, C++, and ICU

2004-01-16 Thread Michael Scott
Maybe we can use someone else's solution... http://lists.ximian.com/archives/public/mono-list/2003-November/ 016731.html On 16 Jan 2004, at 00:33, Jonathan Worthington wrote: - Original Message - From: "Dan Sugalski" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, January 15

Re: Unicode, internationalization, C++, and ICU

2004-01-15 Thread Jonathan Worthington
- Original Message - From: "Dan Sugalski" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, January 15, 2004 8:09 PM Subject: Unicode, internationalization, C++, and ICU > Now, assuming there's still anyone left reading this message... > > We've been threatening to build ICU in

Re: Unicode, internationalization, C++, and ICU

2004-01-15 Thread Dan Sugalski
At 10:34 PM +0100 1/15/04, Michael Scott wrote: Well I did originally have this in mind, but the more I looked into it the more I thought it needed someone with unicode experience. It seems to me that the unicode world is full of "ah but in North Icelandic Yiddish aleph is considered to be an in

Re: Unicode, internationalization, C++, and ICU

2004-01-15 Thread Michael Scott
Well I did originally have this in mind, but the more I looked into it the more I thought it needed someone with unicode experience. It seems to me that the unicode world is full of "ah but in North Icelandic Yiddish aleph is considered to be an infinitely composite character" and other such

Re: Unicode operators

2002-11-07 Thread Kurt D. Starsinic
On Nov 07, Dan Sugalski wrote: > Lacking a decent C++ compiler isn't necessarily a strike against > VMS--to be a strike against, there'd actually have to *be* a decent > C++ compiler... Doesn't VMS have a /bin/false? - Kurt

Re: Unicode operators

2002-11-07 Thread Dan Sugalski
At 1:27 PM -0800 11/6/02, Brad Hughes wrote: Flaviu Turean wrote: [...] 5. if you want to wait for the computing platforms before programming in p6, then there is quite a wait ahead. how about platforms which will never catch up? VMS, anyone? Not to start an OS war thread or anything, but why d

Re: Unicode operators

2002-11-07 Thread Brad Hughes
Flaviu Turean wrote: [...] 5. if you want to wait for the computing platforms before programming in p6, then there is quite a wait ahead. how about platforms which will never catch up? VMS, anyone? Not to start an OS war thread or anything, but why do people still have this mistaken impression o

vote no - Re: Unicode operators [Was: Re: UTF-8 and Unicode FAQ, demos]

2002-11-06 Thread David Dyck
The first message had many of the following characters viewable in my telnet window, but the repost introduced a 0xC2 prefix to the 0xA7 character. I have this feeling that many people would vote against posting all these funny characters, as is does make reading the perl6 mailing lists difficult

Re: Unicode operators [Was: Re: UTF-8 and Unicode FAQ, demos]

2002-11-05 Thread Damian Conway
Michael Lazzaro proposed: It's up to Larry, and he knows where we're all coming from. Unless anyone has any _new_ observations, I propose we pause the debate until a decision is reached? I second the motion! Damian

Re: Unicode operators [Was: Re: UTF-8 and Unicode FAQ, demos]

2002-11-05 Thread Damian Conway
Scott Duff wrote: I'm all for one or two unicode operators if they're chosen properly (and I trust Larry to do that since he's done a stellar job so far), but what's the mechanism to generate unicode operators if you don't have access to a unicode-aware editor/terminal/font/etc.? IS the only rec

Re: Unicode operators [Was: Re: UTF-8 and Unicode FAQ, demos]

2002-11-05 Thread Michael Lazzaro
As one of the instigators of this thread, I submit that we've probably argued about the Unicode stuff enough. The basic issues are now known, and it's known that there's no general agreement on any of this stuff, nor will there ever be. To wit: -- Extended glyphs might be extremely useful in

Re: Unicode operators

2002-11-05 Thread Flaviu Turean
one more data point from a person who lived, travelled and used computers in a few countries (Romania, France, Germany, Belgium, UK, Canada, US, Holland, Italy). paraphrasing: rule 1: if it's not on my keyboard, it doesn't exist; rune 2: if it's not on everybody's keyboard, it doesn't exist. lon

Re: Unicode operators [Was: Re: UTF-8 and Unicode FAQ, demos]

2002-11-05 Thread Richard Proctor
On Tue 05 Nov, Smylers wrote: > Richard Proctor wrote: > > > I am sitting at a computer that is operating in native Latin-1 and is > > quite happy - there is no likelyhood that UTF* is ever likely to reach > > it. > > > > ... Therefore the only addition characters that could be used, that > > wil

Re: Unicode operators [Was: Re: UTF-8 and Unicode FAQ, demos]

2002-11-05 Thread Smylers
Richard Proctor wrote: > I am sitting at a computer that is operating in native Latin-1 and is > quite happy - there is no likelyhood that UTF* is ever likely to reach > it. > > ... Therefore the only addition characters that could be used, that > will work under UTF8 and Latin-1 and Windows ...

Re: Unicode operators [Was: Re: UTF-8 and Unicode FAQ, demos]

2002-11-05 Thread Smylers
Dan Kogai wrote: > We already have source filters in perl5 and I'm pretty much sure > someone will just invent yet another 'use operators => "ascii";' kind > of stuff in perl6. I think that's backwards to have operators being funny characters by default but requiring explicit declaration to use w

Re: Unicode operators [Was: Re: UTF-8 and Unicode FAQ, demos]

2002-11-05 Thread Jonathan Scott Duff
I'm all for one or two unicode operators if they're chosen properly (and I trust Larry to do that since he's done a stellar job so far), but what's the mechanism to generate unicode operators if you don't have access to a unicode-aware editor/terminal/font/etc.? IS the only recourse to use the "n

Re: Unicode operators [Was: Re: UTF-8 and Unicode FAQ, demos]

2002-11-05 Thread Michael Lazzaro
Thanks, I've been hoping for someone to post that list. Taking it one step further, we can assume that the only chars that can be used are those which: -- don't have an obvious meaning that needs to be reserved -- appear decently on all platforms -- are distinct and recognizable in the tiny fon

Re: Unicode operators [Was: Re: UTF-8 and Unicode FAQ, demos]

2002-11-05 Thread Richard Proctor
This UTF discussion has got silly. I am sitting at a computer that is operating in native Latin-1 and is quite happy - there is no likelyhood that UTF* is ever likely to reach it. The Gillemets are coming through fine, but most of the other heiroglyphs need a lot to be desired. Lets consider the

Re: Unicode thoughts...

2002-03-30 Thread Jeff
Dan Sugalski wrote: > > At 10:07 AM -0500 3/30/02, Josh Wilmes wrote: > >Someone said that ICU requires a C++ compiler. That's concerning to me, > >as is the issue of how we bootstrap our build process. We were planning > >on a platform-neutral miniparrot, and IMHO that can't include ICU (as i'

Re: Unicode thoughts...

2002-03-30 Thread Dan Sugalski
At 10:07 AM -0500 3/30/02, Josh Wilmes wrote: >Someone said that ICU requires a C++ compiler. That's concerning to me, >as is the issue of how we bootstrap our build process. We were planning >on a platform-neutral miniparrot, and IMHO that can't include ICU (as i'm >sure it's not going to be wr

Re: Unicode thoughts...

2002-03-30 Thread Josh Wilmes
Someone said that ICU requires a C++ compiler. That's concerning to me, as is the issue of how we bootstrap our build process. We were planning on a platform-neutral miniparrot, and IMHO that can't include ICU (as i'm sure it's not going to be written in pure ansi C) --Josh At 8:45 on 03/3

RE: Unicode thoughts...

2002-03-30 Thread Dan Sugalski
At 4:32 PM -0800 3/25/02, Brent Dax wrote: >I *really* strongly suggest we include ICU in the distribution. I >recently had to turn off mod_ssl in the Apache 2 distro because I >couldn't get OpenSSL downloaded and configured. FWIW, ICU in the distribution is a given if we use it. Parrot will re

Re: Unicode thoughts...

2002-03-25 Thread Jeff
Jeff wrote: > > Hong Zhang wrote: > > > > I think it will be relative easy to deal with different compiler > > and different operating system. However, ICU does contain some > > C++ code. It will make life much harder, since current Parrot > > only assume ANSI C (even a subset of it). > > > > Hon

Re: Unicode thoughts...

2002-03-25 Thread Jeff
Hong Zhang wrote: > > I think it will be relative easy to deal with different compiler > and different operating system. However, ICU does contain some > C++ code. It will make life much harder, since current Parrot > only assume ANSI C (even a subset of it). > > Hong > > > This is rather conce

RE: Unicode thoughts...

2002-03-25 Thread Hong Zhang
I think it will be relative easy to deal with different compiler and different operating system. However, ICU does contain some C++ code. It will make life much harder, since current Parrot only assume ANSI C (even a subset of it). Hong > This is rather concerning to me. As I understand it, on

Re: Unicode thoughts...

2002-03-25 Thread Josh Wilmes
This is rather concerning to me. As I understand it, one of the goals for parrot was to be able to have a usable subset of it which is totally platform-neutral (pure ANSI C). If we start to depend too much on another library which may not share that goal, we could have trouble with the par

RE: Unicode thoughts...

2002-03-25 Thread Charles Bunders
> We also need to make sure ICU will work everywhere. And I do mean > *everywhere*. Will it work on VMS? Palm OS? Crays? Nope, nope, and nope. >From their site - Operating systemCompilerTesting frequency Windows 98/NT/2000 Microsoft Visual C++ 6.0Reference pl

RE: Unicode thoughts...

2002-03-25 Thread Brent Dax
Jeff: # This will likely open yet another can of worms, but Unicode has been # delayed for too long, I think. It's time to add the Unicode libraries # (In our case, the ICU libraries at , # which Larry has now blessed) to Parrot. string.c already has # (admittedly

Re: Unicode sorting...

2001-06-08 Thread Bryan C . Warnock
On Friday 08 June 2001 02:17 pm, NeonEdge wrote: > > Another example is the chinese has no definite > > sorting order, period. The commonly used scheme are > > phonetic-based or stroke-based. Since many characters > > have more than one pronounciations (context sensitive) > > and more than one for

Re: Unicode sorting...

2001-06-08 Thread Jarkko Hietaniemi
> The A-Z syntax is really a shorthand for "All the uppercase letters". > (Originally at least) I won't argue the problems with sorting various sets > of characters in various locales, but for regexes at least it's not an > issue, because the point isn't sorting or ordering, it's identifying >

RE: Unicode sorting...

2001-06-08 Thread Dan Sugalski
At 11:29 AM 6/8/2001 -0700, Hong Zhang wrote: > > If this is the case, how would a regex like "^[a-zA-Z]" work (or other, >more > > sensitive characters)? If just about anything can come between A and Z, >and > > letters that might be there in a particular locale aren't in another >locale, > > th

Re: Unicode sorting...

2001-06-08 Thread Jarkko Hietaniemi
> > If this is the case, how would a regex like "^[a-zA-Z]" work (or other, > more > > sensitive characters)? If just about anything can come between A and Z, > and > > letters that might be there in a particular locale aren't in another > locale, > > then how will regex engine make the distinctio

RE: Unicode sorting...

2001-06-08 Thread Hong Zhang
> If this is the case, how would a regex like "^[a-zA-Z]" work (or other, more > sensitive characters)? If just about anything can come between A and Z, and > letters that might be there in a particular locale aren't in another locale, > then how will regex engine make the distinction? This synt

  1   2   >