Re: new sigil

2005-10-22 Thread Autrijus Tang
Juerd wrote:
 I do not see why $ and @ couldn't be both a sigil and an infix
 operator, and the same goes for whatever ASCII equivalent ¢ gets.
 
 ^ and | are available for sigil use. (All the closing brackets are too,
 but that would be very confusing because we tend to visually parse those
 in pairs.)
 
 Using the an infix operator's symbol as a sigil is not weird, not wrong,
 not confusing and mostly: not a new idea.

Indeed.  Somehow I think this makes some sense:

sub Bool eqv (|T $x, |T $y) { ... }

Thanks,
/Autrijus/


Re: new sigil

2005-10-22 Thread Nicholas Clark
On Fri, Oct 21, 2005 at 09:42:00AM +0100, Carl Franks wrote:
 Where did you get ALT-155 from?

Code page 437:

http://www.kostis.net/charsets/cp437.htm

On Fri, Oct 21, 2005 at 06:07:47AM -0500, Steve Peters wrote:
 On Fri, Oct 21, 2005 at 09:42:00AM +0100, Carl Franks wrote:
  Where did you get ALT-155 from?
  
  I've just checked the windows Character Map, and ¢ (cent) is ALT-0162
  ( If it's not in your startmenu, do start - run - charmap )
 
 Actually, both work.  That's where the issus with the documentation starts.

what he says

This is going to be hard to document well.

For example, *I* know why the leading zero is significant on ALT-0162, but
how many people are going to assume that it's not?

Anyone care to save to a file called AUX.TXT on Windows?

And for anyone who says upgrade, please note that many firms in the real
world are still forcing a base perl version of 5.005_03 or 5.6.1 for
development. Still.

The active perl community is not wholly representitive of the global usage
of perl, and would do well to remember this. For example, see
http://use.perl.org/~barbie/journal/27098

Nicholas Clark


Re: new sigil

2005-10-22 Thread John Adams
-Original Message-
From: Nicholas Clark [EMAIL PROTECTED]

 And for anyone who says upgrade, please note that many firms in the real
world are still forcing a base perl version of 5.005_03 or 5.6.1 for
development. Still.

My weekend project is to demonstrate that you are an optimist. Really.


Re: new sigil

2005-10-22 Thread Nicholas Clark
At the risk of re-enforcing my apparent optimism.

On Thu, Oct 20, 2005 at 04:02:10PM -0700, Darren Duncan wrote:

 that the next best one to exploit is ¤ (euro; 
 unicode=20AC; utf8=E282AC), and the next best is 

Woah. You've just demonstrated why Euro is far worse than any of the other
Unicode characters so far suggested. You mail headers say:

Content-Type: text/plain; charset=iso-8859-1 ; format=flowed

The symbol in your message *as sent* is the international currency symbol,
U00A4. The Euro symbol is not part of ISO-8859-1.
(ISO-8859-15 yes, but that's about 10 years more recent)

ISO-8859-1 has been the default standard for the character set on most
Internet protocols for a long time, and many systems for the past 10+ years
have supported it by default (most Unix variants, Windows 3.1, I think.
DOS boxes were CP437, but native Windows was (extended) ISO-8859-1)

This cannot be said for ISO-8859-15. So I can see little reason why any
currently operational system will be incapable of displaying the ISO-8859-1
operators in scripts or CPAN modules correctly, even if the editor the
maintenance programmer (or sysdamin) is constrained to entering the ASCII
digraphs.

But there will be a lot of systems out there where this is not true for the
Euro symbol, and the assumption of ISO-8859-1 defaults will mean that this
won't be the last time that Euro symbols are going to get mangled during
transit, with all the ensuing pain, frustration, losses and defections to
other languages that this will cause.

Perl 5 runs everywhere: http://www.cpan.org/ports/index.html

Perl 6 is intended to be an improvement on Perl 5. It would be a shame to
design in restrictions on portability.

Nicholas Clark



Re: new sigil

2005-10-22 Thread Aaron Crane
Kaoru Maeda writes:
 Darren Duncan wrote:
  the next best is £
 Isn't that 0x23 in UK?  I imagine that someday all the comment lines 
 cause syntax errors in UK...

U+00A3 POUND SIGN is at 0x23 in ISO 646-GB (aka BS 4730), true.
Fortunately, that character set is almost never used.  I think the last
time I encountered it was on a dot-matrix printer manufactured in the
1980s.

Hmmm.  Encode.pm doesn't seem to have support available for any of the
ISO 646 character sets.  I feel a patch coming on.

-- 
Aaron Crane


Re: new sigil

2005-10-22 Thread John Macdonald
On Fri, Oct 21, 2005 at 09:35:12AM -0400, Rob Kinyon wrote:
 On 10/21/05, Steve Peters [EMAIL PROTECTED] wrote:
  On Fri, Oct 21, 2005 at 02:37:09PM +0200, Juerd wrote:
   Steve Peters skribis 2005-10-21  6:07 (-0500):
Older versions of Eclipse are not able to enter these characters.  
That's
where the copy and paste comes in.
  
   That's where upgrades come in.
  
  That's where lots of money to update to the next version of WSAD becomes the
  limiting factor.
 
 So, you are proposing that the Perl of the Unicode era be limited to
 ASCII because a 15 year old editor cannot handle the charset? That's
 like suggesting that operating systems should all be bootable from a
 single floppy because not everyone has access to a CD drive.

Um, that's not what I'm hearing.

To type in a Unicode character requires machinations beyond just
hitting a labelled key on the keybourd.  There are no standards
for these machinations - what must be done is different for
Windows vs. Linux, and different for specific applications
(text-mode mutt vs. xvi vs. Eclipse vs. ...).

So, a book can't just show code and expect the reader to be
able to use it, and no book is going to be able to tell all
of its users how to type the characters because there are so
many different ways.

Any serious programmer will be able to sort out how to do
things but casual programmers won't be typing the extended
characters enough to learn how to do it without looking it
up each time.  Proprammers that use many different computers
and applications will have difficulty remembering which of
the varous incantations happen to work on the system they're
currently using.  People who do sort out a good working
environment will be at a loss when they occassionally have to do
something on a different system and no longer know how to type
basic characters.  (But since in their normal environment they
do know how, they may never have known the ASCII workarounds,
so they'll have to look them up.)  I've gotten away from
programming enough that I often have to look up a function
or operator definition to check on details; but that is much
less disruptive to the thought process than having to look up
how to type a character.

I think that the reasons for using Unicode characters are good
ones and that there is no good alternative.  However, doing
so does make Perl less accessable for casual programmers.
(While we may deride the Learn to Web Program in 5 Minutes
crowd, that did get many people involved with Perl, and I'm
sure some of them evolved beyond those limited roots, just
as an earlier generation of programmers had some who evolved
beyond their having started with Basic into nonetheless becoming
competent and knowledgeable craftsmen.)

We need to have a Why Unicode is the lesser of evils document
to refer to whenever this issue rizes up again.  The genuine
problems involved ensure that the issue will continue to arise,
so we can't just get mad at the people who raise it.

-- 


Re: new sigil

2005-10-22 Thread Darren Duncan

At 3:26 PM +0100 10/22/05, Nicholas Clark wrote:

At the risk of re-enforcing my apparent optimism.

On Thu, Oct 20, 2005 at 04:02:10PM -0700, Darren Duncan wrote:


 that the next best one to exploit is ¤ (euro;
 unicode=20AC; utf8=E282AC), and the next best is


Woah. You've just demonstrated why Euro is far worse than any of the other
Unicode characters so far suggested. You mail headers say:

Content-Type: text/plain; charset=iso-8859-1 ; format=flowed

The symbol in your message *as sent* is the international currency symbol,
U00A4. The Euro symbol is not part of ISO-8859-1.
(ISO-8859-15 yes, but that's about 10 years more recent)


Actually, what you point out in my message is a 
limitation of my email client, which I didn't 
realize existed until now.


I then did a bit of research, and apparently the 
newest Eudora doesn't support customization of 
what character set messages are composed with, 
always sending them using ISO-8859-1.  This is 
apparently a an issue that many Eudora users 
requested fixed but haven't been addressed.


This said, sending UTF8 files as email 
attachments, rather than UTF8 in the message 
body, still works fine, AFAIK, as does 
transmitting them by other ways such as http or 
ftp etc.


And my normal text editor handles UTF8 correctly.

Also, apparently some other email clients handle UTF8 properly.

So my email client failed me, but my point still 
stands that Unicode characters should still be 
embraced in Perl 6.  I just need to replace my 
email client if I want to type them into the 
message body.


-- Darren Duncan