date:20090130

Re: r25102 - docs/Perl6/Spec

2009-01-30 Thread Carl Mäsak

Mark (), Moritz (), Larry via commit bot ():
 +PERL# Lexical symbols in the standard perlude

 Did you mean prelude instead?

 I took the quotation marks to indicate an intentional
 misspelling/coinage:  perl + prelude = perlude.

At which point one might ask oneself whether it is more important that
the synopses be amusing and punny, or that they clearly specify what
is expected of a conforming Perl 6 implementation.

Now, just so you don't think I'm all cranky and humour-impaired: I got
the pun, I smiled a bit at it -- but I already know what a standard
prelude is. Those who don't are going to be confused in two ways when
they read the above, making the explanatory comment essentially
useless.

// Carl

Re: r25122 - docs/Perl6/Spec

2009-01-30 Thread Darren Duncan


pugs-comm...@feather.perl6.nl wrote:

 In the abstract, Perl is written in Unicode, and has consistent Unicode
-semantics regardless of the underlying text representations.
+semantics regardless of the underlying text representations.  By default
+Perl presents Unicode in NFG formation, where each grapheme counts as
+one character.  A grapheme is what the novice user would think of as a
+character in their normal everyday life, including any diacritics.


What's with this NFG / Normal Form G that you refer to?  I don't see any mention 
of that in http://unicode.org/reports/tr15/ ... did you mean NFC?


For that matter, is it possible for all realistic combinations of diacritics and 
base letters to be represented by a single Unicode codepoint, including all 
language-dependent graphemes?


I thought NFC sort of did one codepoint per grapheme but there were a few 
exceptions ... I could be wrong on that point.


-- Darren Duncan

Re: r25122 - docs/Perl6/Spec

2009-01-30 Thread Mark J. Reed

On Fri, Jan 30, 2009 at 6:30 AM, Darren Duncan dar...@darrenduncan.net wrote:
 pugs-comm...@feather.perl6.nl wrote:

 By default Perl presents Unicode in NFG formation, where each grapheme 
 counts as
 one character.  A grapheme is what the novice user would think of as a
 character in their normal everyday life, including any diacritics.

 What's with this NFG / Normal Form G that you refer to?  I don't see any
 mention of that in http://unicode.org/reports/tr15/ ... did you mean NFC?

As far as I can tell, NFG isn't an official Unicode Normalization
Format; it's a HLL thing, and it has nothing to do with code points.
When you ask Perl6 for one character, what you get back (by default)
is one grapheme - presumably as defined by UAX #29 - which may be
one or more code points, and who knows how many bytes it winds up
encoded as in memory.

Applescript 2.0 takes this approach as well.

So are there any non-opaque, non-string grapheme representations?
Does ord() work on them?  In AS, the equivalent function is allowed to
return a list of numbers instead of just a single number; in either
case, the value can be passed to the chr() equivalent to get the same
grapheme back.

 For that matter, is it possible for all realistic combinations of diacritics
 and base letters to be represented by a single Unicode codepoint, including
 all language-dependent graphemes?

Absolutely not.  Again, nobody said anything about code points.
We're talking about Perl6's idea of characters.

-- 
Mark J. Reed markjr...@gmail.com

Re: r25102 - docs/Perl6/Spec

2009-01-30 Thread Larry Wall

On Fri, Jan 30, 2009 at 10:49:13AM +0100, Carl Mäsak wrote:
: Mark (), Moritz (), Larry via commit bot ():
:  +PERL# Lexical symbols in the standard perlude
: 
:  Did you mean prelude instead?
: 
:  I took the quotation marks to indicate an intentional
:  misspelling/coinage:  perl + prelude = perlude.
: 
: At which point one might ask oneself whether it is more important that
: the synopses be amusing and punny, or that they clearly specify what
: is expected of a conforming Perl 6 implementation.
: 
: Now, just so you don't think I'm all cranky and humour-impaired: I got
: the pun, I smiled a bit at it -- but I already know what a standard
: prelude is. Those who don't are going to be confused in two ways when
: they read the above, making the explanatory comment essentially
: useless.

You must understand that part of the reason I wrote that is to
remind folks that we're *not* talking about a standard prelude here.
The prelude metaphor says that it's something that comes before your
program, but that's not what we want.  We want something that comes
outside your program, that is, a lexical scope that *surrounds* the
file scope.  We don't have a good word for that: circumlude?  ambilude?

So that's why I said perlude.  Well, that, and it was a pun. :)

The concept here is that any lexical scope can parse a token that
says snapshot me here at this depth, and then there's a mechanism
for inserting the new main program in that lexical scope at startup.
It not only gives us the standard outerlude, but allows us to start
up the parser in any language we care to specify by snapshot name.
Special cases might even have their own switches, which is why S19
talks about implementing -n and -p by substituting a different prelude.

But then it's not just a prelude, because it's supplying an implicit
loop around the main code as part of the definition of the language
you're using.

So I'm open to suggestions for what we ought to call that envelope
if we don't call it the prelude or the perlude.  Locale is bad,
environs is bad, context is bad...the wrapper?  But we have dynamic
wrappers already, so that's bad.  Maybe the setting, like a jewel?
That has a nice static feeling about it at least, as well as a feeling
of surrounding.

Or we could go with a more linguistic contextual metaphor.  Argot,
lingo, whatever...

So anyway, just because other languages call it a prelude doesn't
mean that we have to.  Perl is the tail that's always trying to
wag the dog...

What is the sound of one tail wagging?

Larry

Re: r25102 - docs/Perl6/Spec

2009-01-30 Thread Mark J. Reed

On Fri, Jan 30, 2009 at 11:30 AM, Larry Wall la...@wall.org wrote:
  We want something that comes
 outside your program, that is, a lexical scope that *surrounds* the
 file scope.  We don't have a good word for that: circumlude?  ambilude?
[...]
 Or we could go with a more linguistic contextual metaphor.  Argot,
 lingo, whatever...

If we're being all linguistical, how about circumlect?

-- 
Mark J. Reed markjr...@gmail.com

Re: r25102 - docs/Perl6/Spec

2009-01-30 Thread Jon Lang

Larry Wall wrote:
 So I'm open to suggestions for what we ought to call that envelope
 if we don't call it the prelude or the perlude.  Locale is bad,
 environs is bad, context is bad...the wrapper?  But we have dynamic
 wrappers already, so that's bad.  Maybe the setting, like a jewel?
 That has a nice static feeling about it at least, as well as a feeling
 of surrounding.

 Or we could go with a more linguistic contextual metaphor.  Argot,
 lingo, whatever...

 So anyway, just because other languages call it a prelude doesn't
 mean that we have to.  Perl is the tail that's always trying to
 wag the dog...

 What is the sound of one tail wagging?

whoosh, whoosh.

I tend to like setting, because it makes me think of the setting of
a play, in which the actors (i.e., objects) perform their assigned
roles in following the script.

-- 
Jonathan Dataweaver Lang

Re: r25102 - docs/Perl6/Spec

2009-01-30 Thread Patrick R. Michaud

On Fri, Jan 30, 2009 at 08:30:25AM -0800, Larry Wall wrote:
 So anyway, just because other languages call it a prelude doesn't
 mean that we have to.  Perl is the tail that's always trying to
 wag the dog...
 
 What is the sound of one tail wagging?

For my dog Sally, the sound of one tail wagging is regularly
used to indicate that she believes I'm in desperate need of taking
her on a walk.

Pm

Re: r25122 - docs/Perl6/Spec

2009-01-30 Thread Larry Wall

On Fri, Jan 30, 2009 at 03:30:02AM -0800, Darren Duncan wrote:
 pugs-comm...@feather.perl6.nl wrote:
  In the abstract, Perl is written in Unicode, and has consistent Unicode
 -semantics regardless of the underlying text representations.
 +semantics regardless of the underlying text representations.  By default
 +Perl presents Unicode in NFG formation, where each grapheme counts as
 +one character.  A grapheme is what the novice user would think of as a
 +character in their normal everyday life, including any diacritics.

 What's with this NFG / Normal Form G that you refer to?  I don't see any 
 mention of that in http://unicode.org/reports/tr15/ ... did you mean NFC?

Nope, this is a Perl/Parrot idea.  It started out with a notion of mine a
year ago.  Search for 'grapheme' in

http://use.perl.org/~chromatic/journal/35461

We named it NFG about the time Simon Cozens wrote a PDD for it for parrot.
At the moment it's much better specced in Parrotland than in P6land.  See

http://www.parrotcode.org/docs/pdd/pdd28_strings.html

NFG stands for Normalization Form G, where the G is short for
grapheme.  And before anyone asks, yes, we were aware of the other
gloss for NFG when we picked it.  :)

 For that matter, is it possible for all realistic combinations of 
 diacritics and base letters to be represented by a single Unicode 
 codepoint, including all language-dependent graphemes?

No, that is the vision of NFC, but there are potentially an infinite number of
graphemes that can be composed in Unicode.  NFG aims to represent each
of those locally as a single integer, and translate back out to a more
standard normalization form on output.

 I thought NFC sort of did one codepoint per grapheme but there were a few 
 exceptions ... I could be wrong on that point.

You are correct, NFC doesn't do all that we want.

By the way, we could use someone to write the Perl 6 Unicode synopsis,
based on PDD 28.

Larry

Re: r25122 - docs/Perl6/Spec

2009-01-30 Thread Geoffrey Broadwell

On Fri, 2009-01-30 at 08:12 +0100, pugs-comm...@feather.perl6.nl wrote:
 @@ -103,7 +106,7 @@
  =item *
  
  POD sections may be used reliably as multiline comments in Perl 6.
 -Unlike in Perl 5, POD syntax now requires that C=begin comment
 +Unlike in Perl 5, POD syntax now lets you use C=begin comment
  and C=end comment delimit a POD block correctly without the need
  for C=cut.  (In fact, C=cut is now gone.)  The format name does
  not have to be Ccomment -- any unrecognized format name will do

I believe that with this change in wording the next line needs to use
'to delimit' rather than just 'delimit'.


-'f

Re: r25122 - docs/Perl6/Spec

2009-01-30 Thread Larry Wall

On Fri, Jan 30, 2009 at 10:28:43AM -0800, Geoffrey Broadwell wrote:
: On Fri, 2009-01-30 at 08:12 +0100, pugs-comm...@feather.perl6.nl wrote:
:  @@ -103,7 +106,7 @@
:   =item *
:   
:   POD sections may be used reliably as multiline comments in Perl 6.
:  -Unlike in Perl 5, POD syntax now requires that C=begin comment
:  +Unlike in Perl 5, POD syntax now lets you use C=begin comment
:   and C=end comment delimit a POD block correctly without the need
:   for C=cut.  (In fact, C=cut is now gone.)  The format name does
:   not have to be Ccomment -- any unrecognized format name will do
: 
: I believe that with this change in wording the next line needs to use
: 'to delimit' rather than just 'delimit'.

You've got a commit bit, I believe.  :)

Larry

Re: r25122 - docs/Perl6/Spec

2009-01-30 Thread Darren Duncan


Larry Wall wrote:

On Fri, Jan 30, 2009 at 03:30:02AM -0800, Darren Duncan wrote:
What's with this NFG / Normal Form G that you refer to?  I don't see any 
mention of that in http://unicode.org/reports/tr15/ ... did you mean NFC?


Nope, this is a Perl/Parrot idea.  It started out with a notion of mine a
year ago.  Search for 'grapheme' in

http://use.perl.org/~chromatic/journal/35461

We named it NFG about the time Simon Cozens wrote a PDD for it for parrot.
At the moment it's much better specced in Parrotland than in P6land.  See

http://www.parrotcode.org/docs/pdd/pdd28_strings.html


Okay, I understand now.  NFG is designed just as a temporary in-process normal 
form where the same representation of a character as a number can't reliably be 
consistent over the long term, unlike NFC/D/KC/KD/etc.


It does occur to me, though, that as long as we include the generated lookup 
table (not required for NFC/etc), NFG can be serialized as is and be 
unambiguously understood by NFG-savvy programs over the long term.


Much how LZW (name?) compression works, that includes its own lookup table.

So as long as this nature of NFG is understood, and if necessary any serialized 
forms will include a spec version num / etc as protection in the face of 
upgrades, this could also stand to be a standard beyond Perl/Parrot/etc.


I wonder if the Unicode consortium would be interested in adopting an NFG-alike, 
or whether that would be beyond their scope?



By the way, we could use someone to write the Perl 6 Unicode synopsis,
based on PDD 28.


Well, if someone else doesn't do it first, I don't think it would be too 
difficult for me to do this, at least the initial based-on-PDD-28 cut; however 
it would likely be a few weeks before I get around to it, partly since I don't 
have a Pugs repo checkout in place ... maybe when I port the new Set::Relation 
to Perl 6, requiring such a checkout, I may do that too ... but don't wait for me.


By the way, in the mean-time, someone should update that reference to NFG in S02 
to include a link to that PDD28, so other people encountering it don't have to 
ask the same question I did.


-- Darren Duncan

Re: r25102 - docs/Perl6/Spec

Re: r25122 - docs/Perl6/Spec

Re: r25122 - docs/Perl6/Spec

Re: r25102 - docs/Perl6/Spec

Re: r25102 - docs/Perl6/Spec

Re: r25102 - docs/Perl6/Spec

Re: r25102 - docs/Perl6/Spec

Re: r25122 - docs/Perl6/Spec

Re: r25122 - docs/Perl6/Spec

Re: r25122 - docs/Perl6/Spec

Re: r25122 - docs/Perl6/Spec

11 matches

Site Navigation

Mail list logo

Footer information