Re: continuation markers for long literals (was Re: r29931 - docs/Perl6/Spec)

2010-03-04 Thread Darren Duncan

Larry Wall wrote:

Dealing with antediluvian displays sounds like a good spot for that
ancient technology, the preprocessor,

: I also figured that this would be a fairly simple thing to do.

Well, it will be simple, once we have macros; in fact, textual macros
can be regarded simply as scoped preprocessors, with all the rights,
privileges, and responsibilities pertaining thereto.  I think macros
will provide enough language support for this sort of hard things
should be possible escape hatch.  And remember you can always override
the grammar if you have special reasons for doing so.  That's what
Perl 6 is all about.  It's not about foreseeing every possible twinge
of misgiving that anyone may come to feel in the next 100 years...

Sure, we're trying to create a gigantic sweet spot in Perl 6, but
Willy Wonka knows you can't have the whole world, and if you could,
you can't have it now.  :)


FYI,

As a follow-up, this Perl 6 discussion has inspired me to apply some of these 
ideas to my Muldis D language (PTMD_STD dialect), so then there should be an 
earlier example of the concept I proposed.


The feature that I went and specced out today, and named unspace after its 
Perl 6 inspiration, basically is a token consisting of a pair of backslashes 
separated by whitespace.  Currently this token can be placed in the middle of 
either a numeric or a stringy literal (including quoted formats of identifiers), 
and in the latter case *inside* the quotes.


Examples:

  $bar := 7;55084\ \4222\ \7677  # an integer expressed in octal #

  $some_pi := 3.141592653589793238462643383279\
\5028841971693993751058209749445923078164\
\0628620899862803482534211706798214808651\
\3282306647093844609550582231725359408128

  $a_rat := 48111745028410270193\
\8521105559 / 64462294895493038196\
\442881097566593344612847564823

  $a_string := 'hello this world\
\ how are you today'

  $a_string := 'hello this world'
~ ' how are you today'

Those, except the last, are direct translations of the pseudo-Perl-6 I gave 
earlier in the thread.


Note that since Muldis D's use of backslashes differs in some ways from how Perl 
6 uses them (backslashes are only otherwise used for spelling escaped 
characters), the above syntax doesn't conflict with anything there.


One detail unlike Perl 6's unspace is that I put a backslash at both ends of the 
stuff to ignore, so when used for the above purpose the code isn't end-weighted 
and a human can parse what's going on faster.  Also, my version of unspace is 
simpler in that you don't put any comments in it.


Now if naming my concept unspace may confuse people, as it is partially like 
that of Perl 6 but partly not, I could name it something else, but that name 
seemed the best for now, because it is all about letting a programmer put some 
whitespace in their code which the parser would then treat as if it wasn't there 
at all.


-- Darren Duncan


continuation markers for long literals (was Re: r29931 - docs/Perl6/Spec)

2010-03-03 Thread Darren Duncan

pugs-comm...@feather.perl6.nl wrote:

Modified:
   docs/Perl6/Spec/S02-bits.pod
Log:
[S02] remove 1/2 and +2-3i literal forms, now rely on angle dwimmery for 
literals,
or constant folding otherwise.

snip

I find this an interesting change, and I can see how it would simplify some 
things, even though I would miss the old behavior.


But this reminds me of what I see as a tangential issue, which I want to raise.

How would Perl 6 support someone wanting to write a numeric literal that is so 
long that they would want to split it over multiple source code lines, such as a 
very long integer that takes a few hundred or thousand characters to write, or 
an X/Y rational composed of 2 such integers, but they want to keep their source 
code under the 80 chars per line mark.


I'm not currently aware that Perl 6 provides some kind of continuation marker 
that one could put between pieces of such a literal, so that they could split 
those pieces otherwise with whitespace but then the parser would treat the code 
as if said whitespace wasn't there, but I think Perl 6 should have this.  It 
would need to work both outside any quoting constructs as well as inside any 
angle dwimmery.


On one hand I would think the mnemonics of ~, which are stitching things 
together, would work great for a continuation marker, but that ~ seems to 
already be established in Perl 6 as indicating a string data context, such that 
it is used for casting things into Str or catenating 2 strings.  However, I will 
use ~ below for the sake of illustration.


  my $some_pi = 3.141592653589793238462643383279
~ 5028841971693993751058209749445923078164
~ 0628620899862803482534211706798214808651
~ 3282306647093844609550582231725359408128;
  my $a_rat = 48111745028410270193
~ 8521105559/64462294895493038196
~ 442881097566593344612847564823;

As a slight extension to this, one should be able to use that same continuation 
character between 2 consecutive string literals so that they are parsed as if 
they were one string literal, so that one could also split those over source 
code lines, without the vaguarities of source code line endings affecting the 
value of the string like a here-doc or literal line breaks would.  I grant that 
this could be redundant with regular constant folding of the already defined ~ 
operator, but using the continuation marker instead for this could spare concern 
about precedence issues same as 1/2 does versus 1/2 after today's changes.


  my $a_string = 'hello this world'
~ ' how are you today';

Now I think in the wider world some precedent exists for using the logical-not 
character ¬ as a continuation marker, but that isn't an ASCII symbol and we 
would want something ASCII for the continuation marker.  Also I think using the 
backslash for such a marker would be a bad idea.


While this isn't an operator per se, if it had to be put in the precedence 
table, I would think it would have the highest possible precedence; it would be 
eliminated during one of the earliest parsing phases, during tokenization I 
believe, and then all the other parsing rules would come into effect following 
that elimination, except for the big one that any literal continuation chars 
inside a quoted string are taken as normal characters as usual.


So can we please have this continuation marker thing, and what do you think it 
should look like?


Thank you in advance.

-- Darren Duncan



Re: continuation markers for long literals (was Re: r29931 - docs/Perl6/Spec)

2010-03-03 Thread Darren Duncan

Mark J. Reed wrote:

Doesn't unspace work for this?


It would seem that S02 says otherwise:

Although we say that the unspace hides the whitespace from the parser, it 
does not hide whitespace from the lexer.  As a result, unspace is not allowed 
within a token.


So, assuming that an integer literal at least, and maybe also an angle dwimmery, 
is a single token, then that wouldn't work.


If unspace did the job, I should be able to say this:

  my $foo = 3.1415926535897\
93238462643383279;

or:

  my $foo = 3.1415926535897\ 93238462643383279;

and it would be interpreted the same ways as if I said:

  my $foo = 3.141592653589793238462643383279;

Now I think there are good reasons for unspace not being allowed in a token, in 
which case we'd need some other syntax for the continuation marker that I want.


As for supporting long rational literals expressed as X/Y, I can live with being 
required to say (136\ 5634/42442\ 555) and depend on constant folding rather 
than 136\ 5634/42442\ 555 doing the same, if that would make things easier.


However, the likes of this needs to work:

  my $bar = :855084\ 4222\ 7677;

... same as this does:

  my $baz = 564345\ 242432;

Thank you.

-- Darren Duncan


Re: continuation markers for long literals (was Re: r29931 - docs/Perl6/Spec)

2010-03-03 Thread Damian Conway
Surely this is not a common-enough requirement to warrant a special
syntax.

At 80-columns, you can represent integers up to ninety-nine
quinvigintillion, nine hundred ninety-nine quattuorvigintillion, nine
hundred ninety-nine trevigintillion, nine hundred ninety-nine
duovigintillion, nine hundred ninety-nine unvigintillion, nine hundred
ninety-nine vigintillion, nine hundred ninety-nine novemdecillion, nine
hundred ninety-nine octodecillion, nine hundred ninety-nine
septendecillion, nine hundred ninety-nine sexdecillion, nine hundred ninety-
nine quindecillion, nine hundred ninety-nine quattuordecillion, nine
hundred ninety-nine tredecillion, nine hundred ninety-nine duodecillion,
nine hundred ninety-nine undecillion, nine hundred ninety-nine
decillion, nine hundred ninety-nine nonillion, nine hundred ninety-nine
octillion, nine hundred ninety-nine septillion, nine hundred ninety-nine
sextillion, nine hundred ninety-nine quintillion, nine hundred ninety-
nine quadrillion, nine hundred ninety-nine trillion, nine hundred ninety-
nine billion, nine hundred ninety-nine million, nine hundred ninety-nine
thousand, and nine hundred ninety-nine.

Surely that's enough for the vast majority of users, isn't it?

And if you *do* need anything bigger (perhaps to represent the burgeoning
U.S. national debt) then there's always some variation on:

my $debt = +(
123456789012345678901234567890123456789012345678901234
  ~ 567890123456789012345678901234567890123456789012345678
  ~ 901234567890123456789012345678901
);

or even:

my $debt = +(
123_456_789_012_345_678_901_234_567_890_123_456_789_012_345_678_901_234
  ~ 567_890_123_456_789_012_345_678_901_234_567_890_123_456_789_012_345_678
  ~ 901_234_567_890_123_456_789_012_345_678_901
);

if you like to group your thousands for better readability.

With adequate constant folding, both of those are still compile-time constants.

Damian


Re: continuation markers for long literals (was Re: r29931 - docs/Perl6/Spec)

2010-03-03 Thread Mark J. Reed
On Wed, Mar 3, 2010 at 6:26 PM, Darren Duncan dar...@darrenduncan.net wrote:
 Mark J. Reed wrote:

 Doesn't unspace work for this?

 It would seem that S02 says otherwise:

    Although we say that the unspace hides the whitespace from the parser, it
 does not hide whitespace from the lexer.  As a result, unspace is not
 allowed within a token.

D'oh, indeed.  Never mind.


On Wed, Mar 3, 2010 at 7:00 PM, Damian Conway dam...@conway.org wrote:
 At 80-columns, you can represent integers up to ninety-nine
 quinvigintillion, [...]

Assuming the short scale.  On the long scale, that's ninety-nine
tredecillion, nine hundred ninety-nine thousand nine hundred
ninety-nine duodecillion, etc. :)

 there's always some variation on:

    my $debt = +(
        123456789012345678901234567890123456789012345678901234
      ~ 567890123456789012345678901234567890123456789012345678
      ~ 901234567890123456789012345678901
    );

Serviceable, but feels a bit hackish.  Reminds me of faking P5 qw in
PHP by using split(' ', 'words like this').  But with a reasonably
intelligent compiler, as you say, at least it still compiles to a
literal.

I note that Rakudo alpha turns the above into Inf, which seems apropos. :)

-- 
Mark J. Reed markjr...@gmail.com


Re: continuation markers for long literals (was Re: r29931 - docs/Perl6/Spec)

2010-03-03 Thread Darren Duncan

Damian Conway wrote:

Surely this is not a common-enough requirement to warrant a special
syntax.

At 80-columns, you can represent integers up to

snip

Surely that's enough for the vast majority of users, isn't it?


Well, 80 columns was an example, albeit the most common, but the principle idea 
was to support writing code that fit into very narrow spaces (such as may result 
from having the 80-col constraint plus a whole bunch of code indent levels) 
while being able to keep the code easily readable and nicely formatted.


I also figured that this would be a fairly simple thing to do.

Part of the idea was that one could also wrap any long identifiers as well to 
fit in a narrow space.


Now, granted that expressing every thing which might become long as a string 
literal could probably work, it seemed somewhat inelegant, though maybe the 
problem is uncommon enough that this is an acceptable sacrifice.



And if you *do* need anything bigger (perhaps to represent the burgeoning
U.S. national debt) then there's always some variation on:

my $debt = +(
123456789012345678901234567890123456789012345678901234
  ~ 567890123456789012345678901234567890123456789012345678
  ~ 901234567890123456789012345678901
);

or even:

my $debt = +(
123_456_789_012_345_678_901_234_567_890_123_456_789_012_345_678_901_234
  ~ 567_890_123_456_789_012_345_678_901_234_567_890_123_456_789_012_345_678
  ~ 901_234_567_890_123_456_789_012_345_678_901
);

if you like to group your thousands for better readability.

With adequate constant folding, both of those are still compile-time constants.


That sounds half-reasonable, though it would seem to me that you'd have to quote 
each piece of the number to make it work right if you were using anything other 
than base 10.  And we're assuming that +(...) isn't producing a Num instead of 
an Int or Rat as the case may be, as if the rules for +(...) were the same as 
the parser's rules for what kind of number it makes.


So if we leave things as is, then hopefully the examples you raised will be 
commonly supported as compile-time constants in Perl 6 implementations.


-- Darren Duncan


Re: continuation markers for long literals (was Re: r29931 - docs/Perl6/Spec)

2010-03-03 Thread Larry Wall
On Wed, Mar 03, 2010 at 05:39:58PM -0800, Darren Duncan wrote:
: Damian Conway wrote:
: Surely this is not a common-enough requirement to warrant a special
: syntax.
: 
: At 80-columns, you can represent integers up to
: snip
: Surely that's enough for the vast majority of users, isn't it?
: 
: Well, 80 columns was an example, albeit the most common, but the
: principle idea was to support writing code that fit into very narrow
: spaces (such as may result from having the 80-col constraint plus a
: whole bunch of code indent levels) while being able to keep the code
: easily readable and nicely formatted.

Dealing with antediluvian displays sounds like a good spot for that
ancient technology, the preprocessor,

: I also figured that this would be a fairly simple thing to do.

Well, it will be simple, once we have macros; in fact, textual macros
can be regarded simply as scoped preprocessors, with all the rights,
privileges, and responsibilities pertaining thereto.  I think macros
will provide enough language support for this sort of hard things
should be possible escape hatch.  And remember you can always override
the grammar if you have special reasons for doing so.  That's what
Perl 6 is all about.  It's not about foreseeing every possible twinge
of misgiving that anyone may come to feel in the next 100 years...

Sure, we're trying to create a gigantic sweet spot in Perl 6, but
Willy Wonka knows you can't have the whole world, and if you could,
you can't have it now.  :)

Larry