Re: One-pass parsing and forward type references

2010-02-02 Thread Larry Wall
On Mon, Feb 01, 2010 at 06:12:16PM -0800, Jon Lang wrote:
: Larry Wall wrote:
:  But also note that there are several other ways to predeclare
:  types implicitly.  The 'use', 'require', and 'need' declarations
:  all introduce a module name that is assumed to be a type name.
: 
: Just to clarify: it's possible to define a module within a file,
: rather than as a file; and in fact the usual means of defining classes
: and roles is an example of this, since they are specialized kinds of
: modules.  Correct?  So if I' understanding this correctly, you should
: be able to say something like:
: 
: use Foo;
: class Bar { ... has Foo $x ... }
: class Foo { ... }
: 
: ...where the dots are stand-ins for irrelevant code.  In effect, use
: tells the compiler that Foo is a noun, so that the parser knows the
: proper way to handle it.  It also looks for the definition of Foo; but
: will it start screaming bloody murder if it can't find the definition
: right away?  Have I failed to correctly tell it where to look for the
: definition?  (i.e., do I need to say something like use ::Foo to let
: the parser know that the definition is in this file?)

You should use 'class Foo {...}' for a forward declaration of a class
in the same file, not a vacuous 'use' that implies strongly but wrongly
that the definitions are in another file.  To look at it another way,
'use' is a real predeclaration, not a pre-non-declaration such as we
mean when we talk about forward declarations.

In other words, the above should really be giving you an illegal
redeclaration of class Foo, if Foo.pm did as advertised and created
a Foo type.  You'd have to use 'augment' to do monkey typing like that.

There has been some speculation that we could allow proto and multi
classes when the intent is to scatter the definition around.  But that
hasn't yet been determined to be a Good Thing.  The compiler would like
to know that it can compose a class when it sees the trailing curly,
in case someone wants to use it in a subsequent BEGIN.  Allowing multi
classes would necessarily subvert that at least till CHECK time, if
not till first use.  You'd generally like to catch method name conflicts
no later than CHECK time.

Larry


Re: One-pass parsing and forward type references

2010-02-01 Thread Moritz Lenz
Carl Mäsak wrote:
 But on another level, the level of types, Perl 6 makes it fairly
 *un*natural that the type CFoo refers to the type CBar, which in
 turn refers to the type CFoo.

True, and that has also been bothering me quite a bit.

The solution is to always write ::Typename instead of Typename
except when it isn't a solution.

First of all in signatures ::T means actually type capture, secondly I
guess that some constructs really want to resolve type names at compile
time -- for example the multi dispatcher needs to know the inheritance
structure of a type in order to its pre-sorting of signatures.

The first problem could be solved by introducing another syntax for type
captures (perhaps  :foo   or so?), of the second I know too little to
really comment on it.

Cheers,
Moritz


Re: One-pass parsing and forward type references

2010-02-01 Thread Carl Mäsak
Moritz (), Carl ():
 But on another level, the level of types, Perl 6 makes it fairly
 *un*natural that the type CFoo refers to the type CBar, which in
 turn refers to the type CFoo.

 True, and that has also been bothering me quite a bit.

 The solution is to always write ::Typename instead of Typename
 except when it isn't a solution.

 First of all in signatures ::T means actually type capture, secondly I
 guess that some constructs really want to resolve type names at compile
 time -- for example the multi dispatcher needs to know the inheritance
 structure of a type in order to its pre-sorting of signatures.

 The first problem could be solved by introducing another syntax for type
 captures (perhaps  :foo   or so?), of the second I know too little to
 really comment on it.

I had half an idea about the second one as I wrote the first mail. It
may or may not be useful, but here goes:

Some keyword, on the level of 'is' and 'does', which allows one to use
a not-yet-defined typename within another type. It wouldn't solve the
core problem -- the one about having to think about circularities --
but it would allow one to create cycles. So this would work:

  class A precedes-but-refers-to B { ... B ... }
  class B { ... A ... }

Just an idea.

// Carl


Re: One-pass parsing and forward type references

2010-02-01 Thread Patrick R. Michaud
On Sun, Jan 31, 2010 at 06:35:14PM +0100, Carl Mäsak wrote:
 I found two ways. Either one uses Caugment (the language construct
 formerly known as Cis also):
 
   class B {}
   class A { sub foo { B::bar } }
   augment class B { sub bar { A::foo } }
 
 ...or one may use the C:: notation to index a type using a string value:
 
   class A { sub foo { ::B::bar() } }
   class B { sub bar { A::foo } }

There's a third way:

class B { ... }# introduce B as a class name without definition
class A { sub foo { B::bar } }

class B { sub bar { A::foo } }

The first line is a literal ... in the body of the class -- it
indicates that we're only declaring the name as being a type,
and that something else will fill in the details later.

Pm


Re: One-pass parsing and forward type references

2010-02-01 Thread Carl Mäsak
Patrick (), Carl ():
 I found two ways. Either one uses Caugment (the language construct
 formerly known as Cis also):

   class B {}
   class A { sub foo { B::bar } }
   augment class B { sub bar { A::foo } }

 ...or one may use the C:: notation to index a type using a string value:

   class A { sub foo { ::B::bar() } }
   class B { sub bar { A::foo } }

 There's a third way:

    class B { ... }    # introduce B as a class name without definition
    class A { sub foo { B::bar } }

    class B { sub bar { A::foo } }

 The first line is a literal ... in the body of the class -- it
 indicates that we're only declaring the name as being a type,
 and that something else will fill in the details later.

That's nice. Seems like a decent way to avoid 'use MONKEY_TYPING'.

Is it allowed to do 'class B { ... }' several times in different files
before finally declaring the real B? If so, then I'd consider it
equivalent to my proposed keyword, and thus there'd be no need for the
latter.

// Carl


Re: One-pass parsing and forward type references

2010-02-01 Thread Jan Ingvoldstad
On Mon, Feb 1, 2010 at 17:46, Patrick R. Michaud pmich...@pobox.com wrote:

 There's a third way:

class B { ... }# introduce B as a class name without definition
 class A { sub foo { B::bar } }

class B { sub bar { A::foo } }

 The first line is a literal ... in the body of the class -- it
 indicates that we're only declaring the name as being a type,
 and that something else will fill in the details later.


It seems to me that this doesn't really solve the problems that will occur
when people start making packages independently of eachother.

Of course it can be solved by submitting patches to the other developer's
code, but it seems inelegant.
-- 
Jan


Re: One-pass parsing and forward type references

2010-02-01 Thread Larry Wall
Please don't assume that rakudo's idiosyncracies and design fossils
are canonical.  STD does better namespace management in some respects,
particularly in accepting the approved predeclaration form:

class Foo {...}

(and rakudo might now accept this too).

You don't want to use augment for this, because augment is only for
*cheating*, and will eventually require a 'use MONKEY_TYPING;' before it.

You are correct that the one-pass parsing is non-negotiable; this is
how humans think, even when dealing with unknown names.  However, human
language has ways of telling the listener whether a new word is a noun
or a verb, and we generally choose not to require articles or verb
markers in Perl 6, so class names (nouns) are officially ambiguous
with function names (verbs).  We could, as has been suggested before,
have an optional noun marker for undeclared classes, but this tends
to insidiously creep into lots of places, when a single predeclaration
would be clearer and more efficient.

The other reason we'll stick with one-pass parsing is that you can
only do multiple pass parsing if you know exactly which language
you're parsing all the time.  This works against extensibility.
Multiple passes introduce linguistic race conditions any time you're
not sure what language you're in; see source filters for a good bad
example of why you don't want to do multi-pass parsing.

Because type names are nouns in Perl 6, we want to parse them like
other nouns, such as variables and constants, not like verbs.
Know whether something is a noun or a verb is very important in
maintaining the two-terms-in-a-row prohibition, which is the
primary way in which Perl 6 expressions are self-clocking.  Break
that clock and your error messages will turn to mush.  Some of
the smartest error messages that STD emits are actually the
two-terms message transmogrified into something more specific
by context.

Therefore, a word that is declared as a sigilless noun must change
the parsing from there on down so that we know to expect an infix
after it, not a term, which a verb functioning as a listop would
expect.

constant Int pi = 3;
pi 'foo';   # syntax error

(Notice how the fact that Int is predeclared also allows us to
correctly parse the constant declaration, since allows us to
distinguish the existing type from the name to be declared.)

Now that we have basic philosophy out of the way, on to other
arguments.  You might take the predeclaration policy as subtle
encouragement to use the late binding provided by method calls,
which don't require predeclaration.  But Perl 6 is not religiosly
OO to the exclusion of FP, so that's a weak argument, and we're not
trying to discourage people from doing cross-module function calls.

You make an analogy between type recursion and functional recursion,
which is true up to a point, but types and other unsigilled names
tend to be rarer than function names, so it makes sense, if we see
an unknown name, to assume (if we choose to assume) that it is a
postdeclared verb rather than a postdeclared noun.  We don't want to
break function recursion in order to support type recursion.  We can't
guess it both ways since we wouldn't know what to expect next.  Perl 5
did lookahead and tried to guess in some cases, but this generally
turned out to be a mistake.  It was difficult to document and only
inspired mistrust in the parser.  Plus it degrades the two-term rule.

But also note that there are several other ways to predeclare
types implicitly.  The 'use', 'require', and 'need' declarations
all introduce a module name that is assumed to be a type name.

We can also deduce that any use of Foo::Bar::baz to mean
there exists a Foo::Bar type.  STD does this already, though
perhaps only as a lexically scope name at the moment.  We might
assume it to (at least potentially) be global.

A 'use' also predeclares any types in its own space and makes the
public ones available as predeclaration.  From the standpoint of a
typical midsized application that are spread across multiple files,
most of the predeclarations should arrive like this, and not require
a yadayada predeclaration.  And in Perl 6 you shouldn't generally
be using types without a 'use' of the definitions of those types,
or you might not import some desirable multi into your lexical scope.

Of course, we have to deal with 'use' cycles.  Perl 5 simply assumes
that it doesn't need to load a module that it knows is already being
loaded, even if that loading is not complete or would result in
something not being defined that ought to be.  We could take the same
agnostic approach and require the user to break use cycles explicitly
with yada.

Or we could detect that exact situation and relax the rules slightly
to do a bit of dwimmery on type names that *start* with the
partially compiled module.  In that case, it might be reasonable,
if we've got a suspended 'use Foo', to assume:

Foo::Bar::baz   # likely a function
Foo::Bar::Baz   # likely 

Re: One-pass parsing and forward type references

2010-02-01 Thread Matthew Wilson
On Mon, Feb 1, 2010 at 9:32 AM, Larry Wall la...@wall.org wrote:
 But I also think that type recursion is likelier to indicate a design
 error than function recursion, so I'm not sure how far down this road
 we want to go.  We could, for instance, create a new type name every

I was going to say I use self-referential and cyclical type definitions in
Sprixel's (stage0) C# source... (because I did at one point), but then I
realized I stopped doing that; it worked, but it was a tad too tricksy to
maintain.

So, point taken.


Re: One-pass parsing and forward type references

2010-02-01 Thread yary
A slight digression on a point of fact-

On Mon, Feb 1, 2010 at 9:32 AM, Larry Wall la...@wall.org wrote:
...
 You are correct that the one-pass parsing is non-negotiable; this is
 how humans think, even when dealing with unknown names.

It's common for people to read a passage twice when encountering
something unfamiliar. That's on the large level. And even on the small
level of reading for the first time, people don't read completely
linearly, skilled readers make regressions back to material already
read about 15 percent of the time. -
http://en.wikipedia.org/wiki/Eye_movement_in_language_reading (and I
read about that elsewhere years ago, wikipedia happens to be the most
convenient reference.)

I'm not arguing against 1-pass parsing for Perl6, just reminding that
humans are complicated. And Larry's quote is how humans think
whereas the research on eye jumps is about how humans read which are
not exactly the same...

-y


Re: One-pass parsing and forward type references

2010-02-01 Thread Larry Wall
On Mon, Feb 01, 2010 at 10:10:11AM -0800, yary wrote:
: A slight digression on a point of fact-
: 
: On Mon, Feb 1, 2010 at 9:32 AM, Larry Wall la...@wall.org wrote:
: ...
:  You are correct that the one-pass parsing is non-negotiable; this is
:  how humans think, even when dealing with unknown names.
: 
: It's common for people to read a passage twice when encountering
: something unfamiliar. That's on the large level. And even on the small
: level of reading for the first time, people don't read completely
: linearly, skilled readers make regressions back to material already
: read about 15 percent of the time. -
: http://en.wikipedia.org/wiki/Eye_movement_in_language_reading (and I
: read about that elsewhere years ago, wikipedia happens to be the most
: convenient reference.)
: 
: I'm not arguing against 1-pass parsing for Perl6, just reminding that
: humans are complicated. And Larry's quote is how humans think
: whereas the research on eye jumps is about how humans read which are
: not exactly the same...

True enough; I was thinking primarily about the parsing of spoken
speech, where one generally doesn't have the option to replay beyond
what you can remember.  And Perl 6 is arguably more textual than aural.

Larry


Re: One-pass parsing and forward type references

2010-02-01 Thread Patrick R. Michaud
On Mon, Feb 01, 2010 at 05:55:47PM +0100, Carl Mäsak wrote:
 Is it allowed to do 'class B { ... }' several times in different files
 before finally declaring the real B? If so, then I'd consider it
 equivalent to my proposed keyword, and thus there'd be no need for the
 latter.

Yes.  And declaring the real B doesn't have to be final, nor
does it have to occur at all (as long as none of the features needed
from B are ever needed).

Pm


Re: One-pass parsing and forward type references

2010-02-01 Thread Solomon Foster
On Mon, Feb 1, 2010 at 3:46 PM, Patrick R. Michaud pmich...@pobox.com wrote:
 On Mon, Feb 01, 2010 at 05:55:47PM +0100, Carl Mäsak wrote:
 Is it allowed to do 'class B { ... }' several times in different files
 before finally declaring the real B? If so, then I'd consider it
 equivalent to my proposed keyword, and thus there'd be no need for the
 latter.

 Yes.  And declaring the real B doesn't have to be final, nor
 does it have to occur at all (as long as none of the features needed
 from B are ever needed).

And just to finish it off... are you allowed to do 'class B { ... }'
even after declaring the real B?

-- 
Solomon Foster: colo...@gmail.com
HarmonyWare, Inc: http://www.harmonyware.com


Re: One-pass parsing and forward type references

2010-02-01 Thread Patrick R. Michaud
On Mon, Feb 01, 2010 at 05:56:09PM +0100, Jan Ingvoldstad wrote:
 On Mon, Feb 1, 2010 at 17:46, Patrick R. Michaud pmich...@pobox.com wrote:
  There's a third way:
 
 class B { ... }# introduce B as a class name without definition
  class A { sub foo { B::bar } }
 
 class B { sub bar { A::foo } }
 
 It seems to me that this doesn't really solve the problems that will occur
 when people start making packages independently of eachother.
 
 Of course it can be solved by submitting patches to the other developer's
 code, but it seems inelegant.

I see it as not being much different that what already happens now
in most languages I deal with.

Assume the above lines of code are in different files -- one for A
and one for B.  Presumably A has a reason for saying class B { ... }  
instead of the more likely use B; -- i.e., the author of A knows 
that it is using B, and that B is likely to refer back to A.

And in the above example, I'd expect the file containing the definition
of B to likewise have either a use A; or class A { ... }
declaration.

It ultimately comes down to the fact that Perl expects each module
to declare class names before they get used, unless the class
names are part of CORE.

Pm


Re: One-pass parsing and forward type references

2010-02-01 Thread Larry Wall
On Mon, Feb 01, 2010 at 03:55:15PM -0500, Solomon Foster wrote:
: On Mon, Feb 1, 2010 at 3:46 PM, Patrick R. Michaud pmich...@pobox.com wrote:
:  On Mon, Feb 01, 2010 at 05:55:47PM +0100, Carl Mäsak wrote:
:  Is it allowed to do 'class B { ... }' several times in different files
:  before finally declaring the real B? If so, then I'd consider it
:  equivalent to my proposed keyword, and thus there'd be no need for the
:  latter.
: 
:  Yes.  And declaring the real B doesn't have to be final, nor
:  does it have to occur at all (as long as none of the features needed
:  from B are ever needed).
: 
: And just to finish it off... are you allowed to do 'class B { ... }'
: even after declaring the real B?

STD does not currently allow it because you have to install the name
immediately in case of references in the traits, even before the block.
The block is too late to say whoops, didn't mean it really.  Pretty
much the same reason we changed is also to augment.  We want to
look up the name right now and know whether it should exist without
doing lookahead.

Larry


Re: One-pass parsing and forward type references

2010-02-01 Thread Carl Mäsak
Larry ():
 [Long exposition on the philosophy of predeclaration]

 Hope this helps, or I just wasted a lot of time.  :-)

It did help. Thanks.

A comment on one part, though:

 But I also think that type recursion is likelier to indicate a design
 error than function recursion [...]

I do too. A month or so ago I would have considered type recursion to
always indicate a design error. That was before I started trying to
port type cycles in a program making sensible use of them. :)

As far as I can see, the two cases of type recursion in PGE I outlined
do not indicate a design error. I'd be happy to chat with anyone who
has an idea about how they could be simplified away and replaced by
something non-ugly.

Another thing I started thinking about: if Perl 6 professes to be able
to put on the hat -- syntactically and semantically -- of most any
other programming language out there, through the use of a simple 'use
Language::Java' or 'use Language::Ruby' -- how will Perl 6 compensate
for the fact that its parser is one-pass whereas most other languages
do two passes or more? Specifically, will some programs in those other
languages fail to compile under a Perl 6 language module due to the
fact that a type keyword was referred to before it was declared? If
multiple passes introduce linguistic race conditions, what about
outright linguistic infelicities due to the Perl 6 limitation of
one-pass parsing?

// Carl


Re: One-pass parsing and forward type references

2010-02-01 Thread Larry Wall
On Tue, Feb 02, 2010 at 12:23:50AM +0100, Carl Mäsak wrote:
: Another thing I started thinking about: if Perl 6 professes to be able
: to put on the hat -- syntactically and semantically -- of most any
: other programming language out there, through the use of a simple 'use
: Language::Java' or 'use Language::Ruby' -- how will Perl 6 compensate
: for the fact that its parser is one-pass whereas most other languages
: do two passes or more? Specifically, will some programs in those other
: languages fail to compile under a Perl 6 language module due to the
: fact that a type keyword was referred to before it was declared? If
: multiple passes introduce linguistic race conditions, what about
: outright linguistic infelicities due to the Perl 6 limitation of
: one-pass parsing?

The one-pass ideal is only for standard Perl 6 and other languages
that make that commitment.  Once you switch to another language, you
should use whatever kind of recognizer you need for that language.
Since Perl is Turing complete, it can (in theory) emulate any atomaton
in the right column in the table at the end of:

http://en.wikipedia.org/wiki/Unrestricted_grammar

In short, it's simple only from the standpoint of the *user* of the
module.  Module creators, on the other hand, should be acquainted
with the concept of vicarious suffering before they begin.

Larry


Re: One-pass parsing and forward type references

2010-02-01 Thread Jon Lang
Larry Wall wrote:
 But also note that there are several other ways to predeclare
 types implicitly.  The 'use', 'require', and 'need' declarations
 all introduce a module name that is assumed to be a type name.

Just to clarify: it's possible to define a module within a file,
rather than as a file; and in fact the usual means of defining classes
and roles is an example of this, since they are specialized kinds of
modules.  Correct?  So if I' understanding this correctly, you should
be able to say something like:

use Foo;
class Bar { ... has Foo $x ... }
class Foo { ... }

...where the dots are stand-ins for irrelevant code.  In effect, use
tells the compiler that Foo is a noun, so that the parser knows the
proper way to handle it.  It also looks for the definition of Foo; but
will it start screaming bloody murder if it can't find the definition
right away?  Have I failed to correctly tell it where to look for the
definition?  (i.e., do I need to say something like use ::Foo to let
the parser know that the definition is in this file?)

-- 
Jonathan Dataweaver Lang


One-pass parsing and forward type references

2010-01-31 Thread Carl Mäsak
There's one thing that bugs me ever so slightly. I'll just air it and
happily accept whatever feedback it produces.

This email is somewhat of a third-strike thing: looking back, I've
been muttering over this itch both on IRC and on Twitter during the
past year.

masak sometimes one-pass parsing annoys me to no end.
moritz_ masak: multi pass parsing is even more annoying in the long run ;-)
masak I think it's very restrictive that you can't refer to a class
name before it's been declared.
masak it's unlike many other languages I'm familiar with.
jnthn You can always declare a stub and define it later.
masak true.
masak in other circumstances, that's called 'cruft' or 'boilerplate'.
masak code required to cater to a language's oddities.

carlmasak I sometimes run into the #perl6 restriction that you have
to declare types textually above you use them. (So no cycles.) It
feels arbitrary.
quietfanatic @carlmasak I think It's required to differentiate
between types and subs. I guess you could invent a way to declare a
stub type...
quietfanatic @carlmasak Oh, but you'd think the process that checks
ahead for sub declarations could also check for type declarations.
carlmasak @quietfanatic I haven't yet been seeking ways in which
classes wouldn't have to be declared beforehand; it just bothers me
that they do.

Just to be clear: I don't expect to sway anyone as to whether Perl 6
should be do parsing in more than one pass -- the one-pass parsing is
here to stay in the language. I mostly want to make the case that
sometimes the one-pass restriction causes the programmer to have to
resort to subtle contortions or experience strange errors.

Before I go into specifics, here's my general opinion: historically,
it took a while before compiler writers were convinced that recursion
within and between subroutines was useful enough that they built the
requisite complexity into the compilers. (FORTRAN 77 doesn't have
recursion, for example.) Nowadays, it feels completely natural that
Cfoo may call Cbar, which may call Cfoo again.

But on another level, the level of types, Perl 6 makes it fairly
*un*natural that the type CFoo refers to the type CBar, which in
turn refers to the type CFoo.

Quick, write a program where CA::foo calls CB::bar which calls CA::foo!

I found two ways. Either one uses Caugment (the language construct
formerly known as Cis also):

  class B {}
  class A { sub foo { B::bar } }
  augment class B { sub bar { A::foo } }

...or one may use the C:: notation to index a type using a string value:

  class A { sub foo { ::B::bar() } }
  class B { sub bar { A::foo } }

In either case, one has to mentally acknowledge that there's a
dependency cycle, and manually apply a circularity saw somewhere early
in the code to fix it.

While this in itself is not much of a problem, it becomes one as the
code base grows. The design of Perl 6 stigmatizes type cycles, and
introduces boilerplate of the above type, whereas in other languages
no special treatment at all is necessary. Also, when everything is
confined to one file, it's not so bad. The real pains start when types
in different files need to refer to each other. Do I put a stub class
definition in the 'wrong' file? Or do I turn off the compiler type
checking by putting types in strings?

I didn't see it that way a month or so ago, but now I think of
mutually defined classes as no more unusual than mutually recursive
functions. Here are two naturally-occurring examples from my current
medium-sized project GGE. If the details weigh you down rather than
inform, feel free to skip them. I just want to show that these kinds
of cycles do happen:

* The regex class R occasionally calls out to an optable parser O to parse
  a regex string into an AST. The class O can be set up in such a way as to
  call provided subroutines, including -- if one wants -- subroutines inside
  the class R. However, one of the O tests sends in a whole R object into
  O, expecting it to match as an ordinary regex. Question: How should O
  detect whether an R was sent in?

* The 'before foo\dbar' syntax in S05 allows any regular expression to
  occur after the word 'before' and a space. In this case, 'foo\dbar' would
  be sent as a string to an ordinary method C.before in the match class
  M. Thus, M needs to (recursively) invoke the regex class R to parse the
  string into an AST. Only... R uses M heavily, so it's an A::foo-B::bar
  situation. Question: How should M call out to R when R already calls out
  to M?

The one-pass answer to both these questions are: Well, you simply
need to force your types into a tree structure, and take special care
every time there's a forwards reference somewhere in all your modules.
Either define a type 'too early' and re-open it when you really want
to define it, or use weaker string references to circumvent the
compiler.

The two-pass answer to both these questions are: Huh? What's the problem?

And that's what bothers me. There shouldn't, ideally, *be*