Re: Apoc2 - concerns

Nathan Wiger Sat, 05 May 2001 15:00:02 -0700
Ok, this is long, so here goes...

> I expect the real choice is between <$FOO and <$FOO>.  I can convince
> myself pretty easily that a unary < is just another name for "next", or
> "more", or something.  On the other hand <$FOO> has history.  And if
> one special-cases <$...>, we could also have <foo bar baz> as a qw()
> replacement.  All we'd be doing is taking that syntax away from
> globbing, and giving it to qw, more or less.  (Less, because I'd like
> <foo bar baz($xyz)> to mean {foo => 1, bar => 1, baz => $xyz} or some
> such, where the 1's are negotiable.)

One thing I think we should avoid is as many "special cases" as possible.
This is already why people hate <> currently - because it does both glob()
and readline().

I would say that <> having history is actually a good thing. It's a
foundation, really, since readline() is an iterator of sorts. All we'd be
doing is generalizing the notion. Not only does it apply to files, but it's
a shortcut to more() wherever you feel like using it.

As for <> as a qw() replacement, I think there are really two issues here.
First, you're not really talking about a "replacement", since you're
mentioning different semantics. So qw() will still be widely used. I suggest
that we simply create another q-op to do the qw-ish things you're proposing.
Perhaps qi() for "interpolate" or something else. Plus <> has the terrible
problem that the POD C<> stuff does w/ embedded > chars. The really nice
thing about the q's is you can choose any bracket you want. I think fleshing
out this series of constructs makes the most sense.

> For a circumfix, you could just treat <> as funny parens:
>
>     $line = <somefunc()>;
>
> On the other hand, I wouldn't want to go so far as to require
>
>     <1..10>
>
> merely to make the iterator iterate.  Saying
>
>     @foo[1..10]
>
> ought to be enough clue.

Yes, I think that <> could just be shortcut to wherever you wanted more()
called. Just like .. with a different notation:

   @foo = <$BAR>;
   @foo = @bar[1..10];
   @foo[0..4] = <$BAR>;

This is nice because it looks Perlish, but fundamentally it could be reduced
to a single iterator concept.

> But does that mean that
>
>     %foo{$STDIN}
>
> should read one value or all of them?

I would say that this should return only one thing - the $STDIN variable. No
automatic iteration should be done. Otherwise you run into problems with how
to pass filehandles around as variables. I think iteration needs to be
explicit.

If you want iteration:

   %foo{<$STDIN>};       # poll the iterator
   %foo{$STDIN.more};    # same thing

The iterator is just a member function, and you have to call it if you want
it.

   $FOO = open "<bar";
   &do_stuff($FOO);            # filehandle passed, not contents
   &do_more_stuff(<$FOO>);     # now filehandle is iterated (lazily)
   &do_more_stuff($FOO.more);  # same thing
   %holding_stuff{$FOO.more};  # same thing, just into a var
   close $FOO;

I think these semantics make sense.

> Obviously, given a list flattener *, we'd expect
>
>     %foo{*$STDIN}
>
> to return all the values.  Maybe then
>
>     %foo{<$STDIN}
>
> returns exactly one value, and
>
>     %foo{$STDIN}
>
> guesses.

I'd so this differently, as hinted at above:

   %foo{<$STDIN>};    # return all values
   %foo{<$STDIN);     # return one value
   %foo{$STDIN};      # pass the $STDIN variable

This is assuming that we want < to exist and have a different semantics. But
I'm not sure that's a good idea. For one thing, you'd have to mention it
several times if you want a couple values. I think there's got to be a
better way to request the number of lines based on context.

   %foo{<$STDIN>};              # the whole thing
   %foo{ (1..2) = <$STDIN> };   # anonymous list request?
   %foo{ <$STDIN>[1..2] };      # or notate it as a list?
   %foo{ ($STDIN.more)[1..2] }; # same thing

The last one seems to make sense (it's got those (localtime)[2,3] roots),
with the third one as a shortcut.

> Looking at it from the iterator object end, there might really be three
> methods:
>
>     $STDIN.next # Return one element regardless of context.
>     $STDIN.more # Return number of element wanted by context.
>     $STDIN.all # Return all element regardless of context.
>
> Or maybe there's only a "more" method, and you simply have to force the
> context if you don't want it to guess.

I think one method is the way to go. Let the subs and other contexts request
the number of elements to get back using lazy evaluation:

   @foo[1..10] = <$STDIN>;     # get 10 iterations
   $bar = <$STDIN>;            # next one
   &lazy_sub(<$STDIN>);        # lazily

Assuming:

   sub lazy_sub ($a, $b, $c, $d) {   }

Then the last line above would lazily grab the next four lines.

> We don't actually have a good
> notation for forcing a scalar context yet, let alone a scalar context
> wanting a certain number of arguments.

Personally, I'd look at it differently. I don't think that getting a number
of arguments out of a scalar context makes sense. Rather, I think you need
to call a member function or do whatever to get a list, then lazily evaluate
that.

   @a = $STDIN;    # @a gets a single element - $STDIN
   @b = <$STDIN>;  # @b gets the entire contents of $STDIN
                   # iterations via calls to more()

   @c[0..3] = <$STDIN>;    # assuming we hadn't already emptied
                           # it, this would lazily get 4 lines
   @c[0..3] = more $STDIN; # same thing

> Doing violence to our current
> notions of what various prefix operators currently mean, we might want:
>
>     $$STDIN # Return one element regardless of context.
>     @$STDIN # Return number of element wanted by context.
>     *$STDIN # Return all element regardless of context.

Again, I think this approach is barking up the wrong tree, respectfully. I
don't think it makes sense. Which elements are you grabbing? The ones from
more()? Why? Why would you automagically grab those instead of the ones from
 bob() or jim()? Ok, we could define it that way, but I'd rather see:

   @a[0,1,2,3] = $jim.bob;
   @b[4..5] = $foo.bar;
   @c[9,10,11..13] = $jeff.more;    # just happens to have a <> shortcut

Basically, <> is left as direct access to the iterator, but it's not
magically called except where it clearly make sense (and I don't think
normal variable manip like passing into subs and hashes should be in this
category).

>     $<$STDIN # Return one element regardless of context.
>      <$STDIN # Return number of element wanted by context.
>     @<$STDIN # Return all element regardless of context.
>
> or some other casting mechanism yet to be devised.

I'd do a variation on the above. Looking from a functional perspective:

     $STDIN.more          # context-dependent
    ($STDIN.more)[0..3]   # just selected elements (lazily)
    ($STDIN.more)[0..-1]  # forced all

Then we'd have the following shorcuts as a side-effect:

    <$STDIN>              # context-dependent
   (<$STDIN>)[0..3]       # just selected elements (lazily)
   (<$STDIN>)[0..-1]      # forced all

Seems to be easy and clean, without lots of new syntax.

> The basic underlying question is, what is the context that should fire
> off an iterator?  Everyone thinks @foo[1..10] should just do the right
> thing, whatever that is.  Assuming the following makes an iterator,
> and doesn't set $iter to 1 the first time through:
>
>     $iter = 1..10;
>
> How many of these work?

Hmmm, I'm not sure about the "iterator object" you're implicitly proposing
above. Is it going to be standard fare to assume certain types of objects
are created from certain constructs? What would this do?

   $iter = (1..10);

The same thing as above? What about:

   $iter = (1,2,3,4);
   $iter = (5,7..10);
   $iter = 1, 2, 3;
   $iter = a, b, c;
   $iter = 1, 4 .. 10;

What would these do?

Assuming, however, that we had an iterator object concept, I would say:

>     while ($x = @foo[$iter]) { ... }

Works.

>     while ($x = <$iter>) { ... }

Works, same as:

   while ($x = $iter.more) { ... }

>     while ($x = $iter) { ... }

Passes the iterator object verbatim. That is, now $x is a copy of $iter.

>     for ($iter) { ... }

This is tricky. I see two ways. First, you could expand more()
automatically, then go through its arguments. But I don't think that
necessarily makes sense. Consider in Perl 5:

   $a = \@b;
   for ($a) { ... }

This only gets one thing. And I know we're redoing semantics, but I think
this is sensible. I don't want auto-element-grabbing-conversion. If I want
to iterate, I think I want to say:

   for (<$iter>) { ... }                # whole thing ($iter.more)
   for ( ($iter.more)[0..4] ) { ... }   # lazy evaluation
   for my $n (<$iter>[0..4]) { ... }    # same thing

> I think that iterators must be dereferenced by something explicit, and
> we will have to be very clear as to what is explicit enough.  Subscripts
> may be explicit enough:
>
>     @foo[$iter] # okay?

Yes, I think that's ok. Including the <> in this context would be redundant.

Phew! My brain hurst. ;-)

-Nate
Re: Apoc2 - concerns

Reply via email to