date:20000926

Re: RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

2000-09-26 Thread Ilya Zakharevich


On Mon, Sep 25, 2000 at 06:30:22PM -0400, Karl Glazebrook wrote:
  Well, this shows that you entirely miss the problem of cryptocontexts.
  Context is determined by the "environment" of the operation, not by
  the operation.  Context is propagated:
  
the-left-hand-side-of-assignment --- the-right-hand-side-of-assignment
 
 
 so what is wrong with the statement '@y = 3*@x;' then ?

That other constructs *also* create an array context, in which the
behaviour of multiplication you propose is not appropriate.

  Changing Perl in this respect will make one particular mode of
  operation a tiny bit simpler, but (without major changes to
  cryptocontexting - PLUG see for example my interview on perl.com
  /PLUG) will make life much harder in other modes of operation.

 I think major changes are what we aree talking about here.

I did not see any viable proposal on changing things in a major way.
To design such a change is a *major* work.  We need to keep a lot of
possible combinations with other features in mind, and understand all
the ramifications and desired/undesired interaction.  We need
insight.  We need to balance the tradeoffs.

I do not think we made *any* step in the correct direction yet.

  Remember: do you do your system mainainance in Mathematica?  Why?
  Remember that Wolfram *wanted* you to do this?  Perl5 is much better
  balanced.  You are pulling the blanket to your side of the bed.
 
 I am not sure what point you are trying to make about Mathematica? I
 have read intevrviews with Woldfram ,he is clearky an egomanica and
 thinks everything should be an expression, but I am not sure he
 was arguing for system management in Mathematica.

I did not mean interviews.  10 years ago I read the manual.  It was
clearly there.

Ilya

Re: RFC 204 (v2) Arrays: Use list reference for multidimensional array access

2000-09-26 Thread Bart Lateur


On Mon, 25 Sep 2000 19:26:38 -0700, Nathan Wiger wrote:

 I agree with both of you. It would be nice if @$ precedence worked as Bart
 specified, but I still think that arrays should be arrays.

The problem is that

   $name = "myarray";
   @$name = (1,2,3);
   print @$name[0,1];  # 1,2

Is very consistent currently. Change one and you have to change the
precedence and parsing of all symbolic refs.

You are suggesting to keep a weird precedence rule, just to ease
symbolic dereferencing!?! That's... obscene.

-- 
Bart.

Re: RFC 48 (v4) Replace localtime() and gmtime() with date() and utcdate()

2000-09-26 Thread Russ Allbery


Jonathan Scott Duff [EMAIL PROTECTED] writes:

 Do you mean local time now or local time for all time?  The former is
 easy, the latter hard.  Well, it's not hard for those places where the
 offset from UTC has remained (mostly) constant, but there are some
 places that have an offset from UTC that is a function of time more
 complex than daylight savings.

 Or would Cdate() just use Clocaltime() and punt to the OS/C
 RTL/etc.?

It should just punt.  ANSI/ISO C already requires that the C localtime
call deal with all of this.  We can look at providing our own localtime if
the system is grossly deficient in this respect, but that's an internals
rather than a language issue.

-- 
Russ Allbery ([EMAIL PROTECTED]) http://www.eyrie.org/~eagle/

RFC 178 (v5) Lightweight Threads

2000-09-26 Thread Perl6 RFC Librarian


This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

Lightweight Threads

=head1 VERSION

  Maintainer: Steven McDougall [EMAIL PROTECTED]
  Date: 30 Aug 2000
  Last Modified: 26 Sep 2000
  Mailing List: [EMAIL PROTECTED]
  Number: 178
  Version: 5
  Status: Frozen

=head1 ABSTRACT

A lightweight thread model for Perl.

=over 4

=item *

All threads see the same compiled subroutines

=item *

All threads share the same global variables

=item *

Threads can create thread-local storage by Clocalizing global variables

=item *

All threads share the same file-scoped lexicals

=item *

Each thread gets its own copy of block-scoped lexicals upon execution
of Cmy

=item *

Threads can share block-scoped lexicals by passing a reference to a
lexical into a thread, by declaring one subroutine within the scope of
another, or with closures.

=item *

Open code can only be executed by a thread that compiles it

=item * 

The language guarantees atomic data access. Everything else is the
user's problem.

=back 

=over 4

=item Perl

Swiss-army chain saw

=item Perl with threads

juggling chain saws

=back

=head1 CHANGES

=head2 v5

Frozen

=head2 v4

=over 4

=item *

Traded in data coherence for LAtomic data access. Added examples 16 and
17. 

=item *

Traded in Primitive operations for LLocking

=item *

Dropped L/local section

=item *

Revised L/Performance section

=back

=head2 v3

=over 4

=item *

Simplified example 9

=item *

Added L/Performance section

=back

=head2 v2

=over 4

=item *

Added section on sharing block-scoped lexicals between threads

=item *

Added examples 9, 10, and 11. (N.B. renumbered following examples)

=item *

Fixed some typos

=back


=head1 FROZEN

There was substantial--if somewhat disjointed--discussion of thread
models on perl6-internals. The consensus among those with internals
experience is that this RFC shares too much data between threads, and
that the CPU cost of acquiring a lock for every variable access will
be prohibitive.

Dan Sugalski discussed some of the tradeoffs and sketched an alternate
threading model at

http://www.mail-archive.com/perl6-internals%40perl.org/msg01272.html

however, this has not been submitted as an RFC.


=head1 DESCRIPTION

The overriding design principle in this model is that there is one
program executing in multiple threads. One body of code; one set of
global variables; many threads of execution. I like this model because

=over 4

=item *

I understand it

=item *

It does what I want

=item *

I think it can be implemented

=back


=head2 Notation

=over 4

=item Imain and Ispawned threads

We'll call the first thread that executes in a program the Imain
thread. It isn't distinguished in any other way. All other threads are
called Ispawned threads.

=item Iopen code

Code that isn't contained in a BLOCK.

=back

Examples are written in Perl5, and use the thread programming model
documented in CThread.pm. Discussions of performance and
implementation is based on the Perl5 internals; obviously, these are
subject to change.


=head2 All threads see the same compiled subroutines

Subroutines are typically defined during the initial compilation of a
program. Cuse, Crequire, Cdo, and Ceval can later define
additional subroutines or redefine existing ones. Regardless, at any
point in its execution, a program has one and only one collection of
defined subroutines, and all threads see this collection.

Example 1

sub foo  { print 1 }
sub hack_foo { eval 'sub foo { print 2 }' }
foo();
Thread-new(\hack_foo)-join;
foo();

Output: 12. The main thread executes Cfoo; the spawned thread
redefines Cfoo; the main thread executes the redefined subroutine.


Example 2

sub foo  { print 1 }
sub hack_foo { eval 'sub foo { print 2 }' }
foo();
Thread-new(\hack_foo);
foo();

Output: 11 or 12, according as the main thread does or does not make
the second call to Cfoo() before the spawned thread redefines it. If
the user cares which happens first, then they are responsible for
doing their own synchronization, for example, with Cjoin, as shown
in Example 1.

Code refs (like all Perl data objects) are reference counted. Threads
increment the reference count upon entry to a subroutine, and
decrement it upon exit. This ensures that the op tree won't be garbage
collected while the thread is executing it.


=head2 All threads share the same global variables

Example 3

#!/my/path/to/perl
$a = 1;
Thread-new(\foo)-join;
print $a;

sub foo { $a++ }

Output: 2. C$a is a global, and it is the Isame global in both the
main thread and the spawned thread.


=head2 Threads can create thread-local storage by Clocalizing global
variables

Example 4

#!/my/path/to/perl
$a = 1;
Thread-new(\foo);
print $a;

sub foo { local $a = 2 }

Output: 1. The spawned thread gets it's own copy of C$a. The copy of
C$a in the main thread is unaffected. It

RFC 185 (v3) Thread Programming Model

2000-09-26 Thread Perl6 RFC Librarian


This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

Thread Programming Model

=head1 VERSION

  Maintainer: Steven McDougall [EMAIL PROTECTED]
  Date: 31 Aug 2000
  Last Modified: 26 Sep 2000
  Mailing List: [EMAIL PROTECTED]
  Number: 185
  Version: 3
  Status: Frozen

=head1 ABSTRACT

This RFC describes the programming interface to Perl6 threads. 
It documents the function calls, operators, classes, methods, or
whatever else the language provides for programming with threads.

=head1 CHANGES

=head2 v3

Frozen

=head2 v2

=over 4

=item * 

Added SYNOPSIS, and wrote a proper ABSTRACT

=item *

Detailed Casync 

=item *

Detailed sharing of lexicals between threads

=item *

Traded Mutexes back for Clock, Ctry, and Cunlock

=item *

Pushed CSemaphore, CEvent, and CTimer down into CThread::

=item *

Specified readable, writable and failure to return Events

=item *

Reworked the wait functions

=item *

Added CQueue

=back


=head1 FREEZE

There was little, if any, further discussion after version 2.


=head1 SYNOPSIS

  use Thread;
  
  $sub = sub { ... };  
  
  $thread  = new Thread \func   , @args;
  $thread  = new Thread  $sub, @args;
  $thread  = new Thread   sub { ... }, @args;
async { ... };
  $result  = join $thread;
   
  $thread  = this Thread;
  @threads = all  Thread;
  
  $thread1 == $thread2 and ...
  Thread::yield();

  critical { ... };   # one thread at a time in this block
  
  lock $scalar;
  lock @array
  lock %hash;
  lock sub;
  
  $ok = try $scalar;
  $ok = try @array
  $ok = try %hash;
  $ok = try sub;
  
  unlock $scalar;
  unlock @array
  unlock %hash;
  unlock sub;
  
  $event = auto   Thread::Event; 
  $event = manual Thread::Event;
   set$event;
   reset  $event;
   wait   $event;
  
   $semaphore = new Thread::Semaphore $initial;
  $ok= $semaphore-up($n);
   $semaphore-down;
  $count = $semaphore-count;
  
  $timer = Thread::Timer-delay($seconds);
  $timer = Thread::Timer-alarm($time);
  $timer-wait;
  
  $event = $fh-readable
  $event = $fh-writable
  $event = $fh-failure
  
  $ok = wait_all(@references);
  $i  = wait_any(@references);
  
   $queue = new Thread::Queue
   $queue-enqueue($a);
  $a = $queue-dequeue;
  $empty = $queue-empty;


=head1 DESCRIPTION

=head2 Thread

=over 4

=item I$thread = Cnew CThread \Ifunc, I@args

Executes Ifunc(I@args) in a separate thread. The return value is
a reference to the CThread object that manages the thread.

The subroutine executes in its enclosing lexical context. This means
that lexical variables declared in that context may be shared between
threads. See RFC 178 for examples.


=item I$thread = Cnew CThread I$sub, I@args

=item I$thread = Cnew CThread Csub { ... }, I@args

Executes an anonymous subroutine in a separate thread, passing it
I@args. The return value is a reference to the CThread object that
manages the thread.

The subroutine is a closure. References to variables in its lexical
context are bound when the Csub operator executes. See RFC 178 for
examples.


=item Casync BLOCK

Executes BLOCK in a separate thread. Syntactically, Casync BLOCK
works like Cdo BLOCK. Casync creates a CThread object to manage
the thread, but it does not return a reference to it. If you want the
CThread object, use one of the Cnew CThread forms shown above.

The BLOCK executes in its enclosing lexical context. This means that
lexical variables declared in that context may be shared between
threads.


=item I$thread = Cthis CThread

Returns a reference to the CThread object that manages the current
thread.


=item I@threads = Call CThread

Returns a list of references to all existing CThread objects in the
program. This includes CThread objects created for Casync blocks.


=item I$result = Cjoin I$thread

=item I@result = Cjoin I$thread

Blocks until I$thread terminates. May be called repeatedly,
by any number of threads.

Returns the last expression evaluated in I$thread. This expression
is evaluated in list context inside the thread.

If Cjoin is called in list context, it returns the entire list; if
Cjoin is called in scalar context, it returns the first element of
the list.


=item I$thread1 == I$thread2

Evaluates to true iff I$thread1 and I$thread2 reference the same
CThread object.


=item CThread::yield()

Gives the interpreter an opportunity to switch to another thread. The
interpreter is not obligated to take this opportunity, and the calling
thread may regain control after an arbitrarily short period of time.

=back


=head2 Critical section

Ccritical is a new keyword. Syntactically, it works like Cdo. 

  critical { ... }; 

The interpreter guarantees that only one thread at a time can execute
a Ccritical block.


=head2 Lock

=over 4

=item Clock I$scalar

=item Clock I@array

=item Clock I%hash

=item Clock Isub

Applies a lock to a variable.

If there are no locks applied to

RFC 239 (v2) IO: Standardization of Perl IO Functions to use Indirect Objects

2000-09-26 Thread Perl6 RFC Librarian


This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

IO: Standardization of Perl IO Functions to use Indirect Objects

=head1 VERSION

  Maintainer: Nathan Wiger [EMAIL PROTECTED]
  Date: 15 Sep 2000
  Last Modified: 26 Sep 2000
  Mailing List: [EMAIL PROTECTED]
  Number: 239
  Version: 2
  Status: Frozen

=head1 ABSTRACT

Currently, Perl IO functions follow a C-like style, twiddling values
passed to them and then returning 1 or 0. 

This RFC takes after RFC 14's modifications to open() and proposes
similar modifications to other IO operations, in an attempt to make them
both more internally consistent and also more flexible. These
modifications allow increased modularity and function namespace as well.
In most cases the changes are relatively minor, simply requiring the
user to drop the "," after the first argument.

=head1 DESCRIPTION

I'm running short on time here, so here goes. The following functions
should be modified into the new syntaxes shown below:

  $FH  =  open $file, [@args];
  $ret =  seek $FH $pos;# $FH-seek
  $ret =  read $FH $scalar, $len, $offset;  # $FH-read
  $ret =  tell $FH; # $FH-tell
  $ret =  ioctl $FH $fun, $scalar;  # $FH-ioctl
  $ret =  flock $FH $op;# $FH-flock
  $ret =  fnctl $FH $func, $scalar; # $FH-fcntl

  $DH  =  open dir $dir;# dir-open
  $ret =  seek $DH $pos;# $DH-seek
  $ret =  tell $DH; # $DH-tell
  $ret =  rewind $DH;   # $DH-rewind

  $FH  =  sysopen  $file, $mode, $mask; # "open sys"?
  $ret =  sysread  $FH $scalar, $len, $offset;  # $FH-sysread
  $ret =  syswrite $FH $scalar, $len, $offset;  # $FH-syswrite
  $ret =  sysseek  $FH $pos, $whence;   # $FH-sysseek
   
  $SH  =  open socket $dom, $type, $proto;  # socket-open
  $ret =  connect $SH $name;# $SH-connect
  $ret =  recv $SH $scalar, $len, $flags;   # $SH-recv
  $ret =  setsockopt $SH $lev, $opt, $val;  # $SH-setsockopt
  $ret =  socketshut $SH $how;  # $SH-socketshut
 ($S1,$S2) =  open socket $dom, $type, $proto, 2;   # (socketpair)

 ($R, $W)  =  pipe; # "open pipe"?

If you read your Camel, most of these changes simply involve dropping
the "," after the first argument to take advantage of the indirect
object syntax.

This is not pure sugar. It buys us many important benefits:

   1. Fewer functions. As RFC 14 starts to note, gone are the *dir
  functions, as well as many specialized functions. They simply
  become member functions, increasing function namespace. So,
  tell() can now be used on files, directories, and web docs,
  without having to invent new function names.

   2. Seemless integration with extended file types. Using this
  syntax, we can now do this:

 use http;
 $WEB = open http "http://www.yahoo.com", POST;
 flock $WEB $op;

  And $WEB-flock can die as "unimplemented", without this
  having to be anywhere even remotely in core. It can exist
  as an external module, but to the user it looks like the
  same function used here:

 $FH = open "/etc/motd" or die;
 flock $FH $op;

  Even though it's vastly different.  Users seen a clean,
  coherent interface, without having to worry about the
  nuts and bolts or calling special methods.

   3. More consistent syntax. The most frequently-used IO function,
  print, already uses an indirect object syntax for its handles.
  This RFC simply follows the lead and extends this to other
  IO functions as well.

   4. More modular design. This tightly integrates with the idea
  of moving certain functions out of core, like socket(). Now
  you can simply say:

 use socket;  # import socket class
 $SH = open socket $dom, $type, $proto;
 recv $SH $scalar, $len, $offset;  

  Bingo. It walks and talks like an extensible open() thanks
  to the lovely indirect object syntax, but all the functions
  are actually member functions of $SH. 

   5. Less stuff in core. Hand in hand with the above, stuff like 
  recv() doesn't even have to be in the same zip code as core
  anymore. But it walks and talks just like it was.

  *Plus*, there's less namespace pollution because everything's
  a member function. As such, no extensive checks like "Arg 1
  not a socket handle" have to be done. Running a bad function
  yields a standard error:

 recv $not_a_socket, $bad, $args;
 Can't find object method "recv" via "$not_a_socket" ...

  See below for suggestions on a better error message.

   6. Can extend use of the default filehandle. Thanks to this
  approach, we

Re: RFC 319 (v1) Transparently integrate Ctie

2000-09-26 Thread Piers Cawley


Perl6 RFC Librarian [EMAIL PROTECTED] writes:

 This and other RFCs are available on the web at
   http://dev.perl.org/rfc/
 
 =head1 TITLE
 
 Transparently integrate Ctie

On the whole I think I'm liking this. But it needs work.

my packed $a;   # just an assertion, RFC 218
$a = get_binary;# packed-TIESCALAR($a); $a-STORE(..);

I'm not sure there's any reason to delay the tie; It kind of depends
on what happens to stuff like 'defined($a)'. RFC 218 was brought
forward because I didn't want magic to be happening in cases where
Cmy Foo $bar was using a Foo that was a superclass, and there was no
mechanism for knowing that at the time. 

I'm kind of curious to know what you think would happen with the
following. I've commented where I'm confident...

interface Number;
sub TIESCALAR;
sub STORE;
sub FETCH;

package integer implements Number; # I really like this notation
sub TIESCALAR {...};
sub STORE {...};
sub FETCH {...};

my Number $i;   # Number is an interface, so just an assertion
my integer $n;  # integer-TIESCALAR($n);
my non_tied $object;# Just an assertion
defined($n);# Should be false

$n = 5;
$i = $n;
$n = 10;
print $i;   
$i = $object;   # Assertion fails

$a = more_data; # $a-STORE(...);
$a++;   # $a-STORE($a-PLUS(1));
undef $a;   # $a-DESTROY;
 
my int @b :64bit;   # again, just an assertion

Asserting what? That's not valid syntax at the moment.

@c = @b;# empty list passed still
@b = (1,2); # int-TIEARRAY(@a, '64bit'); @b-CLEAR(...);
...

Hmm... I think this is somewhat ugly. Assuming that you want
Cmy int @b to imply CUNIVERSAL::isa(all(@a), 'int') then tying the
entire array seems a bit weird.

 Note that the CTIE* methods will only be called if they exist, just
 like currently. If a given CTIE* method does not exist, then the
 appropriate error should be spit out:
 
my Pet @spot = ("fluffy");
Can't locate method "TIEARRAY" via package "Pet"
 
 In this case, the package CPet has declared that it can't handle
 arrays, which is just fine.

Er... You seem to be implying here that *all* classes should have TIE
methods. Which is not good. It's especially not good in cases where
the class is actually an interface. Hmm... maybe we should have the tie
behaviour only happen if the package is declared as implementing an
appropriate 'Tie::Foo' interface:

package integer implements Tie::Scalar;

Note that this would imply that the Tie::Foo modules in the standard
library become interfaces, but I'm not sure that that would be any
great loss...

 =head2 Optimization and Inheritance
 
 One of the main goals behind doing something like this is being able to
 create custom variable types that can take advantage of optimizations,
 and having these variables walk and talk like builtins.
 
 For this reason, it is further proposed that all variable types be
 handled through basic method inheritance in Perl 6. Essentially,
 everything becomes an object and is fully overrideable and redefineable.
 So, for example:

Whoa, now you're stretching.

 [...]

 =head2 Type checking
 
 Nat's upcoming RFC on type checking will propose a Cuse strict 'types'
 pragma. Type checking would be trivial to implement by combining aspects
 of this RFC with the Cuse optimize concept:
 
package Pet : interface;   # RFC 265
use optimize types = ['Dog', 'Cat'];
 
 With this declaration, Perl is now told that anything of type CPet can
 be either a CDog or a CCat. 

Err... Specifying which classes implement an interface in the
interface specification is Wrong Wrong Wrong.

 This means that in your main code:
 
use strict 'types';
my Pet $spot = new Camel;   # oops!
 
 The second line would raise a syntax error.

If your client code wants to insist that the only pets it's interested
in are Dogs or Cats then you should make that assertion somewhere.
You certainly shouldn't assert it in the interface declaration.
Check out the 

my Pet $spot : isa(any(qw/Dog Cat/)) = new Camel; # oops!

style that I proposed elsewhere.

 =head2 The C:tie attribute
 
 Making Ctie this seamless may scare some people. In this case, we may
 wish to add an C:tie attribute that can be specified on the
 Cpackage:
 
package Pet : tie; # will be auto-tied

Can I just point out that nobody has yet proposed that you can attach
attributes to a package?

 
 Placing this on the package, and not individual subs, makes more sense
 because it dictates how all the package's methods interact.
 
 The idea here is that by fully integrating these concepts, a separate
 Ctie function will no longer be necessary and will instead be replaced
 by a simple C:tie package attribute (or no attribute at all).

I'm not entirely sure what you're driving at here. I thought you were
arguing that *all* packages that created objects would use tie magic,
in

Re: RFC 319 (v1) Transparently integrate Ctie

2000-09-26 Thread Nathan Wiger


 I'm kind of curious to know what you think would happen with the
 following. I've commented where I'm confident...
 
 interface Number;
 sub TIESCALAR;
 sub STORE;
 sub FETCH;
 
 package integer implements Number; # I really like this notation

Tangentially, yes, it is nice to read, but it prevents multiple
interface specifications. "use interface" is more consistent.

 sub TIESCALAR {...};
 sub STORE {...};
 sub FETCH {...};
 
 my Number $i;   # Number is an interface, so just an assertion
 my integer $n;  # integer-TIESCALAR($n);
 my non_tied $object;# Just an assertion
 defined($n);# Should be false

Yes. The only potential gotcha is if the user decides to do something
Really Evil and stores a value as part of their TIESCALAR method. Then
$n-FETCH will return that value and defined($n) will be true.

However, this is not the purpose of tie, and I think an appropriate
response is: Don't Do That. I agree with both you and Damian that TIE*
should be called on declaration for consistency. If a person doesn't
know how to use tie, well, that's not our problem. ;-)

 $n = 5;
 $i = $n;
 $n = 10;
 print $i;
 $i = $object;   # Assertion fails

Assuming you've set up your Cuse optimize restrictions appropriately,
then yes. The key is really what 'non_tied' is setup as. If this is a
builtin type that optimizes itself to be a string object, then yes it
will fail. However, if 'non_tied' is just a synonym for 'float' (for
some odd reason) then the last line will be ok.

 my int @b :64bit;   # again, just an assertion
 
 Asserting what? That's not valid syntax at the moment.

But it will be. :-) See RFC 279.
 
 @c = @b;# empty list passed still
 @b = (1,2); # int-TIEARRAY(@a, '64bit'); @b-CLEAR(...);
 
 Hmm... I think this is somewhat ugly. Assuming that you want
 Cmy int @b to imply CUNIVERSAL::isa(all(@a), 'int') then tying the
 entire array seems a bit weird.

Not necessarily. The key is: *how* would you implement the assertion
check? 

If you use tie, then your int class STORE method can do something like
this:

   package int;
   use base 'var';
   # take defaults from var class
   STORE {
  if ( self-isa($_[1]) ) {
 SUPER-STORE($_[0], $_[1]);   # internally store
  } else {
 die "Bad data $_[1]" if ( $under_strict_types );
  }
   }

Now, there's a difference between _could_ do this and _must_ do this.
The idea here is that you could do this, and users wouldn't see any
difference between your custom types and builtin types.

 Er... You seem to be implying here that *all* classes should have TIE
 methods. Which is not good.

No, I wasn't trying to imply that, I'll clarify this point. TIE methods
are still completely optional. 

 Err... Specifying which classes implement an interface in the
 interface specification is Wrong Wrong Wrong.

Yes, you're right, and it's way outside of this scope of this RFC,
actually. Your idea:

 my Pet $spot : isa(any(qw/Dog Cat/)) = new Camel; # oops!

is much better for this application.
 
 Can I just point out that nobody has yet proposed that you can attach
 attributes to a package?

Didn't Damian propose Cpackage Foo : interface already? ;-)

 I'm not entirely sure what you're driving at here. I thought you were
 arguing that *all* packages that created objects would use tie magic,
 in which case the new attribute becomes unnecessary. And if you're not
 proposing that then :tie is too general in the cases where the module
 can only tie to specific variable types. I think you get better
 granularity with interfaces, which are way more general than a special
 new attribute.

No, let me back up a little. The idea is to make it so that tied
interfaces - which are really different beasts from OO interfaces
altogether because of their purpose - should be more closely integrated
into Perl 6. This would allow you to create custom, optimized,
strongly-typed variables that would function just like builtins:

   my Matrix @a = ([1,2,3], [4,5,6]);
   my NISMap %map :passwd = read_passwd_file();
   my Apache::Session %session :transaction;

However, this is not to overshadow OO interfaces, which are needed for
functional methods, as you note.

The :tie attribute is a poorly chosen name. The original name was
:implicit, and was going to be attached to the TIE subs:

   package Demo;
   sub TIESCALAR : implicit { ... }
   sub TIEHASH : implicit { ... }
   sub TIEARRAY { ... }

So in this example, a user can say:

   my Demo $x;
   my Demo %x;
   my Demo @x;   # TIEARRAY not called

However, after thinking about this, I decided this was not a worthwhile
distinction. How often would you want this behavior? So I decided to
attach the attribute to the package:

   package Demo : implicit;

But that really didn't connote what was going on. So I changed it to:

   package Demo : autotie;

But then decided the 'auto' was redundant. However, the more

Re: RFC 279 (v1) my() syntax extensions and attribute declarations

2000-09-26 Thread Alan Gutierrez


On 24 Sep 2000, Perl6 RFC Librarian wrote:

I still hope that it doesn't get as complicated as all this. I know
there are arguments out there for specifying integer size and signedness
but I can't imagine that adding this stuff is a good thing.

 Note that multiple types cannot be specified on the same line. To
 declare variables of multiple types, you must use separate statements:
 
my int ($x, $y, $z) :64bit;
my string ($firstname, $lastname :long);
 
Not so bad if I don't have to worry about 64bit or long. I'd rather not
worry about integer versus string, but I assume there are some
performance gains in doing so.


$a[0] :32bit = get_val;   # 32-bit
$r-{name} :private = "Nate"; # privatize single value
$s-{VAL} :laccess('data') = "";  # lvalue autoaccessor

Here I'd prefer to see private and laccess as functions.

private $r-{name} = 'Nate';
laccess $s-{VAL} = '';

And as far as the :shared modifier goes I much prefer the our keyword.

Alan Gutierrez

Re: RFC 308 (v1) Ban Perl hooks into regexes

2000-09-26 Thread Piers Cawley


Perl6 RFC Librarian [EMAIL PROTECTED] writes:

 This and other RFCs are available on the web at
   http://dev.perl.org/rfc/
 
 =head1 TITLE
 
 Ban Perl hooks into regexes
 
 =head1 VERSION
 
   Maintainer: Simon Cozens [EMAIL PROTECTED]
   Date: 25 Sep 2000 
   Mailing List: [EMAIL PROTECTED]
   Number: 308
   Version: 1
   Status: Developing
 
 =head1 ABSTRACT
 
 Remove C?{ code }, C??{ code } and friends.
 
 =head1 DESCRIPTION
 
 The regular expression engine may well be rewritten from scratch or
 borrowed from somewhere else. One of the scarier things we've seen
 recently is that Perl's engine casts back its Krakken tentacles into Perl
 and executes Perl code. This is spooky, tangled, and incestuous.
 (Although admittedly fun.)

It's *loads* of fun. Though admittedly, I've not used it in any *real*
code yet...

 It would be preferable to keep the regular expression engine as
 self-contained as possible, if nothing else to enable it to be used
 either outside Perl or inside standalone translated Perl programs
 without a Perl runtime.
 
 To do this, we'll have to remove the bits of the engine that call 
 Perl code. In short: C?{ code } and C??{ code } must die.

You don't *have* to remove 'em. You can just throw an exception during
compilation if some hypothetical 'no regex subs' pragma is there.

-- 
Piers
'063039183598121887134041122600:1917131105:Jaercunrlkso tPh.'=~/^(.{6})*
(.{6})[^:]*:(..)*(..).*:(??{'.{'.$2%$4.'}'})(.)(??{print$5})/x;print"\n"

Re: RFC 170 (v2) Generalize =~ to a special apply-to assignment operator

2000-09-26 Thread Simon Cozens


On Sun, Sep 17, 2000 at 05:41:57AM -, Perl6 RFC Librarian wrote:
. Some criticized it as being too sugary, since this:
 
$string =~ quotemeta;# $string = quotemeta $string;
 
 Is not as clear as the original. However, there is fairly similar
 precedent in:
 
$x += 5; # $x = $x + 5;

Looks great on scalars, but...

@foo =~ shift;   # @foo = $foo[0]  ?
@foo =~ unshift; # @foo = $foo[-1] ?

Although I have to admit I like:

@foo =~ grep !/\S/;

But I'm not very keen on the idea of

%foo =~ keys;

-- 
A formal parsing algorithm should not always be used.
-- D. Gries

Re: RFC 308 (v1) Ban Perl hooks into regexes

2000-09-26 Thread Bart Lateur


On 25 Sep 2000 20:14:52 -, Perl6 RFC Librarian wrote:

Remove C?{ code }, C??{ code } and friends.

I'm putting the finishing touches on an RFC to drop (?{...}) and replace
it with something far more localized, hence cleaner: assertions, also in
Perl code. That way,

/(?!\d)(\d+)(?{$1  256})/

would only match integers between 0 and 255.

Communications between Perl code snippets inside a regex would be
strongly discouraged.

-- 
Bart.

Re: RFC 308 (v1) Ban Perl hooks into regexes

2000-09-26 Thread Michael Maraist



 On 25 Sep 2000 20:14:52 -, Perl6 RFC Librarian wrote:

 Remove C?{ code }, C??{ code } and friends.

 I'm putting the finishing touches on an RFC to drop (?{...}) and replace
 it with something far more localized, hence cleaner: assertions, also in
 Perl code. That way,

 /(?!\d)(\d+)(?{$1  256})/

 would only match integers between 0 and 255.

 Communications between Perl code snippets inside a regex would be
 strongly discouraged.

I can't believe that there currently isn't a means of killing a back-track
based on perl-code.  Looking through perlre it seems like you're right.  I'm
not really crazy about breaking backward compatibilty like this though.  It
shouldn't be too hard to find another character sequence to perform your
above job.

Beyond that, there's a growing rift between reg-ex extenders and purifiers.
I assume the functionality you're trying to produce above is to find the
first bare number that is less than 256 (your above would match the 25 in
256).. Easily fixed by inserting (?!\d) between the second and third
aggregates.  If you were to be more strict, you could more simply apply
\b(\d+)\b...

In any case, the above is not very intuitive to the casual observers as
might be

while ( /(\d+)/g ) {
  if ( $1  256 ) {
$answer = $1;
last;
  }
}

Likewise, complex matching tokens are the realm of a parser (I'm almost
getting tired of saying that).  Please be kind to your local maintainer,
don't proliferate n'th order code complexities such as recursive or
conditional reg-ex's.  Yes, I can mandate that my work doesn't use them, but
it doesn't mean that CPAN won't (and I often have to reverse engineer CPAN
modules to figure out why something isn't working).

That said, nobody should touch the various relative reg-ex operators.  I
look at reg-ex as a tokenizer, and things like (?...) which optimizes
reading, and (?!..), etc are very useful in this realm.

Just my $0.02

-Michael

Re: RFC 308 (v1) Ban Perl hooks into regexes

2000-09-26 Thread Bart Lateur


On Tue, 26 Sep 2000 13:32:37 -0400, Michael Maraist wrote:



I can't believe that there currently isn't a means of killing a back-track
based on perl-code.  Looking through perlre it seems like you're right.

There is, but as MJD wrote: "it ain't pretty". Now, semantic checks or
assertions would be the only reason why I'd expect to be able to execute
perl code every time a part of a regex is succesfully parsed. Simply
look at RFC 197: a syntactic extension to regexes just to check if a
number is within a range! That is absurd, isn't it? Would a simple way
to include localized tests, *any*¨test, make more sense?

I'm
not really crazy about breaking backward compatibilty like this though.  It
shouldn't be too hard to find another character sequence to perform your
above job.

Me neither. But many prominent people in the Perl World have expressed
their amazement when they found out that the purpose of embedding Perl
in a regex wasn't aimed to just do this kind of tests. (?{...}) hasn't
even been tried out yet by many people, let alone that they'd use it in
production code. (?{...}) is notorious for dumping core. I can't see why
it can't be recycled. After all, it still executes Perl code.

Beyond that, there's a growing rift between reg-ex extenders and purifiers.
I assume the functionality you're trying to produce above is to find the
first bare number that is less than 256 (your above would match the 25 in
256).. 

You're forgetting about greediness. This test simply answers the
question: "will this do?" If the answer is always yes, the regex will
*always* match the same thing as it would do without this assertion.
Compare it to other assertions, such as /\b/, anchors (/^/ and /$/), and
lookahead and loobehind. These too don't really control what it would
match. They can only express their veto.

In any case, the above is not very intuitive to the casual observers as
might be

while ( /(\d+)/g ) {
  if ( $1  256 ) {
$answer = $1;
last;
  }
}

Maybe for this simple example. But the same can be said of lookahead and
lookbehind. It takes a *bit* of getting used to, but it's very simple,
and very powerful. IMO.

Likewise, complex matching tokens are the realm of a parser (I'm almost
getting tired of saying that).  Please be kind to your local maintainer,
don't proliferate n'th order code complexities such as recursive or
conditional reg-ex's.

I said nothing of recursive regexes. Again, just look at RFC 197, and
see what complex rules people would like to cram into a regex. Or look
at the examples in Friedl's book, to see what contortions people put
themselves through, just to make sure that they only match numbers
between 0 and 23:

/[01]?[09]|2[0-3]/
/[01]?[4-9]|[012]?[0-3]/

So you think these are easy on the maintainer? I think not. A simple
boolean expression, "match a number and it must be 23 or less", is far
simpler, at least to me.

-- 
Bart.

Re: RFC 308 (v1) Ban Perl hooks into regexes

2000-09-26 Thread Michael Maraist


 There is, but as MJD wrote: "it ain't pretty". Now, semantic checks or
 assertions would be the only reason why I'd expect to be able to execute
 perl code every time a part of a regex is succesfully parsed. Simply
 look at RFC 197: a syntactic extension to regexes just to check if a
 number is within a range! That is absurd, isn't it? Would a simple way
 to include localized tests, *any*¨test, make more sense?

I'm trying to stick to a general philosophy of what's in a reg-ex, and I can
almost justify assertions since as you say, \d, ^, $, (?=), etc are these
very sort of things.  I've been avoiding most of this discussion because
it's been so odd, I can't believe they'll ultimately get accepted.  Given
the argument that it's unlikely that (?{code}) has been implemented in
production, I can almost see changing it's symantics.  From what I
understand, the point would be to run some sort of perl-code and returned
defined / undefined, where undefined forces a back-track.

As you said, we shouldn't encourage full-fledged execution (since core dumps
are common).  I can definately see simple optimizations such as (?{$1 op
const}), though other interesting things such as (?{exists $keywords{ $1 }})
might proliferate.  That would expand to the general purpose (?{
isKeyword( $1 ) }), which then allows function calls within the reg-ex,
which is just asking for trouble.

One restriction might be to disallow various op-codes within the reg-ex
assertion.  Namely user-function calls, reg-ex's, and most OS or IO
operations.

A very common thing could be an optimal /(?\d+)(?{MIN  $1  $1  MAX})/,
where MIN and MAX are constants.

-Michael

Re: RFC 308 (v1) Ban Perl hooks into regexes

2000-09-26 Thread Hugo


In 005501c027eb$43bafe60$[EMAIL PROTECTED], "Michael Maraist" writes:
:As you said, we shouldn't encourage full-fledged execution (since core dumps
:are common).

Let's not redefine the language just because there are bugs to fix.
Surely it is better to concentrate first on fixing the bugs so that
we can then more fairly judge whether the feature is useful enough
to justify its existence.

:One restriction might be to disallow various op-codes within the reg-ex
:assertion.  Namely user-function calls, reg-ex's, and most OS or IO
:operations.

That seems quite unreasonable. Why do you _want_ to restrict someone
from calling isKeyword($1) within the regexp, which will then read
the keyword patterns from a file and check $1 against those patterns
using regexps? It seems like an entirely reasonable and useful thing
to do.

Hugo

Re: RFC 308 (v1) Ban Perl hooks into regexes

2000-09-26 Thread Hugo


In [EMAIL PROTECTED], Bart Lateur writes:
:On 25 Sep 2000 20:14:52 -, Perl6 RFC Librarian wrote:
:
:Remove C?{ code }, C??{ code } and friends.
:
:I'm putting the finishing touches on an RFC to drop (?{...}) and replace
:it with something far more localized, hence cleaner: assertions, also in
:Perl code. That way,
:
:   /(?!\d)(\d+)(?{$1  256})/
:
:would only match integers between 0 and 255.

I'd like to suggest an alternative semantic for this: rename
(??{ code }) to (?{ code }), and use the newly freed (??{ code })
for the assertions. (I was about to write an RFC for just that, so
I'm glad I can save a bit of time. :)

Hugo

Re: RFC 170 (v2) Generalize =~ to a special apply-to assignment operator

2000-09-26 Thread Nathan Wiger


Simon Cozens wrote:
 
 Looks great on scalars, but...
 
 @foo =~ shift;   # @foo = $foo[0]  ?
 @foo =~ unshift; # @foo = $foo[-1] ?

Yes, if you wanted to do something that twisted. :-) It probably makes
more sense to do something like these:

   @array =~ reverse;
   @vals =~ sort { $a = $b };
   @file =~ grep /!^#/;
 
 Although I have to admit I like:
 
 @foo =~ grep !/\S/;

Exactly!
 
 But I'm not very keen on the idea of
 
 %foo =~ keys;

Again, that depends on whether or not you're Really Evil. ;-)

-Nate

Re: RFC 299 (v1) C@STACK - a modifyable Ccaller()

2000-09-26 Thread Piers Cawley


Perl6 RFC Librarian [EMAIL PROTECTED] writes:

 This and other RFCs are available on the web at
   http://dev.perl.org/rfc/
 
 =head1 TITLE
 
 C@STACK - a modifyable Ccaller()

Why am I having bad thoughts along the lines of:

   local @STACK = @SAVED_STACK

I don't know what'd do, but it'd be fun to find out...

-- 
Piers

Re: RFC 288 (v1) First-Class CGI Support

2000-09-26 Thread Nathan Wiger


 The http_header() is a straw man intended to demonstrate that there
 are issues with shoving all of the outgoing HTTP headers into a simple
 variable.  Not insoluable problems, but problems.

Agreed.
 
 I do like the idea of stacking HTTP headers and queueing them up
 before the first print statement, where that print is a little
 more magical than every subsequent print.  I'd imagine that if the
 HTTP headers were blank, it would Do The Right Thing (tm) and emit
 a "Content-type: text/html\n\n", before replacing itself with the
 standard print builtin.

Don't get me wrong, I like this idea too.
 
 And if HTTP headers are segregated from document content, it's
 easier to determine when the headers are finished and when the
 content starts.  That aids in determining when '\n\n' is appropriate.

Yes.
 
 Robust input parsing: yes.
 
 General purpose output formatting: no, nyet, nein, non, "over my dead body".

I'm guessing you mean "nyet" to "general purpose formatting *only*". :-)

After sending that last email, I was sitting here drinking a beer, and
it occured to me that tons of headers all the use same format

   Tag-With-Hyphens: unspecified value of some type

This is true for HTTP, Mail, and lots of other applications.

Maybe what we need is a general "headers()" function, which could
produce a stack of these headers, something like:

   @http_headers = headers(content-type = 'text/html',
   last-modified = $date);

   @mail_headers = headers(from = '[EMAIL PROTECTED]',
   to = '[EMAIL PROTECTED]');

This function could always be present in Perl. I could see *lots* of
uses for such a function. Then, perhaps the "use cgi" pragma could
simply alter the semantics of this builtin to stack the headers given to
it by the headers() function and auto-output them on print.

That way, we're changing the semantics of headers(), not importing it.
Just like 'use tristate' changes the semantics of undef() without
importing a new function. And 'headers()' makes just as much sense as
having 'quotemeta()' in core - there's many potential uses for it.

How's something like that sound?

-Nate

Re: RFC 288 (v1) First-Class CGI Support

2000-09-26 Thread Adam Turoff


On Tue, Sep 26, 2000 at 05:02:02PM +1100, iain truskett wrote:
 Is there much point having a lightweight CGI module? If you say 'I want
 it to load quickly', I say 'get mod_perl'.

There's more to it than just loading quickly.  It should load quickly
as in "load everything that's absolutely necessary but nothing more."
Taint mode is necessary, and half the reason for this proposal.

That should make it faster than it is today (ms) and faster than 
it is today in mod_perl (us).

  I do like the idea of stacking HTTP headers and queueing them up
  before the first print statement, where that print is a little more
  magical than every subsequent print.  I'd imagine that if the HTTP
  headers were blank, it would Do The Right Thing (tm) and emit a
  "Content-type: text/html\n\n", before replacing itself with the
  standard print builtin.
 
 This would be useful. I'd be more inclined to have it with the CGI
 module though. But then, it would need to be an option since sometimes
 programs use CGI bits and pieces but don't run in a CGI context (such
 as mod_perl scripts, and the odd script I have for generating
 semi-static documents).

This has nothing to do with CGI context.
 
  Robust input parsing: yes.
 
  General purpose output formatting: no, [...]
 
  Rudimentary HTTP header emission: probably.
 
 I think it all belongs in the CGI module really. 

Then you're probably against this proposal.  No biggie.

 The assorted Apache modules (and HTML::Embperl and similar) and the CGI
 module really do provide proper facilities for CGI operations. Having
 "rudimentary" features in the core would be duplication and hence a
 waste. 

Even if the core implementations were so lightweight that they could
be reused as-is across all the other modules?

Z.

Re: RFC 143 (v2) Case ignoring eq and cmp operators

2000-09-26 Thread Markus Peter


"David L. Nicol" wrote:
 
  Perl currently only has Ceq and Ccmp operators which work case-sensitively.
  It would be a useful addition to add case-insensitive equivalents.
 
 As I recall, the consensus the last time this came up was that Ccmpi and
 Ceqi would be perfect examples w/in a RFC proposing a way to declare
 a function to take it's arguments in infix instead of prefix manner.

Well - it only came to the list again as I retired the RFC as most
people
thought this was not important enough :-)

-- 
Markus Peter - SPiN GmbH
[EMAIL PROTECTED]

Re: RFC 288 (v1) First-Class CGI Support

2000-09-26 Thread Alan Gutierrez


On Tue, 26 Sep 2000, iain truskett wrote:

 * Adam Turoff ([EMAIL PROTECTED]) [26 Sep 2000 17:15]:
  On Tue, Sep 26, 2000 at 05:02:02PM +1100, iain truskett wrote:
   Is there much point having a lightweight CGI module? If you say 'I want
   it to load quickly', I say 'get mod_perl'.

Agreed. The bottleneck in standard CGI is not parsing the query string
or form post, it is the fork.
 
  There's more to it than just loading quickly.  It should load quickly
  as in "load everything that's absolutely necessary but nothing more."
  Taint mode is necessary, and half the reason for this proposal.

Make it so. Find a way to turn tainting on for CGI. But don't clutter
the core with application specific functions. Doesn't everyone want to
remove the sockets library from the core?

 But is anything else? I like what Nathan Wiger suggested --- that of the
 common headers function, but I also believe that belongs as a separate
 module. Hence:

snip Nice code Mr. Truskett.

Agreed. Split CGI into component parts. If all you want is headers, or
cookies, then you can use those modules without incurring the penalty of
loading all of CGI.pm.

Robust input parsing: yes.
   
General purpose output formatting: no, [...]
   
Rudimentary HTTP header emission: probably.

So this is the definition of first-class? As I've said before,
first-class CGI to me means a language where I can focus on the HTML or
XML I am creating. An example of a first-class CGI language is ASP or my
beloved HTML::Embperl. I don't bother with CGI anymore, and when I did I
was content with CGI.pm.

   I think it all belongs in the CGI module really.
 
  Then you're probably against this proposal.  No biggie.
 
 Yeah. I contemplated saying that outright in the previous one but was
 having trouble phrasing it in a friendly sense =)

It seems like we are talking about pulling some functions from a module
to the core. And for no real good reason. Is query string parsing or
header processing so time consuming that it must be implemented in C?

For any sizeable application input and headers will not be enough.
You'll need cookies and redirection certainly. At which point you will
load CGI.pm anyway (if you are bothering to create this in classic CGI).

Tainting sure. Functions no.

Alan Gutierrez

Re: RFC 283 (v1) Ctr/// in array context should return a histogram

2000-09-26 Thread Paris Sinclair


On Mon, 25 Sep 2000, Simon Cozens wrote:

 On Mon, Sep 25, 2000 at 09:55:38AM +0100, Richard Proctor wrote:
  While this may be a fun thing to do - why?  what is the application?
 
 I think I said in the RFC, didn't I? It's extending the counting use of tr///
 to allow you to count several different letters at once. For instance, letter
 frequencies in text is an important metric for linguists, codebreakers and
 others; think about how you'd get letter frequency from a string:
 
 $as = $string =~ tr/a//;
 $bs = $string =~ tr/b//;
 $cs = $string =~ tr/c//;
 ...
 $zs = $string =~ tr/z//;
 
 Ugh.
 
 (%alphabet) = $string =~ tr/a-z//;
 
 Yum.

also a little more concise (and certainly more efficient...) than

%alphabet = map { $_ = eval "\$string =~ tr/$_//" } (a..z);

Context is beautiful; it's at least 50% of the reason I love Perl. I would
love to see it extended here.

Paris Sinclair|4a75737420416e6f74686572
[EMAIL PROTECTED]|205065726c204861636b6572
www.sinclairinternetwork.com

Re: RFC 288 (v1) First-Class CGI Support

2000-09-26 Thread Adam Turoff


On Tue, Sep 26, 2000 at 04:41:21AM -0400, Alan Gutierrez wrote:
 Robust input parsing: yes.

 General purpose output formatting: no, [...]

 Rudimentary HTTP header emission: probably.
 
 So this is the definition of first-class? 

Have you read the RFC?

Have you read the message you're quoting here?  In full?  In context?

Here's the abstract again:

Perl is frequently used in CGI environments.  It should be
as easy to write CGI programs with perl as it is to write
commandline text filters.

For the purposes of this discussion *this* is what first class means.

 As I've said before,
 first-class CGI to me means a language where I can focus on the HTML or
 XML I am creating. An example of a first-class CGI language is ASP or my
 beloved HTML::Embperl. I don't bother with CGI anymore, and when I did I
 was content with CGI.pm.

Have you been reading this thread?  I've already said in at least
one occasion that playing favorites between embperl, mason, template 
toolkit, text::template, Format, autoformat, xml::writer and such 
is *NOT* the intent of this RFC.

Please stop inferring that this should be a way of getting *your* own
*personal* favorite CGI module included into the core, especially
when you say later it wouldn't make sense.

I'm not saying any of that.  
 
 It seems like we are talking about pulling some functions from a module
 to the core. And for no real good reason. Is query string parsing or
 header processing so time consuming that it must be implemented in C?

It seems you are mistaken.  We are not talking about implementing 
string or header processing in C.  Please re-read this thread.

 For any sizeable application input and headers will not be enough.
 You'll need cookies and redirection certainly. At which point you will
 load CGI.pm anyway (if you are bothering to create this in classic CGI).

Please re-read the CGI specification as well.  Cookies and redirection
are both HTTP headers, and do not require loading CGI.pm.

Z.

RE: RFC 264 (v1) Provide a standard module to simplify the creation of source filters

2000-09-26 Thread Paul Marquess


From: Damian Conway [mailto:[EMAIL PROTECTED]]

...

 No. That's my point. I want to match BANG followed by maximal whitespace
 followed by another BANG. But a line-by-line filter fails dismally if that
 maximal whitespace contains a newline.

 Admittedly this particular example is contrived for effect, but I have
 now used your excellent module in numerous projects when constructs *can*
 cross newline boundaries and it's always painful to do (hence the new
 module). In fact, I would claim that filtering constructs across
 newline boundaries is the *norm*, since newlines are just whitespace most
 places in Perl.

Aaah! I see where you are coming from now. Is that mentioned in the RFC and
module docs? If not, it really needs to be emphasised. It completely passed
me by.

If you like, next time I do a Filters release, I'll update my documentation
to mention your module.

Paul


_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

Re: RFC 288 (v1) First-Class CGI Support

2000-09-26 Thread Jonathan Scott Duff


On Tue, Sep 26, 2000 at 12:04:50AM -0400, Adam Turoff wrote:
 On Mon, Sep 25, 2000 at 07:50:28AM +0100, Richard Proctor wrote:
  On Mon 25 Sep, Perl6 RFC Librarian wrote:
   Turn on tainting
  
  What would it do on a platform that does not support Tainting?
 
 Is this a real issue?  Is there a platform where tainting isn't
 supported?

I wouldn't think so.  Tainting is a Perl thing.  Perl does it's best
to mark "unsafe" things as tainted.  What's unsafe and Perl's best
vary from platform to platform, but tainting still happens.  (But this
is from a guy who has only used but 6 or 7 of the OSes that Perl has
been ported to)

-Scott
-- 
Jonathan Scott Duff
[EMAIL PROTECTED]

Re: RFC 290 (v1) Remove -X

2000-09-26 Thread Jonathan Scott Duff


On Tue, Sep 26, 2000 at 12:34:00AM -0400, Adam Turoff wrote:
 Making '@permissions = -rwx $filename;' work is an interesting new
 suggestion.

Yep.

 Of course, I should say that I've been hanging out with some
 snake-hearders recently.  

Hey, we could learn a thing or two from some snake herders.  (Watch
how they get bit and don't do the same ;-)

 I'll revise the RFC to add 'readable()', 'writable()', and such
 synonyms for -r and -w that are more like 'use english' and less like
 'use English'.

Excellent.

-Scott
-- 
Jonathan Scott Duff
[EMAIL PROTECTED]

Re: RFC 320 (v1) Allow grouping of -X file tests and add Cfiletest builtin

2000-09-26 Thread Jonathan Scott Duff


On Tue, Sep 26, 2000 at 05:53:13AM -, Nate Wiger wrote:
 Currently, file tests cannot be grouped, resulting in very long
 expressions when one wants to check to make sure some thing is a
 readable, writeable, executable directory:
 
if ( -d $file  -r $file  -w $file  -x $file ) { ... }

Non-novice perl programmers would probably write this as you have
below with the special _ filehandle.  Perhaps you should move that to
the fore and comment on it's unreadability and general ickiness :-)

 It would be really nice if these could be grouped instead:
 
if ( -drwx $file ) { ... }

Indeed it would.

-Scott
-- 
Jonathan Scott Duff
[EMAIL PROTECTED]

Re: RFC 320 (v1) Allow grouping of -X file tests and add Cfiletest builtin

2000-09-26 Thread John L. Allen




On 26 Sep 2000, Perl6 RFC Librarian wrote:

 =head1 TITLE
 
 Allow grouping of -X file tests and add Cfiletest builtin

Nice summary.  Thanks.

 =head1 IMPLEMENTATION
 
 This would involve making C-[a-zA-Z]+ a special token in all contexts,
 serving as a shortcut for the Cfiletest builtin.
 
 =head1 MIGRATION
 
 There is a subtle trap if you are negating subroutines:
 
$result = -drwx $file;
 
 And expect this to be parsed like this:
 
$result = - drwx($file);
 
 However, usage such as this is exceedingly unlikely, and can simply be
 resolved by the p52p6 translator looking for C-([a-zA-Z]{2,}) and
 replacing it with C- $1, since injecting a single space will break up
 the token.

I can't believe that special-casing the token -[rwxoRWXOezsfdlpSbctugkTBMAC]+
is an acceptble solution.  I mean think of all the existing perl keywords 
that that already matches: -pos, -cos, -lc, -uc, -fork, -use, -pop, -exp, 
-oct, -log, -ord + others!.  A lot of these already have a legitimate 
not-filetest meaning, others not :-)  So it still seems to me that some 
sort of disambiguating syntax is needed, if not actually findable, to 
save this filetest grouping idea of mine from the scrap heap.  I guess 
the p5-to-p6 converter could still resolve this as you stated, but I just 
don't like it...

John.

Re: RFC 290 (v1) Remove -X

2000-09-26 Thread Uri Guttman


 "JSD" == Jonathan Scott Duff [EMAIL PROTECTED] writes:

   I'll revise the RFC to add 'readable()', 'writable()', and such
   synonyms for -r and -w that are more like 'use english' and less like
   'use English'.


i have a minor problem with the names readable and writeable. i am
currently using them as method names in a major project i have
created. they are callbacks (see my recent callback rfc) which mean the
socket is readable/writeable. maybe put some sort of prefix on them to
designate them as file test operators so we don't clutter up the
namespace so much. also the common prefix will make it clear tha tthey
are part of the family of file test ops.

here are some ideas which you can shoot down:

a minus prefix like the current -r

-readable
-tty

or has_ as the file has read permission:

has_readable
has_executable

or is_ if the file is a text file:

is_text
is_sticky
is_writable

also you have to differentiate -R from -r.

is_euid_readable
is_ruid_writeable

and then names for -M/A/C don't work with is/has:

file_mod_time
file_access_time


i don't mind having the longer names as an option, but it should be a
pragma/module and the namespace has to be clean.

uri

-- 
Uri Guttman  -  [EMAIL PROTECTED]  --  http://www.sysarch.com
SYStems ARCHitecture, Software Engineering, Perl, Internet, UNIX Consulting
The Perl Books Page  ---  http://www.sysarch.com/cgi-bin/perl_books
The Best Search Engine on the Net  --  http://www.northernlight.com

Re: RFC 320 (v1) Allow grouping of -X file tests and add Cfiletest builtin

2000-09-26 Thread John Porter


John L. Allen wrote:
 
 I can't believe that special-casing the token -[rwxoRWXOezsfdlpSbctugkTBMAC]+
 is an acceptble solution.  I mean think of all the existing perl keywords 
 that that already matches: -pos, -cos, -lc, -uc, -fork, -use, -pop, -exp, 
 -oct, -log, -ord + others!.  A lot of these already have a legitimate 
 not-filetest meaning, others not :-)  

Yeah, not to mention the fact that many modules, notably CGI.pm,
are arranged to allow to use unquoted strings of the form -name:

print textfield( -name = 'description' );

-- 
John Porter

Aus des Weltalls ferne  funken Radiosterne.

Re: RFC 292 (v1) Extensions to the perl debugger

2000-09-26 Thread Johan Vromans


Perl6 RFC Librarian [EMAIL PROTECTED] writes:

 The ability to easily retrieve and edit your N most recent commands to the
 debugger (much like a bash_history).

and

 A better default pager.  The default pager should assume a 24x80 term
 window ...

To me, these clearly indicates that the debugger should be run as a
subsystem of some other tool that takes care of this. It is not part
of the debugger itself. For example, take a look at how emacs runs the
debugger.

-- Johan

Re: perl6storm #0050

2000-09-26 Thread Johan Vromans


Philip Newton [EMAIL PROTECTED] writes:

 so fewer "cluttering"
 parentheses are needed to make things readable while still being correct.

Since when do parentheses make things less readable?
What is your definition of readable?

-- Johan

Re: perl6storm #0050

2000-09-26 Thread Johan Vromans


Philip Newton [EMAIL PROTECTED] writes:

 so fewer "cluttering"
 parentheses are needed to make things readable while still being correct.

By the same reasoning, you can reduce the use of curlies by using
indentation to define block structure.

-- Johan

Re: RFC 290 (v1) Remove -X

2000-09-26 Thread Clayton Scott


"John L. Allen" wrote:
 The use of a caret was to prevent decimation of the user's namespace,

 perl -e 'print -^rwx $_'
 syntax error at -e line 1, near "-^"
 Execution of -e aborted due to compilation errors.

The only problem I have with a caret is that to me the proposal 
 doesn't "jibe" with it's current use of negating a character class in
 a regexp and it's proposed use for currying.

Clayton

Re: proposed RFC. lindex() function...

2000-09-26 Thread Webmaster


Dear Iain,
I had a few moments, so I tried to put together a subroutine that would
express what I was thinking. It's attached with the script that I used to
test it.
Grant M.
[EMAIL PROTECTED]


 lndxexmp.pl
 lindex.pl

Re: RFC 292 (v1) Extensions to the perl debugger

2000-09-26 Thread Dave Storrs


On 26 Sep 2000, Johan Vromans wrote:

 Perl6 RFC Librarian [EMAIL PROTECTED] writes:
 
  The ability to easily retrieve and edit your N most recent commands to the
  debugger (much like a bash_history).
 and
  A better default pager.  The default pager should assume a 24x80 term
  window ...
 
 To me, these clearly indicates that the debugger should be run as a
 subsystem of some other tool that takes care of this. It is not part
 of the debugger itself. For example, take a look at how emacs runs the
 debugger.


I'm confused...are you suggesting that the debugger should no
longer be integrated into perl?  If so, I disagree...I absolutely insist
that, no matter what pathological distribution someone may put together
for Perl, I will get the debugger so at least I have a chance of figuring
out what's wrong. The only way to absolutely ensure this is to have it
built into the interpreter itself.

As to the history file...this is something that I've wanted since
I first touched the debugger, and I suspect others would like it as well.
As to the pager...the "default pager" is currently "no pager", which is
silly.  I distinctly remember having the following thought pattern, from
back when I was learning how to use the debugger:


"Ok, what commands are available?  Let's type 'h' "

"Hmm...it scrolls off the screen...aha! look, down here at the
bottom it says I can do '|dbcmd' to pipe the output of a debugger command
through the current pager!  Excellent, that's just what I need."

" '|h' " [it runs off the screen again]

"DOH!"


All humor aside, there is too much information in the debugger
help screen to fit in 50 lines.  That means that anyone trying to use the
debugger through a DOS window, or a fixed-size telnet client, can't see
the majority of the information.

Dave

Re: RFC 290 (v1) Remove -X

2000-09-26 Thread Adam Turoff


On Tue, Sep 26, 2000 at 02:13:41PM -0400, Uri Guttman wrote:
 
 and if the file test names are only loaded via a pragma it should be
 ok. it is not clear to me that you want that. 

It's not clear that I want that either.

This is probably a plea for a subset of 'use english;', possibly
'use english "filetests";'

But I wouldn't want that pragma to override any other aspect of the
core library, such as async I/O.

 also i think a common
 prefix idea is still valid as it will make it more apparent that these
 are from the file test family. 

That's a stone's throw awaty from:

import english
from english import filetest

result = filetest.readable("/dev/null")

I think the common prefix idea is a nonstarter.  There must be a way
to coming up with sensible names for all of -X that don't conflict
with the core library.  Besides, AIO has a requirement on 
'sub Foo::readable()',  which means that main::readable is still 
accessible and doesn't conflict.  No, that's not desirable, but AIO
behavior looks more malleable to me.

 i have not seen an attempt to name all of
 the -X ops yet. 

v2 fast approaching.  ;-)

Z.

Re: perl6storm #0050

2000-09-26 Thread Simon Cozens


On Tue, Sep 26, 2000 at 02:06:47PM -0400, John Porter wrote:
  Since when do parentheses make things less readable?
 
 Can you say "lisp"?

"lisp".

(defun Schwartzian (func list)
  (mapcar
   (lambda (x) (car x))
   (sort
(mapcar
 (lambda (x) (cons x (funcall func x)))
 list
 )
(lambda (x y) ( (cdr x) (cdr y)))
)
   )
  )

Maybe you'd prefer this:

defun Schwartzian func list mapcar lambda x car x sort mapcar 
lambda x cons x funcall func x list lambda x y  cdr x cdr y
 
I know which I'd rather read.
   

-- 
God gave man two ears and one tongue so that we listen twice as much as
we speak.
-- Arab proverb

Re: RFC 290 (v1) Remove -X

2000-09-26 Thread Nathan Wiger


Adam Turoff wrote:

 That's a stone's throw awaty from:
 
 import english
 from english import filetest
 
 result = filetest.readable("/dev/null")
 
 I think the common prefix idea is a nonstarter.  There must be a way
 to coming up with sensible names for all of -X that don't conflict
 with the core library. 

I think perhaps that Uri was suggesting more a common letter prefix,
such as:

  freadable($file);
  fwritable($file);
  fexecutable($file);

Than a piece of bastardized Pythonesque syntax. ;-)

-Nate

Re: RFC 290 (v1) Remove -X

2000-09-26 Thread John L. Allen




On Tue, 26 Sep 2000, Nathan Wiger wrote:

 I think perhaps that Uri was suggesting more a common letter prefix,
 such as:
 
   freadable($file);
   fwritable($file);
   fexecutable($file);
 
 Than a piece of bastardized Pythonesque syntax. ;-)

Was that what the foo.bar("baz") syntax was?  I thought that was in another 
RFC I had to hunt down and read :-)  But I think I like this anyway:

  -f.r($file);# same as -r $file
  -f.rwx($file);  # same as -rwx $file

etc.  Or leave off the -, or even the -f ... oh well, I guess there are 
syntax possibilities ad nauseum, but none very satisfying.

John.

Re: perl6storm #0011: interactive perl mode

2000-09-26 Thread Nicola Meade


Russ, you can use "perl -" to punch/paste into that window.
But "foo | perl" would not be affected as you would not
be running interactively.  Essentially, only if there
are no arguments and stdin (and stdout) areatty would you
do that.

--tom, posting blind

Visit our website at http://www.ubswarburg.com

This message contains confidential information and is intended only 
for the individual named.  If you are not the named addressee you 
should not disseminate, distribute or copy this e-mail.  Please 
notify the sender immediately by e-mail if you have received this 
e-mail by mistake and delete this e-mail from your system.

E-mail transmission cannot be guaranteed to be secure or error-free 
as information could be intercepted, corrupted, lost, destroyed, 
arrive late or incomplete, or contain viruses.  The sender therefore 
does not accept liability for any errors or omissions in the contents 
of this message which arise as a result of e-mail transmission.  If 
verification is required please request a hard-copy version.  This 
message is provided for informational purposes and should not be 
construed as a solicitation or offer to buy or sell any securities or 
related financial instruments.

Re: PERL6STORM - tchrist's brainstorm list for perl6

2000-09-26 Thread Nicola Meade


Yes, while still allowing an explicit A()-B(), of course.
I just meant that A-B means A::-B(), or, if you would, "A"-B().
But A()-B would not change in meaning.

--tom, posting blind(ly)

Visit our website at http://www.ubswarburg.com

This message contains confidential information and is intended only 
for the individual named.  If you are not the named addressee you 
should not disseminate, distribute or copy this e-mail.  Please 
notify the sender immediately by e-mail if you have received this 
e-mail by mistake and delete this e-mail from your system.

E-mail transmission cannot be guaranteed to be secure or error-free 
as information could be intercepted, corrupted, lost, destroyed, 
arrive late or incomplete, or contain viruses.  The sender therefore 
does not accept liability for any errors or omissions in the contents 
of this message which arise as a result of e-mail transmission.  If 
verification is required please request a hard-copy version.  This 
message is provided for informational purposes and should not be 
construed as a solicitation or offer to buy or sell any securities or 
related financial instruments.

perl6storm #0010: kill all defaults

2000-09-26 Thread Nicola Meade


Yes,  Phil, I mean things like abs() meaning abs($_) and
localtime() meaning localtime(time).
Actually, combined with the paren requirement thingie, it means
localtime(time()), and localtime
has to be written localtime().   These are two different suggestions,
though.

This is an attempt at sending from an account I can't get mail from, or
to, or read.
So I can only read this if it makes the public list.


Visit our website at http://www.ubswarburg.com

This message contains confidential information and is intended only 
for the individual named.  If you are not the named addressee you 
should not disseminate, distribute or copy this e-mail.  Please 
notify the sender immediately by e-mail if you have received this 
e-mail by mistake and delete this e-mail from your system.

E-mail transmission cannot be guaranteed to be secure or error-free 
as information could be intercepted, corrupted, lost, destroyed, 
arrive late or incomplete, or contain viruses.  The sender therefore 
does not accept liability for any errors or omissions in the contents 
of this message which arise as a result of e-mail transmission.  If 
verification is required please request a hard-copy version.  This 
message is provided for informational purposes and should not be 
construed as a solicitation or offer to buy or sell any securities or 
related financial instruments.

perl6storm #0073

2000-09-26 Thread Nicola Meade


No, not for 

   use 'strict';

That is not a bareword.  Hard to say why (have short time).
Only "$a = fred" is a bareword.  But "require Module", is not,
as it has another meaning, and is accomodated in the grammar.
Likewise, a prototype of sub fn(*) is not a bareword when
you call fn(Whatever).

Visit our website at http://www.ubswarburg.com

This message contains confidential information and is intended only 
for the individual named.  If you are not the named addressee you 
should not disseminate, distribute or copy this e-mail.  Please 
notify the sender immediately by e-mail if you have received this 
e-mail by mistake and delete this e-mail from your system.

E-mail transmission cannot be guaranteed to be secure or error-free 
as information could be intercepted, corrupted, lost, destroyed, 
arrive late or incomplete, or contain viruses.  The sender therefore 
does not accept liability for any errors or omissions in the contents 
of this message which arise as a result of e-mail transmission.  If 
verification is required please request a hard-copy version.  This 
message is provided for informational purposes and should not be 
construed as a solicitation or offer to buy or sell any securities or 
related financial instruments.

Re: RFC 283 (v1) Ctr/// in array context should return a histogram

2000-09-26 Thread Bennett Todd


2000-09-26-05:18:57 Paris Sinclair:
  (%alphabet) = $string =~ tr/a-z//;
 
 also a little more concise (and certainly more efficient...) than
 
   %alphabet = map { $_ = eval "\$string =~ tr/$_//" } (a..z);

However, compared to say

$hist[ord($_)]++ for split //, $string;

the performance edge might not be quite so dramatic. Then again,
maybe it would be, I dunno.

-Bennett

 PGP signature

Re: perl6storm #0050

2000-09-26 Thread Robert Mathews


Simon Cozens wrote:
 (defun Schwartzian (func list)
   (mapcar
(lambda (x) (car x))
(sort
 (mapcar
  (lambda (x) (cons x (funcall func x)))
  list
  )
 (lambda (x y) ( (cdr x) (cdr y)))
 )
)
   )
 
 Maybe you'd prefer this:
 
 defun Schwartzian func list mapcar lambda x car x sort mapcar
 lambda x cons x funcall func x list lambda x y  cdr x cdr y
 
 I know which I'd rather read.

Ok, you've proved that lisp doesn't make sense without all those
annoying parentheses.  Congratulations.  Fortunately, perl isn't lisp.

Why does removing the parens force you to jam everything together on two
lines?  The only reason I can think of is that you're using some fancy
autoindenting lisp editor.  Fortunately, perl doesn't force you to use a
special editor just to tame the obvious shortcomings of the language
syntax.

Perl *lets* you include parentheses, or not, whichever makes the code
easier to read.  Yeah, you can write ugly or broken code by leaving too
many out.  So?

-- 
Robert Mathews
Software Engineer
Excite@Home

Re: perl6storm #0050

2000-09-26 Thread John Porter


Simon Cozens wrote:
 
 Maybe you'd prefer this:
 
 defun Schwartzian func list mapcar lambda x car x sort mapcar 
 lambda x cons x funcall func x list lambda x y  cdr x cdr y

What happened to the newlines?

Also, "no parens" is not the only alternative to having parens.
Other punctiation is available.  One of the improvements
ML makes over Lisp is the use of different bracketers to
signify semantically different kinds of lists.


-- 
John Porter

Aus des Weltalls ferne  funken Radiosterne.

Re: RFC 290 (v1) Remove -X

2000-09-26 Thread Uri Guttman


 "AT" == Adam Turoff [EMAIL PROTECTED] writes:

  AT On Tue, Sep 26, 2000 at 02:13:41PM -0400, Uri Guttman wrote:
   
  AT But I wouldn't want that pragma to override any other aspect of the
  AT core library, such as async I/O.

agreed. but we can reconcile the name spaces then. or let larry do
it. :)


  AT That's a stone's throw awaty from:

  AT   import english
  AT   from english import filetest

  AT   result = filetest.readable("/dev/null")

blecchh!

  AT I think the common prefix idea is a nonstarter.  There must be a way
  AT to coming up with sensible names for all of -X that don't conflict
  AT with the core library.  Besides, AIO has a requirement on 
  AT 'sub Foo::readable()',  which means that main::readable is still 
  AT accessible and doesn't conflict.  No, that's not desirable, but AIO
  AT behavior looks more malleable to me.

what about my idea for is_ or has_? actually that could be used for
callbacks as well. we need some semantic distance from the socket/handle
is readable right now (buffers have data) and the file can be read
if/when it is opened (you can test an open handle too IIRC).

   i have not seen an attempt to name all of
   the -X ops yet. 

  AT v2 fast approaching.  ;-)

awaiting with my whip so i can shred your names. :)

uri

-- 
Uri Guttman  -  [EMAIL PROTECTED]  --  http://www.sysarch.com
SYStems ARCHitecture, Software Engineering, Perl, Internet, UNIX Consulting
The Perl Books Page  ---  http://www.sysarch.com/cgi-bin/perl_books
The Best Search Engine on the Net  --  http://www.northernlight.com

Re: RFC 290 (v1) Remove -X

2000-09-26 Thread Uri Guttman


 "NW" == Nathan Wiger [EMAIL PROTECTED] writes:

  NW I think perhaps that Uri was suggesting more a common letter prefix,
  NW such as:

  NW   freadable($file);
  NW   fwritable($file);
  NW   fexecutable($file);

  NW Than a piece of bastardized Pythonesque syntax. ;-)

basically correct. even a - as a prefix will be nicer and closer to the
original -X stuff. but that might conflict with the -rwx ideas (which i
am not fond of anyway).

if ( -readable( $filename ) ) {

not the best. would that be confused with a sub readable and a leading
unary negation? in fact how does perl parse -r now vs - r()?

uri

-- 
Uri Guttman  -  [EMAIL PROTECTED]  --  http://www.sysarch.com
SYStems ARCHitecture, Software Engineering, Perl, Internet, UNIX Consulting
The Perl Books Page  ---  http://www.sysarch.com/cgi-bin/perl_books
The Best Search Engine on the Net  --  http://www.northernlight.com

Re: RFC 290 (v1) Remove -X

2000-09-26 Thread Nathan Wiger


Uri Guttman wrote:
 
 not the best. would that be confused with a sub readable and a leading
 unary negation? in fact how does perl parse -r now vs - r()?

Yes it would, here's how Perl parses these right now:

perl -w -e '
  sub r { local $\; print "r(@_) : "; }
  $\ = "\n";
  print "-r" if -r "/etc/motd";
  print "-r()" if -r("/etc/motd");
  print "- r" if - r "/etc/motd";
  print "- r()" if - r("/etc/motd");
'

Prints out:

  -r
  -r()
  r(/etc/motd) : - r
  r(/etc/motd) : - r()

Basically, the -r filetest always wins. Injecting a space makes the r
sub always win, as would be expected.

-Nate

Re: RFC 244 (v1) Method calls should not suffer from the action on a distance

2000-09-26 Thread Ilya Zakharevich


On Mon, Sep 25, 2000 at 09:10:49PM -0700, Nathan Wiger wrote:
if ( want-{count}  2 ) { return $one, $two }
 
 Will that be interpreted as:
 
'want'-{count}
want()-{count}
 
 To be consistent, it should mean the first one. That is, the infix
 operator - should always autoquote the bareword to the left. Am I
 correct in assuming that's what you meant?

Yes.  Use want()-{count} instead.  Or, better, 

  use want;
  wantCount;

Ilya

Re: perl6storm #0050

2000-09-26 Thread Simon Cozens


On Tue, Sep 26, 2000 at 12:43:07PM -0700, Robert Mathews wrote:
 Ok, you've proved that lisp doesn't make sense without all those
 annoying parentheses.  Congratulations.  Fortunately, perl isn't lisp.

Correct, John bringing lisp into the discussion *was* a canard.

-- 
Writing software is more fun than working.

Re: RFC 283 (v1) Ctr/// in array context should return a histogram

2000-09-26 Thread Paris Sinclair


On Tue, 26 Sep 2000, Bennett Todd wrote:

 2000-09-26-05:18:57 Paris Sinclair:
   (%alphabet) = $string =~ tr/a-z//;
  
  also a little more concise (and certainly more efficient...) than
  
  %alphabet = map { $_ = eval "\$string =~ tr/$_//" } (a..z);
 
 However, compared to say
 
   $hist[ord($_)]++ for split //, $string;
 
 the performance edge might not be quite so dramatic. Then again,
 maybe it would be, I dunno.

But would technique work with unicode? What if I am just counting some 
Bulgarian characters? Most encodings put these in the extended ascii
range. Making an array of 250 items for a count of 5 items isn't going to
be more efficient. Also, it requires jumping through more hoops, and doing
more conversions, to figure out which index is which letter. A table could
be built, but if it maps to an array index, based on ord(), then I
couldn't support both KOI-8 and windows cyrillic encodings in the same
@hist structure. Using a hash, the only limits are the more general
language supports in Perl, and I can still convert and store KOI8 and
cp1251, and store the results without needing to know which coding it
originated in; only needing to have a symbol for the character.

There seem to be lots of beneficial side effects of extending context,
that allow for general sollutions that are much more powerful than any of
the specific sollutions.

Paris Sinclair|4a75737420416e6f74686572
[EMAIL PROTECTED]|205065726c204861636b6572
www.sinclairinternetwork.com

Re: RFC 283 (v1) Ctr/// in array context should return a histogram

2000-09-26 Thread Bennett Todd


2000-09-26-20:29:22 Paris Sinclair:
 On Tue, 26 Sep 2000, Bennett Todd wrote:
  $hist[ord($_)]++ for split //, $string;
 
 But would technique work with unicode?

Beats me, I've never tried programming against unicode, as I don't
speak any other language than english I don't expect I will do so in
the future either. I expect the answer to your question depends
partly on details of the encoding, and partly on the implementations
of split and ord in a unicoded-infested world.

Could be someone would try and feed some kanji through it or
something and produce a sparse array a trillion bytes long, for all
I know. If you're worried about a scary sparse alphabet, switch the
[] to {} and use a hash:-).

 What if I am just counting some Bulgarian characters? Most
 encodings put these in the extended ascii range. Making an array
 of 250 items for a count of 5 items isn't going to be more
 efficient.

I'd expect it would; an array of 250 items is teensy.

 Also, it requires jumping through more hoops, and doing
 more conversions, to figure out which index is which letter.

Yup, I'm a sick little monkey who truly doesn't care about anything
other than US-ASCII, and doesn't mind the mildly extended encodings
like ISO 8859-1 because they include ASCII as their 7-bit subset; if
I get a text file and it's not in ASCII I can't read it anyway, so I
toss it.

 A table could be built, but if it maps to an array index, based on
 ord(), then I couldn't support both KOI-8 and windows cyrillic
 encodings in the same @hist structure.

If you're gonna have both KOI-8 and windows cyrillic encodings in
the same single string being passed to split, I am really really
glad I don't share your problems. I'll stand way, way back, thanks.

If you're getting from different sources, you could map them as you
consolidate them. But I think most folks would go for a single
common encoding before they even began examining the contents.

 Using a hash, the only limits are the more general language
 supports in Perl, and I can still convert and store KOI8 and
 cp1251, and store the results without needing to know which coding
 it originated in; only needing to have a symbol for the character.

If the purpose for including histogram-generation as a builtin to
perl, as a context-triggered side-effect of tr///, is to support
i18n, let's do please make that very explicit in the RFC. If we
don't, the requirements to make that work might not get thought
completely through and the desired i18n might not actually work. Oh,
and if the implementation is going to have to do all the right
brilliant stuff for i18n in the face of every conceivable encoding,
I expect it's not gonna be faster than the hash-based equivalent
construct:

$hist{ord($_)}++ for split //, $string;

which only requires that split// and ord do something appropriately
consistent across encodings.

But when people claim i18n benefits for things I tend to just go
away to my corner and get quiet, since I don't planning on doing
multilingual code or work with multilingual data, I don't feel
qualified to hold an opinion.

-Bennett

 PGP signature

Re: RFC 283 (v1) Ctr/// in array context should return a histogram

2000-09-26 Thread Paris Sinclair


On Tue, 26 Sep 2000, Bennett Todd wrote:

 Yup, I'm a sick little monkey who truly doesn't care about anything
 other than US-ASCII

Please keep your fetishes and/or geocentricism to yourself. There is no
need to propose that others should share them. If Perl is going to exist
into the future, if Perl is going to be a great programming language for
Humans, then it needs to support the different ways that Humans
communicate.

It's doing a better job at it all the time. Extending the context of
Ctr/// is an excellent general sollution to many problems, in many
languages. While it has been suggested that Ctr/// isn't for
counting... well, the p5 manual says it IS for counting, amoung other
things. If it is a general language tool that makes counting easy as a
side effect, this is wonderful. And if making it a more general tool by
extending it's context makes it even better for counting, who does this
hurt? There are certainly those of us it would help.

And yes, a list of 250 items to store 5 items is HUGE. There is no way to
know how many items I will have. O(N*50) is never going to make me
happy. Which is why right now I would have to use a funky Cmap
and Ceval. Or a map and a match and an index, but that's a lot of
frivilous temp variables.

Paris Sinclair|4a75737420416e6f74686572
[EMAIL PROTECTED]|205065726c204861636b6572
www.sinclairinternetwork.com

Re: RFC 283 (v1) Ctr/// in array context should return a histogram

2000-09-26 Thread Bennett Todd


2000-09-26-21:11:53 Paris Sinclair:
 Please keep your fetishes and/or geocentricism to yourself.

They get all ingrown and infested if I don't take 'em out and
air 'em out occasionally:-).

 There is no need to propose that others should share them.

No indeedy! I'm not opposed to i18n support in perl, or anywhere
else. In much the same way that I'm not opposed to various other
anti-discrimination measures even though I'm not in any of the
discriminated-against populations they aim to aid.

 If Perl is going to exist into the future, if Perl is going to be
 a great programming language for Humans, then it needs to support
 the different ways that Humans communicate.

That sounds positively noble when you put it that way. I can
actually hear choirs of cherubim providing atmosphere.

 Extending the context of Ctr/// is an excellent general
 sollution to many problems, in many languages.

That's an interesting claim, and for all I know it could be true. If
folks believe it, and think it's a justification for the proposed
behavior in tr///, let's get this claim made nice and explicit in
the RFC, is all I'm saying.

 And yes, a list of 250 items to store 5 items is HUGE. There is no way to
 know how many items I will have.

Yup, but as long as you're working with 8-bit encodings the array
will never get bigger than 256.

 O(N*50) is never going to make me happy.

O(1) should make you happy. It's got a small fixed upper bound.
Unless, of course, split// and ord get interesting in the face of
UTF-32 or something and the data is no longer bounded, in which case
(as I said) your only hope is to change the [] to {}, at which point
it's probably as fast as the hyper-sexy hash-building-tr///.

 Which is why right now I would have to use a funky Cmap
 and Ceval.

You've lost me. If you want a hash, what's wrong with:

$hist{$_}++ for split //, $string;

What's all this about eval?

-Bennett

 PGP signature

Re: RFC 283 (v1) Ctr/// in array context should return a histogram

2000-09-26 Thread Paris Sinclair


On Tue, 26 Sep 2000, Bennett Todd wrote:
 That sounds positively noble when you put it that way. I can
 actually hear choirs of cherubim providing atmosphere.

I heard them also, but I thought it was the radio.

  And yes, a list of 250 items to store 5 items is HUGE. There is no way to
  know how many items I will have.
 
 Yup, but as long as you're working with 8-bit encodings the array
 will never get bigger than 256.

Who says I'm working with 8-bt encodings?!
Perl5 already has rudimentary support for multibyte encodings. So far I
haven't used them, but this is only because I'm dealing with my multibyte
input as binary data, and just passing it allong. Presumably I will want
to make some sanity checks once I learn enough to know what I'm checking
for.

When the Martians come out of hiding, we're going to have to add 13bit
fonts, so maybe we should keep our arbitrary character restrictions in the
core, in just one place, to make it easier to accomodate this inevitable
circumstance. If we make everything else general enough, we will be able
to meet their demands quicker, saving the world and bringing a new age of
prosperity to humankind.
 
  O(N*50) is never going to make me happy.
 
 O(1) should make you happy. It's got a small fixed upper bound.
 Unless, of course, split// and ord get interesting in the face of
 UTF-32 or something and the data is no longer bounded, in which case
 (as I said) your only hope is to change the [] to {}, at which point
 it's probably as fast as the hyper-sexy hash-building-tr///.

A "small" fixed upper bound? It is N that is bounded, that doesn't stop it
from using N*50 variables to represent N, or N*150 variables if I'm only
matching vs 2 characters. Perhaps instead of using O() I should have just
said, "it is 0 to 150 times slower." The overal algorithm, that is, I am
assuming that this list is going to be iterated over. Making this monster
list would add inefficiencies to each step in the algorithm. In any case,
that sollution doesn't seem to work, because of it's reliance on an
arbitrary set of conditions that are smaller than the conditions in the
problem domain. What's the upper bound in a 16bit language? Or does that
case just have to break? "Sorry, you're not European. Please be
assimilated before using this tool. Resistance is futile."

 What's all this about eval?

That was in reference to my previous map example, which is the best
general way I've seen proposed to handle the specified counting in p5.
Ugly as it is, there is hopefully a better way, but not one that is
obvious (to me). But given the changes proposed in RFC 283, it would not
only be easier, it would be more efficient, and fully compatible with
whatever character encodings Perl supports, now and into the future.

Paris Sinclair|4a75737420416e6f74686572
[EMAIL PROTECTED]|205065726c204861636b6572
www.sinclairinternetwork.com

Re: RFC 283 (v1) Ctr/// in array context should return a histogram

2000-09-26 Thread Bennett Todd


2000-09-26-21:56:04 Paris Sinclair:
 A "small" fixed upper bound? It is N that is bounded, that doesn't
 stop it from using N*50 variables to represent N, or N*150
 variables if I'm only matching vs 2 characters.

In big-O notation, the N is the size of the problem; in this case,
it could be the size of the alphabet, or the length of the string.
The rest of big-O notation describes the order of growth, ignoring
constant factors, of the resources consumed by the solution as a
function of the size of the input.

I was coding for reasonable-sized alphabets, so I used a simple
linear array; as I pointed out, if you have an alphabet of
preposterous size, where it's impractical to e.g. build a single
copy of it in memory, and any document's characters from that
alphabet are sparse, then use a hash.

 Perhaps instead of using O() I should have just said, "it is 0 to
 150 times slower."

You're either going to build a dense array, which you can directly
subscript with some kind of ordinal value of the "letters", or
you're going to build a sparse array holding the chr = count map, a
hash. You can do either one inside the implementation of tr///. You
can do either one outside. There may be a significant performance
difference, there may not, that's not obvious to me.

 The overal algorithm, that is, I am assuming that this list is
 going to be iterated over.

You mean using something like 0..$#hist or keys(%hist)? Yup, I'd
expect that too. If the alphabet is reasonable sized, 0..$#hist is
so teensy it's free. If it's monster big, then keys(%hist) will be
scaling with the size of your input. Whether you do this with a loop
over split// incrementing array/hash buckets or whether it's done by
tr///, the O of the algorithms and their data structures are the
same.

 What's the upper bound in a 16bit language? Or does that case just
 have to break? "Sorry, you're not European. Please be assimilated
 before using this tool. Resistance is futile."

Lordie lordie lordie, you're one of the persecuted minority, and
a brand-waving rioter too. I've clearly stepped on a corn, not to
mention picked the wrong person to persecute. I'll go speak english
to other bigots who only speak english, and leave the future of the
civilized universe in your responsible hands.

Only kidding, I'm not about to let you off the hook that easy.

If you've got a variable-length coding where the values produced by
split /// are themselves strings, and ord() returns bignums, then
you need to

$hist{$_}++ for split //, $input;

and if you don't care to collect all the data, you're only
interested in xyz, then you can do various tricks; if someone forced
me to work in such a distressing environment, I might express it
something like

$hist{$_} = 0 for split //, "xyz";
for (split //, $input) { $hist{$_}++ if exists $hist{$_} }

or then again I might

for("$input") {
tr/xyz//cd; $hist{$_}++ for split //;
}

I dunno. If you expect some tr invocation to do something
reasonable and appropriate with weird multibyte encodings of
gigantic and sparse alphabets, do let us get this stated explicitly
in the RFC; otherwise it's not a reasonable argument for trying to
add this feature.

  What's all this about eval?
 
 That was in reference to my previous map example, which is the best
 general way I've seen proposed to handle the specified counting in p5.

Wow, I'd skipped that, now that you force me to review it, I see
why. Maybe it looks better when interpreted as UTF-32.

Perhaps neither of my previous constructs were as clean as

%hist = $input =~ tr/xyz//;

but then too, it's not awfully obvious just what that does, and it's
not something people need to do every day. If it's really cheap to
toss into the bucket, what the heck, I can't think of anything else
that the syntax would be better for. But I've also yet to hear
anything like a strong case made for it. For sure the ability to gen
up an equivalent expression that uses eval doesn't seem like
appealing grounds.

-Bennett

 PGP signature

Re: RFC 283 (v1) Ctr/// in array context should return a histogram

2000-09-26 Thread Paris Sinclair


kOn Tue, 26 Sep 2000, Bennett Todd wrote:

  What's the upper bound in a 16bit language? Or does that case just
  have to break? "Sorry, you're not European. Please be assimilated
  before using this tool. Resistance is futile."
 
 Lordie lordie lordie, you're one of the persecuted minority, and
 a brand-waving rioter too. I've clearly stepped on a corn, not to
 mention picked the wrong person to persecute. I'll go speak english
 to other bigots who only speak english, and leave the future of the
 civilized universe in your responsible hands.

That's really ridiculous. How do you know if I'm a minority? Mandarin is
the majority language, and it doesn't use 8bits. Not to say I speak
Mandarin. But, if you have to make assumptions about me to disagree with
my points, then it proves that your argument is flawed. And, if my being
or not being a minority is something that would effect the value of my
position, then you are even more dangerous than I had suspected.

As for a rioter, that is funny. I am not rioting, I am giving arguments in
support of an RFC. Am I "rioting" because I disagree with you?

Paris Sinclair|4a75737420416e6f74686572
[EMAIL PROTECTED]|205065726c204861636b6572
www.sinclairinternetwork.com

Re: RFC 283 (v1) Ctr/// in array context should return a histogram

2000-09-26 Thread Paris Sinclair


Could you please start from the assumption that we're all interested in
supporting the full Unicode space to the greatest degree possible?  None
of us are trying to force an ASCII-only alphabet on anyone (although some
of us are interested in keeping ASCII-only operations fast and efficient
since that's most of what we do).

I will start with no assumption. If my claim that what was said wasn't
compatible unicode and other encodings is false, pointing that out
would be more constructive than telling me to start making assumptions.

  And, if my being or not being a minority is something that would effect
  the value of my position, then you are even more dangerous than I had
  suspected.
 
 Comments like this don't help the discussion any.

Oh, I see, the problem isn't

   you're one of the persecuted minority

after all.

What a bunch of hogwash.

You don't like my comments? That is fine with me. I am only a user, and
you are something-or-other, and so you have the market cornered on the
right to be offended.

But as soon as a person labels me a minority, and implies that because I
have been labeled such that I am a rioter, and that my opinions are based
upon this label, then your choices are to filter me, or to listen to me
protest.

Yes, my aggressness is probably annoying to some people. Just like,
passive-aggressive sarcasm is annoying to me. I am sorry that this is
case.

Anyhow, I will not bother you anymore.

Re: RFC 283 (v1) Ctr/// in array context should return a histogram

2000-09-26 Thread Russ Allbery


Paris Sinclair [EMAIL PROTECTED] writes:

 But as soon as a person labels me a minority, and implies that because I
 have been labeled such that I am a rioter, and that my opinions are
 based upon this label, then your choices are to filter me, or to listen
 to me protest.

Then perhaps you shouldn't have labelled him Euro-centric if you didn't
want a sarcastic response in kind.

I'd just prefer that we discussed the technical issues without this
pointless bickering.  If you were offended, fine; say you were offended
and move on.  I was offended by your implication that people who don't
agree with you are saying that only European scripts matter.  But please
don't escalate the argument as part of being offended.

I'll now stop replying to this thread.  Sorry for sticking my nose in; it
really bugs me when this happens in i18n discussions.

-- 
Russ Allbery ([EMAIL PROTECTED]) http://www.eyrie.org/~eagle/

63 matches

Mail list logo