[Raku/old-design-docs] 07ac3b: rm broken link
Branch: refs/heads/master Home: https://github.com/Raku/old-design-docs Commit: 07ac3bb2ab509a5143e5f99a0e9c32b823828fae https://github.com/Raku/old-design-docs/commit/07ac3bb2ab509a5143e5f99a0e9c32b823828fae Author: librasteve <40125330+librast...@users.noreply.github.com> Date: 2024-01-18 (Thu, 18 Jan 2024) Changed paths: M README.md Log Message: --- rm broken link Commit: a4c36c683dafdad0bb55996e78b66a1f48fc703c https://github.com/Raku/old-design-docs/commit/a4c36c683dafdad0bb55996e78b66a1f48fc703c Author: librasteve <40125330+librast...@users.noreply.github.com> Date: 2024-01-18 (Thu, 18 Jan 2024) Changed paths: M README.md Log Message: --- Merge pull request #128 from librasteve/master rm broken link Compare: https://github.com/Raku/old-design-docs/compare/63e44c363518...a4c36c683daf
Re: Correct enum incantation?
Thank you Fernando! (still on Rakudo 2020.10 here). On Wed, May 5, 2021 at 10:24 AM Fernando Santagata < nando.santag...@gmail.com> wrote: > On Wed, May 5, 2021 at 6:00 PM William Michels via perl6-users < > perl6-us...@perl.org> wrote: > >> Hello, >> >> I've been reading over an interesting Answer on StackOverflow by wamba: >> >> https://stackoverflow.com/a/67324175/7270649 >> >> I started trying to explore enums on my own and quickly realized that >> method calls on enums are different from simple key/value pairs. For enums, >> calling a `.key` or `.value` or `.kv` method won't work. Instead one must >> use something like `.^enum_values` or `.^enum_value_list`. >> > > Probably you need the new and shiny compiler ;) > > > $*RAKU.compiler.version > v2021.04 > > Month.keys > (oct dec aug jun mar apr feb nov jul may sep jan) > > Month.values > (11 6 9 2 5 7 3 12 8 4 10 1) > > .say for Month.kv > nov > 11 > jul > 7 > sep > 9 > jan > 1 > oct > 10 > mar > 3 > jun > 6 > apr > 4 > aug > 8 > dec > 12 > feb > 2 > may > 5 > -- > Fernando Santagata >
Correct enum incantation?
Hello, I've been reading over an interesting Answer on StackOverflow by wamba: https://stackoverflow.com/a/67324175/7270649 I started trying to explore enums on my own and quickly realized that method calls on enums are different from simple key/value pairs. For enums, calling a `.key` or `.value` or `.kv` method won't work. Instead one must use something like `.^enum_values` or `.^enum_value_list`. Can someone explain how `.^enum_values` or `.^enum_value_list` came to be the **correct** incantation for enums, and why there is no equivalent `.^enum_keys` or `.^enum_key_list` commands? Thank you, Bill. > # In the Raku REPL (wamba's enum example 'Month') > enum Month (jan => 1, |); Map.new((apr => 4, aug => 8, dec => 12, feb => 2, jan => 1, jul => 7, jun => 6, mar => 3, may => 5, nov => 11, oct => 10, sep => 9)) > > Month.^enum_values {apr => 4, aug => 8, dec => 12, feb => 2, jan => 1, jul => 7, jun => 6, mar => 3, may => 5, nov => 11, oct => 10, sep => 9} > Month.^enum_value_list (jan feb mar apr may jun jul aug sep oct nov dec) > > Month.^enum_keys No such method 'enum_keys' for invocant of type 'Perl6::Metamodel::EnumHOW' in block at line 4 > Month.^enum_key_list No such method 'enum_key_list' for invocant of type 'Perl6::Metamodel::EnumHOW' in block at line 1 > > #Random keyword attempts: > Month.key Invocant of method 'key' must be an object instance of type 'Mu', not a type object of type 'Month'. Did you forget a '.new'? in block at line 1 > Month>>.key Invocant of method 'key' must be an object instance of type 'Mu', not a type object of type 'Month'. Did you forget a '.new'? in block at line 1 > Month.keys () > Month>>.keys (())
[Raku/old-design-docs] 63e44c: S22: Clarify how system specific values work and c...
Branch: refs/heads/master Home: https://github.com/Raku/old-design-docs Commit: 63e44c36351887f1eb76500d7102f0db44848d27 https://github.com/Raku/old-design-docs/commit/63e44c36351887f1eb76500d7102f0db44848d27 Author: niner Date: 2020-10-01 (Thu, 01 Oct 2020) Changed paths: M S22-package-format.pod Log Message: --- S22: Clarify how system specific values work and correct the list of variables
[Raku/old-design-docs] b13e78: Update dependency specifications in S22
Branch: refs/heads/master Home: https://github.com/Raku/old-design-docs Commit: b13e78fe5b9dc10bfdacb0122ea40a77b6037ac9 https://github.com/Raku/old-design-docs/commit/b13e78fe5b9dc10bfdacb0122ea40a77b6037ac9 Author: Stefan Seifert Date: 2020-09-30 (Wed, 30 Sep 2020) Changed paths: M S22-package-format.pod Log Message: --- Update dependency specifications in S22 An attempt at implementation has shows that using lists for dependency alternatives causes confusion and is not self-documenting enough as the depends section itself consists of a list. Using an object with an any key is more clear and lends itself to future expansion. Collapsing of system specific values has always been meant to also cover dependencies. This is now stated explicitly and the details on what's available described. The build-depends and test-depends keys are deprecated. They should not have been part of the v1 META spec. Leaving them in was just an oversight.
Fwd: unflattering flat
Migrating this question over from perl6-users : Can someone explain why in the second and third REPL code lines below, line 2 returns a List while line 3 returns a Seq? And is there a general rule to remember which code returns which data structure? 1> my %hash-with-arrays = a => [1,2], b => [2,3]; {a => [1 2], b => [2 3]} > 2> %hash-with-arrays.values>>.map({$_}) #makes a List ((1 2) (2 3)) > 3> %hash-with-arrays.values.map({.flat}) #makes a Seq ((1 2) (2 3)) > Thank you in advance, Bill. -- Forwarded message - Date: Wed, Apr 22, 2020 at 8:58 AM Subject: Re: unflattering flat To: perl6-users Regarding a recent discussion on flattening hash values. Here are some posted answers: >#Konrad Bucheli > my %hash-with-arrays = a => [1,2], b => [2,3]; {a => [1 2], b => [2 3]} > %hash-with-arrays.values>>.map({$_}) #makes a List ((1 2) (2 3)) > %hash-with-arrays.values>>.map({$_}).flat (2 3 1 2) >#ElizabethMattijsen > %hash-with-arrays.values.map( { $_.Slip } ) (1 2 2 3) > say %hash-with-arrays.values.map: |* (1 2 2 3) >#Larry Wall > %hash-with-arrays.values»[].flat (2 3 1 2) > say gather %hash-with-arrays.values.deepmap: { .take } (1 2 2 3) I played with Raku/Perl6 a bit and found two more constructs on my own. The first one (A) with {.self} flattens the hash values as intended. The second one (B) produces a "Sequence of sequences", which I don't think I've seen before: >#Me A> %hash-with-arrays.values.map({.self}).flat (1 2 2 3) > B> %hash-with-arrays.values.map({.flat}) #makes a Seq ((1 2) (2 3)) > dd(%hash-with-arrays.values.map({.flat})) ((1, 2).Seq, (2, 3).Seq).Seq Nil > %hash-with-arrays.values.map({.flat}).WHAT (Seq) > Question: why not a "List of sequences" instead? Any precedent in Perl5? What is a "Sequence of sequences" useful for (above and beyond a "List of sequences")? How can I predict a priori whether a particular line of code will return a Seq or a List? In fact, in comparison to the List generated from Konrad's code, the auto-printed REPL ".gists" are identical between the .Seq and List objects above, and it's only upon calling ".elems" that a user sees that one returns 4 elements while the other returns 2 elements (which could be confusing). Thx, Bill. On Mon, Apr 6, 2020 at 6:20 PM Brad Gilbert wrote: > > [*] is also a meta prefix op > > say [*] 4, 3, 2; # 24 > > But it also looks exactly the same as the [*] postfix combination of operators > > my @a = 1,2,3; > > say @a[*]; # (1 2 3) > > There is supposed to be one that looks like [**] > > my @b = [1,], [1,2], [1,2,3]; > > say @b[**]; # (1 1 2 1 2 3) > > Really @a[*] is > > say postfix:« [ ] »( @a, Whatever ) > > And @b[**] is > > say postfix:« [ ] »( @b, HyperWhatever ) > > --- > > say *.WHAT.^name; # Whatever > > say **.WHAT.^name; # HyperWhatever > > On Mon, Apr 6, 2020 at 7:05 PM yary wrote: >> >> Question- what am I missing from the below two replies? >> >> Larry's answer came through my browser with munged Unicode, it looks like >> this >> >> >> - with the Chinese character for "garlic" after the word "values" >> >> Then Ralph says "[**] will be a wonderful thing when it's implemented" but >> as far as I can tell, [**] is exponentiation (math) as a hyper-op, nothing >> to do with flattening. From https://docs.raku.org/language/operators >> >> say [**] 4, 3, 2; # 4**3**2 = 4**(3**2) = 262144 >> >> >>
Re: Announce: french perl workshop (Aka Journées Perl)
Hello Mark, I was thinking about submitting a talk on the P6 Object System, but I could also change it to Grammars (or do both). Cheers, Laurent. Le mar. 21 mai 2019 à 10:30, Marc Chantreux a écrit : > hello perl6 people, > > we hope there will be some events around the French Perl Worshop > (aka journées perl) > > https://journeesperl.fr/jp2019/ > > there will be at least a "perl6 modules hackathon" (trying to contribute > to the perl6 ecosystem). however i really would like to see a talk or a > workshop about perl6. some ideas of topics that can be appreciated by > the audience as well as trivial for some of you: > > * perl6 module path (and bytecoded version) > * getting started with > * zef and mi6 > * Cro > * NativeCall > * Grammars > > so we'll be pleased to see you all and if you think you can give a talk: > don't be shy and show us your perl6 :) > > regards > marc >
Re: [PATCH] multiple heredoc beginning in the same line
it's not so difficult, really. you just gotta know the trick: - cd into the repository you want, then run "git am". - Use your mail client's "view source" function (ctrl-u in thunderbird for example). - copy the complete mail source including headers - paste it into the terminal that's running git am - hit return if necessary, and ctrl-d done. I pushed the patch to our repository. Thanks for your work, francois! - Timo
enhanced open-funktion
Hello, I have a wish for Perl 6. I would like if the open-funktion opens only a file if it doesn't exist. Of course, I can first test if the file exist. if (-e $filename) { print file already exists!; } else { open (FH, $filename) } My suggestion is to have a character for the open-mode-character that indicate that the file should not already exist. For example: open (FH, ?$filename) || die can't open $filename: $!; Gerd
more than one modifier
Hello, I have a wish for Perl6. I think it would be nice to have the possibility for more than one modifier after a simple statement. For example: print $a+$b if $a if $b for 1..3; Gerd Pokorra E-Mail: [EMAIL PROTECTED]
RFC 207 (v3) Arrays: Efficient Array Loops
ymous looping indices. |@ and |@foo are allowed only within array indices. Because |@ and |@foo are of variable length, it must be possible to determine from the expressions how large they are, knowing the shape of the arrays. Practically, this means that |@ can be used only once in an index, because it is impossible to tell which dimension C|i is in an expression like C$a[|@;|i;|@]. An index may use multiple named index groups, as long as other parts of the statement provide enough context to provide the bounds of the index groups. As yet another way to take a tensor product: $product[|@a;|@b] = $factor1[|@a] * $factor2[|@b]; demonstrates this. Because the shape of @factor1 and @factor2 determined the number and bounds of the looping index groups |@a and |@b, both can be used within the index for @product. =head3 Looping indices outside of array indices. Looping indices aren't restricted to being used solely as array indices, as the "unriffle" example showed. But each looping index has to be used in an array index for at least one array. # find $nth triangular number my $triangle = 0; $triangle += |i=(0..$n); # compile-time error: |i not used as index # Fill a multiplication table my @multtable : shape(12,12); $multtable[|i;|j] = |i*|j; # OK =head Lazy Evaluation Assuming that lazy evaluation is used in other parts of Perl6, it would be nice if these loops could also be evaluated lazily. In list context, this could be done by creating an anonymous function to evaluate the looped expression at the desired indices: $a[|i]*$b[|j] # in list context # becomes sub { my ($i,$j) = @_; $a[$i]*$b[$j]; } This anonymous function can be TIEd to the resulting anonymous array, so all array lookups would invoke this function. Since TIEing is supposed to be improved in Perl6, this would be a reasonable way to do it. If other lazy evaluation mechanisms work in Perl6, they could be used instead. I am uncertain if lazy evaluation makes sense in void context. =head2 Examples: $t[[|i,|j]] = $a[[|j,|i]]; # transpose 2-d @a would be equivilant to: { my $i; my $j for $i (0..) { # last if out-of-bounds for $j (0..) { # last if out-of-bounds $t[[$i,$j]] = $a[[$j,$i]]; } } } This notation also allows (as a specific use) an alternative notation to the RFC 82 element-wise syntax. #compute pairwise sum, pairwise product, pairwise difference... @sum = @a[[|i,|j,|k,|l]] + @b[[|i;|j;|k;|l]]; # RFC82: @sum = @a + @b @prod= @a[[|i,|j,|k,|l]] * @b[[|i;|j;|k;|l]]; #@prod = @a * @b @diff= @a[[|i,|j,|k,|l]] - @b[[|i;|j;|k;|l]]; #@diff = @a - @b RFC 82 syntax is simpler, but this is perl, so There Is More Than One Way To Do It, as tensor multiplication demonstrates. Note that if the "Lazy Evaluation" schema mentioned above is adopted, then these sums, products, and differences could be automagically lazy as well. =head1 IMPLEMENTATION The simplest implementation would be to convert at compile-time (or parse time) void-context looped iterator scopes to loops analogous to the above examples, and convert list-context looped iterator scopes to valued do-blocks or invoked anonymous subroutines: $dotproduct = reduce {^_+^_},0,$a[|i]*$b[|i]; # would be transformed into $dotproduct = reduce {^_+^_},0, sub { my $i; my @r; for $i (0..min($#a,$#b)) { $r[$i] = $a[$i] * $b[$i]; } return @r; }-(); A more sophisticated, preferred, implementation would take advantage of the static, known nature of the data to create a highly optimized version of the loop. Possible optimizations include: Common sub-expression elimintation, encoding internally to some non-interpreted looping construct, etc. If special 'numeric functions' are provided in Perl, then expressions with just unoverloaded operators and numeric functions could be optimised into tight compiled loops, as occurs for example with fromfunction() and ufuncs in Numeric Python: http://starship.python.net/~da/numtut/array.html#SEC8 http://starship.python.net/~da/numtut/array.html#SEC13 For lazy evaluation, the value of the expression at any given set of indices is easy to calculate. However the lazy evaluation mechanism works, it can use this property to calculate the appropriate values. =head1 REFERENCES RFC 203: Notation for declaring and creating arrays RFC 204: Notation for indexing arrays with an LOL as an index RFC 205: New operator ';' for creating array slices
RFC 1 (v1) Improvement needed in error messages (both internal errors and die function).
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Improvement needed in error messages (both internal errors and die function). =head1 VERSION Maintainer: S. A. Janet [EMAIL PROTECTED] Date: 30 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 1 Version: 1 Status: Developing =head1 ABSTRACT Error messages should contain the word `ERROR' in them. Messages due to internal errors should contain `INTERNAL ERROR'. Error messages should also contain the program name, or if that is not known, the word `perl'. =head1 DESCRIPTION Internal error messages should be improved. For example, % perl -le 'print "PRIME" if (1 x shift) !~ /^(11+)\1+$/' 373403020102920303 Out of memory! is a poor message. This is better: "perl: FATAL ERROR: out of memory". =head1 IMPLEMENTATION This should require very minor improvements to die and the addition of a function e.g. setprogramname() to register the program name internally: % cat foo.pl setprogramname( $0 ); die "filenames expected" if ( $#ARGV 0 ); ... $ perl foo.pl foo.pl: FATAL ERROR: filenames expected =head1 REFERENCES None.
RFC 359 (v1) Improvement needed in error messages (both internal errors and die function).
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Improvement needed in error messages (both internal errors and die function). =head1 VERSION Maintainer: S. A. Janet [EMAIL PROTECTED] Date: 30 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 359 Version: 1 Status: Developing =head1 ABSTRACT Error messages should contain the word `ERROR' in them. Messages due to internal errors should contain `INTERNAL ERROR'. Error messages should also contain the program name, or if that is not known, the word `perl'. =head1 DESCRIPTION Internal error messages should be improved. For example, % perl -le 'print "PRIME" if (1 x shift) !~ /^(11+)\1+$/' 373403020102920303 Out of memory! is a poor message. This is better: "perl: FATAL ERROR: out of memory". =head1 IMPLEMENTATION This should require very minor improvements to die and the addition of a function e.g. setprogramname() to register the program name internally: % cat foo.pl setprogramname( $0 ); die "filenames expected" if ( $#ARGV 0 ); ... $ perl foo.pl foo.pl: FATAL ERROR: filenames expected =head1 REFERENCES None.
RFC 88 (v3) Omnibus Structured Exception/Error Handling Mechanism
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Omnibus Structured Exception/Error Handling Mechanism =head1 VERSION Maintainer: Tony Olekshy [EMAIL PROTECTED] Date: 8 Aug 2000 Last Modified: 1 Oct 2000 Mailing List: [EMAIL PROTECTED] Number: 88 Version: 3 Status: Frozen =head1 NOTES RFC 88 as HTML http://www.avrasoft.com/perl6/rfc88.htm RFC 88 as Text http://www.avrasoft.com/perl6/rfc88.txt RFC 88 as PODhttp://www.avrasoft.com/perl6/rfc88-pod.txt Perl 5 Try.pmhttp://www.avrasoft.com/perl6/try6-ref5.txt Regression Test http://www.avrasoft.com/perl6/try-tests.htm =head1 ABSTRACT "The Encyclopedia of Software Engineering" [ESE-1994] says (p.847): =over 4 =item Z Inevitably, no matter how carefully a programmer behaves when writing a program and no matter how thoroughly its verification is carried out, errors remain in the program and program execution may result in a failure. [...] The programming language may provide a framework for detecting and then handling faults, so that the program either fails gracefully or continues to work after some remedial action has been taken to recover from the error. Such a linguistic framework is usually called exception handling. =back The Istructured exception handling mechanism described in this RFC satisfies the following requirements. =over 4 =item 1 It does not, by default, interfere with traditional Perl programming styles. When explicitly used, it simply adds functionality to better support other programming styles, which currently have to undertake some contortions to get the desired effect. Z =item 1 It is simple in the cases in which it is commonly used. Z =item 1 It is capable enough to handle the needs of production applications, frameworks, and modules, including the ability to hook into the mechanism itself. Z =item 1 It is suitable for "error" handling via exceptions. Z =item 1 It is suitable for light-weight exceptions that may not involve errors at all. =back This RFC describes a collection of changes and additions to Perl, which together support a built-in base class for Exception objects, and exception/error handling code like this: exception 'Alarm'; try { throw Alarm "a message", tag = "ABC.1234", ... ; } catch Alarm = { ... } catch Error::DB, Error::IO = { ... } catch $@ =~ /divide by 0/ = { ... } catch { ... } finally { ... } Any exceptions that are raised within an enclosing try, catch, or finally block, where the enclosing block can be located anywhere up the subroutine call stack, are trapped and processed according to the semantics described in this RFC. The new built-in Exception base class is designed to be used by Perl for raising exceptions for failed operators or functions, but this RFC can be used with the Exception base class whether or not that happens. Readers who are not familiar with the technique of using exception handling to handle errors should refer to the LCONVERSION section of this document first. It is not the intent of this RFC to interfere with traditional Perl scripts; the intent is only to facilitate the availability of a more controllable, pragmatic, and yet robust mechanism when such is found to be appropriate. =over 4 =item * Nothing in this RFC impacts the tradition of simple Perl scripts. =item * Ceval {die "Can't foo."}; print $@; continues to work as before. =item * There is no need to use Ctry, Cthrow, Ccatch, or Cfinally at all, if one doesn't want to. =item * This RFC does not require core Perl functions to use exceptions for signalling errors. =back =head1 DEFINITIONS Braise =over 4 =item Z An exception is raised to begin call-stack unwinding according to the semantics described herein. This provides for controlled non-local flow-control. This is what Cdie does. =back Bpropagate =over 4 =item Z The passing of an exception up the call stack for further processing is called propagation. Raising an exception starts propagation. Propagation stops when the exception is trapped. =back Bunwinding =over 4 =item Z The overall process of handling the propagation of an exception, from the point it is raised until the point it is trapped, is called unwinding. =back Btrap =over 4 =item Z The termination of unwinding, for the purpose of attempting further processing using local flow-control semantics, is called trapping. This is what Ceval does, as do Ctry, Ccatch, and Cfinally. =back Bcleanly caught =over 4 =item Z This means the trapping and handling of an exception did not itself raise an exception. Cleanly handling exceptions raised while handling exceptions is difficult, tedious, and error-prone given only Ceval. =back Bexception =over 4 =item Z An exception is a collection of informaton about a particular non-local goto, captured at raise-time, for
RFC 96 (v2) A Base Class for Exception Objects
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE A Base Class for Exception Objects =head1 VERSION Maintainer: Tony Olekshy [EMAIL PROTECTED] Date: 12 Aug 2000 Last Modified: 1 Oct 2000 Mailing List: [EMAIL PROTECTED] Number: 96 Version: 2 Status: Withdrawn =head1 ABSTRACT The contents of this RFC are now covered by RFC 88 and RFC 80. =head1 DESCRIPTION A number of topics in the Perl 6 RFC discussions have touched on the concept of a sanctioned or canonical base class for exception objects. This RFC considers a basic proposal for a base class for exception objects that can be used with core Perl, traditional eval, die, and $@ functionality, and with exception handling mechanisms like RFC 63 and RFC 88. One of the attributes of an exception object is its class name. This can be changed via inheritance, and a hierarchy of isa relationships can be arranged. Exception handing mechanisms can arrange to have catch behaviour depend on these relationships. =head2 Instance Variables The following exception object instance variables are supported by this base class. tag This is a string which module developers can use to assign a unique "identifier" to each exception object constructor invocation in the module. What should the default be if not otherwise specified? severity This is some sort of "priority" (such as info v/s fatal) on which handing can be based. The details need to be worked out. What should the default be if not otherwise specified? message This is a description of the exception in language intended for the "end user". This is the only required ivar, so as such the constructor treats it specially. debug This is a place for additional description that is not intended for the end user (because it is "too technical" or "sensitive"). file The file from which the constructor was called. line The line from which the constructor was called. object If the exception is related to an object, it can be specified here. Should this be a weak reference? sysmsg This a place for the internal exceptions raised by Perl to record system information, along the lines of $!. Methods sub new { my ($C, $msg, %A) = @_; exists $A{file} or $A{file} = caller ... exists $A{line} or $A{line} = caller ... exists $A{sysmsg} or $A{sysmsg} = magic ... bless {message = $msg, %A}, ref $C || $C; } sub tag { @_ 1 and $_[0]-{tag} = $_[1]; $_[0]-{tag} } sub severity { @_ 1 and $_[0]-{severity} = $_[1]; $_[0]-{severity} } sub message { @_ 1 and $_[0]-{message} = $_[1]; $_[0]-{message} } sub debug{ @_ 1 and $_[0]-{debug}= $_[1]; $_[0]-{debug}} sub file { @_ 1 and $_[0]-{file} = $_[1]; $_[0]-{file} } sub line { @_ 1 and $_[0]-{line} = $_[1]; $_[0]-{line} } sub object { @_ 1 and $_[0]-{object} = $_[1]; $_[0]-{object} } sub sysmsg { @_ 1 and $_[0]-{sysmsg} = $_[1]; $_[0]-{sysmsg} } =head1 IMPACT Once a standard set of attributes is decided on, RFC 88 can be revised to provide exception tests like: except tag = "ABC.1234" = catch { ... } except severity = "Fail" = catch { ... } =head1 ISSUES How to extend ivars and control namespace? How to extend methods and control namespace? Default values for tag and severity? How to categorize severity? How to arrange the exception class hierarchy for the Perl core? How to tag exceptions in the Perl core? What assertions should be placed on the instance variables? What should stringification return? =head1 REFERENCES RFC 70: Allow exception-based error-reporting. RFC 80: Exception objects and classes for builtins. RFC 63: Exception handling syntax proposal (Error.pm) RFC 88: Structured Exception Handling Mechanism (Try.pm)
RFC 331 (v2) Consolidate the $1 and C\1 notations
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Consolidate the $1 and C\1 notations =head1 VERSION Maintainer: David Storrs [EMAIL PROTECTED] Date: 28 Sep 2000 Last Modified: 30 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 331 Version: 2 Status: Frozen =head1 ABSTRACT Currently, C\1 and $1 have only slightly different meanings within a regex. It is possible to consolidate them without losing any functionality and, in the process, we gain intuitiveness. =head1 CHANGES v1-v2: A major rewrite: =over 4 =item * Reformatted the argument into "The Problem" and "The Solution" sections =item * Added "Some Examples" section =item * Added "Why do this?" section =item * Added "P526 migration" section =item * Proposed the @/ variable =item * Various trivial edits and typo-fixs =back =head1 DESCRIPTION Note: For convenience, I am going to talk about C\1 and $1 in this RFC. In actuality, these notations extend indefinitely: C\1..\n and C$1..$n. Take it as read that anything which applies to $1 also applies to C$2, $3, etc. =head2 The Problem In current versions of Perl, C\1 and C$1 mean different things. Specifically, C\1 means "whatever was matched by the first set of grouping parens Iin this regex match." $1 means "whatever was matched by the first set of grouping parens Iin the previously-run regex match." For example: =over 4 =item * C/(foo)_$1_bar/ =item * C/(foo)_\1_bar/ =back the second will match 'foo_foo_bar', while the first will match 'foo_[SOMETHING]_bar' where [SOMETHING] is whatever was captured in the Bprevious match...which could be a long, long way away, possibly even in some module that you didn't even realize you were including (because it was included by a module that was included by a module that was included by a...). The primary reason for this distinction is s///, in which the left hand side is a pattern while the right hand side is a string (assuming no 'e' modifier). Therefore: =over 4 =item * Cs/(foo)$1/$1bar/ # changes "foo???" to "foobar" where ??? is from the last match =item * Cs/(foo)\1/$1bar/ # changes "foofoo" to "foobar" =back Note that, in the first example, the two $1s refer to different things, whereas in the second example, $1 and C\1 refer to the same thing. This is counterintuitive and non-Perlish; Perl should be intuitive and DWIMish. A separate, though less important, problem with the way backreferences are currently implemented is that it is difficult for a human to tell at a glance whether \10 means "escape character 10" or "backreference 10"...the only way to tell is to count the number of captured elements and see if there actually are ten of them, in which case \10 is a backreference and otherwise it is an escape character. In general, this isn't a problem because most patterns don't have ten sets of capturing parens. =head2 The Solution Ok, so the problem is that $1 and C\1 are counterintuitive. How do we make them intuitive without losing any functionality? First, let's get rid of the C\1 form for backreferences. Second, let's say that $n refers to the nth captured subelement of the pattern match which occured in this Bstatement--note that this is distinct from "in this pattern match." That means that, in Cs/(foo)$1/$1bar/, both $1s refer to the same thing (the string 'foo'), even though one of them occured inside a pattern and one occured inside a string. (See note [1] in the IMPLEMENTATION section.) Third, let's create a new special variable, @/ (mnemonic: the / is the default delimiter for a pattern match; if the English module remains extant, then @/ could have the long name of @LAST_MATCH, but there are currently several threads concerning removal of the English module). Much like the current C$1, $2... variables, this array will only be created (and hence, the speed price will only be paid), if you access its members. The 0th element of @/ will contain the qr()d form of the last pattern match, while successive elements refer to the captured subelements. Fourth, let's change when we update the variables which store the captures (the current C$1, $2, etc). @/ will only be updated when the entire statement which contains a pattern match has finished running (e.g., when the entire s/// is completed), rather than as soon as the pattern match is done (and therefore before the substitution happens). =head2 Some Examples =over 4 =item 1 If you did the following: C"Bilbo Baggins" =~ /((\w+)\s+(\w+))/ Then @/ would contain the following: C$/[0] the compiled equivalent of C/((\w+)\s+(\w+))/, C$/[1] the string "Bilbo Baggins" C$/[2] the string "Bilbo" C$/[3] the string "Baggins" Note that after the match, C$/[1], C$/[2], and C$/[3] contain exactly what C$1, $2, and C$3 would contain with present-day syntax. Furthermore, the compiled form of the match is available so if you want to repeat the match later (or insert it into a larger regex), you can
RFC 347 (v2) Remove long-deprecated $* (aka $MULTILINE_MATCHING)
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Remove long-deprecated $* (aka $MULTILINE_MATCHING) =head1 VERSION Maintainer: Hugo van der Sanden [EMAIL PROTECTED] Date: 29 Sep 2000 Last Modified: 30 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 347 Version: 2 Status: Frozen =head1 ABSTRACT The magic $* variable (known in English as $MULTILINE_MATCHING) has been deprecated for years. It is time to kill it. =head1 DESCRIPTION In days of yore, you would set $* to 1 to achieve in all regexps the same as you can now achieve on a per-regexp basis with the /m flag. Nowadays, when most perl programmers have never heard of it, it is an accident waiting to happen and requires ugly additional cruft for the defensive programmer to avoid. The particular danger of $* is its 'action at a distance' effect: as a global variable, its effect reaches into and out of scopes that we normally expect to protect us. =head1 MIGRATION The long deprecation cycle helps here. p52p6 should complain and die if it sees any attempt to set $* or $MULTILINE_MATCHING to a non-zero value, or any attempt to alias it other than in English. It should silently (or maybe with a warning) ignore any attempt to set it to a zero value, and silently (or maybe with a warning) replace any attempt to read it with a constant undef. =head1 IMPLEMENTATION This only simplifies the regexp engine, and should help fix some longstanding bugs in the scope of /m. There is a bit of work to do to extricate it, but nothing seriously difficult. =head1 REFERENCES perlvar manpage for discussion of $*
RFC 360 (v1) Allow multiply matched groups in regexes to return a listref of all matches
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Allow multiply matched groups in regexes to return a listref of all matches =head1 VERSION Maintainer: Kevin Walker [EMAIL PROTECTED] Date: 30 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 360 Version: 1 Status: Developing =head1 DESCRIPTION Since the October 1 RFC deadline is nigh, this will be pretty informal. Suppose you want to parse text with looks like: name: John Abajace children: Tom, Dick, Harry favorite colors: red, green, blue name: I. J. Reilly children: Jane, Gertrude favorite colors: black, white ... Currently, this takes two passes: while ($text =~ /name:\s*(.*?)\n\s* children:\s*(.*?)\n\s* favorite\ colors:\s*(.*?)\n/sigx) { # now second pass for $2 ( = "Tom, Dick, Harry") and $3, yielding # list of children and favorite colors } If we introduce a new construction, (?@ ... ), which means "spit out a list ref of all matches, not just the last match", then this could be done in one pass: while ($text =~ /name:\s*(.*?)\n\s* children:\s*(?:(?@\S+)[, ]*)*\n\s* favorite\ colors:\s*(?:(?@\S+)[, ]*)*\n/sigx) { # now we have: # $1 = "John Abajace"; # $2 = ["Tom", "Dick", "Harry"] # $3 = ["red", "green", "blue"] } Although the above example is contrived, I have very often felt the need for this feature in real-world projects. =head1 IMPLEMENTATION Unknown. =head1 REFERENCES None.
RFC 112 (v4) Assignment within a regex
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Assignment within a regex =head1 VERSION Maintainer: Richard Proctor [EMAIL PROTECTED] Date: 16 Aug 2000 Last Modified: 1 Oct 2000 Mailing List: [EMAIL PROTECTED] Number: 112 Version: 4 Status: Frozen =head1 ABSTRACT Provide a simple way of naming and picking out information from a regex without having to count the brackets. =head1 DESCRIPTION If a regex is complex, counting the bracketed sub-expressions to find the ones you wish to pick out can be messy. It is also prone to maintainability problems if and when you wish to add to the expression. Using (?:) can be used to surpress picking up brackets, it helps, but it still gets "complex". I would sometimes rather just pickout the bits I want within the regex itself. Suggested syntax: (?$foo= ... ) would assign the string that is matched by the patten ... to $foo when the patten matches. These assignments would be made left to right after the match has succeded but before processing a replacement or other results (or prior to a some (?{...}) or (??{...}) code). There may be whitespace between the $foo and the "=". Potentially the $foo could be any scalar LHS, as in (?$foo{$bar}= ... ), likewise the '=' could be any asignment operator. The camel and the docs include this example: if (/Time: (..):(..):(..)/) { $hours = $1; $minutes = $2; $seconds = $3; } This then becomes: /Time: (?$hours=..):(?$minutes=..):(?$seconds=..)/ This is more maintainable than counting the brackets and easier to understand for a complex regex. And one does not have to worry about the scope of $1 etc. =head2 When does the assignment actually happen? In general all assignments should wait to the very end, and then assign them all. However before code callouts (?{...}) and friends, the named assignments that are currently defined should be made so that the code can refer to them by name. It may be appropriate for any assignments made before a code callout to be localised so they can unrolled should the expression finally fail. =head2 Named Backrefs The first versions of this RFC did not allow for backrefs. I now think this was a shortcoming. It can be done with (??{quotemeta $foo}), but I find this clumsy, a better way of using a named back ref might be (?\$foo). =head2 Scoping The question of scoping for these assignments has been raised, but I don't currently have a feel for the "best" way to handle this. Input welcome. Hugo: I think it should be defined to act the same as in (??{...}), whenever we get around to defining that. =head2 Brackets Using this method for capturing wanted content, it might be desirable to stop ordinary brackets capturing, and needing to use (?:...). I therefore suggest that as an enhancement to regexes that /b (bracket?) ordinary brackets just group, without capture - in effect they all behave as (?:...). =head1 CHANGES V3 - added bit about backrefs, and brackets. V4 - Clarified a few things and froze =head1 IMPLENTATION Currently all $scalars in regexes are expanded before the main regex compiler gets to analyse the syntax. This problem also affects several other RFCs (166 for example). The expansion of variables in regexes needs for these (and other RFCs) to be driven from within the regex compiler so that the regex can expand as and where appropriate. Changing this should not affect any existing behaviour. =head1 REFERENCES I brought this up on p5p a couple of years ago, but it was lost in the noise... RFC 166 Perlstorm #0040
RFC 166 (v4) Alternative lists and quoting of things
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Alternative lists and quoting of things =head1 VERSION Maintainer: Richard Proctor [EMAIL PROTECTED] Date: 27 Aug 2000 Last Modified: 1 Oct 2000 Mailing List: [EMAIL PROTECTED] Number: 166 Version: 4 Status: Frozen =head1 ABSTRACT Expand Alternate Lists from Arrays and Quote the contents of things inside regexes. =head1 DESCRIPTION These are a couple of constructs to make it easy to build up regexes from other things. =head2 Alternative Lists from arrays The basic idea is to expand an array as a list of alternatives. There are two possible syntaxs (?@foo) and just plain @foo. @foo might just have existing uses (just), therefore I prefer the (?@foo) syntax. (?@foo) is just syntactic sugar for (?:(??{ join('|',@foo) })) A bracketed list of alternatives. But built at regex compile time maybe its @{[ join('|',@foo) ]}. =head2 Quoting the contents of things If a regex uses $foo or @bar there are problems if the content of the variables contain special characters. What is needed is a way of \Quoting the content of scalars $foo or arrays (?@foo). Suggested syntax: (?Q$foo) Quotes the contents of the scalar $foo - equivalent to (??{ quotemeta $foo }). (?Q@foo) Quotes each item in a list (as above) this is equivalent to (?:(??{ join ('|', map quotemeta, @foo)})). In this syntax the Q is used as it represents a more inteligent \Quot\E. It is recognised that (?Q$foo) is equivalent to \Q$foo\E, but it does not mean that this is a bad idea to add this at the same time as (?Q@foo) for reasons of symetry and perl DWIM. It is recognised the (?Q might be reserved for control of a hypothetical Q flag, but this does feel "appropriate" as its about \Quoting. =head2 Comments Hugo: (?@foo) and (?Q@foo) are both things I've wanted before now. I'm not sure if this is the right syntax, particularly if RFC 112 is adopted: it would be confusing to have (?@foo) to have so different a meaning from (?$foo=...), and even more so if the latter is ever extended to allow (?@foo=...). I see no reason that implementation should cause any problems since this is purely a regexp-compile time issue. Me: I cant see any reasonable meaning to (?@foo=...) this seams an appropriate syntax, but I am open for others to be suggested. =head1 CHANGES V1 of this RFC had three ideas, one has been dropped, the other is now part of RFC 198. V2 Expands the list expansion and quoting with quoting of scalars and Implemention issues. V3 In an error what should have been 165 V2 was issued as 166 V2 so this is V3 with a change in (?Q$foo). This is in a pre-frozen state. V4 Added a couple of minor changes from Hugo and frozen. =head1 MIGRATION As (?@foo) and (?Q...) these are additions with out any compatibility issues. The option of just @foo for list exansion, might represent a small problem if people already use the construct. =head1 IMPLENTATION Both of these are changes are regex compile time issues. Generating lists from arrays almost works by localising $" as '|' for the regex and just using @foo. MJD has demonstrated implementing (?@foo) as (?\@foo) by means of an overload of regexes, this slight change was necessary because of the expansion of @foo - see below. Both of these changes are currently affected by the expansion of variables in the regex before the regex compiler gets to work on the regex. This problem also affects several other RFCs. The expansion of variables in regexes needs for these (and other RFCs) to be driven from within the regex compiler so that the regex can expand as and where appropriate. Changing this should not affect any existing behaviour. =head1 REFERENCES RFC 198: Boolean Regexes
RFC 324 (v2) Extend AUTOLOAD functionality to AUTOGLOB
le release of perl6, but remove it in later releases. This will give people sufficient time to heed the warnings (we do heed warnings, right?) and update their code. =head1 REFERENCES RFC 8: The AUTOLOAD subroutine should be able to decline a request RFC 190: Objects : NEXT pseudoclass for method redispatch RFC 232: Replace AUTOLOAD by a more flexible mechanism
RFC 356 (v2) Dominant Value Expressions
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Dominant Value Expressions =head1 VERSION Maintainer: Glenn Linderman [EMAIL PROTECTED] Date: 29 Sep 2000 Last Modified: 30 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 356 Version: 2 Status: Frozen =head1 ABSTRACT An aid to determining if an input value has an impact on the result of an expression whole program. Can also be used for Perl poetry. =head1 CHANGES Version 2 adds output function, examples, and freezes. =head1 DESCRIPTION This is an optional feature; to turn it on, "use domination;" is suggested. When use domination is in scope, two new functions are available, and new rules for expression evaluation obtain. Each of these is described in a subsection below. =head2 Domination pragma The "use domination 'output-function-name';" pragma enables the rest of the functionality. It should be scoped, affecting the current and nested blocks. domination;", but allows user specification of a method to use to convert dominant values to output strings. This is implicitly called only when an output stream is passed a dominant value. If an output-function-name is not supplied with "use domination;", the following function is implied: sub output-dominant-value { return sprintf "DOMINANT(%g)", dominant_weight($_[0]); } The "no domination;" pragma would turn off the effect of "use domination;" for the current and nested blocks. If a dominant value is encountered while "no domination;" is in effect, it is treated as "undef" by all scalar operators. =head2 Dominant operation The dominant operation takes a scalar argument, which is considered to be a weight parameter. The scalar argument is converted to numeric, if possible, with the resultant positive weight parameter producing a dominant value with that given weight. A scalar argument that cannot be automatically converted to numeric, or that produces a negative or zero numeric value produces undef as a result. dominant 47# produces a dominant value of weight 47 dominant 0 # produces undef dominant -47 # produces undef dominant "ab" # produces undef dominant "14" # produces a dominant value of weight 14 dominant undef # produces undef (and a warning if "use warnings") =head2 Dominant_weight operation The dominant_weight operation takes a scalar argument. If the scalar argument is a dominant value, it returns its weight as a positive number. If the scalar argument is not a dominant value, the return of the dominant_weight operation is zero. dominant_weight dominant 47 # produces 47 dominant_weight dominant 0# produces 0 dominant_weight dominant -47 # produces 0 dominant_weight dominant "ab" # produces 0 dominant_weight dominant "14" # produces 14 dominant_weight 47# produces 0 dominant_weight 0 # produces 0 dominant_weight "ab" # produces 0 dominant_weight "14"# produces 0 dominant_weight undef # produces 0 (and a warning if "use warnings") =head2 Expressions involving dominant values All scalar operations are affected by the presence of dominant values. If a scalar operation other than the dominant_weight operation involves one or more dominant values, the result of the operation is the heaviest (by weight) dominant value among the operands. If no dominant values are supplied to the operation, the result of the operation is the same as it would be according to the usual definition of the operation. use domination; $w = dominant 3; $x = dominant 47; $y = 33; $z = "abc"; $x + $y# produces dominant 47 $w . $z# produces dominant 3 $w - $x# produces dominant 47 "$z $x"# produces dominant 47 $z =~ m/$w/; # produces dominant 3 $x =~ m/$z/; # produces dominant 47 defined $w # produces dominant 3 $x == $x # produces dominant 47 $w eq $w # produces dominant 3 $w 17# produces dominant 3 $x 17# produces dominant 47 print "Show me: $w\n" # same result as: print "DOMINANT(3)" if ( $w ) # considered false if ( dominant_weight $w 17 ) # false if ( dominant_weight $x 17 ) # true =head1 COMPATIBILITY New functionality, no compatibility issues. The new functionality only obtains if the "use domination;" pragma is in effect. =head1 IMPLEMENTATION The impact of dominant value expressions is pervasive, affecting all builtin scalar operators in a minor way. Any operators doing item by item operations on each scalar of a list or hash would be similarly affected (various RFCs exist for extending scalar operations to work on lists of values in "corresponding item" fashion). =head1 REFERENCES
RFC 357 (v1) Perl should use XML for documentation instead of POD
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Perl should use XML for documentation instead of POD =head1 VERSION Maintainer: Frank Tobin [EMAIL PROTECTED] Date: 30 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 357 Version: 1 Status: Developing =head1 ABSTRACT Perl documentation should move to using XML as the formatting language, instead of using POD. XML has many advantages over POD, and would address several problems POD is reaching as it expands beyond its original designs. =head1 DESCRIPTION POD, described in Lperlpod, is currently the de-facto language for Perl-related documentation. It is a simple language, with simple tags. According to Lperlpod/"The Intent", simplicity was behind the intended design of POD. There exist several tools to convert POD into HTML, manpages, text, and other languages. However, POD can be confusing to learn, and has many limitations. XML, on the other hand, is a document language that is designed for high extensibility, and little ambiguity. XML is flexible, relying on Document Type Definitions, DTD's, (recently, also XML Schemas), which define the document structure being used by the author. For example, a copy of the DTD for XHTML may be found at Chttp://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd. Knowledge of how to write or work with DTD's is not necessary for an XML-author; the author need only know the effect of the DTD or XML Schema. In the following few sections I will try to compare various aspects of POD versus XML. =head2 Syntax POD's in-line tags use the general form CtagEltbodyEgt, while Primarily, XML uses balanced tags, e.g., CElttagEgtbodyElt/tagEgt. POD's in-line tags tend to be single-letters, and this can be thought of an argument for the readability of POD in it's un-rendered form, but an XML DTD for Perl documentation could also define single-letter tags to minimize this. POD's tagging, althrough simple, can produce confusing code. For example, to render the following (CtagEltbodyEgt) I need to have (CElttagEEltltEgtbodyEEltgtEgtEgt), which is horrendous in its own (look at the source to this document). It is especially hard to understand, since one would expect CbodyE to be an inseparable string; yet, reading, one needs to use look-ahead to see what is following the E. XML, on the other hand, uses as the escaping mechanism, helping a reader sort-out deeply-nested escapings. =head2 Use of Tags and Rendering Several of POD's tags do not necessarily relate to the meaning of the text, but how the text is rendered. For instance, CIElttextEgt is used to italicize text, not give meaning, although Lperlpod mentions that this should be used for emphasis or variables. This sort of style in having tags say how something should be rendered is something XHTML and friends have tried to get away from, instead having the presentation of the document separate from the structuring/tagging of it. XHTML and XML user-agents use stylesheets to render documents separately from the document itself; this allows the developer to merely give Imeaning to the document, instead of deciding how it will be rendered; the choice of how to render is left to the user, with the use of a stylesheet. The use of using tags solely to give meaning also helps accessibility problems for the visually-impaired; for example, having bold text (e.g., BEltfoobarEgt) is not quite as meaningful as strongly-emphasized text (e.g, EltstrongEgtUsing meaning makes things accessible!Elt/strongEgt). The use of stylesheets along with XML allows documents to be rendered very well in a variety of circumstances, such as manpages, a continuous-display browser (e.g., web-browser), and in printed form. =head2 Whitespace This can be confusing for many, as whitespace in most languages does not have functionality. However, in POD, lines beginning with a whitespace character are treated as pre-formatted text. This has already caused an RFC to take effect, RFC 216. XML, on the other hand, uses properties to determine the whitespace handling between different types of tags; for example, a tag in a Perl XML DTD tag such as EltperlEgt could be used to surround pre-formatted Perl code. =head2 Learning Curve Who knows what languages people will base their knowledge off of in 2 years? Noone really does, but HTML-style, balanced-tag languages are a good guess though, given the web's popularity, and easy-to-grasp notion of having balanced-tags. On the other hand, POD will be Yet Another Language to learn, distinct from other typing systems the user will know. While this may not seem like a problem since POD is designed to just be a simple language, will likely become more complicated in the future (see L"Extensibility"). =head2 Author Effort POD has the advantage in that it has syntax that is quick and easy to type, such as CElt$var++Egt instead of an XML/XHTML EltcodeEgtvar++EltcodeEgt. However, author effort pays off immensely
RFC 361 (v1) Simplifying split()
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Simplifying split() =head1 VERSION Maintainer: Sean M. Burke [EMAIL PROTECTED] Date: 30 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 361 Version: 1 Status: Developing =head1 ABSTRACT Perl 5's Csplit function is messy, and should be simplified. =head1 DESCRIPTION Perl 5 split does five things that I think are just annoying, and which I suggest be removed: =over =item 1. The first argument to split is currently interpreted as a regexp, regardless of whether or not it actually is one. (Yes, Csplit '.', $foo doesn't split on dot -- it's currently the same an Csplit /./, $foo.) I suggest that split be changed to treat only regexps as regexps, and everything else as literals. =item 2. Empty trailing fields are currently suppressed (although a -1 as the third argument disables this). I suggest that empty trailing fields be retained by default. =item 3. When not in list context, split currently splits into @_. I suggest that this side-effect be removed. =item 4. split ?pat? in any context currently splits into @_. I suggest that this side-effect be removed. =item 5. split ' ' (but not split / /) currently splits on whitespace, but also removes leading empty fields. I suggest that this irregularity be removed. =back The last three of the above points speak for themselves. I will focus on the first two. Most notably, I suggest that Perl 6 Csplit('|', ...) should work as most people expect -- splitting on a literal bar. (Under Perl 5, Csplit('|', ...) is synonymous with Csplit(/|/, ...) -- i.e., split on nullstring or nullstring [sic].) So I suggest: Perl 5: split /\|/, ... be synonymous with (and be better written as) Perl 6: split '|', ... # altho split /\|/, $bar... remains valid And as to the second point, the removal of trailing blanks, I suggest: Perl 5: @x = split /:/, $bar, -1; be synonymous with Perl 6: @x = split ':', $bar; If you want to remove trailing fields, under Perl 6 you should have to do it explicitly: Perl 5: @x = split /:/, $bar; be synonymous with Perl 6: @x = split ':', $bar; while(@x and !length $x[-1]) { pop @x } I believe that the current behavior of removing trailing empty fields is unintuitive and surprising to learners; nothing about the concept of splitting a string into a list suggests removing trailing empties. (Moreover, I find that when I need to remove empties, it's not just the trailing ones; so the current behavior is rarely just what I want.) =head1 IMPLEMENTATION I'll leave the C-coding details to the usual, capable implementers. But I will note one minor complication with my first suggestion (that literals and regexps be distinguished). Consider: Perl 6: @x = split $foo, $bar; I suggest that the correct approach is to treat $foo's value as a literal, unless it holds an object of class Regexp (or a class derived from it?), in which case it should be treated as if the above were: Perl 6: @x = split qr/$foo/, $bar; In other words, in such cases it is not possible to know at compile time whether a given "split" operator means literal-split or regexp-split. I note that such cases are rare. =head1 ALTERNATIVE APPROACH In conclusion, I'll note that there is a conservative alternative approach possible: if any of the above features of Perl 5 split seem really worth keeping, my suggestion for a "clean split" can be implemented as a separate operator called, for example, "cleave". (Consider the precedent of Perl 5 chomp being added alongside Perl 4 chop, not replacing it.) I would consider this suboptimal, though; I think that an operator with as straightforward and intuitive a name as "split" should behave in a straightforward and intuitive way. =head1 REFERENCES Nil.
RFC 358 (v1) Keep dump capability.
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Keep dump capability. =head1 VERSION Maintainer: S. A. Janet [EMAIL PROTECTED] Date: 30 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 358 Version: 1 Status: Developing =head1 ABSTRACT To simplify distribution of programs in binary form, support for dump should be kept. =head1 DESCRIPTION This would immensely aid distribution of code from one Linux, Windows, etc. machine to others without requiring all the recipients to be able to install Perl, compile and install modules required by the program, and configure their hosts so that Perl find the modules. There are also times when pre-loading and pre-processing large amounts of data are desirable. =head1 IMPLEMENTATION RFC 267 wants dump eliminated mainly because it is a common name for user subroutines, bit also because it can be accomplished with a kill signal. I really do not care if dump is renamed, but I believe keeping the capability is in perl's interest for greater acceptance and use. =head1 REFERENCES None.
RFC 287 (v2) Improve Perl Persistance
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Improve Perl Persistance =head1 VERSION Maintainer: Adam Turoff [EMAIL PROTECTED] Date: 24 Sep 2000 Last Modified: 30 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 287 Version: 2 Status: Frozen =head1 ABSTRACT Many mechanisms exist to make perl code and data persistant. They should be cleaned up, unified, and documented widely within the core documentation. =head1 DESCRIPTION Tom Christiansen proposed this in his perl6storm message: =item perl6storm #0022 make marshalling easy. core module? would this allow for easy persistence of data structures other than dbm files? general persistence is hard, right? can this be an attribute? Python offers one way to make code/data persistant: the Cpickle interface. More complex serialization can be accomplished through the 'shelve' interface or DBM files. This capability is quite useful, widely known and easily used. Perl, by comparison, offers Data::Dumper, which can serialize Perl objects that are rather asymetrically reconstituted by using Ceval or Cdo. Perl also offers solid, simple interfaces into DBM and Berkeley DB files, and offer a well known, low-level serialization mechanism. CPAN offers many other serialization modules that are only slightly different than Data::Dumper. This plethora of serialization mechanisms confuses users and adds to code bloat when multiple modules each use different serialization mechanisms that are all substantially similar. Something similar to Python's Cpickle interface should be added into Perl as a builtin; this feature should have a symmetric "restore" builtin (eg save()/restore(), freeze()/thaw(), dump()/undump()...). Furthermore, Perl's low level serialization machinery (DBM, SDBM, GDBM, Berkeley DB) should be unified into a single core module, where the underlying DBM implementations are pluggable drivers, like DBI's DBD infrastructure. =head1 IMPLEMENTATION First, the issue of adding builtin serialization functions needs to be addressed. This is a language issue because serialization should be more visible than it is today, and the best way to accomplish that is to include this feature as a pair of builtin functions. If this feature is implemented through a core module, that module might best be presented as a pragmatic module. Finally, although this proposal describes a simple matter of programming, some of the issues (such as pluggable interfaces) are best hashed out at a language-design level, so that they may be used elsewhere, easily. =head1 REFERENCES Python Pocket Reference, Chapter 12 perl6storm
RFC 290 (v3) Better english names for -X
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Better english names for -X =head1 VERSION Maintainer: Adam Turoff [EMAIL PROTECTED] Date: 24 Sep 2000 Last Modified: 30 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 290 Version: 3 Status: Frozen =head1 ABSTRACT Many programmers who have not used Perl's -X (or sh's -X) file tests find them to be bizarre, arcane and confusing. They deserve better 'use english;' names. =head1 NOTES The first version of this RFC proposed removing -X entirely, since it was a throwback to Perl's roots in sh. That proposal was quite vigorously shot down. The discussion centered around creating good descriptive names around the -X tests. While discussing -X, the idea came about to stack multiple tests into a single tests, e.g. mutate -r -w -x into -rwx $file (or something). See RFC 320 for details. =head1 DESCRIPTION Tom Christiansen proposed this in his perl6storm message: =item perl6storm #0101 Just like the "use english" pragma (the modern not-yet-written version of "use English" module), make something for legible fileops. is_readable(file) is really -r(file) note that these are hard to write now due to -s(FH)/2 style parsing bugs and prototype issues on handles vs paths. Here is a list of possible 'use english;' names for -X: -r freadable() -w fwriteable() -x fexecable() -o fowned() -R Freadable() -W Fwriteable() -X Fexecable() -O Fowned() -e fexists() -z fzero() -s fsize() -f ffile() -d fdir() -l flink() -p fpipe() -S fsocket() -b fblock() -c fchar() -t ftty() -u fsetuid() -g fsetgid() -k fsticky() -T ftext() -B fbinary() -M fage() -A faccessed() -C fchanged() =head1 MIGRATION ISSUES None. New symbolic names for -X are being added. =head1 IMPLEMENTATION Add appropriate hooks into 'use english;', and possibly export them as 'use english "filetests";' =head1 REFERENCES RFC 320: perl6storm
RFC 18 (v2) Immediate subroutines
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Immediate subroutines =head1 VERSION Maintainer: Jean-Louis Leroy Date: 4 Aug 2000 Last Modified: 1 Oct 2000 Mailing List: [EMAIL PROTECTED] Number: 18 Version: 2 Status: Frozen =head1 ABSTRACT This very simple construct, inspired by the Forth language, makes the parser extensible by Perl code, providing powerful macro capabilities, multi-line comments, inline functions and conditional compilation. CHANGES Made code examples start at column 4 for proper HTML rendering. =head1 DESCRIPTION When the parser sees a subroutine that has been marked as 'immediate', it calls it immediately. The call's arguments are implicitly quoted as with q{} and the resulting strings are passed to the subroutine. The entire call is removed from the parse stream and replaced with the subroutine's return value. =head1 SYNTAX use immediate qw( compileif ); # mark subroutines as immediate =head1 EXAMPLES # multiline comments sub comment { return ''; } use immediate 'comment'; sub foo { # ... comment { this is a multiline comment; the call to comment is executed at parse time and returns an empty string that replaces the whole call in the parse stream }; } # conditional compilation sub compileif { my ($condition, $body) = @_; return eval($condition) ? $body : ''; } use immediate qw( compileif ); # mark subroutines as immediate sub bar { compileif -e 'state' { do 'state'; } compileif $Module::VERSION 1.23, { # blah blah blah } } # macros sub square { my $arg = shift; my $gensym = $arg . '_'; $gensym .= '_' while $arg =~ /$gensym/; return "do { $gensym = $arg; $gensym * $gensym }"; } =head1 IMPLEMENTATION A flag is associated with the data structure associated to a subroutine by the parser. The pragmatic module 'immediate' is used to turn the flag on. When the parser recognizes a subroutine call, it checks the flag and if it's true, proceeds as described above. =head1 REFERENCES The Forth standard. "Starting Forth", by Leo Brodie
RFC 162 (v2) Heredoc contents
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Heredoc contents =head1 VERSION Maintainer: Richard Proctor [EMAIL PROTECTED] Date: 27 Aug 2000 Last Modified: 1 Oct 2000 Mailing List: [EMAIL PROTECTED] Number: 162 Version: 2 Status: Frozen =head1 ABSTRACT The content of a Heredoc is normally included into the program verbatim. RFC 111 allows whitespace (and comments) on the terminator. This RFC covers the content. It introduces the Enhanced Heredoc that removes whitespace and discusses the provision of other dequoting options in the library and documentation enhacements that should follow. =head1 DESCRIPTION =head2 Preamble I originally wanted to remove leading whitespace from the lines in a heredoc, several other people wanted to remove whitespace equivalent to the shortest span of whitespace at the start of lines, or the whitespace from the first line. TomC pointed out ways to achieve the removal of the whitespace in current perl, although this sort of works (as long as the user is consistent about use of spaces and tabs). I would like to make life easier. This attempts to bring all these ideas together. =head2 Discussion of options There are several possible ways that have been discussed: a) No Indenting - this is the current behaviour of . b) Remove all leading whitespace from all lines of input. This was not popular - no longer supported in this RFC. c) Remove whitespace equivalent to the first line of the Heredoc This was not popular - it did not fit many peoples requirements. d) Remove whitespace equialent to the smallest whitespace - a Realistic option, this can be performed by using regexes and the dequote function. e) Remove whitespace equialent to the terminator - a realistic option. This takes the whitespace off the content equivalent to that on the terminator and removes that amount of whitespace from the content. (This is now proposed for ). f) Using a Heredoc and a regex to remove unwanted whitespace. TomC provided some examples showing how this would work, and howw this could handle many of the options above. g) Using a Heredoc and a function to handle the dequoting of the content. This is essentially the same as a regex, but allows common types of dequoting to be written once. =head2 Agreements There are three things that have been agreed:- =head3 Enhanced Heredoc There will be two types of heredocs, the simple POD which just includes the contents of until the POD terminator and an enhanced POD which removes whitespace equivalent to that on the terminator from each line of the content (case e above). (Note the enhacements to the terminator in RFC 111 apply in both cases). =head3 Distribute a collection of dequote() mutations with perl These are a set of enhanced dequoting options that can strip of all leading whitespace with all the options mentioned above, treatement for variable expansion and perhaps procedure call expansion. These would be part of the standard library. Names and content to be discussed. [ NOT as part of this RFC ] =head3 Mention the s/// tricks in the documentation In the discussion that followed this RFC various ways using regexes were shown that could achieve most of what people want. Some of these should be included as examples in the documentation. =head2 Tabs Some debate took place on tabs in the whitespace. There were two considerations: a) The problem comes with mixing editors - some use tabs for indented material some dont, some reduce files using tabs etc etc. [I move between too many editors]. Perl should DWIM. I think that treating tabs=8 as the default would work for most people, even those who set tabs at other values as long as they are consistent - a "use tabs 4" could be used by them if they want to get the same behaviour if they mix tabs and spaces. b) Tabs are easy, don't expand them. Consider them as a literal character. This assums that the code author is going to use the same keystrokes to indent their here-doc text as the terminator, about as safe an assumption as any for tabs. There was more support for the second case than the first. =head2 dequoting example TomC in the debate provided this example, which works as long as there are no inconsistent tabs in the whitespace. $poem = dequoteEVER_ON_AND_ON; Now far ahead the Road has gone, And I must follow, if I can, Pursuing it with eager feet, Until it joins some larger way Where many paths and errands meet. And whither then? I cannot say. --Bilbo in /usr/src/perl/pp_ctl.c EVER_ON_AND_ON print "Here's your poem:\n\n$poem\n"; The following Cdequote function handles all these cases. It expects to be called with a here document as its argument. It looks to see whether each line begins with a common substring, and if so,
RFC 337 (v2) Common attribute system to allow user-defined, extensible attributes
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Common attribute system to allow user-defined, extensible attributes =head1 VERSION Maintainer: Nathan Wiger [EMAIL PROTECTED] Date: 28 Sep 2000 Last Modified: 1 Oct 2000 Mailing List: [EMAIL PROTECTED] Number: 337 Version: 2 Status: Frozen =head1 ABSTRACT Camel-3 and others have proposed a syntax for declaring variables like so: my type $var :attr1 :attr2 = $val; However, nobody has firmly nailed down what C:attr1 and C:attr2 are supposed to do. This takes a shot at it, since this could simplify the implementation of BRFC 188, BRFC 336, BRFC 163, and others. Currently, the Cattributes module can be used to set and retrieve these attributes. However, the current CMODIFY_ATTRIBUTES and CFETCH_ATTRIBUTES subs are not really fine-grained enough to give really good control. It would be nice to be able to call specialized, delegateable and inheritable subs to handle attributes. Furthermore, these attribute handlers should be able to alter Perl's internals somewhat, perhaps through the Cuse optimize pragma. =head1 DESCRIPTION Attributes should be able to access some type of core hooks so that they can be defined by the user and extended arbitrarily. For example, in pseudocode, a user should be able to say something like this: package Dog; attr fluffy { causes numeric contexts to fail; results in stringification appending "--and fluffy"; } Then, if a user does the following: my Dog $spot :fluffy = "happy"; print "$spot"; # "happy--and fluffy"; $spot++; # fails So, the declaration of C:fluffy would trigger the execution of a specific attribute handler, which could make the necessary changes to the variable. This would allow BRFC 188, which proposes new Cprivate and Cpublic keywords, to instead be implemented as attributes. This is perhaps more appropriate since these do not alter lexical scope (unlike Cmy and Cour), but rather change properties of the variables themselves: package __ALL__; # some type of builtin global declaration attr private { attachable to hashes and hash keys only; marks entire hash as non-autovivifying; marks specific entry as private to the package; allows duplication of keys in different packages; leaves any other entries public; } attr public { attachable to hashes and hash keys only; marks specific entry as accessible by all packages; would take the entry out of the package symbol table; } In your code, then, you could use the C:private and C:public attributes to modify your variables: sub new { my ($class, %self) = @_; bless \%self :private, $class; $self{seed} = rand; # dies, can't autovivify $self{seed} :private = rand;# okay $self{seed} = rand; # now okay } In addition, perhaps we have a BigInt class that we need to be able to modify: package BigInt; attr 128bit { attachable to any variable; causes exception if 128 bits not supported; results in huge memory preallocation; does other neato stuff too; } Again, in your code: my BigInt $x :128bit; This would invoke the attribute to modify the variable's properties internally. By having some type of attribute-specific declaration method, the attribute system could be modifiable at will, allowing for native access to variable manipulation without the need to compile a new version of Perl. This would allow those who want to to warp Perl OO into Java or Python or C++ without these features having to be either widely used or embedded in core. A base class could simply define attribute handlers which other classes could then inherit from. Attributes would be inherited just like subs. Attributes are not necessarily tied to Cmy or Cour declarations; see BRFC 279 for details. =head1 IMPLEMENTATION The easiest way I see is to change the Cattributes pragma into a pre-declaration pragma instead, something like: package Foo; use attributes fluffy = 'DEFAULT', size = \SIZE_ATTR, UNKNOWN = \ATTR_HANDLER; So, the declaration of C:fluffy on an instance of CFoo would just result in it being stored as text retrievable via Cattributes::get or some other means. On declaration of C:size('big'), though, the CSIZE_ATTR(\$var, 'big') sub from the class would be called. That is, a reference to the variable being altered would be the first arg, and the attribute arguments would be passed in by value. Any unknown attributes would be thrown to the CATTR_HANDLER sub, which would take the name of the attribute as the first arg, and then the other args would look like any other handler. For example: my Foo $bar :baz('likely'); # Foo-ATTR_HANDLER('baz', \$bar, #
RFC 344 (v2) Elements of @_ should be read-only by default
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Elements of @_ should be read-only by default =head1 VERSION Maintainer: John Tobey [EMAIL PROTECTED] Date: 28 Sep 2000 Last Modified: 1 Oct 2000 Mailing List: [EMAIL PROTECTED] Number: 344 Version: 2 Status: Frozen =head1 ABSTRACT Unprototyped subs should not be allowed to modify their callers' data simply by assigning to elements of the arg array. =head1 COMMENTS ON FREEZE This RFC generated no discussion in 3 days. =head1 DESCRIPTION In Perl 5, you can modify a caller's value by assigning directly to elements of C@_, like this: sub perp { $_[0] = 'ha!'; } sub victim { my $num = 42; perp($num); print $num;# prints ha! } This form of passing arguments by reference is obsolete now that Perl has hard references and C\$ in prototypes. The feature is surprising. At least, I was surprised when I first learned you could modify a caller's value by assigning to $_[0]. The feature is confusing, since it means that the recommended convention for naming parameters (by assigning C@_ to a Cmy list) alters semantics. For example, the following subs do Inot have the same effect as above: sub perp1 { my $arg = shift; $arg = 'ha!'; } sub perp2 { my ($arg) = @_; $arg = 'ha!'; } The Perl 5 (and older) behavior may preclude optimizations in compiled code. If a compiler knows that arguments are passed by value, it may generate code that places the value directly in a register or on the stack. With the Perl 5 semantics, it would normally have to pass a pointer to each scalar and dereference the pointer to obtain the argument's value. I think this change should not affect subs with a prototype, so the examples in Lperlsub/Prototypes would still work. People who bother to use prototypes and sub attributes should know what they are getting into. The prototype and attribute system will, I think, give plenty of opportunities to specify compiler optimizations. Just the default case should be changed. =head1 IMPLEMENTATION A fascist implementation would emit a compile-time error any time C@_ or one of its elements were assigned to, taken a refernce to, etc. A friendlier version would automatically copy the arg value to a new temporary. This change would mean a moderate Perl 5 compatibility breakage. The Perl-5-to-Perl-6 converter could insert whatever trick is used to obtain the Perl 5 behavior (perhaps a C(@) prototype) when it detects (or suspects) argument modification. =head1 REFERENCES RFC 154: Simple assignment lvalue subs should be on by default Lperlsub The C-- homepage - http://www.cminusminus.org/
RFC 160 (v3) Function-call named parameters (with compiler optimizations)
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Function-call named parameters (with compiler optimizations) =head1 VERSION Maintainer: Michael Maraist [EMAIL PROTECTED] Date: 25 Aug 2000 Last Modified: 30 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 160 Version: 3 Status: Frozen =head1 CHANGES Finialized various features by removing many of the options( grealy simplified the RFC). Unified the goals with that of RFC 176 and RFC 273. =head1 ABSTRACT Function parameters and their positions can be ambiguous in function-oriented programming. Hashes offer tremendous help in this realm, except that error checking can be very tedious. Also, hashes, in general, take a performance hit. The goal is to enhance functionality / convinience / performance where possible in regards to named-parameters, with a minimal of changes. And, at the same time, allow this to be a completely optional and virtually transparent process. The following is an in-depth analysis of various ways of accomplishing these goals. =head1 DESCRIPTION The current method of parameter proto-types only fulfills a tiny niche, which is mainly to offer compile-type checking and to disambiguate context ( as in sub foo($) { }, or sub foo($) { } ). No support, however, is given to hashes, even though they are one of perl's greatest strengths. We see them pop up in parameterized function calls all over the place (CGI, tk, SQL wrapper functions, etc). As above, however, it is left to the coder to check the existance of required parameters, since in this realm, the current proto-types are of no help. It should not be much additional work to provide an extension to prototypes that allow the definition of hashes. The following is a complex example of robust code: #/usr/bin/perl -w use strict # IN: hash: # a = '...' # req # b = '...' # req, defined # c = '...' # req, 0 = c = MAX_C # d = '..' # opt # e = '..' # opt # f = '..' # opt # OUT: xxx sub foo { my $self = shift; my %args = @_; # Requires $a my $a; die "No a provided" unless exists $args{a}; $a = $args{a}; # Requires non-null $b my $b; die "invalid b" unless exists $args{b} defined ($b = $args{b}); # Requires non-null and bounded $c my $c; die "Invalid c" unless exists $args{c} defined ($b = $args{b}) ($c = 0 $c $MAX_C); my ( $d, $e, $f ) = @args{ qw( d e f ) }; ... } # end foo Becomes: sub foo($%) : method required_fields(a b c) fields(d e f) doc(EOS) { # IN: hash: # a = '...' # req; Do some A # b = '...' # req, defined; Do some B # c = '...' # req, 0 = c = MAX_C; Do some C # d = '..' # opt; Do some D # e = '..' # opt; Do some E # f = '..' # opt; Do some F # OUT: xxx EOS my $self = shift; my %args : fields(a b c d e f) = @_; # produce optimized hash that is already pre-allocated at compile-time. # Requires non-null $args{b} die "invalid b" unless defined $args{b}; # Requires non-null and bounded $args{c} die "invalid c" unless defined $args{c} ($args{c} = 0 $args{c} $MAX_C); ... } # end foo $obj-foo( c = 3, b = 2, f= 8, a = 1 ); # Note the out-of order, and the mixture of optional fields foo( $obj, a = 1, b = 2, c = 3 ); # still totally legal foo( a = 1, b = 2 ); # compiler-error (invalid num-args) foo( 1,2,3,4,5,6,7); # compiler-error, missing args a, b and c foo(a,1,b,2,c,3,$obj); # compiler-error, missing args a, b and c # (since they're offset by one) my @args = ( a = 1, b = 2, c = 3); $obj-foo( @args ); # checking-deffered to run-time. Will be ok. my @bad_args = ( b = 8, e = 4 ); $obj-foo( @bad_args ); # checking-deffered to run-time. Will fail. Essentially, perl's compiler can be put to use for hashed-function calls in much the same way as pseudo hashes work for structs/objects. Making this a compile-time check would drastically reduce run-time errors in code (that used hash-based parameters). It would also make the code both more readible AND more efficient. For readibility, perl can be quiried for the list of allowable options as well as general documentation. In the above, the listing of Input options would have been redundant, for both the code-reader, and the run-time query, but was provided for completeness. Note also that the above is compatible with the existing structure. In fact, foo required the old-style prototype to distinguish the "self" variable from the general-hash arguments. The use of the attribute "method" was optional, and could be used in the auto-generation of a $SELF variable. At the very least, it allows a run-time description of what the first argument really-is. An important thing to note is that we're not changing the functionality of execution. Perl sub's still look and feel like old-style subs to the user. They simply act as if additional
RFC 115 (v3) Overloadable parentheses for objects
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Overloadable parentheses for objects =head1 VERSION Maintainer: pdl-porters team [EMAIL PROTECTED] Date: 16 Aug 2000 Last Modified: 30 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 115 Version: 3 Status: Frozen =head1 DISCUSSION There was hardly any response to this RFC. That can be interpreted in a number of ways C;). RFC 231 suggests an alternative way to get what we tried to get from this RFC: Syntactical ease of making slices of multidim array objects. RFC 231 suggests $a-[1..3; 4] * $b-[]; From our point of view this is not too bad but we would like to get rid of the requirement for C-Egt. An extended Ctie mechanism with possibility of overloaded operations on those arrays might do it. But then again, if we get all those polymorphic methods we could also allow the one suggested by this RFC -- TIMTOWTDI. =head1 ABSTRACT This RFC proposes syntactic support for a polymorphic method that can be defined for blessed references. It would allow the parentheses C() to be used for a variety of convenient syntaxes for objects. This is principally motivated by a need of PDL for a simple syntax of PDL object slicing. =head1 DESCRIPTION =head2 Motivation Currently, PDL objects have to use quite an unwieldy syntax for the all important slicing and indexing. In perl5 the syntax is $n1 = $n-1; # since we need to stringify $y = $x-slice("0:$n1:4"); $y = $x-slice("0:${\($n-1)}:4"); # even more horrible This should be contrasted with the less cluttered syntax offered by numerical Python and commercial systems such as Matlab and IDL: y = x[0:n-1:4]; In perl we desire to say: $y = $x(0:$n-1,4); # or, depending on the choice of separator $y = $x(0:$n-1;4); # see also RFC 169 Note that we need to keep l-value subs in perl6 to avoid related types of syntactical clumsiness if C$x() can invoke a subroutine (see below). $x(0:$n-1:4) *= 2; should be allowed, as well as the long form $x-slice(0:$n-1:4) *= 2; LRFC 81 and related RFCs propose introducing ranges as part of the syntax cleanup, this RFC proposes C() overloading for objects. =head2 Overloading C()'s If classes allowed the definition of a method that is invoked with a syntax akin to one used with sub refs (but without the need for the C dereferencing) we could have the cake and eat it. The parentheses method notion seems general enough to be useful for other classes. =head2 Examples: A possible scenario could be as follows. The class PDL defines a default method that is invoked when a syntax like $variable_name(args) is used, where C$Eltvariable_nameEgt is supposed to be an instance of the class in question (here PDL). The PDL package would contain the definition of the method CPARENTHESES (a polymorphic method along the lines of RFC 159): package PDL; sub PARENTHESES { my $this = shift; $this-slice(@_); } This would allow for the creation of a variety of powerful syntaxes for different kinds of objects. For example in PDL we might wish to use C() to slice by index value and in a derived class PDL::World C() would slice by physical real-value coordinate. One could think of this as saying that C() is an operators which can be overloaded. Note that we still would like to have such a feature even if Perl 6 provided its own multi-dim array type. It would give us the freedom to provide an object oriented interface to these arrays and/or derive classes from it Iand have convenient slicing syntax. =head1 IMPLEMENTATION Changes to the parser to allow the new syntax. We are not aware of any other conflicts with existing or proposed language features. =head1 SEE ALSO RFC 159: True Polymorphic Objects RFC 117: Perl syntax support for ranges RFC 81: Lazily evaluated list generation functions RFC 169: Proposed syntax for matrix element access and slicing RFC 231: Data: Multi-dimensional arrays/hashes and slices LPDL (http://pdl.sourceforge.net/PDLdocs) http://pdl.perl.org Numerical Python: http://starship.python.net/~da/numtut/
RFC 278 (v2) Additions to 'use strict' to fix syntactic ambiguities
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Additions to 'use strict' to fix syntactic ambiguities =head1 VERSION Maintainer: Nathan Wiger [EMAIL PROTECTED] Date: 24 Sep 2000 Last Modified: 30 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 278 Version: 2 Status: Frozen =head1 ABSTRACT Several RFCs and many people have voiced concerns with different parts of Perl's syntax. Most take issue with syntactic ambiguities and the inability to easily tokenize Perl. This RFC shows how these all boil down to a three central issues, and how they can be solved with some simple additions to Cuse strict. By default, Perl should remain as flexible as possible. By adding these flags to Cuse strict, those who desire them can have all the benefits of a stricter syntax, without hurting those that like these features. =head1 DESCRIPTION =head2 The Problems =head3 Indirect Objects RFC 244 proposes eliminating the bareword indirect object syntax because this: print STDERR @stuff; Can be parsed as either of these: STDERR-print(@stuff); print(STDERR(@stuff)); Depending on your usage of CSTDERR other places in your program. However, many of us like writing: $q = new CGI; Quite a bit, and consider this DWIMish. =head3 Barewords vs. Functions RFC 244 and others mention several problems with barewords such as: Class-stuff(@args); # Class()-stuff or 'Class'-stuff ? Again, the fact that Perl can figure this out correctly is quite DWIMish, and this functionality should not be removed by default. =head3 Special Cases Many special cases abound, such as the bare C// mentioned in RFC 135. Again, this is stuff that makes Perl fun, and should not be taken out of the language. =head2 The Solutions At first, these may not seem related. However, they very much are, and in fact all boil down to only three issues which can be resolved with additions to Cuse strict. =head3 Function Parens - Cuse strict 'words' This imposes a very simple restriction: barewords are not allowed. They must be either quoted or specified with parens to indicate they are functions. Note this solves the C%SIG problem from Camel: use strict 'words'; $SIG{PIPE} = Plumber;# syntax error $SIG{PIPE} = "Plumber"; # use main::Plumber $SIG{PIPE} = Plumber(); # call Plumber() In addition, this also forces users to disambiguate certain functions: use strict 'words'; name-stuff(@args); # syntax error 'name'-stuff(@args);# 'name'-stuff name::-stuff(@args);# ok too, same thing name()-stuff(@args);# name()-stuff $result = value + 42;# syntax error $result = value() + 42; # value() + 42 $result = value( + 42); # value(42) $result = 'value' + 42; # ok, if you think this is Java... It's simple: barewords are not allowed. =head3 Indirect Objects - Cuse strict 'objects' Another major problem is ambiguous indirect objects. Under Cuse strict 'objects', the indirect object Imust be surrounded by braces: use strict 'objects'; no strict 'words'; print STDERR @stuff; # print(STDERR(@stuff)) print 'STDERR' @stuff; # syntax error print {'STDERR'} @stuff; # 'STDERR'-print(@stuff) print $fh @junk; # syntax error print {$fh} @junk; # $fh-print(@junk) This eliminates the possibility of ambiguity with indirect objects. When combined with Cstrict 'words', code becomes even less ambiguous: use strict qw(words objects); $q = new 'CGI'; # syntax error $q = new {'CGI'};# 'CGI'-new $q = new ('CGI');# new('CGI') $q = new (CGI());# new(CGI()) $q = new 'CGI' @args;# syntax error $q = new {'CGI'} (@args);# 'CGI'-new(@args) $q = new (CGI (@args)); # new(CGI(@args)) =head3 Syntactic Problems - Cuse strict 'syntax' There are many other "little ambiguities" throughout Perl. Adding Cstrict 'syntax' would remove these and require the user to specify them explicitly. In this category fits the bare C// problem mentioned in RFC 135, as well as several common "bugs" (mistakes). Under this rule, the following would apply: 1. No more // by itself, you must use m// 2. Trailing conditionals would require parens 3. Precedence other than for basic math and boolean ops would not apply This is designed to force you to write clean, unambiguous code that borders on being non-Perlish: use strict 'syntax'; next if /^#/ || /^$/; # syntax error next if m/^#/ || m/^$/; # syntax error next if (m/^#/ || m/^$/); # ok use strict 'syntax'; $data = $a + $b / $c - $d || $default or die; # no way $data = ($a + $b / $c - $d) || $default or die;# nope ($data = ($a + $b / $c - $d) || $default) or die; # ok Basically, the idea is to impose a truly unambiguous style so that people don't get carried away with precedence and special cases. =head2 Combining all these together Let's look at
RFC 279 (v2) my() syntax extensions and attribute declarations
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE my() syntax extensions and attribute declarations =head1 VERSION Maintainer: Nathan Wiger [EMAIL PROTECTED] Date: 24 Sep 2000 Last Modified: 30 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 279 Version: 2 Status: Frozen =head1 ABSTRACT This RFC fleshes out variable declarations with Cmy, and also proposes a way to assign attributes without the need for a Cmy anywhere. Much of this stuff has been hinted at, so this is just a formalization. Please note this entire document is Boptional, intended for those that need this type of control. Perl is not a BD language (by default, at least :). =head1 DESCRIPTION Camel-3 shows some interesting hints of what's been proposed for Cmy declarations: my type $var :attr1 :attr2 = $value; And we all know that you can use Cmy to declare a group of variables: my($x, $y, $z); Here's the issues: 1. How do the two jive together? 2. Should it be possible to assign attributes to individual elements of hashes/arrays? (yes) =head2 Cohesive Cmy syntax This RFC proposes that you be able to group multiple variables of the same type within parens: my int ($x, $y, $z); my int ($x :64bit, $y :32bit, $z); It seems most logical that: 1. The type will be the same across variables; this is common usage in other languages because it makes sense. 2. The attributes will be different for different variables. As such, multiple attributes can be assigned and grouped flexibly: my int ($x, $y, $z) :64bit; # all are 64-bit my int ($x, $y, $z :unsigned) :64bit; # plus $z is unsigned Note that multiple types cannot be specified on the same line. To declare variables of multiple types, you must use separate statements: my int ($x, $y, $z) :64bit; my string ($firstname, $lastname :long); This is consistent with other languages and also makes parsing realistic. =head2 Assigning attributes to individual elements of hashes/arrays This is potentially very useful. ":laccess", ":raccess", ":public", ":private", and others spring to mind as potential candidates for this. This RFC proposes that in addition to attributes being assigned to a whole entity on declaration: my int @a :64bit; # makes each element a 64-bit int my string %h :long; # each key/val is long string They can also be declared on individual elements, without the need for Cmy or Cour: $a[0] :32bit = get_val; # 32-bit $r-{name} :private = "Nate"; # privatize single value $s-{VAL} :laccess('data') = ""; # lvalue autoaccessor Assigning attributes to individual elements has the advantage over keywords of allowing them to be grouped: $self-{name} :public :roaccess('getname') = "Nathan Wiger"; However, a problem arises in how to assign types to singular elements, since this requires a Cmy: my int $a[0] :64bit; # just makes that single element # a lexically-scoped 64-bit int? my string $h{name} = ""; # cast $h{name} to string, rescope %h? Currently, lexical scope has no meaning for individual elements of hashes and arrays. However, assigning attributes and even types to individual elements seems useful. There's two ways around this that I see: 1. On my'ing of an individual hash/array element, the entire hash/array is rescoped to the nearest block. 2. Only the individual element is rescoped, similar to what happens when you do this: my $x = 5; { my $x = 10; } Either of these solutions is acceptable, and they both have their pluses and minuses. The second one seems more consistent, but is potentially extremely difficult to implement. =head1 IMPLEMENTATION Hold on. =head1 MIGRATION None. This introduces a more flexible syntax but does not break old ones. =head1 REFERENCES RFC 337: Common attribute system to allow user-defined, extensible attributes RFC 319: Transparently integrate Ctie Camel for the Cmy syntax. Cattributes man page for details on attributes.
RFC 320 (v2) Allow grouping of -X file tests and add Cfiletest builtin
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Allow grouping of -X file tests and add Cfiletest builtin =head1 VERSION Maintainer: Nathan Wiger [EMAIL PROTECTED] Date: 25 Sep 2000 Last Modified: 30 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 320 Version: 2 Status: Frozen =head1 ABSTRACT Currently, file tests cannot be grouped, resulting in very long expressions when one wants to check to make sure some thing is a readable, writeable, executable directory: if ( -d $file -r $file -w $file -x $file ) { ... } It would be really nice if these could be grouped instead: if ( -drwx $file ) { ... } Notice how much easier this is to read and write. =head1 NOTES ON FREEZE Everyone liked it (I even heard Tom say so :), the only concern raised was that it makes the direct negation of subs impossible because of the way C-[a-zA-Z]+ would have to be tokenized. See the MIGRATION and also REFERENCES sections for details on this. =head1 DESCRIPTION =head2 File Test Grouping See above. Multiple file tests, when grouped, should be ANDed together. This RFC does not propose a way to OR them, since usage like this: if ( -d $file || -r $file || -w $file || -x $file ) { ... } Is highly uncommon, to say the least. Notice this has the nice side effect of eliminating the need for C_ in many cases, since this: if ( -d $file -r _ -w _ -x _ ) { ... } Can simply be written as a single grouped file test, as shown above. If you need to check for more complex logic, you still have to do that separately: if ( -drwx $file and ! -h $file ) { ... } This is the simplest and also probably the clearest way to implement this. =head2 New Cfiletest Builtin This RFC also proposes a new Cfiletest builtin that is actually what is used for these tests. The C-[a-zA-Z]+ form is simply a shortcut to this builtin, just like is a shortcut to Creadline. So: if ( -rwdx $file ) { ... } Is really just a shortcut to the Cfiletest builtin: if ( filetest $file, 'rwdx' ) { ... } Either form could be used, depending on the user's preferences (just like Creadline). Note that this Cfiletest builtin is designed to supposedly make the implementation of this easier, but if it doesn't, then it's unnecessary and should not be added. =head1 IMPLEMENTATION This would involve making C-[a-zA-Z]+ a special token in all contexts, serving as a shortcut for the Cfiletest builtin. This means that you will no longer be able to directly negate subroutine calls; see below. =head1 MIGRATION There is a subtle trap if you are negating subroutines: $result = -drwx $file; And expect this to be parsed like this: $result = - drwx($file); However, usage such as this is not common, since negating subs only makes sense in a very few cases. In Perl 6, instead of writing: $num = -exp($foo); You would have to write: $num = - exp($foo); The author personally feels this is not too much of a burden for the benefit of grouped filetests. Note that this is already required for people that use subs named r, w, d, or any other filetest character. To fix this issue, the p52p6 translator simply has to look for C-([a-zA-Z]{2,}) and replace it with C- $1, since injecting a single space will break up the token. See the below links for more details on the discussions of this. =head1 REFERENCES http://www.mail-archive.com/perl6-language%40perl.org/msg04649.html http://www.mail-archive.com/perl6-language%40perl.org/msg04658.html
RFC 327 (v3) C\v for Vertical Tab
define isPSXSPC(c)(isSPACE(c) || (c) == '\v') + ((c) == ' ' || (c) == '\t' || (c) == '\n' || (c) =='\r' || (c) == '\f' \ +|| (c) == '\v') +#define isPSXSPC(c)isSPACE(c) #define isBLANK(c) ((c) == ' ' || (c) == '\t') #define isDIGIT(c) ((c) = '0' (c) = '9') #ifdef EBCDIC --- t/op/pat.t.orig Tue Aug 29 13:54:13 2000 +++ t/op/pat.t Tue Sep 26 14:27:14 2000 @@ -1064,15 +1064,14 @@ cr= "\r", lf= "\n", ff= "\f", -# The vertical tabulator seems miraculously be 12 both in ASCII and EBCDIC. - vt= chr(11), + vt= "\v", false = "space" ); my @space0 = sort grep { $space{$_} =~ /\s/ } keys %space; my @space1 = sort grep { $space{$_} =~ /[[:space:]]/ } keys %space; my @space2 = sort grep { $space{$_} =~ /[[:blank:]]/ } keys %space; -print "not " unless "@space0" eq "cr ff lf spc tab"; +print "not " unless "@space0" eq "cr ff lf spc tab vt"; print "ok $test\n"; $test++; =back To be strict the perl5 to perl6 converter would need to =over 4 =item * replace C\v with Cv in interpolated strings. =item * replace C\s with C[\t\n\r\f ] and C\S with C[^\t\n\r\f ] in regular expressions. =back It might be considered acceptable to omit either or both conversions if the number of programs that would break were negligible. =head1 REFERENCES perlop manpage for interpolation perlre manpage for \s and \S
RFC 355 (v1) Leave $[ alone.
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Leave $[ alone. =head1 VERSION Maintainer: Fred Heutte [EMAIL PROTECTED] Date: 29 September 2000 Mailing List: [EMAIL PROTECTED] Number: 355 Version: 1 Status: Developing =head1 ABSTRACT The array base directive $[ is not just deprecated, it is dissed. But setting $[ = 1 is the mathematically correct method for array addressing and makes it easier for ordinary mortals to do basic tasks with Csubstr(), array addressing and the like. On the other hand, the strong C legacy of starting every integral sequence with 0 is culturally bound and should not be disturbed. Therefore, the correct approach is to make it explicit policy that the status quo will continue. =head1 DESCRIPTION The first Camel states (p. 68), The special variable $[ is the current array base, ordinarily 0, as we mentioned. You can change it to 1 if you prefer the FORTRAN approach, and then $#whatever will be equal to @whatever. Most Perl programmers prefer to leave $[ at 0. Those like myself who use Perl extensively alongside database programming find it highly annoying to have two different array bases when dealing with the same data. $[ = 1 provides a nice out for us, since we like to $zipcode = $myaddressline[7] when the Zip Code really is the 7th element, or $lastname = substr($myrecord, 26, 15) for surnames starting at the 26th character of a string. C programmers find this highly annoying since they are attuned since time immemorial to thinking in binary or vector terms where the count always starts with 0. Given both the machine and cultural realities, it is correct for Perl to have this as the default behavior. However, $[ always existed for us lesser mortals who prefer the ordinary usage with Crindex(), array subscripts and other similar work. This allows us to align our work across the landscape of work contexts from Perl to, say, SQL. (For those who are old-fashioned, from 3GL to 4GL and beyond). The second Camel takes things much further. $[ is not only deprecated, it is dissed in two separate footnotes on p. 49: *For historical reasons, the special variable can be used to change the array base. Its use is not recommended, however. In fact, this is the last we'll even mention it. Just don't use it. +Unless you've diddled the deprecated $[ variable. Er, *this* is the last time we'll mention it ... This qualifies as a uniquely dogmatic position in a language and culture otherwise refreshingly free of rigid dogmatism. For those who prefer the array base to be 0 at all times, the issue may be one of purity, and the value to those of us who like $[ = 1 to be mere convenience. But convenience is a virtue too. If I want to grab the 16th member of an array, I don't want to have to remember always to subtract one in my head and do something with $myarray[15]. This feels unnatural, which is a clue that perhaps it is. This kind of arithmetic is what machines are good for, and it is an inefficient use of human attention to do so, at least for some of us. Furthermore, it is mathematically more valid to use an option base of 1 for many common tasks. In Joe Celko's book "Data Databases: Concepts in Practice" (Morgan Kaufmann, 1999), which I highly recommend in any event, he has this to say (p. 83): An ordinal number represents a position (first, second, third, ...) in an ordering ... This question of position leads to another debate: Is there such a thing as the zeroth ordinal number? Computer people like to have a zeroth position because it is handy in implementing data structures with relative positioning. For example, in the C language and many versions of BASIC, arrays start with the element zero. This allows the compiler to locate an array element with the displacement formula: base address + (element size * array index) The idea of a zeroth ordinal number is a mild mathematical heresy. To be in the zeroth position in a queue is to have arrived in the serving line before the first person in line. In the end, the correct implementation is the one Perl has always had: make array base 0 the default, for the benefit of programmers and systems with that perspective, but allow a different option base for those who prefer it, partly for convenience and productivity reasons, and partly to retain a state of grace mathematically speaking. =head1 IMPLEMENTATION None. We won't even mind snide asides and patronizing glances if you just leave $[ alone. =head1 BUGS Something changed in the behaviors related to $[ in 5.6, at least in my usual world (ActiveState). Specifically, $[ is silently overridden in the following case: $[ = 1 ; @files = ($ARGV[1]) ; Then substr("ABCDE", 3, 1) returns "D". Ouch. I mentioned this to Gurusamy Sarathy at Usenix in San Diego and he didn't know about it, but it's replicable, and whatever causes this should be fixed or
RFC 356 (v1) Dominant Value Expressions
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Dominant Value Expressions =head1 VERSION Maintainer: Glenn Linderman [EMAIL PROTECTED] Date: 24 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 356 Version: 1 Status: Developing =head1 ABSTRACT An aid to determining if an input value has an impact on the result of an expression whole program. Can also be used for Perl poetry. =head1 DESCRIPTION This is an optional feature; to turn it on, "use domination;" is suggested. When use domination is in scope, two new functions are available, and new rules for expression evaluation obtain. Each of these is described in a subsection below. =head2 Domination pragma The "use domination;" pragma enables the rest of the functionality. It should be scoped, affecting the current and nested blocks. The "no domination;" pragma would turn off the effect of "use domination;" for the current and nested blocks. If a dominant value is encountered while "no domination;" is in effect, it is treated as "undef" by all scalar operators. =head2 Dominant function The dominant function takes a scalar argument, which is considered to be a weight parameter. The scalar argument is converted to numeric, if possible, with the resultant positive weight parameter producing a dominant value with that given weight. A scalar argument that cannot be automatically converted to numeric, or that produces a negative or zero numeric value produces undef as a result. =head2 Weight function The weight function takes a scalar argument. If the scalar argument is a dominant value, it returns its weight as a positive number. If the scalar argument is not a dominant value, the return of the weight function is zero. =head2 Expressions involving dominant values All scalar operations are affected by the presence of dominant values. If a scalar operation involves one or more dominant values, the result of the operation is the heaviest (by weight) dominant value among the operands. If no dominant values are supplied to the operation, the result of the operation is the same as it would be according to the usual definition of the operation. =head1 COMPATIBILITY New functionality, no compatibility issues. The new functionality only obtains if the "use domination;" pragma is in effect. =head1 IMPLEMENTATION The impact of dominant value expressions is pervasive, affecting all builtin scalar operators in a minor way. Any operators doing item by item operations on each scalar of a list or hash would be similarly affected (various RFCs exist for extending scalar operations to work on lists of values in "corresponding item" fashion). =head1 REFERENCES RFC 263: Add null() keyword and fundamental data type
RFC 328 (v3) Single quotes don't interpolate \' and \\
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Single quotes don't interpolate \' and \\ =head1 VERSION Maintainer: Nicholas Clark [EMAIL PROTECTED] Date: 28 Sep 2000 Last Updated: 30 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 328 Version: 3 Status: Frozen =head1 CHANGES Reissued on [EMAIL PROTECTED] - I goofed the list. Clarified the description slightly; by single quoted string I mean '' and q() Updated discussion section Frozen not withdrawn (see discussion section) =head1 DISCUSSION I'm in two minds as to whether to freeze or retract this RFC Reaction was strongly polarised; three strongly against and one strongly for. People valued their ability to use single quotes to easily make strings containing single quotes. Michael Fowler expresses Whew. Disallowing escapes in a single-quote string does not make easy things easier and hard things possible. I'm not arguing that we should keep it simply because people are used to it, but instead we should keep it because it's useful. My view was that the majority are against the change, but views were from Bexisting perl users [who do you expect as the majority on perl6 lists? :-)]. The change would penalise existing perl users, but benefit new perl users (and presumably people teaching perl). However, I'm wrong on that. Hildo Biersma states Now, I have been teaching perl for a number of years, and nobody's ever had trouble with understanding how single quotes and the two escapes work. Plenty of people find double-quotes either too powerful or too limited (see the various RFCs), but I think single quotes are fine However, there was no comment on the secondary issue of how single quotes treat unrecognised escapes. To me the following seems wrong: $ perl -lwe "print q(Quoted \( \\ \) Not \' \t \z)" Quoted ( \ ) Not \' \t \z And from the archives: Does it strike anyone else as odd that 'foo\\bar' eq 'foo\bar'? (Steve Fink, http://www.mail-archive.com/perl6-language@perl.org/msg04008.html) I conclude =over 4 =item 1 Removing C\ escaping of C\ and the delimiter from '' and q() would make perl less useful to the majority =item 2 How '' and q() deal with C\ followed by a non-escaped character isn't a major concern to perl programmers =back hence escaping as is should remain, but it would be possible to change the unrecognised escape behaviour (say C\z maps to Cz like a "" string) without causing pain, if this change were deemed sensible. My view is that (2) should be considered, hence I freeze rather than withdraw the RFC. =head1 ABSTRACT Remove all interpolation within single quotes and the Cq() operator, to make single quotes 100% shell-like. C\ rather than C\\ gives a single backslash; use double quotes or Cq() if you need a single quote in your string. =head1 DESCRIPTION Camel III (page 7) says "Double quotation marks (double quotes) do Ivariable interpolation and Ibackslash interpolation while single quotes suppress interpolation." Page 60 qualifies this with "except for C\' and C\\". In perl single quotes are used to generate strings. Double quotes also generate strings. In C single quotes are used to make character constants. Double quotes are used to make string constants. Backslash interpretation is performed in single quotes in C. While multi-character constants are allowed by C, they are strongly discouraged as they are non-portable, and a character constant in C is a type distinct from a string constant. Hence double quotes and single quotes signify different things. In shell, single quotes are used to make strings. Double quotes also make strings. Within single quotes backslashes are ordinary characters, and do not quote anything. As one can't quote a C' with a C\ there is no way to interpolate a single quote within a single quoted string, but a workaround such as C'don'\''t' relying on the concatenation of C'don' C\' and C't' achieves the desired results. Hence perl's single quoted strings are analogous to shell's single quoted strings, not C's. However, they're not identical, as perl allows C\\ to mean an embedded C\, C\' to mean an embedded C'. This RFC argues that the exception is confusing and proposes to remove it. This makes perl more regular in shell terms, and slightly more easy to learn for the shell programmer. It also makes perl internally simpler more regular. Currently the behaviour for Cq() strings is that C\( C\) and C\\ map to 1 character, C\I? for all other I? maps to 2 characters. Cqq() differs as C\I? maps to 1 character both when I? is recognised as a backslash escapes, Band when it is unrecognised. A further irregularity is that currently single quoted here docs don't interpolate C\\ or C\'. The consequence of this is that currently 'foo\\bar' eq 'foo\bar' which sure looks odd. With this RFC it is proposed that in a sin
RFC 142 (v2) Enhanced Pack/Unpack
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Enhanced Pack/Unpack =head1 VERSION Maintainer: Edwin Wiles [EMAIL PROTECTED] Date: 22 Aug 2000 Last Modified: 30 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 142 Version: 2 Status: Frozen Submitted by: Glenn Linderman [EMAIL PROTECTED] =head1 ABSTRACT Pack and Unpack are percieved as being difficult to use, and possibly missing desirable features. =head2 Notes on freeze Edwin on vacation, co-author deems appropriate to freeze without changes. Most discussion happened with "draft" RFCs before the original submittal, anyway. =head1 DESCRIPTION The existing pack and unpack methods depend upon a simple grammar which leads to opaque format specifications, which are often difficult to get right, and which carry no information regarding variable names. A more descriptive grammar, which includes variable name associations, would make pack and unpack easier to use. =head1 IMPLEMENTATION Given the expressed desire to shrink the overall size of the perl executable, this should be implemented as a seperate module; included with the core distribution. =head2 Definition $foo = new Structure(...definition...); Define a new structure type. Can use previously defined user data types. See the section on definitions. Structure::define( $typename, \from_sub, \to_sub ); Define a user data type. The 'from_sub' extracts data from the packed form. The 'to_sub' puts data back into the packed form. See the section on user defined types. =head2 Input $foo-read(INPUT); # sysread binary data from given IO reference. $foo-set($var); # accept binary data from normal perl # variable. $foo-append($var); # append binary data to the existing data in # the structure. =head2 Output $foo-write(OUTPUT); # syswrite binary data to given IO reference. $var = $foo-get(); # output binary data to normal perl variable. =head2 Maniuplation $foo-{'name'} = $val; # set "name" to value $val = $foo-{'name'}; # get value of "name" [Note: There is an alternative method, using the Class::Class method of exposing the variables via their names. That is still a possibility, but this is deemed easier to implement at this time.] =head2 Data Definition DEFINITION := '[' ELEMENTS ']' ELEMENTS := ELEMENT [',' ELEMENTS] ELEMENT := NAME '=' TYPE NAME := Text used to identify the variable for further use. You may not use 'array', it is reserved. You may not embed whitespace. TYPE := ''' BASETYPE [ '/' MODIFIERS ] ''' | '[' ARRAYDEF ']' | USERDEFINED [ '/' UDEFARGS ] BASETYPE := 'short' | 'long' | 'int' | 'double' | 'float' | 'char' | 'byte' In each of the above, unless otherwise modified, the type defaults to the signedness, endianness, and bit length that your system normally uses. The one exception to this is 'char', which Unicode may cause to be larger than a single byte, even if your system normally considers a 'char' to be a single byte. | 'chars' A null terminated string of characters. If unicode is used, this may be more than one byte per character. For use with indefinite length strings, where a "count" is not provided. If the array modifier is used, then you're expecting that many null terminated strings. | 'bytes' A null terminated string of bytes. If the array modifier is used, then you're expecting that many null terminated strings of bytes. [Note: Other basetypes desired can certainly be added. It were best if they were added at this phase. Inform me of any additional base types desired, with justifications.] UDEFARGS := UDEFARG [ ',' UDEFARGS ] UDEFARG := User defined argument, meaning dependent upon user defined code. Pretty much, any legal Perl constant. At least, by the time it hits this module, it better be constant. USERDEFINED := A user defined type name, see the section on user defined data types.
RFC 350 (v1) Advanced I/O (AIO)
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Advanced I/O (AIO) =head1 VERSION Maintainer: Uri Guttman [EMAIL PROTECTED] Date: 29 Sept 2000 Mailing List: [EMAIL PROTECTED] Number: 350 Version: 1 Status: Developing =head1 ABSTRACT This RFC describes a pragma and module which support an advanced I/O subsystem. It is meant to be a centralized subsystem that supports a wide range of I/O requirements, many of which are covered in other RFCs. It doesn't add any new syntax or break any existing code. =head1 DESCRIPTION There are many way that coders want to do I/O in Perl and many I/O sources as well. RFC 14 discusses ways to make open smarter and to have handlers for each flavor. The goal is good, but it requires new syntax and semantics for open. You have effectively multiple versions of open, each with their own argument order. It also doesn't address asynchronous I/O or events and callbacks. I don't see how to make a socket or http connect and not block on the request. This RFC addresses those issues in a Perl5 way by creating a new class with attribute based methods. This allows use to add new ways to do open and I/O and not have to learn specialized argument formats. The style is very similar to IO:: now but more generalized and tightly integrated in the core. This is why I choose to call it Advanced I/O (AIO). The main pragma will be 'use aio' and it will allow the coder to control and load various components. I won't have the time to specify all of them but think along the lines of RFC 14, LWP, libnet, IO::*, LWP::Parallel, etc. Supporting them all under this single pardigm is simple as you don't need to know more then the minimum attributes to get your job done. Also as I have coverd some in RFC 47, many of the attributes will have well chosen defaults which allow for shorter argument lists for common situations. For example, instead of $sock = IO::Socket::INET-new(PeerAddr = 'www.perl.org', PeerPort = 'http(80)', Proto= 'tcp'); you would do: $sock = AIO::Open( Host = 'www.perl.org', Port = 80 ) ; TCP would be the default protocol as it is much more common than UDP, and it would know it is a socket connection because of the Host/Port attributes. Similarly for LWP you would just do: $sock = AIO::Open( Url = 'http://www.perl.org' ) ; If you refer to a special attribute, it can cause the appropriate module to be loaded at runtime. In the above case the AIO::LWP (or whatever it is called) would get loaded and then it would be passed the open call. One critical feature of AIO is direct support of asynchronous I/O (including connections, server accepts, and real asynchronous file I/O). There is no special interface required, you just specify a callback attribute. That makes the request automatically non-blocking and registers the event for you. You have to had previously specified an event dispatch method (with use event or use aio) or it is a runtime error. Assuming an object $foo in package Foo, the above calls can be done with asynchronous callbacks (see RFC 321 for more on callbacks) like this: $event = AIO::Open( Host = 'www.perl.org', Port = 80 Callback = $obj ) ; $event = AIO::Open( Url = 'http://www.perl.org/index.html', Callback = $obj ) ; Package Foo ; sub Connected { my( $self, $socket ) = @_ ; print "i am connected to perl.org\n" ; } sub Url_gotten { my( $self, $socket ) = @_ ; print $socket-read( 8096 ) ; } Here is an asynchronous file read : $event = AIO::Read( Fd = $fh, Count = 1000, Callback = \read_handler ) ; sub read_handler { my( $read_data ) = @_ ; print "i read [$read_data]\n" ; } The callback method names are the default ones (we can pick better ones but I am under deadline pressure!) or you can choose your own. Timeouts can also be set with their own method names or defualt ones: $event = AIO::Open( Url = 'http://www.perl.org/index.html', Callback= $obj, Method = 'Url_Ready' Timeout = 10 Timeout_Method = 'URL_timeout' ) ; So you see, the same simple syntax and a consistant API lets you do blocking and non-blocking connect, I/O and timers. No need to learn IO::, LWP and LWP::Parallel and IO::Select. All of them are covered in under this approach and adding new attributes is easy and won't conflict with other code written to this specification. There is much more than can be desribed and I hope to have a
RFC 328 (v2) Single quotes don't interpolate \' and \\
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Single quotes don't interpolate \' and \\ =head1 VERSION Maintainer: Nicholas Clark [EMAIL PROTECTED] Date: 28 Sep 2000 Last Updated: 29 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 328 Version: 2 Status: Developing =head1 CHANGES Reissued on [EMAIL PROTECTED] - I goofed the list. Added discussion section Clarified the description slightly; by single quoted string I mean '' and q() =head1 DISCUSSION Limited discussion so far because I wrongly issued the RFC to [EMAIL PROTECTED] The only responses were from two people both of whom valued their ability to use single quotes to make strings including making strings containing single quotes. Here docs already provide a means to get "quote nothing, everything is literal". One argued that Single-quoted strings need to be able to contain single quotes, which means some escape mechanism is required. which I did not agree with. My view is that single quotes are a means to an end, a way to create string constants. String constants need to contain C' (and every other character). There's more than just '' to make a string. Although consensus so far is against the change, views were from Bexisting perl users [who do you expect as the majority on perl6 lists? :-)]. The change would penalise existing perl users, but benefit new perl users (and presumably people teaching perl). =head1 ABSTRACT Remove all interpolation within single quotes and the Cq() operator, to make single quotes 100% shell-like. C\ rather than C\\ gives a single backslash; use double quotes or Cq() if you need a single quote in your string. =head1 DESCRIPTION Camel III (page 7) says "Double quotation marks (double quotes) do Ivariable interpolation and Ibackslash interpolation while single quotes suppress interpolation." Page 60 qualifies this with "except for C\' and C\\". In perl single quotes are used to generate strings. Double quotes also generate strings. In C single quotes are used to make character constants. Double quotes are used to make string constants. Backslash interpretation is performed in single quotes in C. While multi-character constants are allowed by C, they are strongly discouraged as they are non-portable, and a character constant in C is a type distinct from a string constant. Hence double quotes and single quotes signify different things. In shell, single quotes are used to make strings. Double quotes also make strings. Within single quotes backslashes are ordinary characters, and do not quote anything. As one can't quote a C' with a C\ there is no way to interpolate a single quote within a single quoted string, but a workaround such as C'don'\''t' relying on the concatenation of C'don' C\' and C't' achieves the desired results. Hence perl's single quoted strings are analogous to shell's single quoted strings, not C's. However, they're not identical, as perl allows C\\ to mean an embedded C\, C\' to mean an embedded C'. This RFC argues that the exception is confusing and proposes to remove it. This makes perl more regular in shell terms, and slightly more easy to learn for the shell programmer. It also makes perl internally simpler more regular. Currently the behaviour for Cq() strings is that C\( C\) and C\\ map to 1 character, C\I? for all other I? maps to 2 characters. Cqq() differs as C\I? maps to 1 character both when I? is recognised as a backslash escapes, Band when it is unrecognised. A further irregularity is that currently single quoted here docs don't interpolate C\\ or C\'. The consequence of this is that currently 'foo\\bar' eq 'foo\bar' which sure looks odd. With this RFC it is proposed that in a single quoted string and the q() operator C\ is not special. Hence C\I? always maps to 2 characters (C\ then I?) unless I? is the closing terminator, in which case the string terminates with that C\ . Single quoted strings behave like single quoted here docs, and like shell single quoted strings. You don't lose any functionality, as 'don\'t implement this RFC, the benefits don\'t outweigh the confusion' can still be written q(don't implement this RFC, the benefits don't outweigh the confusion) which is actually less typing. =head1 IMPLEMENTATION Modify the tokeniser/lexer not to treat C\ as special, hence the first end delimiter ends the string. For 5.7's toke.c this doesn't appear that simple. it looks like modifications would be needed to CS_tokeq, Cscan_str and CPerl_yylex (for a quoted string at the start of curlies). There are probably more; the code that makes single quoted strings interpolate C\' and C\\ appear to be deeply ingrained into the core. The perl5 to perl6 convertor would need to convert single quoted strings and Cq() operators containing C\' to the shortest (clearest?) equivalent of: =over 4 =item * double quoted string with C\Q...C\E or C\ escapes
RFC 259 (v1) Builtins : Make use of hashref context for garrulous builtins
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Builtins : Make use of hashref context for garrulous builtins =head1 VERSION Maintainer: Damian Conway [EMAIL PROTECTED] Date: 19 Sep 2000 Last Modified: 29 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 259 Version: 1 Status: Frozen Frozen since: v3 =head1 ABSTRACT This RFC proposes the builtin functions that return a large number of values in an array context should also detect hashref contexts (see RFC 21) and return their data in a kinder, gentler format. =head1 DESCRIPTION It's hard to remember the sequence of values that the following builtins return: stat/lstat caller localtime/gmtime get* and though it's easy to look them up, it's a pain to look them up Every Single Time. Moreover, code like this is far from self-documenting: if ((stat $filename)[7] 1000) {...} if ((lstat $filename)[10] time()-1000) {...} if ((localtime(time))[3] 5) {...} if ($usage (getpwent)[4]) {...} @host{qw(name aliases addrtype length addrs)} = gethostbyname $name; warn "Problem at " . join(":", @{[caller(0)]}[3,1,2]) . "\n"; It is proposed that, when one of these subroutines is called in the new HASHREF context (RFC 21), it should return a reference to a hash of values, with standardized keys. For example: if (stat($filename)-{size} 1000) {...} if (lstat($filename)-{ctime} time()-1000) {...} if (localtime(time)-{mday} 5) {...} if ($usage getpwent()-{quota}) {...} %host = %{gethostbyname($name)}; warn "Problem at " . join(":", @{caller(0)}{qw(sub file line)} . "\n"; =head2 Standardized keys The standardized keys for these functions would be: =over 4 =item Cstat/Clstat 'dev' Device number of filesystem 'ino' Inode number 'mode' File mode (type and permissions) 'nlink' Number of (hard) links to the file 'uid' Numeric user ID of file's owner 'gid' Numeric group ID of file's owner 'rdev' The device identifier (special files only) 'size' Total size of file, in bytes 'atime' Last access time in seconds since the epoch 'mtime' Last modify time in seconds since the epoch 'ctime' Inode change time in seconds since the epoch 'blksize' Preferred block size for file system I/O 'blocks'Actual number of blocks allocated =item Clocaltime/Cgmtime 'sec' Second 'min' Minute 'hour' Hour 'mon' Month 'year' Year 'mday' Day of the month 'wday' Day of the week 'yday' Day of the year 'isdst' Is daylight savings time in effect (localtime only) =item Ccaller 'package' Name of the package from which sub was called 'file' Name of the file from which sub was called 'line' Line in the file from which sub was called 'sub' Name by which sub was called 'args' Was sub called with args? 'want' Hash of values returned by want() 'eval' Text of EXPR within eval EXPR 'req' Was sub called from a Crequire (or Cuse)? 'hints' Pragmatic hints with which sub was compiled 'bitmask' Bitmask with which sub was compiled =item Cgetpw* 'name' Username 'passwd'Crypted password 'uid' User ID 'gid' Group ID 'quota' Disk quota 'comment' Administrative comments 'gcos' User information 'dir' Home directory 'shell' Native shell 'expire'Expiry date of account of password =item Cgetgr* 'name' Group name 'passwd'Group password 'gid' Group id 'members' Group members =item Cgethost* 'name' Official host name 'aliases' Other host names 'addrtype' Host address type 'length'Length of address 'addrs' Anonymous array of raw addresses in 'C4' format =item Cgetnet* 'name' Official name of netwwork 'aliases'
RFC 264 (v3) Provide a standard module to simplify the creation of source filters
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Provide a standard module to simplify the creation of source filters =head1 VERSION Maintainer: Damian Conway [EMAIL PROTECTED] Date: 20 Sep 2000 Last Modified: 29 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 264 Version: 3 Status: Frozen Frozen since: v2 =head1 ABSTRACT This RFC proposes that the interface to Perl's source filtering facilities be made much easier to use. =head1 DESCRIPTION Source filtering is an immensely powerful feature of recent versions of Perl. It allows one to extend the language itself (e.g. the Switch module), to simplify the language (e.g. Language::Pythonesque), or to completely recast the language (e.g. Lingua::Romana::Perligata). Effectively, it allows one to use the full power of Perl as its own, recursively applied, macro language. The Filter::Util::Call module (by Paul Marquess) provides a usable Perl interface to source filtering, but it is not nearly as simple as it could be. To use the module it is necessary to do the following: =over 4 =item 1. Download, build, and install the Filter::Util::Call module. =item 2. Set up a module that does a Cuse Filter::Util::Call. =item 3. Within that module, create an Cimport subroutine. =item 4. Within the Cimport subroutine do a call to Cfilter_add, passing it either a subroutine reference. =item 5. Within the subroutine reference, call Cfilter_read or Cfilter_read_exact to "prime" $_ with source code data from the source file that will Cuse your module. Check the status value returned to see if any source code was actually read in. =item 6. Process the contents of $_ to change the source code in the desired manner. =item 7. Return the status value. =item 8. If the act of unimporting your module (via a Cno) should cause source code filtering to cease, create an Cunimport subroutine, and have it call Cfilter_del. Make sure that the call to Cfilter_read or Cfilter_read_exact in step 5 will not accidentally read past the Cno. Effectively this limits source code filters to line-by-line operation, unless the Cimport subroutine does some fancy pre-pre-parsing of the source code it's filtering. =back This last requirement is often the stumbling block. Line-by-line source filters are not difficult to set up using Filter::Util::Call, but line-by-line filtering is the exception, rather than the norm. Since a newline is just whitespace throughout much of a Perl program, most useful source filters have to make allowance for components that may span two or more newlines. And that complicates the filtering code enormously. For example, here is a minimal source code filter in a module named BANG.pm. It simply converts every occurrence of the sequence CBANG\s+BANG (which may include newlines) to the sequence Cdie 'BANG' if $BANG in any piece of code following a Cuse BANG; statement (until the next Cno BANG; statement, if any): package BANG; use Filter::Util::Call ; sub import { filter_add( sub { my $caller = caller; my ($status, $no_seen, $data); while ($status = filter_read()) { if (/^\s*no\s+$caller\s*;\s*$/) { $no_seen=1; last; } $data .= $_; $_ = ""; } $_ = $data; s/BANG\s+BANG/die 'BANG' if \$BANG/g unless $status 0; $_ .= "no $class;\n" if $no_seen; return 1; }) } sub unimport { filter_del(); } 1 ; Given this level of complexity, it's perhaps not surprising that source code filtering is not commonly used. This RFC proposes that a new standard module -- Filter::Simple -- be provided, to vastly simplify the task of source code filtering, at least in common cases. =head2 The Filter::Simple module Instead of the above process, it is proposed that the Filter::Simple module would simplify the creation of source code filters to the following steps: =over 4 =item 1. Set up a module that does a Cuse Filter::Simple sub { ... }. =item 2. Within the anonymous subroutine passed to Cuse Filter, process the contents of $_ to change the source code in the desired manner. =back In other words, the previous example, would become: package BANG; use Filter::Simple sub { s/BANG\s+BANG/die 'BANG' if \$BANG/g; }; 1 ; =head2 Module semantics This drastic simplication is achieved by having the standard Filter::Simple module export into the package that Cuses it (e.g. package "BANG" in the above example) two automagically constructed subroutines -- Cimport and Cunimport -- which take care of all the nasty details. In addition, the generated Cimport subroutine
RFC 119 (v4) Object neutral error handling via exceptions
t error type 2, handle it } catch ( ... ) { if ( handle1 ) close ( handle1 ); if ( handle2 ) close ( handle2 ); if ( handle3 ) close ( handle3 ); throw; } Or you could: void help_clean ( FILE * handle1, FILE * handle2, FILE * handle3 ) { if ( handle1 ) close ( handle1 ); if ( handle2 ) close ( handle2 ); if ( handle3 ) close ( handle3 ); } FILE *handle1 = NULL, *handle2 = NULL, *handle3 = NULL; try { handle1 = fopen ( ... ); handle2 = fopen ( ... ); handle3 = fopen ( ... ); ... } catch ( error_type_1 ) { help_clean ( handle1, handle2, handle3 ); // ... report error type 1, handle it } catch ( error_type_2 ) { help_clean ( handle1, handle2, handle3 ); // ... report error type 2, handle it } catch ( ... ) { help_clean ( handle1, handle2, handle3 ); throw; } This removes the error handling code even further from the setup code, still requires redundancy among the catch phrases, and introduces new functions dealing only with cleanup. Assuming an RFC 88 finally clause added to C++ would help, if and only if and only if (if I understand it correctly) the handles can be closed _at the end_ of the cleanup process. That would produce: FILE *handle1 = NULL, *handle2 = NULL, *handle3 = NULL; try { handle1 = fopen ( ... ); handle2 = fopen ( ... ); handle3 = fopen ( ... ); ... } catch ( error_type_1 ) { // ... report error type 1, handle it } catch ( error_type_2 ) { // ... report error type 2, handle it } catch ( ... ) { throw; } finally { if ( handle1 ) close ( handle1 ); if ( handle2 ) close ( handle2 ); if ( handle3 ) close ( handle3 ); } I'm not sure how the "catch ( ... )"'s rethrow would interact with the finally clause, that seems to be an area of discussion regarding the differences between RFC 63 and RFC 88. =head2 Goals of exceptions This is my list so far, feel free to suggest more. In the examples thus far, each "fopen" call could independently fail, but the overall program appears to need to open all three, or none, in a somewhat atomic manner. While the code to deal with a single fopen call and the possibility that it fails is straightforward, the complexity of the situation results from the polynomial explosion of code and branches resulting from increasing numbers of operations. This is my justification for the first 6 items on the list. While I have nothing against OO techniques (I've found C++ OO features useful for a compiled language), it is somewhat cumbersome to deal with OO for small projects. Perhaps some of the "make everything an object" RFCs for Perl6 will sidestep that cumbersomeness, and moot this point. However, until or unless that is achieved, I'd rather not be forced to use objects to achieve exception handling. On the other hand, when building large system, having an exception object might be helpful. This is my justification for item 7. 1) Keep the cleanup code near the setup code, to keep it understandable 2) Keep the cleanup code in the same scope as the setup code, to avoid hoisting variables into higher scopes. 3) Avoid redundancy and complex control flow in the visible cleanup code paths. 4) Achieve a structured form of non-local goto to allow exiting multiple levels of subroutine calls without coding tests of error conditions at every level within the stack. 5) Achieve good default reporting of uncaught exceptions. 6) Make exception handling the default (or only) method of operation for Perl code 7) Permit use of exception objects, but don't require them. =head2 Techniques for exceptions =head3 Technique for goals 1-3 Add new except and always clauses that can modify a statement or a block: statement1 except statement2 always statement3; Any of statement1, statement2 or statement3 can be made into blocks, with the result that scoping problems resurface, but often times they wouldn't need to be blocks. If execution of the containing scope reaches statement1, it is executed as normal. Because it contains an always clause, statement3 is pushed on the stack of cleanup code to be executed when the scope exits, and because it contains an except clause, statement2 to be pushed on the stack of cleanup code to be executed if an exception occurs. There is logically only one stack of cleanup code, so the order of execution of the cleanup statements is always consistent, although some of it is conditional. For the statement above, statement2 would always be executed before statement3 if an exception occurs. Statement2 is omitted if no exception occurs. Statement 3 is omitted only if the scope exits prior to statement1 being executed. For example (I'll use Perl language examples henceforth): $handle1 = open ( "file1" ) always close ( $handle1 ); throw "
RFC 340 (v1) with takes a context
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE with takes a context =head1 VERSION Maintainer: David Nicol [EMAIL PROTECTED] Date: 28 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 340 Version: 1 Status: Developing =head1 ABSTRACT "call frames" become useful as objects. The current one is always Cwith with no argument and you can get into an arbitrary one by using Cwith $whatever BLOCK in either forward or backward form, like Cif. =head1 DESCRIPTION =item description by commented example sub A{ my $A = shift; return with; }; $context1 = A(3); print "$context1"; # something like CONTEXT(0XF0GD0G) print "$A" with $context1; # prints 3 with $context1{print "$A"}; # same thing =item coexistence w/ pascal-like with The context-access Cwith takes a scalar argument, the pascal-like with takes a hash. If the pascal-like with is considered as describing aliases to defined variables, the two have deep similarities. =item CONTEXT objects in other contexts In an array context, one of these CONTEXT things will expand into key-value pairs if it can. =item warning about memory leaks using Cwith to store contexts may adversely affect memory recycling. =head1 IMPLEMENTATION Perl5 has the hooks required to do this: the closure stuff. This proposed Cwith keyword makes access into such things more explicit. =head1 REFERENCES None.
RFC 277 (v2) Method calls SHOULD suffer from ambiguity by default
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Method calls SHOULD suffer from ambiguity by default =head1 VERSION Maintainer: Nathan Wiger [EMAIL PROTECTED] Date: 24 Sep 2000 Last Modified: 28 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 277 Version: 2 Status: Developing =head1 CHANGES The first version was too cute, trying to prove a point but failing miserably. So here it is, in plain English. =head1 ABSTRACT RFC 244 proposes special-casing - to quote the left operand, and eliminating the bareword indirect object syntax entirely. These are both Bad Ideas, and will likely cause a large number of longtime Perl hackers to run away screaming. Many people, myself included, view the ability to type: $q = new CGI; $val = shift-{fullname}; And have Perl DWIM is what makes Perl Perl, and fun. Forcing me to write: $q = new {'CGI'}; $val = shift()-{fullname}; Because this ambiguity upsets a few people is just plain silly. And Bvery un-fun. Tightening this syntax by default makes no sense. Rather, this is something that can easily be added to Cuse strict. See BRFC 278 for details on Cuse strict 'words' and Cuse strict 'objects'. =head1 DESCRIPTION In Perl 5, there are already plenty of ways in which people can Bvoluntarily disambiguate the above: $q = new 'CGI'; # main::new('CGI'); $q = CGI::-new; # or 'CGI'-new; $val = shift::-{fullname}; # or 'shift'-{fullname} $val = shift()-{fullname}; # main::shift()-{fullname} And if people want this to be enforced, then they can easily Cuse strict 'words' as specified in RFC 278. Perl 6 should be a value add to Perl 5. It should not take away established, widely-used syntax just because of a little ambiguity. This is hardly the only syntactic ambiguity in Perl. =head1 IMPLEMENTATION Perl 6 should tokenize the above just like Perl 5. =head1 MIGRATION None. Why would we want there to be? =head1 REFERENCES RFC 28: Perl should stay Perl. RFC 278: Additions to 'use strict' to fix syntactic ambiguities RFC 244: Method calls should not suffer from the action on a distance
RFC 178 (v5) Lightweight Threads
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Lightweight Threads =head1 VERSION Maintainer: Steven McDougall [EMAIL PROTECTED] Date: 30 Aug 2000 Last Modified: 26 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 178 Version: 5 Status: Frozen =head1 ABSTRACT A lightweight thread model for Perl. =over 4 =item * All threads see the same compiled subroutines =item * All threads share the same global variables =item * Threads can create thread-local storage by Clocalizing global variables =item * All threads share the same file-scoped lexicals =item * Each thread gets its own copy of block-scoped lexicals upon execution of Cmy =item * Threads can share block-scoped lexicals by passing a reference to a lexical into a thread, by declaring one subroutine within the scope of another, or with closures. =item * Open code can only be executed by a thread that compiles it =item * The language guarantees atomic data access. Everything else is the user's problem. =back =over 4 =item Perl Swiss-army chain saw =item Perl with threads juggling chain saws =back =head1 CHANGES =head2 v5 Frozen =head2 v4 =over 4 =item * Traded in data coherence for LAtomic data access. Added examples 16 and 17. =item * Traded in Primitive operations for LLocking =item * Dropped L/local section =item * Revised L/Performance section =back =head2 v3 =over 4 =item * Simplified example 9 =item * Added L/Performance section =back =head2 v2 =over 4 =item * Added section on sharing block-scoped lexicals between threads =item * Added examples 9, 10, and 11. (N.B. renumbered following examples) =item * Fixed some typos =back =head1 FROZEN There was substantial--if somewhat disjointed--discussion of thread models on perl6-internals. The consensus among those with internals experience is that this RFC shares too much data between threads, and that the CPU cost of acquiring a lock for every variable access will be prohibitive. Dan Sugalski discussed some of the tradeoffs and sketched an alternate threading model at http://www.mail-archive.com/perl6-internals%40perl.org/msg01272.html however, this has not been submitted as an RFC. =head1 DESCRIPTION The overriding design principle in this model is that there is one program executing in multiple threads. One body of code; one set of global variables; many threads of execution. I like this model because =over 4 =item * I understand it =item * It does what I want =item * I think it can be implemented =back =head2 Notation =over 4 =item Imain and Ispawned threads We'll call the first thread that executes in a program the Imain thread. It isn't distinguished in any other way. All other threads are called Ispawned threads. =item Iopen code Code that isn't contained in a BLOCK. =back Examples are written in Perl5, and use the thread programming model documented in CThread.pm. Discussions of performance and implementation is based on the Perl5 internals; obviously, these are subject to change. =head2 All threads see the same compiled subroutines Subroutines are typically defined during the initial compilation of a program. Cuse, Crequire, Cdo, and Ceval can later define additional subroutines or redefine existing ones. Regardless, at any point in its execution, a program has one and only one collection of defined subroutines, and all threads see this collection. Example 1 sub foo { print 1 } sub hack_foo { eval 'sub foo { print 2 }' } foo(); Thread-new(\hack_foo)-join; foo(); Output: 12. The main thread executes Cfoo; the spawned thread redefines Cfoo; the main thread executes the redefined subroutine. Example 2 sub foo { print 1 } sub hack_foo { eval 'sub foo { print 2 }' } foo(); Thread-new(\hack_foo); foo(); Output: 11 or 12, according as the main thread does or does not make the second call to Cfoo() before the spawned thread redefines it. If the user cares which happens first, then they are responsible for doing their own synchronization, for example, with Cjoin, as shown in Example 1. Code refs (like all Perl data objects) are reference counted. Threads increment the reference count upon entry to a subroutine, and decrement it upon exit. This ensures that the op tree won't be garbage collected while the thread is executing it. =head2 All threads share the same global variables Example 3 #!/my/path/to/perl $a = 1; Thread-new(\foo)-join; print $a; sub foo { $a++ } Output: 2. C$a is a global, and it is the Isame global in both the main thread and the spawned thread. =head2 Threads can create thread-local storage by Clocalizing global variables Example 4 #!/my/path/to/perl $a = 1; Thread-new(\foo); print $a; sub foo { local $a = 2 } Output: 1. The spawned thread gets it's own copy of C$a. The copy of C$a in the main thread is unaffected
RFC 185 (v3) Thread Programming Model
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Thread Programming Model =head1 VERSION Maintainer: Steven McDougall [EMAIL PROTECTED] Date: 31 Aug 2000 Last Modified: 26 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 185 Version: 3 Status: Frozen =head1 ABSTRACT This RFC describes the programming interface to Perl6 threads. It documents the function calls, operators, classes, methods, or whatever else the language provides for programming with threads. =head1 CHANGES =head2 v3 Frozen =head2 v2 =over 4 =item * Added SYNOPSIS, and wrote a proper ABSTRACT =item * Detailed Casync =item * Detailed sharing of lexicals between threads =item * Traded Mutexes back for Clock, Ctry, and Cunlock =item * Pushed CSemaphore, CEvent, and CTimer down into CThread:: =item * Specified readable, writable and failure to return Events =item * Reworked the wait functions =item * Added CQueue =back =head1 FREEZE There was little, if any, further discussion after version 2. =head1 SYNOPSIS use Thread; $sub = sub { ... }; $thread = new Thread \func , @args; $thread = new Thread $sub, @args; $thread = new Thread sub { ... }, @args; async { ... }; $result = join $thread; $thread = this Thread; @threads = all Thread; $thread1 == $thread2 and ... Thread::yield(); critical { ... }; # one thread at a time in this block lock $scalar; lock @array lock %hash; lock sub; $ok = try $scalar; $ok = try @array $ok = try %hash; $ok = try sub; unlock $scalar; unlock @array unlock %hash; unlock sub; $event = auto Thread::Event; $event = manual Thread::Event; set$event; reset $event; wait $event; $semaphore = new Thread::Semaphore $initial; $ok= $semaphore-up($n); $semaphore-down; $count = $semaphore-count; $timer = Thread::Timer-delay($seconds); $timer = Thread::Timer-alarm($time); $timer-wait; $event = $fh-readable $event = $fh-writable $event = $fh-failure $ok = wait_all(@references); $i = wait_any(@references); $queue = new Thread::Queue $queue-enqueue($a); $a = $queue-dequeue; $empty = $queue-empty; =head1 DESCRIPTION =head2 Thread =over 4 =item I$thread = Cnew CThread \Ifunc, I@args Executes Ifunc(I@args) in a separate thread. The return value is a reference to the CThread object that manages the thread. The subroutine executes in its enclosing lexical context. This means that lexical variables declared in that context may be shared between threads. See RFC 178 for examples. =item I$thread = Cnew CThread I$sub, I@args =item I$thread = Cnew CThread Csub { ... }, I@args Executes an anonymous subroutine in a separate thread, passing it I@args. The return value is a reference to the CThread object that manages the thread. The subroutine is a closure. References to variables in its lexical context are bound when the Csub operator executes. See RFC 178 for examples. =item Casync BLOCK Executes BLOCK in a separate thread. Syntactically, Casync BLOCK works like Cdo BLOCK. Casync creates a CThread object to manage the thread, but it does not return a reference to it. If you want the CThread object, use one of the Cnew CThread forms shown above. The BLOCK executes in its enclosing lexical context. This means that lexical variables declared in that context may be shared between threads. =item I$thread = Cthis CThread Returns a reference to the CThread object that manages the current thread. =item I@threads = Call CThread Returns a list of references to all existing CThread objects in the program. This includes CThread objects created for Casync blocks. =item I$result = Cjoin I$thread =item I@result = Cjoin I$thread Blocks until I$thread terminates. May be called repeatedly, by any number of threads. Returns the last expression evaluated in I$thread. This expression is evaluated in list context inside the thread. If Cjoin is called in list context, it returns the entire list; if Cjoin is called in scalar context, it returns the first element of the list. =item I$thread1 == I$thread2 Evaluates to true iff I$thread1 and I$thread2 reference the same CThread object. =item CThread::yield() Gives the interpreter an opportunity to switch to another thread. The interpreter is not obligated to take this opportunity, and the calling thread may regain control after an arbitrarily short period of time. =back =head2 Critical section Ccritical is a new keyword. Syntactically, it works like Cdo. critical { ... }; The interpreter guarantees that only one thread at a time can execute a Ccritical block. =head2 Lock =over 4 =item Clock I$scalar =item Clock I@array =item Clock I%hash =item Clock Isub Applies a lock to a variable. If there are no locks applied
RFC 239 (v2) IO: Standardization of Perl IO Functions to use Indirect Objects
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE IO: Standardization of Perl IO Functions to use Indirect Objects =head1 VERSION Maintainer: Nathan Wiger [EMAIL PROTECTED] Date: 15 Sep 2000 Last Modified: 26 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 239 Version: 2 Status: Frozen =head1 ABSTRACT Currently, Perl IO functions follow a C-like style, twiddling values passed to them and then returning 1 or 0. This RFC takes after RFC 14's modifications to open() and proposes similar modifications to other IO operations, in an attempt to make them both more internally consistent and also more flexible. These modifications allow increased modularity and function namespace as well. In most cases the changes are relatively minor, simply requiring the user to drop the "," after the first argument. =head1 DESCRIPTION I'm running short on time here, so here goes. The following functions should be modified into the new syntaxes shown below: $FH = open $file, [@args]; $ret = seek $FH $pos;# $FH-seek $ret = read $FH $scalar, $len, $offset; # $FH-read $ret = tell $FH; # $FH-tell $ret = ioctl $FH $fun, $scalar; # $FH-ioctl $ret = flock $FH $op;# $FH-flock $ret = fnctl $FH $func, $scalar; # $FH-fcntl $DH = open dir $dir;# dir-open $ret = seek $DH $pos;# $DH-seek $ret = tell $DH; # $DH-tell $ret = rewind $DH; # $DH-rewind $FH = sysopen $file, $mode, $mask; # "open sys"? $ret = sysread $FH $scalar, $len, $offset; # $FH-sysread $ret = syswrite $FH $scalar, $len, $offset; # $FH-syswrite $ret = sysseek $FH $pos, $whence; # $FH-sysseek $SH = open socket $dom, $type, $proto; # socket-open $ret = connect $SH $name;# $SH-connect $ret = recv $SH $scalar, $len, $flags; # $SH-recv $ret = setsockopt $SH $lev, $opt, $val; # $SH-setsockopt $ret = socketshut $SH $how; # $SH-socketshut ($S1,$S2) = open socket $dom, $type, $proto, 2; # (socketpair) ($R, $W) = pipe; # "open pipe"? If you read your Camel, most of these changes simply involve dropping the "," after the first argument to take advantage of the indirect object syntax. This is not pure sugar. It buys us many important benefits: 1. Fewer functions. As RFC 14 starts to note, gone are the *dir functions, as well as many specialized functions. They simply become member functions, increasing function namespace. So, tell() can now be used on files, directories, and web docs, without having to invent new function names. 2. Seemless integration with extended file types. Using this syntax, we can now do this: use http; $WEB = open http "http://www.yahoo.com", POST; flock $WEB $op; And $WEB-flock can die as "unimplemented", without this having to be anywhere even remotely in core. It can exist as an external module, but to the user it looks like the same function used here: $FH = open "/etc/motd" or die; flock $FH $op; Even though it's vastly different. Users seen a clean, coherent interface, without having to worry about the nuts and bolts or calling special methods. 3. More consistent syntax. The most frequently-used IO function, print, already uses an indirect object syntax for its handles. This RFC simply follows the lead and extends this to other IO functions as well. 4. More modular design. This tightly integrates with the idea of moving certain functions out of core, like socket(). Now you can simply say: use socket; # import socket class $SH = open socket $dom, $type, $proto; recv $SH $scalar, $len, $offset; Bingo. It walks and talks like an extensible open() thanks to the lovely indirect object syntax, but all the functions are actually member functions of $SH. 5. Less stuff in core. Hand in hand with the above, stuff like recv() doesn't even have to be in the same zip code as core anymore. But it walks and talks just like it was. *Plus*, there's less namespace pollution because everything's a member function. As such, no extensive checks like "Arg 1 not a socket handle" have to be done. Running a bad function yields a standard error: recv $not_a_socket, $bad, $args; Can't find object method "recv" via "$not_a_socket" ... See below for suggestions on a better error message. 6. Can extend use of the default filehandle. Thanks to this approach, we
RFC 120 (v4) Implicit counter in for statements, possibly $#.
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Implicit counter in for statements, possibly $#. =head1 VERSION Maintainer: John McNamara [EMAIL PROTECTED] Date: 16 Aug 2000 Last Modified: 25 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 120 Version: 4 Status: Frozen Frozen since: v3 =head1 ABSTRACT The syntax of the Perl style Cfor statement could be augmented by the introduction of an implicit counter variable. The deprecated variable C$# could be used for this purpose due to its mnemonic association with C$#array. Other alternatives are also proposed: an explicit counter returned by a function; an explicit counter defined after foreach; an explicit counter defined by a scoping statement. =head1 DESCRIPTION The use of Cfor and Cforeach statements in conjunction with the range operator, C.., are generally seen as good idiomatic Perl: @array = qw(sun moon stars rain); foreach $item (@array) { print $item, "\n"; } as opposed to the "endearing attachment to C" style: for ($i = 0; $i = $#array; $i++) { print $array[$i], "\n"; } In particular, the foreach statement provides a useful level of abstraction when iterating over an array of objects: foreach $object (@array) { $object-getline; $object-parseline; $object-printline; } However, the abstraction breaks down as soon as there is a need to access the index as well as the variable: for ($i = 0; $i = $#array; $i++) { # Note $array[$i]-index = $i; $array[$i]-getline; $array[$i]-parseline; $array[$i]-printline; } # Note - same applies to: foreach $i (0..$#array) Here we are dealing with array variables and indexes instead of objects. The addition of an implicit counter variable in Cfor statements would lead to a more elegant syntax. It is proposed the deprecated variable C$# should be used for this purpose due to its mnemonic association with C$#array. For example: foreach $item (@array) { print $item, " is at index ", $#, "\n"; } =head1 ALTERNATIVE METHODS Following discussion of this proposal on perl6-language-flow the following suggestions were made: =head2 Alternative 1 : Explicit counter returned by a function This was proposed by Mike Pastore who suggested reusing pos() and by Hildo Biersma who suggested using position(): foreach $item (@array) { print $item, " is at index ", pos(@array), "\n"; } # or: foreach $item (@array) { $index = some_counter_function(); print $item, " is at index ", $index, "\n"; } =head2 Alternative 2 : Explicit counter defined after foreach This was proposed by Chris Madsen and Tim Jenness, Jonathan Scott Duff made a similar pythonesque suggestion: foreach $item, $index (@array) { print $item, " is at index ", $index, "\n"; } Glenn Linderman added this could also be used for hashes: foreach $item $key ( %hash ) { print "$item is indexed by $key\n"; } Ariel Scolnicov suggested a variation on this through an extension of the Ceach(): foreach (($item, $index) = each(@array)) { print $item, " is at index ", $index, "\n"; } With this in mind Johan Vromans suggested the use of Ckeys() and Cvalues() on arrays. A variation on this is an explicit counter after C@array. This was alluded to by Jonathan Scott Duff: foreach $item (@array) $index { print $item, " is at index ", $index, "\n"; } =head2 Alternative 3 : Explicit counter defined by a scoping statement This was proposed by Nathan Torkington. This behaves somewhat similarly to Tie::Counter. foreach $item (@array) { my $index : static = 0; # initialized each time foreach loop starts print "$item is at index $index\n"; $index++; } # or: foreach $item (@array) { my $index : counter = 0; # initialized to 0 first time # incremented by 1 subsequently print "$item is at index $index\n"; } =head1 IMPLEMENTATION There was no discussion about how this might be implemented. It was pointed out by more than one person it would inevitably incur an overhead. =head1 REFERENCES perlvar Alex Rhomberg proposed an implicit counter variable on clpm: http://x53.deja.com/getdoc.xp?AN=557218804fmt=text and http://x52.deja.com/threadmsg_ct.xp?AN=580369190.1fmt=text Craig Berry suggested C$#: http://x52.deja.com/threadmsg_ct.xp?AN=580403316.1fmt=text
RFC 321 (v1) Common Callback API for all AIO calls.
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Common Callback API for all AIO calls. =head1 VERSION Maintainer: Uri Guttman [EMAIL PROTECTED] Date: 25 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 321 Version: 1 Status: Developing =head1 ABSTRACT This RFC addresses the way callbacks are requested in the Advanced I/O (AIO) system and how they get called. The goal is to have a common callback style across the board. =head1 DESCRIPTION There are several places in Perl where callbacks are needed. These include socket I/O events, asynchonous file I/O, signals, timers, plain events, etc. This RFC proposes a common API for requesting those callbacks in order to keep this consistant in all those places. =head1 IMPLEMENTATION A callback currently can be registered in %SIG with a code ref as do Perl/Tk and Event.pm. But there are time you want to have an object oriented callback and that requires an object and a method. Event.pm handles this with a anon list instead of the code ref. Its first element is the object and the second is the method. A solution which simplifies this and make calback code more consistant is to allow either a code ref or an object as the single primary argument to the call back API. The method called with the object is defaulted to a name that is appropriate for the type of callback. That method can be overridden with an optional argument. For example, a read event would default to a method named 'readable', and a socket connect event would default to 'connected'. Here are two possible APIs, the first based on I/O handle objects from RFC 14 and the second on the AIO syntax: $obj = bless {}, 'Foo' ; # these are I/O handle object based calls $fh-read_event( 'cb' = $obj ) ; $fh-write_event( 'cb' = $obj, 'method' = 'my_write_method' ) ; # this is an AIO class call AIO::read_event( 'fh' = $fh, 'cb' = \my_callback ) ; sub my_callback { my ( $fh ) = @_ ; print "i can read $fh from package main::\n" ; } package Foo ; sub readable { my ( $self, $fh ) = @_ ; print "i can read $fh\n" ; } sub my_write_method { my ( $self, $fh ) = @_ ; print "i can write $fh\n" ; } Most callbacks will also allow an optional timeout which would use the the 'timed_out' method by default. You can override that method name with the 'timeout_method' argument. # 10 second timeout AIO::read_event( 'fh' = $fh, 'cb' = $obj, 'timeout' = 10 ) ; package Foo ; sub readable { my ( $self, $fh ) = @_ ; print "i can read $fh\n" ; } sub timed_out { my ( $self, $fh ) = @_ ; print "$fh has had no data in a while\n" ; } =head1 IMPACT None. =head1 UNKNOWNS None. =head1 REFERENCES Event.pm: XS based event loop module. RFC 14: Modify open() to support FileObjects and Extensibility RFC 47: Universal Asynchronous I/O RFC 60: Safe Signals RFC 86: IPC Mailboxes for Threads and Signals RFC 87: Timers and Timeouts
RFC 101 (v3) Apache-like Event and Dispatch Handlers
easily chain classes and methods together with a couple key benefits over an inline C||: 1. Each handler can partially handle the request, but still return undef, deferring to the next one in line. 2. The handlers can be reordered internally at-will without the main Copen http code having to be redone. 3. Different class open() methods can use internal rules, such as "only open .com URLs", without you having to put checks for this all over the place in the top-level program. Note that Copen() is the name of the method called on each class because that is the name of the method called on the Chttp handler. If: http-bob(@stuff); was called, then CMyHTTP::bob and CLWP::UserAgent::bob would be attempted, in that order. =head2 Removing Handlers In addition to handlers being added, they need to be removed as well. This is where Cno handler comes in: no handler 'http' = 'MyHTTP'; # remove MyHTTP from list no handler 'http'; # remove http handler The first example removes CMyHTTP from the list of classes used by the Chttp handler. The second syntax removes the Chttp handler entirely, meaning that this call: $fo = open http "http://www.yahoo.com"; will result in the familiar error: Can't locate object method "open" via package "http" This should obey blocks as well (like Cstrict), allowing you to say: { # force LWP::UserAgent to be used no handler 'http' = 'MyHTTP'; $fo = open http "http://www.yahoo.com"; } $fo2 = open http "https://www.etrade.com"; =head2 Automatic Handler Registration and Deregistration When a class is imported, it should be able to automatically register as a member of a certain Chandler. For example, the above code would be better written as: use MyHTTP;# these register as 'http' use LWP::UserAgent;# handlers automatically $fo = open http "http://www.yahoo.com"; This means that there needs to be some mechanism for a module to execute the equivalent of a 'use handler' statement, but have it take affect in the package Cmain. The easiest way it seems is to simply qualify the full package name you want to affect: package MyHTTP; use handler 'main::http' = 'MyHTTP'; This borders on scary action-at-a-distance, though, and should be used with care. =head1 IMPLEMENTATION A complete Perl 5 implementation of this can be found as Class::Handler http://www.perl.com/CPAN/authors/id/N/NW/NWIGER/Class-Handler-1.03.tar.gz The Perl 5 implementation uses two functions, Chandler and nohandler, instead of the pragmatic style proposed in the RFC. This style may be more appropriate, depending how these are used. One problem with pragmas is that they are compile-time-only, meaning that dynamically changing handler lists is tricky to say the least. A module may remain the best implementation for this, the only problems are with speed (since the Perl 5 version requires AUTOLOAD) and also using this mechanism for core methods (like the new Copen from RFC 14). =head1 REFERENCES RFC 14: Modify open() to support FileObjects and Extensibility RFC 8: The AUTOLOAD subroutine should be able to decline a request http://www.mail-archive.com/perl6-language-io@perl.org/msg00086.html
RFC 308 (v1) Ban Perl hooks into regexes
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Ban Perl hooks into regexes =head1 VERSION Maintainer: Simon Cozens [EMAIL PROTECTED] Date: 25 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 308 Version: 1 Status: Developing =head1 ABSTRACT Remove C?{ code }, C??{ code } and friends. =head1 DESCRIPTION The regular expression engine may well be rewritten from scratch or borrowed from somewhere else. One of the scarier things we've seen recently is that Perl's engine casts back its Krakken tentacles into Perl and executes Perl code. This is spooky, tangled, and incestuous. (Although admittedly fun.) It would be preferable to keep the regular expression engine as self-contained as possible, if nothing else to enable it to be used either outside Perl or inside standalone translated Perl programs without a Perl runtime. To do this, we'll have to remove the bits of the engine that call Perl code. In short: C?{ code } and C??{ code } must die. =head1 IMPLEMENTATION It's more of an unimplementation really. =head1 REFERENCES None.
RFC 317 (v1) Access to optimisation information for regular expressions
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Access to optimisation information for regular expressions =head1 VERSION Maintainer: Hugo van der Sanden ([EMAIL PROTECTED]) Date: 25 September 2000 Mailing List: [EMAIL PROTECTED] Number: 317 Version: 1 Status: Developing =head1 ABSTRACT Currently you can see optimisation information for a regexp only by running with -Dr in a debugging perl and looking at STDERR. There should be an interface that allows us to read this information programmatically and possibly to alter it. =head1 DESCRIPTION At its core, the regular expression matcher knows how to check whether a pattern matches a string starting at a particular location. When the regular expression is compiled, perl may also look for optimisation information that can be used to rule out some or all of the possible starting locations in advance. Currently you can find out about the optimisation information captured for a particular regexp only in a perl built with DEBUGGING, by turning on -Dr: % perl -Dr -e 'qr{test.*pattern}' Compiling REx `test.*pattern' size 8 first at 1 rarest char p at 0 rarest char s at 2 1: EXACT test(3) 3: STAR(5) 4: REG_ANY(0) 5: EXACT pattern(8) 8: END(0) anchored `test' at 0 floating `pattern' at 4..2147483647 (checking floating) minlen 11 Omitting $` $ $' support. EXECUTING... Freeing REx: `test.*pattern' % For some purposes it would help to be able to get at this information programmatically: the test suite could take advantage of this (to test that optimisations occur as expected), and it could also be useful for enhanced development tools, such as a graphical regexp debugger. Additionally there are times that the programmer is able to supply optimisation that the regexp engine cannot discover for itself. While we could consider making it possible to modify these values, it is important to remember that these are only hints: the regexp engine is free to ignore them. So there is a danger that people will misuse writable optimisation information to move part of the logic out of the regexp, and then blame us when it breaks. Suggested example usage: % perl -wl use re; $a = qr{test.*pattern}; print join ':', $a-fixed_string, $a-floating_string, $a-minlen; __END__ test:pattern:11 % .. but perhaps a single new method returning a hashref would be cleaner and more extensible: $opt = $a-optimisation; print join ':', @$opt{qw/ fixed_string floating_string minlen /}; =head1 IMPLEMENTATION Straightforward: add interface functions within the perl core to give access to read and/or write the optimisation values; add methods in re.pm that use XS code to reach the internal functions. =head1 REFERENCES Prompted by discussion of RFC 72: RFC 72: Variable-length lookbehind: the regexp engine should also go backward.
RFC 160 (v2) Function-call named parameters (with compiler optimizations)
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Function-call named parameters (with compiler optimizations) =head1 VERSION Maintainer: Michael Maraist [EMAIL PROTECTED] Date: 25 Aug 2000 Last Modified: 25 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 160 Version: 2 Status: Developing =head1 CHANGES Finialized various features by removing many of the options( grealy simplified the RFC). Unified the goals with that of RFC 176 and RFC 273. =head1 ABSTRACT Function parameters and their positions can be ambiguous in function-oriented programming. Hashes offer tremendous help in this realm, except that error checking can be very tedious. Also, hashes, in general, take a performance hit. The goal is to enhance functionality / convinience / performance where possible in regards to named-parameters, with a minimal of changes. And, at the same time, allow this to be a completely optional and virtually transparent process. The following is an in-depth analysis of various ways of accomplishing these goals. =head1 DESCRIPTION The current method of parameter proto-types only fulfills a tiny niche, which is mainly to offer compile-type checking and to disambiguate context ( as in sub foo($) { }, or sub foo($) { } ). No support, however, is given to hashes, even though they are one of perl's greatest strengths. We see them pop up in parameterized function calls all over the place (CGI, tk, SQL wrapper functions, etc). As above, however, it is left to the coder to check the existance of required parameters, since in this realm, the current proto-types are of no help. It should not be much additional work to provide an extension to prototypes that allow the definition of hashes. The following is a complex example of robust code: #/usr/bin/perl -w use strict # IN: hash: # a = '...' # req # b = '...' # req, defined # c = '...' # req, 0 = c = MAX_C # d = '..' # opt # e = '..' # opt # f = '..' # opt # OUT: xxx sub foo { my $self = shift; my %args = @_; # Requires $a my $a; die "No a provided" unless exists $args{a}; $a = $args{a}; # Requires non-null $b my $b; die "invalid b" unless exists $args{b} defined ($b = $args{b}); # Requires non-null and bounded $c my $c; die "Invalid c" unless exists $args{c} defined ($b = $args{b}) ($c = 0 $c $MAX_C); my ( $d, $e, $f ) = @args{ qw( d e f ) }; ... } # end foo Becomes: sub foo($%) : method required_fields(a b c) fields(d e f) doc(EOS) { # IN: hash: # a = '...' # req; Do some A # b = '...' # req, defined; Do some B # c = '...' # req, 0 = c = MAX_C; Do some C # d = '..' # opt; Do some D # e = '..' # opt; Do some E # f = '..' # opt; Do some F # OUT: xxx EOS my $self = shift; my %args : fields(a b c d e f) = @_; # produce optimized hash that is already pre-allocated at compile-time. # Requires non-null $args{b} die "invalid b" unless defined $args{b}; # Requires non-null and bounded $args{c} die "invalid c" unless defined $args{c} ($args{c} = 0 $args{c} $MAX_C); ... } # end foo $obj-foo( c = 3, b = 2, f= 8, a = 1 ); # Note the out-of order, and the mixture of optional fields foo( $obj, a = 1, b = 2, c = 3 ); # still totally legal foo( a = 1, b = 2 ); # compiler-error (invalid num-args) foo( 1,2,3,4,5,6,7); # compiler-error, missing args a, b and c foo(a,1,b,2,c,3,$obj); # compiler-error, missing args a, b and c # (since they're offset by one) my @args = ( a = 1, b = 2, c = 3); $obj-foo( @args ); # checking-deffered to run-time. Will be ok. my @bad_args = ( b = 8, e = 4 ); $obj-foo( @bad_args ); # checking-deffered to run-time. Will fail. Essentially, perl's compiler can be put to use for hashed-function calls in much the same way as pseudo hashes work for structs/objects. Making this a compile-time check would drastically reduce run-time errors in code (that used hash-based parameters). It would also make the code both more readible AND more efficient. For readibility, perl can be quiried for the list of allowable options as well as general documentation. In the above, the listing of Input options would have been redundant, for both the code-reader, and the run-time query, but was provided for completeness. Note also that the above is compatible with the existing structure. In fact, foo required the old-style prototype to distinguish the "self" variable from the general-hash arguments. The use of the attribute "method" was optional, and could be used in the auto-generation of a $SELF variable. At the very least, it allows a run-time description of what the first argument really-is. An important thing to note is that we're not changing the functionality of execution. Perl sub's still look and feel like old-style subs to the user. They simply act as if
RFC 255 (v3) Fix iteration of nested hashes
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Fix iteration of nested hashes =head1 VERSION Maintainer: Damian Conway [EMAIL PROTECTED] Date: 18 Sep 2000 Last Modified: 25 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 255 Version: 3 Status: Retracted =head1 NOTE ON RETRACTION The thread: http://www.mail-archive.com/perl6-language@perl.org/index.html#04190 points out some serious problems that the proposal did not address. As I do not have time to find/invent good solutions, I am forced to withdraw the proposal. Anyone wishing to take up the cudgels against this annoying problem has my encouragement to pick whatever they like from the bones of this document. =head1 ABSTRACT This RFC proposes that the internal cursor iterated by the Ceach function be stored in the pad of the block containing the Ceach, rather than being stored within the hash being iterated. =head1 DESCRIPTION Currently, nesting two Ceach iterations on the same hash leads to unexpected behaviour, because both Ceachs advance the same internal cursor within the hash. For example: %desc = ( blue = "moon", green = "egg", red = "Baron" ); while ( my ($key1,$value1) = each %desc ) { while ( my ($key2,$value2) = each %desc ) { print "$value2 is not $key1\n" unless $key1 eq $key2; } } print "(finished)\n"; It is proposed that each Ceach maintain its own cursor (stored in the pad of the block containing it) so that the above example DWIMs. =head1 MIGRATION ISSUES Minimal. No-one nests iterators now because it doesn't work. Usages such as: $x = each %hash; $y = each %hash; @z = each %hash; would change their behaviour, but could be translated if p52p6 defined: sub p5_each(\%) { each %{$_[0]} } and globally replaced each Perl 5 Ceach by Cp5_each. There would not (necessarily) be any effect on the use of FIRSTKEY and NEXTKEY in tied hashes, since the compiler could still determine which should be called. However, tied hashes that use an internal cursor might behave differently, if nested. =head1 IMPLEMENTATION Store the cursor in the pad of the block in which the Ceach is defined, rather than within hash. =head1 REFERENCES RFC 136: (Implementation of hash iterators) suggests separate iterators for Ceach and Ckeys/Cvalues.
RFC 320 (v1) Allow grouping of -X file tests and add Cfiletest builtin
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Allow grouping of -X file tests and add Cfiletest builtin =head1 VERSION Maintainer: Nathan Wiger [EMAIL PROTECTED] Date: 25 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 320 Status: Developing =head1 ABSTRACT Currently, file tests cannot be grouped, resulting in very long expressions when one wants to check to make sure some thing is a readable, writeable, executable directory: if ( -d $file -r $file -w $file -x $file ) { ... } It would be really nice if these could be grouped instead: if ( -drwx $file ) { ... } Notice how much easier this is to read and write. =head1 DESCRIPTION =head2 File Test Grouping See above. Multiple file tests, when grouped, should be ANDed together. This RFC does not propose a way to OR them, since usage like this: if ( -d $file || -r $file || -w $file || -x $file ) { ... } Is highly uncommon, to say the least. Notice this has the nice side effect of eliminating the need for C_ in many cases, since this: if ( -d $file -r _ -w _ -x _ ) { ... } Can simply be written as a single grouped file test, as shown above. If you need to check for more complex logic, you still have to do that separately: if ( -drwx $file and ! -h $file ) { ... } This is the simplest and also probably the clearest way to implement this. =head2 New Cfiletest Builtin This RFC also proposes a new Cfiletest builtin that is actually what is used for these tests. The C-[a-zA-Z]+ form is simply a shortcut to this builtin, just like is a shortcut to Creadline. So: if ( -rwdx $file ) { ... } Is really just a shortcut to the Cfiletest builtin: if ( filetest $file, 'rwdx' ) { ... } Either form could be used, depending on the user's preferences (just like Creadline). =head1 IMPLEMENTATION This would involve making C-[a-zA-Z]+ a special token in all contexts, serving as a shortcut for the Cfiletest builtin. =head1 MIGRATION There is a subtle trap if you are negating subroutines: $result = -drwx $file; And expect this to be parsed like this: $result = - drwx($file); However, usage such as this is exceedingly unlikely, and can simply be resolved by the p52p6 translator looking for C-([a-zA-Z]{2,}) and replacing it with C- $1, since injecting a single space will break up the token. =head1 REFERENCES This grew out of a discussion on RFC 290 between myself, John Allen, Clayton Scott, Bart Lateur, and others
RFC 272 (v2) Arrays: transpose()
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Arrays: transpose() =head1 VERSION Maintainer: Jeremy Howard [EMAIL PROTECTED] Date: 22 Sep 2000 Last Modified: 24 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 272 Version: 2 Status: Frozen =head1 DISCUSSION This RFC was modified to incorporate the functionality of PDL's xchg() and mv(), which are useful for acting on arbitrary dimensions of multidimensional arrays. Implementing aliasing was discussed in more detail than for RFCs 90, 91, and 148, including suggestions to learn from PDL's implementation (outlined in RFC 116), which are more sophisticated than simply keeping a list of mapped indices, instead actually storing information about the specific operations that have occured. =head1 ABSTRACT It is proposed that a new function Ctranspose be added to Perl. Ctranspose($dim1, $dim2, @list) would return @list with $dim1 and $dim2 switched. Ctranspose(\@order, @list) would return @list with dimensions in the order specified by @order. Ctranspose would return an alias into the original list, not a copy of the elements. =head1 DESCRIPTION =head2 Swapping Dimensions It is proposed that Perl implement a function called Ctranspose that transposes two dimensions of an array, and is evaluated lazily. LRFC 202 gives an overview of the proposed multidimensional arrays that Ctranspose works with. For instance: @a = ([1,2],[3,4],[5,6]); @transposed_list = transpose(0,1,@a); # ([1,3,5],[2,4,6]) This is different to Creshape (see LRFC 148) which does not reorder its elements: @a = ([1,2],[3,4],[5,6]); @reshaped_list = reshape([3,2],@a); # ([1,2,3],[4,5,6]) Ctranspose is its own inverse: @transposed_list = transpose(0,1,@a); # ([1,3,5],[2,4,6]) @orig_list = transpose(0,1,@transposed_list); # (([1,2],[3,4],[5,6]) @a == @orig_list; # true If Ctranspose refers to a dimension that does not exist, empty dimensions autovivify as necessary: @row_vector = (1,2,3,4); @col_vector = transpose(0,1,@row_vector); # ([1],[2],[3],[4]) =head2 Reordering Dimensions An alternative form of Ctranspose uses the first argument as a list ref to specify a new order for the dimensions: transpose [0,3,4,1,2], @arr; If some dimensions are not specified in the first argument, those dimensions are left in their current order: # Where @arr is a rank 5 array... transpose ([3], @arr) == transpose ([3,0,1,2,4], @arr); transpose ([0,3], @arr) == transpose ([0,3,1,2,4], @arr); This syntax allows multidimensional arrays to be reduced along any dimension: @sumover_1st_dim = reduce ^_ + ^_, @arr[ 0..; |i; * ]; @sumover_3rd_dim = reduce ^_ + ^_, transpose([3],@arr)[0..; |i; * ]; Note that Creduce is from RFC 76, and C|i is from RFC 207. =head2 Aliasing Ctranspose does not make a copy of the elements of its arguments; it simply create an alias: @row_vector = (1,2,3,4); @col_vector = transpose(0,1,@row_vector); # ([1],[2],[3],[4]) $col_vector[[0,1]] = 0; @row_vector == (1,0,3,4); # True =head2 Optional Extra: Dimension Insert To move a dimension and insert it before some other dimension, the following syntax may be used: transpose ({3=2}, @arr) == transpose ([0,1,3,2,4], @arr); which inserts dimension 3 in front of dimension 2. =head1 IMPLEMENTATION RFC 90 discusses possible approaches to implementing aliasing. =head1 REFERENCES RFC 76: Builtin: reduce RFC 90: Arrays: merge() and unmerge() RFC 148: Arrays: Add reshape() for multi-dimensional array reshaping RFC 207: Arrays: Efficient Array Loops
RFC 152 (v2) Replace invocant in @_ with self() builtin
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Replace invocant in @_ with self() builtin =head1 VERSION Maintainer: Nathan Wiger [EMAIL PROTECTED] Date: 24 Aug 2000 Last Modified: 24 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 152 Version: 2 Status: Frozen =head1 ABSTRACT Currently, the invocant is passed into a sub as the first element of @_, leading to the familiar construct: my $self = shift; However, this is a big PITA. In particular, if you support several different calling forms (like CGI.pm), you have to check whether $_[0] is a ref or class name, etc. This RFC, therefore, proposes a new builtin called Cself() which will return the correct invocant information. This has the added advantage that it is consistent with Ccaller(), Cwant(), ref(), and other context functions. =head1 DESCRIPTION =head2 Syntax The new function Cself() would be called in the following way: sub fullname { my $self = self; @_ ? $self-{STATE}-{fullname} = $_[0] : $self-{STATE}-{fullname}; } sub my_junk { my $this = self; $this-fork_o_matic(@_); } # Or even... sub error { carp @_ if self-config('VerboseErrors'); } sub uid { @_ ? self-{uid} = $_[0] : self-{uid}; } The return value of Cself() would be similar to the current invocant in $_[0], with increased flexibility. In particular, it can be called anywhere and everywhere, not just within a method. Depending on the context it's called in, the return value of Cself() will be: 1. A reference to the object, within an object method 2. The name of the package, within a package 3. Undef, if a sub is not called as a method These different return values give us the ability to call Cself() anywhere within Perl 6 code: package MyPackage; # ... many other functions ... sub do_stuff { print "Hello, @_" if self-config('Yep'); } self-do_stuff; # MyPackage-do_stuff package main; my $mp = new MyPackage; $mp-config('Yep') = 1; $mp-do_stuff('Nate');# prints "Hello, Nate" In addition, having a routine called Cself() has the major advantage that it hides the internal magic and scoping from the user. Just like using Cwant() instead of a special variable called C$WANT, Cself() makes using and comprehending contexts easy, simply changing the Perl 5 rule: "The invocant is passed into subs as $_[0] in OO contexts" To the simpler still: "The invocant is always gotten by calling self()" This provides a consistent interface, since Cself() can be called anywhere, just like Ccaller(), Cwant(), and other context functions. =head2 Arguments against Cuse invocant This RFC was released prior to, and remains in opposition to, RFC 233, which proposes a Cuse invocant pragma that provides the flexibility to name the invocant anything you want. As many have noted, Perl is already hard enough. Cuse invocant only gives us multiple ways to do something without adding value, only confusion, by promoting an inconsistent interface. Like providing a means to rename C@ARGV and CSTDIN because a person prefers C@args and Coutput, Cuse invocant further complicates an issue which should only be made easier. The author of this RFC Bloves Perl and loves its flexibility. However, just like choosing a name for Ccaller, Cwant, Cprint, C@ARGV, and so forth, we need to choose a name for Cself as well to ease the burden on the programmer. "Choosing an interface" does not amount to "being un-Perlish" as some might purport to suggest. In fact, just the opposite: We're decreasing the amount of time a user has to spend decoding somebody else's invocant naming scheme by providing a very Perlishly-named function. BThis makes things easier. If it is vital that the invocant must be named something specific, then a person can always use a sub wrapper, tie, or a typeglob to rename it appropriately. Actually, they don't even have to go to these extremes since they can still do this: sub getdata { my $this = self; return $this-{DATA}-{$_[0]}; } (that is, assign to a custom variable) anywhere they want to. Finally, the author would be more than happy to settle for the selection of something different than Cself, such as Cthis(), C$SELF, or even C$ME. The main point is that we need to choose something, because doing so makes the language more consistent and easier (combatting two widespread criticisms of Perl). =head1 IMPLEMENTATION Replace the invocant usually included in $_[0] with Cself(). Stop passing the invocant in @_. =head1 MIGRATION Backwards compatibility is simple. Subs can simply have the expression: unshift @_, self if self; Added as the first line of the sub, since Cself() will return undef if not in an OO context. =head1 REFERENCES Critique of the Cuse invocant pragma: http://www.mai
RFC 279 (v1) my() syntax extensions and attribute declarations
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE my() syntax extensions and attribute declarations =head1 VERSION Maintainer: Nathan Wiger [EMAIL PROTECTED] Date: 24 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 279 Version: 1 Status: Developing =head1 ABSTRACT This RFC fleshes out variable declarations with Cmy, and also proposes a way to assign attributes without the need for a Cmy anywhere. =head1 DESCRIPTION Camel-3 shows some interesting hints of what's been proposed for Cmy declarations: my type $var :attribute = $value; And we all know that you can use Cmy to declare a group of variables: my($x, $y, $z); Here's the issues: 1. How do the two jive together? 2. Should it be possible to assign attributes to individiual elements of hashes/arrays? (yes) =head2 Cohesive Cmy syntax This RFC proposes that you be able to group multiple variables of the same type within parens: my int ($x, $y, $z); my int ($x :64bit, $y :32bit, $z); It seems most logical that: 1. The type will be the same across variables; this is common usage in other languages because it makes sense. 2. The attributes will be different for different variables. As such, multiple attributes can be assigned and grouped flexibly: my int ($x, $y, $z) :64bit; # all are 64-bit my int ($x, $y, $z :unsigned) :64bit; # plus $z is unsigned Note that multiple types cannot be specified on the same line. To declare variables of multiple types, you must use separate statements: my int ($x, $y, $z) :64bit; my string ($firstname, $lastname :long); This is consistent with other languages and also makes parsing realistic. =head2 Assigning attributes to individual elements of hashes/arrays This is potentially very useful. ":laccess", ":raccess", ":public", ":private", and others spring to mind as potential candidates for this. This RFC proposes that in addition to attributes being assignable to a whole entity: my int @a :64bit; # makes each element a 64-bit int my string %h :long; # each key/val is long string They can also be declared on individual elements, without the need for Cmy: $a[0] :32bit = get_val; # 32-bit $r-{name} :private = "Nate"; # privatize single value $s-{VAL} :laccess('data') = ""; # lvalue autoaccessor However, a problem arises in how to assign types to singular elements, since this requires a Cmy: my int $a[0] :64bit; # just makes that single element # a lexically-scoped 64-bit int? my string $h{name} = ""; # cast $h{name} to string, rescope %h? Currently, lexical scope has no meaning for individual elements of hashes and arrays. However, assigning attributes and even types to individual elements seems useful. There's two ways around this that I see: 1. On my'ing of an individual hash/array element, the entire hash/array is rescoped to the nearest block. 2. Only the individual element is rescoped, similar to what happens when you do this: my $x = 5; { my $x = 10; } Either of these solutions is acceptable, and they both have their pluses and minuses. The second one seems more consistent, but is potentially extremely difficult to implement. =head1 IMPLEMENTATION Hold on. =head1 MIGRATION None. This introduces a more flexible syntax but does not break old ones. =head1 REFERENCES Camel for the Cmy syntax.
RFC 276 (v1) Localising Paren Counts in qr()s.
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Localising Paren Counts in qr()s. =head1 VERSION Maintainer: Richard Proctor [EMAIL PROTECTED] Date: 24 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 276 Version: 1 Status: Developing =head1 ABSTRACT The Paren Counts and backreferences should be localised in each qr(), to prevent surprises when qr()s are used in combination. =head1 DESCRIPTION TomCs perl storm #0040 has: Figure out way to do /$e1 $e2/ safely, where $e1 might have '(foo) \1' in it. and $e2 might have '(bar) \1' in it. Those won't work. =head2 DISCUSSION Me: If e1 and e2 are qr// type things the answer might be to localise the backref numbers in each qr// expression. Use of assignment in a regex and named backrefs (RFC 112) would make this a lot safer. Hugo: I think it is reaonable to ask whether the current handling of qr{} subpatterns is correct: perl -wle '$a=qr/(a)\1/; $b=qr/(b).*\1/; /$a($b)/g and print join ":", $1, pos for "aabbac"' a:5 I'm tempted to suggest it isn't; that the paren count should be local to each qr{}, so that the above prints 'bb:4'. I think that most people currently construct their qr{} patterns as if they are going to be handled in isolation, without regard to the context in which they are embedded - why else do they override the embedder's flags if not to achieve that? The problem then becomes: do we provide a mechansim to access the nested backreferences outside of the qr{} in which they were referenced, and if so what syntax do we offer to achieve that? I don't have an answer to the latter, which tempts me to answer 'no' to the former for all the wrong reasons. I suspect (and suggest) that complication is the only reason we don't currently have the behaviour I suggest the rest of the semantics warrant - that backreferences are localised within a qr(). I lie: the other reason qr{} currently doesn't behave like that is that when we interpolate a compiled regexp into a context that requires it be recompiled, we currently ignore the compiled form and act only on the original string. Perhaps this is also an insufficiently intelligent thing to do. MJD: Interpolated qr() items shouldn't be recompiled anyway. They should be treated as subroutine calls. Unfortunately, this requires a reentrant regex engine, which Perl doesn't have. But I think it's the right way to go, and it would solve the backreference problem, as well as many other related problems. Me: You can access the nested backreferences outside of the qr{} in which they were referenced by use of the named backref see RFC 112. =head2 AGREEMENTS The paren count in each qr() is localised to each qr(). There is no way to access the nested backrefernces outside of the qr() by number they may be accessed by name (see RFC 112). The regex engine must be made re-entrant. The regex compiler should not need to recompile qr()s when used as part of another regex. =head1 IMPLENTATION The Regex engine must be made re-entrant. The expansion of variables in regexes must be driven by the regex compiler (Same problem as for RFCs 112, 166 ...) =head1 REFERENCES Perlstorm #0040 from TomC. RFC 112: Assignment within a regex
RFC 280 (v1) Tweak POD's CEltEgt
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Tweak POD's CEltEgt =head1 VERSION Maintainer: Simon Cozens [EMAIL PROTECTED] Date: 24 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 280 Version: 1 Status: Developing =head1 ABSTRACT CEltEgt is not as intuitive as it could be. =head1 DESCRIPTION In Perl 5.6.0, we altered the behaviour of the CEltEgt construct in POD, so that you could say CCEltElt ... EgtEgt to avoid the problem that C $foo-bar would be ended by the arrow. However, this isn't too Perlish, and there's an easier solution; give the CEltEgt the same semantics as a single quoted string with Cq//. That is: =over 1 =item * Do away with the need to escape stuff inside CEltEgt, because that stops you cutting and pasting the code. =item * Allow the use of alternate delimiters to avoid the arrow problem. C$xyz C/$foo-bar/ =back =head1 IMPLEMENTATION Just some tweaks to CPOD::Parser, haha. =head1 REFERENCES None.
RFC 282 (v1) Open-ended slices
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Open-ended slices =head1 VERSION Maintainer: Simon Cozens [EMAIL PROTECTED] Date: 24 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 282 Version: 1 Status: Developing =head1 ABSTRACT The dreaded C@array[$foo...] rears its ugly head again. =head1 DESCRIPTION How many times have you wanted Bjust the last two return values from a function? And how many times have you got frustrated that you can't work out how many things there are in a list and you have to decant it to an array: @thingy = function() for (@thingy[3..$#thingy]) { ... } Horrible, isn't it? People want something better. I thought about it last year or so, and produced a couple of patches. It seemed then that the right syntax was not, for instance: (function())[3...-1] because sometimes you want C$x..$y to return the empty list, but actually: (function())[3...] (Or C[3..]. It doesn't matter.) Someone else on Perl5-Porters wanted this recently too, so it isn't just me. =head1 IMPLEMENTATION It's new syntax, so it isn't going to break anything, and I did produce patches against 5.6, so it is possible. It's a question of adding another rule to the grammar, which flags that the slice should be computed at run time. =head1 REFERENCES None.
RFC 283 (v1) Ctr/// in array context should return a histogram
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Ctr/// in array context should return a histogram =head1 VERSION Maintainer: Simon Cozens [EMAIL PROTECTED] Date: 24 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 283 Version: 1 Status: Developing =head1 ABSTRACT Ctr/// in array context should return a histogram explaining the number of matches for each letter in the pattern. =head1 DESCRIPTION This has been on the Perl 5 to-do list for ages and ages. The idea is that when you're transliterating a bunch of things, you want to know how many of each of them matched in your original string. For instance, while Ctr/x// will count the x's, Ctr/xy// will count both x's and y's - you don't know how many of each. So, the proposal is that Ctr in the array context should return a hash, like this: (%foo) = "xyzzy" =~ tr/xyz// # %foo is ( x = 1, y = 2, z = 3); =head1 IMPLEMENTATION I posted a patch to Perl 5.6 to do this some time back; it's a very simple matter of constructing the hash and incrementing the values every time you do a transliteration of a character. Of course, since we don't know what Perl 6's transliteration operator's going to look like, it's hard to know how to implement an extension to it... =head1 REFERENCES None.
RFC 284 (v1) Change C$SIG{__WARN__} and C$SIG{__DIE__} to magic subs
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Change C$SIG{__WARN__} and C$SIG{__DIE__} to magic subs =head1 VERSION Maintainer: Simon Cozens [EMAIL PROTECTED] Date: 24 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 284 Version: 1 Status: Developing =head1 ABSTRACT It sounds really stoopid to say C$SIG{__WARN__} on a machine which doesn't have signals. =head1 DESCRIPTION Perl 6 is going to be portable to all kinds of system, just like Perl 5. Some of those systems won't have signals, so it's time to question why the warn and die hooks are implemented as signal handlers. Instead, let's implement them as magic subroutines CWARN and CDIE like CBEGIN and CEND. This seems more consistent anyway. Well, to me. =head1 IMPLEMENTATION Call subroutines CWARN and CDIE instead of the signal handler versions. Everything else stays the same. =head1 REFERENCES None.
RFC 287 (v1) Improve Perl Persistance
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Improve Perl Persistance =head1 VERSION Maintainer: Adam Turoff [EMAIL PROTECTED] Date: 24 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 287 Version: 1 Status: Developing =head1 ABSTRACT Many mechanisms exist to make perl code and data persistant. They should be cleaned up, unified, and documented widely within the core documentation. =head1 DESCRIPTION Tom Christiansen proposed this in his perl6storm message: =item perl6storm #0022 make marshalling easy. core module? would this allow for easy persistence of data structures other than dbm files? general persistence is hard, right? can this be an attribute? Python offers one way to make code/data persistant: the Cpickle interface. More complex serialization can be accomplished through the 'shelve' interface or DBM files. This capability is quite useful, widely known and easily used. Perl, by comparison, offers Data::Dumper, which can serialize Perl objects that are rather asymetrically reconstituted by using Ceval or Cdo. Perl also offers solid, simple interfaces into DBM and Berkeley DB files, and offer a well known, low-level serialization mechanism. CPAN offers many other serialization modules that are only slightly different than Data::Dumper. This plethora of serialization mechanisms confuses users and adds to code bloat when multiple modules each use different serialization mechanisms that are all substantially similar. Something similar to Python's Cpickle interface should be added into Perl as a builtin; this feature should have a symmetric "restore" builtin (eg save()/restore(), freeze()/thaw(), dump()/undump()...). Furthermore, Perl's low level serialization machinery (DBM, SDBM, GDBM, Berkeley DB) should be unified into a single core module, where the underlying DBM implementations are pluggable drivers, like DBI's DBD infrastructure. =head1 IMPLEMENTATION First, the issue of adding builtin serialization functions needs to be addressed. This is a language issue because serialization should be more visible than it is today, and the best way to accomplish that is to include this feature as a pair of builtin functions. If this feature is implemented through a core module, that module might best be presented as a pragmatic module. Finally, although this proposal describes a simple matter of programming, some of the issues (such as pluggable interfaces) are best hashed out at a language-design level, so that they may be used elsewhere, easily. =head1 REFERENCES Python Pocket Reference, Chapter 12 perl6storm
RFC 288 (v1) First-Class CGI Support
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE First-Class CGI Support =head1 VERSION Maintainer: Adam Turoff [EMAIL PROTECTED] Date: 24 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 288 Version: 1 Status: Developing =head1 ABSTRACT Perl is frequently used in CGI environments. It should be as easy to write CGI programs with perl as it is to write commandline text filters. =head1 DESCRIPTION Tom Christiansen proposed this in his perl6storm message: =item perl6storm #0025 Make -T the default when operating in a CGI env. That is, taintmode. Will this kill us? Close to it. Tough. Insecurity through idiocy is a problem. Make them *add* a switch to make it insecure, like -U, if that's what they mean, to disable tainting instead. and this: =item perl6storm #0026 Make CGI programming easier. Make as first class as @ARGV and %ENV for CLI progging. Perl6 should be *easier* to write CGI programs than Perl5. One way to accomplish this is to add a C-cgi option to Perl, so that all of the mechanical setup is done automatically. That setup could also be done through a Cuse cgi; pragma. To make CGI programming easier, this option/pragma should: =over 4 =item * Turn on tainting =item * Parse the CGI context, returning CGI variables into %CGI =item * Offer simple functions to set HTTP headers (e.g. content type, result codes) =item * Load quickly =item * Not take up gobs of memory =back All of the other features offered by Lincoln Stein's CGI.pm should remain, but should not be deeply integrated into Perl6. =head1 IMPLEMENTATION Write a very small cgi.pm module that does as little as possible, probably based on Lincoln's code. Add a C-cgi commandline switch, and/or turn on tainting through a Cuse cgi pragma. =head1 REFERENCES CGI.pm perl6storm
RFC 290 (v1) Remove -X
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Remove -X =head1 VERSION Maintainer: Adam Turoff [EMAIL PROTECTED] Date: 24 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 290 Version: 1 Status: Developing =head1 ABSTRACT File tests (-r/-w/-x/...) made sense when Perl's shellness was an attribute. Most new Perl programmers are not coming from a shell programming background, and the -X syntax is opaque and bizarre. It should be removed. =head1 DESCRIPTION Tom Christiansen proposed this in his perl6storm message: =item perl6storm #0101 Just like the "use english" pragma (the modern not-yet-written version of "use English" module), make something for legible fileops. is_readable(file) is really -r(file) note that these are hard to write now due to -s(FH)/2 style parsing bugs and prototype issues on handles vs paths. Aside from providing parsing bugs and prototype issues, the -X syntax is strange and confusing to many Perl programmers who are completely unfamiliar with Bourne shell syntax. The prefered mechanism for file tests should be more legible, using terms like 'readable(FOO)' and 'writeable(FOO)' instead of the opaque '-r FOO' and '-x FOO'. Furthermore, these tests should remain useable where appropriate on any I/O mechanism, not just files. =head1 MIGRATION ISSUES p52p6 would convert instances of -X to the appropriate legible test. Perl programmers happy with the -X syntax will need to get used to the lengthier replacement. =head1 IMPLEMENTATION None required. =head1 REFERENCES perl6storm
RFC 278 (v1) Additions to 'use strict' to fix syntactic ambiguities
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Additions to 'use strict' to fix syntactic ambiguities =head1 VERSION Maintainer: Nathan Wiger [EMAIL PROTECTED] Date: 24 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 278 Version: 1 Status: Developing =head1 ABSTRACT Several RFCs and many people have voiced concerns with different parts of Perl's syntax. Most take issue with syntactic ambiguities and the inability to easily tokenize Perl. This RFC shows how these all boil down to a three central issues, and how they can be solved with some simple additions to Cuse strict. By default, Perl should remain as flexible as possible. By adding these flags to Cuse strict, those who desire them can have all the benefits of a stricter syntax, without hurting those that like these features. =head1 DESCRIPTION =head2 The Problems =head3 Indirect Objects RFC 244 proposes eliminating the bareword indirect object syntax because this: print STDERR @stuff; Can be parsed as either of these: STDERR-print(@stuff); print(STDERR(@stuff)); Depending on your usage of CSTDERR other places in your program. However, some of us like writing: $q = new CGI; Quite a bit, and consider this DWIMish. =head3 Barewords vs. Functions RFC 244 and others mention several problems with barewords such as: name-stuff(@args); # name()-stuff or 'name'-stuff ? Again, the fact that Perl can figure this out correctly is quite DWIMish, and this functionality should not be removed by default. =head3 Special Cases Many special cases abound, such as the bare C// mentioned in RFC 135. Again, this is stuff that makes Perl fun, and should not be taken out of the language. =head2 The Solutions At first, these may not seem related. However, they very much are, and in fact all boil down to only three issues which can be resolved with additions to Cuse strict. =head3 Function Parens - Cuse strict 'words' This imposes a very simple restriction: barewords are not allowed. They must be either quoted or specified with parens to indicate they are functions. Note this solves the C%SIG problem from Camel: use strict 'words'; $SIG{PIPE} = Plumber;# syntax error $SIG{PIPE} = "Plumber"; # use main::Plumber $SIG{PIPE} = Plumber(); # call Plumber() In addition, this also forces users to disambiguate certain functions: use strict 'words'; name-stuff(@args); # syntax error 'name'-stuff(@args);# 'name'-stuff name::-stuff(@args);# ok too, same thing name()-stuff(@args);# name()-stuff $result = value + 42;# syntax error $result = value() + 42; # value() + 42 $result = value( + 42); # value(42) $result = 'value' + 42; # ok, if you think this is Java... It's simple: barewords are not allowed. =head3 Indirect Objects - Cuse strict 'objects' Another major problem is ambiguous indirect objects. Under Cuse strict 'objects', the indirect object Imust be surrounded by braces: use strict 'objects'; no strict 'words'; print STDERR @stuff; # print(STDERR(@stuff)) print "STDERR" @stuff; # syntax error print {"STDERR"} @stuff; # 'STDERR'-print(@stuff) print $fh @junk; # syntax error print {$fh} @junk; # $fh-print(@junk) This eliminates the possibility of ambiguity with indirect objects. When combined with Cstrict 'words', code becomes even less ambiguous: use strict qw(words objects); $q = new 'CGI'; # syntax error $q = new {'CGI'};# 'CGI'-new $q = new ('CGI');# new('CGI') $q = new (CGI());# new(CGI()) $q = new 'CGI' @args;# syntax error $q = new {'CGI'} (@args);# 'CGI'-new(@args) $q = new (CGI (@args)); # new(CGI(@args)) =head3 Syntactic Problems - Cuse strict 'syntax' There are many other "little ambiguities" throughout Perl. Adding Cstrict 'syntax' would remove these and require the user to specify them explicitly. In this category fits the bare C// problem mentioned in RFC 135, as well as several common "bugs" (mistakes). Under this rule, the following would apply: 1. No more // by itself, you must use m// 2. Trailing conditionals would require parens 3. Precedence other than for basic math and boolean ops would not apply This is designed to force you to write clean, unambiguous code that borders on being non-Perlish: use strict 'syntax'; next if /^#/ || /^$/; # syntax error next if m/^#/ || m/^$/; # syntax error next if (m/^#/ || m/^$/); # ok use strict 'syntax'; $data = $a + $b / $c - $d || $default or die; # no way $data = ($a + $b / $c - $d) || $default or die;# nope ($data = ($a + $b / $c - $d) || $default) or die; # ok Basically, the idea is to impose a truly unambiguous style so that people don't get carried away with precedence and special cases. =head2 Combining all these together Let's look at an example of how all these
RFC 112 (v3) Asignment within a regex
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Asignment within a regex =head1 VERSION Maintainer: Richard Proctor [EMAIL PROTECTED] Date: 16 Aug 2000 Last Modified: 23 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 112 Version: 3 Status: Developing =head1 ABSTRACT Provide a simple way of naming and picking out information from a regex without having to count the brackets. =head1 DESCRIPTION If a regex is complex, counting the bracketed sub-expressions to find the ones you wish to pick out can be messy. It is also prone to maintainability problems if and when you wish to add to the expression. Using (?:) can be used to surpress picking up brackets, it helps, but it still gets "complex". I would sometimes rather just pickout the bits I want within the regex itself. Suggested syntax: (?$foo= ... ) would assign the string that is matched by the patten ... to $foo when the patten matches. These assignments would be made left to right after the match has succeded but before processing a replacement or other results (or prior to a some (?{...}) or (??{...}) code). There may be whitespace between the $foo and the "=". Potentially the $foo could be any scalar LHS, as in (?$foo{$bar}= ... )!, likewise the '=' could be any asignment operator. The camel and the docs include this example: if (/Time: (..):(..):(..)/) { $hours = $1; $minutes = $2; $seconds = $3; } This then becomes: /Time: (?$hours=..):(?$minutes=..):(?$seconds=..)/ This is more maintainable than counting the brackets and easier to understand for a complex regex. And one does not have to worry about the scope of $1 etc. =head2 Named Backrefs The first versions of this RFC did not allow for backrefs. I now think this was a shortcoming. It can be done with (??{quotemeta $foo}), but I find this clumsy, a better way of using a named back ref might be (?\$foo). =head2 Scoping The question of scoping for these assignments has been raised, but I don't currently have a feel for the "best" way to handle this. Input welcome. =head2 Brackets Using this method for capturing wanted content, it might be desirable to stop ordinary brackets capturing, and needing to use (?:...). I therefore suggest that as an enhancement to regexes that /b (bracket?) ordinary brackets just group, without capture - in effect they all behave as (?:...). =head1 CHANGES V3 - added bit about backrefs, and brackets. =head1 IMPLENTATION Currently all $scalars in regexes are expanded before the main regex compiler gets to analyse the syntax. This problem also affects several other RFCs (166 for example). The expansion of variables in regexes needs for these (and other RFCs) to be driven from within the regex compiler so that the regex can expand as and where appropriate. Changing this should not affect any existing behaviour. =head1 REFERENCES I brought this up on p5p a couple of years ago, but it was lost in the noise... RFC 166: Alternative lists and quoting of things Perlstorm #0040
RFC 103 (v3) Fix C$pkg::$var precedence issues with parsing of C::
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Fix C$pkg::$var precedence issues with parsing of C:: =head1 VERSION Maintainer: Nathan Wiger [EMAIL PROTECTED] Date: 14 Aug 2000 Last Modified: 23 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 103 Version: 3 Status: Retracted Frozen since: v2 =head1 ABSTRACT Currently, trying to dynamically assign to unnamed classes is very difficult: $pkg::$var = $val; # error ${pkg}::$var = $val; # nope ${$pkg::$var} = $val; # you wish ${${pkg}::$var} = $val;# sorry ${"${pkg}::$var"} = $val; # works, but bleeech :-) The precedence and parsing of the :: operator should be fixed to allow easy access to anonymous package operations. =head1 NOTES ON RETRACTION I don't see any easy way of getting this to work without causing potentially really hairy problems with precedence. In particular check out: http://www.mail-archive.com/perl6-language%40perl.org/msg04058.html Which is actually a reply to Schwern's post, but that appears to be gone from the mail archives forever... =head1 DESCRIPTION In a perfect world, these should work in Perl 6: $var = 'RaiseError'; $DBI::$var = 1 ; # $DBI::RaiseError = 1 $pkg = 'Class'; $var = 'DEBUG'; ${${pkg}::$var} = 1; # $Class::DEBUG = 1 $subpkg = 'Special'; $class = $pkg . '::' . $subpkg; require $class;# require Class::Special $mypkg = 'Some::Package::Name'; $ret = $mkpkg::do_stuff(@a); # is {"${mypkg}::do_stuff"}(@a) now Currently, the precedence of :: does not allow these operations. Some of the above examples may still require additional braces, but they shouldn't require the types of contortions currently needed. =head1 IMPLEMENTATION Unfortunately, I don't have the time to think this part up yet. :-( I will gladly contribute to the precedence and parsing rules discussions that will ensue in the future if this RFC is accepted. =head1 REFERENCES Programming Perl, 2ed, for the ${"${pkg}::$var"} syntax RFC 222: Interpolation of method calls
RFC 275 (v1) Add 'tristate' pragma to allow undef to take on NULL semantics
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Add 'tristate' pragma to allow undef to take on NULL semantics =head1 VERSION Maintainer: Nathan Wiger [EMAIL PROTECTED] Date: 23 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 275 Version: 1 Status: Developing =head1 ABSTRACT RFC 263 proposed the introduction of a Cnull keyword for introducting tristate logic into Perl 6. However, that was abandoned in favor of the approach specified here, a Ctristate pragma. =head1 DESCRIPTION The Ctristate pragma allows for undef to take on the RDBMS concept of CNULL, in particular: 1. Any math or string operation between a NULL and any other value results in NULL 2. No NULL value is equal to any other NULL 3. A NULL value is neither defined nor undefined The Ctristate pragma is lexically scoped, so that it obeys code blocks: $a = undef; $b = 1; $c = $a + $b;# 1 { use tristate; $d = $a + $b; # undef } $e = $c + $d;# 1 For more details on theoretical issues, please see the references or RFC 263. =head1 IMPLEMENTATION No idea, too burned out. =head1 MIGRATION None, unless some dumbass has a custom Ctristate module that they wrote to navigate the tristate area of New York, New Jersey, and Connecticut. But that should be CTristate anyways. =head1 REFERENCES RFC 263: Add null() keyword and fundamental data type http://www.sitelite.nl/mysql/manual_Problems.html#IDX666 http://www.unb.ca/web/transpo/mynet/mtx19.htm#r2
RFC 269 (v2) Perl should not abort when a required file yields a false value
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Perl should not abort when a required file yields a false value =head1 VERSION Maintainer: Dominus [EMAIL PROTECTED] Date: 21 Sep 2000 Last Modified: 23 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 269 Version: 2 Status: Withdrawn See Also: RFC 55 =head1 STATUS This proposal is withdrawn because it duplicates RFC 55, "Compilation: Remove requirement for final true value in require-d and do-ed files" =head1 ABSTRACT Modules should not have to end with C1;. It is silly and confusing. =head1 DESCRIPTION Modules typically contain subroutine definitions. A module may contain initialization code also. If the initialization code fails, the module can return a false value to its caller, which aborts the compilation. In Perl 5, a module that contains nothing but subroutine definitions will return false by default, necessitating a 1; at the bottom. If the C1; is omitted, Perl emits the error Foo.pm did not return a true value... In spite of plenty of documentation, people Frequently Ask what this error means. Some languages like to have the compiler emit annoying messages to announce you forgot to include some pointless code whose only purpose is to stop the compiler from emitting the annoying message. Perl is mostly free of such nonfeatures. I propose that this unfeature be dropped entirely. No useful functionality is lost. If a Perl 6 module wants to indicate an initialization failure by throwing a fatal exception, it can simply call Cdie. If the calling module wants to abort when a Crequired file returns a false value, it is free to do that. The 'module initialization' feature is little-used. 99 the of 102 files in Perl 5.6 lib/*.{pl,pm} end with C1;. AnyDBM_File invokes 'die' explicitly. The only real exceptions are diagnostics.pm and timelocal.pl. =head1 IMPLEMENTATION 'require' should execute code in a file and return the result, as before, but it should not call Perl_die when the result is false. However, see below. =head1 MIGRATION In 98% of cases, no translation is necessary. The first version of the translator can ignore the issue entirely. Strategies to cover the other 2% follow: Is general, direct source translation of this feature of Perl 5 modules would probably be impossible. It's tempting to say that the translator should simply translate the last statement or block in the module from this: STATEMENT to this: unless (do {STATEMENT}) { require Carp; Carp::croak "... did not return a true value"; } However, I think that is impractical. The module might contain code that looks like this: if (something()) { return $v1; } ... $v2; In this case the 'return $v1' statement would Ialso have to be translated. In general, there might be many, many statements that would need to be translated. This would look awful. I think that if complete coverage is desired, the best choice would be to introduce a new pragma, which would enable the old behavior. A translated module would begin with package Foo; use perl5 'require/use semantics'; ... When this file was Crequired, the pragma would set a flag. The Cpp_require opcode would check the flag after compiling the file, and would call CPerl_die as before if the file returned a false value and if the flag was set. If Foo Crequired any other modules, the flag would be cleared before loading them, and restored again afterwards. (That is, the flag would have file scope.) =head1 REFERENCES Perl on-line manuals
RFC 158 (v3) Regular Expression Special Variables
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Regular Expression Special Variables =head1 VERSION Maintainer: Uri Guttman [EMAIL PROTECTED] Date: 25 Aug 2000 Last Modified: 22 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 158 Version: 3 Status: Frozen Frozen since: v2 =head1 ABSTRACT This RFC addresses ways to make the regex special variables $`, $ and $' not be such pariahs like they are now. =head1 CHANGES I dropped the local scoping of $`, $ and $' as they are already localized now. =head1 DESCRIPTION $`, $ and $' are useful variables which are never used by any experienced Perl hacker since they have well known problems with efficiency. Since they are globals, any use of them anywhere in your code forces all regexes to copy their data for potential later referencing by one of them. I will describe some ideas to make this issue go away and return these variables back into the toolbox where they belong. =head1 IMPLEMENTATION The copy all regex data problem is solved by a new modifier k (for keep). This tells the regex to do the copy so the 3 vars will work properly. So you would use code like this: $str = 'prefoopost' ; if ( $str =~ /foo/k ) { print "pre is [$`]\n" ; print "match is [$]\n" ; print "post is [$']\n" ; } =head1 IMPACT None =head1 UNKNOWNS None =head1 REFERENCES None.
RFC 165 (v3) Allow Varibles in tr///
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Allow Varibles in tr/// =head1 VERSION Maintainer: Richard Proctor [EMAIL PROTECTED] Date: 27 Aug 2000 Last Modified: 22 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 165 Version: 3 Status: Frozen =head1 ABSTRACT Allow variables in a tr///. At present the only way to do a tr/$foo/$bar/ is to wrap it up in an eval. I dont like using evals for this sort of thing. =head1 DESCRIPTION Suggested syntax: tr/$foo/$bar/e With a /e, tr will expand both the LHS and RHS of the translate function. Either or both could be variables. I am suggesting /e as it is sort of like /e for s///e. These words from MJD: The way tr/// works is that a 256-byte table is constructed at compile time that say for each input character what output character is produced. Then when it's time to apply the tr/// to a string, Perl iterates over the string one character at a time, looks up each character in the table, and replaces it with the corresponding character from the table. With tr///e, you would have to generate the table at run-time. This would suggest that you want the same sorts of optimizations that Perl applies when it encounters a regex that contains variables: 1. Perl should examine the strings to see if they have changed since the last time it executed the code 2. It should rebuild the tables only if the strings changed 3. There should be a /o modifier that promises Perl that the variables will never change. The implementation could be analogous to the way m/.../o is implemented, with two separate op nodes: One that tells Perl 'construct the tables' and one that tells Perl 'transform the string'. The 'construct the tables' node would remove itself from the op tree if it saw that the tr//o modifier was used. Hugo wrote: Definitely. Should be easy to implement. There is a potential for confusion, since it makes the tr/ lists look even more like m/ and s/ patterns, but I think it can only be less confusion than the current state of affairs. It is tempting to make it the default, and have a flag to turn it off (or just backwhack the dagnabbed dollar), and auto-translation of existing scripts would be pretty easy, except that it would presumably fail exactly where people are using the current workaround, by way of eval. Comments by me: Therefore tr///o might be a good idea as well. If Hugo's idea of making this the normal behaviour, the problem of existing evals is avoided by p52p6 changing the eval to a perl5_eval which acts accordingly. (One of MJD's ideas). =head1 IMPLENTATION Hugo: Should be easy to implement. Me: Should not be too complicated, this is just a case of doing existing things in a different context. =head1 CHANGES V2 - Added words from MJD and Hugo - This hopefully in a pre freeze state. V3 - re issued due to an error in posting V2 and now frozen =head1 REFERENCES None yet.
RFC 166 (v3) Alternative lists and quoting of things
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Alternative lists and quoting of things =head1 VERSION Maintainer: Richard Proctor [EMAIL PROTECTED] Date: 27 Aug 2000 Last Modifiedj: 22 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 166 Version: 3 Status: Developing =head1 ABSTRACT Expand Alternate Lists from Arrays and Quote the contents of things inside regexes. =head1 DESCRIPTION These are a couple of constructs to make it easy to build up regexes from other things. =head2 Alternative Lists from arrays The basic idea is to expand an array as a list of alternatives. There are two possible syntaxs (?@foo) and just plain @foo. @foo might just have existing uses (just), therefore I prefer the (?@foo) syntax. (?@foo) is just syntactic sugar for (?:(??{ join('|',@foo) })) A bracketed list of alternatives. =head2 Quoting the contents of things If a regex uses $foo or @bar there are problems if the content of the variables contain special characters. What is needed is a way of \Quoting the content of scalars $foo or arrays (?@foo). Suggested syntax: (?Q$foo) Quotes the contents of the scalar $foo - equivalent to (??{ quotemeta $foo }). (?Q@foo) Quotes each item in a list (as above) this is equivalent to (?:(??{ join ('|', map quotemeta, @foo)})). In this syntax the Q is used as it represents a more inteligent \Quot\E. It is recognised that (?Q$foo) is equivalent to \Q$foo\E, but it does not mean that this is a bad idea to add this at the same time as (?Q@foo) for reasons of symetry and perl DWIM. =head2 Comments Hugo: (?@foo) and (?Q@foo) are both things I've wanted before now. I'm not sure if this is the right syntax, particularly if RFC 112 is adopted: it would be confusing to have (?@foo) to have so different a meaning from (?$foo=...), and even more so if the latter is ever extended to allow (?@foo=...). I see no reason that implementation should cause any problems since this is purely a regexp-compile time issue. Me: I cant see any reasonable meaning to (?@foo=...) this seams an appropriate syntax, but I am open for others to be suggested. =head1 CHANGES V1 of this RFC had three ideas, one has been dropped, the other is now part of RFC 198. V2 Expands the list expansion and quoting with quoting of scalars and Implemention issues. V3 In an error what should have been 165 V2 was issued as 166 V2 so this is V3 with a change in (?Q$foo). This is in a pre-frozen state. =head1 MIGRATION As (?@foo) and (?Q...) these are additions with out any compatibility issues. The option of just @foo for list exansion, might represent a small problem if people already use the construct. =head1 IMPLENTATION Both of these are changes are regex compile time issues. Generating lists from arrays almost works by localising $" as '|' for the regex and just using @foo. MJD has demonstrated implementing (?@foo) as (?\@foo) by means of an overload of regexes, this slight change was necessary because of the expansion of @foo - see below. Both of these changes are currently affected by the expansion of variables in the regex before the regex compiler gets to work on the regex. This problem also affects several other RFCs. The expansion of variables in regexes needs for these (and other RFCs) to be driven from within the regex compiler so that the regex can expand as and where appropriate. Changing this should not affect any existing behaviour. =head1 REFERENCES RFC 198
RFC 198 (v2) Boolean Regexes
But not dependant on how many brackets have been used already. If expressed this way the code needs to deliver succes and failure. I am not sure how greedy the capture element should be. ie if it should be (.*) or (.*?) or (.+) or (.+?)] but I think (.*) is suficient given that it may be bound by other regexes in a boolean context to reduce the context sufficiently. =head3 A Failure Token \F if reached as part of a pattern is always a failure. This can be used outside of the (? construct. This is not dependant on the Boolean regex concept and could be used for other things. (In RFC 198 V1 this was proposed as (?F) but I think \F is more in keeping with existing and intended syntax.) There might be a case for a compilentary Success Token \T ?? though I am not sure it is needed. =head2 Applications and examples Matching both foo and bar in a string. $string =~ /(? foo bar)/x; Matching a string that does not contain baz that is at least 20 chars between foo and bar. $string =~ /foo (? .{20,} ! baz ) bar/x; Does a html image have both an alt and a src, and what are they? $string =~ /img(? \s alt=(?$Alt=(".*?"|\S*)) \s src=(?$Src=(".*?"|\S*)) )/ix; It might be possible to have a regex that simply matches valid perl6 out of this. (though it would be large...)! =head1 IMPLENTATION Implementation detail is not appropriate for this stage in the devlopment of this RFC. If the concepts gain approval then detailed implementation issues become relevant. There are two aspects to regexs - compiling and executing: Compiling of these extended forms should be relativly straight forward, but would need some extensions to recognise the regex as being within (?) state to handle the extended syntax. Executing - No thoughts at all at present. =head1 REFERENCES RFC 166 V1. RFC 112 - Assignment within a regex RFC 150 - Hash assignment from regexs RFC 145 - (or at least the discussion that followed it)
RFC 274 (v1) Generalised Additions to Regexs
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Generalised Additions to Regexs =head1 VERSION Maintainer: Richard Proctor [EMAIL PROTECTED] Date: 22 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 274 Version: 1 Status: Developing =head1 ABSTRACT This proposes a way for generalised additions to regex capabilities. =head1 DESCIPTION Given that expansion of regexes could include (+...) and (*...) I have been thinking about providing a general purpose way of adding functionality. Hence I propose that the entire (+...) syntax is kept free from formal specification for this. (+ = addition) A module or anything that wants to support some enhanced syntax registers something that handles "regex enhancements". At regex compile time, if and when (+foo) is found perl calls each of the registered regex enhancements in turn, these: 1) Are passed the foo string as a parameter exactly as is. (There is an issue of actually finding the end of the generic foo.) 2) The regex enhancement can either recognise the content or not. 3) If not the enhancement returns undef and perl goes to the next regex enhancement (Does it handle the enhancements as a stack (Last checked first) or a list (First checked first?) how are they scoped? Job here for the OO/scoping fanatics) 4) If perl runs out of registered regex enhancements it reports an error. 5) if an enhancement recognises the content it could do either of: a) return replacement expanded regex using existing capabilities perl will then pass this back through the regex compiler. b) return a coderef that is called at run time when the regex gets to this point. The referenced code needs to have enough access to the regex internals to be able to see the current sub-expression, request more characters, access to relevant flags and visability of greediness. It may also need a coderef that is simarly called when the regex is being unwound when it backtracks. These features would also be of interest to the existing code inside regexes as well. Thinking from that - the last case should be generalised (it is sort of like my (?*{...}) from RFC 198 or an enhancement to (??{...}). If so cases (a) and (b) are the same as case (b) is just a case of returning (?*{...}) the appropriate code. Following on, if (?{...}) etc code is evaluated in forward match, it would be a good idea to likewise support some code block that is ignored on a forward match but is executed when the code is unwound due to backtracking. Thus (?{ foo })(?\{ bar }) executes foo on the forward case and bar if it unwinds. I dont care at the moment what the syntax is - what about the concepts. Think about foo putting something on a stack (eg the bracket to match [RFC 145]) and bar taking it off for example. Note: I dont consider this RFC complete, but after posting this on the regex list to no effect I am making it an RFC to see if it gets a little more feedback... =head1 MIGRATION This is a new feature - no compatibity problems =head1 IMPLENTATION This has not been looked at in detail, but the desciption above provides some views as to how it may operate. =head1 REFERENCES RFC 145 - Bracket matching RFC 198 - Boolean Regexes
RFC 184 (v3) Perl should support an interactive mode.
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Perl should support an interactive mode. =head1 VERSION Maintainer: Ariel Scolnicov [EMAIL PROTECTED] Date: 31 Aug 2000 Last Modified: 22 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 184 Version: 3 Status: Frozen =head1 DISCUSSION Very little discussion was generated by this RFC. Several people noted that Cperl -de 42 and the Perl shell Cpsh already provide some of what the RFC requests; this is noted in the RFC. The RFC is not being withdrawn, since 2 other people expressed (mild) interest in it. No changes have been made since the last posted version (version 2 of 3 Sep 2000), other than the addition of this "DISCUSSION". =head1 ABSTRACT Perl5 does not have an interactive mode. The debugger is fine for testing a single line, but it is inadequate for running a set of commands interactively. The Perl6 parser (and possibly the language) should contain hooks to allow full interactive environments to be written. =head1 DESCRIPTION Perl does not have an interactive mode. It has Cperl -de 42, but that is not the same. An interactive mode is useful not only for a debugger, but also for exploring the capabilities of a module, or even for performing simple "one-off" programming tasks. The most serious obstacle to easy interaction is the difficulty in typing multiple line commands to a Perl debugger (see below). However, the Perl debugger also limits this use in other ways, notably by evaluating each line in a separate Ceval. This too has unfortunate consequences. Languages which include better interactive capabilities than Perl's include Python and zsh. =head2 Example Observe an interaction with another language whose name begins with a `P': Python 1.5.1 (# 1, Jul 28 1998, 22:02:27) [GCC 2.7.2.3] on sunos5 Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam def fact(x): ... if x = 1: ... return 1 ... else: ... return x*fact(x-1) ... fact(10) 3628800 ^D Note in particular the definition of Cfact, which spans multiple lines. With Perl5, it doesn't work: bioserv 108 [13:31] ~ perl -de 42 Loading DB routines from perl5db.pl version 1.0402 Emacs support available. Enter h or `h h' for help. main::(-e:1): 42 DB1 sub fact { Missing right bracket at (eval 5) line 4, at end of line syntax error at (eval 5) line 4, at EOF [ oops... must fit it all on one line ] shift; if ($x 2) { return 1 } else { return $x*fact($x-1) } Missing right bracket at (eval 12) line 4, at end of line syntax error at (eval 12) line 4, at EOF [ can't see the beginning of the line I'm editing, and forgot a close brace; might as well forget about any indentation to help remind me ] if ($x 2) { return 1 } else { return $x*fact($x-1) } } [ Finally! But I can't even see what I typed! ] DB4 print fact(10) 3628800 Michael Maraist and Tom Christiansen point out that the debugger allows explicit marking of continuation lines by backslashes: selena 150 [14:37] ~ perl -de 42 Loading DB routines from perl5db.pl version 1.0402 Emacs support available. Enter h or `h h' for help. main::(-e:1): 42 DB1 sub fact { \ cont: my $x = shift; \ cont: if ($x 2) {\ cont: return 1 \ cont: } else { \ cont: return $x*fact($x-1) \ cont: }\ cont: } DB2 x fact 10 0 3628800 This is inconvenient. Syntax in an interactive mode should mirror normal Perl syntax as far as possible; Cperldoc perldebug goes so far as to say Note that this business of escaping a newline is specific to interactive commands typed into the debugger. =head2 Separate eval()s Cmy and Clocal variables don't work in the debugger as one would expect; their scope does not propagate between lines: bioserv 112 [14:08] ~ perl -de 42 Loading DB routines from perl5db.pl version 1.0402 Emacs support available. Enter h or `h h' for help. main::(-e:1): 42 DB1 $x = 17 DB2 my $x=5 DB3 x $x 0 17 The ability to be able to create variables is essential for serious interactive use of Perl. What causes all this is that the debugger evaluates every line in a separate Ceval; this is not what is desired in an interactive environment. This is another limitation on using the debugger for interactive work. For another example, it is impossible to change packages persistently: DB3 package foo DB4 $x = 2 DB5 x $foo::x 0 undef DB6 x $x 0 2 DB7 x $main::x 0 2 =head2 Possible uses =over 4 =item * The Perl debugger (and other Perl debuggers) =item * Interaction environments (e.g. Cperldl) =item * "Super" calculators =item * Perl shell
RFC 81 (v4) Lazily evaluated list generation functions
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Lazily evaluated list generation functions =head1 VERSION Maintainer: Jeremy Howard [EMAIL PROTECTED] Date: 10 Aug 2000 Last Modified: 22 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 81 Version: 4 Status: Frozen =head1 DISCUSSION Not surprisingly, the controversial point of discussion for this RFC was about viability and efficiency of implementation. These points were more about the use of lazy evaluation in general, rather than generated lists in particular. The viability of lazy evaluation has been proven in other languages (both functional and procedural). The efficiency of generated lists will obviously depend on the implementation, but this RFC suggests some obvious optimisations for frequently used constructs (e.g. stepped slices). The second major point of discussion was around syntax. Other languages that provide list comprehension do so with quite different syntax (for example, Haskell and Python v2). However, the syntax is these languages in not at all Perlish. The proposed syntax incorporates Perl 5 syntax and extends it using minimal additional notation. =head1 ABSTRACT This RFC proposes that the existing C.. operator produce a lazily evaluated list. In addition, a new operation C: is proposed that allows for the generation of lazily evaluated lists based on any Perl expression. This proposal only discusses these operators in a list context. The current meaning of '..' in a scalar context is not affected. =head1 CHANGES =head2 Since v3 =over 4 Clarified how introspection of list generation parameters would work for anonymously concatenated lists, and other more complex structures =back =head2 Since v2 =over 4 =item * Clarified the order of arguments passed to the list generation function =item * Made ':' an alias for '..' =back =head2 Since v1 =over 4 =item * Changed notation to generate lists using previous element value from I(@start:gen:$num_steps) to I(@start..gen:$num_steps) =item * Made I(@start:gen:$num_steps) create a list that does not require intermediate values to be calculated =back =head1 DESCRIPTION This RFC proposes that Perl incorporate a broader tool box of list generation techniques: =over 8 =item * Lazy evaluation of generated lists =item * Generation of arbitrary lists from a function =back These techniques would allow programs written in Perl to follow a structure familiar to programmers used to numerical programming environments. It would provide a more compact notation for many common mathematical algorithms, and give Perl important information to make key optimisations. =head2 Lazy evaluation of generated lists The C.. of previous Perls is a Ilist generation operator, which creates a list based on its parameters: ($start..$stop); # ($start, $start+inc, $start+2*inc, ... $stop) where 'inc' is 1 if $start$stop, or -1 otherwise. The list is generated as soon as it is declared. These makes some code rather inefficient: @a = (1..100); # One million element list generated here print $a[99]; creates a one million element list despite only using one element of it. Under Ilazy evaluation, elements of the list are only created when they are required, and saved for later use. In the previous example only $a[99] would be calculated by interpolation (not sequentially) and stored when using lazy evaluation. Lists, whether generated lazily or not, are assumed to be Istable. That is, the value of $a[99] will be the same everywhere in a program, unless @a itself is modified. This means that lazily evaluated lists provide a handy notation for memoization, as we will see later. It is proposed that once an element has been calculated in a list, that it is cached for use later rather than recalculated each time. Lazy list elements get calculated when they are output, or used in an expression that is output. If list elements are not output then they are never calculated. =head2 Introduction of C: to generate arbitrary lists It is proposed that a new operator be added to Perl's list generation arsenal, C:. The ':' character is chosen because it reflects standard notation for array slicing, which is an important use of this operator. C: is only meaningful when called in a list context, generating a lazily evaluated list in one of 3 ways. =over 4 =item 1. I($start..$end:$step) Although earlier Perls could create ascending and descending lists incrementing by one, other increments required an unwieldy map: @threes = map {3*$_} (1..5); # (3,6,9,12,15) which was also less than intuitive to those used to the simple slicing notation of numerical programming languages such as Matlab and IDL. This proposed use of C: is identical to C.. without C:, except that it increments by $step rather than 1. Specifically, returns a list ($start, $start+$step, $start+2*$step, ... $end). If $step does not go into ($end-$start)
RFC 272 (v1) Arrays: transpose()
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Arrays: transpose() =head1 VERSION Maintainer: Jeremy Howard [EMAIL PROTECTED] Date: 22 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 272 Version: 1 Status: Developing =head1 ABSTRACT It is proposed that a new function Ctranspose be added to Perl. Ctranspose($dim1, $dim2, @list) would return @list with $dim1 and $dim2 switched. It would return an alias into the original list, not a copy of the elements. =head1 DESCRIPTION It is proposed that Perl implement a function called Ctranspose that transposes two dimensions of an array, and is evaluated lazily. LRFC 202 gives an overview of the proposed multidimensional arrays that Ctranspose works with. For instance: @a = ([1,2],[3,4],[5,6]); @transposed_list = transpose(0,1,@a); # ([1,3,5],[2,4,6]) This is different to Creshape (see LRFC 148) which does not reorder its elements: @a = ([1,2],[3,4],[5,6]); @reshaped_list = reshape([3,2],@a); # ([1,2,3],[4,5,6]) Ctranspose is its own inverse: @transposed_list = transpose(0,1,@a); # ([1,3,5],[2,4,6]) @orig_list = transpose(0,1,@transposed_list); # (([1,2],[3,4],[5,6]) @a == @orig_list; # true If Ctranspose refers to a dimension that does not exist, empty dimensions autovivify as necessary: @row_vector = (1,2,3,4); @col_vector = transpose(0,1,@row_vector); # ([1],[2],[3],[4]) Ctranspose does not make a copy of the elements of its arguments; it simply create an alias: @row_vector = (1,2,3,4); @col_vector = transpose(0,1,@row_vector); # ([1],[2],[3],[4]) $col_vector[[0,1]] = 0; @row_vector == (1,0,3,4); # True =head1 IMPLEMENTATION RFC 90 discusses possible approaches to implementing aliasing. =head1 REFERENCES RFC 90: Arrays: merge() and unmerge() RFC 148: Arrays: Add reshape() for multi-dimensional array reshaping
RFC 90 (v4) Arrays: merge() and unmerge()
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Arrays: merge() and unmerge() =head1 VERSION Maintainer: Jeremy Howard [EMAIL PROTECTED] Date: 10 Aug 2000 Last Modified: 21 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 90 Version: 4 Status: Frozen =head1 DISCUSSION Two major issues were discussed. One was that merge and unmerge were not deserving of being placed in the core. In the end this is a judgement call, but merge() in particular is fundamental to programming in a functional style, to manipulating multidimensional matrices, and for iterating through multiple arrays simultaneously. Furthermore, implementing optimised aliasing behaviour as proposed may not be possible in a module (depending on what extension mechanisms are in Perl 6). The other issue discussed was whether the aliasing behaviour is appropriate, and achievable. A section has been added to this RFC discussing this. =head1 ABSTRACT It is proposed that two new functions, Cmerge, and Cunmerge, be added to Perl. Cmerge(@list1, @list2, ...) would return a list that interleaved its arguments. Cunmerge($num_lists, @list) would reverse this operation. Both functions would return an alias into the original list, not a copy of the elements. =head1 CHANGES =head2 Since v3 =over 4 =item * Name change from Cdemerge to Cunmerge =item * Added discussion of options for aliasing behaviour =back =head2 Since v2 =over 4 =item * Described aliasing behaviour of Cmerge and unmerge =back =head2 Since v1 =over 4 =item * Moved list to [EMAIL PROTECTED] =item * Changed name from zip/unzip =item * Pass lists directly in examples, not references =item * Change 2nd argument from $list_size to $num_lists =back =head1 DESCRIPTION It is proposed that Perl implement a function called Cmerge that interleaves the arguments of arrays together, and is evaluated lazily. For instance: @a = (1,3,5); @b = (2,4,6); @merged_list = merge(@a,@b); # (1,2,3,4,5,6) This makes it easy to operate on multiple lists using flexible reduction functions: $sum_xy = sub {reduce ^last+^x*^y, merge($_[0], $_[1])}; print $sum_xy-(@a, @b); # Prints '44', i.e. 1*2+3*4+5*6 In order to reverse this operation we need an Cunmerge function: @merged_list = merge(@a,@b); # (1,2,3,4,5,6) @unmerged_list = unmerge(2, @merged_list); # ([1,3,5], [2,4,6]) The second argument to Cunmerge is the number of lists that are to be created (i.e. the number of lists that would have been Cmerged for this to reverse the operation). If the list to be unmerged is not an exact multiple of the partition size, the final list references are not padded--their length is one less than the list size. For example: @list = (1..7); @unmerged_list2 = unmerge(3, @list); # ([1,4,7], [2,5], [3,6]) Both Cmerge and unmerge do not make a copy of the elements of their arguments; they simply create an alias to them: @a = (1,3,5); @b = (2,4,6); @merged_list = merge(@a,@b); # (1,2,3,4,5,6) $merged_list[1] = 0; @b == (0,4,6); # True =head1 IMPLEMENTATION The Cmerge and Cunmerge functions should be evaluated lazily. Cmerge and Cunmerge return an alias into the original list, not a copy of the elements. Effectively, Cmerge creates an iterator over multiple lists. If used as part of a reduction, the actual interleaved list need never be created. For instance: $sum_xy = sub {reduce ^last+^x*^y, merge($_[0], $_[1])}; $answer = $sum_xy-(@a, @b); should be evaluated as if it read: $answer = 0; $answer += $a[$_] * $b[$_] for (0..$#a-1)); which does not need to create an intermediate list. =head2 Aliasing implementation The proposed aliasing behaviour is also proposed for part()/flatten() (see LRFC 91) and reshape() (see LRFC 148). This requires some more thought. Perl's current slicing operation C@a[$x1, $x2] only aliases when used in an lvalue context: @a = (4,5,6); @b = @a[1,2]; @b[0] = 9; # @a == (4,5,6) @a[1,2] = (8,9); # @a == (4,8,9) We could do the same for merge() and friends. The downside is that: @transpose = part( # Find the size of each column scalar @list_of_lists, # Interleave the rows merge(@list_of_lists); ) and similar expressions would do an awful lot of copying. Ideally if merge() didn't alias in an rvalue context, Perl would still optimise away multiple merge()s, part()s, slices, and so forth so that only one copy occurred. If the aliasing behaviour is implemented, then assigning an aliased array to another array should result in a copy being created: @a = (1,3,5); @b = (2,4,6); @merged_list = merge(@a,@b); # Just an alias into @a and @b @copied_list = @merged_list; # Does an actual array copy Alternatively, a copy-on-write optimisation could allow some of the efficiency of full aliasing to be combined with the simplicity of Perl 5's alias-in-lvalue behaviour. =head1 REFERENCES RFC 23: Higher order
RFC 91 (v4) Arrays: part and flatten
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Arrays: part and flatten =head1 VERSION Maintainer: Jeremy Howard [EMAIL PROTECTED] Date: 10 Aug 2000 Last Modified: 21 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 91 Version: 4 Status: Frozen =head1 DISCUSSION The discussion points for this RFC were the same as those for RFC 90. =head1 ABSTRACT It is proposed that two new functions, Cpart and Cflatten, be added to Perl. Cpart($part_size, @list, $skip) would return @list broken into references to sub-lists, each one $list_size in size, offset from the end of the previous sub-list by $skip. Cflatten(@list_of_lists, $nest_level) would take a list of lists and dereference the elements recursively, up to $nest_level levels deep. Both functions would return an alias into the original list, not a copy of the elements. =head1 CHANGES =head2 Since v2 =over 4 =item * Described aliasing behaviour of flatten and part =item * Added Cflatten builtin =back =head2 Since v1 =over 4 =item * Moved list to [EMAIL PROTECTED] =item * Changed name from Cpartition =item * Add optional third argument to allow skipping elements =item * Make difference between Cunmerge and Cpart explicit =back =head1 DESCRIPTION In order to work with lists of arbitary size, it is often necessary to split a list into equal sized sub-lists. A Cpart function is proposed that achieves this: @list = (1,2,3,4,5,6); @parted_list = part(2, @list); # ([1,2],[3,4],[5,6]) This is useful to provide tuples to functions that can operate on lists of lists, for instance: @sum_pairs = map {$_-[0] + $_-[1]} @parted_list; # (3,7,11) The optional third argument can be used to skip over elements of the list: @list = (1,2,3,4,5,6,7,8); @parted_list = part(2, @list, 1); # ([1,2],[4,5],[7,8]) If the list to be parted is not an exact multiple of the part size, the final list reference is not padded. For example: @list2 = (1..7); @parted_list2 = part(3, @list2); # ([1,2,3], [4,5,6], [7]) A list that has been parted can be flattened again using the Cflatten function: @parted_list2 = ([1,2,3], [4,5,6], [7]); @list2 = flatten(@parted_list2); # (1,2,3,4,5,6,7) Cflatten can work on arrays of greater than two dimensions: my @some_LOL = ([[1,2], [3,4]], [[5,6], [7,8]]); my @flat_LOL = flatten(@some_LOL); # ([1,2,3,4], [5,6,7,8]) The innermost list of lists is flattened first. To flatten more levels of nesting, use the optional second argument: my @some_LOL = ([[1,2], [3,4]], [[5,6], [7,8]]); my @flat_LOL = flatten(@some_LOL,2); # (1,2,3,4,5,6,7,8) Both Cpart and flatten do not make a copy of the elements of their arguments; they simply create an alias to them: @parted_list2 = ([1,2,3], [4,5,6], [7]); @list2 = flatten(@parted_list2); # (1,2,3,4,5,6,7) $list2[1] = 0; @parted_list2 == ([1,0,3], [4,5,6], [7]); # True Cmerge (see RFC 90) and Cpart can work together to allow manipulation of arbitary sized lists. For instance, we can extend the $sum_xy function used as an example in the Cmerge RFC, which takes two lists and returns the sum of them multiplied together component-wise: $sum_xy = sub {reduce ^last+^x*^y, merge($_[0], $_[1])}; to a function $sum_mult that does the same with an arbitary number of lists: # Multiply all the elements of a list together, returning the result $apply_times = reduce (^total * ^element, @^multiplicands); # Swap the rows and columns of a list of lists $transpose = part( # Find the size of each column scalar @^list_of_lists, # Interleave the rows merge(@^lists_of_lists); ) # Take a list of references to lists, multiply them component-wise, # and return their sum $sum_mult = reduce ( ^total + $apply_times-( @^next_list ), $transpose-(^list_of_lists), ); # Example usage of $sum_mult @a = (1,3,5); @b = (2,4,6); @c = (-1,1,-1); $answer = $sum_mult-(\@a, \@b, \@c); # 1*2*-1+3*4*1+5*6*-1 = -20 A common usage of Cpart and Cunmerge is to access the rows and columns of a matrix stored as a flat list: @array = ( a1, a2, a3, b1, b2, b3, c1, c2, c3 ); @columns = unmerge(3,@array) # Return the columns @rows = part(3,@array) # Return the rows =head1 IMPLEMENTATION The Cpart functions should be evaluated lazily. Because it is used in common operations such as the transposition of a matrix, its efficiency is particularly important. Cpart and Cflatten return an alias into the original list, not a copy of the elements. Cpart and Cflatten are just special cases of Creshape (from RFC 148). =head1 REFERENCES RFC 23: Higher order functions RFC 76: Builtin: reduce RFC 90: Builtins: merge() and unmerge() RFC 148: Add reshape() for multi-dimensional array reshaping =head1 ACKNOWLEDGEMENTS Damian Conway: Numerous comments on first draft
RFC 148 (v3) Arrays: Add reshape() for multi-dimensional array reshaping
=head1 VERSION Reply-To: [EMAIL PROTECTED] This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Arrays: Add reshape() for multi-dimensional array reshaping =head1 VERSION Maintainer: Jeremy Howard [EMAIL PROTECTED] Date: 24 Aug 2000 Last Modified: 21 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 148 Version: 3 Status: Frozen =head1 CHANGES Changed semantics to match PDL, NumPy, and J. Changed maintainer from Nathan Wiger [EMAIL PROTECTED]. =head1 ABSTRACT Currently, there is no easy way to reshape existing arrays into multiple arrays or matrices. This makes nifty array manipulation and complex math hard. A general-purpose tool that can do arbitrary multi-dimensional array reshaping, from which other array manipulation functions can be derived, makes data manipulation easier. =head1 DESCRIPTION Let's jump in. This RFC proposes a Creshape builtin that takes an array to reshape as the second parameter, and a list of dimensions to reshape the array to as the first parameter: my int @a = ([1,2,3], [4,5,6]); @b = reshape([2,3], @a); # ([1,2],[3,4],[5,6]) @c = reshape([1,6], @a); # ([1],[2],[3],[4],[5],[6]) @d = reshape([6,1], @a); # (1,2,3,4,5,6) @e = reshape([1,2,3], @a);# ([[1],[2]],[[3],[4]],[[5],[6]]) The dimensions specified in the first argument are the same ones used by the C:shape array attribute described in LRFC 203. LRFC 202 gives an overview of the proposed multidimensional arrays that Creshape works with. We only need one Creshape since it is a multipurpose tool that works in any direction, serving as its own inverse. The dimensions used are subject to the following properties: =over 4 =item 1 Less data than specified causes Creshape to repeat the data as many times as necessary to fill the new structure (like C$ in the J language) =item 2 More data than specified is silently discarded =back Any one (but no more than one) element of the list of dimensions can be '-1', which indicates that that dimension should be made as large as necessary to fill in the array: my int @a = ([1,2,3], [4,5,6]); @b = reshape([-1,3], @a);# ([1,2],[3,4],[5,6]) @c = reshape([-1], @a);# (1,2,3,4,5,6) The semantics of Creshape match those of PDL's reshape(), NumPy's reshape(), and J's verb C$. See the references. Creshape creates an alias to the original array, not a copy (this is like merge/demerge/part/flatten). See LRFC 90 for discussion of aliasing behaviour that would apply to Creshape. =head1 IMPLEMENTATION For simple typed arrays (RFC 203) it is simply a case of changing the dimension attributes stored internally. For standard lists of lists, the actual references and arrays will have to be rejigged, with will be a slow operation. However, Creshape should rarely be used on arrays that are not stored compactly, since standard lists of lists are unlikely to be used for heavy data crunching. =head1 MIGRATION None. This introduces new functionality. =head1 REFERENCES RFC 81: Lazily evaluated list generation functions RFC 90: Arrays: Builtins: merge() and demerge() RFC 202: Arrays: Overview of multidimensional array RFCs (RFC 203 through RFC 207) RFC 203: Arrays: Notation for declaring and creating arrays Thanks to Uri Guttman for suggesting the APL "reshape" name The '$' verb in J, described in The J Primer (provided as a help file with the J Language, available from http://www.jsoftware.com/) reshape() in NumPy: http://starship.python.net/~da/numtut/array.html#SEC3 reshape() in PDL: http://pdl.sourceforge.net/PDLdocs/Core.html#reshape
RFC 203 (v2) Arrays: Notation for declaring and creating arrays
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Arrays: Notation for declaring and creating arrays =head1 VERSION Maintainer: Jeremy Howard [EMAIL PROTECTED] Date: 8 Sep 2000 Last Modified: 21 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 203 Version: 2 Status: Frozen =head1 DISCUSSION No objections were noted to the proposals in this RFC. A change of name from :bounds to :shape was accepted. =head1 ABSTRACT RFC 202 described the need to be able to declare a data structure that contains elements of the same type stored contiguously in memory, which is called an Iarray. This RFC outlines the syntax to declare and create arrays. The syntax to create arrays is identical to that to create lists of lists (described in Lperllol in the Perl 5 documentation). The syntax to declare the type of elements is the standard type syntax. RFC 203 describes a syntax for multidimensional indexing of arrays. A syntax to declare the bounds of the dimensions is described in this RFC using a new ':shape' attribute. =head1 DESCRIPTION =head2 Compact arrays It is proposed that if a list is declared that specifies a simple type for its elements: my int @a; then that list be stored as an array--that is, contiguously in memory. These arrays support Iall the same syntax as lists. Therefore any description of syntax for a 'list' also applies to an 'array', and visa versa. However, their implementation is very different. =head2 :shape attribute Furthermore, it is proposed that lists accept a new C:shape attribute: my @a :shape(3); that defines the number of elements in a list. This is equivalent to: my @a; $#a = 3-1; except that an attempt at accessing beyond $a[3] would result in an error if :shape(3) is set. Specfically, adding a :shape attribute to a declaration has two effects on the array: =over 4 =item 1 Adds range checking (since exceptions are defined for access outside the range) =item 2 Allows the block of memory to be preallocated =back (2) can also be achieved by simply setting the bottom right element to undef, or by setting @#array. The behaviour of (1) removes autovivification of new elements, since an exception is raised instead. :shape doesn't actually reshape. If the returned array overshoots specified bounds of :shape, an exception is raised. Reshaping is done with reshape() (RFC 148), merge()/demerge() (RFC 90), and part()/flatten() (RFC 91). The :shape attribute can also accept a list: my @b :shape(3,3); The second element of the list is the number of elements in the list, as before. The first element is the number of lists that are referenced as elements of the list. Therefore my @b :shape(3,3); $b[3][4] = 0; # Error: access beyond bounds of @b would result in an error. :shape can take as many arguments as required--an n-element list declares a list with at most n levels of nesting, with the maximum index at level x being (n-x). Because lists of lists support multidimensional indexing (see RFC 204) the :shape attribute effectively specifies the bounds of a multidimensional structure. The parameters of :shape are optional if an array is assigned in the declaration. Therefore: my int @array :shape = @rvalue; is equivalent to: my int @array :shape(@#rvalue) = @rvalue; The bounds of an array or list can be specified at run time, of course: my @t1 :shape(@dimList) = getFromSomeplace(); =head2 Combining compact storage and :shape attribute Efficient multidimensional arrays can be declared by combining a fixed simple type with the :shape attribute: my int @b :shape(4,4); Perl in this case would set aside enough room for sixteen ints, and store an attribute with @b that it had two dimensions, each indexed by (0..3). Because @b here is stored as an array, and supports multidimensional indexing (see RFC 204), it is a true multidimensional array. Although @b looks just like a normal list of lists that happens to have a type and an attribute, it is implemented as a multidimensional array. Therefore my int @b :shape(4,4); @b = ([1,2,3,4], [5,6,7,8], [9,10,11,12], [13,14,15,16]); creates a multidimensional array @b that contains all sixteen ints in a contiguous block of memory, but can be accessed using standard list of lists syntax, along with the extensions proposed in RFC 204. Where the type and bounds of an array can be derived at run time, it is not necessary to specify them explicitly: my int @t1 :shape(@dimList) = getFromSomeplace(); my int @t2 :shape(@dimList) = getFromSomeplaceElse(); my @prod = @t1 * @t2; # @prod magically has type (int) and :shape (@dimlist) Note that this is using an element-wise multiplication operation, described in RFC 82. If either @t1 or @t2 was unbounded (i.e. had no :shape attribute) then @prod would also be unbounded. A list (of lists...) that contains elements of the same type can be converted to an
RFC 82 (v4) Arrays: Apply operators element-wise in a list context
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Arrays: Apply operators element-wise in a list context =head1 VERSION Maintainer: Jeremy Howard [EMAIL PROTECTED] Date: 10 Aug 2000 Last Modified: 21 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 82 Version: 4 Status: Frozen =head1 DISCUSSION The first source of discussion was around whether there is any consistent meaning to array operations. The source of this was that some felt that other people may want C* to mean something other than element-wise multiplication by default (e.g. matrix inner product). However no-one actually said that Ithey wanted it to work this way, only that Iothers may prefer it, particularly mathematicians. The standard use of element-wise operations in mathematical programming languages such as Mathematica and J suggests that this is unlikely to be a source of confusion in practice. The second source of discussion was around whether C|| and C should be an exception, as specified in RFC 45. This is discussed in detail in the CONFLICTS section. =head1 ABSTRACT It is proposed that in a list context, operators are applied element-wise to their arguments. Furthermore, it is proposed that this behaviour be extended to functions that do not provide a specific list context. =head1 CHANGES =head2 Since v3 =over 4 =item * Added discussion of conflict with RFC 45 =back =head2 Since v2 =over 4 =item * Extended to work with multidimensional arrays =item * Added ability to broadcast vectors across multidimensional arrays =item * Made operating on non-equal sized lists an error =back =head2 Since v1 =over 4 =item * Added the ability to apply an operator to a scalar and a list. =item * Added more examples, including text processing examples. =back =head1 DESCRIPTION Currently, operators applied to lists in a list context behave counter-intuitively: @b = (1,2,3); @c = (2,4,6); @d = @b * @c; # Returns (9) == scalar @b * scalar @c This RFC proposes that operators in a list context should be applied element-wise to the elements of their arguments: @d = @b * @c; # Returns (2,8,18) If the lists are not of equal length, an error is raised. =head2 Multidimensional array operations RFC 202 describes multidimensional arrays in Perl 6. Element-wise list operations also apply to multidimensional arrays: my int @mat1 = ([1,2], [3,4]); my int @mat2 = ([2,2], [1,1]); my @mat3 = @mat1 * @mat2; # ([2,4],[3,4]) An error is raised if the two arrays do not have equal dimensions. =head2 Broadcasting If an operator is used in a list context with one list (or multidimensional array), and one or more scalars, the scalars are treated as if they were an array of that scalar with the same dimensions as the array: my int @mat1 = ([1,2], [3,4]); @e = @mat1 * 2; # ([2,4],[6,8]) @f = @mat1 * (2,2,2) # Same thing If one operand is a vector of the same bounds as the equivalent dimension of the other operand, the vector's elements are 'broadcast' across every other dimension of the other operand: my int @mat1 = ([1,2], [3,4]); my int @vec1 = (2,3); # 1st dimension @g = @mat1 * @vec1; # ([2,4],[9,16]) my int @vec2 = ([2],[3]); # 2nd dimension @h = @mat1 * @vec2; # ([2,6],[6,12]) If the operands are a column vector and a row vector, the elements of each vector are combined into a two dimensional array: my int @vec1 = (2,3); # 1st dimension my int @vec2 = ([2],[3]); # 2nd dimension @i = @vec1*@vec2; # ([2*2,3*2],[2*3,3*3]) == ([4,6],[6,9]) Equivalent combinatorial broadcasting occurs if the operands are perpendicular planes (creating a cube), and so forth for higher dimensional arrays. =head2 Element-wise functions Functions that do not return a list should be treated in the same way: @e = (-1,1,-3); @f = abs(@e); # Returns (1,1,3) =head1 EXAMPLES =head2 Text processing If @first_names contains a list of peoples first names, and @surnames contains their surnames, this creates a new list that concatenates the elements of the two lists: @full_names = @first_names . @surnames; To quote a number of lines of a message by prefixing them all with ' ': @quoted_lines = ' ' . @raw_lines; To create a histogram for a list of scores: @people = ('adam', 'eve ', 'bob '); @scores = (7,9,5); # Score for each person @histogram = '#' x @scores; # Returns ('xxx','x','x') print join("\n", @people . ' ' . @histogram); adam xxx eve x bob x =head2 Number crunching This snippet multiplies the absolute values of three arrays together and sums the results, in a very efficient way: @b = (1,2,3); @c = (2,4,6); @d = (-2,-4,-6); $sum = reduce ^_+^_, abs(@b * @c + @d); Lists can be reordered or sliced with list generation functions (RFC 81) allowing flexible data
RFC 207 (v2) Arrays: Efficient Array Loops
($a[|i], $b[|i]) = ($c[ 2*|i ], $c[ 2*|i + 1 ]); $average[|i] = ($a[|i-1] + $a[|i] + $a[|i+1])/3; =back In the first example, |i will never take on values that would cause 2*|i+1 to be out of bounds for $c. As mentioned above, using multiple looping indices will cause a nested loop. The order of nesting the loops is not specified here, but any interdependencies among the indices must be satisfied. In most of the examples above, the loops caused by the multiple iterators are independent. However, in the "upper triangle" example, since the range of |j depends on the current value of |i, |i must be the "outer loop". In expressions containing looping indices and RFC205-style Cartesian-product array slices (e.g., $matrix[|x;@y]), each explicit (or implicit) non-singleton argument to ; acts as if it were an anonymous iterator using the explicit (or implicit (0..)) range list. Each anonymous iterator is independent of each other. This can be very powerfull, especially when combined with the * operand to ;: =over 4 # Generalized tensor multiplication: @product = $a[|i;*] * $b[|j;*]; =back The use of ; also makes it easier to express long lists of looping indices. $array[[|1,|j,|k]] is equivilant to $array[|i;|j;|k], but doesn't use as much punctuation. Looping indices aren't restricted to being used solely as array indices, as the "unriffle" example showed. But each looping index has to be used in an array index for at least one array. =over 4 # find $nth triangular number my $triangle = 0; $triangle += |i=(0..$n); # compile-time error: |i not used as index # Fill a multiplication table my @multtable : shape(12,12); $multtable[|i;|j] = |i*|j; # OK =back =head Lazy Evaluation Assuming that lazy evaluation is used in other parts of Perl6, it would be nice if these loops could also be evaluated lazily. In list context, this could be done by creating an anonymous function to evaluate the looped expression at the desired indices: =over 4 $a[|i]*$b[|j] # in list context # becomes sub { my ($i,$j) = @_; $a[$i]*$b[$j]; } =back This anonymous function can be TIEd to the resulting anonymous array, so all array lookups would invoke this function. Since TIEing is supposed to be improved in Perl6, this would be a reasonable way to do it. If other lazy evaluation mechanisms work in Perl6, they could be used instead. I am uncertain if lazy evaluation makes sense in void context. =head2 Examples: =over 2 $t[[|i,|j]] = $a[[|j,|i]]; # transpose 2-d @a =back would be equivilant to: =over 2 { my $i; my $j for $i (0..) { # last if out-of-bounds for $j (0..) { # last if out-of-bounds $t[[$i,$j]] = $a[[$j,$i]]; } } } =back This notation also allows (as a specific use) an alternative notation to the RFC 82 element-wise syntax. =over 2 #compute pairwise sum, pairwise product, pairwise difference... @sum = @a[[|i,|j,|k,|l]] + @b[[|i;|j;|k;|l]]; # RFC82: @sum = @a + @b @prod= @a[[|i,|j,|k,|l]] * @b[[|i;|j;|k;|l]]; #@prod = @a * @b @diff= @a[[|i,|j,|k,|l]] - @b[[|i;|j;|k;|l]]; #@diff = @a - @b =back RFC 82 syntax is simpler, but this is perl, so There Is More Than One Way To Do It. Note that if the "Lazy Evaluation" schema mentioned above is adopted, then these sums, products, and differences could be automagically lazy as well. =head1 IMPLEMENTATION The simplest implementation would be to convert at compile-time (or parse time) void-context looped iterator scopes to loops analogous to the above examples, and convert list-context looped iterator scopes to valued do-blocks or invoked anonymous subroutines: =over 4 $dotproduct = reduce {^_+^_},0,$a[|i]*$b[|i]; # would be transformed into $dotproduct = reduce {^_+^_},0, sub { my $i; my @r; for $i (0..min($#a,$#b)) { $r[$i] = $a[$i] * $b[$i]; } return @r; }-(); =back A more sophisticated, preferred, implementation would take advantage of the static, known nature of the data to create a highly optimized version of the loop. Possible optimizations include: Common sub-expression elimintation, encoding internally to some non-interpreted looping construct, etc. If special 'numeric functions' are provided in Perl, then expressions with just unoverloaded operators and numeric functions could be optimised into tight compiled loops, as occurs for example with fromfunction() and ufuncs in Numeric Python: http://starship.python.net/~da/numtut/array.html#SEC8 http://starship.python.net/~da/numtut/array.html#SEC13 For lazy evaluation, the value of the expression at any given set of indices is easy to calculate. However the lazy evaluation mechanism works, it can use this property to calculate the appropriate values. =head1 REFERENCES RFC 203: Notation for declaring and creating arrays RFC 204: Notation for indexing arrays with an LOL as an index RFC 205: New operator ';' for creating array slices
RFC 83 (v3) Make constants look like variables
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Make constants look like variables =head1 VERSION Maintainer: Jeremy Howard [EMAIL PROTECTED] Date: 10 Aug 2000 Last Modified: 21 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 83 Version: 3 Status: Frozen =head1 DISCUSSION The syntax was widely accepted. Some posters preferred to see constants expanded to work within complex data structures. Others preferred to keep things simple. This RFC keeps things simple. A counter-RFC has not been submitted. =head1 ABSTRACT This RFC proposes that the current constant.pm module removed, and replaced with a syntax allowing any variable to be marked as constant. =head1 CHANGES =head2 Since v1 =over 4 =item * Changed notation to be consistent with other value attributes =item * Specified behaviour of references, arrays, and lists =item * Added definition =back =head1 DESCRIPTION A constant is a value that can not be changed once it is declared. Once declared, it behaves just as if it was the actual literal value that it contains. Currently, constants are created in Perl using the constant.pm module: use constant PI = 3.1415926; which creates an inlined subroutine: sub PI () {3.1415926;} This method of creating constants has three serious drawbacks: =over 8 =item Can not be interpolated in strings Whereas variables can be interpolated into strings (e.g. "PI is $Pi"), subroutines can not be. This makes using constants inconvenient, since string concatenation must be used. =item Inconsistant syntax The sudden appearance of barewords can be quite unsettling to new users. After becoming told that 'arrays are @name, scalars are $name, ...', the rule suddenly stops working just because the programmer wants the value to stay constant. =item Redundant warnings In persistant Perl environments such as mod_perl, inlined subroutines often created the redundant warning 'Constant subroutine PI redefined'. This has been a frequent source of confusion amongst new mod_perl users. =back It is proposed that a new syntax for declaring constants be introduced: my $PI : constant = 3.1415926; my @FIB : constant = (1,1,2,3,5,8,13,21); my %ENG_ERRORS : constant = (E_UNDEF='undefined', E_FAILED='failed'); Constants can be lexically or globally scoped (or any other new scoping level yet to be defined). If an array or hash is marked constant, it cannot be assigned to, and its elements can not be assigned to: @FIB = (1,2,3); # Compile time error @FIB[0] = 2; # Compile time error %ENG_ERRORS=(); # Compile time error %ENG_ERRORS{E_UNDEF='No problem'} # Compile time error To create a reference to a constant use the reference operator: my $ref_pi = \$PI; To create a constant reference use a reference operator in the declaration: my $a = 'Nothing to declare'; my $const_ref : constant = \$a; Note that this does not make the scalar referenced become constant: $$const_ref = 'Jewellery'; # No problems $const_ref = \4; # Compile time error =head1 IMPLEMENTATION Constants should have the same behaviour as the do now. They should be inlined, and constant expressions should be calculated at compile time. =head1 EXTENSIONS It may be desirable to have a way to remove constness from a value. This will not be covered in this RFC--if it is required a separate RFC should be written referencing this one. =head1 REFERENCES perldoc constant perldoc perlsub (for constant subroutines)
RFC 269 (v1) Perl should not abort when a required file yields a false value
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Perl should not abort when a required file yields a false value =head1 VERSION Maintainer: Dominus [EMAIL PROTECTED] Date: 21 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 269 Version: 1 Status: Developing =head1 ABSTRACT Modules should not have to end with C1;. It is silly and confusing. =head1 DESCRIPTION Modules typically contain subroutine definitions. A module may contain initialization code also. If the initialization code fails, the module can return a false value to its caller, which aborts the compilation. In Perl 5, a module that contains nothing but subroutine definitions will return false by default, necessitating a 1; at the bottom. If the C1; is omitted, Perl emits the error Foo.pm did not return a true value... In spite of plenty of documentation, people Frequently Ask what this error means. Some languages like to have the compiler emit annoying messages to announce you forgot to include some pointless code whose only purpose is to stop the compiler from emitting the annoying message. Perl is mostly free of such nonfeatures. I propose that this unfeature be dropped entirely. No useful functionality is lost. If a Perl 6 module wants to indicate an initialization failure by throwing a fatal exception, it can simply call Cdie. If the calling module wants to abort when a Crequired file returns a false value, it is free to do that. The 'module initialization' feature is little-used. 99 the of 102 files in Perl 5.6 lib/*.{pl,pm} end with C1;. AnyDBM_File invokes 'die' explicitly. The only real exceptions are diagnostics.pm and timelocal.pl. =head1 IMPLEMENTATION 'require' should execute code in a file and return the result, as before, but it should not call Perl_die when the result is false. However, see below. =head1 MIGRATION In 98% of cases, no translation is necessary. The first version of the translator can ignore the issue entirely. Strategies to cover the other 2% follow: Is general, direct source translation of this feature of Perl 5 modules would probably be impossible. It's tempting to say that the translator should simply translate the last statement or block in the module from this: STATEMENT to this: unless (do {STATEMENT}) { require Carp; Carp::croak "... did not return a true value"; } However, I think that is impractical. The module might contain code that looks like this: if (something()) { return $v1; } ... $v2; In this case the 'return $v1' statement would Ialso have to be translated. In general, there might be many, many statements that would need to be translated. This would look awful. I think that if complete coverage is desired, the best choice would be to introduce a new pragma, which would enable the old behavior. A translated module would begin with package Foo; use perl5 'require/use semantics'; ... When this file was Crequired, the pragma would set a flag. The Cpp_require opcode would check the flag after compiling the file, and would call CPerl_die as before if the file returned a false value and if the flag was set. If Foo Crequired any other modules, the flag would be cleared before loading them, and restored again afterwards. (That is, the flag would have file scope.) =head1 REFERENCES Perl on-line manuals
RFC 208 (v3) crypt() default salt
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE crypt() default salt =head1 VERSION Maintainer: Mark Dominus [EMAIL PROTECTED] Date: 11 Sep 2000 Last Modified: 21 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 208 Version: 3 Status: Developing =head1 ABSTRACT A frequently-asked question is how to generate an appropaite random salt for password encryption. I propose that Perl generate the salt automatically if the salt argument is omitted in the call to crypt(). =head1 DESCRIPTION At present, crypt() requires two arguments: crypt PLAINTEXT,SALT It then passes these arguments directly to the C library crypt() function. When encrypting a new password, the programmer is required to generate a salt at random: @letters = ('A' .. 'Z', 'a' .. 'z', '0' .. '9', '/', '.'); $salt = $letters[rand@letters] . $letters[rand@letters]; $passwd = crypt($passwd, $salt); This is inconvenient and nonportable. It's also nonobvious: people frequently ask in the newsgroups how to do it. I propose that if the SALT argument is omitted, Perl should generate an appropriate salt internally and use that. $passwd = crypt($passwd);# Same as above On systems where the password format is different, Perl can do the appropriate thing. =head1 IMPLEMENTATION For the standard DES-based crypt, the implementation is straightforward trivial. Perl already has many functions that take an optional argument, and the C internals of the random-salt generator are well-known. Details will vary for systems using alternative password hashing schemes. On some systems, no salt need be generated. These can be taken care of with a suitably ifdef'ed section of code if necessary. If the random number generator has not yet been seeded, Perl should seed it. Michael Schwern has developed a partial demonstration implementation in pure Perl. It is available from http://www.pobox.com/~schwern/src/RFC-Prototype-0.01.tar.gz It has been suggested that Ccrypt() should have a private random number generator, to avoid interfering with the sequence of numbers produced by rand(). This would significantly complicate the implementation, and I believe it is probably unnecessary. See the REFERENCES for details. =head1 MIGRATION Ccrypt() with only one argument is presently a compile-time error, so there are probably few translation issues. The meaning of this program will change: $" = ', '; $code = "crypt(@ARGV)"; eval $code; die $@ if $@; But I don't think this is anything to worry about---it should fall into the "other 5%" category. =head1 REFERENCES perlfunc manpage for discussion of crypt() crypt(3) http://dev.perl.org/archive?35:mss:4500:29:lmemkmdbnocclmnnijmc
RFC 270 (v1) Replace XS with the CInline module as the standard way to extend Perl.
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Replace XS with the CInline module as the standard way to extend Perl. =head1 VERSION Maintainer: Brian Ingerson [EMAIL PROTECTED] Date: 21 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 270 Version: 1 Status: Developing =head1 REPLACES RFC 61 - Interfaces for linking C objects into perlsubs =head1 ABSTRACT Extending Perl with XS is too hard. First, there is a hefty learning curve for even simple extensions. Also, the resulting code gets spread out over several files, making it hard to maintain. In the spirit of Perl itself, CInline.pm makes extending Perl easy for easy things, and possible for harder things. A Perl programmer can write their first CInline extension in minutes. They can learn more difficult maneuvers as needed. All of the extension code can be in the same file as the Perl script or module. This one-liner is a complete perl extension: perl -e 'print add(2,2);use Inline C="int add(int x,int y){return x+y;}"' The first time you run it, you'll notice a pause (compiling). Subsequent runs are lightning fast, as long as the C component isn't modified. =head1 DESCRIPTION CInline.pm is basically a user friendly abstraction over XS. You feed it a snippet of (C) source and it performs the following steps: 1) Determine if the code snippet has already be compiled. If so, goto 5. 2) Parse function definitions to determine how code should bind to Perl. 3) Generate XS glue code. 4) Build the extension and install it in some known place. 5) DynaLoader the extension. If the extension is a user script or one-liner, the extension will get built and installed in a place that the user has access to. CInline chooses a reasonable default. The default can easily be over-ridden. If the extension is part of a distributed (ie CPAN) module, the code gets compiled during "make test" and permanently installed in the "installsitearch" during "make install". CInline is intended to replace 80-90% of the current functionality of XS. Although it does not need to be built over XS, doing so makes CInline more robust, helps towards backwards compatability, and provides an easy "out" if a project grows to exceed CInline's capabilities. In perl6, something like XS should still exist, but just as a foundation for CInline. Savvy hackers could defeat CInline and write glue code themselves, but this would not be the standard. =head1 IMPLEMENTATION All of this is currently functional in CInline v0.25 (on CPAN now). CInline seems to work on any machine that has access to the same environment that was used to build Perl itself. Success has been achieved on platforms including MSWin32 and most *nixes. Version 0.25 provides bindings to the following types: Cint, Clong, Cdouble, Cchar *, CSV *. (For anything else the user must pass the argument as a CSV * and do their own type conversion.) There is also support for passing and returning lists. Version 0.30 (not yet released) has no default types. It gets all of its types from XS Ctypemap files. These files are parsed for their types, which are dumped into the grammar to parse C. Since Perl comes with a generic Ctypemap file, this is used as the default. It contains all the types listed above and more. This allows XS programmers to use their old typemaps when switching to CInline. It also allows other modules (like Event.pm) which have an XS API, to publish that API to CInline seamlessly, using a syntax like "use Inline with = 'Event';" Version 0.30 will also support the following syntax: use Inline; print add(2, 4); __END__ =pod blah blah blah =cut __C__ int add(int x, int y) { return x + y; } If so desired, perl6 might be able to simply recognize the C__C__ marker, and not require the Cuse Inline; at all. The code would look like: print add(2, 4); __C__ int add(int x, int y) { return x + y; } __END__ =pod blah blah blah =cut =head1 REFERENCES RFC 61: Interfaces for linking C objects into perlsubs Inline.pm documentation and tutorial: http://search.cpan.org/doc/INGY/Inline-0.25/lib/Inline.pod http://search.cpan.org/doc/INGY/Inline-0.25/lib/Inline/Config.pod http://search.cpan.org/doc/INGY/Inline-0.25/lib/Inline/C/Tutorial.pod
RFC 271 (v1) Subroutines : Pre- and post- handlers for subroutines
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Subroutines : Pre- and post- handlers for subroutines =head1 VERSION Maintainer: Damian Conway [EMAIL PROTECTED] Date: 21 September 2000 Mailing List: [EMAIL PROTECTED] Number: 271 Version: 1 Status: Developing =head1 ABSTRACT In response to the desiderata set out in RFC 194 and in the Class::Contract module, this RFC proposes a generic "handler" mechanism that can install behaviours around (before or after) a subroutine invocation. =head1 DESCRIPTION =head2 Overall semantics It is proposed to provide a mechanism which allows two sequences of "handlers" to be associated with a specific subroutine or built-in function (hereafter referred to as the Iprimary). One sequence of handlers (the Iprefix sequence) would be called whenever the primary is invoked, but Ibefore the body of the primary is executed. Each handler would itself be a subroutine, and each would be called with the same argument list as the primary it prefixes, plus an extra argument (specifically, $_[-1]) representing a slot for the eventual return value of the primary. This last argument would normally have the value Cundef. The second sequence of handlers would be called Iafter the body of the primary has executed, but Ibefore the return value is returned to the invoking scope. Once again, each handler would itself be a subroutine, and each would be called with the same argument list as the primary it postfixes, plus the return value(s) argument ($_[-1]). For a postfix handlers, this extra argument would hold a reference to an array containing the value(s) actually returned from its primary. Normally prefix handlers would be prepended to the appropriate prefix sequence, whilst postfix handlers would be appended to the appropriate postfix sequence, thereby preserving the symmetry of pre- and post-fix handlers. The mechanisms for setting up such sequences are described in LGeneral Syntax of Handler Installers. =head2 Prefix Handler Semantics A prefix handler may do anything that any other subroutine may do. A typical action might be to trace or log the invocation of the primary (the Cpre syntax is used to set up a handler, and is explained in LGeneral Syntax of Handler Installers). package Foo; # set up a tracing prefix handler for the subroutine Foo::bar... pre bar { local $" = ',' my @caller = caller; print "Called foo with args (@_[0..$#_-1])\n", "from @caller[1,2]\n", "in contexts: ", join(", ",want); } Note that this (correctly) implies that a handler receives the same information from Ccaller and Cwant (RFC ???) as its primary would. Another common usage would be to acquire resources that the primary will use: pre read_file { flock $_[0], LOCK_EX; } (For the obvious complement of this usage, see LPostfix Handler Semantics). Note that, in all cases, the return value of the handler is ignored. =head3 Special semantics: changes to arguments Each handler receives the same argument list: the one that the primary was called with. If a handler changes one of the original arguments through the aliases in its @_ array, those changes a passed on to subsequent handlers and to the primary itself. For example: pre tax_payable_on { $_[0] -= 20.00; # routinely underquote sales price } sub tax_payable_on {# now sees prices $20 less than # specified argument printf("Tax: %.2lf", $_[0] * 0.1); } tax_payable_on(99.95); # prints 8.00 tax_payable_on(29.95); # prints 0.99 tax_payable_on( 9.95); # prints -1.01 (a profit!) =head3 Special semantics: changes to return value slot If a handler changes the value of the (normally Cundef) return value slot (i.e. $_[-1]), then the remaining handlers (prefix and postfix) are still called, but the primary itself is Inot called. Instead, the defined value in $_[-1] is used as the return value for the primary. This allows techniques such as memoization to be implemented: my %sin_cache = ( 0 = 0, 1.5707963267949 = 1, 3.1415926535898 = 0, 4.7123889803847 = -1, 6.2831853071796 = 0, ); pre CORE::sin { $_[-1] = $sin_cache{$_[0]}; # short-circuit if cache # value defined } One might also use this feature to short-circuit upon failure to acquire resources: pre read_file { flock($_[0], LOCK_EX|LOCK_NB) or $_[-1] = ""; # Can't lock file so...
RFC 39 (v4) Perl should have a print operator
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Perl should have a print operator =head1 VERSION Maintainer: Jon Ericson [EMAIL PROTECTED] Date: 5 Aug 2000 Last Modified: 20 Sept 2000 Mailing List: [EMAIL PROTECTED] Number: 39 Version: 4 Status: Retracted =head1 ABSTRACT Perl supplies an operator for line input - angle brackets. This is no analogous operator for output. I propose "inverse angle brackets": "Print this line.\n"; =head1 NOTES ON RETRACTION It seems that I am alone in loving the proposed syntax. It's short, it works the way I want, it fits into my brain. As a matter of fact, I've found myself trying to use it in code that I am currently working on. But this RFC suffers a fatal flaw - perl already has a perfectly good print operator. Perl is a language designed to be spoken by people, so it should be comfortable to people (even if they don't think exactly like me :). =head1 DESCRIPTION =head2 Easy things should be easy Output is already easy in Perl, but it could be easier. For one thing, it doesn't nest well in statements: while (){ print; push @line, $_; }; This could be written: push @line, $_ while ; Printing to STDOUT and some other file ala tee(1): print $fh "This also goes to stdout.\n"; Another problem with print is that the ()s are optional. perlop points out the following traps: print $foo, exit; print ($foo 255) + 1, "\n"; They could be correctly written as: $foo, exit; ($foo 255) + 1, "\n"; =head2 Ugly as a virtue A representative comment of this RFC is "Ick!" -- Jonathan Scott Duff [EMAIL PROTECTED] This RFC doesn't mind (nor does its maintainer). The print operator should be quick and dirty - used as an afterthought or side-effect. When you are looking for it, the print operator should stick out. When you are looking for something else (and have gotten used to the syntax), it should blend into the sea of punctuation. Do you remember when you first saw FH, or i++ (in C)? Compact syntax with side-effects, such as the print operator, should be ugly. This operator _will_ be misused, just as `STRING` (qx/STRING/) is misused. It will cause confusion just as the conditional operator (?:) causes confusion. It will be as jarring as =~ is to those who have never seen it. Perl is operator rich whether you like it or not. =head2 print will still be there Not all output is suited for inverse angle brackets. Most output will still go through print. Prints to files should use 'print FH LIST' so that the return value can be checked (and the filehandle specified). Long documents should be printed with the expanded form on their own lines so that they are emphasised. 'print "Hello world\n";' should remain the canonical 'first Perl script'. We still need print for practical and stylistic reasons. =head1 IMPLEMENTATION Let: LIST print LIST to the default output filehandle (normally STDOUT) and return LIST. It should have the same precedence as other list operators =head2 Migration from Perl 5 Inverse angle brackets are currently a syntax error, so no translation will be needed. =head1 Changes =over =item v4 Retracted =item v3 Added "Developing" status Operator now returns its arguments Changed DESCRIPTION to respond to concerns voiced about previous versions =item v2 Changed title Added other symbols section Added migration section Added RFC 51 reference =back =head1 REFERENCES RFC 2: Request For New Pragma: Implicit RFC 34: Angle brackets should not be used for file globbing RFC 51: Angle brackets should accept filenames and lists perlop perlfunc/print perldebug/"Debugger Commands"/{p,x}
RFC 200 (v2) Objects: Revamp tie to support extensibility (Massive tie changes)
KED}}) { croak "Fatal: Attempt to clear hash with locked keys!"; } undef $self-{DATA}; } # Want to override what each() and keys() do # Mostly stolen from Camel-3 p. 383 sub FIRSTKEY { my $self = self; my $temp = keys %{$self-{DATA}}; return scalar each %{$self-{DATA}}; } sub NEXTKEY { my $self = self; return scalar each %{$self-{DATA}}; } # Override addition just for demonstration purposes sub ADD { my $self = self; $self-{DATA}-{$_[0]} += (rand * $_[1]); } # Now add any Perl or custom functions that we want these # objects to be able to handle sub lock { my $self = self; $self-{LOCKED}-{$_[0]} = 1; } sub unlock { my $self = self; carp "Warning: Key $_[0] already unlocked" unless $self-{LOCKED}-{$_[0]}; delete $self-{LOCKED}-{$_[0]}; } sub unlock_all { my $self = self; carp "Notice: All values unlocked" unless $self-{LOCKED}; undef $self-{LOCKED}; } # Warn if we have locked values still sub DESTROY { my $self = self; if (keys %{$self-{LOCKED}}) { carp "Warning: Destroying transaction with locked keys!"; } undef $self-{LOCKED}; undef $self-{DATA}; } # Use our Transaction class package main; use CGI; my $cgi = new CGI; tie Transaction %trans; # Transaction-TIEHASH (thru UNIVERSAL::tie) # Generate our session id # Yes I know this is massively insecure ;-) srand; $trans{session} = rand; # All of these call $obj-STORE($var) $trans{name} = $cgi-param('name'); $trans{email} = $cgi-param('email'); $trans{cc}= $cgi-param('cc'); $trans{amount}= $cgi-param('amount'); # Lock our amount while we're charging the card... lock $trans{cc};# $obj-lock('cc'); lock $trans{amount};# $obj-lock('amount'); for ($try = 0; $try 3; $try++) { # Attempt to charge them next unless charge_card($trans{cc}, $trans{amount}); $trans{chargedate} = localtime; } unlock $trans{cc}; # $obj-unlock('cc'); # Check if we were successful die "Could no charge card $trans{cc}" unless $trans{chargedate} # Increment our session id # ++$trans{session} calls $obj-STORE($obj-ADD('session', 1)) $cgi-param('session') = ++$trans{session}; # Kill our transaction unlock_all %trans; # $obj-unlock_all; Note how we are easily able to add three new methods, Clock, Cunlock, and Cunlock_all, which are directly translated for us, meaning we don't have to mix OO and tied variable calls. This provides true object transparence. Note also how our overloaded CADD operator is used to increment our session number as well, all transparently to the user. =head1 IMPLEMENTATION Conceptually, implementation is straightforward, but quite different from tie's current form: 1. Drop Ctie builtin and replace with CUNIVERSAL::tie. 2. Drop hardwired internal function translation and instead add the Cuse tie pragma to overload arbitrary functions. 3. Add CUNTIE method called by Cuntie. Looking at Cpp_sys.c it appears this may be in 5.7 already. 4. Drop CTIEHANDLE method. I'm in the process of coming up with a "real" implementation section, but I'm so short on time I doubt this will happen by the time this RFC freezes. =head1 MIGRATION To keep complete backwards compatibility, the p52p6 translator could simply add a line like this: use tie push = \PUSH, pop = \POP, shift = \SHIFT ... which would include all of the Perl 5 methods for an array. Similar lines could be added for hashes. No translation would have to occur for scalars, since data methods remain automatically invoked still per CRFC 159. Many of the changes in this RFC build on and add power to Ctie, so do not require translation because they are new. =head1 NOTES [1] http://www.mail-archive.com/perl6-language@perl.org/msg02087.html [2] Camel-3 p. 395 has an excellent description of this problem. =head1 REFERENCES RFC 159: True Polymorphic Objects RFC 152: Replace invocant in @_ with self() builtin RFC 189: Objects : Hierarchical calls to initializers and destructors RFC 130: Transaction-enabled variables for Perl6 Camel-3 Chapter on Ctie, p363-398 Thanks to Nathan Torkington for his input and support
RFC 110 (v6) counting matches
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE counting matches =head1 VERSION Maintainer: Richard Proctor [EMAIL PROTECTED] Date: 16 Aug 2000 Last Modified: 20 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 110 Version: 6 Status: Frozen =head1 ABSTRACT Provide a simple way of giving a count of matches of a pattern. =head1 DESCRIPTION Have you ever wanted to count the number of matches of a patten? s///g returns the number of matches it finds. m//g just returns 1 for matching. Counts can be made using s//$/g but this is wastefull, or by putting some counting loop round a m//g. But this all seams rather messy. TomC (and a couple of others) have said that it can also be done as : $count = () = $string =~ /pattern/g; However many people do not like this construct, here are a couple of quotes: jhi: Which I find cute as a demonstration of the Perl's context concept, but ugly as hell from usability viewpoint. Bart Lateur: '()=' is not perfect. It is also butt ugly. It is a "dirty hack". This construct is also likely to be inefficient as perl will have to build up a list of all the matches, store them somewhere, count them, then throw them away. Therefore I would like a way of counting matches. =head2 Proposal m//gt (or m//t see below) would be defined to do the match, and return the count of matches, this leaves all existing uses consistent and unaffected. /t is suggested for "counT", as /c is already taken. Relationship of m//t and m//g - there are three possibilities, my original: m//gt, where /t adds counting to a group match (/t without /g would just return 0 or 1). However \G loses its meaning. The Alternative By Uri : m//t and m//g are mutually exclusive and m//gt should be regarded as an error. Hugo: I like this too. I'd suggest /t should mean a) return a scalar of the number of matches and b) don't set any special variables. Then /t without /g would return 0 or 1, but be faster since no extra information need be captured (except internally for (.)\1 type matching - compile time checks could determine if these are needed, though (?{..}) and (??{..}) patterns would require disabling of that optimisation). /tg would give a scalar count of the total number of matches. \G would retain its meaning. I think Hugo's wording about the relationship makes the best sense, and this is the suggested way forward. =head1 CHANGES RFC110 V1 - Original posting to perl6-language RFC110 V2 - Reposted to perl6-language-regex RFC110 V3 - Added Uri's alternitive m//t RFC110 V4 - Added notes about $count = () = $string =~ /pattern/g RFC110 V5 - Added Hugo's wording about /g and /t relationship, suggested this is the way forward. RFC110 V6 - Frozen =head1 IMPLENTATION Hugo: Implementation should be fairly straightforward, though ensuring that optimisations occurred precisely when they are safe would probably involve a few bug-chasing cycles. =head1 REFERENCES I brought this up on p5p a couple of years ago, but it was lost in the noise...
RFC 121 (v2) linkable output mode
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE linkable output mode =head1 VERSION Maintainer: David Nicol [EMAIL PROTECTED] Date: 17 Aug 2000 Last Modified: 20 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 121 Version: 2 Status: Frozen =head1 changes Addition of sentences concerning "perl without perl" =head1 ABSTRACT Perl5 offers a clunky interface for those who wish to call perl subroutines from within C programs. Herein is suggested a vastly simplified application programmer's interface: a C -o command line switch identical to that used in C compilers to produce a linkable object file. =head1 DESCRIPTION Two command line switches, -o and -oh, are added to perl6's invocation syntax. Perl invoked with the -o switch does not run its program, but rather pukes out an "object file" same way gcc would if given a file full of C code. Perl invoked with the -oh switch does not run its program, but rather pukes out a "header file" suitable for inclusion into a C program, containing the correct linkage definitions for the object file created by the -o switch. The resulting object file must be linked with the perl library to work. The point is, the perl internals are effectively hidden from the programmer who wishes to use a feature available in perl from within a C program. Also, it becomes easier to generate a "stand-alone" deliverable which will work without a full Perl intallation, by linking the output of C perl -o into a simple main() and delivering the resulting linked binary along with a perl shared object library. Or by just doing something like this: perl -o deliverme.o deliverme.pl ld deliverme.o -o deliverme =head1 IMPLEMENTATION Given a nonprototyped subroutine, Cperl6 -o will generate suitable wrapper code for all subroutines in the input file, as described in the Lperlcall and Lperlembed perldoc pages, and then pass this code (via a temporary file) to the same C compiler that was used to build perl. Given a perl6 subroutine with a fully described prototype, which amounts to a C struct structure, that structure (with its names if any) can be used as the parameter types of the resulting function call. A restricted return type described in terms of basic C data types can function as a C function return type. Perl functions that are already using restricted parameter lists and restricted return types are effectively doing their own type conversions, except between SV{STRING} and char*, but allowing C access to the SV{STRING} data type and functions can't be anything but good. Furthermore, the porting team will need to get very chummy with the linking system on the platform. =head1 REFERENCES my imagination perldoc perlembed perldoc perlcall
RFC 168 (v3) Built-in functions should be functions
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Built-in functions should be functions =head1 VERSION Maintainer: Johan Vromans [EMAIL PROTECTED] Date: 27 Aug 2000 Last Modified: 20 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 168 Version: 3 Status: Frozen =head1 NOTE See CHANGES for warp-up info. =head1 ABSTRACT RFC 26 proposes to eliminate the distinction between functions and operators from a language perspective. This RFC proposes that all Perl built-in functions should be usable in all ways normal functions can be used. It is part of a big consipracy to remove the number of cases with exceptional behaviour in Perl. =head1 DESCRIPTION Named operators, like Cabs, can be called like functions in which case they behave like functions. However, that's where the similarity ends. You cannot override most builtins, and cannot tack a reference to them. There is no reason why the built-ins should be treated differently. A famous Perl saying reads "if it looks like a function, it Bis a function." So be it. In particular, it is desired that every built-in =over 4 =item * can be overridden by a user defined subroutine; =item * can have a reference taken; =item * has a useful prototype. =back =head2 Overriding The principle of least surprise dictates that sub Ifoo { return 10 } print Ifoo(); should call Ifoo() and print "10" for all Ifoo. Currently, most built-ins are excluded from this. For example: sub system { return 10 } print system(); Instead of calling the user defined system(), the built-in is used. The second line may give a warning, but only if warnings are enabled: Ambiguous call resolved as CORE::system(), qualify as such or use =head2 References You can call a built-in, but not take a reference. $a = \system; print $a-(-1) This gives an error: Undefined subroutine main::system called This should return a reference to the built-in instead. Since CIfoo implicitly refers to the current package, it would be acceptible to require $a = \CORE::system; Note that this currently (5.7.0 DEVEL6806) results in the error: Undefined subroutine CORE::system called which is surprising, if not misleading. =head2 Prototypes Currently, several built-ins do not provide prototype information. prototype("CORE::abs") == ;$ prototype("CORE::shift") == undef This must be fixed. One might even call this a bug, although the current prototype mechanism is not powerful enough to cope with all built-ins. =head1 CHANGES =head2 Version 3 Added CHANGES. =head2 Version 3, 20 September Frozen after some discussions on the mailing list. People seem to like the idea, but worry about the prototypes. Other RFC will deal with that. =head2 Version 2, 28 Aug 2000 Add Status indicator. =head1 REFERENCES RFC 26: Named operators versus functions Tom Christiansen in 12231.967154045@chthon (perl6-internals, Aug 24, 2000).
RFC 259 (v2) Builtins : Make use of hashref context for garrulous builtins
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Builtins : Make use of hashref context for garrulous builtins =head1 VERSION Maintainer: Damian Conway [EMAIL PROTECTED] Date: 19 Sep 2000 Last Updated: 20 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 259 Version: 2 Status: Developing =head1 ABSTRACT This RFC proposes the builtin functions that return a large number of values in an array context should also detect hashref contexts (see RFC 21) and return their data in a kinder, gentler format. =head1 DESCRIPTION It's hard to remember the sequence of values that the following builtins return: stat/lstat caller localtime/gmtime get* and though it's easy to look them up, it's a pain to look them up Every Single Time. Moreover, code like this is far from self-documenting: if ((stat $filename)[7] 1000) {...} if ((lstat $filename)[10] time()-1000) {...} if ((localtime(time))[3] 5) {...} if ($usage (getpwent)[4]) {...} @host{qw(name aliases addrtype length addrs)} = gethostbyname $name; warn "Problem at " . join(":", @{[caller(0)]}[3,1,2]) . "\n"; It is proposed that, when one of these subroutines is called in the new HASHREF context (RFC 21), it should return a reference to a hash of values, with standardized keys. For example: if (stat($filename)-{size} 1000) {...} if (lstat($filename)-{ctime} time()-1000) {...} if (localtime(time)-{mday} 5) {...} if ($usage getpwent()-{quota}) {...} %host = %{gethostbyname($name)}; warn "Problem at " . join(":", @{caller(0)}{qw(sub file line)} . "\n"; =head2 Standardized keys The standardized keys for these functions would be: =over 4 =item Cstat/Clstat 'dev' Device number of filesystem 'ino' Inode number 'mode' File mode (type and permissions) 'nlink' Number of (hard) links to the file 'uid' Numeric user ID of file's owner 'gid' Numeric group ID of file's owner 'rdev' The device identifier (special files only) 'size' Total size of file, in bytes 'atime' Last access time in seconds since the epoch 'mtime' Last modify time in seconds since the epoch 'ctime' Inode change time in seconds since the epoch 'blksize' Preferred block size for file system I/O 'blocks'Actual number of blocks allocated =item Clocaltime/Cgmtime 'sec' Second 'min' Minute 'hour' Hour 'mon' Month 'year' Year 'mday' Day of the month 'wday' Day of the week 'yday' Day of the year 'isdst' Is daylight savings time in effect (localtime only) =item Ccaller 'package' Name of the package from which sub was called 'file' Name of the file from which sub was called 'line' Line in the file from which sub was called 'sub' Name by which sub was called 'args' Was sub called with args? 'want' Hash of values returned by want() 'eval' Text of EXPR within eval EXPR 'req' Was sub called from a Crequire (or Cuse)? 'hints' Pragmatic hints with which sub was compiled 'bitmask' Bitmask with which sub was compiled =item Cgetpw* 'name' Username 'passwd'Crypted password 'uid' User ID 'gid' Group ID 'quota' Disk quota 'comment' Administrative comments 'gcos' User information 'dir' Home directory 'shell' Native shell 'expire'Expiry date of account of password =item Cgetgr* 'name' Group name 'passwd'Group password 'gid' Group id 'members' Group members =item Cgethost* 'name' Official host name 'aliases' Other host names 'addrtype' Host address type 'length'Length of address 'addrs' Anonymous array of raw addresses in 'C4' format =item Cgetnet* 'name' Official name of netwwork 'aliases' Other names