Re: reshape() (was Re: Fw: Wrapup time)
Nathan Wiger wrote: Jeremy Howard wrote: RFC 203 defines a :bounds attribute that defines the maximum index of each dimension of an array. RFC 206 provides the syntax @#array which returns these maximum indexes. For consistancy, the arguments to reshape() should be the maximum index of each dimension. A maximum index of '0' would mean that that dimension is 1 element wide. Therefore '0' can not be special in reshape(). Therefore we should use '-1'. I agree with Christian, if you're going to use bounds(), this should be equal to the number of elements, NOT the number of the last element. So you would say "3" for 3 elements, even though they're numbered 0..2. This is the way other similar Perl ops work already: $size = @a;# 3 $last = $#a; # 2 OK, I'm convinced. I'll change :bounds. I agree with Christian that it should be renamed :shape in that case.
RFC 225 (v1) Data: Superpositions
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Data: Superpositions =head1 VERSION Maintainer: Damian Conway [EMAIL PROTECTED] Date: 14 September 2000 Mailing List: [EMAIL PROTECTED] Number: 225 Version: 1 Status: Developing =head1 ABSTRACT This RFC (seriously) proposes Perl 6 provide Cany and Call operators, and, thereby, conjunctive and disjunctive superpositional types. =head1 DESCRIPTION The advantages and possibilities of superpositional programming were demonstrated in well-received presentations at both YAPC'19100 and TPC 4.0. It is proposed that the Cany, Call, and Ceigenstates operators proposed in those talks be added to Perl 6. Adding them to the core is suggested because the use of superpositions changes the nature of subroutine and operator invocations that have superpositions as arguments. This is currently impossible to reproduce in a module. Furthermore, the fundamental utility of being able to write: if (any(@value) 10) { ... } or: die unless all(@tests)-($data); ought to be available to all Perl users. Inclusion in the core would also allow the current module-based pure Perl implementation to be greatly optimized (perhaps even parallelized on suitable SIMD or other multiprocessing platforms). A paper proposing the full semantics of superpositions (including their effect when used as subroutine arguments and operator operands) will soon be available from: http://www.csse.monash.edu.au/~damian/papers/#Superpositions =head1 MIGRATION ISSUES The any and all functions may collide with existing user-defined or module-exported subroutine names. =head1 IMPLEMENTATION See the Quantum::Superpositions module. =head1 REFERENCES [1] Bohr, N., On the Constitution of Atoms and Molecules, Philosophical Magazine, s.6, v.24, pp.1-25, 1913. [2] Einstein, A., ber einen die Erzeugung und Verwandlung des Lichtes betreffenden heuristischen Gesichtspunkt ("On a Heuristic Viewpoint Concerning the Production and Transformation of Light"), Annalen der Physik, v.17, p.132-148, 1905. [3] Lewis, G.N., The Conservation of Photons, Nature, v.118(2), pp.874-875, 1926. [4] Monroe, C., Meekhof, D.M., King, B.E., Itano, W.M. Wineland, D.J. Demonstration of a Fundamental Quantum Logic Gate, Phys. Rev. Lett. v.75, pp.4714-4717, 1995. [7] Cirac, J.I. Zoller P., Quantum Computations with Cold Trapped Ions, Phys. Rev. Lett. v.74, pp.4091-4096, 1995. [8] Gershenfeld, N. Chuang, I. L. Bulk Spin-resonance Quantum Computation, Science v.275, pp.350-356, 1997. [9] Cory, D.G., Fahmy, A.F. Havel, T.F., Ensemble Quantum Computing by NMR Spectroscopy, Proc. Natl Acad. Sci. USA 94, pp.1634-1639, 1997. [10] Deutsch, D. Quantum Theory, the Church-Turing Principle and the Universal Quantum Computer, Proc. R. Soc. Lond., v.A400, pp.97-117, 1985. [11] Deutsch, D. Jozsa, R., Rapid Solution of Problems by Quantum Computation, Proc. R. Soc. Lond., v.A439, pp. 553-558, 1992. [12] Shor, P. Algorithms for Quantum Computation: Discrete Logarithms and Factoring, Proc. 35th Symp. on Found'ns of Computer Science, pp. 124-134, 1994. [13] Grover, L.K., A Fast Quantum Mechanical Algorithm for Database Search, Proc. 28'th ACM Symp. on the Theory of Computing, pp. 212-219, 1996. [14] Wallace, J., Quantum Computer Simulators - A Review, Technical Report 387, School of Engineering and Computer Science, University of Exeter, June 1999.
Re: RFC 225 (v1) Data: Superpositions
Perl6 RFC Librarian (aka Damian Conway) wrote: This RFC (seriously) proposes Perl 6 provide Cany and Call operators, and, thereby, conjunctive and disjunctive superpositional types. Great to see this RFC'd--this will makes lots of data crunching code _way_ easier. Now, I haven't quite finished reading all the references yet ;-) but I'm wondering about the efficiency of of any() and all(). I've heard it said that they work in constant time, but I assume that would only be if they are implemented on a quantum computer (which are currently existentially-challenged ;-) Do any() and all() have some magic around how they are implemented in von Neumann computers that make them faster than standard CS searching techniques? The RFC mentions the opportunity to parallelise these operators, if they are included in the core. The same is true of a number of other -data RFCs, such as element-wise array operations, and implicit loops. Is there a generic approach we could propose that would allow parallelising of user algorithms without having to rely only on a subset of 'parallel-enabled' builtins in the core?
Re: RFC 225 (v1) Data: Superpositions
Jeremy Howard wrote: Do any() and all() have some magic around how they are implemented in von Neumann computers that make them faster than standard CS searching techniques? I'm probably naive here but shortcuts in a non-parallelized (classical) implementation rely on the usual shortcircuiting: the first true allows any to return, the first false can terminate all. How can you do better? Christian
Re: RFC 225 (v1) Data: Superpositions
Do any() and all() have some magic around how they are implemented in von Neumann computers that make them faster than standard CS searching techniques? I'm probably naive here but shortcuts in a non-parallelized (classical) implementation rely on the usual shortcircuiting: the first true allows any to return, the first false can terminate all. How can you do better? You can't, in serial implementation. But on a parallel architecture or, better still, on a quantum device, you can run all the computations in parallel. Damian
Re: RFC 99 (v3) Standardize ALL Perl platforms on UNIX epoch
Chaim Frenkel [EMAIL PROTECTED] writes: "AD" == Andy Dougherty [EMAIL PROTECTED] writes: AD In my humble opinion, I think perl's time() ought to just call the C AD library's time() function and not waste time mucking with the return AD value. Instead, if the time is to be stored externally for later use by AD another program, the programmer should be responsible for converting the AD time into a suitably useful and portable format. Any unilateral choice AD made by Perl6 in that regard isn't going to be of any help unless everyone AD else (Java, Python, C, etc.) follows along. From the perspective of a non-Unix user that has been fighting this battle for a few years now, you can't just take the C library time() on all systems. And yes, the programmer does need to take responsibility for external formats, regardless of the internal time representation used, and regardless of what "everyone else is doing". On at least some non-Unix systems, the time() function is itself an attempt to emulate Posix functionality...note that I say "attempt". And also note that Posix != Unix. One can have library functions in perfect compliance with published Posix standards, but when Unix programmers see the result they say "that's evil". An example: gmtime() on my system returns 0. Why? Because the system runs on "local time" and has no idea of the offset to UT. It's a perfectly Posix-compliant result, but causes no end of problems with programs written for Unix. Ditto for the epoch of 1 Jan 1970 0h in local time rather than UT. - programmers make assumptions based on what they're used to Possibly a few functions to make it easy. $Perl::EpochOffset 0 on a unix box 966770660 on a Mac (Lifted from pudge's previous email) etc. I don't know how many places there's an implicit "Epoch=0" embedded deeply in Perl modules and apps, but I've run into plenty of them. Adding a module to define epochs will sure help, but as long as the "default" is so simple, plenty of programmers will just assume Epoch=0 and build it into their code. So my suggestion is that either Epoch=0 for everyone, or make it distinctly non-zero for everyone so lazy programmers have to use $Perl::EpochOffset everywhere. One other that might be useful is have strftime() (or something similar) built-in without having to use POSIX; and the default should be MMDDHHMMSS.fff, (the ISO format) Sounds really good to me. I personally prefer to pass around the string representation, more that perl and unix systems need to handle datetime. (And I find it easier to read the ISO version than a time in seconds) Agreed, and having built in functions that can convert to/from ISO format to internal representation would be a great help. Just to stir things up a bit: if you want to consider abandoning the "unix standard time()" to get higher resolution, etc., take a closer look at the VMS native time, modified to be UT rather than local time. It has 64 bits for a huge range, with "ticks" of 100ns. The epoch is 17-Nov-1858 so no problems with pre-1970 dates in databases...and I *believe* that epoch date comes from the "modified Julian Day" epoch (MJD = JD - 2,400,000) so there's a very simple conversion to Julian Day. -- Drexel University \V--Chuck Lane ==]--*---[=== (215) 895-1545 _/ \ Particle Physics FAX: (215) 895-5934 /\ /~~~[EMAIL PROTECTED]
Re: RFC 99 (v3) Standardize ALL Perl platforms on UNIX epoch
Andy Dougherty [EMAIL PROTECTED] writes: On Thu, 14 Sep 2000, Charles Lane wrote: On at least some non-Unix systems, the time() function is itself an attempt to emulate Posix functionality...note that I say "attempt". And also note Do you mean that the following program might not print '5' (well, about 5, given sleep's uncertaintites ...) #include stdio.h #include sys/types.h #include time.h /* May need sys/time.h instead */ int main(void) { time_t start, stop; start = time((time_t *) 0); sleep(5); stop = time((time_t *) 0); return printf("The difference is %ld seconds.\n", stop - start); } If you mean the above program won't print '5', then I don't see how changing the epoch could possibly help. More radical surgery is required. Run that program during a DST change on my system, and you'll get +-3600 instead of 5. Yes, more radical surgery IS required, and you can't just take the C libraries "time()" function and expect it to be suitable for Perl's time(). There's code to deal with this in Perl5's vms/vms.c. So my suggestion is that either Epoch=0 for everyone, or make it distinctly non-zero for everyone so lazy programmers have to use $Perl::EpochOffset everywhere. This is not a simple either-or. Of course not, it was stating a preference for either of two (out of many) possible alternatives. Of those two, I prefer Epoch=0. If Perl5 on VMS just used the time() function from the C library we'd have Epoch=14400 part of the year and Epoch=18000 the rest of the year. Varying from year to year. Think this is a good way to do it? I don't. Suppose you are using a Mac and that perl6 decides to use the Unix epoch. Suppose you want to communicate with other Mac programs or library functions about time (e.g. perhaps by calling things through the XS interface or by storing to a file). Since the perl6 time() and C time() will now disagree about what time it is, even the non-lazy programmer will have to use $Perl::EpochOffset everywhere. Whenever you communicate outside of Perl and you use "low level" data you have to deal with the possibility of mismatches in things like formats and representations (number endianness, floating point binary representations, and yes, times). When you're writing code to port to other applications you *should* be looking out for stuff like this. The problems that I ran into with Epoch != 0 were purely internal to Perl...NOT the place where you'd expect to have to mess around with EpochOffsets. That's programmers get sloppy. In sum, *either* approach works in some situations and fails in others. There is no universal solution. Epoch=0 everywhere will *not* solve all problems. Nor will Epoch=native. Neither is perfect. Each has problems. Agreed. So we have to pick a default. (Yes, I'm sure everyone agrees that making it easy to make the other choice via some nice module is nice too.) But we have to pick a default. And I vote for perl's time() simply calling the system time(). Andy, those of us using Perl on VMS for some years have had both defaults. So there is some experience with this. I've been in the Epoch !=0 mode and it sucked. I vote for Epoch=0 as the default. -- Drexel University \V--Chuck Lane ==]--*---[=== (215) 895-1545 _/ \ Particle Physics FAX: (215) 895-5934 /\ /~~~[EMAIL PROTECTED]
Re: RFC 99 (v3) Standardize ALL Perl platforms on UNIX epoch
At 11:01 -0400 2000.09.14, Andy Dougherty wrote: On Thu, 14 Sep 2000, Chris Nandor wrote: There's also the possibility of time accepting an argument, where 0 would be perl's time and 1 would be native time, or something. Now that's a clever idea. Hmm. I think I like it as a solution to the specific issue at hand better than the proposed time()/systime() pair. I think I'll "borrow" your suggestion for a longer posting on perl6-language :-). Be my guest. I am not favoring one solution or another, just throwing out ideas. I had tossed around the idea of drawing up several competing RFCs. :-) BTW, Nat noted that a moratorium on RFCs is FAST approaching, that Larry will make his draft feature set on Oct. 1, and his final on Oct. 14, so get any new RFCs in now. See http://use.perl.org/ for links to what Nat said. On the other hand, I'm not sure I like it too much as a general solution to the broader portable vs. native issue, since other functions with arguments can't be handled so cleanly. True. Yes, I know it's very very unfair to take your time() suggestion and try to apply it to a question like "Should unlink() on VMS unlink all previous versions (i.e. like a command line PURGE) or should it behave more like DEL and only delete the latest version? But the unlink() question is also a valid one with many of the same underlying issues. I'm currently trying to think about how to encourage us collectively to handle similar issues in similar ways. Well, I don't think it is entirely unfair. Also, what values do -T and stat() return? Well, -T is not so much of a problem as long as the epoch is still in seconds, but stat() sure is a problem. We could make stat take an optional second parameter. I don't think any other builtin would have the problem, but what about modules like Time::Local? -- Chris Nandor [EMAIL PROTECTED]http://pudge.net/ Open Source Development Network[EMAIL PROTECTED] http://osdn.com/
Re: RFC 99 (v3) Standardize ALL Perl platforms on UNIX epoch
At 11:15 -0400 2000.09.14, Charles Lane wrote: I've been in the Epoch !=0 mode and it sucked. I vote for Epoch=0 as the default. Well, Perl is about making things easy. What is the most common case, needing an arbitrary value of time that may or may not be used to transfer between platforms, or needing a value of time that is specific to a given platform? As noted, I can rarely rely on the Mac value anyway, since it changes with time zone. And I cannot think of a time when I have needed the actual Mac value to communicate with another Mac process or library. If the Perl epoch were used for MacPerl, I don't think I would _ever_ need to get the Mac OS epoch (though I would certainly want it available if necessary). I can't say the same for VMS, or for other Mac users. -- Chris Nandor [EMAIL PROTECTED]http://pudge.net/ Open Source Development Network[EMAIL PROTECTED] http://osdn.com/
Re: RFC 99 (v3) Standardize ALL Perl platforms on UNIX epoch
On Thu, 14 Sep 2000, Chris Nandor wrote: Well, Perl is about making things easy. What is the most common case, needing an arbitrary value of time that may or may not be used to transfer between platforms, or needing a value of time that is specific to a given platform? And I cannot think of a time when I have needed the actual Mac value to communicate with another Mac process or library. Well, my entire experimental temperature control system currently relies on just this communication. But I suspect my personal experience here is probably not the most common :-). -- Andy Dougherty [EMAIL PROTECTED] Dept. of Physics Lafayette College, Easton PA 18042
Re: RFC 99 (v3) Standardize ALL Perl platforms on UNIX epoch
"CN" == Chris Nandor [EMAIL PROTECTED] writes: CN No, that won't really work. When my offset from GMT changes for daylight CN savings time, it will break. The point of having a module is that epoch CN conversions are more complicated than that. For example, Mac OS epoch CN begins at Jan 1 1904 00:00:00 _local time_. That is why the timezone CN offset from GMT was passed to the Time::Epoch functions. I'm confused. How do you expect the program to know the timezone if the OS doesn't? And if the program knows it and can track it, then we can hand off the responsibility to Perl. Then the epoch would 'vary' according to whatever nonsense is necessary. But if the values wander so badly, what does the OS use? If perl has to convert away, then it can easily use Unix epoch. CN Also, you might want to convert between other epochs; what if you get an CN epoch value FROM Mac OS on a Unix box, and want to convert it? That's a different problem than we are trying to solve. This is a wider problem then a fixed epoch for perl. Let's turn this around. What if we are on a platform that doesn't use perl's epoch and we need to write a value to a file? I think I've just gotten very confused. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 99 (v3) Standardize ALL Perl platforms on UNIX epoch
Bart Lateur [EMAIL PROTECTED] writes: Now, on those platforms without 64 bit support, a double float has a lot more mantissa bits than 32, typically 50-something (on a total of 64 bits). This means that all integers with up to more than 50 significant bits can exactly be represented. That would be a lot better than the current situation of 32 bits. Everything I've heard from anyone who's done work on time handling libraries is that you absolutely never want to use floating point for time. Even if you think that the precision will precisely represent it, you don't want to go there; floating point rounding *will* find a way to come back and bite you. Seconds since epoch is an integral value; using floating point to represent an integral value is asking for it. As an aside, I also really don't understand why people would want to increase the precision of the default return value of time to more than second precision. Sub-second precision *isn't available* on quite a few platforms, so right away you have portability problems. It's not used by the vast majority of applications that currently use time, and I'm quite sure that looking at lots of real-world Perl code will back me up on this. It may be significantly more difficult, complicated, or slower to get at on a given platform than the time in seconds. I just really don't see the gain. Sure, we need an interface to sub-second time for some applications, but please let's not try to stuff it into a single number with seconds since epoch. -- Russ Allbery ([EMAIL PROTECTED]) http://www.eyrie.org/~eagle/
Re: RFC 99 (v3) Standardize ALL Perl platforms on UNIX epoch
At 17:47 -0400 2000.09.14, Chaim Frenkel wrote: "CN" == Chris Nandor [EMAIL PROTECTED] writes: CN No, that won't really work. When my offset from GMT changes for daylight CN savings time, it will break. The point of having a module is that epoch CN conversions are more complicated than that. For example, Mac OS epoch CN begins at Jan 1 1904 00:00:00 _local time_. That is why the timezone CN offset from GMT was passed to the Time::Epoch functions. I'm confused. How do you expect the program to know the timezone if the OS doesn't? I am not sure what you mean by "the program." If you mean perl, well, perl often doesn't. Figuring out the correct time zones is sometimes quite hard. And yes, sometimes the OS is completely lacking in knowledge of a time zone. If you'll note, the Time::Epoch example asked that the time zone differential be one of the arguments, because we don't want to rely on guessing (but we fall back to guessing if none is supplied). Assuming the OS does know, and perl can find out from the OS, then perhaps a variable would work. But I, for the most part, despise the idea of adding more global variables to Perl. I would much rather call a simple function. And if the program knows it and can track it, then we can hand off the responsibility to Perl. My program knows the timezone difference because I hardcoded it in. :) CN Also, you might want to convert between other epochs; what if you get an CN epoch value FROM Mac OS on a Unix box, and want to convert it? That's a different problem than we are trying to solve. I don't think so. What we are trying to solve is the problem of different system epochs. This is a wider problem then a fixed epoch for perl. Let's turn this around. What if we are on a platform that doesn't use perl's epoch and we need to write a value to a file? Yes. What if? That's what we're addressing. Right now, you need to use something like Time::Epoch to do a conversion, or you use a non-ambiguous representation, such as you get with Date::Manip (which, BTW, I believe is broken in respect to MacPerl's epoch; that is, I think I needed to convert to Unix epoch before doing something with it). -- Chris Nandor [EMAIL PROTECTED]http://pudge.net/ Open Source Development Network[EMAIL PROTECTED] http://osdn.com/
Re: RFC 99 (v3) Standardize ALL Perl platforms on UNIX epoch
"CN" == Chris Nandor [EMAIL PROTECTED] writes: This is a wider problem then a fixed epoch for perl. Let's turn this around. What if we are on a platform that doesn't use perl's epoch and we need to write a value to a file? CN Yes. What if? That's what we're addressing. Right now, you need to use CN something like Time::Epoch to do a conversion, or you use a non-ambiguous CN representation, such as you get with Date::Manip (which, BTW, I believe is CN broken in respect to MacPerl's epoch; that is, I think I needed to convert CN to Unix epoch before doing something with it). You misundertood me. You have to know several different facts. The current epoch, the machine epoch, the epoch that the file requires. I really don't see that we need more than what is the difference between the timestamp returned from the syscalls, and the unix (or whatever) epoch. If you want to adjust for timezones just calculate the constant. Which since you are giving it in HHMM format you might as well just calculate directly. So what am I missing. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
RFC 80 (v3) Exception objects and classes for builtins
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Exception objects and classes for builtins =head1 VERSION Maintainer: Peter Scott [EMAIL PROTECTED] Date: 9 Aug 2000 Last Modified: 14 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 80 Version: 3 Status: Developing =head1 ABSTRACT This RFC proposes that builtins that throw exceptions throw them as objects belonging to a set of standard classes. This would enable an exception type to be easily recognized by user code. The behavior if the exception were not trapped should be identical to the current behavior (error message with optional line number output to STDERR and exit with non-zero exit code). =head1 DESCRIPTION This RFC is tightly bound with RFC 88, which proposes an exception handling mechanism based upon exceptions-as-objects, and in particular specifies that fatal exceptions thrown by the core will be objects with certain instance attributes. We assume here that these aspects of RFC 88 are implemented. Builtins experiencing fatal errors currently call Cdie, which is to say, they throw an exception. Builtins experiencing non-fatal errors return a variety of error codes. RFC 70 proposes that these be trappable exceptions if Cuse Fatal is in effect. This RFC proposes that both exceptions be objects blessed into a standard set of classes which can be checked for by the user. =head2 Object Attributes The exception object will have attributes filled in by perl. The applicable attributes from RFC 88 will be used, including: =over 4 =item tag RFC 88 eschews numeric codes in favor of alphanumeric tags. A system exception should place the symbolic errno constant here, e.g., CEINVAL, for system errors; something will have to be made up for errors that don't have associated errnos. =item message The text of the exception, e.g., "Out of memory". =item severity Relative level of fatality. Chosen from some TBD enumeration, e.g., "Warning", "Fatal", "Information". =item sysmsg Additional information about the exception, the kind of thing currently put in C$^E. =item files This and the next two attributes are used to track the program locations the exception has passed through to this point since it was thrown. This attribute returns an array of filenames, starting with the one in which the exception was thrown. This is to preserve Ccaller-type information for the catcher to be able to see. RFC 88 proposes the method Cshow and option Ctrace to retrieve filenames and line numbers. =item lines An array of line numbers of program locations the exception has passed through between being thrown and being caught. =item subs An array of (package-qualified) subroutine names the exception has passed through between being thrown and being caught. =back Stringifying the object itself will yield the Cmessage attribute. A Cfacility attribute was suggested to indicate what part of perl is throwing the exception: IMO that is part of the exception class. In an numeric context, the value will be the Cerrno if it corresponds to one, otherwise up to the implementor. =head2 Classes This is a strawman exception class enumeration. The merits of this RFC do not depend on this being a good list, only on it being possible to find a reasonable one. A common prefix like CException:: is elided for readability. Note: conceivably, the implementation could allow an exception to belong to more than one class at a time through multiple inheritance (e.g., CRegex and CRecursion). I haven't explored the ramifications of that. These class names can be specified in calls to CFatal.pm (and appropriate language currently appears in RFC 70), qualified with a C: to distinguish them from core function names. This allows the user to change the fatality or otherwise of whole classes of exceptions. It would be possible (whether it would also be Idesirable is another matter) for a user to say, e.g., Cno Fatal qw(:Reference) and thereby excise the usual core exception upon an incorrect dereference operation, as though they had wrapped it in an Ceval. This makes the operation of CFatal.pm consistent over the broadest possible application. =over 4 =item Arithmetic Divide by zero and friends. =item Memory Cmalloc failed, request too large, that sort of thing. =item Eval A compilation error occurred in Ceval, C/e, or C(?{ ... }). Possible candidate for subdividing. =item Regex A syntax error occurred in a regex (built at run-time). Possible candidate for subdivision. =item IO An I/O error occurred. Almost certainly should be subdivided, perhaps parallel to the CIO:: hierarchy. =item Format Error in format given to Cpack, Cprintf, octal/hex/binary number etc. Could use a better name. =item Thread Some goof in threading. =item Object Tried to call non-existent method, that kind of thing. =item System Attempt to interact
Re: $a in @b (RFC 199)
David L. Nicol wrote: This ability to jump to "the right place" is exactly what exception handling is for, as I understand it. Exceptions allow us to have one kind of block and any number of kinds of exit mechanisms. If qC(last die return) are all excpetions, the can travel up the call stack until they find the appropriate handler. Kinda. "Exceptions" are supposed to be for exceptional situations only; return is none such. last/next/redo isn't really, either. And I strongly oppose having perl handle user-raised exceptions. But the "longjump" idea is right; so I propose that we lump these things together not as "exceptions" (though they may be implemented internally that way), but as "jumps". But I think the point is important, that the various kinds of blocks, and their respective, yea, defining, exit mechanisms, not be confused or conflated. We just need to clear up what kind of block grep/map use: either a true sub (which I favor), or a distinct kind, with its own early exit keyword(s). -- John Porter We're building the house of the future together.
Re: $a in @b (RFC 199)
'John Porter' wrote: David L. Nicol wrote: "Randal L. Schwartz" wrote: I think we need a distinction between "looping" blocks and "non-looping" blocks. And further, it still makes sense to distinguish "blocks that return values" (like subroutines and map/grep blocks) from either of those. But I'll need further time to process your proposal to see the counterarguments now. In the odd parallel universe where most perl 6 flow control is handled by the throwing and catching of exceptions, the next/last/redo controls are macros for throwing next/last/redo exceptions. Loop control structures catch these objects and throw them again if they are labeled and the label does not match a label the loop control structure recognizes as its own. ... In a nutshell, there are different kinds of blocks, and their escape mechanisms are triggered by different keywords. By unifying the block types, and making the keywords work across all of them, I'm afraid we would lose this ability to jump up through the layers of scope to "the right place". This ability to jump to "the right place" is exactly what exception handling is for, as I understand it. Exceptions allow us to have one kind of block and any number of kinds of exit mechanisms. If qC(last die return) are all excpetions, the can travel up the call stack until they find the appropriate handler. The "traveling up the call stack" can even be optimized to a per-thread table of what the appropriate handler is for the most commonly used types. -- David Nicol 816.235.1187 [EMAIL PROTECTED] perl -e'map{sleep print$w[rand@w]}@w=' ~/nsmail/Inbox
Cross-referencing RFC 186 with RFC 183 and RFC 79
RFC 186 is another interesting -io RFC, even though I'm not on the -io list. I couldn't find any discussion in the mail archive, so here's some to start it. Please copy me on the discussion. Sorry for cross posting, but this is attempting to unify RFCs from different lists; I've bcc'd two of the lists, directing followup discussion to -io (seems most appropriate for now). Could the RFC authors respond by adding the other RFCs to their cross-reference lists and republishing their RFCs, or explaining why I'm all wet in seeing these relationships. And as always, other comments welcome. Perl6 RFC Librarian wrote: =head1 TITLE Standard support for opening i/o handles on scalars and arrays-of-scalars It's extremely useful to be able to open an i/o handle on common in-core data structures, such as scalars or arrays-of-lines. The CPAN modules IO::Scalar, IO::ScalarArray, and IO::Lines currently provide some of this functionality, but their pure-Perl implementation (chosen for portability) is not as fast or memory-efficient as a native implementation could be. Additionally, since they are not part of the standard Perl distribution, many developers are either unaware of their existence or unwilling to obtain and install them. This RFC proposes that support for "in-core i/o" be folded into the Perl distribution as a standard extension module, making use of native C code for speed. I have a number of scripts that use this sort of facility, using push/shift to populate/read the array "file". These could be made simpler and more general by wrapping the array as a file. Perhaps the open "handler" stuff could be used to implement this? Efficiently? Perhaps a technique like this could be used to implement RFC 79? Perhaps these RFCs should each reference the other, to preserve this notion? Perl's first pass through the file would read it and interpret all lines via the POD rules, plopping each line into the appropriate memory "file" (array) for each type of active handler? So compiling normal perl would create (minimally) a "perl source array", and a "perl data array". After the file is completely read, the perl compiler would be turned loose on the file populated bythe perl source array, and the perl data array would eventually populate whatever the DATA handle becomes in perl6. A pod processor would declare its type and get the set of lines appropriate to that type of pod processor. Or maybe (if it is cheap, or as a pod processor helper command line option, we read all the lines anyway) a file handle/memory array would be created for each type of pod processor mentioned in the source code. Then (1) programs could access the pod data via those handles, (2) a pod processor written in perl could just use the handle for the type of processor it is, ignoring the others. This RFC also seems to be related to RFC 183... using POD for testing. Now the model of use apparently envisioned for RFC 183 is to have the tests inside the POD, and then use a preprocessor to hack them out and put them in separate files. Wouldn't it be better to skip that step? Just use the "pod helper command line option" mentioned in the above paragraph, or a variation, to cause perl's first pass to (1) obtain the source for the program or module, (2) also obtain the source for the test module, (3) obtain one or more data handles for test input data and validation data, (4) compile 12 as perl source code, and (5) launch the tests, which can then used the appropriate data handles. But when compiled normally (without the test switches), all the test files simply don't get included. -- Glenn = There are two kinds of people, those who finish what they start, and so on... -- Robert Byrne ___ Why pay for something you could get for free? NetZero provides FREE Internet Access and Email http://www.netzero.net/download/index.html
Re: Cross-referencing RFC 186 with RFC 183 and RFC 79
On Wed, Sep 13, 2000 at 11:21:25PM -0700, Glenn Linderman wrote: This RFC also seems to be related to RFC 183... using POD for testing. Now the model of use apparently envisioned for RFC 183 is to have the tests inside the POD, and then use a preprocessor to hack them out and put them in separate files. Wouldn't it be better to skip that step? Just use the "pod helper command line option" mentioned in the above paragraph, or a variation, to cause perl's first pass to (1) obtain the source for the program or module, (2) also obtain the source for the test module, (3) obtain one or more data handles for test input data and validation data, (4) compile 12 as perl source code, and (5) launch the tests, which can then used the appropriate data handles. But when compiled normally (without the test switches), all the test files simply don't get included. RFC 79 would only effect how the tests are extracted from the code. No alteration of the proposed syntax would be necessary, "=for/=begin/=end testing" will still work fine. As for adding the test extraction to the core as a command switch, I think that's unnecessary. Extracting the testing source is trivial, with or without RFC 79. Once extracted, a module can deal with it just as easily, and with much more flexibility, than a core patch to perl can. Besides, .t files aren't going anywhere and we'll still need external (ie. MakeMaker) support to deal with them. There's no compelling reason to muddle the perl core with test switches, it can be done easier as a module. -- Michael G Schwern http://www.pobox.com/~schwern/ [EMAIL PROTECTED] Just Another Stupid Consultant Perl6 Kwalitee Ashuranse BOFH excuse #39: terrorist activities
Re: Cross-referencing RFC 186 with RFC 183 and RFC 79
Glenn Linderman wrote: I have a number of scripts that use this sort of facility, using push/shift to populate/read the array "file". These could be made simpler and more general by wrapping the array as a file. Perhaps the open "handler" stuff could be used to implement this? Efficiently? I think this is a definite possibility: $FH = open scalar $myvar; print $FH "stuff"; Then the scalar handler would just have to provide the necessary print() et al methods to do scalar manipulation. Theoretically this can be done modularly, without having to wedge it into the core binary, while still being able to get speed benefits. -Nate
Re: Cross-referencing RFC 186 with RFC 183 and RFC 79
Michael, Thanks for the explanation. So you see, I'm one of those people that go around looking for redundancies to eliminate. So when I hear that you want to extract a .t file from perl source (as specified by the RFC 183), it makes me wonder 1) why extract it if it could potentially be used in place 2) if it cannot be used in place, then why bundle it So I guess RFC 183 leaves me not understanding its goals. If there is a benefit to the bundling, then RFC 183 would seem to be only half the solution, the other half would be to use it in place. Regarding my comment about "junk" below, I didn't mean that the content was junk, but the file, since it would (given RFC 183) become redundant with the same content embedded in the module source, is, because it is redundant, junk. If it had to be extracted to be used, then it should be cleaned up after it is used, because it is redundant. If it could be used in place, then it doesn't need to be extracted, nor does it need to be cleaned up. Nothing you've said below makes me think that it would be impossible to use the data in-place, given a unification of the concepts in RFCs 186, 183, and 79. Whether or not it is valuable is another issue, but, in the never-ending quest to eliminate redundancy, I'll claim that if it is valuable to embed the content in the first place (the point of RFC 183), then it would be valuable to use it in-place rather than extracting it. Michael G Schwern wrote: On Thu, Sep 14, 2000 at 12:01:03AM -0700, Glenn Linderman wrote: Once extracted, a module can deal with it just as easily, and with much more flexibility, than a core patch to perl can. Who cleans up all the junk files later? Nobody does, they're not junk. They go into the t/ directory of the module/code distribution. More on that later. And if you have to extract them to use them, why bundle them in the first place. Because its all done for you as part of "make test". Besides, .t files aren't going anywhere and we'll still need external (ie. MakeMaker) support to deal with them. Oh? I'm no QA guy, right? So I have no clue what a .t file is (sorry). A .t file is simply a Perl program which outputs a series of "ok" or "not ok" for each test run. Nothing more. They typically reside in a t/ directory of a module's source code (that or there's one test.pl file). Download any module from CPAN and have a look. A simple example would be: #/usr/bin/perl -w print 2+2 == 4 ? "ok 1" : "not ok 1"; print 2*2 == 4 ? "ok 2" : "not ok 2"; When you run "make test" on a module it looks for test.pl and t/*.t, runs them, and counts up the "ok"s and "not ok"s. Simple. Thre's no plans to change this AFAIK. So all RFC 183 needs changed is that when "make test" happens, pod2tests would be run over each source file generating a .t file from any embedded tests found (lib/Some/Module.pm's tests would go in t/Some-Module-embedded.t) and the rest is as it is now. The .t files are run and oks counted. Simple, with a minimum of modification to the existing system. Sorry if I took it for granted that people would understand the testing system. And anyone for whom the above was new information, I expect you FRONT AND CENTER at the "Writing Solid Perl" tutorial at YAPC::Europe. -- Michael G Schwern http://www.pobox.com/~schwern/ [EMAIL PROTECTED] Just Another Stupid Consultant Perl6 Kwalitee Ashuranse MORONS! -- Glenn = There are two kinds of people, those who finish what they start, and so on... -- Robert Byrne NetZero Free Internet Access and Email_ Download Now http://www.netzero.net/download/index.html Request a CDROM 1-800-333-3633 ___
RFC 30 (v4) STDIN, STDOUT, STDERR, ARGV, and DATA should become scalars
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE STDIN, STDOUT, STDERR, ARGV, and DATA should become scalars =head1 VERSION Maintainer: Nathan Wiger [EMAIL PROTECTED] Date: 04 Aug 2000 Last-Modified: 14 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 30 Version: 4 Status: Frozen =head1 ABSTRACT Consensus has been reached that filehandles (currently barewords) will be revamped to become true $scalars, to make them consistent with other Perl variables. CSTDIN, CSTDOUT, CSTDERR, and CDATA should follow suit and be renamed C$STDIN, C$STDOUT, C$STDERR, and C$DATA, becoming full-fledged scalar Bfileobjects. In addition, CARGV should become C$ARGV as well. The old function of C$ARGV (the currently open filename) will be available via polymorphism. =head1 NOTES ON FREEZE This was pretty much accepted by everyone, since it follows logically from filehandles becoming scalars. A few clarifications were added, but it otherwise remains the same from the previous version. =head1 DESCRIPTION =head2 $STDIN, $STDOUT, $STDERR Currently, filehandles are barewords, such as FILE and PIPE. However, for Perl 6 these are planned to be renamed to true "single-whatzitz" types (thanks Tom) and prefixed with a $. So, the current: print FILE "$stuff\n"; Will become something like: print $FILE "$stuff\n"; STDIN, STDOUT, and STDERR need to follow suit. We should change print STDERR "$stuff\n"; to: print $STDERR "$stuff\n"; This makes them consistent with other Perl variables, such as @ARGV, %ENV, $VERSION, etc, all of which have the correct distiguishing prefix for their type. =head2 $DATA CDATA should follow suit, becoming C$DATA. =head2 $ARGV In addition, CARGV should be renamed to C$ARGV. However, this overlaps with the already-existing C$ARGV (currently open filename), appearing to cause problems. But never fear! Polymorphic objects to the rescue: while ($ARGV) { # used as fileobject next if ($ARGV eq $lastfile) # $ARGV-STRING, filename print "Now reading $ARGV";# $ARGV-STRING, filename dostuff($_); $lastfile = $ARGV;# copies object, but that's ok # because will have -STRING too } This means that $ARGV will be both the filehandle *and* the name of the file, but it will automatically morph to suit your needs depending on context. Additionally, CARGVOUT should either follow suit, or be wrapped into C$ARGV ($ARGV-OUT?), whichever makes more sense. =head1 IMPLEMENTATION All references to these bareword filehandles will have to be changed. In addition, $STDIN, $STDOUT, and $STDERR should be standard, read-write variables. If a person wants to do this: $STDOUT = $myfilehandle; print "Watch out!"; They should be able to. The same should (probably) go for C$DATA and C$ARGV. =head1 MIGRATION The p52p6 translator needs to be able to spot instances of barewords and globs and translate them to scalars: print STDERR @foo; - print $STDERR @foo; dostuff(\*STDIN);- dostuff($STDIN); print while(ARGV); - print while($ARGV); select(STDERR); - $DEFOUT = $STDERR;# RFC 129 tie *STDOUT, 'Apache'; - tie Apache $STDOUT; # RFC 200 A similar process will have to be done with all other filehandle conversions as well, so this may well be handled implicitly by the more general conversion. We may be able to ignore globs since these should handle scalars implicitly as aliases. =head1 REFERENCES http://www.mail-archive.com/perl6-language@perl.org/msg03279.html RFC 14: Modify open() to support FileObjects and Extensibility RFC 159: True Polymorphic Objects RFC 129: Replace default filehandle/select with $DEFOUT, $DEFERR, $DEFIN RFC 200: Objects: Revamp tie to support extensibility (Massive tie changes)
RFC 186 (v2) Standard support for opening i/o handles on scalars and
arrays-of-scalars Reply-To: [EMAIL PROTECTED] This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Standard support for opening i/o handles on scalars and arrays-of-scalars =head1 VERSION Maintainer: Eryq (Erik Dorfman) [EMAIL PROTECTED] Date: 23 Aug 2000 Last MOdified: 14 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 186 Version: 2 Status: Developing =head1 ABSTRACT Support the ability to open an i/o handle on a scalar, and also on an array-of-scalars. Implement this in C (for speed), and provide it in a "standard" extension module (for universal availability). =head1 DESCRIPTION It's extremely useful to be able to open an i/o handle on a common in-core data structure, such as a scalar or an array-of-lines. Such a capability breaks down the artificial boundary between core-based and disk-based data processing, allowing a developer to... =over 4 =item * Use memory to hold small "temporary files" (fast, secure, portable), =item * Reuse filehandle-centric code in non-file-based domains. =back The CPAN modules IO::Scalar, IO::ScalarArray, and IO::Lines currently provide some of this functionality, but their pure-Perl implementation (chosen for portability) is not as fast or memory-efficient as a native implementation could be. Additionally, since they are not part of the standard Perl distribution, many developers are either unaware of their existence or unwilling to obtain and install them. This RFC proposes that support for "in-core i/o" be folded into the Perl distribution as a standard extension module, making use of native C code for speed. =head1 IMPLEMENTATION As described above. The following i/o handle classes are proposed as minimally necessary; they are taken from existing Perl5 CPAN modules with the same names: =over 4 =item IO::Scalar An i/o handle which can be opened on a scalar (string) variable. We simply treat the bytes of the scalar as a "virtual file". =item IO::ScalarArray An i/o handle which can be opened on an array of scalar (string) variables. Here, the "virtual file" is defined as the concatenation of the scalars in the array. One very common way to obtain such a data structure is to slurp a file into an array. =back If Perl6 follows Java's example of distinguishing "bytes" from "characters", then it should be understood that the proposed i/o handles manipulate Ibytes, not characters. That is, the Java equivalents are classes like Cjava.io.ByteArrayInputStream. Character-based i/o should be handled by some additional conversion mechanism which is wrapped around byte-based i/o; this mechanism should be applicable to Iany i/o stream. A look at the Java implementation of byte-oriented "input/output streams" versus character-oriented "readers and writers" is worthwhile for this. =head1 REFERENCES IO::Scalar (CPAN) IO::ScalarArray (CPAN)
Re: Cross-referencing RFC 186 with RFC 183 and RFC 79
On Thu, Sep 14, 2000 at 12:15:28PM -0700, Glenn Linderman wrote: 1) why extract it if it could potentially be used in place 2) if it cannot be used in place, then why bundle it So I guess RFC 183 leaves me not understanding its goals. If there is a benefit to the bundling, then RFC 183 would seem to be only half the solution, the other half would be to use it in place. The benefit to allowing inline tests is stated in the RFC. The main reason why I want to extract the tests into a .t file is because then it can benefit from the testing system already in place and that everyone's familiar with (for limited values of everyone) and we know works. Otherwise we'd have to completely rebuild the testing system, and I don't really see any good reason for that. It does its job well and its one less piece of magic to worry about. It also means it will work in Perl 5. Very important, since I don't want to wait two years before I can start using this. Even if we do manage to do it purely as Perl 5, it will still require that certain modules and utilities are installed in order for the tests to be extracted and run. When I distribute my program, I do not want to guarantee that the user has these incidental files (since they have nothing to do with the running of the module itself, and users have little enough tolerance for downloading additional modules as it is). So I can simply run pod2tests over my code, extract the .t files and distribute them directly. Also, because embedded tests are ment to supplement, *not* supplant, traditional tests, if you're going to have .t files around anyway, one more isn't going to hurt. Nothing you've said below makes me think that it would be impossible to use the data in-place, given a unification of the concepts in RFCs 186, 183, and 79. Whether or not it is valuable is another issue, but, in the never-ending quest to eliminate redundancy, I'll claim that if it is valuable to embed the content in the first place (the point of RFC 183), then it would be valuable to use it in-place rather than extracting it. Now, there's no reason pod2tests couldn't have an "inplace" switch which would do what you propose, either by keeping them entirely in memory or by cleaning up after itself afterwards. I've been thinking largely for the case of testing modules, which already have a testing harness around them. You're probably thinking of individual programs. $ pod2tests --inplace foo.pl foo.pl.ok Something like that would make sense. Eliminating redundancy is one thing, eliminating TMTOWTDI is another. -- Michael G Schwern http://www.pobox.com/~schwern/ [EMAIL PROTECTED] Just Another Stupid Consultant Perl6 Kwalitee Ashuranse But why? It's such a well designed cesspool of C++ code. Why wouldn't you want to hack mozilla? -- Ziggy
Re: RFC 30 (v4) STDIN, STDOUT, STDERR, ARGV, and DATA should become scalars
"David L. Nicol" wrote: File handles work perfectly well right now as undecorated terms with well defined characteristics Perfectly well? * Have to use ugly globref syntax to pass them around reliably. * Not first-class objects, so you can't subclass them. * Special syntax to reassign STDOUT, instead of just $STDOUT = $foo. * Stupid "gensym" tricks to create unique names. * Do they even *get* garbage-collected they way real objects do??? * STDOUT-flush barfs if you don't "use FileHandle" first; sheesh. * The "tiehandle" mechanism (blech). I don't know. If Perl didn't work this way and I proposed **adding** such a monstrosity to the language, the community would laugh in my face, spit on my camel, and revoke my PAUSE account. You can't even say that there's more typing involved to make the change, since both columns have the same number of characters, and the righthand-side is a lot easier to understand: print STDOUT "Hi"; print $STDOUT "Hi"; foo(\*STDOUT); foo($STDOUT); Scalars hold references to objects. Filehandles should, ultimately, be objects, as should directory handles. Just my 10 centimes, Eryq
Re: RFC 30 (v4) STDIN, STDOUT, STDERR, ARGV, and DATA should become scalars
Nathan Wiger wrote: (in response to an assertion of preference for undecorated filehandles) Well, I think you might be overlooking a couple of important things about filehandles. First, having them NOT be scalars caused many problems: 1. You must use globs to pass them in and out of functions This could be resolved by allowing undecorated types to be passed 2. You cannot assign to them easily, even when it makes sense This could be resolved by allowing undecorated types to have assignment operators associated with their type 3. There's no way to have them interpolated in strings into something potentially useful. The string value of a file handle. Hmm. Is it's internal file descriptor number? Is it the name of the file? What if it's an anonymous stream? Is it the contents of the file? How far? If interpolation implies reading, how much is read, and can it be shoved back in to be read again? I like file handles that belong to an uninterpolable, undecorated type. I don't want anything interpolating in my doublequotes except specifically designated expressions starting with $ or @. 4. There's no easy way to maintain them as objects without problematic contortions. Not if we expand "object" to something that can handle them. Can the problematic contortions be hidden? The only downside is: 1. They don't standout as well anymore But you can make up for this by just keeping them in ALLCAPS: $FILE = open "/etc/motd"; print $FILE @stuff; Plus you get all the other benefits, which really are big huge benefits, especially when you do lots of work with filehandles. This is not mere decoration, it is fundamentally upgrading filehandles to first-class types - finally! While it does take a little getting used to, type it a few times and I think you'll find it looks just as natural - in fact more natural when you consider $STDOUT is now a true object containing true properties. I do not know what your vehemence refers to. STDOUT didn't truly refer to file descriptor #1 before? Historically, it has been implicitly learned that filehandles are "different thingies", but they're not, no different than a handle to a web doc or ftp server - which are scalars currently. Mine aren't open WEBPAGE, "hose $server 80 -slave tmpdata/getexp$$ |"; And a hidden ftp session is something a lot more complex than a file handle. This change is not about just making handles look more scalar-ish, it's about letting handles act and exist as first-class types. -Nate So why not allow undecorated variables a larger existence? This way allows more flexibility, allowing definition of an infinite variety of holes and pegs instead of sticking with our familiar round hole and peg. With lvalue subroutines we have the ability, in perl 5.6 and up, to have an accessor function that returns a reference to something. Which, means we can now have myriad types. Why not keep the file handle as a canonical example of an undecorated type, that we can do new things with in perl 6 pass as an argument Assign to using some means other than the special accessor function for the filehandle type, Copen -- Al Bundy 816.235.1187 [EMAIL PROTECTED] http://www.weihenstephan.org/~joaraue/img/snf.gif
Re: RFC 30 (v4) STDIN, STDOUT, STDERR, ARGV, and DATA should become scalars
"David L. Nicol" wrote: 1. You must use globs to pass them in and out of functions This could be resolved by allowing undecorated types to be passed This is already allowed. It's called "passing in a bareword". And barewords are just strings. Are you proposing that "a bareword should now mean a filehandle", so that copydata(STDIN, STDOUT); means something different from copydata('STDIN', 'STDOUT'); or copydata(STDIN = "STDOUT"); ?
RFC - Interpolation of method calls
=head1 TITLE Interpolation of method calls =head1 VERSION Maintainer: Michael G Schwern [EMAIL PROTECTED] Date: 14 Sep 2000 Version:1 Mailing List: [EMAIL PROTECTED] =head1 ABSTRACT Method calls should interpolate in double-quoted strings, and similar locations. print "Today's weather will be $weather-temp degrees and sunny."; Would deparse to: print 'Today\'s weather will be '.$weather-temp().' degrees and sunny.'; =head1 DESCRIPTION =head2 The Current Problem With OO Interpolation Object-oriented programming encourages data-hiding, and one of the most basic tool for this is the accessor method. For reasons which should be obvious, C$obj-foo() is usually better than C$obj-{foo}. However, there are several barriers to using an accessor method as simply as one does a hash lookup. Other RFCs deal with most of the current issues, but a basic one still remains. print "Today's weather will be $weather-temp degrees and sunny."; This does not DWIM. Instead of interpolating C$weather-temp as a method call, it comes out as C$weather.'-temp' and is usually followed immediately by the question "What does 'Weather=HASH(0x80d4174)-temp' mean??" Most programmers learning OO Perl expect this to work and are surprised to find that it does not. Work arounds abound: # If I wanted printf(), I'd have written it in C. printf "Today's weather will be %d degrees and sunny.", $weather-temp; my $temp = $weather-temp; print "Today's weather will be $temp degrees and sunny."; print "Today's weather will be @{[$weather-temp]} degrees and sunny."; print "Today's weather will be ".$weather-temp." degrees and sunny."; None are as simple and as obvious as: print "Today's weather will be $weather-{temp} degrees and sunny."; and because of this users groan at having to use accessor methods and are often tempted to violate encapsulation for ease of use. =head2 Proposed Solution - Interpolate Methods Therefore, it is proposed that direct object method calls be interpolated inside double quoted strings and similar constructs. print "Today's weather will be $weather-temp degrees and sunny."; should parse out as: print 'Today\'s weather will be '.$weather-temp().' degrees and sunny.'; thus returning DWIMness to methods and strings and removing one barrier to accessor method's acceptance over hash lookups for objects. Methods will be run in scalar context. A method which returns a single scalar is treated normally. If a list is returned, it should be treated same as array interpolation. The list seperator will be applied. In effect, the deparsing will really work out as follows: print 'Today\'s weather will be '.join($", $weather-temp()). ' degrees and sunny.'; However if temp() calls wantarray(), the result will be FALSE (scalar). (For the remainder of the RFC, the join() will be assumed when discussing deparsing for brevity.) Should it be decided that a formal distinction be made between accessor methods and other types (RFC 95), method interpolation should interpolate Bany method. =head2 Argument passing Interpolation should also handle passing arguments to methods in a string: print "Today's weather will be $weather-temp("F") degrees and sunny."; This should deparse to: print 'Today\'s weather will be '.$weather-temp("F"). ' degrees and sunny.'; The arguments to the method are considered as normal expressions, thus: print "There is $obj-foo(this = $yar, that = 2 + 2) in my head."; deparses as: print 'There is '.$obj-foo(this = $yar, that = 2 + 2). ' in my head."; =head1 CAVEATS Indirect object syntax, being already ambiguous, cannot be easily be distinguished in a string from normal text and should not be interpolated. This is ok, since accessors are rarely called with indirect object syntax. Are there any contexts besides double quotes ("", qq{}, "EOF") where this need be applied? What about inside regexes? And if so, left and/or right hand side? Normally, whitespace is allowed between tokens of a method call. $obj - bar ("this"); and $obj-bar("this"); are equivalent. Whitespace between the object, '-', method name and opening paren should be disallowed when interpolated. This will avoid many ambiguous cases. Should the method not exist, Perl will throw an exception/die as usual. C"$var-{this}[2]{is}-{complex}-method" should also be interpolated. Also C"$obj-method-{key}" for the case where a method returns a reference. =head1 IMPLEMENTATION The behavor of the parser to check for embedded variables would have to be altered, namely the case where an embedded variable is being dereferenced. A case would be added to allow method calls as well as hash and array index dereferences. Otherwise, parsing should remain as normal. =head1 REFERENCES RFC 95 - Object Classes (proposes automatic accessor methods) RFC 163 - Automatic
Re: RFC - Interpolation of method calls
This topic is actually covered, albeit far less in-depth and lumped with an unrelated change, by Nathan Wiger's RFC 103, just in case you weren't aware. On Thu, Sep 14, 2000 at 03:57:41AM -0400, Michael G Schwern wrote: Methods will be run in scalar context. A method which returns a single scalar is treated normally. If a list is returned, it should be treated same as array interpolation. The list seperator will be applied. In effect, the deparsing will really work out as follows: print 'Today\'s weather will be '.join($", $weather-temp()). ' degrees and sunny.'; However if temp() calls wantarray(), the result will be FALSE (scalar). Ok, this is very confusing. You're saying the method is called in scalar context, but then proceed to call it in list context, meanwhile tricking it into thinking it's scalar context. Interpolated method calls should behave the same as every other method call, without extra magic. The only decision, then, is to decide which context to use; if it deparses to concatenation then it seems logical to use scalar context. This also makes sense in that you can force list context with @{[ $weather-temp ]} if you really wanted it. =head2 Argument passing Interpolation should also handle passing arguments to methods in a string: print "Today's weather will be $weather-temp("F") degrees and sunny."; This should deparse to: print 'Today\'s weather will be '.$weather-temp("F"). ' degrees and sunny.'; The arguments to the method are considered as normal expressions, thus: print "There is $obj-foo(this = $yar, that = 2 + 2) in my head."; deparses as: print 'There is '.$obj-foo(this = $yar, that = 2 + 2). ' in my head."; Now perl is parsing full statements within strings. I -really- don't like this, not only because perl is now reaching into strings to parse yet more, but also because it's already beginning to look very difficult for me, personally, to parse. Not only that, it gives me the heeby-geebies (which of course means you should all immediately agree with me :). I'd say keep it simple, allow only simple, non-parenthetical method calls. "foo $foo-bar bar" -- "foo " . $foo-bar . " bar" "foo $foo-bar() bar" -- "foo " . $foo-bar . "() bar" Granted, it may confuse the newbies, but I think it makes things much easier on everyone. Normally, whitespace is allowed between tokens of a method call. $obj - bar ("this"); and $obj-bar("this"); are equivalent. Whitespace between the object, '-', method name and opening paren should be disallowed when interpolated. This will avoid many ambiguous cases. This is a good idea, and has precedence (as I just discovered answering someone's question about it in #Perl as I was writing this email, weird..): "$hash - {'foo'}" -- "HASH(0x8bbf0b8) - {k1}" Michael -- Administrator www.shoebox.net Programmer, System Administrator www.gallanttech.com --
Re: Draft RFC: new pragma: Cuse namespace
I would suggest that anyone want to contribute to this discussion should first read the thread about the addition of this pragma to perl5 in the perl5-porters archives http://www.xray.mpe.mpg.de/cgi-bin/w3glimpse/perl5-porters?query=use+namespace+pragmaerrors=0case=onmaxfiles=100maxlines=30 Graham.
Re: Draft RFC: new pragma: Cuse namespace
Graham Barr [EMAIL PROTECTED] writes: I would suggest that anyone want to contribute to this discussion should first read the thread about the addition of this pragma to perl5 in the perl5-porters archives http://www.xray.mpe.mpg.de/cgi-bin/w3glimpse/perl5-porters?query=use+namespace+pragmaerrors=0case=onmaxfiles=100maxlines=30 Ah. Sorry I didn't post that url with the RFC. -- Piers
Re: RFC 218 (v1) Cmy Dog $spot is just an assertion
Perl6 RFC Librarian [EMAIL PROTECTED] writes: This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Cmy Dog $spot is just an assertion =head1 VERSION Maintainer: Piers Cawley [EMAIL PROTECTED] Date: 13th September 2000 Mailing List: [EMAIL PROTECTED] Number: 218 Version: 1 Status: Developing =head1 ABSTRACT The behaviour of the my Dog $spot syntax should simply be an assertion of the invariant: (!defined($spot) || (ref($spot) $spot-isa('Dog))) =head1 DESCRIPTION The syntax my Dog $spot = Dog-new(); currently carries little weight with Perl, often failing to do what one expects: $ perl -wle 'my Dog::$spot; print "ok"' No such class Dog at -e line 1, near "my Dog" Execution of -e aborted due to compilation errors. $ perl -wle 'sub Dog::new; my Dog $spot; print "ok"' ok $ perl -wle 'sub Dog::new; my Dog $spot = 1' ok The first example is obvious, as is the second. The third one is Iweird. Actually, it's *very* weird given that there's no code to print 'ok'. Which is bad. $ perl -wle 'sub Dog::new; my Dog $spot = 1; print "ok"' That's better. -- Piers
Re: RFC 218 (v1) Cmy Dog $spot is just an assertion
Nathan Torkington [EMAIL PROTECTED] writes: Perl6 RFC Librarian writes: I therefore propose that Cmy Dog $spot comes to mean that C$spot is restricted to being either undefined or a reference to a CDog object (or any subclasses of Dog). Simply having this implicit assertion can be useful to the programmer, but I would argue that its main advantage is that the compiler knows the object's interface at compile time and can potentially use this fact to speed up method dispatch. Yes! I mentioned the hypothetical use strict 'types'; which would require all variables assigned to/from an object, and all variables upon which method calls are made, to be typed like this. Then the compiler can: (a) optimize (b) check at compile-time Your sample implementation is done through a Tie class, which is only runtime. The big win is that if you check types like this (and it's a window into C's type-checking hell) then Perl knows types at compile-time. If there's a formal interface specification for classes, the compiler can use this to check whether a method call is valid or not. You can build calls to the correct subroutine into optree instead of delaying that lookup until runtime. Every compile-time check comes at the cost of a run-time freedom, though. All bets would be off if you modified @ISA, reblessed, or passed objects through non-strict-types-compliant code. Polymorphic types also becomes a problem: how to say that it's okay for a variable to hold a Dog *or* a Cat, because we know that both of them have a "pet()" method? I'd love to see these suggestions incorporated into your RFC. I was going to do it myself, but I have a lot of other things to RFC. TBH, I'm not sure I want to go too far down that road in this RFC. And tbh they seem more like internals issues to me. The runtime behaviour this change grants is good enough for me and I don't want to see the proposal bogged down in flamage about strict types. Of course, given this RFC it's possible to add other RFCs that deal with specific dependent language proposals and optimizations. =head1 MIGRATION Migration issues should be minor, the only problem arising when people have assigned things that aren't objects of the appropriate type to typed variables, but they deserve to lose anyway. Not if you made the checks and optimizations enabled by a pragma. Old programs wouldn't have it, so they could continue to do their stupid things and be fine. Again, I'm not sure that I'd want to encourage such bogosity. After all, if perl5 could introduce "@array" interpolation which broke stuff I don't see why this one can't go through as the default. -- Piers
Re: RFC 218 (v1) Cmy Dog $spot is just an assertion
Michael G Schwern [EMAIL PROTECTED] writes: On Wed, Sep 13, 2000 at 08:43:43PM -, Perl6 RFC Librarian wrote: The behaviour of the my Dog $spot syntax should simply be an assertion of the invariant: (!defined($spot) || (ref($spot) $spot-isa('Dog))) What about the current behavior of typed pseudohashes? package Dog; use fields qw(this night up); my Dog $ph = []; $ph-{this} = "that"; That works? I thought you had to do: my Dog $self = fields::new('Dog'); (In which case, as far as I can see, the proposal makes no difference.) -- Piers
Re: RFC - Interpolation of method calls
This topic is actually covered, albeit far less in-depth and lumped with an unrelated change, by Nathan Wiger's RFC 103, just in case you weren't aware. Yeah, I've got to split those up. I was trying cut down on the flood of RFC's that poor Larry has to sift through :-(, but they are both complex issues. Schwern, if you want to take this one over, it's all yours. There has already been some discussion on this here: http://www.mail-archive.com/perl6-language@perl.org/msg02169.html I would encourage people to read the thread. In particular, to be truly consistent we should interpolate class as well as instance methods, i.e.: "Hello, Class-name"; But there's a lot of problems with this and I'm not sure it's a good idea. print 'Today\'s weather will be '.join($", $weather-temp()). ' degrees and sunny.'; However if temp() calls wantarray(), the result will be FALSE (scalar). Ok, this is very confusing. I think what he's trying to get at is that these should all work the same: print "Here's some @stuff"; print "Here's some $h-{stuff}"; print "Here's some $r-stuff"; print 'There is '.$obj-foo(this = $yar, that = 2 + 2). ' in my head."; Now perl is parsing full statements within strings. This already happens with: print "Here's some $h-{$stuff}"; print "Here's some $a-[$stuff + $MIN]"; So it's not that confusing, and quite consistent. I'd say keep it simple, allow only simple, non-parenthetical method calls. No, this makes it impossible to do this: print "Your name is $cgi-param('name')"; And it's also inconsistent with how hashrefs and arrayrefs work already. -Nate
Re: Draft RFC: new pragma: Cuse namespace
Nathan Wiger wrote: use namespace 'Big::Long::Prefix'; my ::Class $object = ::Class-new; Assuming repairing :: precedence is a reality I don't think this proposal buys us anything. backtracking...That being said, I'm not necessarily against it. I'm just against bloat. I hadn't paid too much attention to the actual patch the first time reading it through, but it looks like a simple thing to add. And the one nice thing it does add is that these: $::stuff $main::stuff don't always mean the same thing, which at least gives some real reason for having the first version around. If this change is made, though, then I think it should work the same way in packages: package Foo; $::bar = "stuff"; # $Foo::bar = "stuff" For consistency. That way the rule is: "If no prefix is found to ::, then the current package namespace is used. You can change the current namespace either via 'package' (if you want to encapsulate a new package), or 'use namespace' (if you simply want a shortcut way of referring to variables)". This might have been discussed, but I didn't see it in the threads (I could have missed it, though). For anyone looking for a direct link to the start of the patch thread: http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2000-07/msg00490.html -Nate
Re: RFC - Interpolation of method calls
Michael G Schwern [EMAIL PROTECTED] writes: print "Today's weather will be $weather-temp degrees and sunny."; This does not DWIM. Instead of interpolating C$weather-temp as a method call, it comes out as C$weather.'-temp' and is usually followed immediately by the question "What does 'Weather=HASH(0x80d4174)-temp' mean??" Most programmers learning OO Perl expect this to work and are surprised to find that it does not. I think $- is unlikely enough in a string that this is worth considering. Work arounds abound: print $weather-report; being the one I like best - avoids the un-meritted assumption it will be sunny ;-) -- Nick Ing-Simmons [EMAIL PROTECTED] Via, but not speaking for: Texas Instruments Ltd.
Re: RFC 218 (v1) Cmy Dog $spot is just an assertion
Piers Cawley writes: TBH, I'm not sure I want to go too far down that road in this RFC. And tbh they seem more like internals issues to me. The runtime behaviour this change grants is good enough for me and I don't want to see the proposal bogged down in flamage about strict types. Of course, given this RFC it's possible to add other RFCs that deal with specific dependent language proposals and optimizations. Ok, I'll work on the RFC for the type-checking. Nat
Re: RFC - Interpolation of method calls
On Thu, Sep 14, 2000 at 07:49:32AM -0700, Nathan Wiger wrote: print 'Today\'s weather will be '.join($", $weather-temp()). ' degrees and sunny.'; However if temp() calls wantarray(), the result will be FALSE (scalar). I think what he's trying to get at is that these should all work the same: print "Here's some @stuff"; print "Here's some $h-{stuff}"; print "Here's some $r-stuff"; That may be, but I don't think calling it in one context, accepting another, and lying about it all is the right way to go. I think I'll let the author try to explain what he had intended. print 'There is '.$obj-foo(this = $yar, that = 2 + 2). ' in my head."; Now perl is parsing full statements within strings. This already happens with: print "Here's some $h-{$stuff}"; print "Here's some $a-[$stuff + $MIN]"; So it's not that confusing, and quite consistent. Well, I still find it confusing (most probably because of the length of the examples), but it is indeed consistent. Stupid mistake on my part. Michael -- Administrator www.shoebox.net Programmer, System Administrator www.gallanttech.com --
RFC 49 (v3) Objects should have builtin stringifying STRING method
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Objects should have builtin stringifying STRING method =head1 VERSION Maintainer: Nathan Wiger [EMAIL PROTECTED] Date: 06 Aug 2000 Last Modified: 14 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 49 Version: 3 Status: Frozen =head1 ABSTRACT Currently, $ single-whatzitz types in Perl can hold several different things. One of the things that these are commonly used to hold are objects, such as: $q = new CGI; $r = Apache-request; Unfortunately, there is no easy way to tell these are actually objects without lots of annoying ref checks throughout your code. So if you say: print "$q"; This prints out something like this: CGI=HASH(0x80ba4e8) Which isn't very useful. This RFC attempts to fix this by providing builtin special method CSTRING which is automatically called when an object is "stringified". While this can be accomplished through the use of 'use overload', a more automatic, object-specific method has certain advantages. For more details on this, please see RFC 159. =head1 NOTES ON FREEZE This RFC goes into details on the uses of CSTRING, but what you really want to read is LRFC 159: True Polymorphic Objects, which extends these concepts to other Perl operators and contexts. =head1 DESCRIPTION Currently, there is no way to easily distinguish between these two syntaxes: $scalar = date; # scalar ctime date, same as localtime() $object = date; # date object with accessor functions As such, there is no easy way to have the Cdate() function return both - it must decide what to return within the general scalar context. Damian's excellent RFC 21 on Cwant() addresses several specific cases, several have suggested alternate syntaxes, such as: my Date $object = date; # return object of class 'Date' my tm $object = date; # return object of struct 'tm' However, this doesn't solve the problem, since printing out either of these in a scalar context still results in "garbage". I suggest that objects provide a default method called CSTRING that determines what they produce in a string context. When stringified, an object would automatically call its CSTRING function and return the correct value. For example, RFC 48 describes a new Cdate() interface. In a scalar context, it could produce a date object always: $date = date; However, when you went to do anything with it in a string context, it would call the appropriate method: print "$date"; # calls $date-STRING, which in this case would # print out a ctime formatted date string The call to C$object-STRING would be a decision made by Perl, similar to the way that Ctie() works. The object simply has to provide the method; Perl does the rest. This gives us several other really neat side effects. First, we can now return a list of objects and have them act the same as a "regular old list": (@objects) = Class-new; Since, in a stringifying context, each of these objects would call their CSTRING methods: print "@objects"; # calls $objects[0]-STRING, $objects[1]-STRING, # and so on for the whole array, thus making it # look like a plain old list As such, we no longer have to distinguish between objects and "true" scalars. Objects are automatically converted when appropriate. =head1 IMPLEMENTATION All core objects should be modified to include a CSTRING function. This function may just be a typeglob pointing to another function, or it may be an actual separate function. Hooks will have to be put in Perl's string context so that if something is an object, then that object's CSTRING method is called automatically if it exists. =head1 MIGRATION None. This introduces new functionality. =head1 REFERENCES RFC 159: True Polymorphic Objects RFC 21: Replace wantarray with a generic want function RFC 48: Replace localtime() and gmtime() with date() and utcdate() Lots of people on perl6-language for great input, thanks!
Re: RFC 218 (v1) Cmy Dog $spot is just an assertion
On Thu, Sep 14, 2000 at 02:19:38PM +0100, Piers Cawley wrote: Michael G Schwern [EMAIL PROTECTED] writes: package Dog; use fields qw(this night up); my Dog $ph = []; $ph-{this} = "that"; That works? I thought you had to do: my Dog $self = fields::new('Dog'); Nope. fields::new() basically just does Cbless [\%{"$class\::FIELDS"}], $class, but the current pseudohash implementation doesn't care if something is an object or not. It just cares about either A) its type or B) what's in $ph-[0]. I don't know if this is a good thing or a bad thing, but there's nothing on the table to change it (yet). my Dog $ph = []; $ph-{this} = "that"; deparses at compile-time to: my Dog $ph = []; $ph-[$Dog::FIELDS{this}] = "that"; # actually the %FIELDS lookup is also # done at compile time, but I left # it in for illustration. -- Michael G Schwern http://www.pobox.com/~schwern/ [EMAIL PROTECTED] Just Another Stupid Consultant Perl6 Kwalitee Ashuranse Sometimes these hairstyles are exaggerated beyond the laws of physics - Unknown narrator speaking about Anime
RFC 222 (v1) Interpolation of method calls
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Interpolation of method calls =head1 VERSION Maintainer: Michael G Schwern [EMAIL PROTECTED] Date: 14 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 222 Version: 1 Status: Developing =head1 ABSTRACT Method calls should interpolate in double-quoted strings, and similar locations. print "Today's weather will be $weather-temp degrees and sunny."; Would deparse to: print 'Today\'s weather will be '.$weather-temp().' degrees and sunny.'; =head1 DESCRIPTION =head2 The Current Problem With OO Interpolation Object-oriented programming encourages data-hiding, and one of the most basic tool for this is the accessor method. For reasons which should be obvious, C$obj-foo() is usually better than C$obj-{foo}. However, there are several barriers to using an accessor method as simply as one does a hash lookup. Other RFCs deal with most of the current issues, but a basic one still remains. print "Today's weather will be $weather-temp degrees and sunny."; This does not DWIM. Instead of interpolating C$weather-temp as a method call, it comes out as C$weather.'-temp' and is usually followed immediately by the question "What does 'Weather=HASH(0x80d4174)-temp' mean??" Most programmers learning OO Perl expect this to work and are surprised to find that it does not. Work arounds abound: # If I wanted printf(), I'd have written it in C. printf "Today's weather will be %d degrees and sunny.", $weather-temp; my $temp = $weather-temp; print "Today's weather will be $temp degrees and sunny."; print "Today's weather will be @{[$weather-temp]} degrees and sunny."; print "Today's weather will be ".$weather-temp." degrees and sunny."; None are as simple and as obvious as: print "Today's weather will be $weather-{temp} degrees and sunny."; and because of this users groan at having to use accessor methods and are often tempted to violate encapsulation for ease of use. =head2 Proposed Solution - Interpolate Methods Therefore, it is proposed that direct object method calls be interpolated inside double quoted strings and similar constructs. print "Today's weather will be $weather-temp degrees and sunny."; should parse out as: print 'Today\'s weather will be '.$weather-temp().' degrees and sunny.'; thus returning DWIMness to methods and strings and removing one barrier to accessor method's acceptance over hash lookups for objects. Methods will be run in scalar context. A method which returns a single scalar is treated normally. If a list is returned, it should be treated same as array interpolation. The list seperator will be applied. In effect, the deparsing will really work out as follows: print 'Today\'s weather will be '.join($", $weather-temp()). ' degrees and sunny.'; However if temp() calls wantarray(), the result will be FALSE (scalar). (For the remainder of the RFC, the join() will be assumed when discussing deparsing for brevity.) Should it be decided that a formal distinction be made between accessor methods and other types (RFC 95), method interpolation should interpolate Bany method. =head2 Argument passing Interpolation should also handle passing arguments to methods in a string: print "Today's weather will be $weather-temp("F") degrees and sunny."; This should deparse to: print 'Today\'s weather will be '.$weather-temp("F"). ' degrees and sunny.'; The arguments to the method are considered as normal expressions, thus: print "There is $obj-foo(this = $yar, that = 2 + 2) in my head."; deparses as: print 'There is '.$obj-foo(this = $yar, that = 2 + 2). ' in my head."; =head1 CAVEATS Indirect object syntax, being already ambiguous, cannot be easily be distinguished in a string from normal text and should not be interpolated. This is ok, since accessors are rarely called with indirect object syntax. Are there any contexts besides double quotes ("", qq{}, "EOF") where this need be applied? What about inside regexes? And if so, left and/or right hand side? Normally, whitespace is allowed between tokens of a method call. $obj - bar ("this"); and $obj-bar("this"); are equivalent. Whitespace between the object, '-', method name and opening paren should be disallowed when interpolated. This will avoid many ambiguous cases. Should the method not exist, Perl will throw an exception/die as usual. C"$var-{this}[2]{is}-{complex}-method" should also be interpolated. Also C"$obj-method-{key}" for the case where a method returns a reference. =head1 IMPLEMENTATION The behavor of the parser to check for embedded variables would have to be altered, namely the case where an embedded variable is being dereferenced. A case would be added to allow method calls as well as hash and array index dereferences. Otherwise, parsing should remain as normal. =head1
Re: RFC 222 (v1) Interpolation of method calls
First of all, I think this is a great idea On 14 Sep 2000, Perl6 RFC Librarian wrote: Are there any contexts besides double quotes ("", qq{}, "EOF") where this need be applied? What about inside regexes? And if so, left and/or right hand side? Regexes are enough like double quoted strings that I think _not_ interpolating here would be confusing. The same goes for backticks (`)/qx, which is double-quote interpolated as well. -dave /*== www.urth.org We await the New Sun ==*/
RFC 189 (v2) Objects : Hierarchical calls to initializers and destructors
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Objects : Hierarchical calls to initializers and destructors =head1 VERSION Maintainer: Damian Conway [EMAIL PROTECTED] Date: 1 September 2000 Mailing List: [EMAIL PROTECTED] Number: 189 Version: 2 Status: Developing =head1 ABSTRACT This RFC proposes a new special method called CBUILD that is invoked automagically whenever an object is created. Furthermore, it proposes that both CBUILD and CDESTROY methods should be invoked hierarchically in all base classes. =head1 DESCRIPTION One of the major limitations of object-oriented Perl is that, unlike most other OO languages, it does not automatically invoke the initializers and destructors of base classes, when initializing or destructing an object of a derived class. This leads to tediously complex code in constructors and destructors in order to manually achieve the same effect. More often, it leads to bugs. It is proposed that Perl 6 introduce an automatic object initialization mechanism, analogous to the automatic object clean-up mechanism provided by CDESTROY methods. It is further proposed that both the initialization and destruction mechanisms automatically call their corresponding base class versions to ensure that complete initialization and destruction of derived objects occurs correctly. =head2 The CBUILD method It is proposed that, if a class has a method named CBUILD, that method will be invoked automatically during any call to Cbless. It is further proposed that Cbless be extended to take an optional argument list after its second argument, and that this list would be passed to any CBUILD method invoked by the Cbless. The typical constructor would then be reduced to: package MyClass; sub new { bless {}, @_ } with initialization handled in a separate CBUILD routine: sub BUILD { my ($self, @ctor_data) = @_; # initialization of object referred to by $self occurs here } =head2 Hierarchical CBUILD calls It is proposed that when an object is blessed, Iall of the CBUILD methods in any of its base classes are also called, and passed the argument list appended to the invocation of Cbless. CBUILD methods would be called in depth-first, left-most order (i.e. ancestral CBUILD methods would be called before derived ones). Any given CBUILD method would only be called once for the same object, no matter how many separate paths its class might be inherited through. For example, given the following class hierarchy: package Base1; sub new { bless {}, @_ } sub BUILD { print "Base1::BUILD : @_\n" } package Base2; sub BUILD { print "Base2::BUILD : @_\n" } package Base3; sub BUILD { print "Base3::BUILD : @_\n" } package Derived1; use base qw(Base1 Base2); sub BUILD { print "Derived1::BUILD : @_\n" } package Derived2; use base qw(Base2 Base3); sub BUILD { print "Derived2::BUILD : @_\n" } package Rederived1; use base qw(Derived1 Derived2); sub BUILD { print "Rederived1::BUILD : @_\n" } then the call to: $obj = Rederived-new(1..3) would print: Base1::BUILD : 1 2 3 Base2::BUILD : 1 2 3 Derived1::BUILD : 1 2 3 Base3::BUILD : 1 2 3 Derived2::BUILD : 1 2 3 Rederived1::BUILD : 1 2 3 Note in particular that CBase2::BUILD is only called once (as early as possible), even though class Rederived inherits it through two distinct paths. =head2 Hierarchical CDESTROY calls It is further proposed that when an object's destructor is invoked, all inherited destructors would also be invoked, in depth-Ilast, right-most order. Again, each CDESTROY for an object would be called exactly once, regardless of how many different paths it is inherited through. For example, given the following class hierarchy (with the same topology as the example for CBUILD above): package Base1; sub new { bless {}, @_ } sub DESTROY { print "Base1::DESTROY\n" } package Base2; sub DESTROY { print "Base2::DESTROY\n" } package Base3; sub DESTROY { print "Base3::DESTROY\n" } package Derived1; use base qw(Base1 Base2); sub DESTROY { print "Derived1::DESTROY\n" } package Derived2; use base qw(Base2 Base3); sub DESTROY { print "Derived2::DESTROY\n" } package Rederived1; use base qw(Derived1 Derived2); sub DESTROY { print "Rederived1::DESTROY\n" } then the destruction of an object: $obj = "something else"; would print: Rederived1::DESTROY Derived2::DESTROY Base3::DESTROY Derived1::DESTROY Base2::DESTROY Base1::DESTROY Note that CBase2::DESTROY is only called once (as late as
RFC 224 (v1) Objects : Rationalizing Cref, Cattribute::reftype, and Cbuiltin::blessed
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Objects : Rationalizing Cref, Cattribute::reftype, and Cbuiltin::blessed =head1 VERSION Maintainer: Damian Conway [EMAIL PROTECTED] Date: 14 September 2000 Mailing List: [EMAIL PROTECTED] Number: 224 Version: 1 Status: Developing =head1 ABSTRACT This RFC proposes that rather than three separate mechanisms (in three separate namespaces) to determine object typing information, Perl 6 simply extend the Cref function to return all the necessary information in a list context. =head1 DESCRIPTION In Perl 5, the class into which an object is blessed is returned by calling Cref on a reference to that object. To determine the underlying implementation type of the object, Cattribute::reftype is used. To determine whether or not a reference refers to a blessed object, builtin::blessed is used. It is proposed that the behaviour of Cref be altered in Perl 6 so that in a list context it returns up to two values: the underlying implementation type of the object (always returned), and the class into which the object is blessed (only if the object Iis blessed). Thus: if (builtin::blessed $ref) { $type = attribute::reftype $ref; $class = ref $ref; } else { $type = ref $ref; $class = "no class"; } print "Object of type $type, blessed into $class\n"; Would become: ($type, $class) = ref($ref); $class ||= "no class"; print "Object of type $type, blessed into $class\n"; =head1 MIGRATION ISSUES All existing calls to Cref in a list context would have to be translated to Cscalar ref. =head1 IMPLEMENTATION Trivial. =head1 REFERENCES None.
Re: RFC 218 (v1) Cmy Dog $spot is just an assertion
Piers wrote: I'm kind of tempted to look at adding another pragma to go with 'use base' along the lines of: use implements 'Interface'; Which is almost entirely like Cuse base 'Interface' but with 'Interface' consisting of nothing but: package Interface; sub virtual_method; sub virtual_method2 (#prototype); ... 1; You and I must have been separated at birth, Piers. Here's what I wrote to Nat just yesterday: There would be an Cinterface pragma or keyword (let's go with keyword) that creates pseudo-packages with which lexicals can be typed. interface Fetcher; sub fetch; Interface specifications can only contain subroutine (method) declarations, which describe what behaviours the interface requires. Lexicals typed into interfaces (as opposed to packages) only require that the objects assigned to them can satisfy the interface. I.e. they don't care about the class of the object, only what is Ccan do. my Fetcher $x; my Dog $spot; my NetSnarfer $z; $x = $z;# ok because $z can fetch $x = $spot; # ok because $spot can fetch $x-fetch();# ok because Fetcher-can('fetch') $x-bark(); # not ok because ! Fetcher-can('bark') Interfaces might also act like pure abstract base classes when inherited, so that: package Dog; use base 'Fetcher'; would cause a compile-time error if Dog failed to actually provide a Cfetch method. If you'd like to run with it, be my guest (but check with Nat first, in case he wants it). Damian
Re: RFC 218 (v1) Cmy Dog $spot is just an assertion
At 08:13 AM 9/15/00 +1100, Damian Conway wrote: Piers wrote: I'm kind of tempted to look at adding another pragma to go with 'use base' along the lines of: use implements 'Interface'; Which is almost entirely like Cuse base 'Interface' but with 'Interface' consisting of nothing but: package Interface; sub virtual_method; sub virtual_method2 (#prototype); ... 1; You and I must have been separated at birth, Piers. Here's what I wrote to Nat just yesterday: snip Interfaces might also act like pure abstract base classes when inherited, so that: package Dog; use base 'Fetcher'; would cause a compile-time error if Dog failed to actually provide a Cfetch method. I don't like that at all... Currently, Object implementations can be changed at will at runtime. In particular, it is possible to create new methods and change the class hierarchy at run time. package TextFilter; use CryptInterface; @ISA = qw(CryptInterface); sub AUTOLOAD { my $self = shift; ($name = $AUTOLOAD) =~ s/.*://; # Cryptography is very hairy and we don't want to load it if we don't # have to. if ($name = any(qw(setkey setcrypt encode decode)) { require Crypt; import Crypt; push @ISA,"Crypt"; $self-$name(@_); # I've not tried this, it may be wrong. } } I'd hate to have that break because TextFilter isn't derived from Crypt unless it needs to be. I think calling a method declared in an inherited interface but not implemented would be a good reason to have a descriptive run-time error, like: Method getkey in interface Crypt not implemented in TextFilter object at line Well, perhaps written better... If you'd like to run with it, be my guest (but check with Nat first, in case he wants it). Damian
Re: RFC 222 (v1) Interpolation of method calls
On Thu, Sep 14, 2000 at 06:37:22PM -0500, David L. Nicol wrote: A possibility that does not appear in RFC222.1 is to put tho whole accessor expression inside curlies: print "Today's weather will be ${weather-temp} degrees and sunny."; which would follow the "You want something funny in your interpolated scalar's name or reference, you put it in curlies" rule. Since the contents of that expression is not strictly \w+ it does not get interpreted as the symbol table lookup ${'weather-temp'}, so that is not a problem, nor are the whitespace situations listed in the CAVEATS section. Currently, ${weather-temp} means dereference the return value of the method 'temp' in the 'weather' class as a scalar reference (or a symbolic reference, if it's not a scalar reference and you're not using strict). Do you intend for this meaning to be taken away entirely, or to be special-cased within interpolated strings? If either I would have to heartily disagree with you; it's inconsistent, and while special cases can be a good thing when it comes to DWIM, I think we should DWIM on the side of interpolating method calls, rather than taking away existing syntax. Michael -- Administrator www.shoebox.net Programmer, System Administrator www.gallanttech.com --
Re: RFC 189 (v2) Objects : Hierarchical calls to initializers and destructors
=head2 The CBUILD method =head3 The CREBUILD method Hey! You left out the alternative names NEW / RENEW and BLESS / REBLESS that we all like! :-( -Nate
Re: RFC 189 (v2) Objects : Hierarchical calls to initializers and destructors
=head2 The CBUILD method =head3 The CREBUILD method Hey! You left out the alternative names NEW / RENEW and BLESS / REBLESS that we all like! :-( Oops. You're correct. I will rectify that. Damian
RFC 222 (v1) Interpolation of method calls
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Interpolation of method calls =head1 VERSION Maintainer: Michael G Schwern [EMAIL PROTECTED] Date: 14 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 222 Version: 1 Status: Developing =head1 ABSTRACT Method calls should interpolate in double-quoted strings, and similar locations. print "Today's weather will be $weather-temp degrees and sunny."; Would deparse to: print 'Today\'s weather will be '.$weather-temp().' degrees and sunny.'; =head1 DESCRIPTION =head2 The Current Problem With OO Interpolation Object-oriented programming encourages data-hiding, and one of the most basic tool for this is the accessor method. For reasons which should be obvious, C$obj-foo() is usually better than C$obj-{foo}. However, there are several barriers to using an accessor method as simply as one does a hash lookup. Other RFCs deal with most of the current issues, but a basic one still remains. print "Today's weather will be $weather-temp degrees and sunny."; This does not DWIM. Instead of interpolating C$weather-temp as a method call, it comes out as C$weather.'-temp' and is usually followed immediately by the question "What does 'Weather=HASH(0x80d4174)-temp' mean??" Most programmers learning OO Perl expect this to work and are surprised to find that it does not. Work arounds abound: # If I wanted printf(), I'd have written it in C. printf "Today's weather will be %d degrees and sunny.", $weather-temp; my $temp = $weather-temp; print "Today's weather will be $temp degrees and sunny."; print "Today's weather will be @{[$weather-temp]} degrees and sunny."; print "Today's weather will be ".$weather-temp." degrees and sunny."; None are as simple and as obvious as: print "Today's weather will be $weather-{temp} degrees and sunny."; and because of this users groan at having to use accessor methods and are often tempted to violate encapsulation for ease of use. =head2 Proposed Solution - Interpolate Methods Therefore, it is proposed that direct object method calls be interpolated inside double quoted strings and similar constructs. print "Today's weather will be $weather-temp degrees and sunny."; should parse out as: print 'Today\'s weather will be '.$weather-temp().' degrees and sunny.'; thus returning DWIMness to methods and strings and removing one barrier to accessor method's acceptance over hash lookups for objects. Methods will be run in scalar context. A method which returns a single scalar is treated normally. If a list is returned, it should be treated same as array interpolation. The list seperator will be applied. In effect, the deparsing will really work out as follows: print 'Today\'s weather will be '.join($", $weather-temp()). ' degrees and sunny.'; However if temp() calls wantarray(), the result will be FALSE (scalar). (For the remainder of the RFC, the join() will be assumed when discussing deparsing for brevity.) Should it be decided that a formal distinction be made between accessor methods and other types (RFC 95), method interpolation should interpolate Bany method. =head2 Argument passing Interpolation should also handle passing arguments to methods in a string: print "Today's weather will be $weather-temp("F") degrees and sunny."; This should deparse to: print 'Today\'s weather will be '.$weather-temp("F"). ' degrees and sunny.'; The arguments to the method are considered as normal expressions, thus: print "There is $obj-foo(this = $yar, that = 2 + 2) in my head."; deparses as: print 'There is '.$obj-foo(this = $yar, that = 2 + 2). ' in my head."; =head1 CAVEATS Indirect object syntax, being already ambiguous, cannot be easily be distinguished in a string from normal text and should not be interpolated. This is ok, since accessors are rarely called with indirect object syntax. Are there any contexts besides double quotes ("", qq{}, "EOF") where this need be applied? What about inside regexes? And if so, left and/or right hand side? Normally, whitespace is allowed between tokens of a method call. $obj - bar ("this"); and $obj-bar("this"); are equivalent. Whitespace between the object, '-', method name and opening paren should be disallowed when interpolated. This will avoid many ambiguous cases. Should the method not exist, Perl will throw an exception/die as usual. C"$var-{this}[2]{is}-{complex}-method" should also be interpolated. Also C"$obj-method-{key}" for the case where a method returns a reference. =head1 IMPLEMENTATION The behavor of the parser to check for embedded variables would have to be altered, namely the case where an embedded variable is being dereferenced. A case would be added to allow method calls as well as hash and array index dereferences. Otherwise, parsing should remain as normal. =head1
Re: RFC 164 (v2) Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()
On 30 Aug 2000 02:13:38 -, Perl6 RFC Librarian wrote: Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade() Why? What's next, replace the regex syntax with something that more closely ressembles the rest of Perl? Regexes are a language within the language. And not a tiny one. So, if regexes are such a completely different sublanguage, I can see the m// and s/// syntax as just a link between these two entirely different worlds. I don't care that it has a weird syntax itself. That, by itself, simply stresses the fact that regexes are indeed "different". -- Bart.
Re: RFC 164 (v2) Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()
On Thu, 14 Sep 2000 08:47:24 -0700, Nathan Wiger wrote: One thing to remember is that regex's are already used in a number of functions: @results = grep /^VAR=\w+/, @values; You are being mislead. You might just as well say that length() is being used in other functions: @results = grep length, @values; This is not a regex as a prarameter. It is an expression. Execution is postponed, just as for length(). The code is equivalent to @results = grep { /^VAR=\w+/ } @values; a block containing an expression. Now, to beconsequent, you should turn this into: @results = grep { match /^VAR=\w+/ } @values; or @results = grep match /^VAR=\w+/, @values; Er... with the problem, that you no longer know what function the argument list @values belongs to. @results = grep macth(/^VAR=\w+/), @values; Is this really worth it? @array = split /\s*:\s*/, $input; You are correct here. I'm opposed to an obligation to replace m// and s///. I won't mind the ability to give a prototype of "regex" to functions, or even *additional* functions, match and subst. Bare regexes should stay. As for tr///: this doesn't even use a regex. It looks a bit like one, but it's an entirely different thing. And to me, the argument of tr/// is actually two arguments; a source character list, and a replacement character list. As for s///: same thing, two arguments, a regex, and a substitution string or function. Your example function OLD:($str = $_) =~ s/\d+/func/ge; NEW:$str = subst /\d+/func/ge; should really be $str = subst /\d+/g, \func; although I have the impression that the //g modifier is in the wrong place. -- Bart.
Re: RFC 164 (v2) Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()
I'm opposed to an obligation to replace m// and s///. I won't mind the ability to give a prototype of "regex" to functions, or even *additional* functions, match and subst. As the RFC basically proposes. The idea is that s///, tr///, and m// would stay, seemingly unchanged. But they'd actually just be shortcuts to the new builtins. These new builtins can act on lists, be prototyped/overridden, be more easily chained together without in-betweener variables. Basically, they get all the benefits normal functions get, while still being 100% backwards compatible. -Nate
negative variable-length lookbehind example
In RFC 72, Peter Heslin gives this example: :Imagine a very long input string containing data such as this: : :... GCAAGAATTGAACTGTAG ... : :If you want to match text that matches /GA+C/, but not when it :follows /G+A+T+/, you cannot at present do so easily. I haven't tried to work it out exactly, but I think you can achieve this (and fairly efficiently) with something like: / (?: ^ | # else we won't match at start (?: (? G+ A+ T+) | (.) )* (?(1) | . ) ) G A+ C /x This requires that the regexp engine reliably leaves $1 unset if we took the G+A+T+ branch last time through the (...)*, which has been an area of many bugs and no little discussion in perl5; I'm not sure of the status of that in latest perls. It isn't particularly relevant to this proposal since there are other combinations that can't be resolved in this way; I thought it might be of interest nonetheless. Hugo
RFC 128 (v3) Subroutines: Extend subroutine contexts to include name parameters and lazy arguments
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Subroutines: Extend subroutine contexts to include name parameters and lazy arguments =head1 VERSION Maintainer: Damian Conway [EMAIL PROTECTED] Date: 17 August 2000 Last Modified: 14 September 2000 Mailing List: [EMAIL PROTECTED] Number: 128 Version: 3 Status: Developing =head1 ABSTRACT This RFC proposes that subroutine argument context specifiers be extended in several ways, including allowing parameters to be typed and named, and that a syntax be provided for binding arguments to named parameters. =head1 CHANGES Added section describing named parameter interaction with named higher-order function placeholders. =head1 DESCRIPTION It is proposed that the existing subroutine "prototype" mechanism be replaced by optional formal parameter lists that allow parameters to be named and their contexts specified. The syntax for this would be: sub subname ( type context(s) parameter_name : parameter_attributes , type context(s) parameter_name : parameter_attributes , type context(s) parameter_name : parameter_attributes ; # end of required parameters type context(s) parameter_name : parameter_attributes , # etc. ) : subroutine_attributes { body } Each of the four components of a parameter specification -- type, context, name, and attributes -- would be optional. =head2 Contexts The context specifiers would be: $ parameter is scalar @ parameter is array (eats remaining args) % parameter is hash (eats remaining args) / parameter is qr'd string parameter is subroutine reference or block * parameter is typeglob (assuming they still exist) "" parameter is bareword or character string () parameter is an explicitly parenthesized list Note that any of these specifiers may appear in any position in a parameter list (especially C, which would no longer be constrained to the first position). The following prefix context modifier would be available: \ parameter must be a reference, argument is magically en-referenced if necessary The following context attributes would be available: :lazy argument is lazily evaluated :uncurried( only) terminate curry propagation on argument :noautovivthat is a (possibly nested) hash element or array element is not autovivified. :repeat{m,n} argument is variadic within the specified range The following subsections describe each of these in detail. The following grouping operator would also be available: (...) specifies that the argument(s) are to be treated collectively (i.e. by modifiers and attributes) =head3 Automagically en-referenced arguments The C\ modifier causes the modified parameter to automagically convert its corresponding argument to a reference without list flattening. The most common usage is in passing hashes and arrays as a single argument. Note that the semantics of C\ attribute would be altered slightly from those of Perl 5, so that a reference is Ialways passed for that parameter. It would, of course, retain its magical en-referencing coercion: \$ argument must be scalar ref or start with $ scalar var magically en-referenced \@ argument must be array ref or start with @, array var magically en-referenced \% argument must be hash ref of start with %, hash var magically en-referenced \/ argument must be qr'd string or /.../ or m/.../ /.../ or m/.../ magically qr'd to en-reference \ arg must be sub reference, curried function, or block block converted to anonymous sub ref \* argument must be typeglob ref of start with *, typeglob magically en-referenced \""argument must be a string reference or a bareword, bareword magically stringified and en-referenced \()argument must be a parenthesized list or an anonymous list constructor parenthesized list is magically en-referenced =head3 Lazy evaluation If the Clazy attribute is used for a particular parameter, that parameter is lazily evaluated. This means that it is only evaluated when the corresponding named parameter (see below) -- or the corresponding element of @_ -- is first accessed in some way, after which the evaluated value is stored in the element in the usual way. Passing the parameter to another subroutine or returning it as an lvalue does not count as an access. Evaluating it in an Ceval
Re: Uninitialized value warnings should die (was Re: RFC 212 (v1) Make length(@array) work)
On Wed, 13 Sep 2000 19:57:28 -0700, Nathan Wiger wrote: Perl should stop nagging about this. Perl should be smart and not bother you about uninitialized values in the following situations: - string concat - string comparison Allow me to disagree. In my case, I mostly use variables in output (= used as strings), far less in numerical context, and these warnings tell me that there are some situations that I had overlooked in my script. Most of the time, ignoring these warnings would make me rely on incorrect results. -- Bart.
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Amen to the below. So can we have an RFC 111 (v4) that gets rid of allowing stuff after the terminator? Even the ";" afterward seems useless... the ";" should be at the end of the statement, not the end of the here doc. The only improvement to here docs I see in this RFC is to allow whitespace before/after the here doc terminator. The rest is handled adequately and consistently today, and Tom's dequote is adequate to eliminate leading white space... especially among people who cannot agree that a tab in a file means "mod 8" (which it does). Michael G Schwern wrote: I can't think of much else I'd want to comment about the end of a here-doc than "this is the end of the here-doc" which is about as useful as "$i++ # add one to $i". There's a big difference. Every code block ends with a '}'. Every here doc ends with its own custom tag. Thus to state: print EOF; Four score and seven years ago... EOF # end of print EOF line 23 can currently be better written as: print GETTYSBURG_ADDRESS Four score and seven years ago... GETTYSBURG_ADDRESS The tag itself describes what the text is, similar to the way a well-named variable describes what's inside of it and removes the need for a descriptive comment. At a glance one can tell that 'GETTYSBURG_ADDRESS' closes the here-doc containing the Gettysburg Address, without having to maintain a comment. (I guarantee the line number mentioned in the comment will not be maintained.) Another reason for wanting to comment the closing of a code block is nesting. Simply searching for the previous '{' will not work. Here-docs cannot be nested and do not have this problem. Simply searching backwards for your here-doc tag will always work. -- Glenn = There are two kinds of people, those who finish what they start, and so on... -- Robert Byrne _NetZero Free Internet Access and Email__ http://www.netzero.net/download/index.html
Re: Conversion of undef() to string user overridable for easy debugging
This would HAVE to be a very optional feature. I rely on undef converting to a null string in many, many programs. Surely in those programs you don't have -w turned on, because you wouldn't want to see all those warning messages. So here is another idea: -w causes string interpolation of variable that evaluate to undef to be cancelled, leaving the variable name in place, as well as giving the warning. I don't know about this. What if someone writes: print "You owe me $2, $name.\n"; With -w it'll print out the "correct" version? You owe me $2, Nate. But without it it won't? You owe me , Nate. As a beginning user, I'd be really confused. And then what if your regexp accidentally matched, and you were relying on $2 to print out verbatim? You owe me maingly name="dangly", Nate. Seems really really scary, as does the #UNDEF# idea. I think Nat's RFC 214 on getting more user-accurate error messages should actually help solve this more than these approaches. -Nate
Re: Conversion of undef() to string user overridable for easy debugging
Nathan Wiger wrote: I don't know about this. What if someone writes: print "You owe me $2, $name.\n"; With -w it'll print out the "correct" version? With a warning, because $2 isn't defined. You owe me $2, Nate. But without it it won't? You owe me , Nate. You turn off warnings, and you gets what you gets... better write perfect code in that case... which this isn't. As a beginning user, I'd be really confused. And then what if your regexp accidentally matched, and you were relying on $2 to print out verbatim? You owe me maingly name="dangly", Nate. This is pretty obvious what went wrong. Seems really really scary, as does the #UNDEF# idea. I think Nat's RFC 214 on getting more user-accurate error messages should actually help solve this more than these approaches. I have no problem with getting more user-accurate error messages. But this seems not to hurt (you ignored the existance of the warning messages in your analysis above), and might help a bit, even if RFC 214 is too hard. But never fear, I'll not RFC it, it isn't worth that much. -Nate -- Glenn = There are two kinds of people, those who finish what they start, and so on... -- Robert Byrne _NetZero Free Internet Access and Email__ http://www.netzero.net/download/index.html
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
On Wed, Sep 13, 2000 at 11:34:20PM -0700, Glenn Linderman wrote: The rest is handled adequately and consistently today, and Tom's dequote is adequate to eliminate leading white space... especially among people who cannot agree that a tab in a file means "mod 8" (which it does). Damnit, I'm going to continue beating this horse until it stops twitching. Tom and I had an extensive off-list discussion about this, and here's about where it left off (hopefully I'll get everything right). We have three major problems and three proposed solutions: Problems: 1 Allowing here-docs to be indented without effecting the ouput. 2 Preserving sub-indentation. 3 Preserving the output of the here-doc regardless of how its overall indentation is changed (ie. shifted left and right) Solutions 1 POD =~ s/some_regex// 2 dequote(POD) 3 indentation of the end-tag Each solution has their strengths and weaknesses. Regexes can handle problem #1 but only #2 xor #3. However, they cover a wide variety of more general problems. dequote has the same problem. #1 is fine, but it can only do #2 xor #3. Not both. The current stumper, which involves problems 1, 2 and 3 is this: if( $is_fitting $is_just ) { die POEM; The old lie Dulce et decorum est Pro patria mori. POEM } I propose that this work out to "The old lie\n Dulce et decorum est\n Pro patria mori.\n" and always work out to that, no matter how far left or right the expression be indented. { { { { { if( $is_fitting $is_just ) { die POEM; The old lie Dulce et decorum est Pro patria mori. POEM } } } } } Four spaces, two spaces, six spaces. Makes sense, everything lines up. So far I have yet to see a regex or dequote() style proposal which can accomdate this. So solution #1 is powerful, solution #2 is simple, solution #3 solves a set of common problems which the others do not (but doesn't provide the other's flexibility). All are orthoganal. All are fairly simple and fairly obvious. Allow all three. My most common case for needing indented here-docs is this: { { { { # I'm nested if($error) { warn "So there's this problem with the starboard warp coupling and oh shit I just ran off the right margin."; } } } } } Usually I wind up doing this: { { { { # I'm nested if($error) { warn "So there's this problem with the starboard ". "warp coupling and oh shit I just ran off the ". "right margin."; } } } } } I'd love it if I could do this instead: { { { { # I'm nested if($error) { warn ERROR =~ s/\n/ /; So there's this problem with the starboard warp coupling and hey, now I have lots of room to pummell you with technobabble! ERROR } } } } } By combining two of the solutions, my problem is solved. I can indent my here-docs and yet keep the output a single line. Show me where this fails and I'll shut up about it. -- Michael G Schwern http://www.pobox.com/~schwern/ [EMAIL PROTECTED] Just Another Stupid Consultant Perl6 Kwalitee Ashuranse Sometimes these hairstyles are exaggerated beyond the laws of physics - Unknown narrator speaking about Anime
Re: RFC 215 (v2) More defaulting to $_
On 13 Sep 2000, Perl6 RFC Librarian wrote: An inconsistency between "Cprint" and "" bugs me: "Cprint;" means "Cprint $_;" so it seems like "" should mean "C$_ = ". This would break code prompting for "Press any key" and wasting the input. I suggest again: s/""/"C "/g; s/C$_ = /C $_ = /; Cheers, Philip -- Philip Newton [EMAIL PROTECTED]
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Michael G Schwern [EMAIL PROTECTED] writes: [...] I propose that this work out to "The old lie\n Dulce et decorum est\n Pro patria mori.\n" and always work out to that, no matter how far left or right the expression be indented. { { { { { if( $is_fitting $is_just ) { die POEM; The old lie Dulce et decorum est Pro patria mori. POEM } } } } } Four spaces, two spaces, six spaces. Makes sense, everything lines up. So far I have yet to see a regex or dequote() style proposal which can accomdate this. I really like this. [...] Show me where this fails and I'll shut up about it. Here are 2 problems I can think of. But please don't "shut up about it" -- I like the solution, but these need to be sorted out! 1. It requires the perl parser know about indentation. Of course we all know that tabs are 8 characters wide (I myself make a point of bludgeoning anyone who says otherwise), but do we really want to open this can of worms? 2. Existing practice for here docs will have the contents of the here doc on the left margin. People might want to preserve that. For instance, it makes sense if you're here-docking a bunch of 80 char lines. (2) can be solved, and the ambiguous "no matter how far left or right the expression be be indented" resolved, by saying that indentation of the here doc is relative to the terminator (*not* the statement that launched it). This might also make slightly better sense when you have 2 here docs in one line: print FIRST_HERE_DOC; print SECOND_HERE_DOC; This is on the left margin. This is indented one char. FIRST_HERE_DOC This is indented one char. This is on the left margin. SECOND_HERE_DOC But (1) needs to be resolved (and don't say "use tabs 8"!). -- Ariel Scolnicov|"GCAAGAATTGAACTGTAG"| [EMAIL PROTECTED] Compugen Ltd. |Tel: +972-2-5713025 (Jerusalem) \ We recycle all our Hz 72 Pinhas Rosen St.|Tel: +972-3-7658514 (Main office)`- Tel-Aviv 69512, ISRAEL |Fax: +972-3-7658555http://3w.compugen.co.il/~ariels
Re: RFC 215 (v2) More defaulting to $_
On 13 Sep 2000, Perl6 RFC Librarian wrote: =head1 DESCRIPTION $_ is the default scalar for a small set of operations and functions. Most of these operations and functions are ones that use C=~. Er, no. Quite a few others also use $_. The ones I found: -X filehandle tests (except for -t), abs, alarm, chomp, chop, chr, chroot, cos, defined, eval, exp, glob, hex, int, lc, lcfirst, length, log, lstat, oct, ord, pos, print, quotemeta, readlink, ref, require, rmdir, sin, split, sqrt, stat, study, uc, ucfirst, unlink. None of those uses C=~. Problem: since has a different behavior in a list context, does "C( C);" break, or does it mean something like "C@_ = c;" ? Eek. This is really ugly. I propose: since C has a different behavior in a list context, does "C (); " break, or does it mean something like "C @_ = ; " ? I especially wonder about your c; escape. =head2 3: For Functions In General "Cstat;", "Clength;", and many others could use C$_. Er, they already do. man perlfunc, and/or see my list above. Cheers, Philip -- Philip Newton [EMAIL PROTECTED]
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
I've implemented a prototype of the indented here-doc tag I'm proposing. http://www.pobox.com/~schwern/src/RFC-Prototype-0.02.tar.gz Its RFC::Prototype::111, which is probably the wrong number. I'll have to add POD =~ s/// syntax. Also, if anyone's good with filters I couldn't quite get the prototype working with Filter::Util::Call. I found myself needing to work line-by-line, and that whole "build up $_" was getting in my way, so I switched to Filter::Util::Exec and it works, but it makes debugging really hard. =head1 NAME RFC::Prototype::111 - Implements Perl 6 RFC 111 =head1 SYNOPSIS use RFC::Prototype::111; if( $is_fitting $is_just ) { die "POEM"; The old lie Dulce et decorum est pro patria mori POEM } =head1 DESCRIPTION Two changes. 1. Allows POD end tags to be indented. The amount of space a tag is indented is the amount which will be clipped off of each line of the here-doc. Tabs will BNOT be expanded. 2. POD end tags may now be followed by trailing whitespace -- Michael G Schwern http://www.pobox.com/~schwern/ [EMAIL PROTECTED] Just Another Stupid Consultant Perl6 Kwalitee Ashuranse When faced with desperate circumstances, we must adapt. - Seven of Nine
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Michael, I just noticed your post (I am at work). This is begining to get there (maybe I should not have split the original 111). In the prototype you only cover use of " quotes. if( ($pre_code, $quote_type, $curr_tag, $post_code) = $_ =~ m/(.*)\\(")(\w+)"(.*)/ ) It needs to match (.*)((["'`])(\w+)\2)|(\w+))(.*) or something like that. Richard Proctor
Can we improve the missing paren error message?
use diagnostics; my $i=1; print 'hi' if ($i=1; running this with perl -wc (v 5.004, unix), I get perl -wc x.pl syntax error at x.pl line 3, near "1;" x.pl had compilation errors (#1) (F) The final summary message when a perl -c fails. Uncaught exception from user code: x.pl had compilation errors. If I'm missing a "}" the compiler tells me its missing. That's also a syntax error, but it reports the actual missing "}". Why not do the same for ")"? In this simple case its easy to spot, but with thousands of possible "syntax errors" its not always this easy. I've learned from experience that usually this message implies a missing ")", at least for me, but why not explicitly state it? The compiler should provide as much error detail as possible, particularly if using -w and "use diagnostics". -Ed _ Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com. Share information about yourself, create your own public profile at http://profiles.msn.com.
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Glenn Linderman wrote: Amen to the below. So can we have an RFC 111 (v4) that gets rid of allowing stuff after the terminator? Even the ";" afterward seems useless... the ";" should be at the end of the statement, not the end of the here doc. The only improvement to here docs I see in this RFC is to allow whitespace before/after the here doc terminator. The rest is handled adequately and consistently today, and Tom's dequote is adequate to eliminate leading white space... especially among people who cannot agree that a tab in a file means "mod 8" (which it does). The semicolon, as you point out, belongs on the statement at the head of the here doc. The proposal to allow a semicolon at the end is mere window-dressing. Aesthetics only. Personally, I have used editors and pretty-printers that could handle here-docs except that they thought that the "statement" without a semicolon meant that all subsequent lines should be indented. I have had to resort to: $foo = HERE; ... HERE ; other_statements(); Yes, the obvious solution is to get a better editor/pretty printer. Not always an option. But, as I said, it's mere aesthetics. Perhaps not worth changing the language to accommodate the minority of people who have inferior tools. But why not allow a comment? Can't think of a use for one? Michael Schwern, whom you quote, points out that the here doc tag ought to be self-documenting, and he is 100% correct. But comments are used for more than documentation. Ever write a note to yourself or to the next programmer in a comment? $foo = TABLE_OF_GOODS; ... TABLE_OF_GOODS # must combine with TABLE_OF_SUPPLIES, below, someday Sure, you can put that comment in a different place, with little harm. But as long as we're proposing allowing whitespace before/after the doc tag, comments are a Good Thing, imho. -- Eric J. Roode, [EMAIL PROTECTED] print scalar reverse sort Senior Software Engineer'tona ', 'reh', 'ekca', 'lre', Myxa Corporation'.r', 'h ', 'uj', 'p ', 'ts';
Re: Can we improve the missing paren error message?
On Thu, 14 Sep 2000 12:16:51 GMT, Ed Mills wrote: If I'm missing a "}" the compiler tells me its missing. That's also a syntax error, but it reports the actual missing "}". Why not do the same for ")"? That reminds me: if Perl reports a missing "}", "]" or ")", it would also be very nice if it also told me on what line the opening "{", "[" or "(" was. A bare error message like Missing "}" at EOF is silly. -- Bart.
Re: RFC 217 (v2) POD needs a reorder command.
Sometimes I want a chunk of documentation to hang out near a chunk of code, but the order of the code is not always a good order for a man page. Are you familiar with the Cdivert m4 command? Lm4. I'm not saying it would be the way to go; but it might be useful to see how another preprocessing language handles this issue. -- John Porter
Re: RFC 213 (v1) rindex and index should return undef on failure
Chaim Frenkel wrote: Removing -1 as a valid result, could be a breakage (if someone is doing something weird with a negative result) What, like using it as an index into a substr? Bad Code is its own reward, my friend. $foo = "flabergasted"; substr($foo, index($foo, 'abc'), 20); # Returns undef One should never do this, regardless of what index() returns on failure. Now, if index() threw an exception on failture, you'd be o.k. But I don't think we want that... -- John Porter We're building the house of the future together.
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Ariel Scolnicov wrote: 1. It requires the perl parser know about indentation. Of course we all know that tabs are 8 characters wide (I myself make a point of bludgeoning anyone who says otherwise), but do we really want to open this can of worms? Not so fast with those 8-column tabs. (But, I do NOT want to start a religious war here). At my company, we're required to have one tab stop, no spaces, between indentation levels. Boss likes 8 columns, which to my mind is way too much -- it doesn't take too many levels for your code to march off the right side of the screen. I prefer four columns. No problem -- I make my tab settings four columns. Which, for purposes of here docs and this proposal, works just as well. The REAL sinners are those who mix spaces and tabs. THAT's evil. :-) -- Eric J. Roode, [EMAIL PROTECTED] print scalar reverse sort Senior Software Engineer'tona ', 'reh', 'ekca', 'lre', Myxa Corporation'.r', 'h ', 'uj', 'p ', 'ts';
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
In Michael Schwerns prototype, expansion to treat both semicolons and comments at the end tag is possible by changing /^(\s*)$curr_tag\s*$/ to /^(\s*)$curr_tag\s*(;\s*)?(#.*)?$/ Richard
Re: $a in @b (RFC 199)
David L. Nicol wrote: "Randal L. Schwartz" wrote: I think we need a distinction between "looping" blocks and "non-looping" blocks. And further, it still makes sense to distinguish "blocks that return values" (like subroutines and map/grep blocks) from either of those. But I'll need further time to process your proposal to see the counterarguments now. In the odd parallel universe where most perl 6 flow control is handled by the throwing and catching of exceptions, the next/last/redo controls are macros for throwing next/last/redo exceptions. Loop control structures catch these objects and throw them again if they are labeled and the label does not match a label the loop control structure recognizes as its own. The more I think about this, and about why I like the way perl does it currently, the more I think it would be a Bad Idea to unify the various block types as I (and others) have previously suggested. And it all boils down to the scope of returns, including non-local returns (last and die). It is hard to argue that perl's current setup is not powerful. sub foo { eval { for (...) { # all these go to different places: last; die; return; } }; } sub foo { # and these as well: last; die; return; } for (...) { eval { foo(); }; } to give but two possible combinations. In a nutshell, there are different kinds of blocks, and their escape mechanisms are triggered by different keywords. By unifying the block types, and making the keywords work across all of them, I'm afraid we would lose this ability to jump up through the layers of scope to "the right place". The issue we've been struggling with is essentially the fact that map/grep blocks don't have a similar early-exit mechanism. One approach is to make them the same as one of our other block types (sub, loop, eval); another is to add a new keyword to implement the early exit. Most folks seem to think that a grep block is more like a loop block, and so want to use Clast; I have been more of the opinion that a grep block is more like a sub, and so should use Creturn. In the other camp, Cyield has been suggested; but the conflation of that with its thread-related semantics may not be a such good idea. -- John Porter We're building the house of the future together.
Re: $a in @b (RFC 199)
David L. Nicol wrote: "Randal L. Schwartz" wrote: I think we need a distinction between "looping" blocks and "non-looping" blocks. And further, it still makes sense to distinguish "blocks that return values" (like subroutines and map/grep blocks) from either of those. But I'll need further time to process your proposal to see the counterarguments now. In the odd parallel universe where most perl 6 flow control is handled by the throwing and catching of exceptions, the next/last/redo controls are macros for throwing next/last/redo exceptions. Loop control structures catch these objects and throw them again if they are labeled and the label does not match a label the loop control structure recognizes as its own. I find this urge to push exceptions everywhere quite sad. Most folks seem to think that a grep block is more like a loop block, and so want to use Clast; I have been more of the opinion that a grep block is more like a sub, and so should use Creturn. In the other camp, Cyield has been suggested; but the conflation of that with its thread-related semantics may not be a such good idea. Cpass. -- $jhi++; # http://www.iki.fi/jhi/ # There is this special biologist word we use for 'stable'. # It is 'dead'. -- Jack Cohen
RE: (RFC 199) Alternative Early-Exit Mechanisms for Blocks?
David L. Nicol wrote: "Randal L. Schwartz" wrote: I think we need a distinction between "looping" blocks and "non-looping" blocks. And further, it still makes sense to distinguish "blocks that return values" (like subroutines and map/grep blocks) from either of those. But I'll need further time to process your proposal to see the counter- arguments now. In the odd parallel universe where most perl 6 flow control is handled by the throwing and catching of exceptions, the next/last/redo controls are macros for throwing next/last/redo exceptions. Loop control structures catch these objects and throw them again if they are labeled and the label does not match a label the loop control structure recognizes as its own. Sounds like there is a significant consensus that we are still waiting for some better mechanism to be proposed. Missing that, most would perfer to leave it alone. Is this correct? Does anyone have an alterative mechanism or keywords to propose? Could someone shoot down or prop up the following: * Subroutines automatically get their name as a label * Either anonymous code blocks can't short-circuit, or we use something like "LABEL undef" for the closest code block * Allow "code" blocks to catch next/last/redo exceptions with explicit labels. * To support both short-circuiting and returning a value Use: return $value next LABEL * We stop calling them loop control statements and start referring to them as short-circuit statements commonly used with loop control. I believe this gets us what we're after with RFC 199: Short-circuiting Cgrep and Cmap with Clast. If there is interest in allowing "loop" and "bare" blocks to Creturn and/or Cyield, I'll defer that to another RFC. I'm looking to firm up RFC 199 in as positive a light as possible or failing that withdraw it. I would like to return to being a lurker ;) Garrett
RE: (RFC 199) Alternative Early-Exit Mechanisms for Blocks?
Garrett Goebel wrote: Could someone shoot down or prop up the following: * Subroutines automatically get their name as a label Ick! Shades of Pascal! Why don't we just replace "return $value" with "subroutine_name = $value;"! Seriously, what is the point of sub func1 { func2(); } sub func2 { last func1; } ? Imho, it is a BAD THING for functions to know who called them, and to vary their behavior accordingly. -- Eric J. Roode, [EMAIL PROTECTED] print scalar reverse sort Senior Software Engineer'tona ', 'reh', 'ekca', 'lre', Myxa Corporation'.r', 'h ', 'uj', 'p ', 'ts';
Re: $a in @b (RFC 199)
Jarkko Hietaniemi wrote: In the other camp, Cyield has been suggested; but the conflation of that with its thread-related semantics may not be a such good idea. Cpass. Well, "pass" might be o.k.; but it usually means something going *into* a sub, not coming out... -- John Porter
Re: (RFC 199) Alternative Early-Exit Mechanisms for Blocks?
Eric Roode wrote: sub func1 { func2(); } sub func2 { last func1; } ? Imho, it is a BAD THING for functions to know who called them, and to vary their behavior accordingly. Yes. This is a serious downside to the proposal, even though it was intended to allow last'ing out of some other kind of nested scope, e.g. sub func1 { while(1) { last func1; } } But if we keep the block types distinct, as I now believe we should, one would simply use Creturn there... -- John Porter We're building the house of the future together.
Re: $a in @b (RFC 199)
On Thu, Sep 14, 2000 at 11:46:31AM -0400, 'John Porter' wrote: Jarkko Hietaniemi wrote: In the other camp, Cyield has been suggested; but the conflation of that with its thread-related semantics may not be a such good idea. Cpass. Well, "pass" might be o.k.; but it usually means something going *into* a sub, not coming out... I'll pass that remark. -- $jhi++; # http://www.iki.fi/jhi/ # There is this special biologist word we use for 'stable'. # It is 'dead'. -- Jack Cohen
Re: (RFC 199) Alternative Early-Exit Mechanisms for Blocks?
Garrett Goebel wrote: * Subroutines automatically get their name as a label My concern here is whether it introduces a problem with Cgoto foo vs. Cgoto foo. If, as you propose, subs do get their name as label, I would like to conflate these two forms. But the semantics of Cgoto foo specifies that @_ is passed as is to the sub entrance. I suppose it would be o.k. to define Cgoto foo identically; just not sure. Any thoughts? -- John Porter We're building the house of the future together.
Re: RFC 208 (v2) crypt() default salt
=head1 TITLE crypt() default salt =head1 VERSION Maintainer: Mark Dominus [EMAIL PROTECTED] Date: 11 Sep 2000 Last Modified: 13 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 208 Version: 2 Status: Developing If there are no objections, I will freeze this in twenty-four hours.
RE: (RFC 199) Alternative Early-Exit Mechanisms for Blocks?
From: John Porter [mailto:[EMAIL PROTECTED]] Eric Roode wrote: sub func1 { func2(); } sub func2 { last func1; } ? Imho, it is a BAD THING for functions to know who called them, and to vary their behavior accordingly. I'm after a next/last/redo mechanism for the subroutine to short-circuit itself. Besides, you could apply the "bad coding" example to the current implementation of short-circuiting loops. OUTER: while (1) { INNER: while (1) { func1(); } } sub func1 { last OUTER; } My proposal would only allow such "bad coding" when someone does so with an explicit label, otherwise the status quo is preserved. Let people who explicitly chose to write bad code to do so. That's their choice. The default behaviour would remain, and I'll able to short-circuit Cgrep and Cmap. Yes. This is a serious downside to the proposal, even though it was intended to allow last'ing out of some other kind of nested scope, e.g. sub func1 { while(1) { last func1; } } But if we keep the block types distinct, as I now believe we should, one would simply use Creturn there... This wouldn't help Cgrep. you'd be returning from the context of the code block... and not Cgrep. I want to short-circuit the built-in, and my own subroutines that take code blocks. sub mygrep (@) { my ($coderef, @list, @results) = @_; $coderef and push(@results, $_) foreach (@list); @results; } mygrep {$_ == 1 and return $_} (1..1_000_000); vs. mygrep {$_ == 1 and return $_ last mygrep} (1..1_000_000); Garrett
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and HereDocs)
On 14 Sep 2000, Ariel Scolnicov wrote: 1. It requires the perl parser know about indentation. Of course we all know that tabs are 8 characters wide (I myself make a point of bludgeoning anyone who says otherwise), but do we really want to open this can of worms? No, because for every person such as yourself who is into eight spaces, there is someone else like me who wants four. And don't even get me started on the really strong guys who want three spaces...you know, the Threeite Musclemen. print FIRST_HERE_DOC; print SECOND_HERE_DOC; This is on the left margin. This is indented one char. FIRST_HERE_DOC This is indented one char. This is on the left margin. SECOND_HERE_DOC RFC 111 specifically disallows statements after the terminator because it is too confusing. I would say that the same logic should apply to the start of the here doc; I'm not sure, just from looking at it, if the example above is meant to be two interleaved heredocs, one heredoc after another, or what. Dave
Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))
Show me where this fails and I'll shut up about it. Actually, to me this thread underscores how broken here docs are themselves. We already have q//, qq//, and qx// which duplicate their functions far more flexibly. Question: Do we really need here docs? Before you scream "Bloody murder", please read on... The current stumper, which involves problems 1, 2 and 3 is this: if( $is_fitting $is_just ) { die POEM; The old lie Dulce et decorum est Pro patria mori. POEM } I propose that this work out to "The old lie\n Dulce et decorum est\n Pro patria mori.\n" Let's look at what happens if we ignore here docs and instead use qq// instead: if( $is_fitting $is_just ) { die qq/ The old lie Dulce et decorum est Pro patria mori. /; } Solves problem #1, indented terminator, except that it adds two newlines (more later). However, it leaves 2 and 3. Let's try adding in a regexp: if( $is_fitting $is_just ) { (my $mesg = qq/ The old lie Dulce et decorum est Pro patria mori. /) =~ s/\s{8}(.*?\n)/$1/g; die $mesg; } But the dang =~ operator make that ugly and hard to read, and requires a $mesg variable. So let's try RFC 164's approach to patterns then: if( $is_fitting $is_just ) { die subst /\s{8}(.*?\n)/$1/g, qq/ The old lie Dulce et decorum est Pro patria mori. /; } Seems to work for me (and yes I'm working on a prototype of RFC 164's functions). I think we're trying to jam alot of stuff into here docs that maybe shouldn't be jammed in, especially since Perl already has the q// alternatives that are much more flexible. Don't get me wrong, I like here docs and all, but I wonder if it isn't time for them to go? I think I'd actually much rather see a new qh// "quoted here doc" operator that solves these problems than trying to jam them all into the existing shell-like syntax, which is a leftover oddity, really. -Nate
Re: Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))
At 10:52 AM 9/14/00 -0700, Nathan Wiger wrote: Actually, to me this thread underscores how broken here docs are themselves. We already have q//, qq//, and qx// which duplicate their functions far more flexibly. Question: Do we really need here docs? I have thought this before, but I think the answer is yes, for the circumstance of when the quoted material does or may contain the terminator character. No matter what you pick, you still only have one character as a terminator, and if you're quoting something big and sufficiently general (think Perl code), then it's a pain to check it each time to see if you've stuck in the terminator by mistake. At any rate, this is what I tell my students when they realize that "..." can contain newlines and start to wonder about the raison d'etre of here documents. I think I'd actually much rather see a new qh// "quoted here doc" operator that solves these problems than trying to jam them all into the existing shell-like syntax, which is a leftover oddity, really. -- Peter Scott Pacific Systems Design Technologies
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
This whole debate has got silly. RFC 111 V1 covered both the whitespace on the terminator and the indenting - there was a lot of debate that this was two things - more were in favour of the terminator and there was more debate on the indenting. Therefore I split this into two RFCs leaving RFC111 just dealing with the terminator. RFC 111V3 represents what I believe was rough concenus (ALA IETF meaning) on the terminator issue. (The debate had been quiet for several weeks) Michael Schwern has gone as far as doing a prototype that almost covers it and with the few things I have posted earlier today could be extended to handle all cases. Next comes the issue of the removing whitespace on the left of the content. There are several possibilities, these are now mostly in RFC 162. These are: 1) There is no processing of the input (current state) 2) All whitespace to the left is removed (my original idea) 3) Whitespace equivalent to the first line is removed (not a good solution) 4) Whitespace equivalent to the terminator is removed if possible (ALA Michaels prototype) - this could be workable. 5) Whitespace equivalent to the smallest amount of the content is removed (current RFC 162 preffered solution) When measuring whitespace how does the system treat tabs? (be realistic and dont FLAME) So where do we go from here? A) Do we want one syntax or two? (HERE and THERE)? I would prefer one but would accept two. B) Is there rough concencus on the terminator issue at least? C) Which of the 5 cases of handling the whitespace in the content might be agreed upon? D) Decide how to treat tabs in the indenting. (Suggest =8 spaces plus allow prama to override) E) If the answer to A) is one and we have B) and we agree on 4) or 5) for the whitespace and some treatment of tabs, then I should cancel RFC 162 and just put everything back into RFC 111 (including Michaels Prototype) and lets try and freeze it and move on to other things. Peace! Richard -- [EMAIL PROTECTED]
Re: Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))
Nathan Wiger wrote: Actually, to me this thread underscores how broken here docs are themselves. We already have q//, qq//, and qx// which duplicate their functions far more flexibly. Question: Do we really need here docs? Yes. Try generating lots of HTML, Javascript, Postscript, or other languages without here docs. Example: print CODE_SNIPPET; // this is a javascript function function valid(s) { ... if (var2 = '"')) { // rest of code to be generated later. CODE_SNIPPET There's a chunk of code for which '', "", qq//, qq, qq{}, are all inadequate. This kind of code happens A LOT in web programming. I do not want to have to examine all of my generated strings to see what quoting character I can use this time around, and I do not want to risk breaking my program whenever I change the text in a code snippet ("oops! I added a bracket. gotta change the quoting character!"). -- Eric J. Roode, [EMAIL PROTECTED] print scalar reverse sort Senior Software Engineer'tona ', 'reh', 'ekca', 'lre', Myxa Corporation'.r', 'h ', 'uj', 'p ', 'ts';
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Richard Proctor made some excellent comments, and asked: When measuring whitespace how does the system treat tabs? (be realistic and dont FLAME) I suggest that there be NO tab/space conversion. Not 8 columns, not 4 columns, nothing. If the here doc terminator has four tabs preceding it, then four tabs should be stripped from each of the lines in the string. If the terminator has one tab and four spaces, then one tab and four spaces should be stripped from each of the lines. Mixing spaces and tabs is basically evil, but if you're consistent about it, it's your own rope for you to trip over or hang yourself. I set my tab stops to four columns; at least one of my coworkers sets his tab stops to eight columns. We edit the same code with no problems. -- Eric J. Roode, [EMAIL PROTECTED] print scalar reverse sort Senior Software Engineer'tona ', 'reh', 'ekca', 'lre', Myxa Corporation'.r', 'h ', 'uj', 'p ', 'ts';
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Eric Roode wrote: I suggest that there be NO tab/space conversion. I also suggest that no whitespace stripping/appending/etc/etc be done at all. If I write: if ( $its_all_good ) { print EOF; Thank goodness this text is centered! EOF } That should print out: Thank goodness this text is centered! Without forcing me to left-justify my EOF marker. Tying space-stripping to the placement of EOF is a Bad Idea, IMO. Do this if you want: if ( $its_all_good ) { (my $s = EOF) =~ s/\s{8}(.*?\n)/$1/g; print $s; Thank goodness this text isn't centered! EOF } But this shouldn't be implicit in the language. -Nate
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Nathan Wiger wrote: I also suggest that no whitespace stripping/appending/etc/etc be done at all. If I write: [...deletia...] But this shouldn't be implicit in the language. That's a good argument for having a separate operator for these "enhanced here docs", say , rather than chucking the whole idea out the window. -- Eric J. Roode, [EMAIL PROTECTED] print scalar reverse sort Senior Software Engineer'tona ', 'reh', 'ekca', 'lre', Myxa Corporation'.r', 'h ', 'uj', 'p ', 'ts';
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Michael G Schwern wrote: On Wed, Sep 13, 2000 at 11:34:20PM -0700, Glenn Linderman wrote: The rest is handled adequately and consistently today, and Tom's dequote is adequate to eliminate leading white space... especially among people who cannot agree that a tab in a file means "mod 8" (which it does). Damnit, I'm going to continue beating this horse until it stops twitching. That's fine, but it could have been done politely. I'm all for solving problems, and this message attempts to specify 3 problems, but it needs more specification. You describe three problems, but it is not clear what the problems are, exactly, because the words you used to describe them must not describe the problem universally. Let me attempt to describe the problems more completely, and when I diverge onto the wrong problem, you can clarify it-- and then maybe we'll be communicating. I think you've also omitted some of the problems-- maybe they shouldn't be classified as major, but since they are related, and get in the way of some of the possible solutions, I think we should mention them all, so I've continued numbering. We have three major problems and three proposed solutions: Problems: 1 Allowing here-docs to be indented without effecting the ouput. This is the problem that currently here-doc content must be relative to the left margin, so doesn't look nice with respect to nearby indented code. 2 Preserving sub-indentation. This is not _currently_ a problem. Perl _currently_ preserves indentation in here-docs. It is not until some other "solutions" gets in the way, that this problem is a problem. If problem 1 were solved by independently eliminating all leading white space from each line of the HERE document, then this problem suddenly appears. So what this "problem" is trying to state is that problem #1 cannot be solved (using your "current stumper" example below) by die POEM =~ s/^\s*//m; because that affects the relative horizontal relationships between characters on different lines. So this problem only needs to be avoided when solving other problems, rather than being a problem today. 3 Preserving the output of the here-doc regardless of how its overall indentation is changed (ie. shifted left and right) This problem appears to be attempting to address what happens when indenting large blocks of code, with something equivalent to $code =~ s/^/^ /m; # N.B. that's 3 spaces after the 2nd ^ character The effect of the indentation is desirable, but the current semantics of here documents result in two problems: your number 3, which is actually subsumes your problem number 1, that the text result of the here document is different than it was before the indentation took place, and also the first additional problem below Additional problems: 4 An indented here-doc terminator is not recognized, because perl6 requires the here-doc terminator to be at the left boundary. 5 Because white space is not visible, white space after the here-doc terminator, which perl6 requires must be followed by end-of-line, can cause apparent here-doc terminators to not be recognized. 6 Because indenting a tab character with non-tab characters changes its starting point, its apparant size also changes, thus affecting the horizontal relationship between characters on different lines of a here-doc. 7 Because people don't all subscribe to the universal definition of the ASCII tab character as meaning proceed to the next (mod 8) horizontal boundary, the appearance of here-docs containing tabs in various environments differs in the horizontal relationship between charactes on different lines of a here-doc. This can be particularly significant if there are different numbers of leading tabs on a line, or a mixture of tabs and spaces at the front of some lines, or tabs found after non-white space characters. Solutions 1 POD =~ s/some_regex// 2 dequote(POD) 3 indentation of the end-tag Each solution has their strengths and weaknesses. Regexes can handle problem #1 but only #2 xor #3. However, they cover a wide variety of more general problems. dequote has the same problem. #1 is fine, but it can only do #2 xor #3. Not both. Agreed that there is unlikely to be a single solution that solves all the problems. So can we look at solutions to each of the problems, and then attempt to pick a set of solutions to make available in perl6 that covers the problem space? Before I do that, let's analyze the current stumper in terms of the problems above, to make sure we are talking about the same problems. The current stumper, which involves problems 1, 2 and 3 is this: if( $is_fitting $is_just ) { die POEM; The old lie Dulce et decorum est Pro patria mori. POEM } I propose that this work out to "The old lie\n Dulce et decorum est\n Pro patria mori.\n"
Re: RFC 208 (v2) crypt() default salt
On Thu, 14 Sep 2000 11:58:46 -0400, Mark-Jason Dominus wrote: If there are no objections, I will freeze this in twenty-four hours. Oh, I have a small one: I feel that this speudo-random salt should NOT affect the standard random generator. I'll clarify: by default, if you feed the pseudo-random generator with a certain number, you'll get the same sequence of output numbers, every single time. There are applications for this. I think that any call to crypt() should NEVER change this sequence of numbers, in particular, it should not skip a number every time crypt() is called with one parameter. Therefore, crypt() should have it's own pseudo-random generator. A simple task, really: same code, but a different seed variable. -- Bart.
Re: Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))
On Thu, 14 Sep 2000 10:52:16 -0700, Nathan Wiger wrote: We already have q//, qq//, and qx// which duplicate their functions far more flexibly. Question: Do we really need here docs? With your above functions, you always need to be able to escape the string end delimiter. Therefore, you will always have to escape backslashes. You don't need to escape backslashes, or anything else, in a single-quoted here-doc. Here-docs are extremely handy if you have to incorporate text from an external file, which perl is supposed to print out verbatim. Their disadvantage is that they'll always end with a newline. -- Bart.
RFC 111
This whole debate has got silly. RFC 111 V1 covered both the whitespace on the terminator and the indenting - there was a lot of debate that this was two things - more were in favour of the terminator and there was more debate on the indenting. Therefore I split this into two RFCs leaving RFC111 just dealing with the terminator. RFC 111V3 represents what I believe was rough concenus (ALA IETF meaning) on the terminator issue. (The debate had been quiet for several weeks) Michael Schwern has gone as far as doing a prototype that almost covers it and with the few things I have posted earlier today could be extended to handle all cases. Next comes the issue of the removing whitespace on the left of the content. There are several possibilities, these are now mostly in RFC 162. These are: 1) There is no processing of the input (current state) 2) All whitespace to the left is removed (my original idea) 3) Whitespace equivalent to the first line is removed (not a good solution) 4) Whitespace equivalent to the terminator is removed if possible (ALA Michaels prototype) - this could be workable. 5) Whitespace equivalent to the smallest amount of the content is removed (current RFC 162 preffered solution) When measuring whitespace how does the system treat tabs? (be realistic and dont FLAME) So where do we go from here? A) Do we want one syntax or two? (HERE and THERE)? I would prefer one but would accept two. B) Is there rough concencus on the terminator issue at least? C) Which of the 5 cases of handling the whitespace in the content might be agreed upon? D) Decide how to treat tabs in the indenting. (Suggest =8 spaces plus allow prama to override) E) If the answer to A) is one and we have B) and we agree on 4) or 5) for the whitespace and some treatment of tabs, then I should cancel RFC 162 and just put everything back into RFC 111 (including Michaels Prototype) and lets try and freeze it and move on to other things. Peace! Richard -- [EMAIL PROTECTED]
Re: Conversion of undef() to string user overridable for easy debugging
This reminds me of a related but rather opposite desire I have had more than once: a quotish context that would be otherwise like q() but with some minimal extra typing I could mark a scalar or an array to be expanded as in qq(). I have wanted that also, although I don't remember why just now. (I think have some notes somewhere about it.) I will RFC it if you want. Note that there's prior art here: It's like Lisp's backquote operator. Reminds me of m4's changequote. $command = q$^"PATH=$PATH:^installdir dosomething"; equiv to $command = "PATH=\$PATH:$installdir dosomething"; q$![That'll be $20, !name.]; Hm... no, that wouldn't handle @ or %. qp(^*)[string with scalar ^var, array var, hash *var] (p for prefix) I guess your suggestion would look something like quote("That'll be $20, $title $name", qw(name title));
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
On Thu, Sep 14, 2000 at 11:49:18AM -0700, Glenn Linderman wrote: I'm all for solving problems, and this message attempts to specify 3 problems, but it needs more specification. You describe three problems, but it is not clear what the problems are Since we've been charging back and forth over this ground like a troop of doughboys over No Man's Land for the past month, I figured everyone knew the problem and proposed solutions. Your review accuractely lays everything out. { { { { { if( $is_fitting $is_just ) { die dequote_like('!', POEM); !The old lie ! Dulce et decorum est ! Pro patria mori. POEM } # this } had been omitted } } } } } Things like this have come up, and to my eyes and fingers its unacceptable. Some people like the explicit demarcation of the left boundry, I find it ugly and don't like the extra typing. It doesn't win me much over: die 'The old lie'. ' Dulce et decorum est'. ' Pro patria mori.'; I'd prefer if here-docs just DWIM. So we may want to add Yet Another problem. I forget what number you got up to, but its basically "You shouldn't have to add anything but whitespace to the here-doc for indenting". An additional problem with dequote() style solutions is they are not as efficient. DOC =~ s/// and the terminator indentation can both be applied at compile time and deparse the whole mess into a simple string (as the prototype does), while the dequote() routine must be run over and over again at run-time. This can get nasty in hot loops. #!/usr/bin/perl -w use strict; use Benchmark; sub dequote_like { local $_ = shift; my ($leader); # common white space and common leading string if (/^\s*(?:([^\w\s]+).*\n)(?:\s*\1.*\n)+$/) { $leader = quotemeta($1); } else { $leader = ''; } s/^\s*$leader//gm; return $_; } my $foo; timethese(shift || -3, { dequote = sub { $foo = dequote_like('!', POEM); !The old lie ! Dulce et decorum est ! Pro patria mori. POEM }, terminator = sub { use RFC::Prototype::111; $foo = "POEM"; The old lie Dulce et decorum est Pro patria mori. POEM }, }); Benchmark: running dequote, terminator, each for at least 3 CPU seconds... dequote: 2 wallclock secs ( 3.00 usr + 0.01 sys = 3.01 CPU) @ 39857.81/s (n=119972) terminator: 3 wallclock secs ( 3.00 usr + 0.02 sys = 3.02 CPU) @ 268209.93/s (n=809994) dequote() comes out nearly seven times slower than the terminator approach (which is basically dequote() vs a plain string). So that's another problem to add to the list. "here-docs should be no slower than the equivalent string, indented or otherwise" The syntax for POEM =~ s/regex/subst/; generally returns 1, and introducing a special case to make it return the string if the left hand side is a here-doc seems to be a pointless inconsistency. I think its considered closer to the current trick of doing: print ($var = POEM) =~ s/regex/subst/; # or something like that Another suggestion was POEM =~ m/re(ge)x/. The match would be run over each line and $1 used to generate the here-doc. Honestly, I'm not really the one who should be evangelizing this technique. but these subs [dequote] work in perl 5 today, so don't really need to be part of the RFC They most definately do. If we're going to propose them as a solution to the indented here-doc problem, it would be best to distribute a collection of commonly used ones as a module with perl. -- Michael G Schwern http://www.pobox.com/~schwern/ [EMAIL PROTECTED] Just Another Stupid Consultant Perl6 Kwalitee Ashuranse slick and shiny crust over my hairy anus constipation sucks -- Ken Flagg
Re: RFC 213 (v1) rindex and index should return undef on failure
John Porter wrote: Chaim Frenkel wrote: Removing -1 as a valid result, could be a breakage (if someone is doing something weird with a negative result) What, like using it as an index into a substr? Bad Code is its own reward, my friend. $foo = "flabergasted"; substr($foo, index($foo, 'abc'), 20); # Returns undef One should never do this, regardless of what index() returns on failure. Now, if index() threw an exception on failture, you'd be o.k. But I don't think we want that... That's exactly why it would be nice if index _did_ throw an exception on failure, then you could write code this way, and catch the failures without needing to check return values for the error code case before proceeding with the real case. -- Glenn = There are two kinds of people, those who finish what they start, and so on... -- Robert Byrne _NetZero Free Internet Access and Email__ http://www.netzero.net/download/index.html
Re: Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))
Nathan Wiger wrote: Solves problem #1, indented terminator, except that it adds two newlines (more later). I never found anything later about these extra newlines... so if this idea has merit, it needs to be finished. However, it leaves 2 and 3. Let's try adding in a regexp: if( $is_fitting $is_just ) { (my $mesg = qq/ The old lie Dulce et decorum est Pro patria mori. /) =~ s/\s{8}(.*?\n)/$1/g; die $mesg; } I think $mesg wins up with the value of "1" the way you've coded it. You cured that issue with the RFC 164 syntax for subst, of course, but it could be cured without that, but does require a temp var. I think we're trying to jam alot of stuff into here docs that maybe shouldn't be jammed in Yes, all we need is to recognize the terminator when embedded in white space on its line, and the rest can be handled with "here doc postprocessing functions". Per my somewhat longer reply to Michael Schwern. I agree with need for a multiple character termination sequence for easy to write here docs. -- Glenn = There are two kinds of people, those who finish what they start, and so on... -- Robert Byrne _NetZero Free Internet Access and Email__ http://www.netzero.net/download/index.html
RFC 223 (v1) Objects: Cuse invocant pragma
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Objects: Cuse invocant pragma =head1 VERSION Maintainer: Damian Conway [EMAIL PROTECTED] Date: 14 September 2000 Mailing List: [EMAIL PROTECTED] Number: 223 Version: 1 Status: Developing =head1 ABSTRACT This RFC proposes that, as in Perl 5, the invocant of a method should be normally available to the method as $_[0], but that it can be automaticaly stripped from @_ and accessed via either a subroutine or a variable, using the Cuse invocant pragma. =head1 DESCRIPTION It is proposed that Perl 6 methods continue to receive their invocant (i.e. a reference to the object on which they were invoked) via $_[0]. This mechanism has served Perl 5 well and preserving it as the default will greatly simplify migration of OO Perl code. It is, however, tedious that all Perl methods are thereby forced to start with: sub method { my ($self, @args) = @_; ... } or its various moral equivalents. The problem may be eased somewhat in Perl 6, if methods can have proper parameter lists: sub method ($self, @args) { ... } and this may, indeed, be sufficient. However, it might also be useful if the invocant could be passed to a method via an entirely separate mechanism -- either a magic variable, or a magic subroutine. The problem is: how to avoid the religious wars over which of these two alternative mechanisms is better, and what the chosen mechanism should be called. It is therefore proposed that Perl 6 provide a pragma -- Cinvocant -- that takes the name of either a scalar variable or a subroutine and causes that variable or subroutine to evaluate to the invocant of any method within the lexical scope of the pragma. Furthermore, where the pragma is in effect, methods would Inot receive their invocant as $_[0]. For example: use invocant '$ME'; sub method { $ME-{calls}++; $ME-SUPER::method(@_); } or: use invocant 'self'; sub method { self-{calls}++; self-SUPER::method(@_); } Note that there is no need to Cshift @_ before passing it to the ancestral method, since the invocant is not prepended to the argument list in the first place. =head1 MIGRATION ISSUES None. That's the point. =head1 IMPLEMENTATION Pragma adds a trivial wrapper around affected methods. For example: use invocant '$ME'; sub method { $ME-{calls}++; $ME-SUPER::method(@_); } becomes: sub method { my $ME = shift; do { $ME-{calls}++; $ME-SUPER::method(@_); } } whilst: use invocant 'self'; sub method { self-{calls}++; self-SUPER::method(@_); } becomes: sub method { local *self = do { my $_invocant = shift; sub { $_invocant } }; do { self-{calls}++; self-SUPER::method(@_); } } =head1 REFERENCES RFC 128 : Subroutines: Extend subroutine contexts to include name parameters and lazy arguments RFC 137 : Overview: Perl OO should Inot be fundamentally changed. RFC 152 : Replace $self in @_ with self() builtin (not $ME)