Re: RFC 213 (v1) rindex and index should return undef on failure
Chaim Frenkel wrote: "GL" == Glenn Linderman [EMAIL PROTECTED] writes: Neither is EOF on a file, or working with an empty list. Adding all these exceptions for non-exceptional and quite common scenerios is bothersome. I don't know where this idea of a _normal_ situation is considered exceptional. The idea of a _normal_ situation being considered exceptional is raised when the code written inappropriately handles some of the normal return values. The original example of bad code John Porter wrote: $foo = "flabergasted"; substr($foo, index($foo, 'abc'), 20); # Returns undef contains errors. Clearly the current index, or even index modified to return undef, doesn't produce the desired results. The inappropriate value of -1 (or undef) passed as the 2nd parameter to substr will produce erroneous results. In order to make the above line useful, index would need to alter the _normal_ flow of control by throwing an exception. However, this is a different index function than we have today. The author of lines like the above should use a wrapper around index that throws exceptions for the normal cases that are not desirable for his application. Then the above terse code could be usefully employed. The real question boils down to who decides what is normal and what is exceptional. This is the conundrum of programming today. I don't want to sprinkle my code with try/catch just to handle a 'normal' situation. I don't want you too, either. (1) RFC 119 only requires the "catch", not the "try". This solves half the problem :) (2) I want you to use the appropriate code in the appropriate ways, so that your catch blocks only catch the abnormal (exceptional) situations. And for abnormal situations, the default catch handlers might serve you just fine, solving the other half of the problem, much of the time. GL I agree with your concern that exception handling is (generally) GL more expensive than error codes. However, I see it as a good GL expenditure of the fast CPUs of today, as a tradeoff towards GL reliable processing. And maybe in Perl6 exception handling could GL be less expensive than it is (by comparison to error codes) in GL other languages? That's a question for the internals guys, of GL course. A cycle here a cycle there, and soon the program becomes bloatware. Right. Different people have different opinions about this, too. Where to spend those cycles most usefully. My preference (but only a preference) is to see cycles expended toward making simple things more reliable. Smart programmers can solve the complex problems, but in doing so often overlook or assume away error handling... by making error handling noisier (via exceptions) but also moving it out of the main code path (via catch blocks) produces a result that pleases me, and avoids cluttering the main path of code with the handling of abnormal things. The cost then is paid only when something abnormal happens, not for every normal thing that might happen but wasn't expected here. -- Glenn = Even if you're on the right track, you'll get run over if you just sit there. -- Will Rogers NetZero Free Internet Access and Email_ Download Now http://www.netzero.net/download/index.html Request a CDROM 1-800-333-3633 ___
Re: RFC 213 (v1) rindex and index should return undef on failure
Chaim Frenkel wrote: What about a hypothetical, use tristate. This would give undef some extra special powers. There is a difference between "undefined" and "unknown". SQL NULL, and the resultant tristate operators used in SQL, specifically is based on NULL representing the "unknown" value. Perl undefined is a different concept--that of an uninitialized variable. This is proven from its earliest versions where the value is coerced to 0 or '' (specific values) when used (without warnings on). Some Perl programs modules (including DBI) attempt to correlate NULL and undefined, for lack of a better match of concepts (Perl is missing the concept of NULL, SQL is missing the concept of undefined, but that doesn't correctly imply that the concepts each language _does_ have are correlated, or should be). If you want NULL, RFC it is a new concept. DBI could then be ported to Perl 6, and the power of using NULL in its operators (perhaps together with transactional variables) could make Perl an extremely powerful database manipulation language and would make the language, complementary to and augmenting SQL in ways no other language currently does. Do not attempt to further the inappropriate correlation between undefined and NULL. Any OO language with full operator overloading could write objects/operators that behave like SQL values, and implement tristate logic for those objects, just like SQL does. Perhaps you should attempt that, and RFC the failures. I would recommend, however, that you not attempt to use the concept of undefined to implement the concept of NULL, at least not visibly... -- Glenn = Even if you're on the right track, you'll get run over if you just sit there. -- Will Rogers ___ Why pay for something you could get for free? NetZero provides FREE Internet Access and Email http://www.netzero.net/download/index.html
Re: RFC 213 (v1) rindex and index should return undef on failure
Glenn Linderman wrote: The idea of a _normal_ situation being considered exceptional is raised when the code written inappropriately handles some of the normal return values. You would throw exceptions at the problem of bad coding practice. I think it's better to correct the bad coding practice. $foo = "flabergasted"; substr($foo, index($foo, 'abc'), 20); # Returns undef contains errors. Clearly the current index, or even index modified to return undef, doesn't produce the desired results. The inappropriate value of -1 (or undef) passed as the 2nd parameter to substr will produce erroneous results. Right. But this is not so much an argument for making index throw, as for encouraging programmers to write good code, i.e. $foo = "flabergasted"; if ( defined my $i = index($foo, 'abc') ) { substr( $foo, $i, 20 ); } else { # do what you want with this condition. } The whole point, IMHO, is that index() should return a value which cannot be used as an index. -1 clearly does not meet this criterion. If it returns undef, that can be used since it will get coerced to 0, but at least it will elicit a warning from perl. Perhaps under some kind of very-strict it will elicit an error instead. -- John Porter We're building the house of the future together.
Re: RFC 213 (v1) rindex and index should return undef on failure
Glenn Linderman wrote: There is a difference between "undefined" and "unknown". Can you explain this difference, briefly? If not, could you give me something off-list? Thanks, John Porter
Re: RFC 213 (v1) rindex and index should return undef on failure
John Porter wrote: Glenn Linderman wrote: The idea of a _normal_ situation being considered exceptional is raised when the code written inappropriately handles some of the normal return values. You would throw exceptions at the problem of bad coding practice. Not the goal. There are, no doubt, many possible interpretations of why your example line of code was bad, all couched in different justifications for using it. I think it's better to correct the bad coding practice. That was actually my goal too. $foo = "flabergasted"; substr($foo, index($foo, 'abc'), 20); # Returns undef contains errors. Clearly the current index, or even index modified to return undef, doesn't produce the desired results. The inappropriate value of -1 (or undef) passed as the 2nd parameter to substr will produce erroneous results. Right. But this is not so much an argument for making index throw, as for encouraging programmers to write good code, i.e. $foo = "flabergasted"; if ( defined my $i = index($foo, 'abc') ) { substr( $foo, $i, 20 ); } else { # do what you want with this condition. } Clearly with the $foo = "flabergasted" ; line in place, the whole example could be replaced with substr ($foo, -1, 20 ); or even undef; So removing that line and allowing for variability in the possible values for $foo, the question is whether the programmer so strongly believes that 'abc' will be found that he wishes to return it and the next 17 characters, and take extreme risk with his code if it is not found, or whether there is really a useful alternative action to be done in the "normal" case that 'abc' is not found in $foo. If the intention is that 'abc' is really expected to be found in all $foo, and is just the demarcation of the next 17 useful characters, then your "better code" costs several extra lines, and the exception which wouldn't be taken most of the time, resulting in little additional cost or complexity, would allow the single line solution you originally proposed, which is nice and concise for the case where not finding 'abc' is unexpected. Only you, of course, can supply the intentions behind your example, but by omitting the "do what you want with this condition" part, you clearly left it up for grabs to be interpreted as abnormal. The whole point, IMHO, is that index() should return a value which cannot be used as an index. -1 clearly does not meet this criterion. I totally agree that having index return undef on failure to find the string would be an improvement to index. Then in could be wrapped by Fatal.pm, or Throw.pm, etc., and we could all have and eat our cake. -- Glenn = Even if you're on the right track, you'll get run over if you just sit there. -- Will Rogers ___ Why pay for something you could get for free? NetZero provides FREE Internet Access and Email http://www.netzero.net/download/index.html
Re: RFC 213 (v1) rindex and index should return undef on failure
At this point, I think the whole thread on functions throwing exceptions should either be: (a) turned into an RFC or (b) abandoned. This discussion is going around and around like a piece of toilet paper in a weakly-flushing toilet. Nat
Re: RFC 213 (v1) rindex and index should return undef on failure
"GL" == Glenn Linderman [EMAIL PROTECTED] writes: GL There is a difference between "undefined" and "unknown". GL Perl undefined is a different concept--that of an uninitialized GL variable. This is proven from its earliest versions where the GL value is coerced to 0 or '' (specific values) when used (without GL warnings on). Sorry, as far as I'm concerned $foo = undef and select @foo = NULL Are both initialized. And what do you consider sub foo { ; return } $status = foo; Uninitialized? Very clearly initialized. And lets look at the name and functions defined($foo) undef($foo) Both seem clearly to mean _undefined_ or perhaps unknown or NULL The use of undef meaning 0 or '' is quite useful. But under some programing styles having tristate logic and NULL propogation would make some programming task a bit more straightforward. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 213 (v1) rindex and index should return undef on failure
"JP" == John Porter [EMAIL PROTECTED] writes: JP Chaim Frenkel wrote: Removing -1 as a valid result, could be a breakage (if someone is doing something weird with a negative result) JP What, like using it as an index into a substr? JP Bad Code is its own reward, my friend. Is that a for or an against. $foo = "flabergasted"; substr($foo, index($foo, 'abc'), 20);# Returns undef JP One should never do this, regardless of what index() returns on JP failure. Now, if index() threw an exception on failture, you'd JP be o.k. But I don't think we want that... I do this _all_ the time. (Well in my SQL.) The correct translation for untranslatable items is NULL (or undef in perl-speak). Yes, sometimes it isn't, for those extra coding is required. Having substr (or other functions) generate an undef is a quite reasonable way to handle this scenerio. This isn't any different than$bar = $hash{$foo} wher $foo doesn't exist. If you must have a value, then check for it. If an undef is acceptable then check for that. I would find checking the final result somehow much clearer to read. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 213 (v1) rindex and index should return undef on failure
"GL" == Glenn Linderman [EMAIL PROTECTED] writes: GL That's exactly why it would be nice if index _did_ throw an exception on GL failure, then you could write code this way, and catch the failures GL without needing to check return values for the error code case before GL proceeding with the real case. But you would still have to catch the exception. Not a nice thing to terminate the program just because an expected mismatch occured. Not finding something is not exceptional. Neither is EOF on a file, or working with an empty list. Adding all these exceptions for non-exceptional and quite common scenerios is bothersome. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 213 (v1) rindex and index should return undef on failure
Chaim Frenkel wrote: But you would still have to catch the exception. Not a nice thing to terminate the program just because an expected mismatch occured. Sometimes it is, sometimes it isn't. Not finding something is not exceptional. Sometimes it is, sometimes it isn't. Why were you looking for it, if you didn't expect to find it? Neither is EOF on a file, or working with an empty list. Adding all these exceptions for non-exceptional and quite common scenerios is bothersome. This is truly one of the conundrums of programming. Let me use sequential file reading as the example scenario. The "normal" thing to do in such a program is to read the next record, and process it somehow. The processing sometimes gets quite involved, of course, but let's not dwell on that here. So we have the following loop, not fully structured programming, and no error handling, exception handling, or any such thing. readloop: { $length = read ( FILE, $buffer, $want ); if ( $lawyer || $bill_collector ) { due_process ( $buffer, $length ); } else { do_process ( $buffer, $length ); } goto readloop; } Now the loop works fine, but somehow, we need to get out of the loop when we encounter an error, or end-of-file. Many programs, including many perl programs, actually do this incorrectly, but it is "good enough" most of the time: they treat _all_ errors during file reading as end-of-file. This is _not_ good enough all of the time, but we're not writing airplane controls most of the time either, but we should stay aware of the issue, even though we usually ignore it. So there are alternatives: (1) we could check $length for zero, and decide that there was nothing left to read. (2) we could modify the goto to have a modifier: "if ! eof ( FILE )". (3) we could check $! just after the call to read to determine if there was an error [should do this in any case], and if it was EOF, could branch out of the loop. I posit that the "correct" solution should be (2)... that you _should not_ as a matter of course, interpret error codes as a way of choosing the control flow of the program. Error codes (and exceptions, when used to report errors) should alter the control flow of the program in the sense of reporting the error so that appropriate, but unusual, action can be taken. So we should check $length, and we should check $!, but finding $length $want, and $! != 0 should both be treated as out-of-the-ordinary conditions-- errors should not happen in well designed programs. Of course, most people would write the loop while ( $length = read ( ... )) { ... } or, for line oriented stuff (rather than record oriented) while ( FILE ) { ... } Such techniques build in a check for errors, but inappropriately mistreat _any_ error as EOF. To avoid mistreating errors that way, such a loop should be followed by if ( $! != 0 ) { # handle error # } Do you often see that coded? Admittedly, non-EOF disk file errors are rare these days of reliable storage. But not all files are disk files, and not all errors are EOF. So there is quite a bit of sloppiness in most code, regarding error handling. That's not a nice thing either. So if (optionally, pragma Throw, similar to pragma Fatal, see RFC 119) all errors could be thrown as exceptions, it would allow programmers to force themselves to be less sloppy about error handling when (like airplane controls) they really should be precise. And actually, once you get the habit of coding that way, it really isn't even any harder. Of the three choices above, none are particularly hard to code. (2) is not harder than (1), or harder than (3). In fact, it might be easier. But of course (2) doesn't add error checking, but neither does (1) or (3)... the add "if something goes wrong, then pretend it was EOF and exit the loop". Clearly of DASD were less reliable, we'd see fewer programs written those ways, and we'd get better diagnostics when something goes wrong, rather than just less output from our programs. To modify the above loop to do error handling properly would take several additional lines of code, no matter what technique was used to code it. Using the syntax of RFC 119, and assuming that all errors turn into thrown exceptions, you can separate the normal logic flow and the error logic flow as follows: while ( ! eof ( FILE )) { $length = read ( FILE, $buffer, $want ); if ( $lawyer || $bill_collector ) { due_process ( $buffer, $length ); } else { do_process ( $buffer, $length ); } } catch FileError { # handle error # }; (N.B. I didn't write the several lines, just used the placeholder: # handle error #) So really, the point is that it is nice to precheck the conditions and avoid error handling, whether it be via error codes, or via exception handling. File handling is pretty obvious: usually you get data, not EOF, so it is pretty obvious where the exception handling should go. But
Re: RFC 213 (v1) rindex and index should return undef on failure
Chaim Frenkel writes: I would like to have an undef returned. Ah, I see. You want subroutines to return undef if they're given it for any of their arguments. That'd break the lazy programmer practice of passing undef expecting it to become "" or 0. They don't have warnings on, of course. Nat
Re: RFC 213 (v1) rindex and index should return undef on failure
Chaim Frenkel wrote: Removing -1 as a valid result, could be a breakage (if someone is doing something weird with a negative result) What, like using it as an index into a substr? Bad Code is its own reward, my friend. $foo = "flabergasted"; substr($foo, index($foo, 'abc'), 20); # Returns undef One should never do this, regardless of what index() returns on failure. Now, if index() threw an exception on failture, you'd be o.k. But I don't think we want that... -- John Porter We're building the house of the future together.
Re: RFC 213 (v1) rindex and index should return undef on failure
John Porter wrote: Chaim Frenkel wrote: Removing -1 as a valid result, could be a breakage (if someone is doing something weird with a negative result) What, like using it as an index into a substr? Bad Code is its own reward, my friend. $foo = "flabergasted"; substr($foo, index($foo, 'abc'), 20); # Returns undef One should never do this, regardless of what index() returns on failure. Now, if index() threw an exception on failture, you'd be o.k. But I don't think we want that... That's exactly why it would be nice if index _did_ throw an exception on failure, then you could write code this way, and catch the failures without needing to check return values for the error code case before proceeding with the real case. -- Glenn = There are two kinds of people, those who finish what they start, and so on... -- Robert Byrne _NetZero Free Internet Access and Email__ http://www.netzero.net/download/index.html
Re: RFC 213 (v1) rindex and index should return undef on failure
"NT" == Nathan Torkington [EMAIL PROTECTED] writes: NT Chaim Frenkel writes: Somehow I find if (40 == ($foo = substr($bar, index($bar, 'xyz' { } NT I don't understand your hypothetical code. substr() returns the NT substring of $bar from the position retutned by index, onward. NT When would this be 40, if index is going to return the position NT of 'xyz'? NT I guess I can't understand your idea of safe failure until I NT see an example, and this doesn't seem to be it. Whoops, I was tired. $to = "010 020 030 047"; $from="AAA BBB CCC DDD"; print substr($to,index($from,"BBB"),3); print substr($to,index($from,"XXX"),3); __END__ 020 7 I would like to have an undef returned. (Now it would have been interesting if it returned "047", then having index return an undef and then having substr() propgate the undef would make things workable.) If you are familiar with Sybase's version of sql. Invalid arguments to various functions return NULL. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
RFC 213 (v1) rindex and index should return undef on failure
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE rindex and index should return undef on failure =head1 VERSION Maintainer: Nathan Torkington [EMAIL PROTECTED] Date: Sep 12 2000 Mailing List: [EMAIL PROTECTED] Number: 213 Version: 1 Status: Developing =head1 ABSTRACT index() and rindex() should return Cundef if their second argument is not a substring of their first argument. =head1 DESCRIPTION In perl5, index() and rindex() return -1 if the substring isn't found. This seems out of step with the rest of Perl's functions, which return Cundef on error. I propose changing index() and rindex() to return Cundef if the substring isn't found. This would also cause warnings to be issued when programmers use the results of index() or rindex() assuming the substring was found. This suggestion doesn't rely on RFC 53, "Built-ins: Merge and generalize Cindex and Crindex", and works regardless of whether 53 is accepted or not. =head1 IMPLEMENTATION The perl526 translator could turn index($a,$b) calls into do { my $tmp = index($a,$b); defined($tmp) ? $tmp : -1 } =head1 REFERENCES RFC 53: Built-ins: Merge and generalize Cindex and Crindex perlfunc manpage for information on index() and rindex()
Re: RFC 213 (v1) rindex and index should return undef on failure
Speaking of failure-mode, all syscalls should return false on failure, not ever -1. Right now, wait and waitpid work the other way. They should go the undef vs "0 but true" route that ioctl, fcntl, and sysread take. --tom
Re: RFC 213 (v1) rindex and index should return undef on failure
"PRL" == Perl6 RFC Librarian [EMAIL PROTECTED] writes: PRL In perl5, index() and rindex() return -1 if the PRL substring isn't found. This seems out of step with PRL the rest of Perl's functions, which return Cundef PRL on error. I propose changing index() and rindex() PRL to return Cundef if the substring isn't found. PRL This would also cause warnings to be issued when PRL programmers use the results of index() or rindex() PRL assuming the substring was found. Removing -1 as a valid result, could be a breakage (if someone is doing something weird with a negative result) Would it be reasonable to ask that passing undef into the offset or start of substr have substr return an undef? This would break the undef == 0 under normal circumstance, but it would prevent an error from propogating. $foo = "flabergasted"; substr($foo, index($foo, 'abc'), 20); # Returns undef If this is too much breakage what about only if it is the argument? $foo = "flabergasted"; $x = index($foo, 'abc'); substr($foo, $x, 20); # starts from the end chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 213 (v1) rindex and index should return undef on failure
Chaim Frenkel writes: Removing -1 as a valid result, could be a breakage (if someone is doing something weird with a negative result) I don't care, so long as the perl526 translator can wrap perl6's index/rindex. And I gave sample code for it to do that. Would it be reasonable to ask that passing undef into the offset or start of substr have substr return an undef? Wouldn't you get a warning anyway, if you were treating undef like a number? Nat
Re: RFC 213 (v1) rindex and index should return undef on failure
"NT" == Nathan Torkington [EMAIL PROTECTED] writes: Would it be reasonable to ask that passing undef into the offset or start of substr have substr return an undef? NT Wouldn't you get a warning anyway, if you were treating undef like NT a number? Aha, but I don't want a warning, I want the code to 'fail' reasonably. Somehow I find if (40 == ($foo = substr($bar, index($bar, 'xyz' { } much nicer than if (defined ($offset = index($bar, 'xyz')) (40 == substr($bar, $offset))) { } I use this style of safe failure when working in SQL. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 213 (v1) rindex and index should return undef on failure
Chaim Frenkel writes: Somehow I find if (40 == ($foo = substr($bar, index($bar, 'xyz' { } I don't understand your hypothetical code. substr() returns the substring of $bar from the position retutned by index, onward. When would this be 40, if index is going to return the position of 'xyz'? I guess I can't understand your idea of safe failure until I see an example, and this doesn't seem to be it. Nat