Re: RFC 213 (v1) rindex and index should return undef on failure

2000-09-18 Thread Glenn Linderman

Chaim Frenkel wrote:

  "GL" == Glenn Linderman [EMAIL PROTECTED] writes:

  Neither is EOF on a file, or working with an empty list. Adding all these
  exceptions for non-exceptional and quite common scenerios is bothersome.

 I don't know where this idea of a _normal_ situation is considered
 exceptional.

The idea of a _normal_ situation being considered exceptional is raised when the
code written inappropriately handles some of the normal return values.

The original example of bad code John Porter wrote:

$foo = "flabergasted";
substr($foo, index($foo, 'abc'), 20);   # Returns undef

contains errors.  Clearly the current index, or even index modified to return
undef, doesn't produce the desired results.  The inappropriate value of -1 (or
undef) passed as the 2nd parameter to substr will produce erroneous results.

In order to make the above line useful, index would need to alter the _normal_
flow of control by throwing an exception.  However, this is a different index
function than we have today.  The author of lines like the above should use a
wrapper around index that throws exceptions for the normal cases that are not
desirable for his application.  Then the above terse code could be usefully
employed.

The real question boils down to who decides what is normal and what is
exceptional.  This is the conundrum of programming today.

 I don't want to sprinkle my code with try/catch just to handle a
 'normal' situation.

I don't want you too, either.

(1) RFC 119 only requires the "catch", not the "try".  This solves half the
problem :)

(2) I want you to use the appropriate code in the appropriate ways, so that your
catch blocks only catch the abnormal (exceptional) situations.  And for abnormal
situations, the default catch handlers might serve you just fine, solving the
other half of the problem, much of the time.

 GL I agree with your concern that exception handling is (generally)
 GL more expensive than error codes.  However, I see it as a good
 GL expenditure of the fast CPUs of today, as a tradeoff towards
 GL reliable processing.  And maybe in Perl6 exception handling could
 GL be less expensive than it is (by comparison to error codes) in
 GL other languages?  That's a question for the internals guys, of
 GL course.

 A cycle here a cycle there, and soon the program becomes bloatware.

Right.  Different people have different opinions about this, too.  Where to spend
those cycles most usefully.  My preference (but only a preference) is to see
cycles expended toward making simple things more reliable.  Smart programmers can
solve the complex problems, but in doing so often overlook or assume away error
handling... by making error handling noisier (via exceptions) but also moving it
out of the main code path (via catch blocks) produces a result that pleases me,
and avoids cluttering the main path of code with the handling of abnormal things.
The cost then is paid only when something abnormal happens, not for every normal
thing that might happen but wasn't expected here.

--
Glenn
=
Even if you're on the right track,
you'll get run over if you just sit there.
   -- Will Rogers



NetZero Free Internet Access and Email_
Download Now http://www.netzero.net/download/index.html
Request a CDROM  1-800-333-3633
___



Re: RFC 213 (v1) rindex and index should return undef on failure

2000-09-18 Thread Glenn Linderman

Chaim Frenkel wrote:

 What about a hypothetical, use tristate. This would give undef some
 extra special powers.

There is a difference between "undefined" and "unknown".

SQL NULL, and the resultant tristate operators used in SQL, specifically is based
on NULL representing the "unknown" value.

Perl undefined is a different concept--that of an uninitialized variable.  This is
proven from its earliest versions where the value is coerced to 0 or '' (specific
values) when used (without warnings on).

Some Perl programs  modules (including DBI) attempt to correlate NULL and
undefined, for lack of a better match of concepts (Perl is missing the concept of
NULL, SQL is missing the concept of undefined, but that doesn't correctly imply
that the concepts each language _does_ have are correlated, or should be).

If you want NULL, RFC it is a new concept.  DBI could then be ported to Perl 6,
and the power of using NULL in its operators (perhaps together with transactional
variables) could make Perl an extremely powerful database manipulation language
and would make the language, complementary to and augmenting SQL in ways no other
language currently does.

Do not attempt to further the inappropriate correlation between undefined and
NULL.

Any OO language with full operator overloading could write objects/operators that
behave like SQL values, and implement tristate logic for those objects, just like
SQL does.  Perhaps you should attempt that, and RFC the failures.  I would
recommend, however, that you not attempt to use the concept of undefined to
implement the concept of NULL, at least not visibly...

--
Glenn
=
Even if you're on the right track,
you'll get run over if you just sit there.
   -- Will Rogers


___
Why pay for something you could get for free?
NetZero provides FREE Internet Access and Email
http://www.netzero.net/download/index.html



Re: RFC 213 (v1) rindex and index should return undef on failure

2000-09-18 Thread John Porter

Glenn Linderman wrote:
 
 The idea of a _normal_ situation being considered exceptional is raised when the
 code written inappropriately handles some of the normal return values.

You would throw exceptions at the problem of bad coding practice.  
I think it's better to correct the bad coding practice.



 $foo = "flabergasted";
 substr($foo, index($foo, 'abc'), 20);   # Returns undef
 
 contains errors.  Clearly the current index, or even index modified to return
 undef, doesn't produce the desired results.  The inappropriate value of -1 (or
 undef) passed as the 2nd parameter to substr will produce erroneous results.

Right. But this is not so much an argument for making index throw, as for 
encouraging programmers to write good code, i.e.

$foo = "flabergasted";
if ( defined my $i = index($foo, 'abc') ) {
substr( $foo, $i, 20 );
}
else {
# do what you want with this condition.
}

The whole point, IMHO, is that index() should return a value which cannot
be used as an index.  -1 clearly does not meet this criterion.

If it returns undef, that can be used since it will get coerced to 0, but
at least it will elicit a warning from perl.  Perhaps under some kind of
very-strict it will elicit an error instead.

-- 
John Porter

We're building the house of the future together.




Re: RFC 213 (v1) rindex and index should return undef on failure

2000-09-18 Thread John Porter

Glenn Linderman wrote:
 
 There is a difference between "undefined" and "unknown".

Can you explain this difference, briefly?
If not, could you give me something off-list?

Thanks,
John Porter




Re: RFC 213 (v1) rindex and index should return undef on failure

2000-09-18 Thread Glenn Linderman

John Porter wrote:

 Glenn Linderman wrote:
 
  The idea of a _normal_ situation being considered exceptional is raised when the
  code written inappropriately handles some of the normal return values.

 You would throw exceptions at the problem of bad coding practice.

Not the goal.  There are, no doubt, many possible interpretations of why your example
line of code was bad, all couched in different justifications for using it.

 I think it's better to correct the bad coding practice.

That was actually my goal too.

  $foo = "flabergasted";
  substr($foo, index($foo, 'abc'), 20);   # Returns undef
 
  contains errors.  Clearly the current index, or even index modified to return
  undef, doesn't produce the desired results.  The inappropriate value of -1 (or
  undef) passed as the 2nd parameter to substr will produce erroneous results.

 Right. But this is not so much an argument for making index throw, as for
 encouraging programmers to write good code, i.e.

 $foo = "flabergasted";
 if ( defined my $i = index($foo, 'abc') ) {
 substr( $foo, $i, 20 );
 }
 else {
 # do what you want with this condition.
 }

Clearly with the  $foo = "flabergasted" ;  line in place, the whole example could be
replaced with

  substr ($foo, -1, 20 );
or even
  undef;

So removing that line and allowing for variability in the possible values for $foo,
the question is whether the programmer so strongly believes that 'abc' will be found
that he wishes to return it and the next 17 characters, and take extreme risk with his
code if it is not found, or whether there is really a useful alternative action to be
done in the "normal" case that 'abc' is not found in $foo.

If the intention is that 'abc' is really expected to be found in all $foo, and is just
the demarcation of the next 17 useful characters, then your "better code" costs
several extra lines, and the exception which wouldn't be taken most of the time,
resulting in little additional cost or complexity, would allow the single line
solution you originally proposed, which is nice and concise for the case where not
finding 'abc' is unexpected.

Only you, of course, can supply the intentions behind your example, but by omitting
the "do what you want with this condition" part, you clearly left it up for grabs to
be interpreted as abnormal.

 The whole point, IMHO, is that index() should return a value which cannot
 be used as an index.  -1 clearly does not meet this criterion.

I totally agree that having index return undef on failure to find the string would be
an improvement to index.  Then in could be wrapped by Fatal.pm, or Throw.pm, etc., and
we could all have and eat our cake.

--
Glenn
=
Even if you're on the right track,
you'll get run over if you just sit there.
   -- Will Rogers


___
Why pay for something you could get for free?
NetZero provides FREE Internet Access and Email
http://www.netzero.net/download/index.html



Re: RFC 213 (v1) rindex and index should return undef on failure

2000-09-18 Thread Nathan Torkington

At this point, I think the whole thread on functions throwing
exceptions should either be:
 (a) turned into an RFC
or
 (b) abandoned.

This discussion is going around and around like a piece of toilet
paper in a weakly-flushing toilet.

Nat



Re: RFC 213 (v1) rindex and index should return undef on failure

2000-09-18 Thread Chaim Frenkel

 "GL" == Glenn Linderman [EMAIL PROTECTED] writes:

GL There is a difference between "undefined" and "unknown".

GL Perl undefined is a different concept--that of an uninitialized
GL variable.  This is proven from its earliest versions where the
GL value is coerced to 0 or '' (specific values) when used (without
GL warnings on).

Sorry, as far as I'm concerned

$foo = undef
and
select @foo = NULL

Are both initialized.

And what do you consider

sub foo { ; return }
$status = foo;

Uninitialized? Very clearly initialized.

And lets look at the name and functions

defined($foo)
undef($foo)

Both seem clearly to mean _undefined_ or perhaps unknown or NULL

The use of undef meaning 0 or '' is quite useful.

But under some programing styles having tristate logic and NULL propogation
would make some programming task a bit more straightforward.

chaim
-- 
Chaim FrenkelNonlinear Knowledge, Inc.
[EMAIL PROTECTED]   +1-718-236-0183



Re: RFC 213 (v1) rindex and index should return undef on failure

2000-09-17 Thread Chaim Frenkel

 "JP" == John Porter [EMAIL PROTECTED] writes:

JP Chaim Frenkel wrote:
 
 Removing -1 as a valid result, could be a breakage (if someone is
 doing something weird with a negative result)

JP What, like using it as an index into a substr?
JP Bad Code is its own reward, my friend.

Is that a for or an against.

 $foo = "flabergasted";
 substr($foo, index($foo, 'abc'), 20);# Returns undef

JP One should never do this, regardless of what index() returns on
JP failure.  Now, if index() threw an exception on failture, you'd
JP be o.k.  But I don't think we want that...

I do this _all_ the time. (Well in my SQL.) The correct translation
for untranslatable items is NULL (or undef in perl-speak). Yes,
sometimes it isn't, for those extra coding is required.

Having substr (or other functions) generate an undef is a quite
reasonable way to handle this scenerio. This isn't any different
than$bar = $hash{$foo} wher $foo doesn't exist.

If you must have a value, then check for it. If an undef is acceptable
then check for that.

I would find checking the final result somehow much clearer to read.

chaim
-- 
Chaim FrenkelNonlinear Knowledge, Inc.
[EMAIL PROTECTED]   +1-718-236-0183



Re: RFC 213 (v1) rindex and index should return undef on failure

2000-09-17 Thread Chaim Frenkel

 "GL" == Glenn Linderman [EMAIL PROTECTED] writes:

GL That's exactly why it would be nice if index _did_ throw an exception on
GL failure, then you could write code this way, and catch the failures
GL without needing to check return values for the error code case before
GL proceeding with the real case.

But you would still have to catch the exception. Not a nice thing to 
terminate the program just because an expected mismatch occured.

Not finding something is not exceptional.

Neither is EOF on a file, or working with an empty list. Adding all these
exceptions for non-exceptional and quite common scenerios is bothersome.

chaim
-- 
Chaim FrenkelNonlinear Knowledge, Inc.
[EMAIL PROTECTED]   +1-718-236-0183



Re: RFC 213 (v1) rindex and index should return undef on failure

2000-09-17 Thread Glenn Linderman

Chaim Frenkel wrote:

 But you would still have to catch the exception. Not a nice thing to
 terminate the program just because an expected mismatch occured.

Sometimes it is, sometimes it isn't.

 Not finding something is not exceptional.

Sometimes it is, sometimes it isn't.  Why were you looking for it, if you didn't
expect to find it?

 Neither is EOF on a file, or working with an empty list. Adding all these
 exceptions for non-exceptional and quite common scenerios is bothersome.

This is truly one of the conundrums of programming.  Let me use sequential file
reading as the example scenario.  The "normal" thing to do in such a program is to
read the next record, and process it somehow.  The processing sometimes gets quite
involved, of course, but let's not dwell on that here.  So we have the following
loop, not fully structured programming, and no error handling, exception handling,
or any such thing.

readloop:
  { $length = read ( FILE, $buffer, $want );
if ( $lawyer  ||  $bill_collector )
{  due_process ( $buffer, $length );
} else
{  do_process ( $buffer, $length );
}
goto readloop;
  }

Now the loop works fine, but somehow, we need to get out of the loop when we
encounter an error, or end-of-file.  Many programs, including many perl programs,
actually do this incorrectly, but it is "good enough" most of the time: they treat
_all_ errors during file reading as end-of-file.  This is _not_ good enough all of
the time, but we're not writing airplane controls most of the time either, but we
should stay aware of the issue, even though we usually ignore it.

So there are alternatives:

(1) we could check $length for zero, and decide that there was nothing left to
read.
(2) we could modify the goto to have a modifier: "if ! eof ( FILE )".
(3) we could check $! just after the call to read to determine if there was an
error [should do this in any case], and if it was EOF, could branch out of the
loop.

I posit that the "correct" solution should be (2)... that you _should not_ as a
matter of course, interpret error codes as a way of choosing the control flow of
the program.  Error codes (and exceptions, when used to report errors) should
alter the control flow of the program in the sense of reporting the error so that
appropriate, but unusual, action can be taken.

So we should check $length, and we should check $!, but finding $length  $want,
and $! != 0 should both be treated as out-of-the-ordinary conditions-- errors
should not happen in well designed programs.

Of course, most people would write the loop

   while ( $length = read ( ... ))
   { ... }

or, for line oriented stuff (rather than record oriented)

   while ( FILE ) { ... }

Such techniques build in a check for errors, but inappropriately mistreat _any_
error as EOF.  To avoid mistreating errors that way, such a loop should be
followed by

   if ( $!  != 0 ) { # handle error # }

Do you often see that coded?  Admittedly, non-EOF disk file errors are rare these
days of reliable storage.  But not all files are disk files, and not all errors
are EOF.

So there is quite a bit of sloppiness in most code, regarding error handling.
That's not a nice thing either.

So if (optionally, pragma Throw, similar to pragma Fatal, see RFC 119) all errors
could be thrown as exceptions, it would allow programmers to force themselves to
be less sloppy about error handling when (like airplane controls) they really
should be precise.

And actually, once you get the habit of coding that way, it really isn't even any
harder.  Of the three choices above, none are particularly hard to code.  (2) is
not harder than (1), or harder than (3).  In fact, it might be easier.  But of
course (2) doesn't add error checking, but neither does (1) or (3)... the add "if
something goes wrong, then pretend it was EOF and exit the loop".  Clearly of DASD
were less reliable, we'd see fewer programs written those ways, and we'd get
better diagnostics when something goes wrong, rather than just less output from
our programs.

To modify the above loop to do error handling properly would take several
additional lines of code, no matter what technique was used to code it.  Using the
syntax of RFC 119, and assuming that all errors turn into thrown exceptions, you
can separate the normal logic flow and the error logic flow as follows:

   while ( ! eof ( FILE ))
   { $length = read ( FILE, $buffer, $want );
  if ( $lawyer  ||  $bill_collector )
  {  due_process ( $buffer, $length );
  } else
  {  do_process ( $buffer, $length );
  }
   }
   catch FileError { # handle error # };

(N.B.  I didn't write the several lines, just used the placeholder: # handle
error #)

So really, the point is that it is nice to precheck the conditions and avoid error
handling, whether it be via error codes, or via exception handling.

File handling is pretty obvious: usually you get data, not EOF, so it is pretty
obvious where the exception handling should go.  But 

Re: RFC 213 (v1) rindex and index should return undef on failure

2000-09-15 Thread Nathan Torkington

Chaim Frenkel writes:
 I would like to have an undef returned.

Ah, I see.  You want subroutines to return undef if they're given it
for any of their arguments.  That'd break the lazy programmer practice
of passing undef expecting it to become "" or 0.  They don't have
warnings on, of course.

Nat



Re: RFC 213 (v1) rindex and index should return undef on failure

2000-09-14 Thread John Porter

Chaim Frenkel wrote:
 
 Removing -1 as a valid result, could be a breakage (if someone is
 doing something weird with a negative result)

What, like using it as an index into a substr?
Bad Code is its own reward, my friend.


   $foo = "flabergasted";
   substr($foo, index($foo, 'abc'), 20);   # Returns undef

One should never do this, regardless of what index() returns on
failure.  Now, if index() threw an exception on failture, you'd
be o.k.  But I don't think we want that...

-- 
John Porter

We're building the house of the future together.




Re: RFC 213 (v1) rindex and index should return undef on failure

2000-09-14 Thread Glenn Linderman

John Porter wrote:

 Chaim Frenkel wrote:
 
  Removing -1 as a valid result, could be a breakage (if someone is
  doing something weird with a negative result)

 What, like using it as an index into a substr?
 Bad Code is its own reward, my friend.

$foo = "flabergasted";
substr($foo, index($foo, 'abc'), 20);   # Returns undef

 One should never do this, regardless of what index() returns on
 failure.  Now, if index() threw an exception on failture, you'd
 be o.k.  But I don't think we want that...

That's exactly why it would be nice if index _did_ throw an exception on
failure, then you could write code this way, and catch the failures
without needing to check return values for the error code case before
proceeding with the real case.

--
Glenn
=
There  are two kinds of people, those
who finish  what they start,  and  so
on... -- Robert Byrne



_NetZero Free Internet Access and Email__
   http://www.netzero.net/download/index.html



Re: RFC 213 (v1) rindex and index should return undef on failure

2000-09-14 Thread Chaim Frenkel

 "NT" == Nathan Torkington [EMAIL PROTECTED] writes:

NT Chaim Frenkel writes:
 Somehow I find
 if (40 == ($foo = substr($bar, index($bar, 'xyz' {
 }

NT I don't understand your hypothetical code.  substr() returns the
NT substring of $bar from the position retutned by index, onward.
NT When would this be 40, if index is going to return the position
NT of 'xyz'?

NT I guess I can't understand your idea of safe failure until I
NT see an example, and this doesn't seem to be it.

Whoops, I was tired.

$to = "010 020 030 047";
$from="AAA BBB CCC DDD";

print substr($to,index($from,"BBB"),3);
print substr($to,index($from,"XXX"),3);
__END__
020
7

I would like to have an undef returned.

(Now it would have been interesting if it returned "047", then having
index return an undef and then having substr() propgate the undef
would make things workable.)

If you are familiar with Sybase's version of sql. Invalid arguments
to various functions return NULL.

chaim
-- 
Chaim FrenkelNonlinear Knowledge, Inc.
[EMAIL PROTECTED]   +1-718-236-0183



RFC 213 (v1) rindex and index should return undef on failure

2000-09-13 Thread Perl6 RFC Librarian

This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

rindex and index should return undef on failure

=head1 VERSION

  Maintainer: Nathan Torkington [EMAIL PROTECTED]
  Date: Sep 12 2000
  Mailing List: [EMAIL PROTECTED]
  Number: 213
  Version: 1
  Status: Developing

=head1 ABSTRACT

index() and rindex() should return Cundef if their
second argument is not a substring of their first
argument.

=head1 DESCRIPTION

In perl5, index() and rindex() return -1 if the
substring isn't found.  This seems out of step with
the rest of Perl's functions, which return Cundef
on error.  I propose changing index() and rindex()
to return Cundef if the substring isn't found.

This would also cause warnings to be issued when
programmers use the results of index() or rindex()
assuming the substring was found.

This suggestion doesn't rely on RFC 53, "Built-ins: Merge and
generalize Cindex and Crindex", and works regardless
of whether 53 is accepted or not.

=head1 IMPLEMENTATION

The perl526 translator could turn index($a,$b) calls into

  do { my $tmp = index($a,$b); defined($tmp) ? $tmp : -1 }

=head1 REFERENCES

RFC 53: Built-ins: Merge and generalize Cindex and Crindex

perlfunc manpage for information on index() and rindex()




Re: RFC 213 (v1) rindex and index should return undef on failure

2000-09-13 Thread Tom Christiansen

Speaking of failure-mode, all syscalls should return false on failure, not
ever -1.  Right now, wait and waitpid work the other way.  They should
go the undef vs "0 but true" route that ioctl, fcntl, and sysread take.

--tom



Re: RFC 213 (v1) rindex and index should return undef on failure

2000-09-13 Thread Chaim Frenkel

 "PRL" == Perl6 RFC Librarian [EMAIL PROTECTED] writes:

PRL In perl5, index() and rindex() return -1 if the
PRL substring isn't found.  This seems out of step with
PRL the rest of Perl's functions, which return Cundef
PRL on error.  I propose changing index() and rindex()
PRL to return Cundef if the substring isn't found.

PRL This would also cause warnings to be issued when
PRL programmers use the results of index() or rindex()
PRL assuming the substring was found.

Removing -1 as a valid result, could be a breakage (if someone is
doing something weird with a negative result)

Would it be reasonable to ask that passing undef into the offset
or start of substr have substr return an undef?

This would break the undef == 0 under normal circumstance, but
it would prevent an error from propogating.

$foo = "flabergasted";
substr($foo, index($foo, 'abc'), 20);   # Returns undef

If this is too much breakage what about only if it is the argument?

$foo = "flabergasted";
$x = index($foo, 'abc');
substr($foo, $x, 20);   # starts from the end

chaim
-- 
Chaim FrenkelNonlinear Knowledge, Inc.
[EMAIL PROTECTED]   +1-718-236-0183



Re: RFC 213 (v1) rindex and index should return undef on failure

2000-09-13 Thread Nathan Torkington

Chaim Frenkel writes:
 Removing -1 as a valid result, could be a breakage (if someone is
 doing something weird with a negative result)

I don't care, so long as the perl526 translator can wrap perl6's
index/rindex.  And I gave sample code for it to do that.

 Would it be reasonable to ask that passing undef into the offset
 or start of substr have substr return an undef?

Wouldn't you get a warning anyway, if you were treating undef like
a number?

Nat



Re: RFC 213 (v1) rindex and index should return undef on failure

2000-09-13 Thread Chaim Frenkel

 "NT" == Nathan Torkington [EMAIL PROTECTED] writes:

 Would it be reasonable to ask that passing undef into the offset
 or start of substr have substr return an undef?

NT Wouldn't you get a warning anyway, if you were treating undef like
NT a number?

Aha, but I don't want a warning, I want the code to 'fail' reasonably.

Somehow I find
if (40 == ($foo = substr($bar, index($bar, 'xyz' {
}

much nicer than

if (defined ($offset = index($bar, 'xyz')) 
(40 == substr($bar, $offset))) {
}

I use this style of safe failure when working in SQL.

chaim
-- 
Chaim FrenkelNonlinear Knowledge, Inc.
[EMAIL PROTECTED]   +1-718-236-0183



Re: RFC 213 (v1) rindex and index should return undef on failure

2000-09-13 Thread Nathan Torkington

Chaim Frenkel writes:
 Somehow I find
   if (40 == ($foo = substr($bar, index($bar, 'xyz' {
   }

I don't understand your hypothetical code.  substr() returns the
substring of $bar from the position retutned by index, onward.
When would this be 40, if index is going to return the position
of 'xyz'?

I guess I can't understand your idea of safe failure until I
see an example, and this doesn't seem to be it.

Nat