Re: [Haskell-cafe] Pattern matching, and bugs

2009-12-19 Thread wren ng thornton

Ketil Malde wrote:

András Mocsáry amo...@gmail.com writes:

Now we have a problem, which is most generally fixed in these ways:
C-like:

switch ( x )
{
Case 0:
  Unchecked
Case 1:
  Checked
Case 2:
  Unknown
Default:
  Nothing
}


This is not a fix, this is a workaround for a design bug, namely that
x is of a type that allows meaningless data.


Indeed. In C-like languages it is common practice to use integers to 
mean just about anything. Often there are fewer of anything than there 
are integers, and so hacks like this are necessary to work around the 
ensuing problems. But ultimately, this is a *type error*.


The proper response, in Haskell, to type errors like this is not to add 
hacks catching bad values, but rather to change the types so that bad 
values cannot be constructed. As a few others have mentioned, rather 
than using an integer, you should define a new type like:


data X = Unchecked | Checked | Unknown

and then require that x is of type X. Thus the pattern-checker can 
verify that all possible values of x will match some case option.


The reason Haskell (or other typed functional languages) allow proving 
correctness so much easier than other languages is because we create new 
types which correctly and precisely match the set of values we want to 
belong to that type[1]. By pushing as much of the correctness logic as 
possible up into the type layer, this frees us from needing to check 
things at the term layer since the type-checker can capture type errors 
automatically. But it can't catch things we haven't told it are errors.



[1] For some complex sets of values this isn't possible with Haskell's 
type system, which is one of the reasons for the recent interest in 
dependently typed languages. However, for most familiar sets of values 
(and quite a few unfamiliar ones) Haskell's type system is more than enough.


--
Live well,
~wren
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Pattern matching, and bugs

2009-12-18 Thread András Mocsáry
Hello,
I was advised respectfully to post my query here.
Please, read the whole letter before you do anything, because I tried to
construct the problem step by step.
Also keep in mind, that the problem I query here is more general, and
similar cases occur elsewhere, not just in this particular example I present
below.

*Intro story* ( Skip if you are in a hurry )
I'm participating in several open-source server development projects, most
of them are written in C, and some of them have C++ code hidden here and
there.
These programs, are developed by many, and by many ways, and so, more often
than not it is very hard to determine the 'real cause of bugs'.
This almost always leads to 'bugfixes' which 'treat the crash'. Sometimes
they are a few lines of extra checks, like IFs.
Sometimes they are complex, and even surprisingly clever hacks.
Thus understanding the 'code' of them is challenging, but the end result is
a pile of ... hacks fixing bugs fixing hacks fixing bug, which also were put
there to fix yet another bugs.
+
When I started to learn functional programming, I was told, that
the correctness of a functional program can be proved a lot more easily, in
fact in a straight mathematical way.
+
*My concern*
is about predictable failure of sw written in Haskell.
To illustrate it let's see a Haskell pattern matching example:

Let's say I have defined some states my object could be in, and I did in a
switch in some C-like language:

 switch ( x )

{

Case 0:

  Unchecked

Case 1:

  Checked

Case 2:

  Unknown

}


And in Haskell pattern matching:

switch 1 =  Unchecked

switch 2 =  Checked

switch 3 =  Unknown


Let's say, these are clearly defined states of some objects.
Then let's say something unexpected happens: x gets something else than 0 1
2.
Now we have a problem, which is most generally fixed in these ways:
C-like:

 switch ( x )

{

Case 0:

  Unchecked

Case 1:

  Checked

Case 2:

  Unknown

Default:

  Nothing

}


Haskell like:

switch 1 =  Unchecked

switch 2 =  Checked

switch 3 =  Unknown

switch x =  Nothing

These general ways really avoid this particular crash, but does something
real bad to the code in my opinion.

Below are some cases x can go wrong:
*1. The bad data we got as 'x', could have came from an another part of our
very program, which is the REAL CAUSE of the crash, but we successfully hide
it.*
*
Which makes it harder to fix later, and thus eventually means the death of
the software product. Eventually someone has to rewrite it.
Which is economically bad for the company, since rewriting implies increased
costs.

2. The bad data we got as 'x', could also could have come form a real
word object,
we have underestimated, or which changed in the meantime.

3. This 'x' could have been corrupted over a network, or by 'mingling' or by
other faulty software or something.

Point 1:
If we allow ourself such general bugfixes, we eventually kill the ability of
the project to 'evolve'.

Point 2:
Programmers eventually take up such 'arguably bad' habits, thus making
harder to find such bugs.

Thus it would be wiser to tell my people to never write Default cases, and
such general pattern matching cases.

*
Which leads to the very reason I wrote to you:

I want to propose this for Haskell prime:

I would like to have a way for Haskell, not to crash, when my coders write
pattern matching without the above mentioned general case.
Like having the compiler auto-include those general cases for us,
but when those cases got hit, then* instead of crashing*, it *should **report
some error* on *stdout *or *stderr*.
(It would be even nicer if it cold have been traced.)


This is very much like warning suppression, just that it's crash
suppression, with the need of a report message of course.

*I would like to hear your opinion on this.*

I also think, that there are many similar cases in haskell, where not
crashing, just error reporting would be way more beneficial.
In my case for server software, where network corrupted data,
( and data which has been 'tampered with' by some 'good guy' who think he's
robin hood if he can 'hack' the server )
is an every day reality.


Thanks for your time reading my 'storm'.

Greets,

Andrew
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Pattern matching, and bugs

2009-12-18 Thread Jochem Berndsen
András Mocsáry wrote:
 *My concern*
 is about predictable failure of sw written in Haskell.
 To illustrate it let's see a Haskell pattern matching example:

 And in Haskell pattern matching:
 
 switch 1 =  Unchecked
 
 switch 2 =  Checked
 
 switch 3 =  Unknown
 
 
 Let's say, these are clearly defined states of some objects.
 Then let's say something unexpected happens: x gets something else than 0 1
 2.
 Now we have a problem, which is most generally fixed in these ways:
 
 switch 1 =  Unchecked
 
 switch 2 =  Checked
 
 switch 3 =  Unknown
 
 switch x =  Nothing
 
 These general ways really avoid this particular crash, but does something
 real bad to the code in my opinion.

Agreed. The real cause of the problem is that the programmer didn't
prove that x is in {1,2,3} when calling switch.

 Below are some cases x can go wrong:
 *1. The bad data we got as 'x', could have came from an another part of our
 very program, which is the REAL CAUSE of the crash, but we successfully hide
 it.*
 *
 Which makes it harder to fix later, and thus eventually means the death of
 the software product. Eventually someone has to rewrite it.
 Which is economically bad for the company, since rewriting implies increased
 costs.

Yes.

 2. The bad data we got as 'x', could also could have come form a real
 word object,
 we have underestimated, or which changed in the meantime.

You should not assume that your input is correct in fault-tolerant programs.

 3. This 'x' could have been corrupted over a network, or by 'mingling' or by
 other faulty software or something.

Unlikely. There is nothing you can do about this, though.


 Point 1:
 If we allow ourself such general bugfixes, we eventually kill the ability of
 the project to 'evolve'.
 
 Point 2:
 Programmers eventually take up such 'arguably bad' habits, thus making
 harder to find such bugs.
 
 Thus it would be wiser to tell my people to never write Default cases, and
 such general pattern matching cases.

It is a better idea to use the type system to prevent this kind of bugs.
In this particular case, it's better to try to have a datatype like
data X = One | Two | Three

 *
 Which leads to the very reason I wrote to you:
 
 I want to propose this for Haskell prime:
 
 I would like to have a way for Haskell, not to crash, when my coders write
 pattern matching without the above mentioned general case.
 Like having the compiler auto-include those general cases for us,
 but when those cases got hit, then* instead of crashing*, it *should **report
 some error* on *stdout *or *stderr*.
 (It would be even nicer if it cold have been traced.)

And, how would it continue?
Suppose that we have the function
head :: [a] - a
head (x:_) = x

What would you propose that would happen if I call head [] ? Print an
error on stderr, say, but what should it return? Surely it cannot make
an value of type a out of thin air?

Nowadays, head [] crashes the program, and you get an error message to
standard error.

 This is very much like warning suppression, just that it's crash
 suppression, with the need of a report message of course.

This is already possible with exception handling.

 *I would like to hear your opinion on this.*

I don't think it can be implemented in a sane way, or that it's a good
idea to suppress this silently, when an explicit solution already exists.

 I also think, that there are many similar cases in haskell, where not
 crashing, just error reporting would be way more beneficial.
 In my case for server software, where network corrupted data,
 ( and data which has been 'tampered with' by some 'good guy' who think he's
 robin hood if he can 'hack' the server )
 is an every day reality.

You should validate your data in any case. You may even turn a DoS
attack into a real security problem with your solution.

Cheers, Jochem

-- 
Jochem Berndsen | joc...@functor.nl | joc...@牛在田里.com
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Pattern matching, and bugs

2009-12-18 Thread Serguey Zefirov
 I would like to have a way for Haskell, not to crash, when my coders write
 pattern matching without the above mentioned general case.
 Like having the compiler auto-include those general cases for us,
 but when those cases got hit, then instead of crashing, it should report
 some error on stdout or stderr.
 (It would be even nicer if it cold have been traced.)

You should not wait for Haskell prime. What you are asking is already
in libraries: 
http://www.haskell.org/ghc/docs/latest/html/libraries/base-4.2.0.0/Control-Exception.html
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Pattern matching, and bugs

2009-12-18 Thread Ketil Malde
András Mocsáry amo...@gmail.com writes:

 Now we have a problem, which is most generally fixed in these ways:
 C-like:

 switch ( x )

 {

 Case 0:

   Unchecked

 Case 1:

   Checked

 Case 2:

   Unknown

 Default:

   Nothing

 }

This is not a fix, this is a workaround for a design bug, namely that
x is of a type that allows meaningless data.

 Haskell like:

 switch 1 =  Unchecked

 switch 2 =  Checked

 switch 3 =  Unknown

 switch x =  Nothing

 These general ways really avoid this particular crash, but does something
 real bad to the code in my opinion.

Yes.  The type of the parameter should be designed so that it only allows
the three legal values.  Using an integer value when there are only
three real options is bad design.

 *1. The bad data we got as 'x', could have came from an another part of our
 very program, which is the REAL CAUSE of the crash, but we successfully hide
 it.*

 2. The bad data we got as 'x', could also could have come form a real
 word object, we have underestimated, or which changed in the meantime.

I think these two are the same - in case 2, it is the parser that
should produce an error.  If it may fail, there's a plethora of
solutions, for instance, wrapping the result in a Maybe.  This will force
users of the data to take failure into account.

 3. This 'x' could have been corrupted over a network, or by 'mingling' or by
 other faulty software or something.

I'm not really sure how a Haskell program would react to memory errors
or similar.  Badly, I expect.

 I would like to have a way for Haskell, not to crash, when my coders write
 pattern matching without the above mentioned general case.

I think this just encourages sloppy programming, and as has been pointed
out, you can catch the exception if you think you can deal with it
reasonably. 

GHC warns you if you do incomplete pattern matching, and I think it's a
good idea to adhere to the warnings.

 In my case for server software, where network corrupted data,
 ( and data which has been 'tampered with' by some 'good guy' who think he's
 robin hood if he can 'hack' the server ) is an every day reality.

I like the Erlang approach: let it (the subsystem) crash, catch the
exception, report it, and restart.

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe