Re: Some clarity please! (was Re: [Haskell-cafe] Re: (flawed?) benchmark : sort)

2008-03-13 Thread Roman Leshchinskiy

Adrian Hey wrote:


I would ask for any correct Eq instance something like the law:
  (x==y) = True implies x=y (and vice-versa)
which implies f x = f y for all definable f
which implies (f x == f y) = True (for expression types which are
instances of Eq). This pretty much requires structural equality
for concrete types. For abstract types you can do something different
provided functions which can give different answers for two "equal"
arguments are not exposed.


How do you propose something like this to be specified in the language 
definition? The report doesn't (and shouldn't) know about abstract 
types. It only talks about things which are exported and things which 
are not. The distinction between implementation modules and client 
modules is made by the programmer, not by the language. So you can 
either require your law to hold everywhere, which IMO isn't a good idea, 
or you don't require it to hold. From the language definition point of 
view, I don't see any middle ground here.


Also, when you talk about definable functions, do you include things 
like I/O? What if I want to store things (such as a Set) on a disk? If 
the same abstract value can have multiple representations, do I have to 
convert them all to some canonical representation before writing them to 
a file? This might be rather slow and is, IMO, quite unnecessary.


From a more philosophical points of view, I'd say that the appropriate 
concept of equality depends a lot on the problem domain. Personally, I 
quite strongly disagree with restricting Eq instances in the way you 
propose. I have never found much use for strict structural equality (as 
opposed to domain-specific equality which may or may not coincide with 
the structural one).



On the subject of ambiguity, bugs and maxima, I see we have in Data.List

[...]

So I believe I'm correct in saying that maximumBy returns the last
of several possible maximum elements of the list. This obviously
needs specifying in the Haddock.

Because maximumBy documentation is ambiguous in this respect, so is the
overloaded maximum documentation. At least you can't figure it out from
the Haddock.


Why not simply say that maximumBy returns some maximum element from the 
list but it's not specified which one? That's how I always understood 
the spec. Code which needs a particular maximum element can't use 
maximumBy but such code is rare. I don't see how this is ambiguous, it 
just leaves certain things unspecified which is perfectly ok.


Roman

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Some clarity please! (was Re: [Haskell-cafe] Re: (flawed?) benchmark : sort)

2008-03-13 Thread Conor McBride

Hi

On 13 Mar 2008, at 23:42, [EMAIL PROTECTED] wrote:


Conor McBride <[EMAIL PROTECTED]> responded to my comment:


(mapMonotonic should of course be removed, too,
 or specified to fail (preferably in some MonadZero)
 if the precondition is violated,
 which should still be implementable in linear time.)


What's wrong with mapMonotonic that isn't wrong
with, say, sortBy?, or Eq instances for parametrized
types?


Prelude> :m + Data.Set
Prelude Data.Set> toAscList $ mapMonotonic (10 -) (fromList [1 .. 5])
[9,8,7,6,5]
Prelude Data.Set> 5 `member`  mapMonotonic (10 -) (fromList [1 .. 5])
False


Something's certainly wrong there!


But nothing out of the ordinary: garbage in,
garbage out. Happens all the time, even in
Haskell. Why pick on mapMonotonic?

Prelude Data.List> sortBy (\_ _ -> GT) [1,3,2,5,4]
[4,5,2,3,1]
Prelude Data.List> sortBy (\_ _ -> GT) [4,5,3,2,1]
[1,2,3,5,4]

I guess there's a question of what we might
call "toxic waste"---junk values other than
undefined. I think undefined is bad enough
already. So the type system can't express
the spec. I don't think we should be casual
about that: we should be precise in
documentation about the obligations which
fall on the programmer. Some dirt is
pragmatically necessary: we shouldn't pretend
that it ain't so; we shouldn't pretend that dirt
is clean.







Before we can talk about what == should return,
can we settle what we mean by = ?


``='' is not in the Haskell interface!  ;-)


No, but "is" is in the human interface!




Therefore, I talked only about (==).


Ah, but you talked about things. Which
things? Is one of the things you talked
about the same as (==)? the same as
(flip (==))?



The best way to include ``='' seems to be the semantic equality of  
P-logic

[Harrison-Kieburtz-2005], which is quite a heavy calibre,
and at least in that paper, classes are not yet included.


I expect it's hard work. It's hard work in
much better behaved systems. My point is that
it's worth it, in order to facilitate more
meaningful discussions.

All the best

Conor

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Some clarity please! (was Re: [Haskell-cafe] Re: (flawed?) benchmark : sort)

2008-03-13 Thread kahl
Conor McBride <[EMAIL PROTECTED]> responded to my comment:

 > > (mapMonotonic should of course be removed, too,
 > >  or specified to fail (preferably in some MonadZero)
 > >  if the precondition is violated,
 > >  which should still be implementable in linear time.)
 > 
 > What's wrong with mapMonotonic that isn't wrong
 > with, say, sortBy?, or Eq instances for parametrized
 > types?

Prelude> :m + Data.Set
Prelude Data.Set> toAscList $ mapMonotonic (10 -) (fromList [1 .. 5])
[9,8,7,6,5]
Prelude Data.Set> 5 `member`  mapMonotonic (10 -) (fromList [1 .. 5])
False


Something's certainly wrong there!


 > 
 > Before we can talk about what == should return,
 > can we settle what we mean by = ?

``='' is not in the Haskell interface!  ;-)

Therefore, I talked only about (==).

The best way to include ``='' seems to be the semantic equality of P-logic
[Harrison-Kieburtz-2005], which is quite a heavy calibre,
and at least in that paper, classes are not yet included.



Wolfram




@Article{Harrison-Kieburtz-2005,
  author =   {William L. Harrison and Richard B. Kieburtz},
  title ={The logic of demand in {Haskell}},
  journal =  JFP,
  year = 2005,
  volume =   15,
  number =   6,
  pages ={837--891},
  abstract = {Haskell is a functional programming language whose
 evaluation is lazy by default. However, Haskell also provides
 pattern matching facilities which add a modicum of eagerness to
 its otherwise lazy default evaluation. This mixed or
 ``non-strict'' semantics can be quite difficult to reason
 with. This paper introduces a programming logic, P-logic, which
 neatly formalizes the mixed evaluation in Haskell
 pattern-matching as a logic, thereby simplifying the task of
 specifying and verifying Haskell programs. In p-logic, aspects of
 demand are reflected or represented within both the predicate
 language and its model theory, allowing for expressive and
 comprehensible program verification.}
}
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Some clarity please! (was Re: [Haskell-cafe] Re: (flawed?) benchmark : sort)

2008-03-13 Thread Conor McBride

Hi folks

I'm late into this thread, so apologies if
I'm being dim.

On 13 Mar 2008, at 16:17, [EMAIL PROTECTED] wrote:


Adrian Hey <[EMAIL PROTECTED]> wrote:


I would ask for any correct Eq instance something like the law:
   (x==y) = True implies x=y (and vice-versa)


I wish I knew what = meant here. Did somebody say?
I don't think it's totally obvious what equational
propositions should mean. Nor do I think it's
desirable to consider only those propositions
expressible in QuickCheck.

If = is reflexive and distinguishes undefined from
True, then

  x = y  implies  (x == y) = True

will be tricky to satisfy for quite a lot of types.
What about

  undefined == undefined

or

  repeat 'x' == repeat 'x'

? For some suitable (slightly subtle) definition
of "finite", you might manage

  x finite and x = y  implies  (x == y) = True

One rather intensional definition of x = y might be
"x and y have a common n-step reduct" with respect
to some suitable operational semantics. I don't think
this is what Adrian had in mind, but it certainly
falls foul of Wolfram's objection.




The easiest counterexample are sets (or finite maps)
provided as an abstract data type (i.e., exported without access to  
the

implementation, in particular constructors).



Different kinds of balanced trees typically do not produce the same
representation for the same set constructed by different construction
expressions.


This suggests that we should seek to define x = y
on a type by type basis, to mean "x and y support
the same observations", for some suitable notion
of observation, which might depend crucially on
what operations are exported from the (notional
or actual) module which defines the type. If so,
it's clearly crucial to allow some observations
which rely on waiting for ever, in order to
avoid _|_-induced collapse.

Something of the sort should allow this...



Therefore, (==) on sets will be expected to produce equality of sets,
which will only be an equivalence on set representations.


...but this...




which implies (f x == f y) = True (for expression types which are
instances of Eq).


This specifies that (==) is a congurence for f, and is in my opinion
the right specification:  (==) should be a congurence
on all datatypes with respect to all purely defineable functions.


...is more troublesome. Take f = repeat. Define
f = f. I'd hope x == y = True would give us x = y,
and that x == y would be defined if at least one
of x and y is finite. That implies f x = f y, which
should guarantee that f x == f y is not False.


But at least nowadays people occasionally do export functions
that allow access to representation details,


[..]

I consider this as an argument to remove showTree from the  
interface of

Data.Set, and to either specify toList to produce an ordered list
(replacing toAscList), or to remove it from the interface as well.


Perhaps that's a little extreme but I agree with the
sentiment. How about designating such abstraction-
breaking functions "nosy", in the same way that
functions which might break purity are "unsafe".


(mapMonotonic should of course be removed, too,
 or specified to fail (preferably in some MonadZero)
 if the precondition is violated,
 which should still be implementable in linear time.)


What's wrong with mapMonotonic that isn't wrong
with, say, sortBy?, or Eq instances for parametrized
types?





but if so would like to appeal to movers and shakers in the
language definition to please decide exactly what the proper
interpretation of standard Eq and Ord "class laws" are and in the
next version of the report give an explanation of these


Strongly seconded, inserting ``precise'' before ``explanation'' ;-)

(And I'd expect equivalences and congruences to be accessible
 on the basis of standard first-year math...)


Before we can talk about what == should return,
can we settle what we mean by = ? I think we need
to pragmatic about breaking the rules, given
suitable documentation and maybe warnings.

We should at least aspire to some principles,
which means we should try to know what we're
talking about and to know what we're doing,
even if we don't always do what we're talking
about.

I'll shut up now.

Potatoes to peel

Conor

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Some clarity please! (was Re: [Haskell-cafe] Re: (flawed?) benchmark : sort)

2008-03-13 Thread kahl
Adrian Hey <[EMAIL PROTECTED]> wrote:

 > I would ask for any correct Eq instance something like the law:
 >(x==y) = True implies x=y (and vice-versa)

The easiest counterexample are sets (or finite maps)
provided as an abstract data type (i.e., exported without access to the
implementation, in particular constructors).

Different kinds of balanced trees typically do not produce the same
representation for the same set constructed by different construction
expressions.

Therefore, (==) on sets will be expected to produce equality of sets,
which will only be an equivalence on set representations.

 > which implies (f x == f y) = True (for expression types which are
 > instances of Eq).

This specifies that (==) is a congurence for f, and is in my opinion
the right specification:  (==) should be a congurence
on all datatypes with respect to all purely defineable functions.

But at least nowadays people occasionally do export functions
that allow access to representation details,
for example Set.toList is not specified to be representation independent,
and Set.showTree is outright specified to be implementation dependent.

Prelude> :m + Data.Set
Data.Set> showTree (fromList [1,2,3,4]) == showTree (fromList [4,3,2,1])
False
Data.Set> fromList [1,2,3,4] == fromList [4,3,2,1]
True

I consider this as an argument to remove showTree from the interface of
Data.Set, and to either specify toList to produce an ordered list
(replacing toAscList), or to remove it from the interface as well.
(mapMonotonic should of course be removed, too,
 or specified to fail (preferably in some MonadZero)
 if the precondition is violated,
 which should still be implementable in linear time.)

 > but if so would like to appeal to movers and shakers in the
 > language definition to please decide exactly what the proper
 > interpretation of standard Eq and Ord "class laws" are and in the
 > next version of the report give an explanation of these

Strongly seconded, inserting ``precise'' before ``explanation'' ;-)

(And I'd expect equivalences and congruences to be accessible
 on the basis of standard first-year math...)



Wolfram
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Some clarity please!

2008-03-13 Thread Aaron Denney
On 2008-03-13, Ketil Malde <[EMAIL PROTECTED]> wrote:
> Aaron Denney <[EMAIL PROTECTED]> writes:
>
>> Well, the way the report specifies that max's default definition
>> is.  I'd actually favor making that not an instance function at
>> all, and instead have max and min be external functions.
>
> If you permit a naïve question:
>
> Prelude> :i Ord
> class (Eq a) => Ord a where
>   compare :: a -> a -> Ordering
>   (<) :: a -> a -> Bool
>   (>=) :: a -> a -> Bool
>   (>) :: a -> a -> Bool
>   (<=) :: a -> a -> Bool
>   max :: a -> a -> a
>   min :: a -> a -> a
>
> ..while all functions could be easily derived from 'compare'.  Or from
> the Eq instance's (==) and (<), say.
>
> What is the reason for this?  Efficiency?  (Which couldn't be handled
> equally well by RULES?)  Otherwise, it looks like an invitation for
> writing inconsistent instances.

My impression (which may not be entirely accurate) is not primarily for
efficiency (though that is one reason), but for ease of implementation.

It may be easier in some cases to think through the various cases of
compare, or to just figure out what (<=) is.  Either of these is
sufficient (perhaps in combination with (==) from the superclass).

You can write things so that any of (<), (<=), (>), or (>=) are
sufficient, but for writing the default compare, it's easiest to know
ahead of time which you are basing it on, so definitions don't get
circular.

max and min seem to have neither justification going for them, although
I suppose it's technically possible to write compare in terms of them
and (==).

I don't think GHC's RULES were around when Haskell 98 was being formalized,
nor is it clear that one compiler's method should let other efficiency
concerns go by the wayside.

Of course, it would be nice to be able to write (==) in terms of
compare.  While doable manually there's no way to default it to that
"smartly".  There are similar issues with Functor and Monad.  ISTR
some discussion about this on the list previously.

-- 
Aaron Denney
-><-

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Some clarity please! (was Re: Re: (flawed?) benchmark : sort)

2008-03-13 Thread Aaron Denney
On 2008-03-13, Adrian Hey <[EMAIL PROTECTED]> wrote:
> Hello All,
>
> I'm top posting because I'm getting bored and frustrated with this
> thread and I don't want to respond detail to everything Aaron has said
> below.
>
> Aaron: Where are you getting this equivalence stuff from?

Not from the prose in the report, but from what the code in the report
seems designed to support.  There are several places where the code
seems to take a (small -- much usually isn't needed) bit of care to
support equivalencies.

> So I don't know if all this is really is the correct reading of the
> report, but if so would like to appeal to movers and shakers in the
> language definition to please decide exactly what the proper
> interpretation of standard Eq and Ord "class laws" are and in the
> next version of the report give an explanation of these in plain
> English using terms that people without a Mathematics degree are
> likely to understand.

I agree that the prose of the report should be clarified.

Luke Palmer's message in haskell-cafe captures why I think that "Eq
means equivalence, not strict observational equality" is a more
generally useful notion.  It's harder to guarantee observational
equality, thus harder to use code that requires it of your types.
Almost all the time (in my experience) equivalencies are all that's
generically needed.

My comments on this particular message are below.

> Because maximumBy documentation is ambiguous in this respect, so is the
> overloaded maximum documentation. At least you can't figure it out from
> the Haddock.

True.

> Despite this ambiguity, the statement that maximum is a special case of
> maximumBy is true *provided* max in the Ord instance is defined the way
> Aaron says is should be: (x==y = True) implies max x y = y.

Well, the way the report specifies that max's default definition
is.  I'd actually favor making that not an instance function at
all, and instead have max and min be external functions.

> AFAICT, the original report authors either did not perceive an
> ambiguity in maximum, or just failed to notice and resolve it.
> If there is no ambiguity this could be for 2 reasons.
>
> 1 - It doesn't matter which maximum is returned because:
>   (x==y) = True implies x=y
>
> 2 - It does matter, and the result is guaranteed to be the
> last maximum in all cases because:
>   (x==y) = True implies max x y = y

The second holds, so long as max isn't redefined.  I'd be rather
surprised at any redefinitino of max, as it's not part of any minimum
definition for Ord, and I can't think of an actual optimization case
for it.

> Regarding the alleged "max law" this too is not mentioned in the
> Haddock for the Ord class, nor is it a "law" AFAICT from reading the
> report. The report (page 83) just says that the default methods are
> "reasonable", but presumably not manditory in any semantic sense.
> This interpretation also seems to be the intent of this from the
> second para. of Chapter 8:

Agreed.  Elevating this to a "law" in my previous message was a mistake
on my part.  I still think this default in combination with the comment
is very suggestive that (min x y, max x y) should preserve distinctness of
elements.

If you're unwilling to count on this holding for arbitrary Ord
instances, why are you willing to count on (/=) and (==) returning
opposite answers for arbitrary Eq instances?

> I wouldn't dispute that the default definition is reasonable, but it's
> certainly not clear to me from the report that it's something that I
> can safely assume for all Ord instances. In fact AFAICS the report
> quite clearly telling me *not* to assume this. But I have to assume
> *something* for maximum to make sense, so I guess that must be:
>   (x==y) = True implies x=y
> IOW "I have no idea if it's the first or last maximum that is returned,
> but who cares?"

Well, you have to assume something for maximum to do what it says it
does, which isn't quite the same thing as "making sense"...

> I'm not saying some people are not right to want classes with more
> mathematically inspired "laws", but I see nothing in the report to
> indicate to me that Eq/Ord are those classes and consequently that
> the "naive" programmers interpretation of (==) is incorrect. Rather
> the contrary in fact.

It's not a question of more or less mathematically inspired, it's a
question of more or less useful.  Yes, it's slightly harder to write
code that can handle any equivalency correctly.  But code that only
handles observational equality correctly is less reuseable.  The
compilers cannot and do not check if the various laws are obeyed.  They
are purely "social" interfaces, in that the community of code writers
determines the real meaning, and what can be depended on.  The community
absolutely should come to a consensus of what these meanings are and
document them better than they are currently.

Currently, if you write code assuming a stricter meaning of Eq a, the
consequences are:
(a) it's easier for you 

Some clarity please! (was Re: [Haskell-cafe] Re: (flawed?) benchmark : sort)

2008-03-13 Thread Adrian Hey

Hello All,

I'm top posting because I'm getting bored and frustrated with this
thread and I don't want to respond detail to everything Aaron has said
below.

Aaron: Where are you getting this equivalence stuff from?

Searching the report for the word "equivalence" the only remotely
relevant section seems to be in para. 17.6..

 "When the “By” function replaces an Eq context by a binary predicate,
  the predicate is assumed to define an equivalence"

Which is fair enough, but this is talking about the argument of "By"
functions.

The Haskell wiki refers me to wikipedia, which contains the words
 "In Haskell, a class Eq intended to contain types that admit equality
  would be declared in the following way"
 http://en.wikipedia.org/wiki/Type_class

Not that this is necessarily authoritative, but it seems to be
contaradicting some peoples interpretation.

Also, on page 60 of the report I find the words
 "Even though the equality is taken at the list type.."

So I don't know if all this is really is the correct reading of the
report, but if so would like to appeal to movers and shakers in the
language definition to please decide exactly what the proper
interpretation of standard Eq and Ord "class laws" are and in the
next version of the report give an explanation of these in plain
English using terms that people without a Mathematics degree are
likely to understand.

Aaron's interpretation may indeed be very correct and precise, but
I fear in reality this is just going to be incomprehensible to many
programmers and a terrible source of bugs in "real world" code. I cite
previous "left biasing" bugs Data.Map as evidence.

I would ask for any correct Eq instance something like the law:
  (x==y) = True implies x=y (and vice-versa)
which implies f x = f y for all definable f
which implies (f x == f y) = True (for expression types which are
instances of Eq). This pretty much requires structural equality
for concrete types. For abstract types you can do something different
provided functions which can give different answers for two "equal"
arguments are not exposed.

Anything else is just wrong (according to the language specification,
even if it can be right in some mathematical sense). Before anyone jumps
down my throat, I remind you that this is a request, not an assertion! :-)

On the subject of ambiguity, bugs and maxima, I see we have in Data.List

-- | 'maximum' returns the maximum value from a list,
-- which must be non-empty, finite, and of an ordered type.
-- It is a special case of 'Data.List.maximumBy', which allows the
-- programmer to supply their own comparison function.
maximum :: (Ord a) => [a] -> a
maximum []  =  errorEmptyList "maximum"
maximum xs  =  foldl1 max xs

-- | The 'maximumBy' function takes a comparison function and a list
-- and returns the greatest element of the list by the comparison function.
-- The list must be finite and non-empty.
maximumBy   :: (a -> a -> Ordering) -> [a] -> a
maximumBy _ []  =  error "List.maximumBy: empty list"
maximumBy cmp xs=  foldl1 max xs
where
   max x y = case cmp x y of
GT -> x
_  -> y

So I believe I'm correct in saying that maximumBy returns the last
of several possible maximum elements of the list. This obviously
needs specifying in the Haddock.

Because maximumBy documentation is ambiguous in this respect, so is the
overloaded maximum documentation. At least you can't figure it out from
the Haddock.

Despite this ambiguity, the statement that maximum is a special case of
maximumBy is true *provided* max in the Ord instance is defined the way
Aaron says is should be: (x==y = True) implies max x y = y.

But it could be be made unconditionally true using..

maximum :: Ord a => [a] -> a
maximum [] = error "List.maximum: empty list"
maximum xs = maximumBy compare xs

AFAICT, the original report authors either did not perceive an
ambiguity in maximum, or just failed to notice and resolve it.
If there is no ambiguity this could be for 2 reasons.

1 - It doesn't matter which maximum is returned because:
 (x==y) = True implies x=y

2 - It does matter, and the result is guaranteed to be the
last maximum in all cases because:
 (x==y) = True implies max x y = y

But without either of the above, it is unsafe to assume
 maximum = maximumBy compare

Regarding the alleged "max law" this too is not mentioned in the
Haddock for the Ord class, nor is it a "law" AFAICT from reading the
report. The report (page 83) just says that the default methods are
"reasonable", but presumably not manditory in any semantic sense.
This interpretation also seems to be the intent of this from the
second para. of Chapter 8:

"The default method definitions, given with class declarations,
 constitute a specification only of the default method. They do not
 constitute a specification of the mea