Re: Pattern Match Success Changes Types
On 28-May-1998, Adrian Hey [EMAIL PROTECTED] wrote: A strong type system is supposed to prevent runtime errors. Thus it makes sense to disallow anything that might result in an attempt to access an unbound type. Yes, but in the cases we've been talking about we know that there will be no access to the unbound type since the reason the type is unbound is that it isn't referenced by expressions such as (Right something). That does not imply that the type won't be used. As shown in my example program, the type may be used for method dispatch, even if there aren't any instances of the type (i.e. values which have that type) around. as a general rule, I completely disregard Haskell's class system when ever possible. If this makes me a naive user, then I must plead guilty. You shouldn't disregard it "when ever possible"; you should only disregard it in cases where you will often want to apply different methods to the same data type. I know using Haskell Classes will probably result in neater code a lot of the time. But when I write modules I like to think that they will last me 'forever', so I try to make them as general purpose as possible in the hope that they might come in handy again some day. Given the choice between a neat but non-general way of doing things and a messy but general way I am always inclined towards the latter. Ah, but the type class method is general. It's just a question of where you put the messiness. With type classes, the common case is elegant, but things get messy (you have to use newtype, etc.) if you want to use operations other than the default for the type. With the alternatives that you are advocating, using different operations is elegant, but things are a little messy (you have to pass stuff explicitly) in the common case. I suppose I could probably get around this problem by by making different types using newtype, but I this would no doubt cause other complications elsewhere. Yes, perhaps. See Alistair Reid's proposal for newtype typecasts to avoid some of these complications. I think I've seen this. It didn't seem to go down to well if my memory is correct. True, but maybe considerations such as these will sway the balance a bit! ;-) -- Fergus Henderson [EMAIL PROTECTED] | "I have always known that the pursuit WWW: http://www.cs.mu.oz.au/~fjh | of excellence is a lethal habit" PGP: finger [EMAIL PROTECTED]| -- the last words of T. S. Garp.
Re: Pattern Match Success Changes Types
On Thu 28 May, Fergus Henderson wrote: Unbound type variables can lead to runtime errors. Its difficult to see how they can be always avoided without writing explicit type signatures to do this. A strong type system is supposed to prevent runtime errors. Thus it makes sense to disallow anything that might result in an attempt to access an unbound type. Yes, but in the cases we've been talking about we know that there will be no access to the unbound type since the reason the type is unbound is that it isn't referenced by expressions such as (Right something). As soon as the (Right something) becomes an alternative result to (Left somethingelse) then the type variable will no longer be unbound. What is the ultimate purpose of type checking a program? The answer has to be, to avoid run-time errors. Beyond this, the type system seems to serve no purpose and so should not reject definitions such as that above, in an ideal world. The Haskell type system is used for much more than just avoiding run-time errors. In particular, the Haskell system of type classes means that types are part of the semantics. I'm begining to understand the significance of this point. I suppose it also means that I couldn't be more wrong in my belief that Haskell is intended as a target language for code synthesis tools in general. What these really need is a non-type checked language, ideally (assuming the synthesised code had somehow been proven to be type safe). Some people have argued against this, arguing for a system where types do not affect the operational semantics, only the legality. But this leads to a less expressive language (for example, you can't have a polymorphic `read'). I think I agree with these people. :-( as a general rule, I completely disregard Haskell's class system when ever possible. If this makes me a naive user, then I must plead guilty. You shouldn't disregard it "when ever possible"; you should only disregard it in cases where you will often want to apply different methods to the same data type. I know using Haskell Classes will probably result in neater code a lot of the time. But when I write modules I like to think that they will last me 'forever', so I try to make them as general purpose as possible in the hope that they might come in handy again some day. Given the choice between a neat but non-general way of doing things and a messy but general way I am always inclined towards the latter. Perhaps its just my temperament rather than logic that makes me do this :-) In many situations, the methods you want are usually fixed for each type, and in those sort of situations, it makes sense to use type classes. Arithmetic being the obvious example. Although, even here it is conceivable that one might want to employ different multiplication algorithms (for example) on different occasions, even if the number representation didn't change. But this is probably rare, I must conceed. I suppose I could probably get around this problem by by making different types using newtype, but I this would no doubt cause other complications elsewhere. Yes, perhaps. See Alistair Reid's proposal for newtype typecasts to avoid some of these complications. I think I've seen this. It didn't seem to go down to well if my memory is correct. (I don't mean that as critisism of the Alistair Reid's proposal, but as a reflection of the fact that there appears to be considerable diversity of opinion on these issues.) Regards -- Adrian Hey
Re: Pattern Match Success Changes Types
On 11-May-1998, Simon L Peyton Jones [EMAIL PROTECTED] wrote: Since GHC keeps the types right through the compiler, it really can't do CSE on two terms of type Either Int Int Either Bool Int even if they are both applications of Right. Actually, GHC does finally discard type information right at the end, so we could do an extra bit of CSE there, but frankly I doubt it would buy very much. A simpler way of doing this is for the CSE pass to just insert a call to a compiler builtin function `$unsafe_cast' if the types don't match. At the very end when you finally discard type information you can then optimize away the call to `$unsafe_cast'. This provides the same benefits without the need for an extra pass. It probably doesn't buy a lot, but IMHO it's easy enough that it's worth doing anyway. Incidentally, I don't think it would be sensible to change the type system to allow the demo1 :: (a - b) - Either a c - Either b c demo1 f (Left a) = Left (f a) demo1 _ r@(Right c) = r What type does r have? Either a c. What type does the result of the fn have? Either b c. Different types. I agree, this code should be disallowed. Note that the different types can lead to different semantics. Consider the following code, which is similar to the code above: foo :: (Int - Float) - Either Int Char - Either Float Char foo f (Left a) = Left (f a) foo _ r@(Right c) = classmethod r class Demo t where classmethod :: t - Either Float Char instance Demo Either Int Char where classmethod = m1 instance Demo Either Float Char where classmethod = m2 Here `r' has type `Either Int Char', not `Either a Char', and this determines which class method is called. Note that writing foo _ r@(Right c) = classmethod (Right c) would result in compile error (or worse) due to an uninstantiated type variable. -- Fergus Henderson [EMAIL PROTECTED] | "I have always known that the pursuit WWW: http://www.cs.mu.oz.au/~fjh | of excellence is a lethal habit" PGP: finger [EMAIL PROTECTED]| -- the last words of T. S. Garp.
Re: Pattern Match Success Changes Types
Hello, Thanks to everybody who replied on this thread. I'm afraid I've got to go away for a couple of weeks, so I can't join in any more :-( Here are my views on the most recent postings... On Mon 11 May, Jon Mountjoy wrote: -- I would guess 'sometimes'. In some cases of course is it, but in other cases you might increase the scope of an expression and thereby worsen the space behaviour. Have there been any attempts to identify/quantify this? My first reaction to this was to wonder how it could ever be advantageous to reduce the same expression several times instead of just once. Then, on reflection I thought Gosh!, I see what you mean, thats never occured to me before. Lazyness seems so blooming complicated in practice, perhaps I'd better revert back to ML :-) On Mon 11 May, Olaf Chitil wrote: - See my paper: Common Subexpressions are Uncommon in Lazy Functional Languages I think I will to, looks interesting. On Mon 11 May, Simon L Peyton Jones wrote: -- My advice would be: write clear code, and let the compiler do the CSE. I take it you mean avoid using 'as patterns'. I don't suppose there's any possibility of expunging them from the language in due course. No? I thought not. If I remember right, David Turners Miranda seemed to be able to live without 'as patterns', didn't it?. I never got to use Miranda, I only read about it. If it doesn't, complain to the compiler writers. You have good reason to believe that it should. Do you think they will take any notice of me :-) (I don't know what "_" is) but rather Just for the record, I use _ as a 'wildcard' instead of a type variable which occurs only once. Perhaps this is ambiguous wrt quantification? Actually, GHC does finally discard type information right at the end, so we could do an extra bit of CSE there, but frankly I doubt it would buy very much. But I'm willing to stand corrected. I don't think you can say this. Granted in this trivial example we are only talking about wasting constructor record per 'demo expression'. But in other more complex examples we could be talking several. Also the total heap space that gets wasted this way is not an intrinsic property of the demo function. It depends on the number of 'demo expressions' (or similar) which get reduced. I don't think you can predict how many this will be with any generality. In some programs it could conceivably be huge Nos. Couldn't it?. It would be nice to re-use the existing constructors, even if the type checker says thats illegal. I think its reasonable to assume that this is _operationally_ safe in most Haskell implementations, even if the Type Checker does reject it. I imagine that there isn't a single Haskell implemention where the representation of (Right c) is in any way dependent on the type of the argument of the 'Left' constructor. Its difficult to see how it can be, because the type of the 'Left' constructors argument may not be known at compile time. Also, such dependency implies compliation of numerous different versions of 'demo' in order to cope with the multitude of possibilities. If I'm wrong, please correct me. (After all, I've had nothing to do with any Haskell implementation so it is quite possible that my assumptions are invalid.) Also, I note that the problem Jon Mountjoy mentioned is not relevant to this particular optimisation. This optimisation will have no effect on the sharing of expressions which are bound to the 'terminal' pattern variables (Is that correct terminology? Anyway, you know what I mean.) The only issue is do existing constructor records get re-used, or are new ones created. I think the first option has to be more efficient, in both time space. Incidentally, I don't think it would be sensible to change the type system to allow the demo1 :: (a - b) - Either a c - Either b c demo1 f (Left a) = Left (f a) demo1 _ r@(Right c) = r What type does r have? Either a c. I don't think I agree, thats the whole point of my argument. Either a c is the type of the second argument of demo1. Things are different for 'r' because it it bound to the pattern/expression (Right c), so we have more information, and may (subject to certain constraints) 'take liberties' with the type assigned to 'r'. I would say that, for all practical purposes, (r == Right c). Its seems to me that an important feature of Haskell is that constructors are supposed to behave like functions, are they not? So, given that Right is a (hopefully well behaved) function, the 'value' of (Right c) is dependent only on the 'value' of c. It seems to be absurd to suggest that the type of (Right c) is, in some way, also dependent on the type of an expression which is not referenced at all. This smells strongly of 'side effects' to me, type wise at least. What type does the result of the fn have? Either b c. Different types. It would be
Re: Pattern Match Success Changes Types
On 12-May-1998, Frank A. Christoph [EMAIL PROTECTED] wrote: With regard to merging Either instances, I agree with Simon that for most programs this will not buy you much, but there are two common kinds of programs where one could expect a significant effect on performance, just because of sheer scale. The first is any program which uses an error/exception monad on a program-wide scale. Another common example, or perhaps an instance of your example, is parsers. Parsers often use something similar to an error/exception monad for propagating parse errors. Parsing applications are very common, so it's probably worth optimizing these cases. -- Fergus Henderson [EMAIL PROTECTED] | "I have always known that the pursuit WWW: http://www.cs.mu.oz.au/~fjh | of excellence is a lethal habit" PGP: finger [EMAIL PROTECTED]| -- the last words of T. S. Garp.
RE: Pattern Match Success Changes Types
On Tue, 12 May 1998, Koen Claessen wrote: Frank A. Christoph wrote: | With regard to merging Either instances, I agree with Simon that for most | programs this will not buy you much, but there are two common kinds of | programs where one could expect a significant effect on performance, just | because of sheer scale. It is not only Either instances who suffer from this. Consider the following definition of "map", which could be made by a naive user: map :: (a - b) - [a] - [b] map f (x:xs) = f x : map f xs map f xs = xs Same problem here. Where is the CSE in theis def of map? Why is it naive? (Hugs ghc define map on lists exactly like that.) Maybe I'm naive... -- m --- Mariano Suarez Alvarez The introduction of Departamento de Matematica numbers as coordinates Universidad Nacional de Rosario [...] is an act of violence Pellegrini 250 A. Weyl 2000 Rosario - Argentina e-mail: [EMAIL PROTECTED] ---
RE: Pattern Match Success Changes Types
On Tue, 12 May 1998, Mariano Suarez Alvarez wrote: | On Tue, 12 May 1998, Koen Claessen wrote: | |map :: (a - b) - [a] - [b] |map f (x:xs) = f x : map f xs |map f xs = xs | | Where is the CSE in theis def of map? Why is it naive? (Hugs ghc define | map on lists exactly like that.) Maybe I'm naive... Sorry, I was too brief. I meant this: the third line map f xs = xs does not type check. It should be: map f [] = [] Regards, Koen. -- Koen Claessen, [EMAIL PROTECTED], http://www.cs.chalmers.se/~koen, Chalmers University of Technology.
RE: Pattern Match Success Changes Types
On Tue 12 May, Frank A. Christoph wrote: I'm experiencing a little bout of deja vu here, so excuse me if it turns out that I'm repeating myself. (I could swear I already posted this, but it's not in my "Messages Sent" folder...) With regard to merging Either instances, I agree with Simon that for most programs this will not buy you much, but there are two common kinds of programs where one could expect a significant effect on performance, just because of sheer scale. The first is any program which uses an error/exception monad on a program-wide scale. The second is a program that uses Chalmers' fudgets library since, as I recall, the type Either plays a prominent role in the "messaging" system. I just caught this one... Precisely, I myself have written a parser in which functions similar to 'demo' appear all over the place. Either is a common data type to use to represent Success or Failure/Exceptions. The same problem is also apparent with the use of Maybe. In fact, just about every algebraic type I can think of will have some constructors that don't reference every type argument, and can therefore be regarded as 'more polymorphic' when they appear in patterns. Regards -- Adrian Hey
Re: Pattern Match Success Changes Types
Yes, GHC does some CSE stuff, but not very much. I don't think it has a large performance impact, but (as luck would have it) but I plan to work on it a bit in the newt few months. My advice would be: write clear code, and let the compiler do the CSE. If it doesn't, complain to the compiler writers. You have good reason to believe that it should. The exception is the case you've been discussing. The type of Right is not Right :: b - Either _ b (I don't know what "_" is) but rather Right :: forall a,b. b - Either a b Since GHC keeps the types right through the compiler, it really can't do CSE on two terms of type Either Int Int Either Bool Int even if they are both applications of Right. Actually, GHC does finally discard type information right at the end, so we could do an extra bit of CSE there, but frankly I doubt it would buy very much. But I'm willing to stand corrected. Incidentally, I don't think it would be sensible to change the type system to allow the demo1 :: (a - b) - Either a c - Either b c demo1 f (Left a) = Left (f a) demo1 _ r@(Right c) = r What type does r have? Either a c. What type does the result of the fn have? Either b c. Different types. It would be hard (I believe) to specify crisply when it was legitimate for two terms with different types to be as'd together. Simon
Re: Pattern Match Success Changes Types
While on this topic I would like to ask the question: Is CSE very useful for Haskell programs? I would guess 'sometimes'. In some cases of course is it, but in other cases you might increase the scope of an expression and thereby worsen the space behaviour. Have there been any attempts to identify/quantify this? Jon
Re: Pattern Match Success Changes Types
On Fri 08 May, Victor M. Gulias wrote: I have found an example that seems to be related to this thread. The map function behaves like this: map f [a1,a2,...,an] = [f a1, f a2, ..., f an] Suppose a function f defined as: f :: Maybe a - Int f Nothing = 1 f _ = 0 the following expression is type checked successfully: sum [f Nothing, f (Just True), f (Just "hi")] it counts the number of "Nothing"s appearing in the list (1). Replacing the above expression folding map, that is, sum (map f [Nothing, Just True, Just "hi"]) or even defining sum_nothings = sum . map f, sum_nothings [Nothing, Just True, Just "hi"] obviously does not type check. However, the expression has the same meaning as the one unfolding the map. Well, suppose that you actually *type* the list as [Nothing, Just True, Just "hi"] :: [Maybe _] where _ is a *who-knows* type (heterogeneous type?). Now, the expression can be typed because f :: Maybe a - Int does not need to know what polymorphic "a" really is ("a" is a *who-cares* type). Or supposing [True, "hi"] :: [_] sum (map f (map Just [True, "hi"])) also would be ok. I can imagine some kind of typing rule somehow __ __ | |- x :: A| |- xs :: [B] --- A/=B \/ A=_ \/ B=_ __ | |- x:xs :: [_] does it make any sense? is it related to other stuff like dynamic or existential typing? I'm not sure if this is related to to my original question about typing of successful pattern matches. Expecting a type checker to allow sub- expressions which are badly typed (even if the result of evaluating that expression would be ok type wise) seems to be asking too much. Couldn't allowing this sort of thing also make some possible program transformations fail at run-time? I think I'm getting way out of my depth here (I haven't a clue what a 'heterogeneous type' is), so I'll say no more about your idea, other than it seems to require a pretty radical new type system which is probably beyond my comprehension :-) My original point (that a variable which is bound to a particular pattern should be typed the same way as the corresponding expression when the variable is used in expressions) seems to me to be entirely consistent with the current Haskell type system and doesn't require any new theory about typing. Only the type checker needs to be modified to accomodate this. Regards -- Adrian Hey
Re: Pattern Match Success Changes Types
On 07-May-1998, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Adrian Hey writes: Consider the following function.. demo1 :: (a - b) - Either a c - Either b c demo1 f (Left a) = Left (f a) demo1 _ (Right c) = Right c My first question is, can programmers safely assume that the compiler is sufficiently smart to avoid duplicating the expression (Right c) in the event that the second match above succeeds? One might reasonably hope that the run time implementation of this function will simply return the functions second argument in this case. Sadly I don't think one can assume this at all, and for a fairly good reason. The argument (Right c :: Either a c) and the result (Right c :: Either b c) need not have the same runtime representation. You can't be _certain_ that they will have the same runtime representations, but I think this is likely to be true for most implementations. Otherwise, how can the compiler generate (efficient) code for a polymorphic function that takes an argument of type `Either t1 t2'? In Mercury, such cases do have the same runtime representation and the Mercury compiler will do the common subexpression elimination in those cases. In a strict language like ML, for example, we might well have sizeof (Either a c) = sizeof tag + max (sizeof a, sizeof c) sizeof (Either b c) = sizeof tag + max (sizeof b, sizeof c) and there is no reason for these to be the same. If you're not doing destructive update, the size of the type isn't going to matter -- all you need is for the data to have the right representation. For example, sizeof(Either a c) might be 8, and sizeof(Either b c) might be 12, but sizeof(Right c) would be the same in either case (and = 8). Indeed, if you have a discriminated union of things of different sizes, then it is an important optimization to be able to allocate a size smaller than the maximum for a type for data items that don't occupy the maximum size. No current Haskell implementations do destructive update optimization, and I'm not aware of any plans to add it to any of the existing implementations, but even for implementations that do perform that kind of optimization, I think the advantage of the space-saving optimization mentioned above would probably outweigh any increased potential for destructive update optization that might be enabled by always allocating the full size even for data that doesn't occupy the full size. Thus I think it unlikely that an implementation would adopt such a strategy. -- Fergus Henderson [EMAIL PROTECTED] | "I have always known that the pursuit WWW: http://www.cs.mu.oz.au/~fjh | of excellence is a lethal habit" PGP: finger [EMAIL PROTECTED]| -- the last words of T. S. Garp.
Re: Pattern Match Success Changes Types
On Thu 07 May, [EMAIL PROTECTED] wrote: Adrian Hey writes: Consider the following function.. demo1 :: (a - b) - Either a c - Either b c demo1 f (Left a) = Left (f a) demo1 _ (Right c) = Right c My first question is, can programmers safely assume that the compiler is sufficiently smart to avoid duplicating the expression (Right c) in the event that the second match above succeeds? One might reasonably hope that the run time implementation of this function will simply return the functions second argument in this case. Sadly I don't think one can assume this at all, and for a fairly good reason. The argument (Right c :: Either a c) and the result (Right c :: Either b c) need not have the same runtime representation. In a strict language like ML, for example, we might well have Yes, I suppose I should qualify my remark (about returning the second argument) by saying that programmer can only rely on this if the two representations are identical. However, in a lazy language it seems to me that it is practically certain that (Right c) is always represented by a tag field and pointer to the expression 'c', but maybe I'm wrong. I suppose my original message basically boil down to 2 questions, one simple, the other contentious. Simple Question --- Is there anything to be gained by using 'as patterns' (in those circumstances where the type checker permits), or can the programmer safely assume that a good compiler will re-use any patterns which also appear in expressions? Contentious Question Does an expression of form (Right c) always have the same type, regardless of the type of any associated expressions of form (Left a)? I think it does. If this is so then shouldn't the type checker take account of the fact that after a successful match and the expression is known to be of form (Right c), then the type of the expression then becomes somewhat less specific (more polymorphic) than it was before the match occurred. My argument would go something like this: 1- In Haskell the constructors of algebraic data types are supposed to behave like any other function when they appear in expressions. In particular the 'Right' constructor has type.. Right :: b - Either _ b 2- Referential transparency demands that any expression of form (Right b) is dependent only on the value of b. Likewise the type of this expression is only dependent on the type of b.. e = Right b b :: tb e :: Either _ tb To say that in some circumstances an expression of form (e = Right b) might have type.. e :: Either Int tb (for example) but at other times might have type.. e :: Either Char tb seems to me to be highly illogical, downright perverse in fact. I would say it _always_ has type.. e :: Either _ tb Now, it may well be true that some implementations might use different representations for e (= Right c) on different occasions, but if this is so its the responsibility of the compiler implementer to deal with this transparently. It doesn't seem reasonable (to me at least) to introduce nasty language semantics (such as equal values having un-equal types) for no other reason than to cater for this possibility. Sorry, I probably haven't expressed my opinion with the precision that computer scientists normally expect, but I'm sure you get the gist of my argument. I'd be really interested in what anybody else thinks about this, because I'm about to try to implement a polymorphic type checker of my own. I haven't started designing it yet though. I'm still trying to get my head round some of the subtleties of type semantics and safe inference methods. Regards -- Adrian Hey
Pattern Match Success Changes Types
Consider the following function.. demo1 :: (a - b) - Either a c - Either b c demo1 f (Left a) = Left (f a) demo1 _ (Right c) = Right c My first question is, can programmers safely assume that the compiler is sufficiently smart to avoid duplicating the expression (Right c) in the event that the second match above succeeds? One might reasonably hope that the run time implementation of this function will simply return the functions second argument in this case. In general, is it a requirement that Haskell compilers implement an optimisation which re-uses patterns on the left hand side which also appear in expressions on the right hand side? If not, shouldn't this optimisation be mandatory? The reason I suggest that it should be is apparent when you look at my next example. A prudent programmer might decide not to rely on this optimisation and try to write something like this.. demo2 :: (a - b) - Either a c - Either b c demo2 f (Left a) = Left (f a) demo2 _ x@(Right c) = x Unfortunately the Type Checker rejects the above 'as pattern'. Hugs 1.3 returns the following error: *** Expression: demo2 *** Declared type : (a - b) - Either a c - Either b c *** Inferred type : (a - a) - Either a b - Either a b I'm afraid I don't know what other Haskell implementations do, but I've encountered this problem with every ML compiler I've tried. This suggests that the Type Checkers used by most compilers are less than ideal, in that they don't take account of the fact that a successful match may change the type of an 'as pattern' variable or function argument. Perhaps there is some fundamental theoretical reason why this can't be done, but I certainly can't think of one. So, what is recommended practice for programmers in this situation? Should programmers use 'as patterns' (assuming the type checker will allow them) or not? Which Haskell compilers currently implement the optimisation I mentioned above? All? Any? None? 'As patterns' look awfully messy and it would be nice to not to have to use them, but will there be efficiency penalties to pay if they're not used? If Haskell requires programmers to program defensively (as far as this problem is concerned) then shouldn't the Type Checkers allow more liberal use of 'as patterns'? Regards -- Adrian Hey