Re: [Haskell-cafe] The difficulty of designing a sequence class
On 31.07 16:27, Brian Hulley wrote: None of the above type classes would be compatible with Data.ByteString! (You mentioned this issue before wrt Data.Edison.Seq but it just clicked with me now for the above refactoring.) For compatibility, the element type would need to appear also thus: class Foldable f_a a | f_a - a where fold :: (a - b - b) - b - f_a - b With the new System FC (when it is merged) we could make these classes nicer. class ElementType c a | c - a instance ElementType [a] a instance ElementType ByteString Char instance IArray a e = ElementType (a i e) e class Foldable c where fold :: ElementType c a = (a - b - b) - b - c - b This won't work at the moment due to limitations in GHC, but seems like a cleaner solution. - Einar Karttunen ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re[2]: [Haskell-cafe] The difficulty of designing a sequence class
Hello Brian, Tuesday, August 1, 2006, 4:23:53 AM, you wrote: That's a tough call to make. Changing the kind of Sequence to * from * - * means losing the Functor, Monad, and MonadPlus superclasses and all the various maps and zips. But there's no option if you want to be able to support non-polymorphic sequences like Data.ByteString etc. I think the Functor class is just fundamentally too limited - it assumes the whole world is polymorphic and it isn't. it's possible, at least in principle, to make ByteString parametric class: data PlainSequence a = ... type ByteString = PlainSequence Word8 and then rewrite all ByteString functions so that they will work with elements of any type, not just Word8 -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re[2]: [Haskell-cafe] The difficulty of designing a sequence class
Hello Brian, Tuesday, August 1, 2006, 4:43:23 AM, you wrote: As you've pointed out, there are 2 separate issues that are in danger of being confused: 1) Forcing all sequence instances to support all operations 2) Bundling all the ops into a single huge class Collections library (darcs get --partial http://darcs.haskell.org/packages/collections/) defines good hierarchy of collection classes -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re[2]: [Haskell-cafe] The difficulty of designing a sequence class
Hello John, Tuesday, August 1, 2006, 6:27:29 AM, you wrote: It is best to think of haskell primitives as something completely new, they reuse some naming conventions from OO programming, but that doesn't mean they suffer from the same limitations. It took me a few trys to wrap my brain around it. I liken learning haskell to tipping over a vending machine. you can't just push it, you gotta rock it back and forth a few times building up momentum until bam! suddenly the flash of insight hits and it all makes sense. btw, http://homepages.cwi.nl/~ralf/gpce06/ can help on this way. i think that there is also other papers that can help too but i also agree with Brian - there is some constraints in Haskell comparing to C++ and vice versa. while developing Streams library i was many times bitten by these restrictions -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re[2]: [Haskell-cafe] The difficulty of designing a sequence class
Hello Einar, Tuesday, August 1, 2006, 1:58:30 PM, you wrote: class ElementType c a | c - a class Foldable c where fold :: ElementType c a = (a - b - b) - b - c - b i love it! will it be possible to write smth like this: class Stream m h | h-m data T h = (Stream m h) = C (m Int) ? currently, i need to pass 'm' parameter to T type constructor too, because GHC can't guess what 'm' is determined by 'h' -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The difficulty of designing a sequence class
On Jul 31, 2006, at 10:27 PM, John Meacham wrote: [snip] It is best to think of haskell primitives as something completely new, they reuse some naming conventions from OO programming, but that doesn't mean they suffer from the same limitations. It took me a few trys to wrap my brain around it. I liken learning haskell to tipping over a vending machine. you can't just push it, you gotta rock it back and forth a few times building up momentum until bam! suddenly the flash of insight hits and it all makes sense. Do you have a lot of personal experience attaining zen-like insight by tipping over vending machines? I'll have to try that some time ;-) *chucke* Thanks for making me laugh this morning. John Rob Dockins Speak softly and drive a Sherman tank. Laugh hard; it's a long way to the bank. -- TMBG ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Re[2]: [Haskell-cafe] The difficulty of designing a sequence class
Bulat Ziganshin wrote: Hello Brian, Tuesday, August 1, 2006, 4:43:23 AM, you wrote: As you've pointed out, there are 2 separate issues that are in danger of being confused: 1) Forcing all sequence instances to support all operations 2) Bundling all the ops into a single huge class Collections library (darcs get --partial http://darcs.haskell.org/packages/collections/) defines good hierarchy of collection classes Hi Bulat - Thanks for the link to the collections repository. I've at last taken the plunge and installed darcs and managed to get this onto my computer (although with a strange warning from darcs: 'plink: unknown option -O '). I'll have to have a proper look into it. On superficial inspection, there are some unusual choices - for example putting (size) and (null) into Foldable instead of Collection, and calling null null instead of isEmpty (considering that Collection has a method called isSingleton which follows the usual convention of starting unary predicates with is). In any case it's interesting to see another possible factoring of the concept of collections to compare with Edison and the existing base collections. Regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Re[2]: [Haskell-cafe] The difficulty of designing a sequence class
On Tue, 2006-08-01 at 14:37 +0400, Bulat Ziganshin wrote: Hello Brian, Tuesday, August 1, 2006, 4:23:53 AM, you wrote: That's a tough call to make. Changing the kind of Sequence to * from * - * means losing the Functor, Monad, and MonadPlus superclasses and all the various maps and zips. But there's no option if you want to be able to support non-polymorphic sequences like Data.ByteString etc. I think the Functor class is just fundamentally too limited - it assumes the whole world is polymorphic and it isn't. it's possible, at least in principle, to make ByteString parametric class: data PlainSequence a = ... type ByteString = PlainSequence Word8 and then rewrite all ByteString functions so that they will work with elements of any type, not just Word8 Much of the performance advantages that we can get are due to the special representation and knowing the types involved. This would not work for arbitrary types. For one thing all the performance advantages of using a packed representation would be lost as we would have to use pointers to boxed objects rather than flat arrays of bytes. We can use low level standard C functions like memchr because we know the representation. There is certainly a place for a general strict sequence type but there is also a need for efficient handling of big chunks of bytes. It'd be even better if the special case representation collections could fit into a general framework. Duncan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The difficulty of designing a sequence class
John Meacham wrote: On Tue, Aug 01, 2006 at 02:56:21AM +0100, Brian Hulley wrote: Now the problem is that person C may come along and notice that there is a useful abstraction to be made by inheriting both from ClassA and ClassB. But both of these define foo and there is no mechanism in the language to resolve this. This is not true at all. every name in haskell can be uniquely specified. module ClassA where class ClassA a where foo :: a - Int module ClassB where class classB a where foo :: a - String import ClassA import ClassB class (ClassA a, ClassB a) = ClassC a where bar :: a - (Int, String) bar x = (ClassA.foo x, ClassB.foo x) Hi John - Thanks for pointing this out. The only thing that I find slightly uneasy about it is that you have to keep the module names and class names in sync so that ClassA.foo which means the value foo exposed by the module ClassA is in truth the same as the foo method of class ClassA. The language doesn't enforce this correspondence (indeed perhaps such a correspondence would be undesirable but it's difficult to get used to this need to keep different entities in sync instead of just being able to name the concept once in one place). I'm also wondering if it would be a good idea to be able to declare some class methods as final, so they don't clutter up the dictionary at runtime, and so that we could end the dubious practice of declaring some functions which are conceptually part of a class as top level functions outside the class just to save space/time in the dictionary and therefore needing the physical module to create the conceptual grouping instead of using the language level grouping provided by the class name. I think a fundamental thing you are missing is that Haskell classes are not like C# or Java or other OO classes. Not because of implementation, but rather they are actually fundamentally different things. The reasons people don't place certain functions in classes has nothing to do with the size of class dicionaries. Heck, jhc doesn't even use dictionaries at all, there is no cost for adding methods to a class. People place them in top level functions because it makes more sense. of course, sometimes it is gotten wrong, and something would have been better off as a class method, but in general there are different concerns when dealing with haskell classes than OO classes. An OO class could be considered equivalent to a triplet of a Haskell data type, a Haskell existential with a class constraint, and a class with the resriction the class type can _only_ appear as the first argument to each method. In haskell all of these things are separate independent tools and are much more general and powerful than the limited and conjoined form that OO programming provides. I think the transition from OOP to Haskell is difficult because although OOP is less powerful for the reasons you've mentioned, many of the decisions you have to make when writing Haskell code just don't exist in OOP eg in C# there are no modules to worry about, every class belongs to a file with the same name, you don't need to decide where to put the object argument and there are no top level functions or values. In contrast, in Haskell you have to juggle class declarations, instance declarations, types, values which may or may not be part of a class, and decide which combinations of the above should go into which modules and what the names of the modules should be and manually remember to keep some module names and class names (or type names eg Data.Set and data Set a = ...) in sync. Anyway these are probably more long term ideas but I mentioned them now just to hopefully start the ball rolling (the above should not be taken as a criticism of Haskell, I'm just saying that at some point we need all the normal mechanisms that everyone else (Java, C#) takes for granted because there's no point waiting till we encounter the same well-known software engineering problems that already have well established good solutions). It is best to think of haskell primitives as something completely new, they reuse some naming conventions from OO programming, but that doesn't mean they suffer from the same limitations. Hopefully my neural pathways will reconnect themselves soon to take advantage of all this new power!!! :-) It took me a few trys to wrap my brain around it. I liken learning haskell to tipping over a vending machine. you can't just push it, you gotta rock it back and forth a few times building up momentum until bam! suddenly the flash of insight hits and it all makes sense. Great image! We need a film or video of a day in the life of Haskell with these kind of things in it eg camera cutting from scene of vending machine satori to animated folds rippling through space gathering structure like a new planet forming, then another cut to a silent robed figure casting objects from a sack
Re: [Haskell-cafe] The difficulty of designing a sequence class
On Jul 30, 2006, at 5:28 PM, Brian Hulley wrote: Robert Dockins wrote: On Sunday 30 July 2006 07:47, Brian Hulley wrote: Another option, is the Edison library which uses: class (Functor s, MonadPlus s) = Sequence s where so here MonadPlus is used instead of Monoid to provide empty and append. So I've got three main questions: 1) Did Edison choose MonadPlus just because this fitted in with the lack of multi-parameter typeclasses in H98? Edison's design hails from a time when MPTCs were not only non-standard (as they still are), but also not widely used, and before fundeps were avaliable (I think). So the answer to this one is pretty much yes. [snip] Hi - Thanks for the answers to this and my other questions. One thing I just realised is that there doesn't seem to be any instance declarations anywhere in the standard libs relating Monoid to MonadPlus so it's a bit unsettling to have to make a random choice on the question of what kind of object a Sequence is... I tried: class (forall a. Monoid s a) = Sequence s where ... but of course that doesn't work, so I suppose MonadPlus is the only option when 'a' doesn't appear as a type variable arg of the class being defined. BTW, for what purpose are you desiging a new sequence class? You are clearly aware of other efforts in this area; in what ways to they not meet your needs? The existing sequence and collection classes I've looked at don't do enough. For example, when I tried to represent the text in an edit widget, I realised I needed a sequence of characters that could also be considered to be a sequence of lines, and it is necessary to be able to index the sequence by character position as well as by line position, as well as keeping track of the total number of characters, the total number of lines, and the maximum number of characters on any one line (so as to be able to calculate the x,y extents when laying out the widget, assuming a fixed width font (tabs ignored!)), with very efficient split and append operations. So, what you want is a sequence of sequences that can be transparently converted to a flattened sequence and vice versa? And you also want to keep track of the total number of lines and characters within each line. Additionally, you want to keep track of the maximum number of characters in any one line. I managed to get a good representation by using a FingerTree of lines where each line uses a ByteString. I made my own FingerTree class based on the one referenced in the paper at http://www.soi.city.ac.uk/~ross/papers/FingerTree.html but without the symbolic names which I find totally unreadable and confusing, and also so I could get full control of the strictness of the implementation, and also as a way of understanding them since I'd never come across such a complicated data structure before. (I highly recommend this paper to anyone who wants to learn about FingerTrees, Monoids and other very useful concepts.) So one thing existing sequence classes don't have (apart from FingerTree) is the concept of measurement which is essential when you want efficient updates. Eg in my text buffer, the measurement maintained for a sequence is the number of chars and number of lines and maximum line length. Edison has support for transparently keeping track of the size of a sequence. http://www.eecs.tufts.edu/~rdocki01/docs/edison/Data-Edison-Seq- SizedSeq.html It may well be possible to create a slightly generalized wrapper that keeps track of arbitrary measures. (If they can be computed by a function which is associative, commutative and has a unit). Humm, sort of an incremental fold I like it. Then I needed a structure for a Trie widget a bit like (details omitted): data Node = Expanded Value T | Collapsed Value T | Leaf Value newtype T = T (FingerTree (Key, Node)) where objects of type T could be regarded as a finite map (eg from hierarchical module names to modules) as well as a flattened linear sequence indexed by line number (for display on the screen in a widget given the current scroll bar position), and which also needed to keep track of the total horizontal and vertical extent of the Trie as it would appear in the widget's font. There are several different kinds of measurement going on in this data structure, as well as the complexity of the extra recursion through the leaf to a new level. Existing sequence abstractions don't seem to provide the operations needed to treat a nested data structure as a single sequence. In summary: 1) Often a complex data structure must be able to be simultaneously regarded as a single flattened sequence 2) Measurements are needed for efficient updates (may need to keep track of several at once) 3) Indexing and size are sometimes needed relative to the flattened sequence not just the top level 4) It is useful to have a finite map that can also be
Re: [Haskell-cafe] The difficulty of designing a sequence class
Robert Dockins wrote: On Jul 30, 2006, at 5:28 PM, Brian Hulley wrote: Robert Dockins wrote: So, what you want is a sequence of sequences that can be transparently converted to a flattened sequence and vice versa? Yes as long as the conversion between them takes no time at all - the sequence of sequences and flattened sequence must coexist simultaneously. The concrete data structure is a sequence of sequences and the flattened sequence is a particular view of it. And you also want to keep track of the total number of lines and characters within each line. Additionally, you want to keep track of the maximum number of characters in any one line. Edison has support for transparently keeping track of the size of a sequence. http://www.eecs.tufts.edu/~rdocki01/docs/edison/Data-Edison-Seq- SizedSeq.html I used this in an initial version of an edit buffer when I just used a SizedSeq wrapping a BinaryRandList to store the text as a sequence of chars. But the lack of ability to also index by line number and keep track of max line length was the problem that led me to use a finger tree instead. Of course I could have used a sequence of chars, a sequence of line lengths, and a bag of line lengths to keep track of everything, and kept them in sync, but after reading the FingerTree paper I was seduced by the idea of being able to represent all this stuff at once in a single data structure. It may well be possible to create a slightly generalized wrapper that keeps track of arbitrary measures. (If they can be computed by a function which is associative, commutative and has a unit). Humm, sort of an incremental fold I like it. I got this from the FingerTree paper. A finger tree supports any measurement that is a Monoid (so it needs to be associative but not commutative (if it had to be commutative it would be impossible to use a sequence as a set or map, which I needed for my Trie structure)). Well, I guess I'd suggest you attempt to identify specific problems with already existing packages and attempt to work with those who maintain such packages before reinventing something as basic (and difficult to get right!) as data structure abstractions. The problem is that some people will be using Data.Edison.Seq at the moment and will naturally not want it to change. However I'd suggest that all the common operations be factored out into separate classes eg: class Foldable f where fold :: (a - b - b) - b - f a - b foldL :: ... class Reduce f where -- based on FingerTree paper reduceR :: (a - b - b) - (f a - b - b) reduceL :: (b - a - b) - (b - f a - b) class TakeDrop f where take :: Int - f a - f a takeWhile :: (a - Bool) - f a - f a drop ... class Filter f where filter :: (a - Bool) - f a - f a partition :: (a - Bool) - f a - (f a, f a) class Indexable f where length :: f a - Int at :: Int - f a - f a -- (*) splitAt :: Int - f a - (f a, f a) (*) Data.ByteString.index puts the Int arg second. It's not at all clear to me which is best, because I often wish that the Int arg of take and drop was second also so I could write take g $! x+1 instead of (take $! x + 1) g though it's consistent with the arg order for takeWhile etc. I know you don't agree with the no-exception-camel-case idea, but I still would argue that this is essential if you want to have a consistent naming convention. I find it extremely confusing that a word like reducer is supposed to be read as reduceR because the word reducer means to me something which reduces. It seems to me that a restructuring of the usual fold, reduce ops into classes is a great opportunity to perfect the naming of these functions to make life easier for generations to come... :-) Such maintainers may be willing to accept patches and/or implement requested features in order to reduce fragmentation in this space *hint, hint* :-) Point taken, although in the case of the above refactoring idea, I think it really is a Haskell-wide task because although there appears to be a defacto standard use of names like take, drop, splitAt etc, it's not nearly so clear which ops belong together and which should be separated out, and I personally don't have enough experience of Haskell yet to be able to recommend a solution. soapbox type=Edison plug I personally think that Edison is a great piece of work, and I took up maintainership because I felt it was a great shame that no one was using it. My ultimate goal is to make Edison the package that everyone thinks of first when they discover they need a Haskell datastructure for some purpose. Even if Edison does not fill that need, I want every Haskeller to compare his needs against what Edison provides before striking out on his own, and I want that to be a decision made with some hesitation. Over time I hope to make the cases where Edison doesn't cut the mustard fewer and
Re: [Haskell-cafe] The difficulty of designing a sequence class
G'day all. Quoting Robert Dockins [EMAIL PROTECTED]: Edison's design hails from a time when MPTCs were not only non-standard (as they still are), but also not widely used, and before fundeps were avaliable (I think). Yes. Chris Okasaki's original version of Edison was standard H98. I've considered reformulating the Sequence class to be more similar to the Collection classes (which use MPTCs, fundeps and mention the element type), The redesign of the Collection hierarchy was from my tree. The main reason why I changed it was that ternary tries couldn't really be typed properly. (Chris' implementation of Patricia trees used a phantom key type along with a stern warning to only define the Int instance. That didn't work for ternary tries, since the key type is polymorphic.) I didn't get around to fixing Sequence because there wasn't a need for it yet, but yes, it should be done. Cheers, Andrew Bromage ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The difficulty of designing a sequence class
G'day all. Quoting Brian Hulley [EMAIL PROTECTED]: The problem is that some people will be using Data.Edison.Seq at the moment and will naturally not want it to change. However I'd suggest that all the common operations be factored out into separate classes eg: While I think the huge typeclass is unfortunate, one of Edison's greatest strengths is that every sequence supports every sequence operation. (The catch, of course, is that the operation may be inefficient.) This was a deliberate design decision, and I'd be sorry to see it go. Many is the time in C++ when I started, say, with a std::stack, then discovered soon after that I needed to peer at the top few elements on the stack, only to find that std::stack doesn't support that. Supporting all operations supports exploratory/agile programming. You don't have to decide up front what operations you need to be fast. You can discover this as you go. Yes, this is orthogonal to breaking up the huge typeclass, but I thought I'd just mention it. Cheers, Andrew Bromage ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The difficulty of designing a sequence class
[EMAIL PROTECTED] writes: G'day all. Quoting Robert Dockins [EMAIL PROTECTED]: I've considered reformulating the Sequence class to be more similar to the Collection classes (which use MPTCs, fundeps and mention the element type), The redesign of the Collection hierarchy was from my tree. The main reason why I changed it was that ternary tries couldn't really be typed properly. (Chris' implementation of Patricia trees used a phantom key type along with a stern warning to only define the Int instance. That didn't work for ternary tries, since the key type is polymorphic.) I didn't get around to fixing Sequence because there wasn't a need for it yet, but yes, it should be done. That's a tough call to make. Changing the kind of Sequence to * from * - * means losing the Functor, Monad, and MonadPlus superclasses and all the various maps and zips. I guess you could separate those into an auxiliary class, class (Functor s, MonadPlus s) = SeqFunctor s where mapWithIndex :: (Int - a - b) - s a - s b zip :: s a - s b - s (a,b) ... and require that any instance of SeqFunctor also be an instance of Sequence. A pity we can't do something like, class (Functor s, MonadPlus s, forall a. Sequence (s a) a) = SeqFunctor s where ... -- David Menendez [EMAIL PROTECTED] | In this house, we obey the laws http://www.eyrie.org/~zednenem |of thermodynamics! ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The difficulty of designing a sequence class
Brian Hulley writes: 1) Did Edison choose MonadPlus just because this fitted in with the lack of multi-parameter typeclasses in H98? Instances of Monoid (and your ISeq) have kind *. Instances of MonadPlus (and Edison's Sequence) have kind * - *. Functions like map, zip, and their variants are best defined in terms of type constructors. With Sequence, you have zipWith :: (Sequence s) = (a - b - c) - s a - s b - s c With ISeq, you'd have to do something like zipWith :: (ISeq s1 a, ISeq s2 b, ISeq s3 c) = (a - b - c) - s1 - s2 - s3 which isn't able to make any assumptions about s1, s2, and s3 having the same structure. 3) Is it worth bothering to derive ISeq from Monoid (with the possible extra inefficiency of the indirection through the definitions for append = mappend etc or does the compiler completely optimize this out)? I would expect the compiler to inline append. 4) Would it be worth reconsidering the rules for top level names so that class methods could always be local to their class (ditto for value constructors and field names being local to their type constructor). Qualified module imports are the way to go, here. Do you really want to start writing if x Eq/== y Num/+ 1 then ... ? -- David Menendez [EMAIL PROTECTED] | In this house, we obey the laws http://www.eyrie.org/~zednenem |of thermodynamics! ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The difficulty of designing a sequence class
G'day all. Quoting David Menendez [EMAIL PROTECTED]: That's a tough call to make. Changing the kind of Sequence to * from * - * means losing the Functor, Monad, and MonadPlus superclasses and all the various maps and zips. And on the other hand, containers that need extra constraints (e.g. sets, which need their members to be Eq at the very least) can't be Functors or Monads anyway. Perhaps Functor/Monad/etc are the culprits here. Cheers, Andrew Bromage ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Fw: [Haskell-cafe] The difficulty of designing a sequence class
David Menendez wrote: Brian Hulley writes: 1) Did Edison choose MonadPlus just because this fitted in with the lack of multi-parameter typeclasses in H98? Instances of Monoid (and your ISeq) have kind *. Instances of MonadPlus (and Edison's Sequence) have kind * - *. Functions like map, zip, and their variants are best defined in terms of type constructors. With Sequence, you have zipWith :: (Sequence s) = (a - b - c) - s a - s b - s c With ISeq, you'd have to do something like zipWith :: (ISeq s1 a, ISeq s2 b, ISeq s3 c) = (a - b - c) - s1 - s2 - s3 which isn't able to make any assumptions about s1, s2, and s3 having the same structure. On the other hand it's more powerful because they can now have different structures ie was there any reason not to have: zipWith :: (Sequence s1, Sequence s2, Sequence s3) = (a - b - c) - s1 a - s2 b - s3 c 3) Is it worth bothering to derive ISeq from Monoid (with the possible extra inefficiency of the indirection through the definitions for append = mappend etc or does the compiler completely optimize this out)? I would expect the compiler to inline append. Thanks - that's good news. I' probably still too much in C++ mode. 4) Would it be worth reconsidering the rules for top level names so that class methods could always be local to their class (ditto for value constructors and field names being local to their type constructor). Qualified module imports are the way to go, here. Do you really want to start writing if x Eq/== y Num/+ 1 then ... ? I'm beginning to see that qualified module imports are indeed the only way to go, because the methods in a type class are only the virtual methods - often there are many other methods which are put outside the class to save space in the dictionary but which conceptually belong to the class thus putting the class + these extra functions in a single module wraps everything up into a conceptual unit. eg: module Foldable ( Foldable(..) , reduceR ) where class Foldable c a | c - a where foldR :: (a - b - b) - b - [a] - b -- ... reduceR :: Foldable c a = (a - b - b) - (c - b - b) reduceR f xs y = foldR f y xs forms the single conceptual unit to use Foldable.foldR, Foldable.reduceR etc so I'll have to retract my suggestion as regards classes... (Although I'm still concerned about value constructors and field names being in the top level instead of local to their type but changing this would require some changes to type inference (so that the constructors and field names could be used unqualified when the type at the given position is known eg by a top level type signature for the function or value) so that's more of a long term idea.) Regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The difficulty of designing a sequence class
David Menendez wrote: [EMAIL PROTECTED] writes: I didn't get around to fixing Sequence because there wasn't a need for it yet, but yes, it should be done. That's a tough call to make. Changing the kind of Sequence to * from * - * means losing the Functor, Monad, and MonadPlus superclasses and all the various maps and zips. But there's no option if you want to be able to support non-polymorphic sequences like Data.ByteString etc. I think the Functor class is just fundamentally too limited - it assumes the whole world is polymorphic and it isn't. Also, MPTC's mean we would gain Monoid. Regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The difficulty of designing a sequence class
[EMAIL PROTECTED] wrote: G'day all. Quoting Brian Hulley [EMAIL PROTECTED]: The problem is that some people will be using Data.Edison.Seq at the moment and will naturally not want it to change. However I'd suggest that all the common operations be factored out into separate classes eg: While I think the huge typeclass is unfortunate, one of Edison's greatest strengths is that every sequence supports every sequence operation. (The catch, of course, is that the operation may be inefficient.) This was a deliberate design decision, and I'd be sorry to see it go. Many is the time in C++ when I started, say, with a std::stack, then discovered soon after that I needed to peer at the top few elements on the stack, only to find that std::stack doesn't support that. As an aside, if I was needing any kind of sequence in C++ I'd use a std::vector because it supplies all the ops you need (and is usually fast enough for exploratory programming). I've never seen any point in stack or deque etc because they're far too limited. Supporting all operations supports exploratory/agile programming. You don't have to decide up front what operations you need to be fast. You can discover this as you go. Yes, this is orthogonal to breaking up the huge typeclass, but I thought I'd just mention it. As you've pointed out, there are 2 separate issues that are in danger of being confused: 1) Forcing all sequence instances to support all operations 2) Bundling all the ops into a single huge class I'd suggest that while 1) may be useful for the classes that are there at present, there are many ops that they don't yet support and also some ops that are never needed. Also, surely as long as there is one concrete type that supports everything that should be good enough for exploratory programming (I'm thinking of FingerTrees which seem to be able to do absolutely anything in logarithmic time!!! :-) ) For 2), you could still have a Sequence class to gather all the separate functionality together but I'd make it inherit from all the separate pieces of functionality rather than being the place where all the functionality is defined eg: class Viewable c a | c - a where viewL :: Monad m = c - m (a, c) viewR :: Monad m = c - m (c, a) atL :: c - a -- must be called on non-empty sequence atR :: c - a class Indexable c a | c - a where length :: c - Int at :: Int - c - a -- index must be in range splitAt :: Int - c - (c, c) -- in same module as Indexable take :: Indexable c a = Int - c - c take i c = let (l, _) = splitAt i c in l class (Monoid c, Foldable c a, Indexable c a, Filterable c a, Viewable c a) = Sequence c a This way, we'd get the advantages of being able to write (Sequence c a) as well as the advantages of being able to supply a sequence to a function that needed a Foldable - at the moment the fold methods of sequence are invisible to the rest of Haskell because they're trapped inside the Sequence class. Regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The difficulty of designing a sequence class
Brian Hulley wrote: David Menendez wrote: Brian Hulley writes: 4) Would it be worth reconsidering the rules for top level names so that class methods could always be local to their class (ditto for value constructors and field names being local to their type constructor). Qualified module imports are the way to go, here. Do you really want to start writing if x Eq/== y Num/+ 1 then ... ? I'm beginning to see that qualified module imports are indeed the only way to go, One reason I forgot: Suppose person A writes ClassA which uses foo as a method name, and somewhere else, person B writes ClassB which also uses foo as a method name, and both classes become widely used for several years. Now the problem is that person C may come along and notice that there is a useful abstraction to be made by inheriting both from ClassA and ClassB. But both of these define foo and there is no mechanism in the language to resolve this. The language C#, which was designed from the outset for programming in the large, already had a solution in the very first release of C#, namely that the interface name could be used to qualify a method name in cases of ambiguity, so transposing this to Haskell, you'd have something like: class ClassA a where foo :: a - Int class classB a where foo :: a - String class (ClassA a, ClassB a) = ClassC a where bar :: a - (Int, String) bar x = (ClassA#foo x, ClassB#foo x) As I see it, Haskell, great and innovative as it is, is still stuck in programming in the small and some of the mechanisms needed for programming in the large are not yet available - it is as impossible to ensure that there will never be conflicts between names of class methods as it is to ensure that there will never be conflicts between module names in packages written by different groups of people, and languages like Java and C# solved these problems right at the beginning but Haskell for some reason has ignored the issues, only recently just starting to address the package module name conflict problem for example even though the language has been around for more than a decade. I'm also wondering if it would be a good idea to be able to declare some class methods as final, so they don't clutter up the dictionary at runtime, and so that we could end the dubious practice of declaring some functions which are conceptually part of a class as top level functions outside the class just to save space/time in the dictionary and therefore needing the physical module to create the conceptual grouping instead of using the language level grouping provided by the class name. Anyway these are probably more long term ideas but I mentioned them now just to hopefully start the ball rolling (the above should not be taken as a criticism of Haskell, I'm just saying that at some point we need all the normal mechanisms that everyone else (Java, C#) takes for granted because there's no point waiting till we encounter the same well-known software engineering problems that already have well established good solutions). Regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The difficulty of designing a sequence class
On 7/31/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: G'day all. Quoting David Menendez [EMAIL PROTECTED]: That's a tough call to make. Changing the kind of Sequence to * from * - * means losing the Functor, Monad, and MonadPlus superclasses and all the various maps and zips. Perhaps Functor/Monad/etc are the culprits here. Indeed. See Oleg's message from a few months back where he shows that we can get John Hughes Restricted Data Types (Set is a Monad) by adding parameters to type classes: http://www.haskell.org//pipermail/haskell-prime/2006-February/000498.html http://hackage.haskell.org/trac/haskell-prime/ticket/98 Jim ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The difficulty of designing a sequence class
On Tue, Aug 01, 2006 at 02:56:21AM +0100, Brian Hulley wrote: Now the problem is that person C may come along and notice that there is a useful abstraction to be made by inheriting both from ClassA and ClassB. But both of these define foo and there is no mechanism in the language to resolve this. This is not true at all. every name in haskell can be uniquely specified. module ClassA where class ClassA a where foo :: a - Int module ClassB where class classB a where foo :: a - String import ClassA import ClassB class (ClassA a, ClassB a) = ClassC a where bar :: a - (Int, String) bar x = (ClassA.foo x, ClassB.foo x) I'm also wondering if it would be a good idea to be able to declare some class methods as final, so they don't clutter up the dictionary at runtime, and so that we could end the dubious practice of declaring some functions which are conceptually part of a class as top level functions outside the class just to save space/time in the dictionary and therefore needing the physical module to create the conceptual grouping instead of using the language level grouping provided by the class name. I think a fundamental thing you are missing is that Haskell classes are not like C# or Java or other OO classes. Not because of implementation, but rather they are actually fundamentally different things. The reasons people don't place certain functions in classes has nothing to do with the size of class dicionaries. Heck, jhc doesn't even use dictionaries at all, there is no cost for adding methods to a class. People place them in top level functions because it makes more sense. of course, sometimes it is gotten wrong, and something would have been better off as a class method, but in general there are different concerns when dealing with haskell classes than OO classes. An OO class could be considered equivalent to a triplet of a Haskell data type, a Haskell existential with a class constraint, and a class with the resriction the class type can _only_ appear as the first argument to each method. In haskell all of these things are separate independent tools and are much more general and powerful than the limited and conjoined form that OO programming provides. Anyway these are probably more long term ideas but I mentioned them now just to hopefully start the ball rolling (the above should not be taken as a criticism of Haskell, I'm just saying that at some point we need all the normal mechanisms that everyone else (Java, C#) takes for granted because there's no point waiting till we encounter the same well-known software engineering problems that already have well established good solutions). It is best to think of haskell primitives as something completely new, they reuse some naming conventions from OO programming, but that doesn't mean they suffer from the same limitations. It took me a few trys to wrap my brain around it. I liken learning haskell to tipping over a vending machine. you can't just push it, you gotta rock it back and forth a few times building up momentum until bam! suddenly the flash of insight hits and it all makes sense. John -- John Meacham - ⑆repetae.net⑆john⑈ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] The difficulty of designing a sequence class
Hi - Part 1 of 2 - Monoid versus MonadPlus === I've just run into a troublesome question when trying to design a sequence class: class ISeq c a | c - a where empty :: c single :: a - c append :: c - c - c However I've noticed that people sometimes separate the empty and append operations out as sequences with these ops form a Monoid therefore: class Monoid c = ISeq c a | c - a where single :: a - c -- now outside the class append :: ISeq c a = c - c - c append = mappend empty :: ISeq c a = c empty = mempty Another option, is the Edison library which uses: class (Functor s, MonadPlus s) = Sequence s where so here MonadPlus is used instead of Monoid to provide empty and append. So I've got three main questions: 1) Did Edison choose MonadPlus just because this fitted in with the lack of multi-parameter typeclasses in H98? 2) Are there any reasons to prefer the Edison design over the MPTC design (apart from H98 compatibility) or vice versa? 3) Is it worth bothering to derive ISeq from Monoid (with the possible extra inefficiency of the indirection through the definitions for append = mappend etc or does the compiler completely optimize this out)? and a fourth more long term question: 4) Would it be worth reconsidering the rules for top level names so that class methods could always be local to their class (ditto for value constructors and field names being local to their type constructor). For example it would be nice imho to be able to write: class Monoid c = ISeq c a | c - a where length :: c - Int f x y = Monoid/append x y -- or ISeq/append x y g x = ISeq/length x instead of having all names collide in the top level of a module, since at the moment it is difficult to think of method names that don't collide with the Prelude, and it's not nice to have to write mempty in place of empty. Part 2 of 2 - Monad versus Ancillary result type Another issue relates to left and right views of a sequence. If a sequence is non-empty, the left view is just the leftmost element and the rest of the sequence. The problem arises when the sequence is empty. In the Edison library, this is solved by returning a monadic value ie: lview :: Monad m = s a - m (a, s a) whereas from the paper Finger trees: a simple general purpose data structure by Ralf Hinze and Ross Paterson they use an ancillary data type to store the result of a view: data ViewL s a = NilL | ConsL a (s a) viewL :: FingerTree a - ViewL FingerTree a So my question here is: what's the best choice? I can see that the monadic version has the advantage that it could be used in do notation in cases where you expect the sequence to be non-empty, but has the disadvantage that it treats the empty sequence as something special (resulting in Monad/fail) and an extra indirection to find the components when the sequence is non-empty. Anyway these are my main questions for now - any feedback appreciated ;-) Thanks, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The difficulty of designing a sequence class
On Sunday 30 July 2006 07:47, Brian Hulley wrote: Hi - Part 1 of 2 - Monoid versus MonadPlus === I've just run into a troublesome question when trying to design a sequence class: class ISeq c a | c - a where empty :: c single :: a - c append :: c - c - c However I've noticed that people sometimes separate the empty and append operations out as sequences with these ops form a Monoid therefore: class Monoid c = ISeq c a | c - a where single :: a - c -- now outside the class append :: ISeq c a = c - c - c append = mappend empty :: ISeq c a = c empty = mempty Another option, is the Edison library which uses: class (Functor s, MonadPlus s) = Sequence s where so here MonadPlus is used instead of Monoid to provide empty and append. So I've got three main questions: 1) Did Edison choose MonadPlus just because this fitted in with the lack of multi-parameter typeclasses in H98? Edison's design hails from a time when MPTCs were not only non-standard (as they still are), but also not widely used, and before fundeps were avaliable (I think). So the answer to this one is pretty much yes. I've considered reformulating the Sequence class to be more similar to the Collection classes (which use MPTCs, fundeps and mention the element type), but for the 1.2 version I wanted to make as few changes as I thought I could to the overall design decisions. In fact, I will likely make this change at some point. It would allow, eg, making Don's ByteString (or whatever it's called now, I forget) an instance of Sequence, which is currently impossible. OTOH, it would require sacrificing the Functor, Monad and MonadPlus instances... 2) Are there any reasons to prefer the Edison design over the MPTC design (apart from H98 compatibility) or vice versa? H98 is probably the big one. I'm currently in wait-and-see mode concerning MPTCs and especially fundeps. As Edison maintainer, I've tried to use them sparingly in the hope that Edison can be made Haskell' compliant (crosses fingers). Aside: I hope the Haskell' committee makes some sort of decision here soonish. 3) Is it worth bothering to derive ISeq from Monoid (with the possible extra inefficiency of the indirection through the definitions for append = mappend etc or does the compiler completely optimize this out)? Not sure about this one. I suspect, however, that the appropriate SPECIALIZE pragmas could cover any cases that you really care about. and a fourth more long term question: 4) Would it be worth reconsidering the rules for top level names so that class methods could always be local to their class (ditto for value constructors and field names being local to their type constructor). [snip more question] No comment. Part 2 of 2 - Monad versus Ancillary result type Another issue relates to left and right views of a sequence. If a sequence is non-empty, the left view is just the leftmost element and the rest of the sequence. The problem arises when the sequence is empty. In the Edison library, this is solved by returning a monadic value ie: lview :: Monad m = s a - m (a, s a) whereas from the paper Finger trees: a simple general purpose data structure by Ralf Hinze and Ross Paterson they use an ancillary data type to store the result of a view: data ViewL s a = NilL | ConsL a (s a) viewL :: FingerTree a - ViewL FingerTree a So my question here is: what's the best choice? I can see that the monadic version has the advantage that it could be used in do notation in cases where you expect the sequence to be non-empty, but has the disadvantage that it treats the empty sequence as something special (resulting in Monad/fail) and an extra indirection to find the components when the sequence is non-empty. Well, the empty sequence IS special, when it comes to looking the leftmost (resp. righmost) element. You have to deal somehow with the fact that the operation is a partial function. I think the arbitrary monad option is better. It gives the user more flexibility about how to use the operation. Really the only way to use ViewL is to put it inside a case of. With a monad you can use any of the plethora of monadic operations and, as you mentioned, the do notation. In addition, if you want the use case of ViewL, you can always use the Maybe monad. I'm not inclined to worry too much about the extra indirection, which seems like a viable target for being erased by the compiler (I've not tested if this happens, however). Anyway these are my main questions for now - any feedback appreciated ;-) BTW, for what purpose are you desiging a new sequence class? You are clearly aware of other efforts in this area; in what ways to they not meet your needs? Thanks, Brian. -- Rob Dockins Talk softly and drive a Sherman tank. Laugh hard,
Re: [Haskell-cafe] The difficulty of designing a sequence class
Robert Dockins wrote: On Sunday 30 July 2006 07:47, Brian Hulley wrote: Another option, is the Edison library which uses: class (Functor s, MonadPlus s) = Sequence s where so here MonadPlus is used instead of Monoid to provide empty and append. So I've got three main questions: 1) Did Edison choose MonadPlus just because this fitted in with the lack of multi-parameter typeclasses in H98? Edison's design hails from a time when MPTCs were not only non-standard (as they still are), but also not widely used, and before fundeps were avaliable (I think). So the answer to this one is pretty much yes. [snip] Hi - Thanks for the answers to this and my other questions. One thing I just realised is that there doesn't seem to be any instance declarations anywhere in the standard libs relating Monoid to MonadPlus so it's a bit unsettling to have to make a random choice on the question of what kind of object a Sequence is... I tried: class (forall a. Monoid s a) = Sequence s where ... but of course that doesn't work, so I suppose MonadPlus is the only option when 'a' doesn't appear as a type variable arg of the class being defined. BTW, for what purpose are you desiging a new sequence class? You are clearly aware of other efforts in this area; in what ways to they not meet your needs? The existing sequence and collection classes I've looked at don't do enough. For example, when I tried to represent the text in an edit widget, I realised I needed a sequence of characters that could also be considered to be a sequence of lines, and it is necessary to be able to index the sequence by character position as well as by line position, as well as keeping track of the total number of characters, the total number of lines, and the maximum number of characters on any one line (so as to be able to calculate the x,y extents when laying out the widget, assuming a fixed width font (tabs ignored!)), with very efficient split and append operations. I managed to get a good representation by using a FingerTree of lines where each line uses a ByteString. I made my own FingerTree class based on the one referenced in the paper at http://www.soi.city.ac.uk/~ross/papers/FingerTree.html but without the symbolic names which I find totally unreadable and confusing, and also so I could get full control of the strictness of the implementation, and also as a way of understanding them since I'd never come across such a complicated data structure before. (I highly recommend this paper to anyone who wants to learn about FingerTrees, Monoids and other very useful concepts.) So one thing existing sequence classes don't have (apart from FingerTree) is the concept of measurement which is essential when you want efficient updates. Eg in my text buffer, the measurement maintained for a sequence is the number of chars and number of lines and maximum line length. Then I needed a structure for a Trie widget a bit like (details omitted): data Node = Expanded Value T | Collapsed Value T | Leaf Value newtype T = T (FingerTree (Key, Node)) where objects of type T could be regarded as a finite map (eg from hierarchical module names to modules) as well as a flattened linear sequence indexed by line number (for display on the screen in a widget given the current scroll bar position), and which also needed to keep track of the total horizontal and vertical extent of the Trie as it would appear in the widget's font. There are several different kinds of measurement going on in this data structure, as well as the complexity of the extra recursion through the leaf to a new level. Existing sequence abstractions don't seem to provide the operations needed to treat a nested data structure as a single sequence. In summary: 1) Often a complex data structure must be able to be simultaneously regarded as a single flattened sequence 2) Measurements are needed for efficient updates (may need to keep track of several at once) 3) Indexing and size are sometimes needed relative to the flattened sequence not just the top level 4) It is useful to have a finite map that can also be regarded as a linear sequence 5) Such finite maps may also be nested (when the keys are hierarchical) but this nesting should be hidden from the user... 6) I want a design that can allow complex data structures to be built up easily and instanced to the appropriate interfaces 7) Also naming conventions in the existing libs are a bit irregular and burdened with old fashioned lisp-isms eg in Data.Edison.Seq there are functions lview and reducel but I'd argue that there must be one and only one way of forming any identifier in any program namely that the function should appear first followed by qualifiers (so that related functionality always appears together in a lexicographical listing of functions) and it must use camel case with no exceptions at all, thus viewL and reduceL (and foldL). 8) More factoring needs to