Re: [Haskell-cafe] The difficulty of designing a sequence class

Brian Hulley Mon, 31 Jul 2006 07:50:38 -0700

Robert Dockins wrote:

On Jul 30, 2006, at 5:28 PM, Brian Hulley wrote:

Robert Dockins wrote:

So, what you want is a sequence of sequences that can be
transparently converted to a "flattened" sequence and vice versa?

Yes as long as the conversion between them takes no time at all - thesequence of sequences and flattened sequence must coexist simultaneously.The concrete data structure is a sequence of sequences and the flattenedsequence is a particular view of it.

And you also want to keep track of the total number of lines and
characters within each line.  Additionally, you want to keep track of
the maximum number of characters in any one line.

Edison has support for transparently keeping track of the size of a
sequence.

http://www.eecs.tufts.edu/~rdocki01/docs/edison/Data-Edison-Seq-
SizedSeq.html

I used this in an initial version of an edit buffer when I just used aSizedSeq wrapping a BinaryRandList to store the text as a sequence of chars.But the lack of ability to also index by line number and keep track of maxline length was the problem that led me to use a finger tree instead.

Of course I could have used a sequence of chars, a sequence of line lengths,and a bag of line lengths to keep track of everything, and kept them insync, but after reading the FingerTree paper I was seduced by the idea ofbeing able to represent all this stuff at once in a single data structure.


It may well be possible to create a slightly generalized wrapper that
keeps track of arbitrary "measures".  (If they can be computed by a
function which is associative, commutative and has a unit).
Humm, sort of an incremental fold.... I like it.

I got this from the FingerTree paper. A finger tree supports any measurementthat is a Monoid (so it needs to be associative but not commutative (if ithad to be commutative it would be impossible to use a sequence as a set ormap, which I needed for my Trie structure)).

Well, I guess I'd suggest you attempt to identify specific problems
with already existing packages and attempt to work with those who
maintain such packages before reinventing something as basic (and
difficult to get right!) as data structure abstractions.

The problem is that some people will be using Data.Edison.Seq at the momentand will naturally not want it to change. However I'd suggest that all thecommon operations be factored out into separate classes eg:


   class Foldable f  where
        fold :: (a -> b -> b) -> b -> f a -> b
        foldL :: ...

   class Reduce f where -- based on FingerTree paper
       reduceR :: (a -> b -> b) -> (f a -> b -> b)
       reduceL :: (b -> a -> b) -> (b -> f a -> b)

   class TakeDrop f where
       take :: Int -> f a -> f a
       takeWhile :: (a -> Bool) -> f a -> f a
       drop ...

   class Filter f where
       filter :: (a -> Bool) -> f a -> f a
       partition :: (a -> Bool) -> f a -> (f a, f a)

   class Indexable f where
      length :: f a -> Int
      at :: Int -> f a -> f a -- (*)
      splitAt :: Int -> f a -> (f a, f a)

(*) Data.ByteString.index puts the Int arg second. It's not at all clear tome which is best, because I often wish that the Int arg of take and drop wassecond also so I could write take g $! x+1 instead of (take $! x + 1) gthough it's consistent with the arg order for takeWhile etc.

I know you don't agree with the no-exception-camel-case idea, but I stillwould argue that this is essential if you want to have a consistent namingconvention. I find it extremely confusing that a word like "reducer" issupposed to be read as "reduceR" because the word "reducer" means to me"something which reduces". It seems to me that a restructuring of the usualfold, reduce ops into classes is a great opportunity to perfect the namingof these functions to make life easier for generations to come... :-)


Such maintainers may be willing to accept patches and/or implement
requested features in order to reduce fragmentation in this space
*hint, hint*  :-)

Point taken, although in the case of the above refactoring idea, I think itreally is a Haskell-wide task because although there appears to be a defactostandard use of names like take, drop, splitAt etc, it's not nearly so clearwhich ops belong together and which should be separated out, and Ipersonally don't have enough experience of Haskell yet to be able torecommend a solution.


<soapbox type="Edison plug">
I personally think that Edison is a great piece of work, and I took
up maintainership because I felt it was a great shame that no one was
using it.  My ultimate goal is to make Edison the package that
everyone thinks of first when they discover they need a Haskell
datastructure for some purpose.  Even if Edison does not fill that
need, I want every Haskeller to compare his needs against what Edison
provides before striking out on his own, and I want that to be a
decision made with some hesitation.  Over time I hope to make the
cases where Edison doesn't cut the mustard fewer and further between.

So, if you've ever looked at Edison, or ever do so in the future, and
decide it isn't what you need, please let me know why so I can make
it better for the next time.  After all, squeaky wheels get the
grease, but only if I can hear the squeaking!
</soapbox>


Best regards,
Brian.

--
Logic empowers us and Love gives us purpose.
Yet still phantoms restless for eras long past,
congealed in the present in unthought forms,
strive mightily unseen to destroy us.

http://www.metamilk.com

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] The difficulty of designing a sequence class

Reply via email to