Robert Dockins wrote:
On Jul 30, 2006, at 5:28 PM, Brian Hulley wrote:
Robert Dockins wrote:
So, what you want is a sequence of sequences that can be
transparently converted to a "flattened" sequence and vice versa?

Yes as long as the conversion between them takes no time at all - the sequence of sequences and flattened sequence must coexist simultaneously. The concrete data structure is a sequence of sequences and the flattened sequence is a particular view of it.

And you also want to keep track of the total number of lines and
characters within each line.  Additionally, you want to keep track of
the maximum number of characters in any one line.

Edison has support for transparently keeping track of the size of a
sequence.

http://www.eecs.tufts.edu/~rdocki01/docs/edison/Data-Edison-Seq-
SizedSeq.html

I used this in an initial version of an edit buffer when I just used a SizedSeq wrapping a BinaryRandList to store the text as a sequence of chars. But the lack of ability to also index by line number and keep track of max line length was the problem that led me to use a finger tree instead.

Of course I could have used a sequence of chars, a sequence of line lengths, and a bag of line lengths to keep track of everything, and kept them in sync, but after reading the FingerTree paper I was seduced by the idea of being able to represent all this stuff at once in a single data structure.


It may well be possible to create a slightly generalized wrapper that
keeps track of arbitrary "measures".  (If they can be computed by a
function which is associative, commutative and has a unit).
Humm, sort of an incremental fold.... I like it.

I got this from the FingerTree paper. A finger tree supports any measurement that is a Monoid (so it needs to be associative but not commutative (if it had to be commutative it would be impossible to use a sequence as a set or map, which I needed for my Trie structure)).

Well, I guess I'd suggest you attempt to identify specific problems
with already existing packages and attempt to work with those who
maintain such packages before reinventing something as basic (and
difficult to get right!) as data structure abstractions.

The problem is that some people will be using Data.Edison.Seq at the moment and will naturally not want it to change. However I'd suggest that all the common operations be factored out into separate classes eg:

   class Foldable f  where
        fold :: (a -> b -> b) -> b -> f a -> b
        foldL :: ...

   class Reduce f where -- based on FingerTree paper
       reduceR :: (a -> b -> b) -> (f a -> b -> b)
       reduceL :: (b -> a -> b) -> (b -> f a -> b)

   class TakeDrop f where
       take :: Int -> f a -> f a
       takeWhile :: (a -> Bool) -> f a -> f a
       drop ...

   class Filter f where
       filter :: (a -> Bool) -> f a -> f a
       partition :: (a -> Bool) -> f a -> (f a, f a)

   class Indexable f where
      length :: f a -> Int
      at :: Int -> f a -> f a -- (*)
      splitAt :: Int -> f a -> (f a, f a)

(*) Data.ByteString.index puts the Int arg second. It's not at all clear to me which is best, because I often wish that the Int arg of take and drop was second also so I could write take g $! x+1 instead of (take $! x + 1) g though it's consistent with the arg order for takeWhile etc.

I know you don't agree with the no-exception-camel-case idea, but I still would argue that this is essential if you want to have a consistent naming convention. I find it extremely confusing that a word like "reducer" is supposed to be read as "reduceR" because the word "reducer" means to me "something which reduces". It seems to me that a restructuring of the usual fold, reduce ops into classes is a great opportunity to perfect the naming of these functions to make life easier for generations to come... :-)


Such maintainers may be willing to accept patches and/or implement
requested features in order to reduce fragmentation in this space
*hint, hint*  :-)


Point taken, although in the case of the above refactoring idea, I think it really is a Haskell-wide task because although there appears to be a defacto standard use of names like take, drop, splitAt etc, it's not nearly so clear which ops belong together and which should be separated out, and I personally don't have enough experience of Haskell yet to be able to recommend a solution.


<soapbox type="Edison plug">
I personally think that Edison is a great piece of work, and I took
up maintainership because I felt it was a great shame that no one was
using it.  My ultimate goal is to make Edison the package that
everyone thinks of first when they discover they need a Haskell
datastructure for some purpose.  Even if Edison does not fill that
need, I want every Haskeller to compare his needs against what Edison
provides before striking out on his own, and I want that to be a
decision made with some hesitation.  Over time I hope to make the
cases where Edison doesn't cut the mustard fewer and further between.

So, if you've ever looked at Edison, or ever do so in the future, and
decide it isn't what you need, please let me know why so I can make
it better for the next time.  After all, squeaky wheels get the
grease, but only if I can hear the squeaking!
</soapbox>

Best regards,
Brian.

--
Logic empowers us and Love gives us purpose.
Yet still phantoms restless for eras long past,
congealed in the present in unthought forms,
strive mightily unseen to destroy us.

http://www.metamilk.com
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to