Mon, 05 Jun 2000 16:34:10 -0500, Matt Harden <[EMAIL PROTECTED]> pisze:

> I think I like the idea, but if we do this, do we lose the ability to
> pattern-match strings?

No. As overloaded numeric literals, string patterns would be matched
using overloaded (==). But patterns like 'f':'o':'o':_ would match
only [Char] (unless characters are overloaded too, but IMHO it's
unnecessary).

> Or should we consider an automatic (or explicit) conversion to
> [Char] when needed?

There is no such things as implicit conversion in Haskell. I don't know
if it could be added in a sensible way, without losing type inference.
If so, it should be a more general change, quite big I think.

> If "class String" makes sense, why not "class List"?  There have
> been times when I wanted to block together list elements similarly
> to packedStrings.  Some of the same benefits can be gained from
> alternate representations of [Int], for instance, as from [Char].
> It doesn't seem right to make Strings a special case.

Use Sequence class from Edison, or unboxed arrays from IArray. These
two unfortunately are quite separate, and Edison doesn't use arrays.

Using lists usually depends on their representation too much to
have them overloaded. Unless we have views (something that looks like
pattern but in fact calls arbitrary functions), x:xs cannot be anything
but matching a constructor called (:). But views won't help: using
pattern matching with recursive functions is inefficient for arrays.
Many functions that construct or deconstruct lists would have to be
changed completely.

It's not that simple to have alternative strings. Consider:

main = do
    s <- getContents
    putStr . unlines . map processLine . lines $ s

It works when processLine :: String -> String. But let's suppose that
processLine:: PackedString -> PackedString. Now, if all functions
above (getContents, lines etc.) are overloaded for various string
variants, the program loses laziness, the file is completely read in
before any processing. If
    lines :: (StringLike s1, StringLike s2, Sequence l) => l s1 -> s2
then it gets messy. Many simple functions must now deal with
conversions. It's hard to tell which should be methods, which should
have common implementation using other methods, and how much implicit
conversions to have, and where to put functions dealing with more
than one type of strings at one time.

There a lot of arbitrary decisions to make, which is a bad thing.
Signature of dropWhile or concat should be "obvious"!

Either there are a lot of ambiguities for intermediate types, or
they are resolved using extended defaulting rules and then strings
are possibly unnecessarily converted to the default representation.

To work with packed strings, the whole program must be changed.
Less efficiently to just have conversions around processLine, or
better made more imperative by using the PackedString variant of
getLine and catching end-of-file.

Numbers are atomic. It's not too hard to have them overloaded, because
they are always treated as a whole and rarely converted. Operations
don't change the type of numbers. OTOH strings are often processed
as lists, mapped to [[Char]] and then concatenated etc. It gets very
complicated when everything is overloaded.

Unless somebody shows how to do it well, IMHO the best thing that
we can have now is to have a separate family of functions for
PackedString with explicit conversions. They are usually processed
differently anyway (e.g. iterated by indices rather than recursive
pattern matching).

Orthogonally, there is a problem with Unicode in PackedStrings.

If Edison and IArray/MArray get more integrated with the rest of
the library, with carefully chosen amount of overloading, working
together, possibly replacing List and Array modules in the standard -
we might try to put PackedString there as well, as a kind of unboxed
array of characters. But I don't see a complete unified solution yet.

-- 
 __("<    Marcin Kowalczyk * [EMAIL PROTECTED] http://qrczak.ids.net.pl/
 \__/              GCS/M d- s+:-- a23 C+++$ UL++>++++$ P+++ L++>++++$ E-
  ^^                  W++ N+++ o? K? w(---) O? M- V? PS-- PE++ Y? PGP+ t
QRCZAK                  5? X- R tv-- b+>++ DI D- G+ e>++++ h! r--%>++ y-


Reply via email to