Re: [Haskell-cafe] Re: Can we do better than duplicate APIs? [was: Data.CompactString 0.3]

2007-03-28 Thread Robert Dockins


On Mar 28, 2007, at 2:44 PM, Benjamin Franksen wrote:


Robert Dockins wrote:
After taking a look at the Haddock docs, I was impressed by the  
amount of
repetition in the APIs. Not ony does Data.CompactString duplicate  
the

whole
Data.ByteString interface (~100 functions, adding some more for  
encoding
and decoding), the whole interface is again repeated another four  
times,

once for each supported encoding.


I'd like to mention that as maintainer of Edison, I face similar

difficulties.
The data structure interfaces have scores of functions and there  
are about

20
different concrete implementations of various sorts.  Even minor  
interface

changes require a lot of tedious editing to make sure that everything

stays

in sync.


But... you have the type of all functions nailed down in classes.  
Thus, even
if a change in the API means a lot of tedious work adapting the  
concrete

implementations, at least the compiler helps you to check that the
implementations will conform to the interface (class);


This is true.


and users have to
consult only the API docs, and not every single function in all 20
implementations. With ByteString and friends there is (yet) no common
interface laid down anywhere. All the commonality is based on  
custom and
good sense and the willingness and ability of the developers to  
make their

interfaces compatible to those of others.


One could use code
generation or macro expansion to alleviate this, but IMO the  
necessity to
use extra-language pre-processors points to a weakness in the  
language;

it
be much less complicated and more satisfying to use a language  
feature

that

avoids the repetition instead of generating code to facilitate it.


I've considered something like this for Edison.  Actually, I've  
considered
going even further and building the Edison concrete  
implementations in a

theorem prover to prove correctness and then extracting the Haskell

source.

Some sort of in-langauge or extra-language support for mechanicly

producing
the source files for the full API from the optimized core API  
would be

quite welcome.  Handling export lists,


How so? I thought in Edision the API is a set of type classes.  
Doesn't that

mean export lists can be empty (since instances are exported
automatically)?


No.  Edison allows you to directly import the module and bypass the  
typeclass APIs if you wish.  Also, some implementations have special  
functions that are not part of the general API, and are only  
available via the module exports.


One could make typeclasses the only way to access the main API, but I  
rather suspect there would be performance implications.  I get the  
impression that typeclass specialization is less advanced than  
intermodule inlining (could be wrong though).




haddock comments,


I thought all the documentation would be in the API classes, not in  
the

concrete implementations.


It is now, but I've gotten complaints about that (which are at least  
semi-justified, I feel).  Also, the various implementations have  
different time bounds which must documented in the individual  
modules.  Ideally, I'd like to have the function documentation string  
and the time bounds on each function in each concrete  
implementation.  I've not done this because its just too painful to  
maintain manually.




typeclass instances,
etc, are quite tedious.

I have to admit, I'm not sure what an in-language mechanism for doing
something like this would look like.  Template Haskell is an  
option, I
suppose, but its pretty hard to work with and highly non- 
portable.  It

also
wouldn't produce Haddock-consumable source files.  ML-style first  
class

modules might fit the bill, but I'm not sure anyone is seriously

interested

in bolting that onto Haskell.


As I explained to SPJ, I am less concerned with duplicated work when
implementing concrete data structures, as with the fact that there  
is still
no (compiler checkable) common interface for e.g. string-like  
thingies,

apart from convention to use similar names for similar features.



Fair enough.  I guess my point is that typeclasses (ad per Edison)  
are only a partial solution to this problem, even if you can stretch  
them sufficiently (with eg, MPTC+fundeps+whatever other extension) to  
make them cover all your concrete implementations.




Cheers
Ben



Rob Dockins

Speak softly and drive a Sherman tank.
Laugh hard; it's a long way to the bank.
  -- TMBG



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Can we do better than duplicate APIs?

2007-03-28 Thread Benjamin Franksen
Robert Dockins wrote:
 Some sort of in-langauge or extra-language support for mechanicly
 producing
 the source files for the full API from the optimized core API  
 would be
 quite welcome.

Have you considered using DrIFT? IIRC it is more portable and easier to use
than TH.

 Handling export lists, 

 How so? I thought in Edision the API is a set of type classes.  
 Doesn't that
 mean export lists can be empty (since instances are exported
 automatically)?
 
 No.  Edison allows you to directly import the module and bypass the  
 typeclass APIs if you wish.

Ah, I didn't know that.

 Also, some implementations have special   
 functions that are not part of the general API, and are only  
 available via the module exports.

Ok.

 One could make typeclasses the only way to access the main API, but I  
 rather suspect there would be performance implications.  I get the  
 impression that typeclass specialization is less advanced than  
 intermodule inlining (could be wrong though).

No idea. Experts?

 haddock comments,

 I thought all the documentation would be in the API classes, not in  
 the
 concrete implementations.
 
 It is now, but I've gotten complaints about that (which are at least  
 semi-justified, I feel).  Also, the various implementations have  
 different time bounds which must documented in the individual  
 modules.  

Yes, I forgot about that. Hmmm.

 Ideally, I'd like to have the function documentation string   
 and the time bounds on each function in each concrete  
 implementation.  I've not done this because its just too painful to  
 maintain manually.

I can relate to that. The more so since establishing such time bounds with
confidence is not trivial even if the code looks simple. BTW, code
generation (of whatever sort) wouldn't help with that, right?

I wonder: would it be worthwhile to split the package into smaller parts
that could be upgraded in a somewhat less synchronous way? (so that the
maintenance effort can be spread over a longer period)

 I have to admit, I'm not sure what an in-language mechanism for doing
 something like this would look like.  Template Haskell is an  
 option, I
 suppose, but its pretty hard to work with and highly non- 
 portable.  It
 also
 wouldn't produce Haddock-consumable source files.  ML-style first  
 class
 modules might fit the bill, but I'm not sure anyone is seriously
 interested
 in bolting that onto Haskell.

 As I explained to SPJ, I am less concerned with duplicated work when
 implementing concrete data structures, as with the fact that there  
 is still
 no (compiler checkable) common interface for e.g. string-like  
 thingies,
 apart from convention to use similar names for similar features.
 
 Fair enough.  I guess my point is that typeclasses (ad per Edison)  
 are only a partial solution to this problem, even if you can stretch  
 them sufficiently (with eg, MPTC+fundeps+whatever other extension) to  
 make them cover all your concrete implementations.

Yes, and I think these problems would be worth some more research effort.

Besides, I dearly hope that we can soon experiment with associated type
synonyms...

Cheers
Ben

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Can we do better than duplicate APIs?

2007-03-28 Thread Robert Dockins
On Wednesday 28 March 2007 17:08, Benjamin Franksen wrote:
 Robert Dockins wrote:
  Some sort of in-langauge or extra-language support for mechanicly
 
  producing
 
  the source files for the full API from the optimized core API
  would be
  quite welcome.

 Have you considered using DrIFT? IIRC it is more portable and easier to use
 than TH.

DrIFT only works on datatype declarations (AFAIK) and doesn't really cover the 
use cases in question.

[snip]

  haddock comments,
 
  I thought all the documentation would be in the API classes, not in
  the
  concrete implementations.
 
  It is now, but I've gotten complaints about that (which are at least
  semi-justified, I feel).  Also, the various implementations have
  different time bounds which must documented in the individual
  modules.

 Yes, I forgot about that. Hmmm.

  Ideally, I'd like to have the function documentation string
  and the time bounds on each function in each concrete
  implementation.  I've not done this because its just too painful to
  maintain manually.

 I can relate to that. The more so since establishing such time bounds with
 confidence is not trivial even if the code looks simple. BTW, code
 generation (of whatever sort) wouldn't help with that, right?

Well, I can't imagine any tool that would prove the bounds for me unless 
automatic proof techniques have improved a _lot_ in the last week or so ;-)

However, if I could record the bounds once somewhere for each implementation 
and then have them auto merged with the documentation for each function, that 
would be great.

 I wonder: would it be worthwhile to split the package into smaller parts
 that could be upgraded in a somewhat less synchronous way? (so that the
 maintenance effort can be spread over a longer period)

Perhaps, but that only amortizes the effort rather than reducing it.


[snip]

  As I explained to SPJ, I am less concerned with duplicated work when
  implementing concrete data structures, as with the fact that there
  is still
  no (compiler checkable) common interface for e.g. string-like
  thingies,
  apart from convention to use similar names for similar features.
 
  Fair enough.  I guess my point is that typeclasses (ad per Edison)
  are only a partial solution to this problem, even if you can stretch
  them sufficiently (with eg, MPTC+fundeps+whatever other extension) to
  make them cover all your concrete implementations.

 Yes, and I think these problems would be worth some more research effort.

Agreed.

 Besides, I dearly hope that we can soon experiment with associated type
 synonyms...

 Cheers
 Ben


Rob Dockins
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Can we do better than duplicate APIs? [was: Data.CompactString 0.3]

2007-03-28 Thread Duncan Coutts
On Wed, 2007-03-28 at 20:44 +0200, Benjamin Franksen wrote:

 But... you have the type of all functions nailed down in classes. Thus, even
 if a change in the API means a lot of tedious work adapting the concrete
 implementations, at least the compiler helps you to check that the
 implementations will conform to the interface (class); and users have to
 consult only the API docs, and not every single function in all 20
 implementations. With ByteString and friends there is (yet) no common
 interface laid down anywhere. All the commonality is based on custom and
 good sense and the willingness and ability of the developers to make their
 interfaces compatible to those of others.

Remember that there's more to an API than a bunch of types. The type
class only ensures common types.

You must still rely on the good sense and ability of the developers to
ensure other properties like strictness, time complexity and simply what
the functions should do.

Duncan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Can we do better than duplicate APIs? [was: Data.CompactString 0.3]

2007-03-26 Thread Benjamin Franksen
Jean-Philippe Bernardy wrote:
 Please look at
http://darcs.haskell.org/packages/collections/doc/html/Data-Collections.html
 for an effort to make most common operation on bulk types fit in a
 single framework.

The last time I looked at this (shortly after you started the project) I
wasn't sure if I would want to use it. Now it seems like an oasis in a
desert to me. I am pretty much impressed, for instance, you managed to
unify all the nine existing 'filter' types into a common type class. Cool.

The only hair in the (otherwise very tasty) soup is Portability: MPTC, FD,
undecidable instances which doesn't sound like it is going to replace the
Prelude any time soon ;-) Never mind: I definitely consider using this
instead of importing all these different Data.XYZ modules directly (and,
heaven forbid, having to import them qualified whenever I need two of them
in the same module).

Do you forsee any particular obstacle to an integration (=providing the
appropriate instances) of e.g. CompactStrings? I would even try to do this
myself, as an exercise of sorts. How difficult is it in practice to work
with 'undecidable instances'? Are there special traps one has to be careful
to walk around?

 Also, we expect indexed types to solve, or at least alleviate, some
 problems you mention in your rant.
 http://haskell.org/haskellwiki/GHC/Indexed_types

I have been hoping for that to resolve (some of) our troubles, but have been
confused by the all the back and forth among the experts about whether they
offer more, or less, or the same, as MPTCs+fundeps+whatever (and that they
will probably not go into Haskell').

BTW, any reason I didn't find your collections library in the HackageDB
(other than stupidity on my part)? (Just interested, I already found the
darcs repo.)

Cheers
Ben

PS: Since I read and post to the Haskell lists via gmane and a news client:
Do mail clients usually respect the follow-up header, such as I insert when
cross-posting, so as to restrict follow-ups to the intended list?

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe