Re: [Haskell-cafe] Diving into the records swamp (possible GSoC project)

2013-04-27 Thread Adam Gundry
Hi AntC,

Thanks for the feedback!


On 26/04/13 09:55, AntC wrote:
 Adam Gundry adam.gundry at strath.ac.uk writes:
 
 Hi,
 
 I am hoping to do a GSoC project this year working on GHC, and
 have been pointed in the direction of the records issue (in
 particular, the desire to overload field names).
 
 Heck you're brave!

Or possibly stupid. ;-) That's what I'm trying to figure out. There are
certainly other, less controversial, things I might work on!

Your questions are the kind of detailed issue that I'm tempted to
postpone for the time being, although I guess I can state my prejudices:

 Are you sure you want to step into the aggravated issue of changing 
 the dot operator from being function composition?

I'd prefer to leave dot alone and add a new operator for record
field projection, mostly because I don't relish trying to modify the
parser, but I am open to other opinions.

 Are you going to use explicit type application? (The type of get is 
 very odd.)

GHC desperately needs explicit type application, but that's another
syntactic minefield; I don't think it matters much for this proposal, in
that it's an implementation detail rather than something that should be
exposed to the user.

 Are you going to handle type-changing update?

I think we need to, though the right solution might well be to punt for
now and implement non-overloaded update alone (i.e. require the user to
explicitly specify the record type being updated when there is
ambiguity). This also works for multiple update, which is an issue
otherwise.


 The plan would be to implement a solution to the narrow issue of
  overloaded field names, along the lines of Simon PJ's SORF 
 proposal
 
 So has someone decided that SORF is the best of those many 
 proposals? I guess it's because it comes with the SPJ ring of 
 confidence?

I should have phrased my message more carefully. By along the lines of
SORF I didn't mean to exclude adapting it to take into account your work
on DORF/TPDORF. As I understand it, they both set out to solve the same
basic problem (that Johan outlines so clearly) but differ in the
implementation details. I rather meant to contrast the SORF/DORF
approach with an approach based solely on name-spacing.

Perhaps I should have mentioned that SPJ suggested I look at this,
although I don't think we want to be dictatorial about the final result.


 Before jumping to that decision, I suggest you/your sponsors
 consider the implications of the NewAxioms stuff in GHC Head [2] to
 support 'controlled' overlap.
 
 I think overlap is the only extra feature needed to support the DORF 
 or TPDORF proposals. (Plus the syntactic sugar already outlined in 
 that proposal.)

 This would provide a basis for experimentation with first-class
 record types.

 No: first-class record types needs much more than the SORF proposal.
 In particular it needs a way to extend an existing record to make a
 new one; project a subset of fields; and most important to merge two
 records with some fields in common avoiding doubling-up those fields
 (aka Relational Natural Join).

 The DORF/TPDORF proposals are aimed much better as a step towards
 first- class record types. [IMO **]

 Oleg/Ralf's HList paper covers all the ground for first-class
 records. It depends heavily on overlaps, which is why the NewAxioms
 stuff would work in really well.

While the NewAxioms stuff looks interesting, it's at an early stage.
Moreover, I'm worried that a records implementation based on desugaring
to an encoding (in the style of HList) will inevitably reveal details of
that encoding to the user. Petr rightly points out the need for good
error messages: these are much easier to generate if the constraints
arising from records are solved in GHC, rather than a library like HList.

There is certainly a long way from this proposal to full-blown
extensible records, and I don't expect to get there in a summer.


 AntC
 
 
 [2] http://hackage.haskell.org/trac/ghc/wiki/NewAxioms
 
 [**] Declaration of interest: I wrote the DORF and TPDORF proposals.

Thanks again for your efforts. From reading the list archives I can see
that a lot of thought and work has gone in to the different proposals,
which definitely will inform the final design.  It would be good if we
could finally make progress on the implementation!

Best regards,

Adam

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Diving into the records swamp (possible GSoC project)

2013-04-27 Thread Adam Gundry
Hi Petr,


On 26/04/13 19:53, Petr Pudlák wrote:
 Hi Adam,
 
 very nice idea. As the others, I'm curious why you chose to implement
 SORF in favor of the other ideas?

As I've commented in a message just now [1], by mentioning SORF I didn't
mean to exclude taking on board the other proposals (particularly
DORF/TPDORF). I do think a type-based approach is the way forward, with
a new sort of constraint that gives a polymorphic type to record
projection. The plans based on changing the name-spacing rules either
need type-based resolution as well, or require field names to be
prefixed with the relevant data type (so they don't fully solve the
problem).

Also, I'm a type theorist, so obviously the type-based solution is the
best one. ;-)


 I just read the SORF proposal, and I'm a bit concerned about what error
 messages would GHC issue when someone would type incorrect code
 involving such records. Currently Haskell's error messages already pose
 a barrier for newcomers (like No instance for (Num (a - a))), and if
 records are converted into those very complicated `Has` instances, type
 errors would be probably undecipherable even for moderate skilled
 Haskell users. Considering that records are a basic feature of Haskell
 and something that  people with OOP background are familiar with, this
 could result in a feature that would without doubts deter many (if not
 most) newcomers. So do you think it would be possible to implement it in
 such a way that users get sensible type error messages?

You're right to raise this as an issue. I think it is essential to get
good error messages, or at least no worse than we have already! Rather
than No instance for Has r l t we want it to say something like Type
r has no field l.  This justifies special treatment of Has constraints
by the compiler, rather than simply desugaring to an encoding.
Constraint-based type inference means this shouldn't be too difficult.

The current description of SORF doesn't really distinguish between the
user's perspective and the implementer's. This is something that we
should change (so it's clear how much users will need to understand);
but perhaps I'll wait and see if my GSoC proposal is accepted first!


   Best regards,
   Petr

Thanks for your comments,

Adam

[1] http://www.haskell.org/pipermail/haskell-cafe/2013-April/107876.html

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Diving into the records swamp (possible GSoC project)

2013-04-27 Thread Adam Gundry
Hi Johan,


On 26/04/13 20:46, Johan Tibell wrote:
 Hi Adam,
 
 Since we have already had *very* long discussions on this topic, I'm
 worried that I might open a can of worms be weighing in here, but the
 issue is important enough to me that I will do so regardless.

I'm the one busily opening this particular can. It's good to know it's
an important one though!

Thanks for characterising the problem so neatly:

 Instead of endorsing one of the listed proposals directly, I will
 emphasize the problem, so we don't lose sight of it. The problem people
 run into *in practice* and complain about in blog posts, on Google+, or
 privately when we chat about Haskell over beer, is that they would like
 to write a record definition like this one:
 
 data Employee = Employee { id :: Int, name :: String }
 
 printId :: Employee - IO ()
 printId emp = print $ id emp
 
 but since that doesn't work well in Haskell today due to name
 collisions, the best practice today is to instead write something like:
 
 data Employee = Employee { employeeId :: Int, employeeName :: String }
 
 printId :: Employee - IO ()
 printId emp = print $ employeeId emp
 
 The downsides of the latter have been discussed elsewhere, but briefly
 they are:
 
  * Overly verbose when there's no ambiguity.
  * Ad-hoc prefix is hard to predict (i.e. sometimes abbreviations of the
 data type name are used).
 
 The important requirement, which might seem a bit obvious, is that any
 solution to this problem better not be *even more* verbose than the
 second code snippet above. If I understand the SORF proposal correctly,
 you would write:
 
 data Employee = Employee { id :: Int, name :: String }
 
 printId :: Employee - IO ()
 printId emp = print $ emp.id http://emp.id
 
 Is that correct or do you have to replace 'Employee' with 'r { id :: Int
 }' in the type signature of 'printId'?

That's correct. The most general type (inferred if the annotation is
omitted) will be something like

printId :: r { id :: Int } = r - IO ()

but you are free to declare a more specific type in the usual way, much
as if the constraint was 'Show r', say.


 The discussions about an overhauled record system also involve lots of
 talk about record sub-typing, extensible records, and other more
 advanced features. I'd like to point out that there doesn't seem to be a
 great demand for these features. They might be nice-to-haves or might
 fall out naturally from a solution to the namespacing problem above, but
 they are in fact not needed to solve the common problem people have with
 the Haskell record system.

Thanks, I take your point. My proposal is to implement a good solution
to the problem you've outlined; I don't think we should go all the way
to extensible records just yet, if at all.


 Cheers,
   Johan
 

All the best,

Adam

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Diving into the records swamp (possible GSoC project)

2013-04-27 Thread AntC
 Johan Tibell johan.tibell at gmail.com writes:
 
 Instead of endorsing one of the listed proposals directly, I will 
emphasize the problem, so we don't lose sight of it. The problem people 
run into *in practice* and complain about in blog posts, on Google+, or 
privately when we chat about Haskell over beer, is that they would like to 
write a record definition like this one:
 
     data Employee = Employee { id :: Int, name :: String }
 
     printId :: Employee - IO ()
     printId emp = print $ id emp
 
 but since that doesn't work well in Haskell today due to name
 collisions, ...

[I've a bit more to say on that record definition below.]


Thank you Johan, I agree we should keep clear sight of the problem. So 
let's be a bit more precise: it's not exactly the record declaration that 
causes the name collisions, it's the field selector function that gets 
created automatically. (Note that we can use xDisambiguateRecordFields to 
access fields to, errm, disambiguate.)

So I did put in a separate proposal [3] (and ticket) on that very narrow 
issue. (Simon M pointed out that I probably didn't name it very well!)

Even if we do nothing to advance the records swamp, PLEASE can we 
provide a compiler option to suppress that function.

I envisage it might facilitate a 'cottage industry' of Template Haskell 
solutions (generating Has instances), which would be a cheap and cheerful 
way to experiment in the design space.


[3] 
http://hackage.haskell.org/trac/ghc/wiki/Records/DeclaredOverloadedRecordFi
elds/NoMonoRecordFields
(There are bound to be some fishhooks, especially around export/import of 
names from a module with no selector functions to one that's expecting 
them.)



[cont from above]
 ... the best practice today is to instead write something like:
 
     data Employee = Employee { employeeId :: Int, employeeName :: 
String }
 
     printId :: Employee - IO ()
     printId emp = print $ employeeId emp
 
 The downsides of the latter have been discussed elsewhere, but briefly 
they are:
 
  * Overly verbose when there's no ambiguity.
  * Ad-hoc prefix is hard to predict (i.e. sometimes abbreviations of the 
data type name are used).

I don't entirely agree with your analysis.
 * fields named `id' or `name' are very likely to clash,
   so that's a bad design (_too_ generic).
 * If you've normalised your data model [**],
   you are very likely to want exactly the same field
   in several records
   (for example employeeId in EmployeeNameAddress,
and in EmployeePay and in EmployeeTimeSheet.)

[And this use case is what TP/DORF is primarily aimed at.]

[**] Do I need to explain what data model normalisation is? I fear that so-
called XML 'databases' mean academics don't get taught normalisation any 
more(?)

AntC



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Diving into the records swamp (possible GSoC project)

2013-04-27 Thread AntC
 Johan Tibell johan.tibell at gmail.com writes:
 
 The discussions about an overhauled record system also involve lots of 
talk about record sub-typing, extensible records, and other more advanced 
features. I'd like to point out that there doesn't seem to be a great 
demand for these features. ...

Sorry, Johan, I really have to disagree with that.

There's lot's of Haskell to SQL interfaces that build on HList and its 
extensible record ideas (HDBC for example).

But the usability is not good (as Petr points out, and as Oleg/Ralf 
admitted back in the paper). The type error messages are long and obscure. 



 ... They might be nice-to-haves or might fall out naturally from a 
solution to the namespacing problem above, but they are in fact not needed 
to solve the common problem people have with the Haskell record system.

the common problem people have is that the record system is unusable 
[IMO] so doesn't get 'stretched' to see what other difficulties it has. 
There are all sorts of alternative systems (including Lenses) built with 
Template Haskell (and chewing gum and gaffer tape: that's how desperately 
bad is the current situation ;-).

I'm saying that many people find the Haskell record system 'as is' so 
dysfunctional that they give up on it! I feel strongly that as soon as we 
get past the name collissions, there'll be other blockages to using it.

I'd be interested to hear if there are any who can remember the Trex 
system, and how (un)usable it was?

AntC



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Diving into the records swamp (possible GSoC project)

2013-04-26 Thread AntC
 Adam Gundry adam.gundry at strath.ac.uk writes:
 
 Hi,
 
 I am hoping to do a GSoC project this year working on GHC, and have been
 pointed in the direction of the records issue (in particular, the desire
 to overload field names).

Heck you're brave!

Are you sure you want to step into the aggravated issue of changing the 
dot operator from being function composition?
Are you going to use explicit type application? (The type of get is very 
odd.)
Are you going to handle type-changing update?

 
 The plan would be to implement a solution to the narrow issue of
 overloaded field names, along the lines of Simon PJ's SORF proposal

So has someone decided that SORF is the best of those many proposals? I 
guess it's because it comes with the SPJ ring of confidence?

Before jumping to that decision, I suggest you/your sponsors consider the 
implications of the NewAxioms stuff in GHC Head [2] to 
support 'controlled' overlap.

I think overlap is the only extra feature needed to support the DORF or 
TPDORF proposals. (Plus the syntactic sugar already outlined in that 
proposal.)


  This would provide a basis for experimentation with
 first-class record types.

No: first-class record types needs much more than the SORF proposal. In 
particular it needs a way to extend an existing record to make a new one; 
project a subset of fields; and most important to merge two records with 
some fields in common avoiding doubling-up those fields (aka Relational 
Natural Join).

The DORF/TPDORF proposals are aimed much better as a step towards first-
class record types. [IMO **]

Oleg/Ralf's HList paper covers all the ground for first-class records. It 
depends heavily on overlaps, which is why the NewAxioms stuff would work 
in really well.

AntC


[2] http://hackage.haskell.org/trac/ghc/wiki/NewAxioms

[**] Declaration of interest: I wrote the DORF and TPDORF proposals.




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Diving into the records swamp (possible GSoC project)

2013-04-26 Thread Aleksandar Dimitrov
Slightly, off-topic, but just because I've been spending my last couple of days
trying to shoehorn an inheritance-based subytping type system into Haskell
(without full OO-power, so no methods or mutable state.)

 Oleg/Ralf's HList paper covers all the ground for first-class records. It 
 depends heavily on overlaps, which is why the NewAxioms stuff would work 
 in really well.

I've been kicking around the idea of re-implementing HList on the basis of the
new DataKinds [1] extension. I don't know if there'd be much of a need for that,
though. More generally, I'm wondering what can be done on extensible records and
inheritance-based type systems with FC-pro under one's belt…

Here's one thing I don't like about the current way HList-based extensible
record are represented (and used in OOHaskell [2]): the access time is linear in
the number of records a certain type has. Somehow just the thought of reorder
the records in your constructors to make your program go faster makes me cringe
a little. Oleg and Ralf do hint at implementing some binary search tree based
technique to alleviate the problem, but I have no idea how to go about it,
except extending the compiler.

~ A.

References

[1] Yorgey, Brent A., et al. Giving Haskell a promotion. Proceedings of the
8th ACM SIGPLAN workshop on Types in language design and implementation. ACM,
2012.
[2] Kiselyov, Oleg, and Ralf Lämmel. Haskell's overlooked object system. arXiv
preprint cs/0509027 (2005).


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Diving into the records swamp (possible GSoC project)

2013-04-26 Thread adam vogt
Hi Aleksandar,

This library for extensible records does use -XDataKinds:
http://hackage.haskell.org/package/vinyl. It doesn't have as many
definitions as HList, but that might be because more recent extensions
are more powerful. Many other libraries are listed
http://www.haskell.org/haskellwiki/Extensible_record#Libraries_on_hackage.
I don't know of a comparison between all of them.

--
Adam

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Diving into the records swamp (possible GSoC project)

2013-04-26 Thread Petr Pudlák
Hi Adam,

very nice idea. As the others, I'm curious why you chose to implement SORF
in favor of the other ideas?

I just read the SORF proposal, and I'm a bit concerned about what error
messages would GHC issue when someone would type incorrect code involving
such records. Currently Haskell's error messages already pose a barrier for
newcomers (like No instance for (Num (a - a))), and if records are
converted into those very complicated `Has` instances, type errors would be
probably undecipherable even for moderate skilled Haskell users.
Considering that records are a basic feature of Haskell and something that
 people with OOP background are familiar with, this could result in a
feature that would without doubts deter many (if not most) newcomers. So do
you think it would be possible to implement it in such a way that users get
sensible type error messages?

  Best regards,
  Petr



2013/4/26 Adam Gundry adam.gun...@strath.ac.uk

 Hi,

 I am hoping to do a GSoC project this year working on GHC, and have been
 pointed in the direction of the records issue (in particular, the desire
 to overload field names). This has been discussed on-and-off for years,
 and while there are lots of ideas [1], little has been implemented in
 GHC itself.

 The plan would be to implement a solution to the narrow issue of
 overloaded field names, along the lines of Simon PJ's SORF proposal (on
 the wiki). This would provide a basis for experimentation with
 first-class record types. While there are still design issues to
 resolve, the broad plan is clear and I'm confident it can be implemented
 in a summer and without overly restricting future record system designs.

 Does this sound like a reasonable strategy? I'd appreciate comments and
 criticism, although arguing about the details of the design should
 perhaps wait.

 (A little about me: I'm a PhD student working on type inference for
 Haskell and dependent types, with about four years' Haskell experience
 including work on big type-system related projects. I am familiar with
 the theory behind GHC, but I haven't worked on the code before.)

 Thanks,

 Adam Gundry


 [1] http://hackage.haskell.org/trac/ghc/wiki/Records

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Diving into the records swamp (possible GSoC project)

2013-04-26 Thread Johan Tibell
Hi Adam,

Since we have already had *very* long discussions on this topic, I'm
worried that I might open a can of worms be weighing in here, but the issue
is important enough to me that I will do so regardless.

Instead of endorsing one of the listed proposals directly, I will emphasize
the problem, so we don't lose sight of it. The problem people run into *in
practice* and complain about in blog posts, on Google+, or privately when
we chat about Haskell over beer, is that they would like to write a record
definition like this one:

data Employee = Employee { id :: Int, name :: String }

printId :: Employee - IO ()
printId emp = print $ id emp

but since that doesn't work well in Haskell today due to name collisions,
the best practice today is to instead write something like:

data Employee = Employee { employeeId :: Int, employeeName :: String }

printId :: Employee - IO ()
printId emp = print $ employeeId emp

The downsides of the latter have been discussed elsewhere, but briefly they
are:

 * Overly verbose when there's no ambiguity.
 * Ad-hoc prefix is hard to predict (i.e. sometimes abbreviations of the
data type name are used).

The important requirement, which might seem a bit obvious, is that any
solution to this problem better not be *even more* verbose than the second
code snippet above. If I understand the SORF proposal correctly, you would
write:

data Employee = Employee { id :: Int, name :: String }

printId :: Employee - IO ()
printId emp = print $ emp.id

Is that correct or do you have to replace 'Employee' with 'r { id :: Int }'
in the type signature of 'printId'?

The discussions about an overhauled record system also involve lots of talk
about record sub-typing, extensible records, and other more advanced
features. I'd like to point out that there doesn't seem to be a great
demand for these features. They might be nice-to-haves or might fall
out naturally from a solution to the namespacing problem above, but they
are in fact not needed to solve the common problem people have with the
Haskell record system.

Cheers,
  Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe