Re: [Haskell-cafe] RFC: demanding lazy instances of Data.Binary

2007-11-20 Thread Duncan Coutts
On Mon, 2007-11-19 at 20:22 -0600, Nicolas Frisby wrote:
 On Nov 19, 2007 4:16 PM, Duncan Coutts [EMAIL PROTECTED] wrote:
 
  On Mon, 2007-11-19 at 13:39 -0800, Don Stewart wrote:
   nicolas.frisby:
 
 *snip*
 
   
   1) The fact that serialisation is fully strict for 32760 bytes but 
not for
   32761 makes the direct application of strictCheck intractable. Do 
you have
   any ideas how to circumvent that?
 
  Test using a much smaller chunk size. I'd test sizes from 1 to something
  like one more than the machine word size.
 
 
 Let me clarify circumvent that. strictCheck uses a bounded search
 starting at 1 and proceeds to some limit. The Binary instance used in
 the example was the fully lazy one for lists: a get/putWord8 for each
 constructor. Even so, it was effectively spine-strict up to 32k bytes
 (which was 32k elements b/c of the use of unit) because (I think that)
 the first chunk of the lazy bytestring wasn't being generated by
 encode until it was full. If you asked strictCheck to go from 1 to
 32k, I don't think it would finish. So by circumvent, I mean: How can
 we apply the essential ideas of strictCheck when our chunks are so
 big?

We don't. We test it with a variety of small chunk sizes. That is the
sensible thing to do.

 Obviously, the iterative search cannot just proceed by one
 element at a time; but then we lose the obvious meaning of add one
 more _|_. I don't see an obvious candidate for how to alter the
 _|_-ridden test vector generation. Moreover, it's proposed output is
 wrong when considered from the Big Chunks perspective--we don't
 necessarily want Chitil's least strictness.

Indeed. As I've said, Data.Binary is lazy but in a chunky way where
within each chunk it is strict.

  Sequences are like spine strict lists. Sets are strict in as much as the
  element type's (==) function is strict.
 
 Let me refine how I posed that question. A predicate: if you enter
 Package.fromList [1..] at the ghci prompt and you get no
 intermediate results, then that was a strict type. 

Right, because (==) for Int is strict, and Set.fromList uses (==) on
each element. Sorry, I was just being pedantic by saying that it depends
on the strictness of (==).

 I'm assuming that if the Show instance doesn't produce intermediate
 results, then the serialisation technique can't handle intermediate
 results (i.e. chunking) either--at least not in a general enough way
 to include it in a library.

So if you did this test with my proposed list instance (and you somehow
slowed your computer right down so you could see what was going on)
you'd see it wait a sec, then print out 32k of serialised list elements,
then wait again and emit another chunk. So lazy, but in strict chunks.

Duncan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RFC: demanding lazy instances of Data.Binary

2007-11-20 Thread Duncan Coutts
On Mon, 2007-11-19 at 20:06 -0600, Nicolas Frisby wrote:
 In light of this discussion, I think the fully spine-strict list
 instance does more good than bad argument is starting to sound like a
 premature optimization. Consequently, using a newtype to treat the
 necessarily lazy instances as special cases is an inappropriate
 bandaid. 

I agree.

 My current opinion: If Data.Binary makes both a fully strict list
 instance (not []) and a fully lazy list instance (this would be the
 default for []) available, then that will also make available all of
 the other intermediate strictness. I'll elaborate that a bit. If the
 user defines a function appSpecificSplit :: MyDataType - [StrictList
 a], then the user can control the compactness and laziness of the
 serialisation by tuning that splitting function. Niel's 255 schema
 fits as one particular case, the split255 :: [a] - [StrictList a]
 function. I would hesitate to hard code a number of elements, since it
 certainly depends on the application and only exposing it as a
 parameter maximizes the reusability of the code. 

Fully lazy is the wrong default here I think. But fully strict is also
not right. What would fit best with the style of the rest of the
Data.Binary library is to be lazy in a lumpy way. This can give
excellent performance where as being fully lazy cannot (because the
chunk size becomes far too small which increases the overhead).

Has anyone actually said they want the list serialisation to be fully
lazy? Is there a need for anything more than just not being fully
strict? If there is, I don't see it. If it really is needed it can be
added just by flushing after serialising each element.

 Reaching for the sky idea: Does the Put monad offer enough
 information for an instance to be able to recognize when it has filled
 a lazy bytestring's first chunk? It could cater its strictness ( i.e.
 vary how much of the spine is forced before any output is generated)
 in order to best line up with the chunks of lazy bytestring it is
 producing. This might be trying to fit too much into the interface.
 And it might even make Put an actual monad ;) 

That is something I've considered. Serialise just as much of the list as
is necessary to fill the remainder of a chunk. Actually we'd always fill
just slightly more than a chunk because we don't know how big each list
element will be, we only know when we've gone over.

Duncan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RFC: demanding lazy instances of Data.Binary

2007-11-19 Thread Nicolas Frisby
I've got a first draft with the newtype and just an instance for list.



If you'd prefer fewer questions, please let me know ;)



0) I've cabalised it (lazy-binary), but I don't have anywhere to host it.
Would it be appropriate to host on darcs.haskell.org or HackageDB (yet?).
Suggestions?

1) The fact that serialisation is fully strict for 32760 bytes but not for
32761 makes the direct application of strictCheck intractable. Do you have
any ideas how to circumvent that?

  ((unLazy . decode . encode . Lazy) `asTypeOf` id) (take 32760 (repeat ())
++ undefined)
= undefined

  ((unLazy . decode . encode . Lazy) `asTypeOf` id) (take 32761 (repeat ())
++ undefined)
   = lots of ()s and then undefined

2) Also, any suggestions for other datatypes to provide default instances
for? Tree type structures immediately raise the question of which traversal
should be the default. I'm learning towards providing none since the goal of
constant space usage actually depends on the serialisation order matching
how the deserialised tree will be traversed.

3) I don't think it is applicable in anyway whatsoever to strict types like
Int, Data.Set, and Data.Sequence? Counter-arguments?

4) Perhaps the tight correspondence between serialisation and traversal
necessary for constant space usage does indeed mean that the instance for
the lazy list is the only appropriate one to provide. Perhaps the Chunks
data type and a function splitN :: Int - [a] - Chunks [a] would also be
helpful.

Thanks!

16, 2007 12:45 PM, Don Stewart [EMAIL PROTECTED] wrote:

 dons:
  nicolas.frisby:
   I've noticed a few posts on the cafe, including my own experience,
   where the spine-strictness of the Binary instance for lists caused
   some confusion. I'd like to suggest an approach to preventing this
   confusion in the future, or at least making it easier to resolve.
  
   Having decided that it is indeed appropriate for some standard
   instances to be strict by default [1], I think it would be beneficial
   to standardize an approach for expressing that a lazy instance is
   expected. I propose the following newtype be added to Data.Binary. A
   demonstration immediately follows.
  
newtype Lazily a = Lazily { unLazily :: a }
  
-- example instance
instance Binary a = Binary (Lazily [a]) where
-- lazy get and put
  
   Now
  
[1..] = (unLazily . decode . encode . Lazily) [1..]
  
   instead of
  
_|_ = (decode . encode) [1..]
  
   This email is a request for comments on this concept. I think it is a
   minimal way of expressing the intent that the serialisation be lazy
   [2]. Please share any concerns or suggestions. I'll submit a patch
   once the discussion is complete... or if it becomes inactive ;)
 
  I think this is a good compromise: allowing laziness for those who need
  it, in a clean manner. How about we provie
 
  Data.Binary.Lazy
 
  with the Lazy newtype, and lazy instances for types that make sense to
  be so?
 
  For now, this can be developed as a single module depending on
  Data.Binary . What do you think, Nick?

 I'd like to also use strictCheck, as we did for the stream fusion
 library, to state and check strictness properties for the instances,
 since getting this clear and correct seems to be a common FAQ with
 Binary.

 -- Don

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RFC: demanding lazy instances of Data.Binary

2007-11-19 Thread Don Stewart
nicolas.frisby:
I've got a first draft with the newtype and just an instance for list.
 
If you'd prefer fewer questions, please let me know ;)
 
0) I've cabalised it (lazy-binary), but I don't have anywhere to host
it. Would it be appropriate to host on [1]darcs.haskell.org or HackageDB
(yet?). Suggestions?

You can host it on code.haskell.org, ask for an account here:

http://community.haskell.org/admin/account_request.html

and a project:

http://community.haskell.org/admin/

 
1) The fact that serialisation is fully strict for 32760 bytes but not for
32761 makes the direct application of strictCheck intractable. Do you have
any ideas how to circumvent that?
 
  ((unLazy . decode . encode . Lazy) `asTypeOf` id) (take 32760 (repeat
()) ++ undefined)
= undefined
 
  ((unLazy . decode . encode . Lazy) `asTypeOf` id) (take 32761 (repeat
()) ++ undefined)
   = lots of ()s and then undefined

What's happening here?

2) Also, any suggestions for other datatypes to provide default instances
for? Tree type structures immediately raise the question of which
traversal should be the default. I'm learning towards providing none since
the goal of constant space usage actually depends on the serialisation
order matching how the deserialised tree will be traversed.

Lazy Arrays?
  
3) I don't think it is applicable in anyway whatsoever to strict types
like Int, Data.Set, and Data.Sequence? Counter-arguments?

Well, atomic types like Int, I don't think it makes sense, but Sets and
Sequence are lazy, aren't they?

4) Perhaps the tight correspondence between serialisation and traversal
necessary for constant space usage does indeed mean that the instance for
the lazy list is the only appropriate one to provide. Perhaps the Chunks
data type and a function splitN :: Int - [a] - Chunks [a] would also be
helpful.

Yes, it is probably the only lazy instance anyone cares about, anyway.

 

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RFC: demanding lazy instances of Data.Binary

2007-11-19 Thread Duncan Coutts
On Mon, 2007-11-19 at 13:39 -0800, Don Stewart wrote:
 nicolas.frisby:
 I've got a first draft with the newtype and just an instance for list.
  
 If you'd prefer fewer questions, please let me know ;)
  
 0) I've cabalised it (lazy-binary), but I don't have anywhere to host
 it. Would it be appropriate to host on [1]darcs.haskell.org or HackageDB
 (yet?). Suggestions?
 
 You can host it on code.haskell.org, ask for an account here:

I think we should consider if a lazier serialisation of lists shouldn't
be the default first before thinking about forking the whole library.

It depends on how much laziness you want. We could certainly make it so
that this is true:

(decode . encode) [1..] = [1..]

rather than giving _|_. However the approach of Data.Binary is lazy
serialisation but in chunks, big chunks. So while the above may be true,
this would not be:

(head . decode . encode) [1, _|_] = 1

because we generate 32k chunks of data when serialising. But do you
really need it to be this lazy? Would it enough for it to be lazy in
chunks.

There is a good argument I think that the current fully strict
serialisation is bad just from a performance perspective, and that
instead we should serialise lists semi-lazily, using a chunked
representation. For example Neil's serialisation library uses length
prefixed chunks with a maximum chunk size of 255. The end of a list is
denoted by a 0 length final chunk. This has the advantage that we only
have to force a limited number of elements (to find the length) before
serialising.

If you want it really lazy then it would have to flush after each
element to create a new lazy bytestring chunk. Note that flushing this
often looses many of the performance advantages of the Data.Binary
stuff.

  
 1) The fact that serialisation is fully strict for 32760 bytes but not 
  for
 32761 makes the direct application of strictCheck intractable. Do you 
  have
 any ideas how to circumvent that?

Test using a much smaller chunk size. I'd test sizes from 1 to something
like one more than the machine word size.

 2) Also, any suggestions for other datatypes to provide default instances
 for? Tree type structures immediately raise the question of which
 traversal should be the default. I'm learning towards providing none 
  since
 the goal of constant space usage actually depends on the serialisation
 order matching how the deserialised tree will be traversed.
 
 Lazy Arrays?
   
 3) I don't think it is applicable in anyway whatsoever to strict types
 like Int, Data.Set, and Data.Sequence? Counter-arguments?
 
 Well, atomic types like Int, I don't think it makes sense, but Sets and
 Sequence are lazy, aren't they?

Sequences are like spine strict lists. Sets are strict in as much as the
element type's (==) function is strict.

 4) Perhaps the tight correspondence between serialisation and traversal
 necessary for constant space usage does indeed mean that the instance for
 the lazy list is the only appropriate one to provide. Perhaps the Chunks
 data type and a function splitN :: Int - [a] - Chunks [a] would also be
 helpful.
 
 Yes, it is probably the only lazy instance anyone cares about, anyway.

Yes. So I think we should be clear about what we want and see if we
can't just fix the default.

Duncan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RFC: demanding lazy instances of Data.Binary

2007-11-19 Thread Nicolas Frisby
In light of this discussion, I think the fully spine-strict list instance
does more good than bad argument is starting to sound like a premature
optimization. Consequently, using a newtype to treat the necessarily lazy
instances as special cases is an inappropriate bandaid.

My current opinion: If Data.Binary makes both a fully strict list instance
(not []) and a fully lazy list instance (this would be the default for [])
available, then that will also make available all of the other intermediate
strictness. I'll elaborate that a bit. If the user defines a function
appSpecificSplit :: MyDataType - [StrictList a], then the user can control
the compactness and laziness of the serialisation by tuning that splitting
function. Niel's 255 schema fits as one particular case, the split255 :: [a]
- [StrictList a] function. I would hesitate to hard code a number of
elements, since it certainly depends on the application and only exposing it
as a parameter maximizes the reusability of the code.

Related action: Thus I think that instead of a standard newtype to denote
laziness expectations, there ought to be a standard strict list data type
and Binary instance AND some documentation popularizing this as a possible
optimization whenever the generated bytestrings from []s are too big. Is
Data.Sequence suitable for use as StrictList? Or perhaps something from the
strict library? I'm not fully savvy to the head-strictness/unpacking
differences and trade-offs.

Reaching for the sky idea: Does the Put monad offer enough information
for an instance to be able to recognize when it has filled a lazy
bytestring's first chunk? It could cater its strictness (i.e. vary how much
of the spine is forced before any output is generated) in order to best line
up with the chunks of lazy bytestring it is producing. This might be trying
to fit too much into the interface. And it might even make Put an actual
monad ;)


On Nov 19, 2007 4:16 PM, Duncan Coutts [EMAIL PROTECTED] wrote:

 On Mon, 2007-11-19 at 13:39 -0800, Don Stewart wrote:
  nicolas.frisby:
  I've got a first draft with the newtype and just an instance for
 list.
  
  If you'd prefer fewer questions, please let me know ;)
  
  0) I've cabalised it (lazy-binary), but I don't have anywhere to
 host
  it. Would it be appropriate to host on [1]darcs.haskell.org or
 HackageDB
  (yet?). Suggestions?
 
  You can host it on code.haskell.org, ask for an account here:

 I think we should consider if a lazier serialisation of lists shouldn't
 be the default first before thinking about forking the whole library.

 It depends on how much laziness you want. We could certainly make it so
 that this is true:

 (decode . encode) [1..] = [1..]

 rather than giving _|_. However the approach of Data.Binary is lazy
 serialisation but in chunks, big chunks. So while the above may be true,
 this would not be:

 (head . decode . encode) [1, _|_] = 1

 because we generate 32k chunks of data when serialising. But do you
 really need it to be this lazy? Would it enough for it to be lazy in
 chunks.

 There is a good argument I think that the current fully strict
 serialisation is bad just from a performance perspective, and that
 instead we should serialise lists semi-lazily, using a chunked
 representation. For example Neil's serialisation library uses length
 prefixed chunks with a maximum chunk size of 255. The end of a list is
 denoted by a 0 length final chunk. This has the advantage that we only
 have to force a limited number of elements (to find the length) before
 serialising.

 If you want it really lazy then it would have to flush after each
 element to create a new lazy bytestring chunk. Note that flushing this
 often looses many of the performance advantages of the Data.Binary
 stuff.

  
  1) The fact that serialisation is fully strict for 32760 bytes but
 not for
  32761 makes the direct application of strictCheck intractable. Do
 you have
  any ideas how to circumvent that?

 Test using a much smaller chunk size. I'd test sizes from 1 to something
 like one more than the machine word size.

  2) Also, any suggestions for other datatypes to provide default
 instances
  for? Tree type structures immediately raise the question of which
  traversal should be the default. I'm learning towards providing
 none since
  the goal of constant space usage actually depends on the
 serialisation
  order matching how the deserialised tree will be traversed.
 
  Lazy Arrays?
 
  3) I don't think it is applicable in anyway whatsoever to strict
 types
  like Int, Data.Set, and Data.Sequence? Counter-arguments?
 
  Well, atomic types like Int, I don't think it makes sense, but Sets and
  Sequence are lazy, aren't they?

 Sequences are like spine strict lists. Sets are strict in as much as the
 element type's (==) function is strict.

  4) Perhaps the tight correspondence between serialisation and
 traversal
  necessary for constant 

Re: [Haskell-cafe] RFC: demanding lazy instances of Data.Binary

2007-11-19 Thread Nicolas Frisby
On Nov 19, 2007 4:16 PM, Duncan Coutts [EMAIL PROTECTED] wrote:

 On Mon, 2007-11-19 at 13:39 -0800, Don Stewart wrote:
  nicolas.frisby:

*snip*

  
  1) The fact that serialisation is fully strict for 32760 bytes but not 
   for
  32761 makes the direct application of strictCheck intractable. Do you 
   have
  any ideas how to circumvent that?

 Test using a much smaller chunk size. I'd test sizes from 1 to something
 like one more than the machine word size.


Let me clarify circumvent that. strictCheck uses a bounded search
starting at 1 and proceeds to some limit. The Binary instance used in
the example was the fully lazy one for lists: a get/putWord8 for each
constructor. Even so, it was effectively spine-strict up to 32k bytes
(which was 32k elements b/c of the use of unit) because (I think that)
the first chunk of the lazy bytestring wasn't being generated by
encode until it was full. If you asked strictCheck to go from 1 to
32k, I don't think it would finish. So by circumvent, I mean: How can
we apply the essential ideas of strictCheck when our chunks are so
big? Obviously, the iterative search cannot just proceed by one
element at a time; but then we lose the obvious meaning of add one
more _|_. I don't see an obvious candidate for how to alter the
_|_-ridden test vector generation. Moreover, it's proposed output is
wrong when considered from the Big Chunks perspective--we don't
necessarily want Chitil's least strictness.

(more new text below)


  2) Also, any suggestions for other datatypes to provide default 
   instances
  for? Tree type structures immediately raise the question of which
  traversal should be the default. I'm learning towards providing none 
   since
  the goal of constant space usage actually depends on the serialisation
  order matching how the deserialised tree will be traversed.
 
  Lazy Arrays?
 
  3) I don't think it is applicable in anyway whatsoever to strict types
  like Int, Data.Set, and Data.Sequence? Counter-arguments?
 
  Well, atomic types like Int, I don't think it makes sense, but Sets and
  Sequence are lazy, aren't they?

 Sequences are like spine strict lists. Sets are strict in as much as the
 element type's (==) function is strict.



Let me refine how I posed that question. A predicate: if you enter
Package.fromList [1..] at the ghci prompt and you get no
intermediate results, then that was a strict type. I'm assuming that
if the Show instance doesn't produce intermediate results, then the
serialisation technique can't handle intermediate results (i.e.
chunking) either--at least not in a general enough way to include it
in a library.

*snip*
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RFC: demanding lazy instances of Data.Binary

2007-11-16 Thread Don Stewart
nicolas.frisby:
 I've noticed a few posts on the cafe, including my own experience,
 where the spine-strictness of the Binary instance for lists caused
 some confusion. I'd like to suggest an approach to preventing this
 confusion in the future, or at least making it easier to resolve.
 
 Having decided that it is indeed appropriate for some standard
 instances to be strict by default [1], I think it would be beneficial
 to standardize an approach for expressing that a lazy instance is
 expected. I propose the following newtype be added to Data.Binary. A
 demonstration immediately follows.
 
  newtype Lazily a = Lazily { unLazily :: a }
 
  -- example instance
  instance Binary a = Binary (Lazily [a]) where
  -- lazy get and put
 
 Now
 
  [1..] = (unLazily . decode . encode . Lazily) [1..]
 
 instead of
 
  _|_ = (decode . encode) [1..]
 
 This email is a request for comments on this concept. I think it is a
 minimal way of expressing the intent that the serialisation be lazy
 [2]. Please share any concerns or suggestions. I'll submit a patch
 once the discussion is complete... or if it becomes inactive ;)

I think this is a good compromise: allowing laziness for those who need
it, in a clean manner. How about we provie

Data.Binary.Lazy

with the Lazy newtype, and lazy instances for types that make sense to
be so?

For now, this can be developed as a single module depending on
Data.Binary. What do you think, Nick?

 
 1 - One solution is to make all standard Binary instances lazy
 wherever possible, but I presume that in most cases it's not needed
 and the compactness gained through default strictness (such as the []
 instance) is quite significant.

Yes, that's the argument.

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] RFC: demanding lazy instances of Data.Binary

2007-11-16 Thread Don Stewart
dons:
 nicolas.frisby:
  I've noticed a few posts on the cafe, including my own experience,
  where the spine-strictness of the Binary instance for lists caused
  some confusion. I'd like to suggest an approach to preventing this
  confusion in the future, or at least making it easier to resolve.
  
  Having decided that it is indeed appropriate for some standard
  instances to be strict by default [1], I think it would be beneficial
  to standardize an approach for expressing that a lazy instance is
  expected. I propose the following newtype be added to Data.Binary. A
  demonstration immediately follows.
  
   newtype Lazily a = Lazily { unLazily :: a }
  
   -- example instance
   instance Binary a = Binary (Lazily [a]) where
   -- lazy get and put
  
  Now
  
   [1..] = (unLazily . decode . encode . Lazily) [1..]
  
  instead of
  
   _|_ = (decode . encode) [1..]
  
  This email is a request for comments on this concept. I think it is a
  minimal way of expressing the intent that the serialisation be lazy
  [2]. Please share any concerns or suggestions. I'll submit a patch
  once the discussion is complete... or if it becomes inactive ;)
 
 I think this is a good compromise: allowing laziness for those who need
 it, in a clean manner. How about we provie
 
 Data.Binary.Lazy
 
 with the Lazy newtype, and lazy instances for types that make sense to
 be so?
 
 For now, this can be developed as a single module depending on
 Data.Binary. What do you think, Nick?

I'd like to also use strictCheck, as we did for the stream fusion
library, to state and check strictness properties for the instances,
since getting this clear and correct seems to be a common FAQ with
Binary.

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] RFC: demanding lazy instances of Data.Binary

2007-11-16 Thread Nicolas Frisby
I've noticed a few posts on the cafe, including my own experience,
where the spine-strictness of the Binary instance for lists caused
some confusion. I'd like to suggest an approach to preventing this
confusion in the future, or at least making it easier to resolve.

Having decided that it is indeed appropriate for some standard
instances to be strict by default [1], I think it would be beneficial
to standardize an approach for expressing that a lazy instance is
expected. I propose the following newtype be added to Data.Binary. A
demonstration immediately follows.

 newtype Lazily a = Lazily { unLazily :: a }

 -- example instance
 instance Binary a = Binary (Lazily [a]) where
 -- lazy get and put

Now

 [1..] = (unLazily . decode . encode . Lazily) [1..]

instead of

 _|_ = (decode . encode) [1..]

This email is a request for comments on this concept. I think it is a
minimal way of expressing the intent that the serialisation be lazy
[2]. Please share any concerns or suggestions. I'll submit a patch
once the discussion is complete... or if it becomes inactive ;)

Thanks for your time,
Nick



1 - One solution is to make all standard Binary instances lazy
wherever possible, but I presume that in most cases it's not needed
and the compactness gained through default strictness (such as the []
instance) is quite significant.



2 - A more involved option is to recognize that serialisation always
maps to a sequence, and so create another standard data type and
class.

 data Chunks a = Empty | Chunk !a (Chunks a) -- a la lazy bytestrings

 instance Binary a = Binary (Chunks a) where
   -- lazy put and get

 class Chunky a chunk | a - chunk where
   toChunks :: a - Chunks chunk
   fromChunks :: Chunks chunk - a

All of this machinery, however, may prematurely emphasize
problem-specific concerns. Thus it obfuscates the point: we just want
sufficient laziness. Whatever ends up happening, the Chunks data type
may be a route so common for Lazily instances that it makes sense to
provide it in a library (?
http://hackage.haskell.org/cgi-bin/hackage-scripts/package/strict-0.2).
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe