Re: Persistant (as in on disk) data

2003-03-09 Thread diatchki
hello,

 I'm not convinced that the binary library should natively support
 cyclic data.  I think that if saying:

   print x

 would not terminate, then there's no reason that

   puts bh x

 should terminate.  I like to think of puts as a binary version of
 print.  (That is, of course, unless the instance writer for the
 Binary/Show instances of the type of x is smart enough to not keep
 writing the same thing over and over again.)  I would challenge the
 interested party to write a Show instance of String which wouldn't loop
 indefinitely on repeat 'x'.
well, it is your choice to think of it as you like, but this is not what
my original mail was about.  i think the ability to make data persistant
is a useful one and it should be as transperant to the programmer as
possible.  when i write something like:
ones = 1 : ones

i don't think of printing infinately many ones in memory and i don't see
why i should start thinking of it that way just because i want to make the
object persistant. after all, one can think of the disk as a verys low
memory.

 If the user has some cyclic data structure and they want to be able to
 write it in binary form, it should be on their shoulders to do it
 correctly, not on the library's.
why is that?  i thought the whole point of having nice tools is that you
don't need to write mindless stuff and concentrate on the important bits
of your program. i don't have to worry much about sharing and cyclic data
when i program in Haskell (i.e. it just happens), why should i suddenly
start to worry about that if i want to make something persistant across
executions of my program.


 So essentially, I believe 'deriving Binary' should work identically to
 'deriving Show', except using a binary rep instead of a string rep.
something like that could be useful, but with drift and the atrem library
one can already do some of that.  and the aterm library is a reasonably
portable way to represent terms.  this is definately not what i had in
mind in my original post.

 it in Haskell, as presumbably sharing is not observable from within
 the  language.  this is why the deriving bit seems essential - the
 compiler  can perform some magic.

 I assume you mean something like:

   let x = ...some really large structure...
   y = [x,x]
   in  puts bh y

 then the size of what is written is |x+c| not |2x| for some small c?  If
 so, then I don't believe this can be implemented in the language; it
 would have to be in the compiler.
this is what i meant by compiler magic.

 I see this as unlikely of happening
 because it would mean that all compilers would have to implement this
 identically and some might not handle sharing the same manner.
different implementations do not need to implement sharing in the same
way.  they need to understand a common format.  i am not saying designing
such a format is easy, in fact things like:
nats = 0 : map (+1) nats
seem tricky as they involve functions.  but persitance is useful.

in fact as a beginning i was hoping for something that works in say GHC,
and won't be too hard to implement.  actually i thought it might already
exist, but i guess not.

bye
iavor




___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell


Re: Persistant (as in on disk) data

2003-03-07 Thread Iavor S. Diatchki
hello,

thanks for your replies.  i browsed thrugh the discussion on the 
libraries list, but it mainly seems to discuss if one should use bits or 
bytes in the binary representation.  not that this is not important (my 
personal preference is to be fast rather then small, within reason), but 
i was more interested in what these functions should do.  unfortunately 
i couldn't quite figure that out from the discussion there.

in particular, i was thinking that this dumping facility should preserve 
sharing and support cyclic data.  as such, i don't think one can write 
it in Haskell, as presumbably sharing is not observable from within the 
language.  this is why the deriving bit seems essential - the compiler 
can perform some magic.

bye
iavor
Simon Peyton-Jones wrote:
|   (c) how do we derive instances of Binary?

If you guys can agree an interface that GHC, nhc and Hugs can all
support, I'll gladly do the 'deriving' stuff to make 'deriving Binary'
work for GHC.  What's always inhibited me is that there isn't a single
agreed interface.
For most users, having a library that works across all Haskell
implementations and platforms is much more important than having the
most efficient possible library. But (the possibility of an) efficient
implementation has to be a goal, just not the only goal.  If GHC can't
use it directly for interface files, so be it.
Simon



--
==
| Iavor S. Diatchki, Ph.D. student   |
| Department of Computer Science and Engineering |
| School of OGI at OHSU  |
| http://www.cse.ogi.edu/~diatchki   |
==
___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell


Re: Persistant (as in on disk) data

2003-03-07 Thread Hal Daume III
I'd not been following this discussion, but now it seems it's gotten to
instances of the Binary module.  I figured I'd chime in briefly:

 thanks for your replies.  i browsed thrugh the discussion on the 
 libraries list, but it mainly seems to discuss if one should use bits or 
 bytes in the binary representation.  not that this is not important (my 

The bits/bytes argument was largely because the NHC library supported Bits
and the GHC library supported Bytes.  In order ot have a common library,
we wanted to support both.

 personal preference is to be fast rather then small, within reason), but 
 i was more interested in what these functions should do.  unfortunately 
 i couldn't quite figure that out from the discussion there.

Basically, write arbitrary data to a file in a binary fashion, or to a
memory location (as in BinMem).

 in particular, i was thinking that this dumping facility should preserve 
 sharing and support cyclic data.  as such, i don't think one can write 

I'm not convinced that the binary library should natively support cyclic
data.  I think that if saying:

  print x

would not terminate, then there's no reason that

  puts bh x

should terminate.  I like to think of puts as a binary version of
print.  (That is, of course, unless the instance writer for the
Binary/Show instances of the type of x is smart enough to not keep writing
the same thing over and over again.)  I would challenge the interested
party to write a Show instance of String which wouldn't loop indefinitely
on repeat 'x'.

If the user has some cyclic data structure and they want to be able to
write it in binary form, it should be on their shoulders to do it
correctly, not on the library's.

So essentially, I believe 'deriving Binary' should work identically to
'deriving Show', except using a binary rep instead of a string rep.

 it in Haskell, as presumbably sharing is not observable from within the 
 language.  this is why the deriving bit seems essential - the compiler 
 can perform some magic.

I assume you mean something like:

  let x = ...some really large structure...
  y = [x,x]
  in  puts bh y

then the size of what is written is |x+c| not |2x| for some small c?  If
so, then I don't believe this can be implemented in the language; it would
have to be in the compiler.  I see this as unlikely of happening because
it would mean that all compilers would have to implement this identically
and some might not handle sharing the same manner.  It might be nice, but
again, I see this as something you could do yourself if you really want it
(i.e., replace this function with:

  let x = ...
  in  puts bh 2  puts bh x

or something like that, when you can -- and obviously you won't always be
able to.)

 - Hal

___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell


RE: Persistant (as in on disk) data

2003-03-06 Thread Simon Marlow

 a recent post reminded me of a feature i'd like.
 for all i know it is already implemenetd in GHC so pointers 
 are welcome.
 
 i'd like to be able to dump data structures to disk, and later load 
 them.

A Binary library was discussed recently on the libraries list.  The
thread starts here:

http://www.haskell.org/pipermail/libraries/2002-November/000691.html

It's currently stalled.  There are several implementations of Binary:
one that comes with NHC and is described in a paper (sorry, don't have a
link to hand), a port of this library to GHC by Sven Panne (suffers from
bitrot), a derived/simplified version used in GHC which is heavily
hacked for speed, and a further derived version of this library by Hal
Daume who is adapting it to support bit-by-bit serialisation.

I think the outstanding issues are

  (a) is the API for GHC's Binary library acceptable, or do we need
  the extra bells and whistles that the NHC version has?

  (b) can we make a version of Binary that uses a bit-by-bit
  rather than byte-by-byte serialisation of the data that is
  as fast (or nearly as fast) as the current byte-by-byte
  implementation?  Perhaps performance isn't that important
  to the majority of people: please comment if you have
  an opinion.

  (c) how do we derive instances of Binary?

IMHO: something is better than nothing, so I'd be in favour of just
plugging in the Binary library from GHC, and marking it experimental.

Cheers,
Simon
___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell