Re: [Rd] Save and serialize

2011-02-07 Thread Prof Brian Ripley

On Mon, 7 Feb 2011, Hadley Wickham wrote:


Hi all,

Is there any relationship between save and serialize?  Do they use the
same algorithm?


See the R-internals manual: there is more info in the R-devel version, 
not least because saveRDS() is added to the mix.


But basically serialize() and saveRDS() use the same format, and 
save() writes a header and then serializes a pairlist of the objects 
given.


'The same algorithm' is somewhat misleading here: strictly no, as they 
manage to use four entry points to the code base.




Hadley

--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Save and serialize

2011-02-07 Thread Henrik Bengtsson
Also, if it it adds any value to what you are looking for, the output
of serialize() also has header information, cf. R-devel thread 'Small
inconsistency in serialize() between R versions and implications on
digest()' started March 7, 2007:

  http://www.mail-archive.com/r-devel@r-project.org/msg07931.html

It caused us some headaches when trying to generate identical output
of the same input using different versions of R.  It was solved in
that thread.  See code for digest::digest() on how to skip/ignore that
header.

/Henrik


On Mon, Feb 7, 2011 at 1:51 PM, Prof Brian Ripley rip...@stats.ox.ac.uk wrote:
 On Mon, 7 Feb 2011, Hadley Wickham wrote:

 Hi all,

 Is there any relationship between save and serialize?  Do they use the
 same algorithm?

 See the R-internals manual: there is more info in the R-devel version, not
 least because saveRDS() is added to the mix.

 But basically serialize() and saveRDS() use the same format, and save()
 writes a header and then serializes a pairlist of the objects given.

 'The same algorithm' is somewhat misleading here: strictly no, as they
 manage to use four entry points to the code base.


 Hadley

 --
 Assistant Professor / Dobelman Family Junior Chair
 Department of Statistics / Rice University
 http://had.co.nz/

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


 --
 Brian D. Ripley,                  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford,             Tel:  +44 1865 272861 (self)
 1 South Parks Road,                     +44 1865 272866 (PA)
 Oxford OX1 3TG, UK                Fax:  +44 1865 272595

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Save and serialize

2011-02-07 Thread Hadley Wickham
Thanks to you both for the information - that's exactly the level of
detail I was looking for.  I ask because I want to play around with a
function to automatically cache expensive operations to disk, in a way
that can be lazy loaded on the next run.

Hadley

On Mon, Feb 7, 2011 at 4:06 PM, Henrik Bengtsson h...@biostat.ucsf.edu wrote:
 Also, if it it adds any value to what you are looking for, the output
 of serialize() also has header information, cf. R-devel thread 'Small
 inconsistency in serialize() between R versions and     implications on
 digest()' started March 7, 2007:

  http://www.mail-archive.com/r-devel@r-project.org/msg07931.html

 It caused us some headaches when trying to generate identical output
 of the same input using different versions of R.  It was solved in
 that thread.  See code for digest::digest() on how to skip/ignore that
 header.

 /Henrik


 On Mon, Feb 7, 2011 at 1:51 PM, Prof Brian Ripley rip...@stats.ox.ac.uk 
 wrote:
 On Mon, 7 Feb 2011, Hadley Wickham wrote:

 Hi all,

 Is there any relationship between save and serialize?  Do they use the
 same algorithm?

 See the R-internals manual: there is more info in the R-devel version, not
 least because saveRDS() is added to the mix.

 But basically serialize() and saveRDS() use the same format, and save()
 writes a header and then serializes a pairlist of the objects given.

 'The same algorithm' is somewhat misleading here: strictly no, as they
 manage to use four entry points to the code base.


 Hadley

 --
 Assistant Professor / Dobelman Family Junior Chair
 Department of Statistics / Rice University
 http://had.co.nz/

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


 --
 Brian D. Ripley,                  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford,             Tel:  +44 1865 272861 (self)
 1 South Parks Road,                     +44 1865 272866 (PA)
 Oxford OX1 3TG, UK                Fax:  +44 1865 272595

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel






-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Save and serialize

2011-02-07 Thread Henrik Bengtsson
On Mon, Feb 7, 2011 at 3:15 PM, Hadley Wickham had...@rice.edu wrote:
 Thanks to you both for the information - that's exactly the level of
 detail I was looking for.  I ask because I want to play around with a
 function to automatically cache expensive operations to disk, in a way
 that can be lazy loaded on the next run.

So starting with digest v0.3.0 (April 2007), the digest() method can
be considered consistent across R version (in addition to across R
sessions).


FYI, recently in R-devel serialize(), which digest() relies on, gained
a 'version' argument reserved for future usage.  From NEWS:

- serialize() and unserialize() are no longer described as
‘experimental’. The interface is now regarded as stable, although the
serialization format may well change in future releases. (serialize()
has a new argument version which would allow the current format to be
written if that happens.)

I've tested, and the introduction of this argument was done such that
the serialized object is identical to as before (R = 2.12.x).  Thus,
digest() will generate the same output also in R v2.13.0.  At some
point, we will add an option to digest() for specifying what 'version'
value should be passed to serialize(), but it doesn't sound like it is
too urgent to add that.  Any updates to digest() will also be backward
compatible, so as long as you use digest() you shouldn't have to worry
about consistency.

/Henrik


 Hadley

 On Mon, Feb 7, 2011 at 4:06 PM, Henrik Bengtsson h...@biostat.ucsf.edu 
 wrote:
 Also, if it it adds any value to what you are looking for, the output
 of serialize() also has header information, cf. R-devel thread 'Small
 inconsistency in serialize() between R versions and     implications on
 digest()' started March 7, 2007:

  http://www.mail-archive.com/r-devel@r-project.org/msg07931.html

 It caused us some headaches when trying to generate identical output
 of the same input using different versions of R.  It was solved in
 that thread.  See code for digest::digest() on how to skip/ignore that
 header.

 /Henrik


 On Mon, Feb 7, 2011 at 1:51 PM, Prof Brian Ripley rip...@stats.ox.ac.uk 
 wrote:
 On Mon, 7 Feb 2011, Hadley Wickham wrote:

 Hi all,

 Is there any relationship between save and serialize?  Do they use the
 same algorithm?

 See the R-internals manual: there is more info in the R-devel version, not
 least because saveRDS() is added to the mix.

 But basically serialize() and saveRDS() use the same format, and save()
 writes a header and then serializes a pairlist of the objects given.

 'The same algorithm' is somewhat misleading here: strictly no, as they
 manage to use four entry points to the code base.


 Hadley

 --
 Assistant Professor / Dobelman Family Junior Chair
 Department of Statistics / Rice University
 http://had.co.nz/

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


 --
 Brian D. Ripley,                  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford,             Tel:  +44 1865 272861 (self)
 1 South Parks Road,                     +44 1865 272866 (PA)
 Oxford OX1 3TG, UK                Fax:  +44 1865 272595

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel






 --
 Assistant Professor / Dobelman Family Junior Chair
 Department of Statistics / Rice University
 http://had.co.nz/

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel