Re: [R] save(), load(), saveRDS(), and readRDS()

2023-10-06 Thread Micha Silver



Jim always went beyond just posting answers. He helped me along in 
learning R, always showing solutions together with insightful 
explanations. His patience and good humor were remarkable.


Condolences to his family.


On 05/10/2023 1:36, Jim Lemon wrote:

Hello,
I am very sad to let you know that my husband Jim died on 18th September. I
apologise for not letting you know earlier but I had trouble finding the
password for his phone.
Kind regards,
Juel

On Fri, 29 Sep 2023, 01:48 Shu Fai Cheung 
Hi All,

There is a thread about the use of save(), load(), saveRDS(), and
loadRDS(). It led me to think about a question regarding them.

In my personal work, I prefer using saveRDS() and loadRDS() as I don't like
the risk of overwriting anything in the global environment. I also like the
freedom to name an object when reading it from a file.

However, for teaching, I have to teach save() and load() because, in my
discipline, it is common for researchers to share their datasets on the
internet using the format saved by save(), and so students need to know how
to use load() and what will happen when using it. Actually, I can't recall
encountering datasets shared by the .rds format. I have been wondering why
save() was usually used in that case.

That discussion led me to read the help pages again and I noticed the
following warning, from the help page of saveRDS():

"Files produced by saveRDS (or serialize to a file connection) are not
suitable as an interchange format between machines, for example to download
from a website. The files produced by save
 have a header identifying
the file type and so are better protected against erroneous use."

When will the problem mentioned in the warning occur? That is, when will a
file saved by saveRDS() not be read correctly? Saved in Linux and then read
in Windows? Is it possible to create a reproducible error?

Regards,
Shu Fai

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Micha Silver
Ben Gurion Univ.
Sde Boker, Remote Sensing Lab
cell: +972-523-665918

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] save(), load(), saveRDS(), and readRDS()

2023-10-04 Thread Jim Lemon
Hello,
I am very sad to let you know that my husband Jim died on 18th September. I
apologise for not letting you know earlier but I had trouble finding the
password for his phone.
Kind regards,
Juel

On Fri, 29 Sep 2023, 01:48 Shu Fai Cheung  Hi All,
>
> There is a thread about the use of save(), load(), saveRDS(), and
> loadRDS(). It led me to think about a question regarding them.
>
> In my personal work, I prefer using saveRDS() and loadRDS() as I don't like
> the risk of overwriting anything in the global environment. I also like the
> freedom to name an object when reading it from a file.
>
> However, for teaching, I have to teach save() and load() because, in my
> discipline, it is common for researchers to share their datasets on the
> internet using the format saved by save(), and so students need to know how
> to use load() and what will happen when using it. Actually, I can't recall
> encountering datasets shared by the .rds format. I have been wondering why
> save() was usually used in that case.
>
> That discussion led me to read the help pages again and I noticed the
> following warning, from the help page of saveRDS():
>
> "Files produced by saveRDS (or serialize to a file connection) are not
> suitable as an interchange format between machines, for example to download
> from a website. The files produced by save
>  have a header identifying
> the file type and so are better protected against erroneous use."
>
> When will the problem mentioned in the warning occur? That is, when will a
> file saved by saveRDS() not be read correctly? Saved in Linux and then read
> in Windows? Is it possible to create a reproducible error?
>
> Regards,
> Shu Fai
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] save(), load(), saveRDS(), and readRDS()

2023-10-02 Thread Greg Snow
One more function to consider using and teaching is the attach
function.  If you use `attach` with a the name of a file that was
created using `save` then it creates a new, empty environment, `load`s
the contents of the file into the environment, and attached the
environment to the search path (by default in position 2).  This means
that the objects are all available to use, but will not overwrite any
objects of the same name in your workspace.  The command `ls(2)`
quickly shows the names of the objects that were read in.  You can use
simple assignment to copy and optionally rename any of the objects
into your workspace, or just leave them in the attached workspace
(just recognize what will happen if you have multiple objects with the
same name).  Once you have copied or used the objects of interest, you
can simply `detach` the environment.

If you are going to teach the use of `attach` I would suggest
emphasizing the 2nd paragraph under the heading "Good practice" on the
help page for attach.

On Thu, Sep 28, 2023 at 9:48 AM Shu Fai Cheung  wrote:
>
> Hi All,
>
> There is a thread about the use of save(), load(), saveRDS(), and
> loadRDS(). It led me to think about a question regarding them.
>
> In my personal work, I prefer using saveRDS() and loadRDS() as I don't like
> the risk of overwriting anything in the global environment. I also like the
> freedom to name an object when reading it from a file.
>
> However, for teaching, I have to teach save() and load() because, in my
> discipline, it is common for researchers to share their datasets on the
> internet using the format saved by save(), and so students need to know how
> to use load() and what will happen when using it. Actually, I can't recall
> encountering datasets shared by the .rds format. I have been wondering why
> save() was usually used in that case.
>
> That discussion led me to read the help pages again and I noticed the
> following warning, from the help page of saveRDS():
>
> "Files produced by saveRDS (or serialize to a file connection) are not
> suitable as an interchange format between machines, for example to download
> from a website. The files produced by save
>  have a header identifying
> the file type and so are better protected against erroneous use."
>
> When will the problem mentioned in the warning occur? That is, when will a
> file saved by saveRDS() not be read correctly? Saved in Linux and then read
> in Windows? Is it possible to create a reproducible error?
>
> Regards,
> Shu Fai
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] save(), load(), saveRDS(), and readRDS()

2023-09-29 Thread Jorgen Harmse via R-help
Ivan Krylov points out that load(file, e <- new.env()) is cumbersome. I put it 
into a function.

Regards,
Jorgen Harmse.


#' Save & load lists & environments

#'

#' \code{\link{save}} has to be told what to save from an environment, and the 
obvious way

#' to save a structure creates an extra layer. \code{\link{load}} with default 
settings

#' clobbers the current environment. \code{save.env} saves a list or 
environment without an

#' extra layer, and by default saves everything. \code{load.env} loads into an 
environment,

#' and \code{load.list} loads into a list.

#'

#' @param S something that can be coerced to an environment, e.g. a named 
\code{list}

#' @param file,envir inputs to \code{save} or \code{load}

#' @param list input to \code{save}

#' @param skip variables in \code{envir} that should not be saved, ignored if 
\code{list}

#' is provided

#' @param ... inputs to \code{load.env} or additional inputs to \code{save}

#'

#' @return \code{invisible} from \code{save.env}; an \code{environment} from 
\code{load.env};

#' a \code{list} from \code{load.list}

#'

#' @export



save.env <- function( S, file, list = setdiff(ls(envir),skip),

  envir = if(missing(S)) parent.frame() else 
as.environment(S),

  skip=NULL, ...

)

{ save(list=list, file=file, envir=envir, ...)}



#' @rdname save.env

#'

#' @param keep,remove names of variables to keep or to remove

#' @param absent what to do if variables named in \code{keep} are absent

#' @param parent input to \code{\link{new.env}}

#'

#' @note \code{remove} is forced after the file is loaded, so the default works 
correctly.

#'

#' @export



load.env <- function( file, keep, remove = if(!missing(keep)) 
setdiff(ls(envir),keep),

  absent=c('warn','ignore','stop'), 
envir=new.env(parent=parent),

  parent=parent.frame()

)

{ load(file,envir)

  rm(list=remove,envir=envir)

  if ( !missing(keep) && (match.arg(absent) -> absent) != 'ignore'

  && length(keep.absent <- setdiff(keep,ls(envir))) > 0L )

  { print(keep.absent)

if (absent=='warning')

  warning('The variables listed above are absent from the file.')

else

  stop('The variables listed above are absent from the file.')

  }

  return(envir)

}



#' @rdname save.env

#'

#' @param all.names input to \code{\link{as.list}}

#'

#' @export



load.list <- function(..., all.names=TRUE) as.list(all.names=all.names, 
load.env(...))




--

Message: 2
Date: Fri, 29 Sep 2023 11:42:37 +0300
From: Ivan Krylov 
To: Shu Fai Cheung 
Cc: R mailing list 
Subject: Re: [R] save(), load(), saveRDS(), and readRDS()
Message-ID: <20230929114237.2592975a@Tarkus>
Content-Type: text/plain; charset="utf-8"

On Thu, 28 Sep 2023 23:46:45 +0800
Shu Fai Cheung  wrote:

> In my personal work, I prefer using saveRDS() and loadRDS() as I
> don't like the risk of overwriting anything in the global
> environment.

There's the load(file, e <- new.env()) idiom, but that's potentially
a lot to type.

*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] save(), load(), saveRDS(), and readRDS()

2023-09-29 Thread Ivan Krylov
On Thu, 28 Sep 2023 23:46:45 +0800
Shu Fai Cheung  wrote:

> In my personal work, I prefer using saveRDS() and loadRDS() as I
> don't like the risk of overwriting anything in the global
> environment.

There's the load(file, e <- new.env()) idiom, but that's potentially
a lot to type.

Confusingly, ?save also says:

>> For saving single R objects, ‘saveRDS()’ is mostly preferable to
>> ‘save()’, notably because of the _functional_ nature of ‘readRDS()’,
>> as opposed to ‘load()’.

> The files produced by save
>  have a header
> identifying the file type and so are better protected against
> erroneous use."

This header is also mentioned elsewhere in ?saveRDS:

>> ‘save’ writes a single line header (typically ‘"RDXs\n"’)

The difference between the save() header and the serialize() header is
that the save() header is designed to be read independently from the
machine running the code: it's exactly 5 bytes; some precisely defined
combinations of those 5 bytes identify how the rest of the file should
be interpreted (nowadays, it's likely either "XDR format version 2" or
"XDR format version 3"), and the rest of them cause an error.

The serialize() header does contain enough information describing it
(there's the first byte choosing between ASCII/XDR/native binary and a
number of encoded integers describing the format version and the
version of R you need to parse it), but it's stored in terms of
serialized objects, so if you cannot for some reason decode them
properly, you won't be able to read the header. A little bit of
Catch-22.

> When will the problem mentioned in the warning occur? That is, when
> will a file saved by saveRDS() not be read correctly?

One example I can offer is when a dataset is saved using serialize(xdr
= FALSE) (which is not reachable using saveRDS()). The resulting file
format would be dependent on the native byte order of the CPU in your
computer. (Nowadays it's really hard to encounter a CPU that doesn't
use little-endian byte order, so this is doubly unlikely to happen in
practice.) Both save() and saveRDS() set xdr = TRUE and convert the
data to "network byte order" (big-endian) when saving and back - when
loading.

The warning is relatively fresh (May 2021). Perhaps Prof. Brian D.
Ripley (who made that change) will be able to explain it better.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] save(), load(), saveRDS(), and readRDS()

2023-09-28 Thread Shu Fai Cheung
Hi All,

There is a thread about the use of save(), load(), saveRDS(), and
loadRDS(). It led me to think about a question regarding them.

In my personal work, I prefer using saveRDS() and loadRDS() as I don't like
the risk of overwriting anything in the global environment. I also like the
freedom to name an object when reading it from a file.

However, for teaching, I have to teach save() and load() because, in my
discipline, it is common for researchers to share their datasets on the
internet using the format saved by save(), and so students need to know how
to use load() and what will happen when using it. Actually, I can't recall
encountering datasets shared by the .rds format. I have been wondering why
save() was usually used in that case.

That discussion led me to read the help pages again and I noticed the
following warning, from the help page of saveRDS():

"Files produced by saveRDS (or serialize to a file connection) are not
suitable as an interchange format between machines, for example to download
from a website. The files produced by save
 have a header identifying
the file type and so are better protected against erroneous use."

When will the problem mentioned in the warning occur? That is, when will a
file saved by saveRDS() not be read correctly? Saved in Linux and then read
in Windows? Is it possible to create a reproducible error?

Regards,
Shu Fai

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.