Re: [R] Accessing objects manipulated in a function

2016-05-14 Thread Bert Gunter
"... wonton (as in "users will change my variables this way") use of
that operator will inevitably lead to surprises and puzzlement later.
"

Is this related to the myriad of choices in a Chinese menu? ;-)

(The spelling is "wanton" -- ah the joys of English!)


Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, May 14, 2016 at 9:45 AM, Jeff Newmiller
 wrote:
> I think you have boxed yourself into a corner, much like someone painting
> the floor and failing to work their way toward an exit.
>
> The whole premise that you will call "unknown code" that does "unknown
> things" that change the global environment is a maintenance disaster. Give
> up on at least one of these requirements.
>
> Think of the "lm" function. It returns a list of related results, augmented
> with a "class" attribute so that methods like "print.lm" and "summary.lm"
> will work. The actions performed need not be reviewed in detail if the user
> knows what to do with the returned object. The "class" attribute is a bit of
> syntactic sugar, but the concept of putting your results into a list that
> the caller doesn't have to mix items in with their own is critical.
>
> Using <<- has to be handled in a very controlled manner... wonton (as in
> "users will change my variables this way") use of that operator will
> inevitably lead to surprises and puzzlement later.
>
> I also happen to think that standardizing on calling "source" inside
> functions is a big mistake... the functions defined by the user should be
> setup and handed off to your "master architecture" code as parameters or
> elements within lists that are parameters.
>
> On Sat, 14 May 2016, Fisher Dennis wrote:
>
>> R 3.2.4
>> OS X and Windows
>>
>> Colleagues,
>>
>> I distribute some code to co-workers and I am trying to simplify their
>> task.  The issue is as follows:
>>
>> 1.  The code automates an extensive set of processes.  Many of the steps
>> are standardized.  However, some of the steps may require that users write
>> snippets of code, stored in R scripts.
>>
>> 2.  If users write their own code, it might appear in files named
>> UserCode1.R, UserCode2.R, etc..  The master code checks for the existence of
>> this code, then executes
>> source(?/path/to/UserCode1.R?)
>> This can occur at many different points in the master code (each time
>> sourcing a different file). In addition to the command above, there are a
>> variety of other commands testing whether the file exists and whether it
>> contains certain commands that I don?t allow the user to execute.
>> In order to simplify the code, these commands are embedded in a function
>> (which I will call MODIFYCODE for the moment).
>>
>> 3.  Assume that an object within the master code is named TEMP.  The user
>> might add a column to TEMP.  Since this occurs within a function, there are
>> two ways to get this modification back to the original environment:
>> a.  within the function:TEMP<<- TEMP
>> b.  use the return value from the function:
>> TEMP<-   MODIFYCODE()
>>
>> 4.  There are disadvantages to each of these:
>> a.  The user needs to know that the ?<<-? command must be invoked.
>> If they don?t do so, the changes within the function are not available in
>> the master environment
>> b.  I don?t know what code will be written by the user, i.e., they
>> might manipulate TEMP or they might create a new object or something else.
>> So, I don?t know a priori what to return.
>>
>> So, my question is: is there some way to manipulate environments such that
>> the changes within the function are AUTOMATICALLY transferred to the
>> environment outside the function?
>>
>> Dennis
>>
>> Dennis Fisher MD
>> P < (The "P Less Than" Company)
>> Phone / Fax: 1-866-PLessThan (1-866-753-7784)
>> www.PLessThan.com 
>>
>>
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live Go...
>   Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and 

Re: [R] Accessing objects manipulated in a function

2016-05-14 Thread Dennis Fisher
Jeff

Thanks for your insights.  I suspected that this was the case but I was hoping 
for a work-around.  Regardless, I have modified the code so that the source() 
command is no longer in a function — all the pre- and post-commands are now in 
two functions, executed before and after the source command.  Problem solved.

Dennis

Dennis Fisher MD
P < (The "P Less Than" Company)
Phone / Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com




> On May 14, 2016, at 9:45 AM, Jeff Newmiller  wrote:
> 
> I think you have boxed yourself into a corner, much like someone painting the 
> floor and failing to work their way toward an exit.
> 
> The whole premise that you will call "unknown code" that does "unknown 
> things" that change the global environment is a maintenance disaster. Give up 
> on at least one of these requirements.
> 
> Think of the "lm" function. It returns a list of related results, augmented 
> with a "class" attribute so that methods like "print.lm" and "summary.lm" 
> will work. The actions performed need not be reviewed in detail if the user 
> knows what to do with the returned object. The "class" attribute is a bit of 
> syntactic sugar, but the concept of putting your results into a list that the 
> caller doesn't have to mix items in with their own is critical.
> 
> Using <<- has to be handled in a very controlled manner... wonton (as in 
> "users will change my variables this way") use of that operator will 
> inevitably lead to surprises and puzzlement later.
> 
> I also happen to think that standardizing on calling "source" inside 
> functions is a big mistake... the functions defined by the user should be 
> setup and handed off to your "master architecture" code as parameters or 
> elements within lists that are parameters.
> 
> On Sat, 14 May 2016, Fisher Dennis wrote:
> 
>> R 3.2.4
>> OS X and Windows
>> 
>> Colleagues,
>> 
>> I distribute some code to co-workers and I am trying to simplify their task. 
>>  The issue is as follows:
>> 
>> 1.  The code automates an extensive set of processes.  Many of the steps are 
>> standardized.  However, some of the steps may require that users write 
>> snippets of code, stored in R scripts.
>> 
>> 2.  If users write their own code, it might appear in files named 
>> UserCode1.R, UserCode2.R, etc..  The master code checks for the existence of 
>> this code, then executes
>>  source(?/path/to/UserCode1.R?)
>> This can occur at many different points in the master code (each time 
>> sourcing a different file). In addition to the command above, there are a 
>> variety of other commands testing whether the file exists and whether it 
>> contains certain commands that I don?t allow the user to execute.
>> In order to simplify the code, these commands are embedded in a function 
>> (which I will call MODIFYCODE for the moment).
>> 
>> 3.  Assume that an object within the master code is named TEMP.  The user 
>> might add a column to TEMP.  Since this occurs within a function, there are 
>> two ways to get this modification back to the original environment:
>>  a.  within the function:TEMP<<- TEMP
>>  b.  use the return value from the function:
>>  TEMP<-   MODIFYCODE()
>> 
>> 4.  There are disadvantages to each of these:
>>  a.  The user needs to know that the ?<<-? command must be invoked.  If 
>> they don?t do so, the changes within the function are not available in the 
>> master environment
>>  b.  I don?t know what code will be written by the user, i.e., they 
>> might manipulate TEMP or they might create a new object or something else.  
>> So, I don?t know a priori what to return.
>> 
>> So, my question is: is there some way to manipulate environments such that 
>> the changes within the function are AUTOMATICALLY transferred to the 
>> environment outside the function?
>> 
>> Dennis
>> 
>> Dennis Fisher MD
>> P < (The "P Less Than" Company)
>> Phone / Fax: 1-866-PLessThan (1-866-753-7784)
>> www.PLessThan.com 
>> 
>> 
>> 
>> 
>> 
>>  [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live Go...
>  Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> ---


Re: [R] Plot trajectories using ggplot?

2016-05-14 Thread Mike Smith
Thanks very much for the pointer - that was spot on. The key thing was to add

df$case <- rownames(df)

to generate the row by row case for when I melted the columns. As Ive since 
found out either of these will work

ggplot(ds, aes(x = as.numeric(variable), y = value, colour = case)) +
  geom_point () + geom_line()

ggplot(ds, aes(x = variable, y = value, group = case)) +
  geom_point () + geom_line()

Much appreciated!

Saturday, May 14, 2016, 10:13:15 AM, you wrote:

US> You can introduce the row number as a  case number, you can group
US> by case and plot the connecting lines


US> #Read raw data
US> df =
US> read.table("http://www.lecturematerials.co.uk/data/sample.csv;,
US> header=TRUE, sep=",", dec=".", na.strings=c("NA"))
US> names(df)<-c("1","2","3","4")
US> df$case <- rownames(df)


US> #Turn data from wide to long
US> ds<-melt(df, id.vars = "case")


US> ggplot(ds, aes(x = variable, y = value, group = case)) +
US>   geom_point () + geom_line()


US> Hope this helps,
US> Ulrik
US> On Sat, 14 May 2016 at 10:20 Mike Smith  wrote:

US> Hi
US>  
US>  Ive got stuck using the code below to try to plot trajectories -
US> columns are data recorded at time points, rows are cases. Ive used
US> melt to turn the data long allowing me to group by time point and
US> then plot using geom_point but I now need to join the points based
US> upon the correct case (i.e. the first row in the original
US> dataset). geo_segment allows me to specify start-end but I need to
US> do this over multiple time periods
US>  
US>  Any help much appreciated
US>  
US>  thanks
US>  
US>  mike
US>  
US>  
US>  library(reshape2)
US>  library(ggplot2)
US>  library(ggthemes)
US>  library(cowplot)
US>  
US>  #Read raw data
US>  df =
US> read.table("http://www.lecturematerials.co.uk/data/sample.csv;,
US> header=TRUE, sep=",", dec=".", na.strings=c("NA"))
US>  names(df)<-c("1","2","3","4")
US>  
US>  #Turn data from wide to long
US>  ds<-melt(df)
US>  
US>  ggplot(ds, aes(x = variable, y = value)) +
US> geom_point (shape=19, size=5, fill="black")
US>  
US>  
US>  
US>  ---
US>  Mike Smith
US>  
US>  __
US>  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
US>  https://stat.ethz.ch/mailman/listinfo/r-help
US>  PLEASE do read the posting guide
US> http://www.R-project.org/posting-guide.html
US>  and provide commented, minimal, self-contained, reproducible code.
US>  


---
Mike Smith

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Accessing objects manipulated in a function

2016-05-14 Thread Jeff Newmiller
I think you have boxed yourself into a corner, much like someone painting 
the floor and failing to work their way toward an exit.


The whole premise that you will call "unknown code" that does "unknown 
things" that change the global environment is a maintenance disaster. Give 
up on at least one of these requirements.


Think of the "lm" function. It returns a list of related results, 
augmented with a "class" attribute so that methods like "print.lm" and 
"summary.lm" will work. The actions performed need not be reviewed in 
detail if the user knows what to do with the returned object. The "class" 
attribute is a bit of syntactic sugar, but the concept of putting your 
results into a list that the caller doesn't have to mix items in with 
their own is critical.


Using <<- has to be handled in a very controlled manner... wonton (as in 
"users will change my variables this way") use of that operator will 
inevitably lead to surprises and puzzlement later.


I also happen to think that standardizing on calling "source" inside 
functions is a big mistake... the functions defined by the user should be 
setup and handed off to your "master architecture" code as parameters or 
elements within lists that are parameters.


On Sat, 14 May 2016, Fisher Dennis wrote:


R 3.2.4
OS X and Windows

Colleagues,

I distribute some code to co-workers and I am trying to simplify their task.  
The issue is as follows:

1.  The code automates an extensive set of processes.  Many of the steps are 
standardized.  However, some of the steps may require that users write snippets 
of code, stored in R scripts.

2.  If users write their own code, it might appear in files named UserCode1.R, 
UserCode2.R, etc..  The master code checks for the existence of this code, then 
executes
source(?/path/to/UserCode1.R?)
This can occur at many different points in the master code (each time sourcing a different file). 
In addition to the command above, there are a variety of other commands testing whether the file exists and whether it contains certain commands that I don?t allow the user to execute.

In order to simplify the code, these commands are embedded in a function (which 
I will call MODIFYCODE for the moment).

3.  Assume that an object within the master code is named TEMP.  The user might 
add a column to TEMP.  Since this occurs within a function, there are two ways 
to get this modification back to the original environment:
a.  within the function:TEMP<<- TEMP
b.  use the return value from the function:
TEMP<-   MODIFYCODE()

4.  There are disadvantages to each of these:
a.  The user needs to know that the ?<<-? command must be invoked.  If 
they don?t do so, the changes within the function are not available in the master 
environment
b.  I don?t know what code will be written by the user, i.e., they 
might manipulate TEMP or they might create a new object or something else.  So, 
I don?t know a priori what to return.

So, my question is: is there some way to manipulate environments such that the 
changes within the function are AUTOMATICALLY transferred to the environment 
outside the function?

Dennis

Dennis Fisher MD
P < (The "P Less Than" Company)
Phone / Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com 





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Accessing objects manipulated in a function

2016-05-14 Thread Fisher Dennis
R 3.2.4
OS X and Windows

Colleagues,

I distribute some code to co-workers and I am trying to simplify their task.  
The issue is as follows:

1.  The code automates an extensive set of processes.  Many of the steps are 
standardized.  However, some of the steps may require that users write snippets 
of code, stored in R scripts.

2.  If users write their own code, it might appear in files named UserCode1.R, 
UserCode2.R, etc..  The master code checks for the existence of this code, then 
executes
source(“/path/to/UserCode1.R”)
This can occur at many different points in the master code (each time sourcing 
a different file).  
In addition to the command above, there are a variety of other commands testing 
whether the file exists and whether it contains certain commands that I don’t 
allow the user to execute.
In order to simplify the code, these commands are embedded in a function (which 
I will call MODIFYCODE for the moment).

3.  Assume that an object within the master code is named TEMP.  The user might 
add a column to TEMP.  Since this occurs within a function, there are two ways 
to get this modification back to the original environment:
a.  within the function:TEMP<<- TEMP
b.  use the return value from the function:
TEMP<-   MODIFYCODE()

4.  There are disadvantages to each of these:
a.  The user needs to know that the “<<-“ command must be invoked.  If 
they don’t do so, the changes within the function are not available in the 
master environment
b.  I don’t know what code will be written by the user, i.e., they 
might manipulate TEMP or they might create a new object or something else.  So, 
I don’t know a priori what to return.

So, my question is: is there some way to manipulate environments such that the 
changes within the function are AUTOMATICALLY transferred to the environment 
outside the function?

Dennis

Dennis Fisher MD
P < (The "P Less Than" Company)
Phone / Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com 





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plot trajectories using ggplot?

2016-05-14 Thread Ulrik Stervbo
You can introduce the row number as a  case number, you can group by case
and plot the connecting lines

#Read raw data
df = read.table("http://www.lecturematerials.co.uk/data/sample.csv;,
header=TRUE, sep=",", dec=".", na.strings=c("NA"))
names(df)<-c("1","2","3","4")
df$case <- rownames(df)

#Turn data from wide to long
ds<-melt(df, id.vars = "case")

ggplot(ds, aes(x = variable, y = value, group = case)) +
  geom_point () + geom_line()

Hope this helps,
Ulrik

On Sat, 14 May 2016 at 10:20 Mike Smith  wrote:

> Hi
>
> Ive got stuck using the code below to try to plot trajectories - columns
> are data recorded at time points, rows are cases. Ive used melt to turn the
> data long allowing me to group by time point and then plot using geom_point
> but I now need to join the points based upon the correct case (i.e. the
> first row in the original dataset). geo_segment allows me to specify
> start-end but I need to do this over multiple time periods
>
> Any help much appreciated
>
> thanks
>
> mike
>
>
> library(reshape2)
> library(ggplot2)
> library(ggthemes)
> library(cowplot)
>
> #Read raw data
> df = read.table("http://www.lecturematerials.co.uk/data/sample.csv;,
> header=TRUE, sep=",", dec=".", na.strings=c("NA"))
> names(df)<-c("1","2","3","4")
>
> #Turn data from wide to long
> ds<-melt(df)
>
> ggplot(ds, aes(x = variable, y = value)) +
>geom_point (shape=19, size=5, fill="black")
>
>
>
> ---
> Mike Smith
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] apply formula over columns by subset of rows in a dataframe (to get a new dataframe)

2016-05-14 Thread Massimo Bressan
thank you, what a nice compact solution with ave() 

I learned something new about the subtleties of R 

let me here summarize the alternative solutions, just in case someonelse might 
be interested... 

thanks, bye 

# 

# my user function (an example) 
mynorm <- function(x) {(x - min(x, na.rm=TRUE))/(max(x, na.rm=TRUE) - min(x, 
na.rm=TRUE))} 

# my dataframe to apply the formula by blocks 
mydf<-data.frame(blocks=rep(c("a","b","c"),each=5), 
v1=round(runif(15,10,25),0), v2=round(rnorm(15,30,5),0)) 

# blocks (factors) to be used for splitting 
b <- mydf$blocks 

# 1 - split-lapply-unsplit with anonimous function to return a new df 
s <- split(mydf, b) 
l<- lapply(s, function(x) data.frame(x, v1mod=mynorm(x$v1))) 
mydf_new <- unsplit(l, mydf$blocks) 

# 2 - split-lapply-unsplit with function trasnform to return a new df 
l <- split(mydf, b) 
l <- lapply(l, transform, v1.mod = mynorm(v1)) 
mydf_new <- unsplit(l, b) 

# 3 - ave() encapsulating split-lapply-unsplit approach 
mydf_new<-transform(mydf, v1.mod = ave(v1, blocks, FUN=mynorm)) 

# 





Da: "William Dunlap"  
A: "Massimo Bressan"  
Cc: "David L Carlson" , "r-help"  
Inviato: Venerdì, 13 maggio 2016 19:22:21 
Oggetto: Re: [R] apply formula over columns by subset of rows in a dataframe 
(to get a new dataframe) 

ave() encapsulates the split/lapply/unsplit stuff so 
transform(mydf, v1.mod = ave(v1, blocks, FUN=mynorm)) 
also gives what you got above. 

Bill Dunlap 
TIBCO Software 
wdunlap tibco.com 

On Fri, May 13, 2016 at 7:44 AM, Massimo Bressan < 
massimo.bres...@arpa.veneto.it > wrote: 


yes, thanks 

you pointed me in the right direction: split/unplist was the trick 

I completely left behind that possibility! 

here the final version 

 

mynorm <- function(x) {(x - min(x, na.rm=TRUE))/(max(x, na.rm=TRUE) - min(x, 
na.rm=TRUE))} 

mydf<-data.frame(blocks=rep(c("a","b","c"),each=5), 
v1=round(runif(15,10,25),0), v2=round(rnorm(15,30,5),0)) 

g <- mydf$blocks 
l <- split(mydf, g) 
l <- lapply(l, transform, v1.mod = mynorm(v1)) 
mydf_new <- unsplit(l, g) 

 

thanks again 

massimo 

__ 
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code. 





-- 

 
Massimo Bressan 

ARPAV 
Agenzia Regionale per la Prevenzione e 
Protezione Ambientale del Veneto 

Dipartimento Provinciale di Treviso 
Via Santa Barbara, 5/a 
31100 Treviso, Italy 

tel: +39 0422 558545 
fax: +39 0422 558516 
e-mail: massimo.bres...@arpa.veneto.it 
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plot trajectories using ggplot?

2016-05-14 Thread Mike Smith
Hi

Ive got stuck using the code below to try to plot trajectories - columns are 
data recorded at time points, rows are cases. Ive used melt to turn the data 
long allowing me to group by time point and then plot using geom_point but I 
now need to join the points based upon the correct case (i.e. the first row in 
the original dataset). geo_segment allows me to specify start-end but I need to 
do this over multiple time periods

Any help much appreciated

thanks

mike


library(reshape2)
library(ggplot2)
library(ggthemes)
library(cowplot)

#Read raw data
df = read.table("http://www.lecturematerials.co.uk/data/sample.csv;, 
header=TRUE, sep=",", dec=".", na.strings=c("NA"))
names(df)<-c("1","2","3","4")

#Turn data from wide to long
ds<-melt(df)

ggplot(ds, aes(x = variable, y = value)) +
   geom_point (shape=19, size=5, fill="black")



---
Mike Smith

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R 3.3.0 Crashing: Error in readRDS(nsInfoFilePath) : unknown input format

2016-05-14 Thread Amitava Mukherjee
Dear All,

Greetings. I hope you will be able to provide kind help with the following:

I am facing a strange problem ever since I have started working with R
3.3.0.

I download and work with it, it was fine. Then when I shut down and reopen,
it is not working properly.

I am getting following message:

Error in readRDS(nsInfoFilePath) : unknown input format

R version 3.3.0 (2016-05-03) -- "Supposedly Educational"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

Warning message:
package "methods" in options("defaultPackages") was not found
[Previously saved workspace restored]

Error in readRDS(nsInfoFilePath) : unknown input format
During startup - Warning message:
package ‘methods’ in options("defaultPackages") was not found
> ?mean
Error in readRDS(nsInfoFilePath) : unknown input format
>


I have uninstall it three times and reinstall -- Every time after first
installation it is working perfectly.

Then when I am closing R window and reopening it, the problem starts.

Kindly suggest what should I do. Looking forward to hear from you,

Best regards,
Amitava





Dr. Amitava Mukherjee, Ph.D.
Associate Professor,
Production, Operations and Decision Sciences Area,
XLRI-Xavier School of Management, India.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] aggregate.data.frame(drop=FALSE) in R 3.3.0

2016-05-14 Thread Suharto Anggono Suharto Anggono via R-help
>From NEWS: The data frame and formula methods for aggregate() gain a drop 
>argument.

Here, I highlight behavior of 'aggregate.data.frame' with drop=FALSE in R 3.3.0.

Example 1, modified from "example with character variables and NAs" in 
"Example" in R help on 'aggregate':
> testDF <- data.frame(v1 = c(1,3,5,7,8,3,5,NA,4,5,7,9),
+  v2 = c(11,33,55,77,88,33,55,NA,44,55,77,99) )
> by1 <- c("red", "blue", 1, 2, NA, "big", 1, 2, "red", 1, NA, 12)
> by2 <- c("wet", "dry", 99, 95, NA, "damp", 95, 99, "red", 99, NA, NA)
> str(aggregate(x = testDF, by = list(by1, by2), FUN = "mean", drop = FALSE))
'data.frame':   30 obs. of  4 variables:
 $ Group.1: Factor w/ 5 levels "1","2","big",..: 1 2 3 4 5 1 2 3 4 5 ...
 $ Group.2: Factor w/ 6 levels "95","99","damp",..: 1 1 1 1 1 2 2 2 2 2 ...
 $ v1 : num  5 7 NaN NaN NaN 5 NA NaN NaN NaN ...
 $ v2 : num  55 77 NaN NaN NaN 55 NA NaN NaN NaN ...
 - attr(*, "out.attrs")=List of 2
  ..$ dim : Named int  5 6
  .. ..- attr(*, "names")= chr  "Group.1" "Group.2"
  ..$ dimnames:List of 2
  .. ..$ Group.1: chr  "Group.1=1" "Group.1=2" "Group.1=big" "Group.1=blue" ...
  .. ..$ Group.2: chr  "Group.2=95" "Group.2=99" "Group.2=damp" "Group.2=dry" ..
.
> str(aggregate(x = testDF, by = list(by1, by2), FUN = "mean"))
'data.frame':   8 obs. of  4 variables:
 $ Group.1: chr  "1" "2" "1" "2" ...
 $ Group.2: chr  "95" "95" "99" "99" ...
 $ v1 : num  5 7 5 NA 3 3 4 1
 $ v2 : num  55 77 55 NA 33 33 44 11

The result of 'aggregate.data.frame' with drop=FALSE has attribute "out.attrs"; 
the result of default 'aggregate.data.frame' (drop=TRUE) doesn't.
Character grouping variable becomes a factor in the result of 
'aggregate.data.frame' with drop=FALSE; stays as character in the result of 
default 'aggregate.data.frame' (drop=TRUE).

Example 2, modified from "Compute the averages according to region and the 
occurrence of more than 130 days of frost" in "Examples" in R help on 
'aggregate':
> aggregate(state.x77,
+   list(Region = state.region,
+Cold = state.x77[,"Frost"] > 130),
+   mean, drop = FALSE)
 Region  Cold Population   Income Illiteracy Life ExpMurder
1 Northeast FALSE  8802.8000 4780.400  1.180 71.12800  5.58
2 South FALSE  4208.1250 4011.938  1.7375000 69.70625 10.581250
3 North Central FALSE  7233.8333 4633.333  0.783 70.95667  8.28
4  West FALSE  4582.5714 4550.143  1.2571429 71.7  6.828571
5 Northeast  TRUE  1360.5000 4307.500  0.775 71.43500  3.65
6 South  TRUENaN  NaNNaN  NaN   NaN
7 North Central  TRUE  2372.1667 4588.833  0.617 72.57667  2.27
8  West  TRUE   970.1667 4880.500  0.750 70.69167  7.67
   HS GradFrost  Area
1 52.06000 110.6000  21838.60
2 44.34375  64.6250  54605.12
3 53.36667 120.  56736.50
4 60.11429  51.  91863.71
5 56.35000 160.5000  13519.00
6  NaN  NaN   NaN
7 55.7 157.6667  68567.50
8 64.2 161.8333 184162.17
> aggregate(state.x77,
+   list(Region = state.region,
+Cold = state.x77[,"Frost"] > 130),
+   mean)
 Region  Cold Population   Income Illiteracy Life ExpMurder
1 Northeast FALSE  8802.8000 4780.400  1.180 71.12800  5.58
2 South FALSE  4208.1250 4011.938  1.7375000 69.70625 10.581250
3 North Central FALSE  7233.8333 4633.333  0.783 70.95667  8.28
4  West FALSE  4582.5714 4550.143  1.2571429 71.7  6.828571
5 Northeast  TRUE  1360.5000 4307.500  0.775 71.43500  3.65
6 North Central  TRUE  2372.1667 4588.833  0.617 72.57667  2.27
7  West  TRUE   970.1667 4880.500  0.750 70.69167  7.67
   HS GradFrost  Area
1 52.06000 110.6000  21838.60
2 44.34375  64.6250  54605.12
3 53.36667 120.  56736.50
4 60.11429  51.  91863.71
5 56.35000 160.5000  13519.00
6 55.7 157.6667  68567.50
7 64.2 161.8333 184162.17

Unlike 'tapply', in 'aggregate.data.frame' with drop=FALSE, the function (mean 
in example 2 above) is also applied to subset corresponding to combination of 
grouping variables that doesn't appear in the data.

Example 3, modified from 
http://stackoverflow.com/questions/22523131/dplyr-summarise-equivalent-of-drop-false-to-keep-groups-with-zero-length-in
 :
> DF <- data.frame(a=rep(1:3,4), b=factor(rep(1:2,6), levels=1:3))
> aggregate(DF["a"], DF["b"], length, drop=FALSE)
  b a
1 1 6
2 2 6

Unlike 'interaction' with drop=FALSE, or 'tapply', for factor grouping 
variable, levels that never appear in the data (in example 3 above, "3" in 'b') 
don't appear in the result of 'aggregate.data.frame' with drop=FALSE.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, 

Re: [R] physical constraint with gam

2016-05-14 Thread Simon Wood
On 12/05/16 02:29, Dominik Schneider wrote:
> Hi again,
> I'm looking for some clarification on 2 things.
> 1. On that last note, I realize that s(x1,x2) would be the other 
> obvious interaction to compare with - and I see that you recommend 
> te(x1,x2) if they are not on the same scale.
- yes that's right, s(x1,x2) gives an isotropic smooth, which is usually 
only appropriate if x1 and x2 are naturally on the same scale.

> 2. If s(x1,by=x1) gives you a "parameter" value similar to a GLM when 
> you plot s(x1):x1, why does my function above return the same yhat as 
> predict(mdl,type='response') ? Shouldn't each of the terms need to be 
> multiplied by the variable value before applying 
> rowSums()+attr(sterms,'constant') ??
predict returns s(x1)*x1 (plot.gam just plots s(x1), because in general 
s(x1,by=x2) is not smooth). If you want to get s(x1) on its own you need 
to do something like this:

x2 <- x1 ## copy x1
m <- gam(y~s(x1,by=x2)) ## model implementing s(x1,by=x1) using copy of x1
predict(m,data.frame(x1=x1,x2=rep(1,length(x2))),type="terms") ## now 
predicted s(x1)*x2 = s(x1)

best,
Simon

> Thanks again
> Dominik
>
> On Wed, May 11, 2016 at 10:11 AM, Dominik Schneider 
>  > wrote:
>
> Hi Simon, Thanks for this explanation.
> To make sure I understand, another way of explaining the y axis in
> my original example is that it is the contribution to snowdepth
> relative to the other variables (the example only had fsca, but my
> actual case has a couple others). i.e. a negative s(fsca) of -0.5
> simply means snowdepth 0.5 units below the intercept+s(x_i), where
> s(x_i) could also be negative in the case where total snowdepth is
> less than the intercept value.
>
> The use of by=fsca is really useful for interpreting the marginal
> impact of the different variables. With my actual data, the term
> s(fsca):fsca is never negative, which is much more intuitive. Is
> it appropriate to compare magnitudes of e.g. s(x2):x2 / mean(x2)
> and s(x2):x2 / mean(x2)  where mean(x_i) are the mean of the
> actual data?
>
> Lastly, how would these two differ: s(x1,by=x2); or
> s(x1,by=x1)*s(x2,by=x2) since interactions are surely present and
> i'm not sure if a linear combination is enough.
>
> Thanks!
> Dominik
>
>
> On Wed, May 11, 2016 at 3:11 AM, Simon Wood  > wrote:
>
> The spline having a positive value is not the same as a glm
> coefficient having a positive value. When you plot a smooth,
> say s(x), that is equivalent to plotting the line 'beta * x'
> in a GLM. It is not equivalent to plotting 'beta'. The smooths
> in a gam are (usually) subject to `sum-to-zero'
> identifiability constraints to avoid confounding via the
> intercept, so they are bound to be negative over some part of
> the covariate range. For example, if I have a model y ~ s(x) +
> s(z), I can't estimate the mean level for s(x) and the mean
> level for s(z) as they are completely confounded, and
> confounded with the model intercept term.
>
> I suppose that if you want to interpret the smooths as glm
> parameters varying with the covariate they relate to then you
> can do, by setting the model up as a varying coefficient
> model, using the `by' argument to 's'...
>
> gam(snowdepth~s(fsca,by=fsca),data=dat)
>
>
> this model is `snowdepth_i = f(fsca_i) * fsca_i + e_i' .
> s(fsca,by=fsca) is not confounded with the intercept, so no
> constraint is needed or applied, and you can now interpret the
> smooth like a local GLM coefficient.
>
> best,
> Simon
>
>
>
>
> On 11/05/16 01:30, Dominik Schneider wrote:
>
> Hi,
> Just getting into using GAM using the mgcv package. I've
> generated some
> models and extracted the splines for each of the variables
> and started
> visualizing them. I'm noticing that one of my variables is
> physically
> unrealistic.
>
> In the example below, my interpretation of the following
> plot is that the
> y-axis is basically the equivalent of a "parameter" value
> of a GLM; in GAM
> this value can change as the functional relationship
> changes between x and
> y. In my case, I am predicting snowdepth based on the
> fractional snow
> covered area. In no case will snowdepth realistically
> decrease for a unit
> increase in fsca so my question is: *Is there a way to
> constrain the spline
> to positive values? *
>
> Thanks
> Dominik
>
> library(mgcv)
>