[R] Panel Data--filling in missing dates in a span only

2015-03-10 Thread Steven Archambault
Hi folks,

I have this panel data (below), with observations missing in each of the 
panels. I want to fill in years for the missing data, but only those years 
within the span of the existing data. For instance, BC-0002 needs on year, 
1995. I do not want any years after the last observation.

structure(list(ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = 
c("BC-0002", 
"BC-0003", "BC-0004"), class = "factor"), Date = c(1989L, 1990L, 
1991L, 1992L, 1993L, 1994L, 1996L, 1989L, 1990L, 1991L, 1992L, 
1993L, 1994L, 1996L, 1995L, 1996L, 1997L, 1998L, 2000L, 1994L, 
1993L, 1999L, 1998L), DepthtoWater_bgs = c(317.85, 317.25, 321.25, 
312.31, 313.01, 330.41, 321.01, 166.58, 167.55, 168.65, 168.95, 
169.25, 168.85, 169.75, 260.6, 261.65, 262.15, 265.45, 266.15, 
265.25, 265.05, 266.95, 267.75)), .Names = c("ID", "Date", "DepthtoWater_bgs"
), class = "data.frame", row.names = c(NA, -23L))


I have been using this code to expand the entire panels, but it is not what 
exactly what I want.

fexp <- expand.grid(ID=unique(wells$ID), Date=unique(wells$Date))
merge(fexp, wells, all=TRUE) 

Any help would be much appreciated!

Thanks,
Steve

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] svg2swf - controlling the looping of flash files

2015-03-10 Thread Yixuan Qiu
Hi Paul,
One workaround for this problem is to manually edit the HTML that
contains the Flash animation, by adding a "loop" parameter in the
 tag:




This works for Firefox at least.



Best,
Yixuan

2015-03-10 12:09 GMT-04:00 Paul Sweeting :
> Hi Yixuan
>
>
>
> Thanks for your reply. I think it would be useful to have the option of a
> “loop = FALSE” option in this function.  However, I’m not sure how long a
> shelf life swf files will have, given everything seems to be moving away
> from flash…
>
>
>
> Paul
>
>
>
> From: Yixuan Qiu [mailto:yixuan@cos.name]
> Sent: 10 March 2015 00:20
> To: Paul Sweeting
> Cc: r-help
> Subject: Re: [R] svg2swf - controlling the looping of flash files
>
>
>
> Hello Paul,
>
> So far there is no way to stop the animation after its first run. If this
> feature is needed I could try to implement it in the future version of
> R2SWF.
>
> Best,
>
> Yixuan
>
>
>
> 2015-03-09 18:33 GMT-04:00 Paul Sweeting :
>
> Hi
>
>
>
> I'm using svg2swf to collate a number of svg outputs into an swf file.  I've
> got this working (mainly.) except that I can't control the looping behaviour
> of the swf file.  In other words, when it's loaded into html it loops
> continuously.  Is there any way to stop the animation looping, so it just
> plays through once when loaded?  The code I use is (broadly):
>
>
>
>svg("testplot%d.svg",onefile = FALSE)
>
>for(j in 1:360){
>
>   print(cloud(x~y*z, groups=tail,
> data=norm_dots_chart, screen=list(z=0,x=0,y=j)))
>
>}
>
>dev.off()
>
>output = svg2swf(sprintf("testplot%d.svg", 1:360), interval =
> 0.04)
>
>swf2html(output)
>
>
>
> Thank you!
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
> --
>
> Yixuan Qiu 
> Department of Statistics,
> Purdue University



-- 
Yixuan Qiu 
Department of Statistics,
Purdue University

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] .Rprofile vs. First (more of an opinion question)

2015-03-10 Thread Jeff Newmiller
I concur with Rolf.

.RData files (the ones with nothing before the period) are just traps for your 
future self, with no documentation. I avoid them like the plague. I refer to 
specifically-named Something.RData files in my .R/.Rnw/.Rmd files to cache 
results of long computations, but they are optional in my workflow because I 
always have R code that can regenerate them.

.Rprofile files offer consistency of behavior  regardless of which working 
directory you use, and you can comment them.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On March 10, 2015 3:38:20 PM PDT, Rolf Turner  wrote:
>On 11/03/15 11:17, Erin Hodgess wrote:
>> Hello again
>>
>> I am using R-3.1.2 on Windows 7.
>>
>> I am the only one using this particular computer.
>>
>> My question is probably more of an opinion question.
>>
>> I want to set a "repos" with the options.  Also, I want to setwd and
>load a
>> particular workspace.
>>
>> Am I better off to put everything into .Rprofile, please?  Or .First?
>>
>> Or put the options into .Rprofile and everything else into .First,
>please?
>>
>> Thanks for any help.
>
>How do you create your .First() function and get it into your
>workspace?
>
>I may be confused here, but I think that you would need to make sure 
>that this is done in each workspace (in each working directory) that
>you 
>use.  It may be the case that you use only a single working directory, 
>but it is generally good practice to use a different working directory 
>for each separate project that you engage in.
>
>In contrast, putting your settings in .Rprofile causes them to be 
>applied in any working directory in which you start R.
>
>I also think that there's more danger of .RData getting lost or 
>over-written --- it is getting used all the time, whereas .Rprofile
>just 
>sits there and does its thing once it's been created --- than there is 
>of .Rprofile getting lost or over-written.
>
>Consequently my 2 bob's worth is:  Use .Rprofile.
>
>Of course, this is a case of the blind leading the blind.  Caveat
>lector.
>
>cheers,
>
>Rolf

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem applying the same function twice

2015-03-10 Thread Curtis Burkhalter
Sarah,

I realized what I was saying after I pressed send on the email. It makes
perfect sense now, thanks so much for your help and patience.
On Mar 10, 2015 5:57 PM, "Sarah Goslee"  wrote:

> I think you're kind of missing the way this works:
>
> the data frame created by expand.grid() should ONLY have site, year,
> sample (with the exact names used in the data itself).
> Then the merged data frame will have the full site,year,sample
> combinations, along with ALL the data variables. Your animal example
> only had one measured variable, but the same method will work with any
> number.
> Reading ?merge might help you understand.
>
> Sarah
>
> On Tue, Mar 10, 2015 at 5:35 PM, Curtis Burkhalter
>  wrote:
> >
> > Thanks Sarah, one of my column names was missing a letter so it was
> throwing
> > things off. It works super fast now and is exactly what I needed. My
> actual
> > data set  has about 6 other ancillary response data data columns, is
> there a
> > way to combine the 'full' data set I just created with the original in
> case
> > I need any of the other response variables. E.g.
> >
> > FULL:  Original:
> > Combined:
> > siteyear samplesiteyear sample
>  color
> > shape  siteyear sample color shape
> > 11 10   11 10
> > blue   diamond  11 10blue
> > diamond
> > 1 112   1 112
> > green pyramid   1 112green
> > pyramid
> > 1 1NA
> > 1 1NA   NANA
> >
> > Thanks
> >
> > On Tue, Mar 10, 2015 at 3:12 PM, Sarah Goslee 
> > wrote:
> >>
> >> Yeah, that's tiny:
> >>
> >> > fullout <- expand.grid(site=1:669, year=1:7, sample=1:3)
> >> > dim(fullout)
> >> [1] 14049 3
> >>
> >>
> >> Almost certainly the problem is that your expand.grid result doesn't
> >> have the same column names as your actual data file, so merge() is
> >> trying to make an enormous result. Note how when I made outgrid in the
> >> example I named the columns.
> >>
> >> Make sure that the names are identical!
> >>
> >>
> >> On Tue, Mar 10, 2015 at 4:57 PM, Curtis Burkhalter
> >>  wrote:
> >> > Sarah,
> >> >
> >> > I have 669 sites and each site has 7 years of data, so if I'm thinking
> >> > correctly then there should be 4683 possible combinations of site x
> >> > year.
> >> > For each year though I need 3 sampling periods so that there is
> >> > something
> >> > like the following:
> >> >
> >> > site 1  year1  sample 1
> >> > site 1  year1  sample 2
> >> > site 1  year1  sample 3
> >> > site 2  year1  sample 1
> >> > site 2  year1  sample 2
> >> > site 2  year1  sample 3.
> >> > site 669   year7  sample 1
> >> > site 669   year7 sample 2
> >> > site 669   year7 sample 3.
> >> >
> >> > I have my max memory allocation set to the amount of RAM (8GB) on my
> >> > laptop,
> >> > but it still 'times out' due to memory problems.
> >> >
> >> > On Tue, Mar 10, 2015 at 2:50 PM, Sarah Goslee  >
> >> > wrote:
> >> >>
> >> >> You said your data only had 14000 rows, which really isn't many.
> >> >>
> >> >> How many possible combinations do you have, and how many do you need
> to
> >> >> add?
> >> >>
> >> >> On Tue, Mar 10, 2015 at 4:35 PM, Curtis Burkhalter
> >> >>  wrote:
> >> >> > Sarah,
> >> >> >
> >> >> > This strategy works great for this small dataset, but when I
> attempt
> >> >> > your
> >> >> > method with my data set I reach the maximum allowable memory
> >> >> > allocation
> >> >> > and
> >> >> > the operation just stalls and then stops completely before it is
> >> >> > finished.
> >> >> > Do you know of a way around this?
> >> >> >
> >> >> > Thanks
> >> >> >
> >> >> > On Tue, Mar 10, 2015 at 2:04 PM, Sarah Goslee
> >> >> > 
> >> >> > wrote:
> >> >> >>
> >> >> >> Hi,
> >> >> >>
> >> >> >> I didn't work through your code, because it looked overly
> >> >> >> complicated.
> >> >> >> Here's a more general approach that does what you appear to want:
> >> >> >>
> >> >> >> # use dput() to provide reproducible data please!
> >> >> >> comAn <- structure(list(animals = c("bird", "bird", "bird",
> "bird",
> >> >> >> "bird",
> >> >> >> "bird", "dog", "dog", "dog", "dog", "dog", "dog", "cat", "cat",
> >> >> >> "cat", "cat"), animalYears = c(1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L,
> >> >> >> 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L), animalMass = c(29L, 48L, 36L,
> >> >> >> 20L, 34L, 34L, 21L, 28L, 25L, 35L, 18L, 11L, 46L, 33L, 48L, 21L
> >> >> >> )), .Names = c("animals", "animalYears", "animalMass"), class =
> >> >> >> "data.frame", row.names = c("1",
> >> >> >> "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
> >> >> >> "14", "15", "16"))
> >> >> >>
> >> >> >>
> >> >> >> # add reps to comAn
> >> >> >> # assumes comAn is already sorted on animals,

Re: [R] problem applying the same function twice

2015-03-10 Thread Sarah Goslee
I think you're kind of missing the way this works:

the data frame created by expand.grid() should ONLY have site, year,
sample (with the exact names used in the data itself).
Then the merged data frame will have the full site,year,sample
combinations, along with ALL the data variables. Your animal example
only had one measured variable, but the same method will work with any
number.
Reading ?merge might help you understand.

Sarah

On Tue, Mar 10, 2015 at 5:35 PM, Curtis Burkhalter
 wrote:
>
> Thanks Sarah, one of my column names was missing a letter so it was throwing
> things off. It works super fast now and is exactly what I needed. My actual
> data set  has about 6 other ancillary response data data columns, is there a
> way to combine the 'full' data set I just created with the original in case
> I need any of the other response variables. E.g.
>
> FULL:  Original:
> Combined:
> siteyear samplesiteyear sample color
> shape  siteyear sample color shape
> 11 10   11 10
> blue   diamond  11 10blue
> diamond
> 1 112   1 112
> green pyramid   1 112green
> pyramid
> 1 1NA
> 1 1NA   NANA
>
> Thanks
>
> On Tue, Mar 10, 2015 at 3:12 PM, Sarah Goslee 
> wrote:
>>
>> Yeah, that's tiny:
>>
>> > fullout <- expand.grid(site=1:669, year=1:7, sample=1:3)
>> > dim(fullout)
>> [1] 14049 3
>>
>>
>> Almost certainly the problem is that your expand.grid result doesn't
>> have the same column names as your actual data file, so merge() is
>> trying to make an enormous result. Note how when I made outgrid in the
>> example I named the columns.
>>
>> Make sure that the names are identical!
>>
>>
>> On Tue, Mar 10, 2015 at 4:57 PM, Curtis Burkhalter
>>  wrote:
>> > Sarah,
>> >
>> > I have 669 sites and each site has 7 years of data, so if I'm thinking
>> > correctly then there should be 4683 possible combinations of site x
>> > year.
>> > For each year though I need 3 sampling periods so that there is
>> > something
>> > like the following:
>> >
>> > site 1  year1  sample 1
>> > site 1  year1  sample 2
>> > site 1  year1  sample 3
>> > site 2  year1  sample 1
>> > site 2  year1  sample 2
>> > site 2  year1  sample 3.
>> > site 669   year7  sample 1
>> > site 669   year7 sample 2
>> > site 669   year7 sample 3.
>> >
>> > I have my max memory allocation set to the amount of RAM (8GB) on my
>> > laptop,
>> > but it still 'times out' due to memory problems.
>> >
>> > On Tue, Mar 10, 2015 at 2:50 PM, Sarah Goslee 
>> > wrote:
>> >>
>> >> You said your data only had 14000 rows, which really isn't many.
>> >>
>> >> How many possible combinations do you have, and how many do you need to
>> >> add?
>> >>
>> >> On Tue, Mar 10, 2015 at 4:35 PM, Curtis Burkhalter
>> >>  wrote:
>> >> > Sarah,
>> >> >
>> >> > This strategy works great for this small dataset, but when I attempt
>> >> > your
>> >> > method with my data set I reach the maximum allowable memory
>> >> > allocation
>> >> > and
>> >> > the operation just stalls and then stops completely before it is
>> >> > finished.
>> >> > Do you know of a way around this?
>> >> >
>> >> > Thanks
>> >> >
>> >> > On Tue, Mar 10, 2015 at 2:04 PM, Sarah Goslee
>> >> > 
>> >> > wrote:
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> I didn't work through your code, because it looked overly
>> >> >> complicated.
>> >> >> Here's a more general approach that does what you appear to want:
>> >> >>
>> >> >> # use dput() to provide reproducible data please!
>> >> >> comAn <- structure(list(animals = c("bird", "bird", "bird", "bird",
>> >> >> "bird",
>> >> >> "bird", "dog", "dog", "dog", "dog", "dog", "dog", "cat", "cat",
>> >> >> "cat", "cat"), animalYears = c(1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L,
>> >> >> 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L), animalMass = c(29L, 48L, 36L,
>> >> >> 20L, 34L, 34L, 21L, 28L, 25L, 35L, 18L, 11L, 46L, 33L, 48L, 21L
>> >> >> )), .Names = c("animals", "animalYears", "animalMass"), class =
>> >> >> "data.frame", row.names = c("1",
>> >> >> "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
>> >> >> "14", "15", "16"))
>> >> >>
>> >> >>
>> >> >> # add reps to comAn
>> >> >> # assumes comAn is already sorted on animals, animalYears
>> >> >> comAn$reps <- unlist(sapply(rle(do.call("paste",
>> >> >> comAn[,1:2]))$lengths, seq_len))
>> >> >>
>> >> >> # create full set of combinations
>> >> >> outgrid <- expand.grid(animals=unique(comAn$animals),
>> >> >> animalYears=unique(comAn$animalYears), reps=unique(comAn$reps),
>> >> >> stringsAsFactors=FALSE)
>> >> >>
>> >> >> # combine with comAn
>> >> >> comAn.full <- merge(outgrid, comAn, all.x=TRUE)
>> >> >>
>> >> >> > comAn.fu

Re: [R] .Rprofile vs. First (more of an opinion question)

2015-03-10 Thread Rolf Turner

On 11/03/15 11:17, Erin Hodgess wrote:

Hello again

I am using R-3.1.2 on Windows 7.

I am the only one using this particular computer.

My question is probably more of an opinion question.

I want to set a "repos" with the options.  Also, I want to setwd and load a
particular workspace.

Am I better off to put everything into .Rprofile, please?  Or .First?

Or put the options into .Rprofile and everything else into .First, please?

Thanks for any help.


How do you create your .First() function and get it into your workspace?

I may be confused here, but I think that you would need to make sure 
that this is done in each workspace (in each working directory) that you 
use.  It may be the case that you use only a single working directory, 
but it is generally good practice to use a different working directory 
for each separate project that you engage in.


In contrast, putting your settings in .Rprofile causes them to be 
applied in any working directory in which you start R.


I also think that there's more danger of .RData getting lost or 
over-written --- it is getting used all the time, whereas .Rprofile just 
sits there and does its thing once it's been created --- than there is 
of .Rprofile getting lost or over-written.


Consequently my 2 bob's worth is:  Use .Rprofile.

Of course, this is a case of the blind leading the blind.  Caveat lector.

cheers,

Rolf


--
Rolf Turner
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276
Home phone: +64-9-480-4619

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] .Rprofile vs. First (more of an opinion question)

2015-03-10 Thread Erin Hodgess
Hello again

I am using R-3.1.2 on Windows 7.

I am the only one using this particular computer.

My question is probably more of an opinion question.

I want to set a "repos" with the options.  Also, I want to setwd and load a
particular workspace.

Am I better off to put everything into .Rprofile, please?  Or .First?

Or put the options into .Rprofile and everything else into .First, please?

Thanks for any help,
Sincerely,
Erin


-- 
Erin Hodgess
Associate Professor
Department of Mathematical and Statistics
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem applying the same function twice

2015-03-10 Thread Jeff Newmiller
You may find it beneficial to investigate packages dplyr, data.table, or a 
combination of the two for handling large data sets in memory. Or, perhaps 
dplyr with a SQL back end for working on disk (I have not tried that myself 
yet).

I do find your excuse for manufacturing data records uncompelling, though. Of 
the information necessary to draw valid conclusions is absent, the results you 
obtain by doing so is going to be questionable at best.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On March 10, 2015 1:57:14 PM PDT, Curtis Burkhalter 
 wrote:
>Sarah,
>
>I have 669 sites and each site has 7 years of data, so if I'm thinking
>correctly then there should be 4683 possible combinations of site x
>year.
>For each year though I need 3 sampling periods so that there is
>something
>like the following:
>
>site 1  year1  sample 1
>site 1  year1  sample 2
>site 1  year1  sample 3
>site 2  year1  sample 1
>site 2  year1  sample 2
>site 2  year1  sample 3.
>site 669   year7  sample 1
>site 669   year7 sample 2
>site 669   year7 sample 3.
>
>I have my max memory allocation set to the amount of RAM (8GB) on my
>laptop, but it still 'times out' due to memory problems.
>
>On Tue, Mar 10, 2015 at 2:50 PM, Sarah Goslee 
>wrote:
>
>> You said your data only had 14000 rows, which really isn't many.
>>
>> How many possible combinations do you have, and how many do you need
>to
>> add?
>>
>> On Tue, Mar 10, 2015 at 4:35 PM, Curtis Burkhalter
>>  wrote:
>> > Sarah,
>> >
>> > This strategy works great for this small dataset, but when I
>attempt your
>> > method with my data set I reach the maximum allowable memory
>allocation
>> and
>> > the operation just stalls and then stops completely before it is
>> finished.
>> > Do you know of a way around this?
>> >
>> > Thanks
>> >
>> > On Tue, Mar 10, 2015 at 2:04 PM, Sarah Goslee
>
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> I didn't work through your code, because it looked overly
>complicated.
>> >> Here's a more general approach that does what you appear to want:
>> >>
>> >> # use dput() to provide reproducible data please!
>> >> comAn <- structure(list(animals = c("bird", "bird", "bird",
>"bird",
>> >> "bird",
>> >> "bird", "dog", "dog", "dog", "dog", "dog", "dog", "cat", "cat",
>> >> "cat", "cat"), animalYears = c(1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L,
>> >> 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L), animalMass = c(29L, 48L, 36L,
>> >> 20L, 34L, 34L, 21L, 28L, 25L, 35L, 18L, 11L, 46L, 33L, 48L, 21L
>> >> )), .Names = c("animals", "animalYears", "animalMass"), class =
>> >> "data.frame", row.names = c("1",
>> >> "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
>> >> "14", "15", "16"))
>> >>
>> >>
>> >> # add reps to comAn
>> >> # assumes comAn is already sorted on animals, animalYears
>> >> comAn$reps <- unlist(sapply(rle(do.call("paste",
>> >> comAn[,1:2]))$lengths, seq_len))
>> >>
>> >> # create full set of combinations
>> >> outgrid <- expand.grid(animals=unique(comAn$animals),
>> >> animalYears=unique(comAn$animalYears), reps=unique(comAn$reps),
>> >> stringsAsFactors=FALSE)
>> >>
>> >> # combine with comAn
>> >> comAn.full <- merge(outgrid, comAn, all.x=TRUE)
>> >>
>> >> > comAn.full
>> >>animals animalYears reps animalMass
>> >> 1 bird   11 29
>> >> 2 bird   12 48
>> >> 3 bird   13 36
>> >> 4 bird   21 20
>> >> 5 bird   22 34
>> >> 6 bird   23 34
>> >> 7  cat   11 46
>> >> 8  cat   12 33
>> >> 9  cat   13 48
>> >> 10 cat   21 21
>> >> 11 cat   22 NA
>> >> 12 cat   23 NA
>> >> 13 dog   11 21
>> >> 14 dog   12 28
>> >> 15 dog   13 25
>> >> 16 dog   21 35
>> >> 17 dog   22 18
>> >> 18 dog   23 11
>> >> >
>> >>
>> >> On Tue, Mar 10, 2015 at 3:43 PM, Curtis Burkhalter
>> >>  wrote:
>> >> > Hey everyone,
>> >> >
>> >> > I've written a function that adds NAs to a dataframe where data
>is
>> >> > missing
>> >> > and it seems to work great if I only need to run it once, but if
>I run
>> >> > it
>> >> > two times in a row I run into problems. I've created a workable
>> example
>> >> > to
>> >> > explain w

Re: [R] problem applying the same function twice

2015-03-10 Thread Curtis Burkhalter
Thanks Sarah, one of my column names was missing a letter so it was
throwing things off. It works super fast now and is exactly what I needed.
My actual data set  has about 6 other ancillary response data data columns,
is there a way to combine the 'full' data set I just created with the
original in case I need any of the other response variables. E.g.

FULL:  Original:
   Combined:
siteyear samplesiteyear sample
color shape  siteyear sample color
shape
11 10   11 10
 blue   diamond  11 10blue
  diamond
1 112   1 112
 green pyramid   1 112green
pyramid
1 1NA
   1 1NA
NANA

Thanks

On Tue, Mar 10, 2015 at 3:12 PM, Sarah Goslee 
wrote:

> Yeah, that's tiny:
>
> > fullout <- expand.grid(site=1:669, year=1:7, sample=1:3)
> > dim(fullout)
> [1] 14049 3
>
>
> Almost certainly the problem is that your expand.grid result doesn't
> have the same column names as your actual data file, so merge() is
> trying to make an enormous result. Note how when I made outgrid in the
> example I named the columns.
>
> Make sure that the names are identical!
>
>
> On Tue, Mar 10, 2015 at 4:57 PM, Curtis Burkhalter
>  wrote:
> > Sarah,
> >
> > I have 669 sites and each site has 7 years of data, so if I'm thinking
> > correctly then there should be 4683 possible combinations of site x year.
> > For each year though I need 3 sampling periods so that there is something
> > like the following:
> >
> > site 1  year1  sample 1
> > site 1  year1  sample 2
> > site 1  year1  sample 3
> > site 2  year1  sample 1
> > site 2  year1  sample 2
> > site 2  year1  sample 3.
> > site 669   year7  sample 1
> > site 669   year7 sample 2
> > site 669   year7 sample 3.
> >
> > I have my max memory allocation set to the amount of RAM (8GB) on my
> laptop,
> > but it still 'times out' due to memory problems.
> >
> > On Tue, Mar 10, 2015 at 2:50 PM, Sarah Goslee 
> > wrote:
> >>
> >> You said your data only had 14000 rows, which really isn't many.
> >>
> >> How many possible combinations do you have, and how many do you need to
> >> add?
> >>
> >> On Tue, Mar 10, 2015 at 4:35 PM, Curtis Burkhalter
> >>  wrote:
> >> > Sarah,
> >> >
> >> > This strategy works great for this small dataset, but when I attempt
> >> > your
> >> > method with my data set I reach the maximum allowable memory
> allocation
> >> > and
> >> > the operation just stalls and then stops completely before it is
> >> > finished.
> >> > Do you know of a way around this?
> >> >
> >> > Thanks
> >> >
> >> > On Tue, Mar 10, 2015 at 2:04 PM, Sarah Goslee  >
> >> > wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> I didn't work through your code, because it looked overly
> complicated.
> >> >> Here's a more general approach that does what you appear to want:
> >> >>
> >> >> # use dput() to provide reproducible data please!
> >> >> comAn <- structure(list(animals = c("bird", "bird", "bird", "bird",
> >> >> "bird",
> >> >> "bird", "dog", "dog", "dog", "dog", "dog", "dog", "cat", "cat",
> >> >> "cat", "cat"), animalYears = c(1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L,
> >> >> 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L), animalMass = c(29L, 48L, 36L,
> >> >> 20L, 34L, 34L, 21L, 28L, 25L, 35L, 18L, 11L, 46L, 33L, 48L, 21L
> >> >> )), .Names = c("animals", "animalYears", "animalMass"), class =
> >> >> "data.frame", row.names = c("1",
> >> >> "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
> >> >> "14", "15", "16"))
> >> >>
> >> >>
> >> >> # add reps to comAn
> >> >> # assumes comAn is already sorted on animals, animalYears
> >> >> comAn$reps <- unlist(sapply(rle(do.call("paste",
> >> >> comAn[,1:2]))$lengths, seq_len))
> >> >>
> >> >> # create full set of combinations
> >> >> outgrid <- expand.grid(animals=unique(comAn$animals),
> >> >> animalYears=unique(comAn$animalYears), reps=unique(comAn$reps),
> >> >> stringsAsFactors=FALSE)
> >> >>
> >> >> # combine with comAn
> >> >> comAn.full <- merge(outgrid, comAn, all.x=TRUE)
> >> >>
> >> >> > comAn.full
> >> >>animals animalYears reps animalMass
> >> >> 1 bird   11 29
> >> >> 2 bird   12 48
> >> >> 3 bird   13 36
> >> >> 4 bird   21 20
> >> >> 5 bird   22 34
> >> >> 6 bird   23 34
> >> >> 7  cat   11 46
> >> >> 8  cat   12 33
> >> >> 9  cat   13 48
> >> >> 10 cat   21 21
> >> >> 11 cat   22 NA
> 

Re: [R] problem applying the same function twice

2015-03-10 Thread Sarah Goslee
Yeah, that's tiny:

> fullout <- expand.grid(site=1:669, year=1:7, sample=1:3)
> dim(fullout)
[1] 14049 3


Almost certainly the problem is that your expand.grid result doesn't
have the same column names as your actual data file, so merge() is
trying to make an enormous result. Note how when I made outgrid in the
example I named the columns.

Make sure that the names are identical!


On Tue, Mar 10, 2015 at 4:57 PM, Curtis Burkhalter
 wrote:
> Sarah,
>
> I have 669 sites and each site has 7 years of data, so if I'm thinking
> correctly then there should be 4683 possible combinations of site x year.
> For each year though I need 3 sampling periods so that there is something
> like the following:
>
> site 1  year1  sample 1
> site 1  year1  sample 2
> site 1  year1  sample 3
> site 2  year1  sample 1
> site 2  year1  sample 2
> site 2  year1  sample 3.
> site 669   year7  sample 1
> site 669   year7 sample 2
> site 669   year7 sample 3.
>
> I have my max memory allocation set to the amount of RAM (8GB) on my laptop,
> but it still 'times out' due to memory problems.
>
> On Tue, Mar 10, 2015 at 2:50 PM, Sarah Goslee 
> wrote:
>>
>> You said your data only had 14000 rows, which really isn't many.
>>
>> How many possible combinations do you have, and how many do you need to
>> add?
>>
>> On Tue, Mar 10, 2015 at 4:35 PM, Curtis Burkhalter
>>  wrote:
>> > Sarah,
>> >
>> > This strategy works great for this small dataset, but when I attempt
>> > your
>> > method with my data set I reach the maximum allowable memory allocation
>> > and
>> > the operation just stalls and then stops completely before it is
>> > finished.
>> > Do you know of a way around this?
>> >
>> > Thanks
>> >
>> > On Tue, Mar 10, 2015 at 2:04 PM, Sarah Goslee 
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> I didn't work through your code, because it looked overly complicated.
>> >> Here's a more general approach that does what you appear to want:
>> >>
>> >> # use dput() to provide reproducible data please!
>> >> comAn <- structure(list(animals = c("bird", "bird", "bird", "bird",
>> >> "bird",
>> >> "bird", "dog", "dog", "dog", "dog", "dog", "dog", "cat", "cat",
>> >> "cat", "cat"), animalYears = c(1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L,
>> >> 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L), animalMass = c(29L, 48L, 36L,
>> >> 20L, 34L, 34L, 21L, 28L, 25L, 35L, 18L, 11L, 46L, 33L, 48L, 21L
>> >> )), .Names = c("animals", "animalYears", "animalMass"), class =
>> >> "data.frame", row.names = c("1",
>> >> "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
>> >> "14", "15", "16"))
>> >>
>> >>
>> >> # add reps to comAn
>> >> # assumes comAn is already sorted on animals, animalYears
>> >> comAn$reps <- unlist(sapply(rle(do.call("paste",
>> >> comAn[,1:2]))$lengths, seq_len))
>> >>
>> >> # create full set of combinations
>> >> outgrid <- expand.grid(animals=unique(comAn$animals),
>> >> animalYears=unique(comAn$animalYears), reps=unique(comAn$reps),
>> >> stringsAsFactors=FALSE)
>> >>
>> >> # combine with comAn
>> >> comAn.full <- merge(outgrid, comAn, all.x=TRUE)
>> >>
>> >> > comAn.full
>> >>animals animalYears reps animalMass
>> >> 1 bird   11 29
>> >> 2 bird   12 48
>> >> 3 bird   13 36
>> >> 4 bird   21 20
>> >> 5 bird   22 34
>> >> 6 bird   23 34
>> >> 7  cat   11 46
>> >> 8  cat   12 33
>> >> 9  cat   13 48
>> >> 10 cat   21 21
>> >> 11 cat   22 NA
>> >> 12 cat   23 NA
>> >> 13 dog   11 21
>> >> 14 dog   12 28
>> >> 15 dog   13 25
>> >> 16 dog   21 35
>> >> 17 dog   22 18
>> >> 18 dog   23 11
>> >> >
>> >>
>> >> On Tue, Mar 10, 2015 at 3:43 PM, Curtis Burkhalter
>> >>  wrote:
>> >> > Hey everyone,
>> >> >
>> >> > I've written a function that adds NAs to a dataframe where data is
>> >> > missing
>> >> > and it seems to work great if I only need to run it once, but if I
>> >> > run
>> >> > it
>> >> > two times in a row I run into problems. I've created a workable
>> >> > example
>> >> > to
>> >> > explain what I mean and why I would do this.
>> >> >
>> >> > In my dataframe there are areas where I need to add two rows of NAs
>> >> > (b/c
>> >> > I
>> >> > need to have 3 animal x year combos and for cat in year 2 I only have
>> >> > one)
>> >> > so I thought that I'd just run my code twice using the function in
>> >> > the
>> >> > code
>> >> > below. Everything works great when I run it the first time, but when
>> >> > I
>> >> > run
>> >> > it again it says that the value returned to the list 'x' is of length
>> >> > 0.
>> >> > I
>> >> > don't understand why the function wo

Re: [R] problem applying the same function twice

2015-03-10 Thread Curtis Burkhalter
Sarah,

I have 669 sites and each site has 7 years of data, so if I'm thinking
correctly then there should be 4683 possible combinations of site x year.
For each year though I need 3 sampling periods so that there is something
like the following:

site 1  year1  sample 1
site 1  year1  sample 2
site 1  year1  sample 3
site 2  year1  sample 1
site 2  year1  sample 2
site 2  year1  sample 3.
site 669   year7  sample 1
site 669   year7 sample 2
site 669   year7 sample 3.

I have my max memory allocation set to the amount of RAM (8GB) on my
laptop, but it still 'times out' due to memory problems.

On Tue, Mar 10, 2015 at 2:50 PM, Sarah Goslee 
wrote:

> You said your data only had 14000 rows, which really isn't many.
>
> How many possible combinations do you have, and how many do you need to
> add?
>
> On Tue, Mar 10, 2015 at 4:35 PM, Curtis Burkhalter
>  wrote:
> > Sarah,
> >
> > This strategy works great for this small dataset, but when I attempt your
> > method with my data set I reach the maximum allowable memory allocation
> and
> > the operation just stalls and then stops completely before it is
> finished.
> > Do you know of a way around this?
> >
> > Thanks
> >
> > On Tue, Mar 10, 2015 at 2:04 PM, Sarah Goslee 
> > wrote:
> >>
> >> Hi,
> >>
> >> I didn't work through your code, because it looked overly complicated.
> >> Here's a more general approach that does what you appear to want:
> >>
> >> # use dput() to provide reproducible data please!
> >> comAn <- structure(list(animals = c("bird", "bird", "bird", "bird",
> >> "bird",
> >> "bird", "dog", "dog", "dog", "dog", "dog", "dog", "cat", "cat",
> >> "cat", "cat"), animalYears = c(1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L,
> >> 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L), animalMass = c(29L, 48L, 36L,
> >> 20L, 34L, 34L, 21L, 28L, 25L, 35L, 18L, 11L, 46L, 33L, 48L, 21L
> >> )), .Names = c("animals", "animalYears", "animalMass"), class =
> >> "data.frame", row.names = c("1",
> >> "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
> >> "14", "15", "16"))
> >>
> >>
> >> # add reps to comAn
> >> # assumes comAn is already sorted on animals, animalYears
> >> comAn$reps <- unlist(sapply(rle(do.call("paste",
> >> comAn[,1:2]))$lengths, seq_len))
> >>
> >> # create full set of combinations
> >> outgrid <- expand.grid(animals=unique(comAn$animals),
> >> animalYears=unique(comAn$animalYears), reps=unique(comAn$reps),
> >> stringsAsFactors=FALSE)
> >>
> >> # combine with comAn
> >> comAn.full <- merge(outgrid, comAn, all.x=TRUE)
> >>
> >> > comAn.full
> >>animals animalYears reps animalMass
> >> 1 bird   11 29
> >> 2 bird   12 48
> >> 3 bird   13 36
> >> 4 bird   21 20
> >> 5 bird   22 34
> >> 6 bird   23 34
> >> 7  cat   11 46
> >> 8  cat   12 33
> >> 9  cat   13 48
> >> 10 cat   21 21
> >> 11 cat   22 NA
> >> 12 cat   23 NA
> >> 13 dog   11 21
> >> 14 dog   12 28
> >> 15 dog   13 25
> >> 16 dog   21 35
> >> 17 dog   22 18
> >> 18 dog   23 11
> >> >
> >>
> >> On Tue, Mar 10, 2015 at 3:43 PM, Curtis Burkhalter
> >>  wrote:
> >> > Hey everyone,
> >> >
> >> > I've written a function that adds NAs to a dataframe where data is
> >> > missing
> >> > and it seems to work great if I only need to run it once, but if I run
> >> > it
> >> > two times in a row I run into problems. I've created a workable
> example
> >> > to
> >> > explain what I mean and why I would do this.
> >> >
> >> > In my dataframe there are areas where I need to add two rows of NAs
> (b/c
> >> > I
> >> > need to have 3 animal x year combos and for cat in year 2 I only have
> >> > one)
> >> > so I thought that I'd just run my code twice using the function in the
> >> > code
> >> > below. Everything works great when I run it the first time, but when I
> >> > run
> >> > it again it says that the value returned to the list 'x' is of length
> 0.
> >> > I
> >> > don't understand why the function works the first time around and adds
> >> > an
> >> > NA to the 'animalMass' column, but won't do it again. I've used
> >> > (print(str(dataframe)) to see if there is a change in class or type
> when
> >> > the function runs through the original dataframe and there is for
> >> > 'animalYears', but I just convert it back before rerunning the
> function
> >> > for
> >> > second time.
> >> >
> >> > Any thoughts on this would be greatly appreciated b/c my actual data
> >> > dataframe I have to input into WinBUGS is 14000x12, so it's not a
> >> > trivial
> >> > thing to just add in an NA here or there.
> >> >
> >> >>comAn
> >> >animals

Re: [R] problem applying the same function twice

2015-03-10 Thread Curtis Burkhalter
William,

You say not to use apply here, but what would you use in its place?

Thanks

On Tue, Mar 10, 2015 at 2:13 PM, William Dunlap  wrote:

> The key to your problem may be that
>x<-apply(missing,1,genRows)
> converts 'missing' to a matrix, with the same type for all columns
> then makes x either a list or a matrix but never a data.frame.
> Those features of apply may mess up the rest of your calculations.
>
> Don't use apply().
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Tue, Mar 10, 2015 at 12:43 PM, Curtis Burkhalter <
> curtisburkhal...@gmail.com> wrote:
>
>> Hey everyone,
>>
>> I've written a function that adds NAs to a dataframe where data is missing
>> and it seems to work great if I only need to run it once, but if I run it
>> two times in a row I run into problems. I've created a workable example to
>> explain what I mean and why I would do this.
>>
>> In my dataframe there are areas where I need to add two rows of NAs (b/c I
>> need to have 3 animal x year combos and for cat in year 2 I only have one)
>> so I thought that I'd just run my code twice using the function in the
>> code
>> below. Everything works great when I run it the first time, but when I run
>> it again it says that the value returned to the list 'x' is of length 0. I
>> don't understand why the function works the first time around and adds an
>> NA to the 'animalMass' column, but won't do it again. I've used
>> (print(str(dataframe)) to see if there is a change in class or type when
>> the function runs through the original dataframe and there is for
>> 'animalYears', but I just convert it back before rerunning the function
>> for
>> second time.
>>
>> Any thoughts on this would be greatly appreciated b/c my actual data
>> dataframe I have to input into WinBUGS is 14000x12, so it's not a trivial
>> thing to just add in an NA here or there.
>>
>> >comAn
>>animals animalYears animalMass
>> 1 bird   1 29
>> 2 bird   1 48
>> 3 bird   1 36
>> 4 bird   2 20
>> 5 bird   2 34
>> 6 bird   2 34
>> 7  dog   1 21
>> 8  dog   1 28
>> 9  dog   1 25
>> 10 dog   2 35
>> 11 dog   2 18
>> 12 dog   2 11
>> 13 cat   1 46
>> 14 cat   1 33
>> 15 cat   1 48
>> 16 cat   2 21
>>
>> So every animal has 3 measurements per year, except for the cat in year
>> two
>> which has only 1. I run the code below and get:
>>
>> #combs defines the different combinations of
>> #animals and animalYears
>> combs<-paste(comAn$animals,comAn$animalYears,sep=':')
>> #counts defines how long the different combinations are
>> counts<-ave(1:nrow(comAn),combs,FUN=length)
>> #missing defines the combs that have length less than one and puts it in
>> #the data frame missing
>> missing<-data.frame(vals=combs[counts<2],count=counts[counts<2])
>>
>> genRows<-function(dat){
>> vals<-strsplit(dat[1],':')[[1]]
>> #not sure why dat[2] is being converted to a string
>> newRows<-2-as.numeric(dat[2])
>> newDf<-data.frame(animals=rep(vals[1],newRows),
>>   animalYears=rep(vals[2],newRows),
>>   animalMass=rep(NA,newRows))
>> return(newDf)
>> }
>>
>>
>> x<-apply(missing,1,genRows)
>> comAn=rbind(comAn,
>> do.call(rbind,x))
>>
>> > comAn
>>animals animalYears animalMass
>> 1 bird   1 29
>> 2 bird   1 48
>> 3 bird   1 36
>> 4 bird   2 20
>> 5 bird   2 34
>> 6 bird   2 34
>> 7  dog   1 21
>> 8  dog   1 28
>> 9  dog   1 25
>> 10 dog   2 35
>> 11 dog   2 18
>> 12 dog   2 11
>> 13 cat   1 46
>> 14 cat   1 33
>> 15 cat   1 48
>> 16 cat   2 21
>> 17 cat   2   
>>
>> So far so good, but then I adjust the code so that it reads (**notice the
>> change in the specification in 'missing' to counts<3**):
>>
>> #combs defines the different combinations of
>> #animals and animalYears
>> combs<-paste(comAn$animals,comAn$animalYears,sep=':')
>> #counts defines how long the different combinations are
>> counts<-ave(1:nrow(comAn),combs,FUN=length)
>> #missing defines the combs that have length less than one and puts it in
>> #the data frame missing
>> missing<-data.frame(vals=combs[counts<3],count=counts[counts<3])
>>
>> genRows<-function(dat){
>> vals<-strsplit(dat[1],':')[[1]]
>> #not sure why dat[2] is being converted to a string
>> newRows<-2-as.numeric(dat[2])
>> newDf<-data.frame(

Re: [R] problem applying the same function twice

2015-03-10 Thread Sarah Goslee
Hi,

I didn't work through your code, because it looked overly complicated.
Here's a more general approach that does what you appear to want:

# use dput() to provide reproducible data please!
comAn <- structure(list(animals = c("bird", "bird", "bird", "bird", "bird",
"bird", "dog", "dog", "dog", "dog", "dog", "dog", "cat", "cat",
"cat", "cat"), animalYears = c(1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L,
1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L), animalMass = c(29L, 48L, 36L,
20L, 34L, 34L, 21L, 28L, 25L, 35L, 18L, 11L, 46L, 33L, 48L, 21L
)), .Names = c("animals", "animalYears", "animalMass"), class =
"data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16"))


# add reps to comAn
# assumes comAn is already sorted on animals, animalYears
comAn$reps <- unlist(sapply(rle(do.call("paste",
comAn[,1:2]))$lengths, seq_len))

# create full set of combinations
outgrid <- expand.grid(animals=unique(comAn$animals),
animalYears=unique(comAn$animalYears), reps=unique(comAn$reps),
stringsAsFactors=FALSE)

# combine with comAn
comAn.full <- merge(outgrid, comAn, all.x=TRUE)

> comAn.full
   animals animalYears reps animalMass
1 bird   11 29
2 bird   12 48
3 bird   13 36
4 bird   21 20
5 bird   22 34
6 bird   23 34
7  cat   11 46
8  cat   12 33
9  cat   13 48
10 cat   21 21
11 cat   22 NA
12 cat   23 NA
13 dog   11 21
14 dog   12 28
15 dog   13 25
16 dog   21 35
17 dog   22 18
18 dog   23 11
>

On Tue, Mar 10, 2015 at 3:43 PM, Curtis Burkhalter
 wrote:
> Hey everyone,
>
> I've written a function that adds NAs to a dataframe where data is missing
> and it seems to work great if I only need to run it once, but if I run it
> two times in a row I run into problems. I've created a workable example to
> explain what I mean and why I would do this.
>
> In my dataframe there are areas where I need to add two rows of NAs (b/c I
> need to have 3 animal x year combos and for cat in year 2 I only have one)
> so I thought that I'd just run my code twice using the function in the code
> below. Everything works great when I run it the first time, but when I run
> it again it says that the value returned to the list 'x' is of length 0. I
> don't understand why the function works the first time around and adds an
> NA to the 'animalMass' column, but won't do it again. I've used
> (print(str(dataframe)) to see if there is a change in class or type when
> the function runs through the original dataframe and there is for
> 'animalYears', but I just convert it back before rerunning the function for
> second time.
>
> Any thoughts on this would be greatly appreciated b/c my actual data
> dataframe I have to input into WinBUGS is 14000x12, so it's not a trivial
> thing to just add in an NA here or there.
>
>>comAn
>animals animalYears animalMass
> 1 bird   1 29
> 2 bird   1 48
> 3 bird   1 36
> 4 bird   2 20
> 5 bird   2 34
> 6 bird   2 34
> 7  dog   1 21
> 8  dog   1 28
> 9  dog   1 25
> 10 dog   2 35
> 11 dog   2 18
> 12 dog   2 11
> 13 cat   1 46
> 14 cat   1 33
> 15 cat   1 48
> 16 cat   2 21
>
> So every animal has 3 measurements per year, except for the cat in year two
> which has only 1. I run the code below and get:
>
> #combs defines the different combinations of
> #animals and animalYears
> combs<-paste(comAn$animals,comAn$animalYears,sep=':')
> #counts defines how long the different combinations are
> counts<-ave(1:nrow(comAn),combs,FUN=length)
> #missing defines the combs that have length less than one and puts it in
> #the data frame missing
> missing<-data.frame(vals=combs[counts<2],count=counts[counts<2])
>
> genRows<-function(dat){
> vals<-strsplit(dat[1],':')[[1]]
> #not sure why dat[2] is being converted to a string
> newRows<-2-as.numeric(dat[2])
> newDf<-data.frame(animals=rep(vals[1],newRows),
>   animalYears=rep(vals[2],newRows),
>   animalMass=rep(NA,newRows))
> return(newDf)
> }
>
>
> x<-apply(missing,1,genRows)
> comAn=rbind(comAn,
> do.call(rbind,x))
>
>> comAn
>animals animalYears animalMass
> 1 bird   1 29
> 2 bird   1 48
> 3 bird   1 36
> 4 bird   2  

[R] problem applying the same function twice

2015-03-10 Thread Curtis Burkhalter
Hey everyone,

I've written a function that adds NAs to a dataframe where data is missing
and it seems to work great if I only need to run it once, but if I run it
two times in a row I run into problems. I've created a workable example to
explain what I mean and why I would do this.

In my dataframe there are areas where I need to add two rows of NAs (b/c I
need to have 3 animal x year combos and for cat in year 2 I only have one)
so I thought that I'd just run my code twice using the function in the code
below. Everything works great when I run it the first time, but when I run
it again it says that the value returned to the list 'x' is of length 0. I
don't understand why the function works the first time around and adds an
NA to the 'animalMass' column, but won't do it again. I've used
(print(str(dataframe)) to see if there is a change in class or type when
the function runs through the original dataframe and there is for
'animalYears', but I just convert it back before rerunning the function for
second time.

Any thoughts on this would be greatly appreciated b/c my actual data
dataframe I have to input into WinBUGS is 14000x12, so it's not a trivial
thing to just add in an NA here or there.

>comAn
   animals animalYears animalMass
1 bird   1 29
2 bird   1 48
3 bird   1 36
4 bird   2 20
5 bird   2 34
6 bird   2 34
7  dog   1 21
8  dog   1 28
9  dog   1 25
10 dog   2 35
11 dog   2 18
12 dog   2 11
13 cat   1 46
14 cat   1 33
15 cat   1 48
16 cat   2 21

So every animal has 3 measurements per year, except for the cat in year two
which has only 1. I run the code below and get:

#combs defines the different combinations of
#animals and animalYears
combs<-paste(comAn$animals,comAn$animalYears,sep=':')
#counts defines how long the different combinations are
counts<-ave(1:nrow(comAn),combs,FUN=length)
#missing defines the combs that have length less than one and puts it in
#the data frame missing
missing<-data.frame(vals=combs[counts<2],count=counts[counts<2])

genRows<-function(dat){
vals<-strsplit(dat[1],':')[[1]]
#not sure why dat[2] is being converted to a string
newRows<-2-as.numeric(dat[2])
newDf<-data.frame(animals=rep(vals[1],newRows),
  animalYears=rep(vals[2],newRows),
  animalMass=rep(NA,newRows))
return(newDf)
}


x<-apply(missing,1,genRows)
comAn=rbind(comAn,
do.call(rbind,x))

> comAn
   animals animalYears animalMass
1 bird   1 29
2 bird   1 48
3 bird   1 36
4 bird   2 20
5 bird   2 34
6 bird   2 34
7  dog   1 21
8  dog   1 28
9  dog   1 25
10 dog   2 35
11 dog   2 18
12 dog   2 11
13 cat   1 46
14 cat   1 33
15 cat   1 48
16 cat   2 21
17 cat   2   

So far so good, but then I adjust the code so that it reads (**notice the
change in the specification in 'missing' to counts<3**):

#combs defines the different combinations of
#animals and animalYears
combs<-paste(comAn$animals,comAn$animalYears,sep=':')
#counts defines how long the different combinations are
counts<-ave(1:nrow(comAn),combs,FUN=length)
#missing defines the combs that have length less than one and puts it in
#the data frame missing
missing<-data.frame(vals=combs[counts<3],count=counts[counts<3])

genRows<-function(dat){
vals<-strsplit(dat[1],':')[[1]]
#not sure why dat[2] is being converted to a string
newRows<-2-as.numeric(dat[2])
newDf<-data.frame(animals=rep(vals[1],newRows),
  animalYears=rep(vals[2],newRows),
  animalMass=rep(NA,newRows))
return(newDf)
}


x<-apply(missing,1,genRows)
comAn=rbind(comAn,
do.call(rbind,x))

The result for 'x' then reads:

> x
[[1]]
[1] animals animalYears animalMass
<0 rows> (or 0-length row.names)

Any thoughts on why it might be doing this instead of adding an additional
row to get the result:

> comAn
   animals animalYears animalMass
1 bird   1 29
2 bird   1 48
3 bird   1 36
4 bird   2 20
5 bird   2 34
6 bird   2 34
7  dog   1 21
8  dog   1 28
9  dog   1 25
10 dog   2 35
11 dog   2 18
12 dog   2   

Re: [R] Error: cannot allocate vector of size 64.0 Mb When Using Read.zoo()

2015-03-10 Thread 李倩雯
I dont think so. I removed all variables except for the data I was to use
and tried gc() to release some memories. But the error still happened.

Regards,
Jasmine
On 10 Mar, 2015 10:49 pm, "Uwe Ligges" 
wrote:

>
>
> On 10.03.2015 04:16, 李倩雯 wrote:
>
>> Hi all,
>>
>> *Problem Description*
>> I encountered the *Error: cannot allocate vector of size 64.0 Mb* when I
>> was using read.zoo to convert a data.frame called 'origin' to zoo object
>> named 'target'
>>
>> *About the Data & Code*
>> My data frame(origin) contains 5340191 obs. of 3 variables[Data,
>> Numeric,Character]
>> The code looks like
>> *target<-read.zoo(origin,format="%m/%d/%Y",index.column=1,split=3)*
>>
>> *SessionInfo:*
>> R version 3.1.2 (2014-10-31)
>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>> Installed memory: 4.00 GB (3.82 GB usable)
>> Result of memory.size() : 3812.85
>>
>
> I guess you have lots of stuff in your workspace? Clean that uop and try
> again.
>
> Best,
> Uwe Ligges
>
>
>
>> I try to calculate the required memory but I don't know what are the
>> operations in such conversion process. Therefore I have no idea if my data
>> is too mass to handle or I was using a low efficient method. Can anyone
>> help me with this problem?
>>
>> By the way, as this is the first time I turn to mailing list for help, I
>> am
>> not sure if I ask in the right manner. Please tell me if any
>> suggestions.Thank you.
>>
>>
>> Best regards,
>> Jasmine
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to access https page

2015-03-10 Thread Jeroen Ooms
On Mon, Mar 9, 2015 at 3:39 PM, Hui Du  wrote:

> > readLines(url)
> Error in file(con, "r") : cannot open the connection
> In addition: Warning message:
> In file(con, "r") : unsupported URL scheme
>

Try:

library(curl)
readLines(curl(url))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] svg2swf - controlling the looping of flash files

2015-03-10 Thread Paul Sweeting
Hi Yixuan

 

Thanks for your reply. I think it would be useful to have the option of a “loop 
= FALSE” option in this function.  However, I’m not sure how long a shelf life 
swf files will have, given everything seems to be moving away from flash…

 

Paul

 

From: Yixuan Qiu [mailto:yixuan@cos.name] 
Sent: 10 March 2015 00:20
To: Paul Sweeting
Cc: r-help
Subject: Re: [R] svg2swf - controlling the looping of flash files

 

Hello Paul,

So far there is no way to stop the animation after its first run. If this 
feature is needed I could try to implement it in the future version of R2SWF.



Best,

Yixuan

 

2015-03-09 18:33 GMT-04:00 Paul Sweeting mailto:paul.j.sweet...@gmail.com> >:

Hi



I'm using svg2swf to collate a number of svg outputs into an swf file.  I've
got this working (mainly.) except that I can't control the looping behaviour
of the swf file.  In other words, when it's loaded into html it loops
continuously.  Is there any way to stop the animation looping, so it just
plays through once when loaded?  The code I use is (broadly):



   svg("testplot%d.svg",onefile = FALSE)

   for(j in 1:360){

  print(cloud(x~y*z, groups=tail,
data=norm_dots_chart, screen=list(z=0,x=0,y=j)))

   }

   dev.off()

   output = svg2swf(sprintf("testplot%d.svg", 1:360), interval =
0.04)

   swf2html(output)



Thank you!


[[alternative HTML version deleted]]

__
R-help@r-project.org   mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




-- 

Yixuan Qiu mailto:yixuan@cos.name> >
Department of Statistics,
Purdue University


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error: cannot allocate vector of size 64.0 Mb When Using Read.zoo()

2015-03-10 Thread Uwe Ligges



On 10.03.2015 04:16, 李倩雯 wrote:

Hi all,

*Problem Description*
I encountered the *Error: cannot allocate vector of size 64.0 Mb* when I
was using read.zoo to convert a data.frame called 'origin' to zoo object
named 'target'

*About the Data & Code*
My data frame(origin) contains 5340191 obs. of 3 variables[Data,
Numeric,Character]
The code looks like
*target<-read.zoo(origin,format="%m/%d/%Y",index.column=1,split=3)*

*SessionInfo:*
R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Installed memory: 4.00 GB (3.82 GB usable)
Result of memory.size() : 3812.85


I guess you have lots of stuff in your workspace? Clean that uop and try 
again.


Best,
Uwe Ligges




I try to calculate the required memory but I don't know what are the
operations in such conversion process. Therefore I have no idea if my data
is too mass to handle or I was using a low efficient method. Can anyone
help me with this problem?

By the way, as this is the first time I turn to mailing list for help, I am
not sure if I ask in the right manner. Please tell me if any
suggestions.Thank you.


Best regards,
Jasmine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error in svychisq and svyttest with svrepdesign

2015-03-10 Thread Anthony Damico
hi anabela, please provide a complete reproducible example.  you need to
use ?dput  -- we are not able to import "dadosSPSS.sav" so we cannot
recreate your problem in order to help you.  thanks!

http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example



On Tue, Mar 10, 2015 at 9:31 AM, Anabela Afonso  wrote:

> Dear Forum,
>
> I’m working with a complex sample and with replication weights. I defined
> my design svrepdesign function. I’m trying to run svychisq and
> svyttest function
> from the survey package and I get the error:
>
>
>
> Error in crossprod(x, y) :
>
>   requires numeric/complex matrix/vector arguments
>
>
>
> I can’t understand this error. I kindly ask if someone can help me out.
>
>
>
> Thanks in advance,
>
>
>
>
> Here is my code and some output:
>
> > library(foreign); library(survey)
>
> > dados<-read.spss("dadosSPSS.sav", use.value.labels=T, to.data.frame=T)
>
> > class(dados)
>
> [1] "data.frame"
>
> > str(dados)
>
> 'data.frame':7624 obs. of  4 variables:
>
>  $ Sex : Factor w/ 2 levels "Male","Female": 1 1 1 1 1 1 1 1 1 1 ...
>
>  $ Computer: Factor w/ 2 levels "Yes","NO": 1 1 2 1 1 1 1 1 1 2 ...
>
>  $ Color   : Factor w/ 3 levels "Red","Green",..: 1 1 1 1 1 1 1 1 1 1 ...
>
>  $ Number  : num  2 1 0 2 1 2 1 2 1 0 ...
>
>  $ final.w : num  1267 596 1143 1069 542 ...
>
> # Note: Variable Color with NA
>
> > repdes<-svrepdesign(data=dados, repweights=rep.w, scale=1, rscales=r.sc,
> type="JKn", weights=~final.w, combined.weights=F)
>
> > summary(repdes)
>
> Call: svrepdesign.default(data = dados, repweights = rep.w, scale = 1,
>
> rscales = r.sc, type = "JKn", weights = ~final.w, combined.weights =
> F)
>
> Stratified cluster jackknife (JKn) with 428 replicates.
>
> Variables:
>
> [1] "Sex"  "Computer" "Color""Number"   "final.w"
>
>
>
> > svytable(~Sex+Computer, repdes)
>
> Computer
>
> SexYesNO
>
>   Male   1501598.7 1063055.3
>
>   Female 1485933.1  810557.9
>
>
>
> > svytable(~Sex+Color, repdes)  # NA are ignored
>
> Color
>
> SexRed GreenYellow
>
>   Male   2060708.5  219678.4  286038.6
>
>   Female 1840511.7  229763.8  22.0
>
>
>
> > svychisq(~Sex+Computer, repdes)
>
> Error in crossprod(x, y) :
>
>   requires numeric/complex matrix/vector arguments
>
>
>
> > svychisq(~Sex+Color, repdes)
>
> Error in crossprod(x, y) :
>
>   requires numeric/complex matrix/vector arguments
>
>
>
> > svyttest(Number ~Sex, repdes)
>
> Error in crossprod(x, y) :
>
>   requires numeric/complex matrix/vector arguments
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error in svychisq and svyttest with svrepdesign

2015-03-10 Thread Anabela Afonso
Dear Forum,

I’m working with a complex sample and with replication weights. I defined
my design svrepdesign function. I’m trying to run svychisq and
svyttest function
from the survey package and I get the error:



Error in crossprod(x, y) :

  requires numeric/complex matrix/vector arguments



I can’t understand this error. I kindly ask if someone can help me out.



Thanks in advance,




Here is my code and some output:

> library(foreign); library(survey)

> dados<-read.spss("dadosSPSS.sav", use.value.labels=T, to.data.frame=T)

> class(dados)

[1] "data.frame"

> str(dados)

'data.frame':7624 obs. of  4 variables:

 $ Sex : Factor w/ 2 levels "Male","Female": 1 1 1 1 1 1 1 1 1 1 ...

 $ Computer: Factor w/ 2 levels "Yes","NO": 1 1 2 1 1 1 1 1 1 2 ...

 $ Color   : Factor w/ 3 levels "Red","Green",..: 1 1 1 1 1 1 1 1 1 1 ...

 $ Number  : num  2 1 0 2 1 2 1 2 1 0 ...

 $ final.w : num  1267 596 1143 1069 542 ...

# Note: Variable Color with NA

> repdes<-svrepdesign(data=dados, repweights=rep.w, scale=1, rscales=r.sc, 
> type="JKn", weights=~final.w, combined.weights=F)

> summary(repdes)

Call: svrepdesign.default(data = dados, repweights = rep.w, scale = 1,

rscales = r.sc, type = "JKn", weights = ~final.w, combined.weights = F)

Stratified cluster jackknife (JKn) with 428 replicates.

Variables:

[1] "Sex"  "Computer" "Color""Number"   "final.w"



> svytable(~Sex+Computer, repdes)

Computer

SexYesNO

  Male   1501598.7 1063055.3

  Female 1485933.1  810557.9



> svytable(~Sex+Color, repdes)  # NA are ignored

Color

SexRed GreenYellow

  Male   2060708.5  219678.4  286038.6

  Female 1840511.7  229763.8  22.0



> svychisq(~Sex+Computer, repdes)

Error in crossprod(x, y) :

  requires numeric/complex matrix/vector arguments



> svychisq(~Sex+Color, repdes)

Error in crossprod(x, y) :

  requires numeric/complex matrix/vector arguments



> svyttest(Number ~Sex, repdes)

Error in crossprod(x, y) :

  requires numeric/complex matrix/vector arguments

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] C#

2015-03-10 Thread Keith S Weintraub
I will keep this short as this might be the wrong list:

I have found one (beta) project that allows R to interface with C#.

Are there others? Any favorites.

Best,
KW

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add sum line to plot of multiple x values

2015-03-10 Thread PIKAL Petr
Hi

Yes, your understanding is correct

the same can be achieved by:

p+geom_point(data=d.ag,aes(x=Group.1,y=x), size=5)+
geom_line(data=d.ag,aes(x=1:3, y=x))

as factors are treated from 1 to number of levels if given as x aestetics.

The whole code is

library(ggplot2)
people <- c("alice","bob","carol")
user <- c(rep(people,3))
files <- c(18,5,21,22,9,14,26,3,22)
date <- c(rep("2013-09-15",3),rep("2013-09-08",3),rep("2013-09-01",3))
d <- data.frame(user=user,files=files,date=date)
p <- ggplot()
p <- p + geom_line(data=d,aes(x=date,y=files,group=user,colour=user))
d.ag<-aggregate(d$files, list(d$date), sum)
p+geom_point(data=d.ag,aes(x=Group.1,y=x), size=5)+
geom_line(data=d.ag,aes(x=1:3, y=x))

Cheers
Petr

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Loris
> Bennett
> Sent: Tuesday, March 10, 2015 1:43 PM
> To: r-h...@stat.math.ethz.ch
> Subject: Re: [R] Add sum line to plot of multiple x values
>
> Loris Bennett  writes:
>
> > Hi Petr,
> >
> > See inline.
> >
> > PIKAL Petr  writes:
> >
> >> Hi
> >>
> >> see inline
> >>
> >>> -Original Message-
> >>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
> >>> Loris Bennett
> >>> Sent: Monday, March 09, 2015 4:35 PM
> >>> To: r-h...@stat.math.ethz.ch
> >>> Subject: Re: [R] Add sum line to plot of multiple x values
> >>>
> >>> PIKAL Petr  writes:
> >>>
> >>> > Hi
> >>> >
> >>> > Not extremely clear what do you want to plot. Do you want to add
> a
> >>> > line which marks total number of files each day regardless of
> user?
> >>> Or
> >>> > a total number of files regardless of date coloured by user?
> >>>
> >>> Sorry, I was unclear.  I meant that I would like to plot the
> following:
> >>>
> >>> 1. For each user: the number of files for each date (my code does
> >>> this) 2. The sum of files of all users for each date (this is what
> I still
> >>>need)
> >>>
> >>> > In each case you shall search functions geom_hline or geom_abline
> >>> >
> >>> > http://stackoverflow.com/questions/13254441/add-a-horizontal-
> line-
> >>> > to-
> >>> plot-and-legend-in-ggplot2
> >>>
> >>> So I don't want a straight line
> >>
> >> but in your code is
> >>
> 
> geom_line(data=d,aes(x=date,y=sum(files),group=date),colour='black'
>  )
> >>
> >> so you apparently want some sort of line.
> >
> > Yes, but see below.
> >
> >> anyway, if I do
> >>
> >> d.ag<-aggregate(d$files, list(d$date), sum)
> >>
> >> I can add
> >>
> >> p+geom_point(data=d.ag,aes(x=Group.1,y=x), size=5)
> >>
> >> and I get summary points.
> >
> > Thanks, this works.
> >
> >> If you want lines you can do
> >>
> >> p+geom_hline(data=d.ag,aes(yintercept=x, colour=Group.1))
> >>
> >> or you can fiddle with geom_segment
> >
> > I don't want an hline, just a line joining the dots I get using
> > geom_point.  I thought something like
> >
> >   p + geom_line(data=d.ag,aes(x=as.character(Group.1),y=x)
> >
> > would work.  However, while I get a plot with axes labelled in the
> > correct ranges, no line is plotted.  Explicitly setting the colour
> > with
> >
> >   p +
> > geom_line(data=d.ag,aes(x=as.character(Group.1),y=x),colour="red")
> >
> > doesn't help.  What am I doing wrong?
>
> I found the answer by googling "geom_line doesn’t draw lines".  The
> following does what I want:
>
>   ggplot(data=d.ag,aes(x=as.character(Group.1),y=x,group=1))
> +_geom_line()
>
> My understanding is that, because 'Group.1' is a factor, the 'x' values
> are not considered as belonging to the same group and geom_line only
> connects points within a group.
>
> >>> > ggplot is rather complicated but very flexible
> >>>
> >>> I don't mind ggplot being complicated, but I find the documentation
> >>> a little impenetrable.
> >>
> >> You can find plenty of help when you just try to google on the item
> >> searching. Actually this is what I do when the solution is not
> >> obvious or requires some hidden instruction.
> >
> > This is what I normally resort to with varying degrees of success.
> It
> > just seems a bit of a shame the some of the documentation for such a
> > good piece of software does indeed appear to be rather "hidden".
> >
> >> Cheers
> >> Petr
> >>
> >>>
> >>> Cheers,
> >>>
> >>> Loris
> >>>
> >>>
> >>> >> -Original Message-
> >>> >> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
> >>> Loris
> >>> >> Bennett
> >>> >> Sent: Monday, March 09, 2015 2:56 PM
> >>> >> To: r-h...@stat.math.ethz.ch
> >>> >> Subject: [R] Add sum line to plot of multiple x values
> >>> >>
> >>> >> Hi,
> >>> >>
> >>> >> Here are my data:
> >>> >>
> >>> >> > d
> >>> >>user files   date
> >>> >> 1 alice18 2013-09-15
> >>> >> 2   bob 5 2013-09-15
> >>> >> 3 carol21 2013-09-15
> >>> >> 4 alice22 2013-09-08
> >>> >> 5   bob 9 2013-09-08
> >>> >> 6 carol14 2013-09-08
> >>> >> 7 alice26 2013-09-01
> >>> >> 8   bob 3 2013-09-01
> >>> >> 9 carol22 2013-09-01
> >>> >>
> >>> >> I would like to plot the number of file

Re: [R] Help with optim() to maximize log-likelihood

2015-03-10 Thread Prof J C Nash (U30A)
1) It helps to include the require statements for those of us who work
outside your particular box.
   lme4 and (as far as I can guess) fastGHQuad
are needed.

2) Most nonlinear functions have domains where they cannot be
evaluated. I'd be richer than Warren Buffett if I got $5 for
each time someone said "your optimizer doesn't work" and I
found   f(start, ...) was NaN or Inf, as in this case, i.e.,

 start <- c(plogis(sum(Y/m)),log(sigma2H))
 cat("starting params:")
 print(start)
 tryf0 <- ll(start,Y,m)
 print(tryf0)


It really is worthwhile actually computing your function at the initial
parameters EVERY time. (Or turn on the trace etc.)

JN

On 15-03-10 07:00 AM, r-help-requ...@r-project.org wrote:
> Message: 12
> Date: Mon, 9 Mar 2015 16:18:06 +0200
> From: Sophia Kyriakou 
> To: r-help@r-project.org
> Subject: [R] Help with optim() to maximize log-likelihood
> Message-ID:
>   
> Content-Type: text/plain; charset="UTF-8"
> 
> hello, I am using the optim function to maximize the log likelihood of a
> generalized linear mixed model and I am trying to replicate glmer's
> estimated components. If I set both the sample and subject size to q=m=100
> I replicate glmer's results for the random intercept model with parameters
>  beta=-1 and sigma^2=1. But if I change beta to 2 glmer works and optim
> gives me the error message "function cannot be evaluated at initial
> parameters".
> 
> If anyone could please help?
> Thanks
> 
>  # likelihood function
>  ll <- function(x,Y,m){
>  beta <- x[1]
>  psi <- x[2]
>  q <- length(Y)
>   p <- 20
>  rule20 <- gaussHermiteData(p)
>  wStar <- exp(rule20$x * rule20$x + log(rule20$w))
>  # Integrate over(-Inf, +Inf) using adaptive Gauss-Hermite quadrature
>  g <- function(alpha, beta, psi, y, m) {-y+m*exp(alpha + beta)/(1 +
> exp(alpha + beta)) + alpha/exp(psi)}
>  DDfLik <- deriv(expression(-y+m*exp(alpha + beta)/(1 + exp(alpha + beta))
> + alpha/exp(psi)),
>  namevec = "alpha", func = TRUE,function.arg = c("alpha", "beta", "psi",
> "y", "m"))
>int0 <- rep(NA,q)
>  piYc_ir <- matrix(NA,q,p)
>  for (i in 1:q){
>  muHat <- uniroot(g, c(-10, 10),extendInt ="yes", beta = beta, psi = psi, y
> = Y[i], m = m)$root
>  jHat <- attr(DDfLik(alpha = muHat, beta, psi, Y[i], m), "gradient")
>  sigmaHat <- 1/sqrt(jHat)
>  z <- muHat + sqrt(2) * sigmaHat * rule20$x
>  piYc_ir[i,] <-
> choose(m,Y[i])*exp(Y[i]*(z+beta))*exp(-z^2/(2*exp(psi)))/((1+exp(z+beta))^m*sqrt(2*pi*exp(psi)))
>  int0[i] <- sqrt(2)*sigmaHat*sum(wStar*piYc_ir[i,])
>  }
>  ll <- -sum(log(int0))
>  ll
>  }
> 
>  beta <- 2
>  sigma2 <- 1
>  m <- 100
>  q <- 100
> 
>  cl <- seq.int(q)
>  tot <- rep(m,q)
> 
>  set.seed(123)
>  alpha <- rnorm(q, 0, sqrt(sigma2))
>  Y <- rbinom(q,m,plogis(alpha+beta))
> 
>  dat <- data.frame(y = Y, tot = tot, cl = cl)
>  f1 <- glmer(cbind(y, tot - y) ~ 1 + (1 | cl), data = dat,family =
> binomial(),nAGQ = 20)
>  betaH <- summary(f1)$coefficients[1]
>  sigma2H <- as.numeric(summary(f1)$varcor)
>  thetaglmer <- c(betaH,sigma2H)
> 
>  logL <- function(x) ll(x,Y,m)
>  thetaMLb <- optim(c(plogis(sum(Y/m)),log(sigma2H)),fn=logL)$par
>  Error in optim(c(plogis(sum(Y/m)), log(sigma2H)), fn = logL) :  function
> cannot be evaluated at initial parameters
> 
> thetaglmer
> [1] 2.1128529 0.8311484
>  (thetaML <- c(thetaMLb[1],exp(thetaMLb[2])))
> 
>   [[alternative HTML version deleted]]
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] logit in "car" package

2015-03-10 Thread John Fox
Dear Kathryn,

On Tue, 10 Mar 2015 03:24:03 -0700 (PDT)
 kat123  wrote:
> I have run a logit data transformation in R using the logit function in the
> package car.
> 
> http://cran.r-project.org/web/packages/car/car.pdf
> 
> If i run logit on a column of data that contains a 0 value it makes and
> adjustment according to the literature of 0.025.
> 
> I thought this meant that it was running the transformation as 
> 
> log((p+0.025)/ (1-(p+0.025)))

It's not that simple -- think what would happen if there were both 0s and 1s in 
the data:

> p <- 0:1

> log((p+0.025)/ (1-(p+0.025)))
[1] -3.663562   NaN
Warning message:
In log((p + 0.025)/(1 - (p + 0.025))) : NaNs produced

> logit(p)
[1] -3.663562  3.663562
Warning message:
In logit(p) : proportions remapped to (0.025, 0.975)

> log(c(0.025, 0.975)/(1 - c(0.025, 0.975)))
[1] -3.663562  3.663562


To see what logit() does simply print it by typing logit at the command prompt.

I hope this helps,
 John


John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

> 
> However, if I run individual values through this equation they do not match
> up to the output of the logit function.
> 
> Any suggestions?
> 
>  
> 
> 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/logit-in-car-package-tp4704408.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Are there any implemented function for A/B testing?

2015-03-10 Thread Sarah Goslee
You already asked this, and show no signs of having either read the
responses or out any effort into trying to find out yourself.

Go to
http://rseek.org
and search for
"a/b testing"

Read the results, try out the examples. After that, if you still have
specific R questions, this list is the place to come for help. But you've
got to put some effort in yourself.

Sarah

On Tuesday, March 10, 2015, Namratha K  wrote:

> Is there any method or built-in function for implementing a/b testing using
> R language
> Are there any function developed to implement a/b testing in R language?
>


-- 
Sarah Goslee
http://www.stringpage.com
http://www.sarahgoslee.com
http://www.functionaldiversity.org

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add sum line to plot of multiple x values

2015-03-10 Thread Loris Bennett
Loris Bennett  writes:

> Hi Petr,
>
> See inline.
>
> PIKAL Petr  writes:
>
>> Hi
>>
>> see inline
>>
>>> -Original Message-
>>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Loris
>>> Bennett
>>> Sent: Monday, March 09, 2015 4:35 PM
>>> To: r-h...@stat.math.ethz.ch
>>> Subject: Re: [R] Add sum line to plot of multiple x values
>>>
>>> PIKAL Petr  writes:
>>>
>>> > Hi
>>> >
>>> > Not extremely clear what do you want to plot. Do you want to add a
>>> > line which marks total number of files each day regardless of user?
>>> Or
>>> > a total number of files regardless of date coloured by user?
>>>
>>> Sorry, I was unclear.  I meant that I would like to plot the following:
>>>
>>> 1. For each user: the number of files for each date (my code does this)
>>> 2. The sum of files of all users for each date (this is what I still
>>>need)
>>>
>>> > In each case you shall search functions geom_hline or geom_abline
>>> >
>>> > http://stackoverflow.com/questions/13254441/add-a-horizontal-line-to-
>>> plot-and-legend-in-ggplot2
>>>
>>> So I don't want a straight line
>>
>> but in your code is
>>
 geom_line(data=d,aes(x=date,y=sum(files),group=date),colour='black')
>>
>> so you apparently want some sort of line.
>
> Yes, but see below.
>
>> anyway, if I do
>>
>> d.ag<-aggregate(d$files, list(d$date), sum)
>>
>> I can add
>>
>> p+geom_point(data=d.ag,aes(x=Group.1,y=x), size=5)
>>
>> and I get summary points.
>
> Thanks, this works.
>
>> If you want lines you can do
>>
>> p+geom_hline(data=d.ag,aes(yintercept=x, colour=Group.1))
>>
>> or you can fiddle with geom_segment
>
> I don't want an hline, just a line joining the dots I get using
> geom_point.  I thought something like
>
>   p + geom_line(data=d.ag,aes(x=as.character(Group.1),y=x)
>
> would work.  However, while I get a plot with axes labelled in the
> correct ranges, no line is plotted.  Explicitly setting the colour with
>
>   p + geom_line(data=d.ag,aes(x=as.character(Group.1),y=x),colour="red")
>
> doesn't help.  What am I doing wrong?

I found the answer by googling "geom_line doesn’t draw lines".  The
following does what I want:

  ggplot(data=d.ag,aes(x=as.character(Group.1),y=x,group=1)) +_geom_line()

My understanding is that, because 'Group.1' is a factor, the 'x' values
are not considered as belonging to the same group and geom_line only
connects points within a group.

>>> > ggplot is rather complicated but very flexible
>>>
>>> I don't mind ggplot being complicated, but I find the documentation a
>>> little impenetrable.
>>
>> You can find plenty of help when you just try to google on the item
>> searching. Actually this is what I do when the solution is not obvious
>> or requires some hidden instruction.
>
> This is what I normally resort to with varying degrees of success.  It
> just seems a bit of a shame the some of the documentation for such a
> good piece of software does indeed appear to be rather "hidden".
>
>> Cheers
>> Petr
>>
>>>
>>> Cheers,
>>>
>>> Loris
>>>
>>>
>>> >> -Original Message-
>>> >> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
>>> Loris
>>> >> Bennett
>>> >> Sent: Monday, March 09, 2015 2:56 PM
>>> >> To: r-h...@stat.math.ethz.ch
>>> >> Subject: [R] Add sum line to plot of multiple x values
>>> >>
>>> >> Hi,
>>> >>
>>> >> Here are my data:
>>> >>
>>> >> > d
>>> >>user files   date
>>> >> 1 alice18 2013-09-15
>>> >> 2   bob 5 2013-09-15
>>> >> 3 carol21 2013-09-15
>>> >> 4 alice22 2013-09-08
>>> >> 5   bob 9 2013-09-08
>>> >> 6 carol14 2013-09-08
>>> >> 7 alice26 2013-09-01
>>> >> 8   bob 3 2013-09-01
>>> >> 9 carol22 2013-09-01
>>> >>
>>> >> I would like to plot the number of files against date for all users,
>>> so
>>> >> I have:
>>> >>
>>> >>   library(ggplot2)
>>> >>
>>> >>   people <- c("alice","bob","carol")
>>> >>   user <- c(rep(people,3))
>>> >>   files <- c(18,5,21,22,9,14,26,3,22)
>>> >>   date <- c(rep("2013-09-15",3),rep("2013-09-08",3),rep("2013-09-
>>> >> 01",3))
>>> >>   d <- data.frame(user=user,files=files,date=date)
>>> >>
>>> >>   p <- ggplot()
>>> >>   p <- p +
>>> geom_line(data=d,aes(x=date,y=files,group=user,colour=user))
>>> >>
>>> >> I would now like to add a line to show the total number of files as
>>> a
>>> >> function of date.  I tried
>>> >>
>>> >>   p <- p +
>>> >> geom_line(data=d,aes(x=date,y=sum(files),group=date),colour='black')
>>> >>
>>> >> I don't get a black line, but the plot is scaled such that I can see
>>> >> that sum(file) for all values of 'file', rather than those for each
>>> >> date, is being used.
>>> >>
>>> >> I would like to know how to do this correctly, but I would rather be
>>> >> able to work it out for myself.  However, if I decide, say, that I
>>> >> don't
>>> >> know exactly what the 'group' argument does, how do I find it out?
>>> >>
>>> >> ?geom_line doesn't have it, although the examples there use it.
>>> ?ggplot
>>> >> doesn't mention it. ?group gives me stuff abou

[R] Are there any implemented function for A/B testing?

2015-03-10 Thread Namratha K
Is there any method or built-in function for implementing a/b testing using
R language
Are there any function developed to implement a/b testing in R language?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Alpha not working in geom_rect

2015-03-10 Thread adel daoud
Thanks for the info Jeff. I will stick to using annotate()


--

Adel Daoud, PhD, Researcher



The New School for Social Research,

Visiting Scholar in the Economics Department,

6 East 16th Street New York, NY 10003,

dao...@newschool.edu





University of Gothenburg

Department of Sociology and Work Science,

Box 720

405 30, Göteborg, Sweden

Visiting address: Sprängkullsgatan 25, room F411

Sprängkullsgatan 25, room K109

+46 031-786 41 73

adel.da...@sociology.gu.se

On Mon, Mar 9, 2015 at 9:42 PM, Jeff Newmiller 
wrote:

> I have run into this a couple of times ... If you generate the rectangles
> once per row of your data, the fill gets more and more "dense" so your
> alpha seems to not work. The annotate call only paints the rectangle once
> so you don't have this problem.
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live
> Go...
>   Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> ---
> Sent from my phone. Please excuse my brevity.
>
> On March 9, 2015 3:24:23 PM PDT, adel daoud  wrote:
> >Hi Jim,
> >
> >Thanks for the input but that did not work. I am suing Rstudio by the
> >way
> >and I guess that has a better device that would support ggplot output.
> >
> >The annotate options works but that does not explain why the geom_area
> >does
> >not work:
> >annotate("rect", xmin=2, xmax=10, ymin=0,  ymax=1, fill="black",
> >alpha=0.5)
> >
> >Best
> >Adel
> >
> >
> >--
> >
> >Adel Daoud, PhD, Researcher
> >
> >
> >
> >The New School for Social Research,
> >
> >Visiting Scholar in the Economics Department,
> >
> >6 East 16th Street New York, NY 10003,
> >
> >dao...@newschool.edu
> >
> >
> >
> >
> >
> >University of Gothenburg
> >
> >Department of Sociology and Work Science,
> >
> >Box 720
> >
> >405 30, Göteborg, Sweden
> >
> >Visiting address: Sprängkullsgatan 25, room F411
> >
> >Sprängkullsgatan 25, room K109
> >
> >+46 031-786 41 73
> >
> >adel.da...@sociology.gu.se
> >
> >On Sun, Mar 8, 2015 at 12:46 AM, Jim Lemon 
> >wrote:
> >
> >> Hi Adel,
> >> Almost certainly because the device you were using doesn't support
> >> transparency.Try it with a PDF device and check the resulting file in
> >a PDF
> >> reader:
> >>
> >> pdf("ad.pdf")
> >> print(p)
> >> dev.off()
> >>
> >> Jim
> >>
> >>
> >> On Sun, Mar 8, 2015 at 4:39 AM, Adel  wrote:
> >>
> >>> Hi
> >>> I am trying to activate the alpha argument to work, but for some
> >reason it
> >>> does not to play with me. Anybody has an idea why?
> >>>
> >>>
> >>> p <- ggplot(data = prediction_df, aes(x=x, y=prediction,
> >fill=threshold))
> >>> +
> >>> geom_area(colour="black", size=.2, alpha=.4) +
> >>> scale_fill_brewer(palette="Set1",
> >>> breaks=rev(levels(prediction_df$threshold)))
> >>> p + geom_rect(aes(xmin=2, xmax=10, ymin=(0), ymax=(1)),
> >fill="black",
> >>> alpha=0.5)
> >>>
> >>>
> >>> prediction_df
> >>>  x prediction  threshold
> >>> 1  -10  0.5694161   noAF
> >>> 2   -9  0.5700513   noAF
> >>> 3   -8  0.5706863   noAF
> >>> 4   -7  0.5713211   noAF
> >>> 5   -6  0.5719556   noAF
> >>> 6   -5  0.5725899   noAF
> >>> 7   -4  0.5732240   noAF
> >>> 8   -3  0.5738578   noAF
> >>> 9   -2  0.5744914   noAF
> >>> 10  -1  0.5751247   noAF
> >>> 11   0  0.5757578   noAF
> >>> 12   1  0.5763906   noAF
> >>> 13   2  0.5770232   noAF
> >>> 14   3  0.5776556   noAF
> >>> 15   4  0.5782876   noAF
> >>> 16   5  0.5789195   noAF
> >>> 17   6  0.5795510   noAF
> >>> 18   7  0.5801823   noAF
> >>> 19   8  0.5808134   noAF
> >>> 20   9  0.5814441   noAF
> >>> 21  10  0.5820747   noAF
> >>> 22 -10  0.2359140   singleAF
> >>> 23  -9  0.2356847   singleAF
> >>> 24  -8  0.2354550   singleAF
> >>> 25  -7  0.2352249   singleAF
> >>> 26  -6  0.2349943   singleAF
> >>> 27  -5  0.2347634   singleAF
> >>> 28  -4  0.2345321   singleAF
> >>> 29  -3  0.2343003   singleAF
> >>> 30  -2  0.2340682   singleAF
> >>> 31  -1  0.2338356   singleAF
> >>> 32   0  0.2336027   singleAF
> >>> 33   1  0.2333694   singleAF
> >>> 34   2  0.2331357   singleAF
> >>> 35   3  0.2329016   singleAF
> >>> 36   4  0.2326671   singleAF
> >>> 37   5  0.2324322   singleAF
> >>> 38   6  0.2321969   singleAF
> >>> 39   7  0.2319613   singleAF
> >>> 40   8  0.2317253   singleAF
> >>> 41   9  0.2314889   singleAF
> >>> 42  10  0.2312522   singleAF
> >>> 43 -10  0.1946699 multipleAF
> >>> 44  -9  0.1942640 multipleAF
> >>> 45  -8  0.1938587 multipleAF
> >>> 46  -7  0.1934540 multipleAF
> >>> 47  -6  0.1930500 multipleAF
> >>> 48  -5  0.1926467 multipleAF
> >>> 49  -4  0.1922440 multipleAF
> >>> 50  -3  0.1918419 multipleAF
> >>

Re: [R] Add sum line to plot of multiple x values

2015-03-10 Thread Loris Bennett
Hi Petr,

See inline.

PIKAL Petr  writes:

> Hi
>
> see inline
>
>> -Original Message-
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Loris
>> Bennett
>> Sent: Monday, March 09, 2015 4:35 PM
>> To: r-h...@stat.math.ethz.ch
>> Subject: Re: [R] Add sum line to plot of multiple x values
>>
>> PIKAL Petr  writes:
>>
>> > Hi
>> >
>> > Not extremely clear what do you want to plot. Do you want to add a
>> > line which marks total number of files each day regardless of user?
>> Or
>> > a total number of files regardless of date coloured by user?
>>
>> Sorry, I was unclear.  I meant that I would like to plot the following:
>>
>> 1. For each user: the number of files for each date (my code does this)
>> 2. The sum of files of all users for each date (this is what I still
>>need)
>>
>> > In each case you shall search functions geom_hline or geom_abline
>> >
>> > http://stackoverflow.com/questions/13254441/add-a-horizontal-line-to-
>> plot-and-legend-in-ggplot2
>>
>> So I don't want a straight line
>
> but in your code is
>
>>> geom_line(data=d,aes(x=date,y=sum(files),group=date),colour='black')
>
> so you apparently want some sort of line.

Yes, but see below.

> anyway, if I do
>
> d.ag<-aggregate(d$files, list(d$date), sum)
>
> I can add
>
> p+geom_point(data=d.ag,aes(x=Group.1,y=x), size=5)
>
> and I get summary points.

Thanks, this works.

> If you want lines you can do
>
> p+geom_hline(data=d.ag,aes(yintercept=x, colour=Group.1))
>
> or you can fiddle with geom_segment

I don't want an hline, just a line joining the dots I get using
geom_point.  I thought something like

  p + geom_line(data=d.ag,aes(x=as.character(Group.1),y=x)

would work.  However, while I get a plot with axes labelled in the
correct ranges, no line is plotted.  Explicitly setting the colour with

  p + geom_line(data=d.ag,aes(x=as.character(Group.1),y=x),colour="red")

doesn't help.  What am I doing wrong?

>>
>> > ggplot is rather complicated but very flexible
>>
>> I don't mind ggplot being complicated, but I find the documentation a
>> little impenetrable.
>
> You can find plenty of help when you just try to google on the item
> searching. Actually this is what I do when the solution is not obvious
> or requires some hidden instruction.

This is what I normally resort to with varying degrees of success.  It
just seems a bit of a shame the some of the documentation for such a
good piece of software does indeed appear to be rather "hidden".

> Cheers
> Petr
>
>>
>> Cheers,
>>
>> Loris
>>
>>
>> >> -Original Message-
>> >> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
>> Loris
>> >> Bennett
>> >> Sent: Monday, March 09, 2015 2:56 PM
>> >> To: r-h...@stat.math.ethz.ch
>> >> Subject: [R] Add sum line to plot of multiple x values
>> >>
>> >> Hi,
>> >>
>> >> Here are my data:
>> >>
>> >> > d
>> >>user files   date
>> >> 1 alice18 2013-09-15
>> >> 2   bob 5 2013-09-15
>> >> 3 carol21 2013-09-15
>> >> 4 alice22 2013-09-08
>> >> 5   bob 9 2013-09-08
>> >> 6 carol14 2013-09-08
>> >> 7 alice26 2013-09-01
>> >> 8   bob 3 2013-09-01
>> >> 9 carol22 2013-09-01
>> >>
>> >> I would like to plot the number of files against date for all users,
>> so
>> >> I have:
>> >>
>> >>   library(ggplot2)
>> >>
>> >>   people <- c("alice","bob","carol")
>> >>   user <- c(rep(people,3))
>> >>   files <- c(18,5,21,22,9,14,26,3,22)
>> >>   date <- c(rep("2013-09-15",3),rep("2013-09-08",3),rep("2013-09-
>> >> 01",3))
>> >>   d <- data.frame(user=user,files=files,date=date)
>> >>
>> >>   p <- ggplot()
>> >>   p <- p +
>> geom_line(data=d,aes(x=date,y=files,group=user,colour=user))
>> >>
>> >> I would now like to add a line to show the total number of files as
>> a
>> >> function of date.  I tried
>> >>
>> >>   p <- p +
>> >> geom_line(data=d,aes(x=date,y=sum(files),group=date),colour='black')
>> >>
>> >> I don't get a black line, but the plot is scaled such that I can see
>> >> that sum(file) for all values of 'file', rather than those for each
>> >> date, is being used.
>> >>
>> >> I would like to know how to do this correctly, but I would rather be
>> >> able to work it out for myself.  However, if I decide, say, that I
>> >> don't
>> >> know exactly what the 'group' argument does, how do I find it out?
>> >>
>> >> ?geom_line doesn't have it, although the examples there use it.
>> ?ggplot
>> >> doesn't mention it. ?group gives me stuff about formatting text
>> >> arguments. ??group only leads me to ?ggplot2::add_group, which also
>> >> does
>> >> not seem to help.
>> >>
>> >> Am I at fault for trying to learn R in an ad hoc manner, to which
>> the
>> >> documentation of R does not lend itself, or am I missing something?
>> >>
>> >> Cheers,
>> >>
>> >> Loris
>> >>
>> >> --
>> >> This signature is currently under construction.
>> >>

-- 
This signature is currently under construction.

__
R-help@r-project.org mailing list -- To U

[R] logit in "car" package

2015-03-10 Thread kat123
I have run a logit data transformation in R using the logit function in the
package car.

http://cran.r-project.org/web/packages/car/car.pdf

If i run logit on a column of data that contains a 0 value it makes and
adjustment according to the literature of 0.025.

I thought this meant that it was running the transformation as 

log((p+0.025)/ (1-(p+0.025)))

However, if I run individual values through this equation they do not match
up to the output of the logit function.

Any suggestions?

 



--
View this message in context: 
http://r.789695.n4.nabble.com/logit-in-car-package-tp4704408.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] svg2swf - controlling the looping of flash files

2015-03-10 Thread Yixuan Qiu
Hello Paul,
So far there is no way to stop the animation after its first run. If this
feature is needed I could try to implement it in the future version of
R2SWF.


Best,
Yixuan

2015-03-09 18:33 GMT-04:00 Paul Sweeting :

> Hi
>
>
>
> I'm using svg2swf to collate a number of svg outputs into an swf file.
> I've
> got this working (mainly.) except that I can't control the looping
> behaviour
> of the swf file.  In other words, when it's loaded into html it loops
> continuously.  Is there any way to stop the animation looping, so it just
> plays through once when loaded?  The code I use is (broadly):
>
>
>
>svg("testplot%d.svg",onefile = FALSE)
>
>for(j in 1:360){
>
>   print(cloud(x~y*z, groups=tail,
> data=norm_dots_chart, screen=list(z=0,x=0,y=j)))
>
>}
>
>dev.off()
>
>output = svg2swf(sprintf("testplot%d.svg", 1:360), interval
> =
> 0.04)
>
>swf2html(output)
>
>
>
> Thank you!
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Yixuan Qiu 
Department of Statistics,
Purdue University

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error: cannot allocate vector of size 64.0 Mb When Using Read.zoo()

2015-03-10 Thread 李倩雯
Hi all,

*Problem Description*
I encountered the *Error: cannot allocate vector of size 64.0 Mb* when I
was using read.zoo to convert a data.frame called 'origin' to zoo object
named 'target'

*About the Data & Code*
My data frame(origin) contains 5340191 obs. of 3 variables[Data,
Numeric,Character]
The code looks like
*target<-read.zoo(origin,format="%m/%d/%Y",index.column=1,split=3)*

*SessionInfo:*
R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Installed memory: 4.00 GB (3.82 GB usable)
Result of memory.size() : 3812.85

I try to calculate the required memory but I don't know what are the
operations in such conversion process. Therefore I have no idea if my data
is too mass to handle or I was using a low efficient method. Can anyone
help me with this problem?

By the way, as this is the first time I turn to mailing list for help, I am
not sure if I ask in the right manner. Please tell me if any
suggestions.Thank you.


Best regards,
Jasmine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pcrfit model for qpcR package

2015-03-10 Thread Luigi Marongiu
Dear all,
I have been trying to apply the Cy0 algorithm of the package qpcR by
creating an object "obj" with the normalized fluorescent data from a
384 plate whose characteristics were TaqMan chemistry and 45 cycles.
The import of the object was successful but when I implemented the
pcrfit model (indicating in column 1 the number of cycles and in the
successive columns the actual data) I obtained an error in
model.frame.default. Yet the dataframe is composed by 384 columns (+ 1
for the cycles) and each by 45 rows (+ 1 for the column titles).
Would you have some tips on how to debug this problem?
Many thanks,
Best regards,
Luigi


>>>
Here is a sample of the code I have written (the result database is
attached for further reference)

> obj<-pcrimport2(
+   file="cq.data.txt",
+   sep="\t",
+   dec=".",
+   header=TRUE,
+   colClasses="numeric",
+   quote=""
+   )
# j<-2:385 to be implemented with a loop cycle, to read obj columns
# i<-1:384 to be implemented with a loop cycle, to write the results
# Cy is an object with 384 values
>   model<-pcrfit(obj, cyc=1, j, model=l4, do.optim=TRUE, robust=FALSE)
>   Cy[i]<-Cy0(model, plot=FALSE)


[1] Error in model.frame.default(formula = ~Fluo + Cycles, data =
DATA, weights = WEIGHTS,  :
  variable lengths differ (found for '(do.optim)')

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Show all elements

2015-03-10 Thread PIKAL Petr
Hi

the source of your problem is most probably that aggregate uses conversion of 
group variable to factor and therefore empty level is lost.

From aggregate help page:

by a list of grouping elements, each as long as the variables in the data frame 
x. The elements are coerced to factors before use.

The only option I can come with is

> sapply(split(dados[,c(1,3)], dados$var), g1)
  A   B   C   D   E
 0.02589377  0.37123239 -0.57820359 NaN  0.39584514

but maybe there is some trick I did not find.

Cheers
Petr

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Leandro
> Marino
> Sent: Monday, March 09, 2015 2:48 PM
> To: r-help@r-project.org
> Subject: [R] Show all elements
>
> Hi,
>
> Look to the following code:
>
> set.seed(1)
> dados =
> data.frame(valor=rnorm(100),var=sample(LETTERS[c(1,2,3,5)],100,replace=
> T),peso=rpois(100,2))
> dados[1:10,]
> dados$var <- factor(dados$var,levels=LETTERS[1:5])
> table(dados$var)
>  A  B  C  D  E
> 31 31 19  0 19
>
> When I try to use summarize, Hmisc package it shows me the result
> without D category.
>
> g1 <- function(y) wtd.mean(y[,1],y[,2])
> summarize(dados[,c(1,3)], llist(var=dados$var), g1,stat.name = 'med')
>   var med
> 1   A  0.02589377
> 2   B  0.37123239
> 3   C -0.57820359
> 4   E  0.39584514
>
> How do I get med = NA or something else with summarize?
>
> I realy need to the function to return all factors in the var even it
> they are an empty set.
>
> thanks in advance.
>
> leandro
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně 
osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi 
či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by 
modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a 
contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately 
accept such offer; The sender of this e-mail (offer) excludes any acceptance of 
the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is concluded only upon an 
express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter into 
any contracts on behalf of the company except for cases in which he/she is 
expressly authorized to do so in writing, and such authorization or power of 
attorney is submitted to the recipient or the person represented by the 
recipient, or the existence of such authorization is known to the recipient of 
the person represented by the recipient.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, s

Re: [R] Add sum line to plot of multiple x values

2015-03-10 Thread PIKAL Petr
Hi

see inline

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Loris
> Bennett
> Sent: Monday, March 09, 2015 4:35 PM
> To: r-h...@stat.math.ethz.ch
> Subject: Re: [R] Add sum line to plot of multiple x values
>
> PIKAL Petr  writes:
>
> > Hi
> >
> > Not extremely clear what do you want to plot. Do you want to add a
> > line which marks total number of files each day regardless of user?
> Or
> > a total number of files regardless of date coloured by user?
>
> Sorry, I was unclear.  I meant that I would like to plot the following:
>
> 1. For each user: the number of files for each date (my code does this)
> 2. The sum of files of all users for each date (this is what I still
>need)
>
> > In each case you shall search functions geom_hline or geom_abline
> >
> > http://stackoverflow.com/questions/13254441/add-a-horizontal-line-to-
> plot-and-legend-in-ggplot2
>
> So I don't want a straight line

but in your code is

>> geom_line(data=d,aes(x=date,y=sum(files),group=date),colour='black')

so you apparently want some sort of line.

anyway, if I do

d.ag<-aggregate(d$files, list(d$date), sum)

I can add

p+geom_point(data=d.ag,aes(x=Group.1,y=x), size=5)

and I get summary points.

If you want lines you can do

p+geom_hline(data=d.ag,aes(yintercept=x, colour=Group.1))

or you can fiddle with geom_segment

>
> > ggplot is rather complicated but very flexible
>
> I don't mind ggplot being complicated, but I find the documentation a
> little impenetrable.

You can find plenty of help when you just try to google on the item searching. 
Actually this is what I do when the solution is not obvious or requires some 
hidden instruction.

Cheers
Petr

>
> Cheers,
>
> Loris
>
>
> >> -Original Message-
> >> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
> Loris
> >> Bennett
> >> Sent: Monday, March 09, 2015 2:56 PM
> >> To: r-h...@stat.math.ethz.ch
> >> Subject: [R] Add sum line to plot of multiple x values
> >>
> >> Hi,
> >>
> >> Here are my data:
> >>
> >> > d
> >>user files   date
> >> 1 alice18 2013-09-15
> >> 2   bob 5 2013-09-15
> >> 3 carol21 2013-09-15
> >> 4 alice22 2013-09-08
> >> 5   bob 9 2013-09-08
> >> 6 carol14 2013-09-08
> >> 7 alice26 2013-09-01
> >> 8   bob 3 2013-09-01
> >> 9 carol22 2013-09-01
> >>
> >> I would like to plot the number of files against date for all users,
> so
> >> I have:
> >>
> >>   library(ggplot2)
> >>
> >>   people <- c("alice","bob","carol")
> >>   user <- c(rep(people,3))
> >>   files <- c(18,5,21,22,9,14,26,3,22)
> >>   date <- c(rep("2013-09-15",3),rep("2013-09-08",3),rep("2013-09-
> >> 01",3))
> >>   d <- data.frame(user=user,files=files,date=date)
> >>
> >>   p <- ggplot()
> >>   p <- p +
> geom_line(data=d,aes(x=date,y=files,group=user,colour=user))
> >>
> >> I would now like to add a line to show the total number of files as
> a
> >> function of date.  I tried
> >>
> >>   p <- p +
> >> geom_line(data=d,aes(x=date,y=sum(files),group=date),colour='black')
> >>
> >> I don't get a black line, but the plot is scaled such that I can see
> >> that sum(file) for all values of 'file', rather than those for each
> >> date, is being used.
> >>
> >> I would like to know how to do this correctly, but I would rather be
> >> able to work it out for myself.  However, if I decide, say, that I
> >> don't
> >> know exactly what the 'group' argument does, how do I find it out?
> >>
> >> ?geom_line doesn't have it, although the examples there use it.
> ?ggplot
> >> doesn't mention it. ?group gives me stuff about formatting text
> >> arguments. ??group only leads me to ?ggplot2::add_group, which also
> >> does
> >> not seem to help.
> >>
> >> Am I at fault for trying to learn R in an ad hoc manner, to which
> the
> >> documentation of R does not lend itself, or am I missing something?
> >>
> >> Cheers,
> >>
> >> Loris
> >>
> >> --
> >> This signature is currently under construction.
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-
> >> guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > 


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo uko