Re: [R] Confused by error message: Error in assign(".popath", popath, .BaseNamespaceEnv)

2022-04-12 Thread Ivan Krylov
В Tue, 12 Apr 2022 15:59:35 +1200
Tiffany Vidal  пишет:

> devtools::install_github("MikkoVihtakari/ggOceanMapsData")
> 
> Error in assign(".popath", popath, .BaseNamespaceEnv) :
> cannot change value of locked binding for '.popath'
> Calls: local ... eval.parent -> eval -> eval -> eval -> eval -> assign

A full output of traceback() just after the error could be very useful
here. This error message might indicate a bug in devtools or its
dependencies (perhaps remotes). I don't know if the developers of
remotes or devtools lurk here, but it should be possible to reach them
at GitHub: https://github.com/r-lib/devtools/issues

You could also check whether devtools or the package itself is to blame
by downloading the code (either using git clone or by downloading the
zip archive of the head of the master branch), then running R CMD build
and R CMD INSTALL on the contents.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused about using data.table package,

2017-02-21 Thread P Tennant
aggregate(), tapply(), do.call(), rbind() (etc.) are extremely useful 
functions that have been available in R for a long time. They remain 
useful regardless what plotting approach you use - base graphics, 
lattice or the more recent ggplot.


Philip


On 22/02/2017 8:40 AM, C W wrote:

Hi Carl,

I have not fully learned dplyr, but it seems harder than tapply() and the
?apply() family in general.

Almost every ggplot2 data I have seen is manipulated using dplyr. Something
must be good about dplyr.

aggregate(), tapply(), do.call(), rbind() will be sorely missed! :(

Thanks!

On Tue, Feb 21, 2017 at 4:21 PM, Carl Sutton  wrote:


Hi

I have found that:
A)  Hadley's new book to be wonderful on how to use dplyr, ggplot2 and his
other packages.  Read this and using as a reference saves major frustration.
b)  Data Camps courses on ggplot2 are also wonderful.  GGPLOT2 has more
capability than I have mastered or needed.  To be an expert with ggplot2
will take some effort.  To just get run of the mill helpful, beautiful
plots, no major time needed for that.

I use both of these sources regularly, especially when what is in my grey
matter memory banks is not working.  Refreshers are sometimes needed.

If your data sets are large and available memory limited, then data.table
is the package I use.   I am amazed at the difference of memory usage with
data.table versus other packages.  My laptop has 16gb ram, and tidyr maxed
it but data.table melt used less than 6gb(if I remember correctly) on my
current work.  Since discovering fread and fwrite, read.table, read.csv,
and write have been benched.   Every script I have includes
library(data.table)

Carl Sutton


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused about using data.table package,

2017-02-21 Thread C W
Hi Carl,

I have not fully learned dplyr, but it seems harder than tapply() and the
?apply() family in general.

Almost every ggplot2 data I have seen is manipulated using dplyr. Something
must be good about dplyr.

aggregate(), tapply(), do.call(), rbind() will be sorely missed! :(

Thanks!

On Tue, Feb 21, 2017 at 4:21 PM, Carl Sutton  wrote:

> Hi
>
> I have found that:
> A)  Hadley's new book to be wonderful on how to use dplyr, ggplot2 and his
> other packages.  Read this and using as a reference saves major frustration.
> b)  Data Camps courses on ggplot2 are also wonderful.  GGPLOT2 has more
> capability than I have mastered or needed.  To be an expert with ggplot2
> will take some effort.  To just get run of the mill helpful, beautiful
> plots, no major time needed for that.
>
> I use both of these sources regularly, especially when what is in my grey
> matter memory banks is not working.  Refreshers are sometimes needed.
>
> If your data sets are large and available memory limited, then data.table
> is the package I use.   I am amazed at the difference of memory usage with
> data.table versus other packages.  My laptop has 16gb ram, and tidyr maxed
> it but data.table melt used less than 6gb(if I remember correctly) on my
> current work.  Since discovering fread and fwrite, read.table, read.csv,
> and write have been benched.   Every script I have includes
> library(data.table)
>
> Carl Sutton
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confused about using data.table package,

2017-02-21 Thread Carl Sutton via R-help
Hi
I have found that:A)  Hadley's new book to be wonderful on how to use dplyr, 
ggplot2 and his other packages.  Read this and using as a reference saves major 
frustration.
b)  Data Camps courses on ggplot2 are also wonderful.  GGPLOT2 has more 
capability than I have mastered or needed.  To be an expert with ggplot2 will 
take some effort.  To just get run of the mill helpful, beautiful plots, no 
major time needed for that.
I use both of these sources regularly, especially when what is in my grey 
matter memory banks is not working.  Refreshers are sometimes needed. 

If your data sets are large and available memory limited, then data.table is 
the package I use.   I am amazed at the difference of memory usage with 
data.table versus other packages.  My laptop has 16gb ram, and tidyr maxed it 
but data.table melt used less than 6gb(if I remember correctly) on my current 
work.  Since discovering fread and fwrite, read.table, read.csv, and write have 
been benched.   Every script I have includes library(data.table)

Carl Sutton
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Confused about using data.table package,

2017-02-21 Thread peter dalgaard
Just. Don't. Do. This. (Hint: Threading mail readers.)

On 21 Feb 2017, at 03:53 , C W  wrote:

> Thanks Hadley!
> 
> While I got your attention, what is a good way to get started on ggplot2? ;)

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused about using data.table package,

2017-02-21 Thread Jeff Newmiller
I suspect Hadley would recommend reading his new book, R for Data Science 
(r4ds.had.co.nz), in particular Chapter 3. You don't need plyr, but it won't 
take long before you will want to be using dplyr and tidyr, which are covered 
in later chapters.
-- 
Sent from my phone. Please excuse my brevity.

On February 20, 2017 6:53:29 PM PST, C W  wrote:
>Thanks Hadley!
>
>While I got your attention, what is a good way to get started on
>ggplot2? ;)
>
>My impression is that I first need to learn plyr, dplyr, AND THEN
>ggplot2.
>That's A LOT!
>
>Suppose i have this:
>iris
>iris2 <- cbind(iris, grade = sample(1:5, 150, replace = TRUE))
>iris2
>
>I want to have some kind of graph conditioned on species, by grade .
>What's
>a good lead to learn about plotting this?
>
>Thank you!
>
>
>
>On Mon, Feb 20, 2017 at 11:12 AM, Hadley Wickham 
>wrote:
>
>> On Sun, Feb 19, 2017 at 3:01 PM, David Winsemius
>
>> wrote:
>> >
>> >> On Feb 19, 2017, at 11:37 AM, C W  wrote:
>> >>
>> >> Hi R,
>> >>
>> >> I am a little confused by the data.table package.
>> >>
>> >> library(data.table)
>> >>
>> >> df <- data.frame(w=rnorm(20, -10, 1), x= rnorm(20, 0, 1),
>y=rnorm(20,
>> 10, 1),
>> >> z=rnorm(20, 20, 1))
>> >>
>> >> df <- data.table(df)
>> >
>> >   df <- setDT(df) is preferred.
>>
>> Don't you mean just
>>
>> setDT(df)
>>
>> ?
>>
>> setDT() modifies by reference.
>>
>> >>
>> >> df_3 <- df[, a := x-y] # created new column a using x minus y, why
>are
>> we
>> >> using colon equals?
>> >
>> > You need to do more study of the extensive documentation. The
>behavior
>> of the ":=" function is discussed in detail there.
>>
>> You can get to that documentation with ?":="
>>
>> Hadley
>>
>> --
>> http://hadley.nz
>>
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused about using data.table package,

2017-02-20 Thread C W
Thanks Hadley!

While I got your attention, what is a good way to get started on ggplot2? ;)

My impression is that I first need to learn plyr, dplyr, AND THEN ggplot2.
That's A LOT!

Suppose i have this:
iris
iris2 <- cbind(iris, grade = sample(1:5, 150, replace = TRUE))
iris2

I want to have some kind of graph conditioned on species, by grade . What's
a good lead to learn about plotting this?

Thank you!



On Mon, Feb 20, 2017 at 11:12 AM, Hadley Wickham 
wrote:

> On Sun, Feb 19, 2017 at 3:01 PM, David Winsemius 
> wrote:
> >
> >> On Feb 19, 2017, at 11:37 AM, C W  wrote:
> >>
> >> Hi R,
> >>
> >> I am a little confused by the data.table package.
> >>
> >> library(data.table)
> >>
> >> df <- data.frame(w=rnorm(20, -10, 1), x= rnorm(20, 0, 1), y=rnorm(20,
> 10, 1),
> >> z=rnorm(20, 20, 1))
> >>
> >> df <- data.table(df)
> >
> >   df <- setDT(df) is preferred.
>
> Don't you mean just
>
> setDT(df)
>
> ?
>
> setDT() modifies by reference.
>
> >>
> >> df_3 <- df[, a := x-y] # created new column a using x minus y, why are
> we
> >> using colon equals?
> >
> > You need to do more study of the extensive documentation. The behavior
> of the ":=" function is discussed in detail there.
>
> You can get to that documentation with ?":="
>
> Hadley
>
> --
> http://hadley.nz
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused about using data.table package,

2017-02-20 Thread David Winsemius

> On Feb 20, 2017, at 8:12 AM, Hadley Wickham  wrote:
> 
> On Sun, Feb 19, 2017 at 3:01 PM, David Winsemius  
> wrote:
>> 
>>> On Feb 19, 2017, at 11:37 AM, C W  wrote:
>>> 
>>> Hi R,
>>> 
>>> I am a little confused by the data.table package.
>>> 
>>> library(data.table)
>>> 
>>> df <- data.frame(w=rnorm(20, -10, 1), x= rnorm(20, 0, 1), y=rnorm(20, 10, 
>>> 1),
>>> z=rnorm(20, 20, 1))
>>> 
>>> df <- data.table(df)
>> 
>>  df <- setDT(df) is preferred.
> 
> Don't you mean just
> 
> setDT(df)
> 
> ?
> 
> setDT() modifies by reference.

Thanks for the correction.


> 
>>> 
>>> df_3 <- df[, a := x-y] # created new column a using x minus y, why are we
>>> using colon equals?
>> 
>> You need to do more study of the extensive documentation. The behavior of 
>> the ":=" function is discussed in detail there.
> 
> You can get to that documentation with ?":="

That's a good place to start reading, but I was thinking of 
data.table::datatable-faq, data.table::datatable-intro which are on the 
Vignettes page from: help(pac=data.table).

> 
> Hadley
> 
> -- 
> http://hadley.nz

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused about using data.table package,

2017-02-20 Thread Hadley Wickham
On Sun, Feb 19, 2017 at 3:01 PM, David Winsemius  wrote:
>
>> On Feb 19, 2017, at 11:37 AM, C W  wrote:
>>
>> Hi R,
>>
>> I am a little confused by the data.table package.
>>
>> library(data.table)
>>
>> df <- data.frame(w=rnorm(20, -10, 1), x= rnorm(20, 0, 1), y=rnorm(20, 10, 1),
>> z=rnorm(20, 20, 1))
>>
>> df <- data.table(df)
>
>   df <- setDT(df) is preferred.

Don't you mean just

setDT(df)

?

setDT() modifies by reference.

>>
>> df_3 <- df[, a := x-y] # created new column a using x minus y, why are we
>> using colon equals?
>
> You need to do more study of the extensive documentation. The behavior of the 
> ":=" function is discussed in detail there.

You can get to that documentation with ?":="

Hadley

-- 
http://hadley.nz

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused about using data.table package,

2017-02-19 Thread David Winsemius

> On Feb 19, 2017, at 11:37 AM, C W  wrote:
> 
> Hi R,
> 
> I am a little confused by the data.table package.
> 
> library(data.table)
> 
> df <- data.frame(w=rnorm(20, -10, 1), x= rnorm(20, 0, 1), y=rnorm(20, 10, 1),
> z=rnorm(20, 20, 1))
> 
> df <- data.table(df)

  df <- setDT(df) is preferred.
> 
> #drop column w
> 
> df_1 <- df[, w := NULL] # I thought you are supposed to do: df_1 <- df[, -w]

Nope. The "[.data.table" function is very different from the "[.data.frame' 
function. As you should be able to see, an expression in the `j` position for 
"[.data.table" gets evaluated in the environment of the data.table object, so 
unquoted column names get returned after application of any function. Here it's 
just a unary minus. 

Actually "nope" on two accounts. You cannot use a unary minus for column names 
in `[.data.frame` either. Would have needed to be df[ , !colnames(df) in "w"]  
# logical indexing


> 
> df_2 <- df[x 
> df_3 <- df[, a := x-y] # created new column a using x minus y, why are we
> using colon equals?

You need to do more study of the extensive documentation. The behavior of the 
":=" function is discussed in detail there.

> 
> I am a bit confused by this syntax.

It's non-standard for R but many people find the efficiencies of the package 
worth the extra effort to learn what is essentially a different evaluation 
strategy.


> 
> Thanks!
> 
>   [[alternative HTML version deleted]]

Rhelp is a plain text mailing list,

-- 
David
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confused about using data.table package,

2017-02-19 Thread C W
Hi R,

I am a little confused by the data.table package.

library(data.table)

df <- data.frame(w=rnorm(20, -10, 1), x= rnorm(20, 0, 1), y=rnorm(20, 10, 1),
z=rnorm(20, 20, 1))

df <- data.table(df)

#drop column w

df_1 <- df[, w := NULL] # I thought you are supposed to do: df_1 <- df[, -w]

df_2 <- df[x

[R] Confused by dlnorm - densities do not match histogram

2014-09-22 Thread Terran Melconian
Good evening!  I'm running into some surprising behavior with dlnorm() and
trying to understand it.

To set the stage, I'll plot the density and overlay a normal distribution.
This works exactly as expected; the two graphs align quite closely:

qplot(data=data.frame(x=rnorm(1e5,4,2)),x=x,stat='density',geom='area') +
stat_function(fun=dnorm,args=list(4,2),colour='blue')

but then I change to a log normal distribution and the behaviour gets
odd.  The distribution looks nothing like the density plot:

qplot(data=data.frame(x=rlnorm(1e5,4,2)),x=x,log='x',stat='density',geom='area')
 + stat_function(fun=dlnorm,args=list(4,2),colour='blue')

I thought the issue might be scale transformation - if dlnorm is giving the
density per unit x this is not the same as the density after transforming
to log(x).  So I tried to effect this scale transformation manually by
dividing by the derivative of log(x) - i.e. by multiplying by x - but this
also did not match:

qplot(data=data.frame(x=rlnorm(1e5,4,2)),x=x,log='x',stat='density',geom='area')
 + 
stat_function(fun=function(x,...){dlnorm(x,...)*x},args=list(4,2),colour='blue')

I also tried plotting without the log scale to eliminate that
transformation as a source of discrepancy, and they still don't match:

qplot(data=data.frame(x=rlnorm(1e5,4,2)),x=x,stat='density',geom='area',xlim=c(0,50))
 + stat_function(fun=dlnorm,args=list(4,2),colour='blue')

I'd appreciate any help in understanding what I'm missing.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused by code?

2012-09-24 Thread Rui Barradas

Hello,

It is pretty basic, and it is deceptively simple. The worst of all :)
When you index a matrix 'x' by another matrix 'z' the index can be a 
logical matrix of the same dimensions or recyclable to the dims of 'x', 
it can be a matrix with only two columns, a row numbers column and a 
column numbers one, or your case.


In your case, 'z' is coerced to vector, and the values in 'z' are taken 
to be indexes to 'x'. But since you only have two distinct values and 
one of them is zero, it will only return x[1] three times (there are 
three 1s in 'z'). The same goes for 'y'.


Correct:

# Create an index matrix
z.inx - which(z == 1, arr.ind = TRUE)
z.inx

# Test
x1 - x2 - x3 - x # Use copies to test
x1[z == 1] - y[z == 1]
x2[z.inx] - y[z.inx]
# 1 and 0 to T/F
x3[as.logical(z)] - y[as.logical(z)]

x1
identical(x1, x2)
identical(x1, x3)


Hope this helps,

Rui Barradas

Em 23-09-2012 21:52, Bazman76 escreveu:

x-matrix(c(1,0,0,0,1,0,0,0,1),nrow=3)

y-matrix(c(0,0,0,1,0,0,1,1,0),nrow=3)
z-matrix(c(0,1,0,0,1,0,1,0,0),nrow=3)
x[z]-y[z]

The resultant matrix x is all zeros except for the last two diagonal cells
which are 1's.
While y is lower triangualr 0's with the remaining cells all ones.

I really don't understand how this deceptively simple looking piece of code
is giving that result can someone explain please.
I'm obviously missing something pretty basic so please keep your answer
suitably basic.



--
View this message in context: 
http://r.789695.n4.nabble.com/Confused-by-code-tp4643946.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused by code?

2012-09-24 Thread Bazman76
Thanks Rui Barrudas and Peter Alspach,

I understand better now:

x-matrix(c(1,0,0,0,2,0,0,0,2),nrow=3) 
 y-matrix(c(7,8,9,1,5,10,1,1,0),nrow=3) 
 z-matrix(c(0,1,0,0,0,0,6,0,0),nrow=3) 
 x[z]-y[z] 
 viewData(x)

produces an x matrix 

7   0   0
0   2   0
0   10 2

which makes sense the first element of y 7 is inserted into z in slot x[1] 
and the and 6th element of y 10 is slotted into the x[6]. 


However the original code runs like this:

mI- mRU(de.d, de.nP)de.CR
mPV[mI]mP[mI]

where mPv and MP are both (de.d, de.nP) matrices.

and

mRUlt;-function(m,n){
 return(array(runif(m*n), dim=c(m,n)))
}

i.e. it returns an array of m*n random numbers uniformly distributed between
0 and 1.

de.CR is a fixed value say 0.8.

So mIlt;- mRU(de.d, de.NP)de.CR returns a de.d*de.nP array where each
element is 1 is its more than 0.8 and zero otherwise.

So in this case element mPv[1] will be repeatedly filled with the value of
mP[1] and all other elements will remain unaffected?

Is this correct?

If so I am still confused as this is not what I thought was supposed to by
happening but I know that the code overall does its job correctly?



--
View this message in context: 
http://r.789695.n4.nabble.com/Confused-by-code-tp4643946p4644010.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused by code?

2012-09-24 Thread Rui Barradas

Hello,

Inline.
Em 24-09-2012 15:31, Bazman76 escreveu:

Thanks Rui Barrudas and Peter Alspach,

I understand better now:

x-matrix(c(1,0,0,0,2,0,0,0,2),nrow=3)
  y-matrix(c(7,8,9,1,5,10,1,1,0),nrow=3)
  z-matrix(c(0,1,0,0,0,0,6,0,0),nrow=3)
  x[z]-y[z]
  viewData(x)

produces an x matrix

7   0   0
0   2   0
0   10 2

which makes sense the first element of y 7 is inserted into z in slot x[1]
and the and 6th element of y 10 is slotted into the x[6].


However the original code runs like this:

mI- mRU(de.d, de.nP)de.CR
mPV[mI]mP[mI]

where mPv and MP are both (de.d, de.nP) matrices.

and

mRUlt;-function(m,n){
  return(array(runif(m*n), dim=c(m,n)))
}

i.e. it returns an array of m*n random numbers uniformly distributed between
0 and 1.

de.CR is a fixed value say 0.8.

So mIlt;- mRU(de.d, de.NP)de.CR returns a de.d*de.nP array where each
element is 1 is its more than 0.8 and zero otherwise.

So in this case element mPv[1] will be repeatedly filled with the value of
mP[1] and all other elements will remain unaffected?

Is this correct?


Yes and no, it should return a logical matrix, not a numeric one. Since 
it seems to be returning numbers 0/1, you can use as.logical like I've 
shown in my first post, or, maybe better,


mI- which(mRU(de.d, de.nP)  de.CR, arr.ind = TRUE)

Like this you'll have an index matrix, whose purpose is precisely what 
its names says, to index. Matrices.
(I'm also a bit confused as to why the logical condition is returning 
numbers, are you sure of that?)


Anyway, the right way would be to index 'mPV' using a logical or an 
index matrix.


Hope this helps,

Rui Barradas


If so I am still confused as this is not what I thought was supposed to by
happening but I know that the code overall does its job correctly?



--
View this message in context: 
http://r.789695.n4.nabble.com/Confused-by-code-tp4643946p4644010.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused by code?

2012-09-24 Thread Rui Barradas

I've just reread my answer and it's not very clear. Not at all. Inline.
Em 24-09-2012 18:34, Rui Barradas escreveu:

Hello,

Inline.
Em 24-09-2012 15:31, Bazman76 escreveu:

Thanks Rui Barrudas and Peter Alspach,

I understand better now:

x-matrix(c(1,0,0,0,2,0,0,0,2),nrow=3)
  y-matrix(c(7,8,9,1,5,10,1,1,0),nrow=3)
  z-matrix(c(0,1,0,0,0,0,6,0,0),nrow=3)
  x[z]-y[z]
  viewData(x)

produces an x matrix

7   0   0
0   2   0
0   10 2

which makes sense the first element of y 7 is inserted into z in slot 
x[1]

and the and 6th element of y 10 is slotted into the x[6].


However the original code runs like this:

mI- mRU(de.d, de.nP)de.CR
mPV[mI]mP[mI]

where mPv and MP are both (de.d, de.nP) matrices.

and

mRUlt;-function(m,n){
  return(array(runif(m*n), dim=c(m,n)))
}

i.e. it returns an array of m*n random numbers uniformly distributed 
between

0 and 1.

de.CR is a fixed value say 0.8.

So mIlt;- mRU(de.d, de.NP)de.CR returns a de.d*de.nP array where each
element is 1 is its more than 0.8 and zero otherwise.

So in this case element mPv[1] will be repeatedly filled with the 
value of

mP[1] and all other elements will remain unaffected?

Is this correct?


Yes and no,


Yes, it is absolutely correct. As is, the matrix mI is coerced to vector 
first and then, since it only has values 0 and 1 the element mPv[1] will 
be repeatedly filled with thesame value of mP[1].


The rest of my answer is right, though. But 'it', the very first word in 
my post after this comment, refers to what? To the condition that 
creates the index matrix ml but this is not at all as clear as it should.
it should return a logical matrix, not a numeric one. Since it seems 
to be returning numbers 0/1, you can use as.logical like I've shown in 
my first post, or, maybe better,


mI- which(mRU(de.d, de.nP)  de.CR, arr.ind = TRUE)


Use this suggestion. It can't go wrong.

Rui Barradas


Like this you'll have an index matrix, whose purpose is precisely what 
its names says, to index. Matrices.
(I'm also a bit confused as to why the logical condition is returning 
numbers, are you sure of that?)


Anyway, the right way would be to index 'mPV' using a logical or an 
index matrix.


Hope this helps,

Rui Barradas


If so I am still confused as this is not what I thought was supposed 
to by

happening but I know that the code overall does its job correctly?



--
View this message in context: 
http://r.789695.n4.nabble.com/Confused-by-code-tp4643946p4644010.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confused by code?

2012-09-23 Thread Bazman76
x-matrix(c(1,0,0,0,1,0,0,0,1),nrow=3)
 y-matrix(c(0,0,0,1,0,0,1,1,0),nrow=3)
 z-matrix(c(0,1,0,0,1,0,1,0,0),nrow=3)
 x[z]-y[z]

The resultant matrix x is all zeros except for the last two diagonal cells
which are 1's.
While y is lower triangualr 0's with the remaining cells all ones.

I really don't understand how this deceptively simple looking piece of code
is giving that result can someone explain please.
I'm obviously missing something pretty basic so please keep your answer
suitably basic.



--
View this message in context: 
http://r.789695.n4.nabble.com/Confused-by-code-tp4643946.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused by code?

2012-09-23 Thread Peter Alspach
Tena koe

I think you probably meant:
x[as.logical(z)] - y[as.logical(z)]

i.e., choosing those elements of × and y where z is 1 (TRUE as logical).  
Whereas what you have written:

×[z] - y[z]

references the 0th (by default indexing starts at 1 so this is empty (see ×[0]) 
and the first element of × and y (repeatedly).

Hope this helps 

Peter Alspach

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Bazman76
Sent: Monday, 24 September 2012 8:53 a.m.
To: r-help@r-project.org
Subject: [R] Confused by code?

x-matrix(c(1,0,0,0,1,0,0,0,1),nrow=3)
 y-matrix(c(0,0,0,1,0,0,1,1,0),nrow=3)
 z-matrix(c(0,1,0,0,1,0,1,0,0),nrow=3)
 x[z]-y[z]

The resultant matrix x is all zeros except for the last two diagonal cells 
which are 1's.
While y is lower triangualr 0's with the remaining cells all ones.

I really don't understand how this deceptively simple looking piece of code is 
giving that result can someone explain please.
I'm obviously missing something pretty basic so please keep your answer 
suitably basic.



--
View this message in context: 
http://r.789695.n4.nabble.com/Confused-by-code-tp4643946.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

The contents of this e-mail are confidential and may be subject to legal 
privilege.
 If you are not the intended recipient you must not use, disseminate, 
distribute or
 reproduce all or any part of this e-mail or attachments.  If you have received 
this
 e-mail in error, please notify the sender and delete all material pertaining 
to this
 e-mail.  Any opinion or views expressed in this e-mail are those of the 
individual
 sender and may not represent those of The New Zealand Institute for Plant and
 Food Research Limited.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confused about multiple imputation with rms or Hmisc packages

2012-07-05 Thread Mohiuddin, Jahan
Hello,
I'm working on a Cox Proportional Hazards model for a cancer data set  that has 
missing values for the categorical variable Grade in less than 10% of the 
observations.  I'm not a statistician, but based on my readings of Frank 
Harrell's book it seems to be a candidate for using multiple imputation 
technique(s).  I understand the concepts behind imputation, but using the 
functions in rms and Hmisc is confounding me.  For instance, whether to use 
transcan or aregImpute.

Here is a sample of my data: https://dl.dropbox.com/u/1852742/sample.csv

Drawing from Chapter 8 of Harrell's book, this is what I've been toying with:

#recurfree_survival_fromsx  is survival time, rf_obs_sx codes for events as a 
binary variable.

#The CPH model I would like to fit, using Ograde_dx as the variable for overall 
grade at
#diagnosis, ord_nodes as an ordinal variable for the # lymph nodes involved.
obj=with(mydata, Surv(recurfree_survival_fromsx,rf_obs_sx))
mod=cph(obj~ord_nodes+Ograde_dx+ERorPR+HER2_Sum,data=mydata,x=T,y=T)
#Impute missing data
mydata.transcan=transcan(~Ograde_dx+tumorsize+ord_nodes+simp_stage_path+afam+
Menopause+Age,imputed=T,n.impute=10)
summary(mydata.transcan)

The issues I have are:

a)  In your opinion(s), should I even be imputing this data?  Is it 
appropriate here?

b)  Even after reading the help pages and Harrell's book, I'm not sure I 
used the correct imputation method, and whether I should be using transcan or 
aregImpute.

c)   In the output of summary(transcan), is R-squared the best value to 
describe how reliably the function could predict Ograde_dx?  What is an 
acceptable level?

d)  Do I use the function fit.mult.impute to fit my final cph model?

I appreciate your help with this as it is a somewhat confusing topic.  I hope I 
gave you all the information you need to answer my questions.

Sincerely,
Jahan








[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] confused with indexing

2012-05-22 Thread Alaios
Dear all,
I have a code that looks like the following (I am sorry that this is not a 
reproducible example)


    indexSkipped-NULL



 code Skipped that might alter indexSkipped


    if (length(indexSkipped)==0)
        
spatial_structure-spatial_structures_from_measurements(DataList[[i]]$Lon,DataList[[i]]$Lat,meanVector)
 
    else
        
spatial_structure-spatial_structures_from_measurements(DataList[[i]]$Lon[-indexSkipped],DataList[[i]]$Lat[-indexSkipped],meanVector)
 # 



What I am doing here is that I am processing files. Every files has a 
measurement table and Longtitude and Latitude fields. If one file is marked as 
invalid I keep a number of of the skipped index so to remove the element of the 
Longtitude and Latitide vectors.

1) That works correct, I was just wondering if it would be possible to remove 
some how the given if statement and initialize the indexSkipped in such a way 
so 

the DataList[[i]]$Lon[-indexSkipped],DataList[[i]]$Lat[-indexSkipped] do 
nothing, aka remove no element, in case the indexSkipped remains unchanged (in 
its initial value).

2) When u define a variable as empty, I usually use NULL, how I can check 
afterwords if that holds or not. If I use  the 

 (indexSkipped==NULL)
logical(0)

this does not return true or false. How I can do that check?


Iwould like to thank you in advance for your help

B.R
Alex
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confused with indexing

2012-05-22 Thread Jim Holtman
use is.null for the test

if (is.null(indexSkipped))

Sent from my iPad

On May 22, 2012, at 2:10, Alaios ala...@yahoo.com wrote:

 Dear all,
 I have a code that looks like the following (I am sorry that this is not a 
 reproducible example)
 
 
 indexSkipped-NULL
 
 
 
  code Skipped that might alter indexSkipped
 
 
 if (length(indexSkipped)==0)
 
 spatial_structure-spatial_structures_from_measurements(DataList[[i]]$Lon,DataList[[i]]$Lat,meanVector)
  
 else
 
 spatial_structure-spatial_structures_from_measurements(DataList[[i]]$Lon[-indexSkipped],DataList[[i]]$Lat[-indexSkipped],meanVector)
  # 
 
 
 
 What I am doing here is that I am processing files. Every files has a 
 measurement table and Longtitude and Latitude fields. If one file is marked 
 as invalid I keep a number of of the skipped index so to remove the element 
 of the Longtitude and Latitide vectors.
 
 1) That works correct, I was just wondering if it would be possible to remove 
 some how the given if statement and initialize the indexSkipped in such a way 
 so 
 
 the DataList[[i]]$Lon[-indexSkipped],DataList[[i]]$Lat[-indexSkipped] do 
 nothing, aka remove no element, in case the indexSkipped remains unchanged 
 (in its initial value).
 
 2) When u define a variable as empty, I usually use NULL, how I can check 
 afterwords if that holds or not. If I use  the 
 
  (indexSkipped==NULL)
 logical(0)
 
 this does not return true or false. How I can do that check?
 
 
 Iwould like to thank you in advance for your help
 
 B.R
 Alex
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confused: Inconsistent result?

2012-02-20 Thread Ajay Askoolum
This is copy  paste from my session:

 xyz-as.vector(c(ls(),as.matrix(lapply(ls(),class
 dim(xyz)-c(length(xyz)/2,2)
 
 allobj-function(){
+ xyz-as.vector(c(ls(),as.matrix(lapply(ls(),class;
+ dim(xyz)-c(length(xyz)/2,2);
+ return(xyz)
+ }
 xyz
  [,1]  [,2]   
 [1,] a   character
 [2,] aa  character
 [3,] abc character
 [4,] AirPassengers   character
 [5,] allobj  character
 [6,] allObjects  character
 [7,] allObjects2 character
 [8,] arrayFromAPL    character
 [9,] classes character
[10,] myCharVector    character
[11,] myDateVector    character
[12,] myNumericVector character
[13,] newArrayFromAPL character
[14,] obj character
[15,] objClass    character
[16,] x   character
[17,] xyz character
[18,] y   character
 allobj()
 [,1] [,2]
 

As far as I can see, the function allobj has the same expressions as those 
executed from the command line. Why are the results different?
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused: Inconsistent result?

2012-02-20 Thread David Winsemius


On Feb 20, 2012, at 10:07 AM, Ajay Askoolum wrote:


This is copy  paste from my session:


xyz-as.vector(c(ls(),as.matrix(lapply(ls(),class
dim(xyz)-c(length(xyz)/2,2)

allobj-function(){

+ xyz-as.vector(c(ls(),as.matrix(lapply(ls(),class;
+ dim(xyz)-c(length(xyz)/2,2);
+ return(xyz)
+ }

xyz

  [,1]  [,2]
 [1,] a   character
 [2,] aa  character
 [3,] abc character
 [4,] AirPassengers   character
 [5,] allobj  character
 [6,] allObjects  character
 [7,] allObjects2 character
 [8,] arrayFromAPLcharacter
 [9,] classes character
[10,] myCharVectorcharacter
[11,] myDateVectorcharacter
[12,] myNumericVector character
[13,] newArrayFromAPL character
[14,] obj character
[15,] objClasscharacter
[16,] x   character
[17,] xyz character
[18,] y   character

allobj()

 [,1] [,2]




As far as I can see, the function allobj has the same expressions as  
those executed from the command line. Why are the results different?


The ls function looks only in the local environment if not supplied  
with specific directions about where to look.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused: Inconsistent result?

2012-02-20 Thread R. Michael Weylandt
Sorry, just checked it and you need to add .GlobalEnv to both ls() calls.


Michael

On Mon, Feb 20, 2012 at 10:17 AM, R. Michael Weylandt
michael.weyla...@gmail.com wrote:
 Short answer, environments -- ls() looks (by default) in its current
 environment, which is not the same as the global environment when
 being called inside a function.

 This would (I think) give the same answer but I haven't checked it. :

 allobj-function(){
 + xyz-as.vector(c(ls(.GlobalEnv),as.matrix(lapply(ls(),class;
 + dim(xyz)-c(length(xyz)/2,2);
 + return(xyz)
 + }

 On Mon, Feb 20, 2012 at 10:07 AM, Ajay Askoolum aa2e...@yahoo.co.uk wrote:
 This is copy  paste from my session:

 xyz-as.vector(c(ls(),as.matrix(lapply(ls(),class
 dim(xyz)-c(length(xyz)/2,2)

 allobj-function(){
 + xyz-as.vector(c(ls(),as.matrix(lapply(ls(),class;
 + dim(xyz)-c(length(xyz)/2,2);
 + return(xyz)
 + }
 xyz
   [,1]  [,2]
  [1,] a   character
  [2,] aa  character
  [3,] abc character
  [4,] AirPassengers   character
  [5,] allobj  character
  [6,] allObjects  character
  [7,] allObjects2 character
  [8,] arrayFromAPL    character
  [9,] classes character
 [10,] myCharVector    character
 [11,] myDateVector    character
 [12,] myNumericVector character
 [13,] newArrayFromAPL character
 [14,] obj character
 [15,] objClass    character
 [16,] x   character
 [17,] xyz character
 [18,] y   character
 allobj()
  [,1] [,2]


 As far as I can see, the function allobj has the same expressions as those 
 executed from the command line. Why are the results different?
        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused: Inconsistent result?

2012-02-20 Thread R. Michael Weylandt
Short answer, environments -- ls() looks (by default) in its current
environment, which is not the same as the global environment when
being called inside a function.

This would (I think) give the same answer but I haven't checked it. :

 allobj-function(){
+ xyz-as.vector(c(ls(.GlobalEnv),as.matrix(lapply(ls(),class;
+ dim(xyz)-c(length(xyz)/2,2);
+ return(xyz)
+ }

On Mon, Feb 20, 2012 at 10:07 AM, Ajay Askoolum aa2e...@yahoo.co.uk wrote:
 This is copy  paste from my session:

 xyz-as.vector(c(ls(),as.matrix(lapply(ls(),class
 dim(xyz)-c(length(xyz)/2,2)

 allobj-function(){
 + xyz-as.vector(c(ls(),as.matrix(lapply(ls(),class;
 + dim(xyz)-c(length(xyz)/2,2);
 + return(xyz)
 + }
 xyz
   [,1]  [,2]
  [1,] a   character
  [2,] aa  character
  [3,] abc character
  [4,] AirPassengers   character
  [5,] allobj  character
  [6,] allObjects  character
  [7,] allObjects2 character
  [8,] arrayFromAPL    character
  [9,] classes character
 [10,] myCharVector    character
 [11,] myDateVector    character
 [12,] myNumericVector character
 [13,] newArrayFromAPL character
 [14,] obj character
 [15,] objClass    character
 [16,] x   character
 [17,] xyz character
 [18,] y   character
 allobj()
  [,1] [,2]


 As far as I can see, the function allobj has the same expressions as those 
 executed from the command line. Why are the results different?
        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused: Inconsistent result?

2012-02-20 Thread Petr PIKAL
Hi
 
 This is copy  paste from my session:
 
  xyz-as.vector(c(ls(),as.matrix(lapply(ls(),class
  dim(xyz)-c(length(xyz)/2,2)
  
  allobj-function(){
 + xyz-as.vector(c(ls(),as.matrix(lapply(ls(),class;
 + dim(xyz)-c(length(xyz)/2,2);
 + return(xyz)
 + }
  xyz
   [,1]  [,2]   
  [1,] a   character
  [2,] aa  character
  [3,] abc character
  [4,] AirPassengers   character
  [5,] allobj  character
  [6,] allObjects  character
  [7,] allObjects2 character
  [8,] arrayFromAPLcharacter
  [9,] classes character
 [10,] myCharVectorcharacter
 [11,] myDateVectorcharacter
 [12,] myNumericVector character
 [13,] newArrayFromAPL character
 [14,] obj character
 [15,] objClasscharacter
 [16,] x   character
 [17,] xyz character
 [18,] y   character
  allobj()
  [,1] [,2]
  
 
 As far as I can see, the function allobj has the same expressions as 
those
 executed from the command line. Why are the results different?

Probably due to environment handling. 

Do you really want to check if ls behaves as is intended and that it 
produces character vector? Or your intention is a little bit more 
ambitious and you want to know what objects do you have?

If the later, I recommend to use this function:

function (pos = 1, pattern, order.by) 
{
napply - function(names, fn) sapply(names, function(x) fn(get(x, 
pos = pos)))
names - ls(pos = pos, pattern = pattern)
obj.class - napply(names, function(x) as.character(class(x))[1])
obj.mode - napply(names, mode)
obj.type - ifelse(is.na(obj.class), obj.mode, obj.class)
obj.size - napply(names, object.size)
obj.dim - t(napply(names, function(x) as.numeric(dim(x))[1:2]))
vec - is.na(obj.dim)[, 1]  (obj.type != function)
obj.dim[vec, 1] - napply(names, length)[vec]
out - data.frame(obj.type, obj.size, obj.dim)
names(out) - c(Type, Size, Rows, Columns)
if (!missing(order.by)) 
out - out[order(out[[order.by]]), ]
out
}







[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confused with Student's sleep data description

2012-01-27 Thread Олег Девіняк
I am confused whether Student's sleep data show the effect of two
soporific drugs or Control against Treatment (one drug). The reason
is the next:
 require(stats)
 data(sleep)
 attach(sleep)
 extra[group==1]
numeric(0)
 group
 [1] Ctl Ctl Ctl Ctl Ctl Ctl Ctl Ctl Ctl Ctl Trt Trt Trt Trt Trt Trt Trt Trt Trt
[20] Trt
Levels: Ctl Trt
 sleep$group
 [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
Levels: 1 2

Does some package overwrite my attach()? I am worried mostly in the
right performance of my code by others. So have the attach() to be
avoided?
Thanks for answers!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused with Student's sleep data description

2012-01-27 Thread R. Michael Weylandt michael.weyla...@gmail.com
It doesn't have anything to do with attach (which is naughty in other ways!)  
rather it's the internal representation of categorical variables (R speak: 
factors) that store each level as an integer for memory efficiency but print 
things with string levels so they look nice to the user. 

You'll note there's a 1-to-1 match between Ctl-1 an Trt-2 in your data. 

The funny business (best I reckon) is that use of $ which down-grades your data 
to its internal representation as a numeric (integer) vector. 

But yes, you should avoid attach anyways. 

M

On Jan 27, 2012, at 6:03 AM, Олег Девіняк o.devin...@gmail.com wrote:

 I am confused whether Student's sleep data show the effect of two
 soporific drugs or Control against Treatment (one drug). The reason
 is the next:
 require(stats)
 data(sleep)
 attach(sleep)
 extra[group==1]
 numeric(0)
 group
 [1] Ctl Ctl Ctl Ctl Ctl Ctl Ctl Ctl Ctl Ctl Trt Trt Trt Trt Trt Trt Trt Trt 
 Trt
 [20] Trt
 Levels: Ctl Trt
 sleep$group
 [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
 Levels: 1 2
 
 Does some package overwrite my attach()? I am worried mostly in the
 right performance of my code by others. So have the attach() to be
 avoided?
 Thanks for answers!
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused with Student's sleep data description

2012-01-27 Thread peter dalgaard

On Jan 27, 2012, at 17:18 , R. Michael Weylandt wrote:

 It doesn't have anything to do with attach (which is naughty in other ways!)  
 rather it's the internal representation of categorical variables (R speak: 
 factors) that store each level as an integer for memory efficiency but print 
 things with string levels so they look nice to the user. 
 
 You'll note there's a 1-to-1 match between Ctl-1 an Trt-2 in your data. 
 
 The funny business (best I reckon) is that use of $ which down-grades your 
 data to its internal representation as a numeric (integer) vector. 

Rubbish! There must be more to this:

 data(sleep)
 sleep$group
 [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
Levels: 1 2
 attach(sleep)
 group
 [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
Levels: 1 2
 
Presumably there's a group variable with different factor levels sitting in 
the global environment. $ certainly will not down-grade data to integers 
(much less keep them as factors but modify the level set)

-pd 

 
 But yes, you should avoid attach anyways. 
 
 M
 
 On Jan 27, 2012, at 6:03 AM, Олег Девіняк o.devin...@gmail.com wrote:
 
 I am confused whether Student's sleep data show the effect of two
 soporific drugs or Control against Treatment (one drug). The reason
 is the next:
 require(stats)
 data(sleep)
 attach(sleep)
 extra[group==1]
 numeric(0)
 group
 [1] Ctl Ctl Ctl Ctl Ctl Ctl Ctl Ctl Ctl Ctl Trt Trt Trt Trt Trt Trt Trt Trt 
 Trt
 [20] Trt
 Levels: Ctl Trt
 sleep$group
 [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
 Levels: 1 2
 
 Does some package overwrite my attach()? I am worried mostly in the
 right performance of my code by others. So have the attach() to be
 avoided?
 Thanks for answers!
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused with an error message related to plotrix library in the newer versions of R.

2011-11-14 Thread Jim Lemon

On 11/14/2011 05:59 PM, Prasanth V P wrote:

require(plotrix)



xy.pop- c(17,15,13,11,9,8,6,5,4,3,2,2,1,3)

xx.pop- c(17,14,12,11,11,8,6,5,4,3,2,2,2,3)

agelabels- c(0-4,5-9,10-14,15-19,20-24,25-29,30-34,


35-39,40-44,45-49,50-54,55-59,60-64,65+)



xycol-color.gradient(c(0,0,0.5,0.15),c(0.25,0.5,0.5,1.75),c(0.5,1.5,1,0),18)

xxcol-color.gradient(c(0,1,0.5,1),c(0.25,0.5,0.5,1.25),c(0.5,0.25,0.5,1.5),18)

par(mar=pyramid.plot(xy.pop,xx.pop,labels=agelabels, labelcex=1.125,

 main=Population Pyramid -- Malawi, xycol=xycol,
xxcol=xxcol))


Hi Prasanth V P,
Just a typo. Try this:

par(mar=pyramid.plot(xy.pop,xx.pop,labels=agelabels,labelcex=1.125,
 main=Population Pyramid -- Malawi, lxcol=xycol,rxcol=xxcol))

Nice plot.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused with an error message related to plotrix library in the newer versions of R.

2011-11-14 Thread Prasanth V P
Hi Jim,

It's working perfectly fine with the rxcol parameter. I am just
wondering how could I miss that..!!!
By the way, many thanks for pointing it out... Otherwise, I would have
been using the old version of R for just getting the required plot.

Much Appreciated,
Prasanth.

-Original Message-
From: Jim Lemon [mailto:j...@bitwrit.com.au]
Sent: 14 November 2011 13:39
To: Prasanth V P
Cc: r-help@r-project.org
Subject: Re: [R] Confused with an error message related to plotrix
library in the newer versions of R.

On 11/14/2011 05:59 PM, Prasanth V P wrote:
 require(plotrix)



 xy.pop- c(17,15,13,11,9,8,6,5,4,3,2,2,1,3)

 xx.pop- c(17,14,12,11,11,8,6,5,4,3,2,2,2,3)

 agelabels- c(0-4,5-9,10-14,15-19,20-24,25-29,30-34,


 35-39,40-44,45-49,50-54,55-59,60-64,65+)




xycol-color.gradient(c(0,0,0.5,0.15),c(0.25,0.5,0.5,1.75),c(0.5,1.5,1,0),
18)


xxcol-color.gradient(c(0,1,0.5,1),c(0.25,0.5,0.5,1.25),c(0.5,0.25,0.5,1.5
),18)

 par(mar=pyramid.plot(xy.pop,xx.pop,labels=agelabels, labelcex=1.125,

  main=Population Pyramid -- Malawi, xycol=xycol,
 xxcol=xxcol))

Hi Prasanth V P,
Just a typo. Try this:

par(mar=pyramid.plot(xy.pop,xx.pop,labels=agelabels,labelcex=1.125,
  main=Population Pyramid -- Malawi, lxcol=xycol,rxcol=xxcol))

Nice plot.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confused with an error message related to plotrix library in the newer versions of R.

2011-11-13 Thread Prasanth V P
Dear R Users,



Greetings!



I am confused with an error message related to plotrix library in the
newer versions of R.

I used to run an R script without fail in the earlier versions (R 2.8.1) of
R; but the same script is now throwing up an error message in the newer
versions (Now I have R 2.13.0  R 2.14.0).



Herewith I am furnishing the same code for your perusal. It would have been
better if somebody could look into this matter and explain in detail.



require(plotrix)



xy.pop - c(17,15,13,11,9,8,6,5,4,3,2,2,1,3)

xx.pop - c(17,14,12,11,11,8,6,5,4,3,2,2,2,3)

agelabels - c(0-4,5-9,10-14,15-19,20-24,25-29,30-34,


35-39,40-44,45-49,50-54,55-59,60-64,65+)



xycol-color.gradient(c(0,0,0.5,0.15),c(0.25,0.5,0.5,1.75),c(0.5,1.5,1,0),18)

xxcol-color.gradient(c(0,1,0.5,1),c(0.25,0.5,0.5,1.25),c(0.5,0.25,0.5,1.5),18)

par(mar=pyramid.plot(xy.pop,xx.pop,labels=agelabels, labelcex=1.125,

main=Population Pyramid -- Malawi, xycol=xycol,
xxcol=xxcol))



Much Appreciated,

*Prasanth, V.P.*
Global Manager – Biometrics


Delta Technology  Management Services Pvt Ltd,
Plot No: 13/2, Sector - I,
Third Floor, HUDA Techno Enclave,
Madhapur, Hyderabad - 500 081.
(  : +91-40-3028 2113
È: +91-9848 290025

*  : vprasa...@deltaintech.com



**

‘The information contained in this email is confidential and may contain
proprietary information. It is meant solely for the intended recipient.
Access to this email by anyone else is unauthorized. If you are not the
intended recipient, any disclosure, copying, distribution or any action
taken or omitted in reliance on this, is prohibited and may be unlawful. No
liability or responsibility is accepted if information or data is, for
whatever reason corrupted or does not reach its intended recipient. No
warranty is given that this email is free of viruses. The views expressed
in this email are, unless otherwise stated, those of the author and not
those of DELTA Technology and Management Services pvt ltd or its
management. DELTA Technology and Management Services pvt ltd reserves the
right to monitor intercept and block emails addressed to its users or take
any other action in accordance with its email use policy’

Thank you in advance for your cooperation.

**

P Please don't print this e-mail unless you really need to.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused about a warning message

2011-07-07 Thread David Winsemius


On Jul 7, 2011, at 8:47 PM, Gang Chen wrote:

I define the following function to convert a t-value with degrees of  
freedom

DF to another t-value with different degrees of freedom fullDF:

tConvert - function(tval, DF, fullDF) ifelse(DF=1, qt(pt(tval, DF),
fullDF), 0)

It works as expected with the following case:


tConvert(c(2,3), c(10,12), 12)

[1] 1.961905 3.00

However, it gives me warning for the example below although the  
output is

still as intended:


tConvert(c(2,3), c(0,12), 12)

[1] 0 3
Warning message:
In pt(q, df, lower.tail, log.p) : NaNs produced

I'm confused about the warning especially considering the fact that  
the

following works correctly without such warning:


tConvert(2, 0, 12)

[1] 0

What am I missing?


The fact that ifelse evaluates both sides of the consequent and  
alternative.




Thanks,
Gang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confused about a warning message

2011-07-07 Thread Gang Chen
I define the following function to convert a t-value with degrees of freedom
DF to another t-value with different degrees of freedom fullDF:

tConvert - function(tval, DF, fullDF) ifelse(DF=1, qt(pt(tval, DF),
fullDF), 0)

It works as expected with the following case:

 tConvert(c(2,3), c(10,12), 12)
[1] 1.961905 3.00

However, it gives me warning for the example below although the output is
still as intended:

 tConvert(c(2,3), c(0,12), 12)
[1] 0 3
Warning message:
In pt(q, df, lower.tail, log.p) : NaNs produced

I'm confused about the warning especially considering the fact that the
following works correctly without such warning:

 tConvert(2, 0, 12)
[1] 0

What am I missing?

Thanks,
Gang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused about a warning message

2011-07-07 Thread David Winsemius


On Jul 7, 2011, at 8:52 PM, David Winsemius wrote:



On Jul 7, 2011, at 8:47 PM, Gang Chen wrote:

I define the following function to convert a t-value with degrees  
of freedom

DF to another t-value with different degrees of freedom fullDF:

tConvert - function(tval, DF, fullDF) ifelse(DF=1, qt(pt(tval, DF),
fullDF), 0)

It works as expected with the following case:


tConvert(c(2,3), c(10,12), 12)

[1] 1.961905 3.00

However, it gives me warning for the example below although the  
output is

still as intended:


tConvert(c(2,3), c(0,12), 12)

[1] 0 3
Warning message:
In pt(q, df, lower.tail, log.p) : NaNs produced

I'm confused about the warning especially considering the fact that  
the

following works correctly without such warning:


tConvert(2, 0, 12)

[1] 0

What am I missing?


The fact that ifelse evaluates both sides of the consequent and  
alternative.


I also think you should update yur R to the most recent version since  
a current version does not issue that warning.



--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused about a warning message

2011-07-07 Thread Gang Chen
Thanks for the help! Are you sure R version plays a role in this case? My R
version is 2.13.0

Your suggestion prompted me to look into the help content of ifelse, and a
similar example exists there:

 x - c(6:-4)
 sqrt(x)  #- gives warning
 sqrt(ifelse(x = 0, x, NA))  # no warning

 ## Note: the following also gives the warning !
 ifelse(x = 0, sqrt(x), NA)

Based on the above example, now I have a solution for my situation:

tConvert2 - function(tval, DF, fullDF) qt(pt(ifelse(DF=1, tval, 0),
ifelse(DF=1, DF, 1)), fullDF)

 tConvert2(c(2,3), c(0,12), 12)
[1] 0 3

However, I feel my solution is a little kludged. Any better idea?

Thanks,
Gang



On Thu, Jul 7, 2011 at 9:04 PM, David Winsemius dwinsem...@comcast.netwrote:


 On Jul 7, 2011, at 8:52 PM, David Winsemius wrote:


 On Jul 7, 2011, at 8:47 PM, Gang Chen wrote:

  I define the following function to convert a t-value with degrees of
 freedom
 DF to another t-value with different degrees of freedom fullDF:

 tConvert - function(tval, DF, fullDF) ifelse(DF=1, qt(pt(tval, DF),
 fullDF), 0)

 It works as expected with the following case:

  tConvert(c(2,3), c(10,12), 12)

 [1] 1.961905 3.00

 However, it gives me warning for the example below although the output is
 still as intended:

  tConvert(c(2,3), c(0,12), 12)

 [1] 0 3
 Warning message:
 In pt(q, df, lower.tail, log.p) : NaNs produced

 I'm confused about the warning especially considering the fact that the
 following works correctly without such warning:

  tConvert(2, 0, 12)

 [1] 0

 What am I missing?


 The fact that ifelse evaluates both sides of the consequent and
 alternative.


 I also think you should update yur R to the most recent version since a
 current version does not issue that warning.


 --
 David Winsemius, MD
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused about a warning message

2011-07-07 Thread David Winsemius


On Jul 7, 2011, at 10:17 PM, Gang Chen wrote:

Thanks for the help! Are you sure R version plays a role in this  
case? My R version is 2.13.0


I'm not sure, but my version is 2.13.1



Your suggestion prompted me to look into the help content of ifelse,  
and a similar example exists there:


 x - c(6:-4)
 sqrt(x)  #- gives warning
 sqrt(ifelse(x = 0, x, NA))  # no warning


 The x variable gets converted to c( 6:0, NA,NA,NA, NA)

Notice the differences here:
 sqrt(NA)
[1] NA
 sqrt(-1)
[1] NaN
Warning message:
In sqrt(-1) : NaNs produced

 qt(.5, 0)
[1] NaN
Warning message:
In qt(p, df, lower.tail, log.p) : NaNs produced

 qt(.5, NA)
[1] NA




 ## Note: the following also gives the warning !
 ifelse(x = 0, sqrt(x), NA)

Based on the above example, now I have a solution for my situation:

tConvert2 - function(tval, DF, fullDF) qt(pt(ifelse(DF=1, tval,  
0), ifelse(DF=1, DF, 1)), fullDF)


 tConvert2(c(2,3), c(0,12), 12)
[1] 0 3

However, I feel my solution is a little kludged. Any better idea?

Thanks,
Gang



On Thu, Jul 7, 2011 at 9:04 PM, David Winsemius dwinsem...@comcast.net 
 wrote:


On Jul 7, 2011, at 8:52 PM, David Winsemius wrote:


On Jul 7, 2011, at 8:47 PM, Gang Chen wrote:

I define the following function to convert a t-value with degrees of  
freedom

DF to another t-value with different degrees of freedom fullDF:

tConvert - function(tval, DF, fullDF) ifelse(DF=1, qt(pt(tval, DF),
fullDF), 0)

It works as expected with the following case:

tConvert(c(2,3), c(10,12), 12)
[1] 1.961905 3.00

However, it gives me warning for the example below although the  
output is

still as intended:

tConvert(c(2,3), c(0,12), 12)
[1] 0 3
Warning message:
In pt(q, df, lower.tail, log.p) : NaNs produced

I'm confused about the warning especially considering the fact that  
the

following works correctly without such warning:

tConvert(2, 0, 12)
[1] 0

What am I missing?

The fact that ifelse evaluates both sides of the consequent and  
alternative.


I also think you should update yur R to the most recent version  
since a current version does not issue that warning.



--
David Winsemius, MD
West Hartford, CT




David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confused by lapply

2011-02-17 Thread Peter Ehlers

On 2011-02-16 09:42, Sam Steingold wrote:

Description:

  'lapply' returns a list of the same length as 'X', each element of
  which is the result of applying 'FUN' to the corresponding element
  of 'X'.

I expect that when I do

lapply(vec,f)

f would be called _once_ for each component of vec.

this is not what I see:

parse.num- function (s) {
   cat(parse.num1\n); str(s)
   s- as.character(s)
   cat(parse.num2\n); str(s)
   if (s == N/A) return(s);
   as.numeric(gsub(M$,e6,gsub(B$,e9,s)));
}



vec

  mcap
1  200.5B
2   19.1M
3  223.7B
4  888.0M
5  141.7B
6  273.5M
7 55.649B

str(vec)

'data.frame':   7 obs. of  1 variable:
  $ mcap: Factor w/ 7 levels 141.7B,19.1M,..: 3 2 4 7 1 5 6

vec-lapply(vec,parse.num)

parse.num1
  Factor w/ 7 levels 141.7B,19.1M,..: 3 2 4 7 1 5 6
parse.num2
  chr [1:7] 200.5B 19.1M 223.7B 888.0M 141.7B 273.5M ...
Warning message:
In if (s == N/A) return(s) :
   the condition has length  1 and only the first element will be used

i.e., somehow parse.num is called on the whole vector vec, not its
components.

what am I doing wrong?


Your 'vec' is NOT a vector. As your str(vec) clearly
shows, you have a *data.frame*. The components of a
data.frame are the columns (variables) of which you
have only one and your function is applied to that.
If you had two columns, parse.num would be applied
to each column.

So do this:

 lapply(vec[, 1], parse.num)


Peter Ehlers





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] confused by lapply

2011-02-16 Thread Sam Steingold
Description:

 'lapply' returns a list of the same length as 'X', each element of
 which is the result of applying 'FUN' to the corresponding element
 of 'X'.

I expect that when I do
 lapply(vec,f)
f would be called _once_ for each component of vec.

this is not what I see:

parse.num - function (s) {
  cat(parse.num1\n); str(s)
  s - as.character(s)
  cat(parse.num2\n); str(s)
  if (s == N/A) return(s);
  as.numeric(gsub(M$,e6,gsub(B$,e9,s)));
}


 vec
 mcap
1  200.5B
2   19.1M
3  223.7B
4  888.0M
5  141.7B
6  273.5M
7 55.649B
 str(vec)
'data.frame':   7 obs. of  1 variable:
 $ mcap: Factor w/ 7 levels 141.7B,19.1M,..: 3 2 4 7 1 5 6
 vec-lapply(vec,parse.num)
parse.num1
 Factor w/ 7 levels 141.7B,19.1M,..: 3 2 4 7 1 5 6
parse.num2
 chr [1:7] 200.5B 19.1M 223.7B 888.0M 141.7B 273.5M ...
Warning message:
In if (s == N/A) return(s) :
  the condition has length  1 and only the first element will be used

i.e., somehow parse.num is called on the whole vector vec, not its
components.

what am I doing wrong?

-- 
Sam Steingold (http://sds.podval.org/) on CentOS release 5.3 (Final)
http://dhimmi.com http://mideasttruth.com http://truepeace.org
http://camera.org http://memri.org http://palestinefacts.org http://iris.org.il
Despite the raising cost of living, it remains quite popular.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confused

2011-02-07 Thread Joel

Hi

Im confused by one thing, and if someone can explain it I would be a happy 

 rev(strsplit(hej,NULL))
[[1]]
[1] h e j

 lapply(strsplit(hej,NULL),rev)
[[1]]
[1] j e h

Why dossent the first one work? What is it in R that fails so to say that
you need to use lapply for it to get the correct output.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Confused-tp3263700p3263700.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused

2011-02-07 Thread Peter Ehlers

On 2011-02-07 00:18, Joel wrote:


Hi

Im confused by one thing, and if someone can explain it I would be a happy


rev(strsplit(hej,NULL))

[[1]]
[1] h e j


lapply(strsplit(hej,NULL),rev)

[[1]]
[1] j e h

Why dossent the first one work? What is it in R that fails so to say that
you need to use lapply for it to get the correct output.


See if this helps to see what's happening in the first case:

 L - list(fruit=c(apple, orange))
 L
 rev(L)

 L - list(fruit=c(apple, orange), nuts=c(pecan, almond))
 L
 rev(L)

 lapply(L, rev)

For your second case, lapply() applies FUN to the pieces
of the list.

Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused

2011-02-07 Thread Ted Harding
On 07-Feb-11 08:18:49, Joel wrote:
 Hi
 Im confused by one thing, and if someone can explain it I would be a
 happy 
 
 rev(strsplit(hej,NULL))
 [[1]]
 [1] h e j
 
 lapply(strsplit(hej,NULL),rev)
 [[1]]
 [1] j e h
 
 Why dossent the first one work? What is it in R that fails
 so to say that you need to use lapply for it to get the
 correct output.
 -- 

WHat's causing the confusion in your example is that the
result of strsplit(hej,NULL) consists of only one element.
This is because (see ?strsplit) the value of strsplit is
a *list*. For example, if you submit a character *vector*
(with 2 elements hej and nej) to your rev(strsplit(...)):

  strsplit(c(hej,nej),NULL)
  # [[1]]
  # [1] h e j
  # 
  # [[2]]
  # [1] n e j

  rev(strsplit(c(hej,nej),NULL))
  # [[1]]
  # [1] n e j
  # 
  # [[2]]
  # [1] h e j

you now get a list with 2 elements [[1]]and [[2]], and rev()
now outputs these in reverse order. With your character vector
hej which has only one element, you get a list with only
one element, and the rev() of this is exactly the same.

Your lapply(strsplit(hej,NULL),rev) applies rev() to each
element of the list returned by strsplit, so even if it only
has one element that element gets its contents reversed.

  lapply(strsplit(c(hej,nej),NULL),rev)
  # [[1]]
  # [1] j e h
  # 
  # [[2]]
  # [1] j e n

Hoping this helps!
Ted.


E-Mail: (Ted Harding) ted.hard...@wlandres.net
Fax-to-email: +44 (0)870 094 0861
Date: 07-Feb-11   Time: 08:56:55
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confused: Looping in dataframes

2010-06-25 Thread phani kishan
Hey,
I have a data frame x which consists of say 10 vectors. I essentially want
to find out the best fit exponential smoothing for each of the vectors.

The problem while I'm getting results when i say
 lapply(x,ets)

I am getting an error when I say
 myprint
function(x)
{
for(i in 1:length(x))
{
ets(x[i],model=AZZ,opt.crit=c(amse))
}
}

The error message is that* Error in ets(x[i], model = AZZ, opt.crit =
c(amse)) :
  y should be a univariate time series*

Could someone please explain why this is happening? I also want to be able
to extract data like coef's, errors (MAPE,MSE etc.)

Thanks and regards,
Phani
-- 
A. Phani Kishan
3rd Year B.Tech
Dept. of Computer Science  Engineering
IIT MADRAS
Ph: +919962363545

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused: Looping in dataframes

2010-06-25 Thread Paul Hiemstra

On 06/25/2010 10:02 AM, phani kishan wrote:

Hey,
I have a data frame x which consists of say 10 vectors. I essentially want
to find out the best fit exponential smoothing for each of the vectors.

The problem while I'm getting results when i say
   

lapply(x,ets)
 

I am getting an error when I say
   

myprint
   

function(x)
{
for(i in 1:length(x))
{
ets(x[i],model=AZZ,opt.crit=c(amse))
   

Hi,

Please provide a reproducible example, as stated in the posting guide. 
My guess is that replacing x[i] by x[[i]] would solve the problem. 
Double brackets return a vector in stead of a data.frame that has just 
column i.


cheers,
Paul

}
}

The error message is that* Error in ets(x[i], model = AZZ, opt.crit =
c(amse)) :
   y should be a univariate time series*

Could someone please explain why this is happening? I also want to be able
to extract data like coef's, errors (MAPE,MSE etc.)

Thanks and regards,
Phani
   



--
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone:  +3130 253 5773
http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused: Looping in dataframes

2010-06-25 Thread phani kishan
On Fri, Jun 25, 2010 at 1:54 PM, Paul Hiemstra p.hiems...@geo.uu.nl wrote:

 On 06/25/2010 10:02 AM, phani kishan wrote:

 Hey,
 I have a data frame x which consists of say 10 vectors. I essentially want
 to find out the best fit exponential smoothing for each of the vectors.

 The problem while I'm getting results when i say


 lapply(x,ets)


 I am getting an error when I say


 myprint


 function(x)
 {
 for(i in 1:length(x))
 {
 ets(x[i],model=AZZ,opt.crit=c(amse))


 Hi,

 Please provide a reproducible example, as stated in the posting guide. My
 guess is that replacing x[i] by x[[i]] would solve the problem. Double
 brackets return a vector in stead of a data.frame that has just column i.

Hey Paul,
As requested.
My example data frame

sdata:
SKU1SKU2   SKU3   SKU4
1   583.8 574.6  1106.9   648.1
2   441.7 552.8  1021.3   353.6
3   454.2 555.7   998.3   306.4
4   569.7 507.6   811.1   360.7
5   512.3 620.0  1046.3   713.9
6   580.8 668.2   732.0   490.9
7   648.5 766.9   653.4   422.1
8   617.4 657.1   602.1   190.8
9   826.8 767.3   640.5   324.1
10 1163.0 657.6   429.6   181.1
11  643.5 788.9   569.1   331.9
12  846.9 568.6   425.1   224.6
13  580.7 582.9   434.2   226.9

now when I apply
lapply(sdata,ets)
I get a result as:
$SKU1
ETS(A,N,N)

Call:
 ets(y = x, model = AZZ)

  Smoothing parameters:
alpha = 0.3845

  Initial states:
l = 533.3698

  sigma:  181.7615

 AIC AICc  BIC
172.6144 173.8144 173.7443

$SKU2
ETS(A,N,N)

Call:
 ets(y = x, model = AZZ)

  Smoothing parameters:
alpha = 0.5026

  Initial states:
l = 567.821

  sigma:  86.7074

 AIC AICc  BIC
153.3704 154.5704 154.5003

$SKU3
ETS(A,A,N)

Call:
 ets(y = x, model = AZZ)

  Smoothing parameters:
alpha = 1e-04
beta  = 1e-04

  Initial states:
l = 1189.2221
b = -64.3776

  sigma:  85.4153

 AIC AICc  BIC
156.9800 161.9800 159.2398

$SKU4
ETS(A,A,N)

Call:
 ets(y = x, model = AZZ)

  Smoothing parameters:
alpha = 1e-04
beta  = 1e-04

  Initial states:
l = 566.9001
b = -27.8818

  sigma:  127.2654

 AIC AICc  BIC
167.3475 172.3475 169.6073

Now when I run the same using:
myfun-function(x)
{
for(i in 1:length(x))
{
ets(x[i])

 }
}
I got the error as mentioned before. Now on modifying it to
myfun-function(x)
{
for(i in 1:length(x))
{
return(ets(x[[i]])
}
}
I only got the output as
ETS(A,N,N)

Call:
 ets(y = x[[i]], model = AZZ, opt.crit = c(amse))

  Smoothing parameters:
alpha = 0.3983

  Initial states:
l = 516.188

  sigma:  181.8688

 AIC AICc  BIC
172.6298 173.8298 173.7597

I think its considering whole dataframe as a series.
As said my objective it to essentially come up with a best exponential model
for each of the SKU's in the dataframe. However I want to be able to extract
information like mse, mape etc later. So kindly suggest.

Thanks in advance,
Phani



 cheers,
 Paul

  }
 }

 The error message is that* Error in ets(x[i], model = AZZ, opt.crit =
 c(amse)) :
   y should be a univariate time series*

 Could someone please explain why this is happening? I also want to be able
 to extract data like coef's, errors (MAPE,MSE etc.)

 Thanks and regards,
 Phani




 --
 Drs. Paul Hiemstra
 Department of Physical Geography
 Faculty of Geosciences
 University of Utrecht
 Heidelberglaan 2
 P.O. Box 80.115
 3508 TC Utrecht
 Phone:  +3130 253 5773
 http://intamap.geo.uu.nl/~paul http://intamap.geo.uu.nl/%7Epaul
 http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770




-- 
A. Phani Kishan
3rd Year B.Tech
Dept. of Computer Science  Engineering
IIT MADRAS
Ph: +919962363545

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused: Looping in dataframes

2010-06-25 Thread David Winsemius


On Jun 25, 2010, at 7:09 AM, phani kishan wrote:

On Fri, Jun 25, 2010 at 1:54 PM, Paul Hiemstra  
p.hiems...@geo.uu.nl wrote:



On 06/25/2010 10:02 AM, phani kishan wrote:


Hey,
I have a data frame x which consists of say 10 vectors. I  
essentially want
to find out the best fit exponential smoothing for each of the  
vectors.


The problem while I'm getting results when i say



lapply(x,ets)



I am getting an error when I say



myprint




function(x)

{
for(i in 1:length(x))
{
ets(x[i],model=AZZ,opt.crit=c(amse))



Hi,

Please provide a reproducible example, as stated in the posting  
guide. My
guess is that replacing x[i] by x[[i]] would solve the problem.  
Double
brackets return a vector in stead of a data.frame that has just  
column i.



Hey Paul,
As requested.
My example data frame

sdata:
   SKU1SKU2   SKU3   SKU4
1   583.8 574.6  1106.9
648.1
2   441.7 552.8  1021.3
353.6
3   454.2 555.7   998.3
306.4
4   569.7 507.6   811.1
360.7
5   512.3 620.0  1046.3
713.9
6   580.8 668.2   732.0
490.9
7   648.5 766.9   653.4
422.1
8   617.4 657.1   602.1
190.8
9   826.8 767.3   640.5
324.1
10 1163.0 657.6   429.6
181.1
11  643.5 788.9   569.1
331.9
12  846.9 568.6   425.1
224.6
13  580.7 582.9   434.2
226.9


now when I apply
lapply(sdata,ets)
I get a result as:
$SKU1
ETS(A,N,N)

Call:
ets(y = x, model = AZZ)

 Smoothing parameters:
   alpha = 0.3845

 Initial states:
   l = 533.3698

 sigma:  181.7615

AIC AICc  BIC
172.6144 173.8144 173.7443

$SKU2
ETS(A,N,N)

Call:
ets(y = x, model = AZZ)

 Smoothing parameters:
   alpha = 0.5026

 Initial states:
   l = 567.821

 sigma:  86.7074

AIC AICc  BIC
153.3704 154.5704 154.5003

$SKU3
ETS(A,A,N)

Call:
ets(y = x, model = AZZ)

 Smoothing parameters:
   alpha = 1e-04
   beta  = 1e-04

 Initial states:
   l = 1189.2221
   b = -64.3776

 sigma:  85.4153

AIC AICc  BIC
156.9800 161.9800 159.2398

$SKU4
ETS(A,A,N)

Call:
ets(y = x, model = AZZ)

 Smoothing parameters:
   alpha = 1e-04
   beta  = 1e-04

 Initial states:
   l = 566.9001
   b = -27.8818

 sigma:  127.2654

AIC AICc  BIC
167.3475 172.3475 169.6073

Now when I run the same using:
myfun-function(x)
{
for(i in 1:length(x))
{
ets(x[i])


}

}
I got the error as mentioned before. Now on modifying it to
myfun-function(x)
{
for(i in 1:length(x))
{
return(ets(x[[i]])
}
}
I only got the output as
ETS(A,N,N)

Call:
ets(y = x[[i]], model = AZZ, opt.crit = c(amse))

 Smoothing parameters:
   alpha = 0.3983

 Initial states:
   l = 516.188

 sigma:  181.8688

AIC AICc  BIC
172.6298 173.8298 173.7597

I think its considering whole dataframe as a series.


Doubtful. It is quietly calculating all of the requested models but  
you did not do anything with them inside the loop (which is a  
function). You could have assigned them to something permanent or  
printed them (or both):


ets_x - list()

for(i in 1:length(x))
{
print(ets(x[[i]]); ets_x - c(ets_x, ets(x[[i]])
}
}


ets_x

As said my objective it to essentially come up with a best  
exponential model
for each of the SKU's in the dataframe. However I want to be able to  
extract

information like mse, mape etc later. So kindly suggest.

Thanks in advance,
Phani




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused: Looping in dataframes

2010-06-25 Thread phani kishan
Hey,
I only got the output once cuz I was returning from the function at the end
of one loop.
I set that right and I have printed the values.

function being used by me now is:
function(x)
{
for(i in 1:length(x))
{
print(names(x[i]))
print(myets(x[[i]]))
}
}

where myets is my customized exponential smoothing model. However the
problem is that if I run my myets function individually on each of the SKU's
I get values of MAPE, MSE etc. However by running the above loop I dont get
the values. How do I store the values for me to look at them later?

There are minor changes (not significant) in the values of parameters from
applying the above function as opposed to lapply. Why could it be so??

Phani



-- 
A. Phani Kishan
3rd Year B.Tech
Dept. of Computer Science  Engineering
IIT MADRAS
Ph: +919962363545

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] confused on model.frame evaluation

2010-04-30 Thread Erik Iverson

Hello!

I'm reading through a logistic regression book and using R to replicate 
the results.  Although my question is not directly related to this, it's 
the context I discovered it in, so here we go.


Consider these data:

interco - structure(list(white = c(1, 1, 0, 0), male = c(1, 0, 1, 0), 
yes = c(43, 26, 29, 22), no = c(134, 149, 23, 36), total = c(177, 175, 
52, 58)), .Names = c(white, male, yes, no, total), row.names = 
c(NA, -4L), class = data.frame)


We can use logistic regression to analyze this table, using glm's syntax 
 for successes/failures described on the top of page 191 in MASS 4th 
edition.


summary(glm(as.matrix(interco[c(yes, no)]) ~ white + male,
data = interco, family = binomial))


The output prints out, no problem!

Now, another data set, note the identifying feature of this one is that 
it contains a column with the same name as the object (i.e., working)


working - structure(list(france = c(1, 1, 1, 1, 0, 0, 0, 0), manual = 
c(1, 1, 0, 0, 1, 1, 0, 0), famanual = c(1, 0, 1, 0, 1, 0, 1, 0), total = 
c(107, 65, 66, 171, 87, 65, 85, 148), working = c(85, 44, 24, 17, 24,
22, 1, 6), no = c(22, 21, 42, 154, 63, 43, 84, 142)), .Names = 
c(france, manual, famanual, total, working, no), row.names = 
c(NA, -8L), class = data.frame)


summary(glm(as.matrix(working[c(working, no)]) ~ france + manual + 
famanual, data = working, family = binomial))


Error in model.frame.default(formula = as.matrix(working[c(working,  :
  variable lengths differ (found for 'france')

Well, this error goes away simply by renaming the working variable in 
the data.frame working to something else.  I found the eval line in 
model.frame that's throwing the error, but I'm still confused as to why.


I'm sure it's not a bug, but could someone point to a thread or offer 
some gentle advice on what's happening?  I think it's related to:


test - data.frame(name1 = 1:5, name2 = 6:10, test = 11:15)
eval(expression(test[c(name1, name2)]))
eval(expression(interco[c(name1, test)]))


Thanks!

--Erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confused on model.frame evaluation

2010-04-30 Thread Erik Iverson

snip


I'm sure it's not a bug, but could someone point to a thread or offer 
some gentle advice on what's happening?  I think it's related to:


test - data.frame(name1 = 1:5, name2 = 6:10, test = 11:15)
eval(expression(test[c(name1, name2)]))
eval(expression(interco[c(name1, test)]))


scratch that last one, obviously a typo was causing my confusion there! 
 The model.frame stuff remains a mystery to me though...


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confused on model.frame evaluation

2010-04-30 Thread Marc Schwartz
On Apr 30, 2010, at 4:57 PM, Erik Iverson wrote:

 snip
 I'm sure it's not a bug, but could someone point to a thread or offer some 
 gentle advice on what's happening?  I think it's related to:
 test - data.frame(name1 = 1:5, name2 = 6:10, test = 11:15)
 eval(expression(test[c(name1, name2)]))
 eval(expression(interco[c(name1, test)]))
 
 scratch that last one, obviously a typo was causing my confusion there!  The 
 model.frame stuff remains a mystery to me though...


Hi Erik,

It's late on a Friday, it's grey and raining here in Minneapolis and I am short 
on caffeine, but, that being said, consider the following :-)


 working
  france manual famanual total working  no
1  1  11   107  85  22
2  1  1065  44  21
3  1  0166  24  42
4  1  00   171  17 154
5  0  1187  24  63
6  0  1065  22  43
7  0  0185   1  84
8  0  00   148   6 142


 as.matrix(working[c(working, no)])
 working  no
[1,]  85  22
[2,]  44  21
[3,]  24  42
[4,]  17 154
[5,]  24  63
[6,]  22  43
[7,]   1  84
[8,]   6 142


 with(working, as.matrix(working[c(working, no)]))
 [,1]
[1,]   NA
[2,]   NA


For the incantations of model.frame(), the formula terms are evaluated first 
within the scope of the data frame indicated for the 'data' argument.

Thus, in the second case, I am asking for the as.matrix(...) call to be 
evaluated within the scope of the 'working' data frame, which returns a matrix 
with only two rows, one NA for each column that was asked for and not found, 
which is different than the number of rows in 'working', thus you get the error 
as soon as the 'france' column is evaluated in the formula to create the model 
frame:

Error in model.frame.default(formula = as.matrix(working[c(working,  :
 variable lengths differ (found for 'france')


2 rows in the response matrix versus 8 rows for 'france'...


It is kind of like you are asking for:

 as.matrix(working$working[c(working, no)])
 [,1]
[1,]   NA
[2,]   NA



Now, try this:

 with(working, matrix(c(working, no), ncol = 2))
 [,1] [,2]
[1,]   85   22
[2,]   44   21
[3,]   24   42
[4,]   17  154
[5,]   24   63
[6,]   22   43
[7,]1   84
[8,]6  142


and then:

 summary(glm(matrix(c(working, no), ncol = 2) ~ france + manual + famanual, 
 data = working, family = binomial))

Call:
glm(formula = matrix(c(working, no), ncol = 2) ~ france + manual + 
famanual, family = binomial, data = working)

Deviance Residuals: 
   1 2 3 4 5 6 7  
 0.09316  -0.14108   2.38028  -1.91838  -1.48196   1.84993  -1.61864  
   8  
 1.16747  

Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept)  -3.6902 0.2547 -14.489   2e-16 ***
france1.9474 0.2162   9.008   2e-16 ***
manual2.5199 0.2168  11.625   2e-16 ***
famanual  0.5522 0.2017   2.738  0.00618 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 308.329  on 7  degrees of freedom
Residual deviance:  18.976  on 4  degrees of freedom
AIC: 60.162

Number of Fisher Scoring iterations: 4



Does that help top clarify?

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] confused with yearmon, xts and maybe zoo

2010-04-18 Thread simeon duckworth
R-listers,

I am using xts with a yearmon index, but am getting some inconsistent
results with the date index when i drop observations (for example by using
na.omit).

The issue is illustrated in the example below.  If I start with a monthly
zooreg series starting in 2009, yearmon converts this to Dec-2008.  Not
such a worry for my example, but strange.  Having converted to xts, i drop
the first observation.  The index shows jan 2009.  But if i create a new
variable with this index, it shifts the series back to dec 2008.

No doubt i am doing something wrong.  very grateful for any tips

library(xts)

z - zooreg(1:24,frequency=12,start=c(2009,1))  # monthly data starting 2009
x - xts(z,as.yearmon(index(z)))# starts Dec 2008
xx - x[-1,  ]  # drop
first obs (eg through na.omit)
index(xx) # starts
jan 2009
xxx - xts(NA[1:length(xx)],index(xx))# back to dec 2008

periodicity(x)
periodicity(xx)
periodicty(xxx)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confused with yearmon, xts and maybe zoo

2010-04-18 Thread Gabor Grothendieck
On Sun, Apr 18, 2010 at 8:25 AM, simeon duckworth
simeonduckwo...@gmail.com wrote:
 R-listers,

 I am using xts with a yearmon index, but am getting some inconsistent
 results with the date index when i drop observations (for example by using
 na.omit).

 The issue is illustrated in the example below.  If I start with a monthly
 zooreg series starting in 2009, yearmon converts this to Dec-2008.  Not
 such a worry for my example, but strange.  Having converted to xts, i drop
 the first observation.  The index shows jan 2009.  But if i create a new
 variable with this index, it shifts the series back to dec 2008.

 No doubt i am doing something wrong.  very grateful for any tips

 library(xts)

 z - zooreg(1:24,frequency=12,start=c(2009,1))  # monthly data starting 2009
 x - xts(z,as.yearmon(index(z)))                        # starts Dec 2008

Not for me.  It starts in January 2009 for me.

Also please show your code in such a way that it can be pasted into a
session.  Either comment out the output using # or else preface input
lines with  so its clear what is input and what is output.  And show
what versions of the software and R you are using and what platform.

 z - zooreg(1:24,frequency=12,start=c(2009,1))  # monthly data starting 2009
 head(z)
2009(1) 2009(2) 2009(3) 2009(4) 2009(5) 2009(6)
  1   2   3   4   5   6
 x - xts(z,as.yearmon(index(z)))
 head(x)
 x
Jan 2009 1
Feb 2009 2
Mar 2009 3
Apr 2009 4
May 2009 5
Jun 2009 6
 R.version.string
[1] R version 2.10.1 (2009-12-14)
 win.version()
[1] Windows Vista (build 6002) Service Pack 2
 packageDescription(zoo)$Version
[1] 1.6-3
 packageDescription(xts)$Version
[1] 0.7-0

I also tried older versions zoo 1.6-0, xts 0.6-8 and R 2.9.2 and got
the same result as I got here.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confused with yearmon, xts and maybe zoo

2010-04-18 Thread simeon duckworth
Hi Gabor

Thats odd. I still get the same problem with the same versions of the
software in your mail ... viz as.yearmon converts 2009(1) to Dec-2008
and although xts is indexed at Jan 2009 in xx, using it to create another
xts object with that index reverts to Dec-2008.

grateful for any suggestions

## code ##
library(xts)
z - zooreg(1:24,frequency=12,start=c(2009,1))
x - xts(z,as.yearmon(index(z)))
xx - x[-1, ]
index(xx)
xxx - xts(NA[1:length(xx)],index(xx))
periodicity(x)
periodicity(xx)
periodicty(xxx)b

## results ###
 periodicity(x)
Monthly periodicity from Dec 2008 to Nov 2010
 periodicity(xx)
Monthly periodicity from Jan 2009 to Nov 2010
 periodicity(xxx)
Monthly periodicity from Dec 2008 to Oct 2010

 R.version.string
[1] R version 2.10.1 (2009-12-14)
 win.version()
[1] Windows XP (build 2600) Service Pack 3
 packageDescription(xts)$Version
[1] 0.7-0
 Sys.time()
[1] 2010-04-18 19:37:26 BST



On Sun, Apr 18, 2010 at 1:25 PM, simeon duckworth simeonduckwo...@gmail.com
 wrote:

 R-listers,

 I am using xts with a yearmon index, but am getting some inconsistent
 results with the date index when i drop observations (for example by using
 na.omit).

 The issue is illustrated in the example below.  If I start with a monthly
 zooreg series starting in 2009, yearmon converts this to Dec-2008.  Not
 such a worry for my example, but strange.  Having converted to xts, i drop
 the first observation.  The index shows jan 2009.  But if i create a new
 variable with this index, it shifts the series back to dec 2008.

 No doubt i am doing something wrong.  very grateful for any tips

 library(xts)

 z - zooreg(1:24,frequency=12,start=c(2009,1))  # monthly data starting
 2009
 x - xts(z,as.yearmon(index(z)))# starts Dec 2008
 xx - x[-1,  ]  # drop
 first obs (eg through na.omit)
 index(xx) # starts
 jan 2009
 xxx - xts(NA[1:length(xx)],index(xx))# back to dec 2008

 periodicity(x)
 periodicity(xx)
 periodicty(xxx)


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confused with yearmon, xts and maybe zoo

2010-04-18 Thread Gabor Grothendieck
On Sun, Apr 18, 2010 at 2:51 PM, simeon duckworth
simeonduckwo...@gmail.com wrote:
 Hi Gabor

 Thats odd. I still get the same problem with the same versions of the
 software in your mail ... viz as.yearmon converts 2009(1) to Dec-2008

We can`t conclude that its in as.yearmon based on the output shown.
What is the output of:

   index(z)
   as.yearmon(index(z))
   x

This is what I get:

 index(z)
 [1] 2009.000 2009.083 2009.167 2009.250 2009.333 2009.417 2009.500 2009.583
 [9] 2009.667 2009.750 2009.833 2009.917 2010.000 2010.083 2010.167 2010.250
[17] 2010.333 2010.417 2010.500 2010.583 2010.667 2010.750 2010.833 2010.917
 as.yearmon(index(z))
 [1] Jan 2009 Feb 2009 Mar 2009 Apr 2009 May 2009 Jun 2009
 [7] Jul 2009 Aug 2009 Sep 2009 Oct 2009 Nov 2009 Dec 2009
[13] Jan 2010 Feb 2010 Mar 2010 Apr 2010 May 2010 Jun 2010
[19] Jul 2010 Aug 2010 Sep 2010 Oct 2010 Nov 2010 Dec 2010
 head(x)
 x
Jan 2009 1
Feb 2009 2
Mar 2009 3
Apr 2009 4
May 2009 5
Jun 2009 6



 and although xts is indexed at Jan 2009 in xx, using it to create another
 xts object with that index reverts to Dec-2008.

 grateful for any suggestions

 ## code ##
 library(xts)
 z - zooreg(1:24,frequency=12,start=c(2009,1))
 x - xts(z,as.yearmon(index(z)))
 xx - x[-1, ]
 index(xx)
 xxx - xts(NA[1:length(xx)],index(xx))
 periodicity(x)
 periodicity(xx)
 periodicty(xxx)b

 ## results ###
 periodicity(x)
 Monthly periodicity from Dec 2008 to Nov 2010
 periodicity(xx)
 Monthly periodicity from Jan 2009 to Nov 2010
 periodicity(xxx)
 Monthly periodicity from Dec 2008 to Oct 2010

 R.version.string
 [1] R version 2.10.1 (2009-12-14)
 win.version()
 [1] Windows XP (build 2600) Service Pack 3
 packageDescription(xts)$Version
 [1] 0.7-0
 Sys.time()
 [1] 2010-04-18 19:37:26 BST



 On Sun, Apr 18, 2010 at 1:25 PM, simeon duckworth simeonduckwo...@gmail.com
 wrote:

 R-listers,

 I am using xts with a yearmon index, but am getting some inconsistent
 results with the date index when i drop observations (for example by using
 na.omit).

 The issue is illustrated in the example below.  If I start with a monthly
 zooreg series starting in 2009, yearmon converts this to Dec-2008.  Not
 such a worry for my example, but strange.  Having converted to xts, i drop
 the first observation.  The index shows jan 2009.  But if i create a new
 variable with this index, it shifts the series back to dec 2008.

 No doubt i am doing something wrong.  very grateful for any tips

 library(xts)

 z - zooreg(1:24,frequency=12,start=c(2009,1))  # monthly data starting
 2009
 x - xts(z,as.yearmon(index(z)))                        # starts Dec 2008
 xx - x[-1,  ]                                                      # drop
 first obs (eg through na.omit)
 index(xx)                                                         # starts
 jan 2009
 xxx - xts(NA[1:length(xx)],index(xx))                # back to dec 2008

 periodicity(x)
 periodicity(xx)
 periodicty(xxx)


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confused with yearmon, xts and maybe zoo

2010-04-18 Thread simeon duckworth
... forgot to post this back to the r-list.

it seems that the problem is with xts rather than zoo and yearmon per se ie
using yearmon to index xts gives inconsistent results.

grateful for any help anyone can offer.

thanks




On Sun, Apr 18, 2010 at 8:15 PM, simeon duckworth simeonduckwo...@gmail.com
 wrote:

 Hi gabor

 It seems asthough the issue is in working with yearmon in xts.  the command
 as.yearmon(index(z)) works in the same way as yours, but not when used to
 index the xts object.




 ## code
 library(xts)
 z - zooreg(1:24,frequency=12,start=c(2009,1))
 x - xts(z,as.yearmon(index(z)))
 xx - x[-1, ]
 index(xx)
 xxx - xts(NA[1:length(xx)],index(xx))

 index(z)
 as.yearmon(index(z))
 head(x,3)
 head(xx,3)
 head(xxx,3)


 ## output

  index(z)
  [1] 2009.000 2009.083 2009.167 2009.250 2009.333 2009.417 2009.500
 2009.583
  [9] 2009.667 2009.750 2009.833 2009.917 2010.000 2010.083 2010.167
 2010.250
 [17] 2010.333 2010.417 2010.500 2010.583 2010.667 2010.750 2010.833
 2010.917
  as.yearmon(index(z))
  [1] Jan 2009 Feb 2009 Mar 2009 Apr 2009 May 2009 Jun 2009
  [7] Jul 2009 Aug 2009 Sep 2009 Oct 2009 Nov 2009 Dec 2009
 [13] Jan 2010 Feb 2010 Mar 2010 Apr 2010 May 2010 Jun 2010
 [19] Jul 2010 Aug 2010 Sep 2010 Oct 2010 Nov 2010 Dec 2010
  head(x,3)
  x
 Dec 2008 1
 Jan 2009 2
 Feb 2009 3
  head(xx,3)
  x
 Jan 2009 2
 Feb 2009 3
 Mar 2009 4
  head(xxx,3)
  [,1]
 Dec 2008   NA
 Jan 2009   NA
 Feb 2009   NA





 On Sun, Apr 18, 2010 at 8:00 PM, Gabor Grothendieck 
 ggrothendi...@gmail.com wrote:

 On Sun, Apr 18, 2010 at 2:51 PM, simeon duckworth
 simeonduckwo...@gmail.com wrote:
  Hi Gabor
 
  Thats odd. I still get the same problem with the same versions of the
  software in your mail ... viz as.yearmon converts 2009(1) to
 Dec-2008

 We can`t conclude that its in as.yearmon based on the output shown.
 What is the output of:

   index(z)
   as.yearmon(index(z))
   x

 This is what I get:

  index(z)
  [1] 2009.000 2009.083 2009.167 2009.250 2009.333 2009.417 2009.500
 2009.583
  [9] 2009.667 2009.750 2009.833 2009.917 2010.000 2010.083 2010.167
 2010.250
 [17] 2010.333 2010.417 2010.500 2010.583 2010.667 2010.750 2010.833
 2010.917
  as.yearmon(index(z))
  [1] Jan 2009 Feb 2009 Mar 2009 Apr 2009 May 2009 Jun 2009
  [7] Jul 2009 Aug 2009 Sep 2009 Oct 2009 Nov 2009 Dec 2009
 [13] Jan 2010 Feb 2010 Mar 2010 Apr 2010 May 2010 Jun 2010
 [19] Jul 2010 Aug 2010 Sep 2010 Oct 2010 Nov 2010 Dec 2010
  head(x)
 x
 Jan 2009 1
 Feb 2009 2
 Mar 2009 3
 Apr 2009 4
 May 2009 5
 Jun 2009 6



  and although xts is indexed at Jan 2009 in xx, using it to create
 another
  xts object with that index reverts to Dec-2008.
 
  grateful for any suggestions
 
  ## code ##
  library(xts)
  z - zooreg(1:24,frequency=12,start=c(2009,1))
  x - xts(z,as.yearmon(index(z)))
  xx - x[-1, ]
  index(xx)
  xxx - xts(NA[1:length(xx)],index(xx))
  periodicity(x)
  periodicity(xx)
  periodicty(xxx)b
 
  ## results ###
  periodicity(x)
  Monthly periodicity from Dec 2008 to Nov 2010
  periodicity(xx)
  Monthly periodicity from Jan 2009 to Nov 2010
  periodicity(xxx)
  Monthly periodicity from Dec 2008 to Oct 2010
 
  R.version.string
  [1] R version 2.10.1 (2009-12-14)
  win.version()
  [1] Windows XP (build 2600) Service Pack 3
  packageDescription(xts)$Version
  [1] 0.7-0
  Sys.time()
  [1] 2010-04-18 19:37:26 BST
 
 
 
  On Sun, Apr 18, 2010 at 1:25 PM, simeon duckworth 
 simeonduckwo...@gmail.com
  wrote:
 
  R-listers,
 
  I am using xts with a yearmon index, but am getting some inconsistent
  results with the date index when i drop observations (for example by
 using
  na.omit).
 
  The issue is illustrated in the example below.  If I start with a
 monthly
  zooreg series starting in 2009, yearmon converts this to Dec-2008.
  Not
  such a worry for my example, but strange.  Having converted to xts, i
 drop
  the first observation.  The index shows jan 2009.  But if i create a
 new
  variable with this index, it shifts the series back to dec 2008.
 
  No doubt i am doing something wrong.  very grateful for any tips
 
  library(xts)
 
  z - zooreg(1:24,frequency=12,start=c(2009,1))  # monthly data starting
  2009
  x - xts(z,as.yearmon(index(z)))# starts Dec
 2008
  xx - x[-1,  ]  #
 drop
  first obs (eg through na.omit)
  index(xx) #
 starts
  jan 2009
  xxx - xts(NA[1:length(xx)],index(xx))# back to dec
 2008
 
  periodicity(x)
  periodicity(xx)
  periodicty(xxx)
 
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 




[[alternative HTML version deleted]]


Re: [R] confused by classes and methods.

2010-03-09 Thread Albert-Jan Roskam
Hi Rob,
 
I just started reading about classes (and also learning R), so I apologize if 
the following code is confusing you more. I simplified the code somewhat in 
order to better understand what's going  on. I was wondering: are you 
deliberately reimplementing the builtin update() function?
 
setClass(Class=StatisticInfo,
    representation( oldData = data.frame,
newData = data.frame
   )
 )
# declare the update method, even though it exists already.
setGeneric (
  name=update,
  def=function(object){standardGeneric(update)}
)

setMethod(f=update, signature (StatisticInfo),
  definition = function(object){
    min = min(obj...@newdata, obj...@olddata, na.rm=T)
    avg = mean(mean(cbind(obj...@newdata, obj...@olddata)))
    max = max(obj...@newdata, obj...@olddata, na.rm=T)
    return(list(min, avg, max))
    }
)
old - data.frame(runif(10, 1, 10))
new - data.frame(runif(10, 1, 9))
instance - new(Class=StatisticInfo, oldData=old, newData=new)
update(instance)

Does this make sense to you?

Cheers!!
Albert-Jan

~~
In the face of ambiguity, refuse the temptation to guess.
~~

--- On Tue, 3/9/10, Rob Forler rfor...@uchicago.edu wrote:


From: Rob Forler rfor...@uchicago.edu
Subject: [R] confused by classes and methods.
To: r-help@r-project.org
Date: Tuesday, March 9, 2010, 12:09 AM


Hello, I have a simple class that looks like:

setClass(statisticInfo,
        representation( max = numeric,
                        min = numeric,
                        beg = numeric,
                        current = numeric,
                        avg = numeric,
                        obs = vector
                       )
         )

and the following function

updateStatistic - function(statistic, newData){
    statis...@obs = c(statis...@obs, newData)
    statis...@max = max(newData, statis...@max, na.rm=T)
    statis...@min = min(newData, statis...@min, na.rm=T)
    statis...@avg = mean(statis...@obs)
    statis...@current = newData
    if(length(statis...@obs)==1 || is.na(statis...@beg)){
        statis...@beg = newData
    }
    return(statistic)
}

Firstly,

I know you can use methods which seems to add some value. I looked at
http://developer.r-project.org/methodDefinition.html but I try

setMethod(update, signature(statistic=statisticInfo, newData=numeric),

function(statistic, newData){
    statis...@obs = c(statis...@obs, newData)
    statis...@max = max(newData, statis...@max, na.rm=T)
    statis...@min = min(newData, statis...@min, na.rm=T)
    statis...@avg = mean(statis...@obs)
    statis...@current = newData
    if(length(statis...@obs)==1 || is.na(statis...@beg)){
        statis...@beg = newData
    }
    return(statistic)
}
)

Creating a new generic function for update in .GlobalEnv
Error in match.call(fmatch, fcall) :
  unused argument(s) (statistic = statisticInfo, newData = numeric)
1: source(tca.init.R, chdir = T)
2: eval.with.vis(ei, envir)
3: eval.with.vis(expr, envir, enclos)
4: source(../../studies/tca.tradeClassifyFuncs.R)
5: eval.with.vis(ei, envir)
6: eval.with.vis(expr, envir, enclos)
7: setMethod(update, signature(statistic = statisticInfo, newData =
numeric), function(statistic, newData) {
8: isSealedMethod(f, signature, fdef, where = where)
9: getMethod(f, signature, optional = TRUE, where = where, fdef = fGen)
10: matchSignature(signature, f

I don't understand this any help would be appreciated.

Secondly, can anyone give any examples of where methods are used that makes
sense besides just checking the class inputs?

Thirdly, I've looked into passing by reference in R, and some options come
up, but in general they seem to be fairly complicated.

I would like update to work more like my update function to work without
having to return a a new object.

Something like
 statList = list(new(statisticInfo))
 updateStatistic(statList[[1]],3)
 statList[[1]]

#this would then have the updated one and not the old one.

Anyways,
The main reason I'm asking these questions is because I can't really find a
good online resource for this. Any help would be greatly appreciated.

Thanks,
Rob

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] confused by classes and methods.

2010-03-08 Thread Rob Forler
Hello, I have a simple class that looks like:

setClass(statisticInfo,
representation( max = numeric,
min = numeric,
beg = numeric,
current = numeric,
avg = numeric,
obs = vector
   )
 )

and the following function

updateStatistic - function(statistic, newData){
statis...@obs = c(statis...@obs, newData)
statis...@max = max(newData, statis...@max, na.rm=T)
statis...@min = min(newData, statis...@min, na.rm=T)
statis...@avg = mean(statis...@obs)
statis...@current = newData
if(length(statis...@obs)==1 || is.na(statis...@beg)){
statis...@beg = newData
}
return(statistic)
}

Firstly,

I know you can use methods which seems to add some value. I looked at
http://developer.r-project.org/methodDefinition.html but I try

setMethod(update, signature(statistic=statisticInfo, newData=numeric),

function(statistic, newData){
statis...@obs = c(statis...@obs, newData)
statis...@max = max(newData, statis...@max, na.rm=T)
statis...@min = min(newData, statis...@min, na.rm=T)
statis...@avg = mean(statis...@obs)
statis...@current = newData
if(length(statis...@obs)==1 || is.na(statis...@beg)){
statis...@beg = newData
}
return(statistic)
}
)

Creating a new generic function for update in .GlobalEnv
Error in match.call(fmatch, fcall) :
  unused argument(s) (statistic = statisticInfo, newData = numeric)
 1: source(tca.init.R, chdir = T)
 2: eval.with.vis(ei, envir)
 3: eval.with.vis(expr, envir, enclos)
 4: source(../../studies/tca.tradeClassifyFuncs.R)
 5: eval.with.vis(ei, envir)
 6: eval.with.vis(expr, envir, enclos)
 7: setMethod(update, signature(statistic = statisticInfo, newData =
numeric), function(statistic, newData) {
 8: isSealedMethod(f, signature, fdef, where = where)
 9: getMethod(f, signature, optional = TRUE, where = where, fdef = fGen)
10: matchSignature(signature, f

I don't understand this any help would be appreciated.

Secondly, can anyone give any examples of where methods are used that makes
sense besides just checking the class inputs?

Thirdly, I've looked into passing by reference in R, and some options come
up, but in general they seem to be fairly complicated.

I would like update to work more like my update function to work without
having to return a a new object.

Something like
 statList = list(new(statisticInfo))
 updateStatistic(statList[[1]],3)
 statList[[1]]

#this would then have the updated one and not the old one.

Anyways,
The main reason I'm asking these questions is because I can't really find a
good online resource for this. Any help would be greatly appreciated.

Thanks,
Rob

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confused about appending to list behavior...

2010-02-19 Thread Jason Rupert

Through help from the list and a little trial and error (mainly error) I think 
I figured out a couple of ways to append to a list.  Now I am trying to access 
the data that I appended to the list.   The example below shows where I'm 
trying to access that information via two different methods.  It turns out that 
trying to access the data the way one would elements in a data.frame does not 
work.  However, using the standard way of accessing data from a list, a la 
[[...]], seems to provide an answer.   

By any chance is there more documentation out there on lists and this behavior, 
as I would like to try to better understand what is really going on and why one 
approach works and another doesn't.   

Thank you again for all the help and feedback, as I love lists and especially 
the fact (that unlike data.frames) you can store different type data and also 
arrays of different lengths.   They are great. 




 example_list-list(tracking-c(house), house_type-c(brick, wood), 
 sizes-c(1600, 1800, 2000, 2400))
 example_list
[[1]]
[1] house

[[2]]
[1] brick wood 

[[3]]
[1] 1600 1800 2000 2400

 
 cost_limits-c(20.25, 350010.15)
 
 example_list[[4]]-cost_limits
 
 example_list
[[1]]
[1] house

[[2]]
[1] brick wood 

[[3]]
[1] 1600 1800 2000 2400

[[4]]
[1] 20.2 350010.2

 c(example_list,list(CostStuff=cost_limits))
[[1]]
[1] house

[[2]]
[1] brick wood 

[[3]]
[1] 1600 1800 2000 2400

[[4]]
[1] 20.2 350010.2

$CostStuff
[1] 20.2 350010.2

 example_list$CostStuff
NULL
 example_list[[5]]
Error in example_list[[5]] : subscript out of bounds
 example_list[[4]]
[1] 20.2 350010.2

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused about appending to list behavior...

2010-02-19 Thread Dieter Menne


JustADude wrote:
 
 
 ...
 
 By any chance is there more documentation out there on lists and this
 behavior, as I would like to try to better understand what is really going
 on and why one approach works and another doesn't.   
 
 ...
  Example reproduced below
 

You forgot an assign. 

Dieter

example_list-list(tracking-c(house), 
house_type-c(brick, wood), sizes-c(1600, 1800, 2000, 2400))
example_list
cost_limits-c(20.25, 350010.15)
example_list[[4]]-cost_limits
example_list
# you forgot the left side here
# c(example_list,list(CostStuff=cost_limits))
# Should be
example_list - c(example_list,list(CostStuff=cost_limits))
example_list$CostStuff
example_list[[5]]




-- 
View this message in context: 
http://n4.nabble.com/Confused-about-appending-to-list-behavior-tp1561547p1561723.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confused on using expand.grid(), array(), image() and npudens(np) in my case

2009-12-13 Thread rusers.sh
Hi all,
  I want to use the npudens() function in the np package (multivariate
kernel density estimation), but was confused by the several functions in the
following codes,expand.grid(),array(),image() and npudensbw().
  This confusion will only be generated in =3 dimensions. I marked the four
places with confusion1-4. I think there should be some kind
of correspondence in those four places,but cannot figure them out.Thanks
very much for chewing on this.
#simulated dataset: d
x1-c(runif(100,0,1),runif(50,0.67,1));y1-c(runif(100,0,1),runif(50,0.67,1));d1-data.frame(x1,y1);colnames(d1)-c(x,y)
x2-c(runif(100,0,1),runif(50,0.33,0.67));y2-c(runif(100,0,1),runif(50,0.33,0.67));d2-data.frame(x2,y2);colnames(d2)-c(x,y)
x3-c(runif(100,0,1),runif(50,0,0.33));y3-c(runif(100,0,1),runif(50,0,0.33));d3-data.frame(x3,y3);colnames(d3)-c(x,y)
d-rbind(d1,d2,d3)
d$tf-c(rep(1,150),rep(2,150),rep(3,150))
plot(d1);points(d2,col=red);points(d3,col=green)
attach(d)

#Confusion1:how to specify the formula in the npudensbw() correctly? I find
the sequence of ordered(tf)+x+y is important and here i may have a wrong
specification

bw - npudensbw(formula=~ordered(tf)+x+y, bwmethod=cv.ml)  #confusion1

year.seq - sort(unique(d$tf))  #length is 3
x.seq - seq(0,1,0.02)  #length is 51
y.seq - seq(0,1,0.02)  #length is 51

#Confusion2:what is the correct sequence for the three variables
(year.seq,x.seq and y.seq) in expand.grid()

data.eval - expand.grid(tf=year.seq,x=x.seq,y=y.seq)  #confusion2

fhat - fitted(npudens(bws=bw, newdata=data.eval))

#Confusion3:what is the correct sequence for the three variables in the c()
options of array()

f - array(fhat, c(51,51,3))  #number of year.seq is 3, and number of x.seq
and y.seq are 51,confusion3

brks - quantile(f, seq(0,1,0.05));cols -
heat.colors(length(brks)-1);oldpar - par(mfrow=c(1,3))

#Confusion4:what is the correct sequence for the three variables(tf,x and y)
in the image()

for (i in 1:3) image(x.seq, y.seq, f[,,i],asp=1, xlab=, ylab=, main=i,
breaks=brks, col=cols)  #confusion4

par(oldpar)

#This was also confused in 4 ,5 and more dimensions.
  Any help or suggestions are greatly appreciated.
-- 
-
Jane Chang
Queen's

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confused - better empirical results with error in data

2009-09-07 Thread Noah Silverman

Hi,

I have a strange one for the group.

We have a system that predicts probabilities using a fairly standard svm 
(e1017).  We are looking at probabilities of a binary outcome.


The input data is generated by a perl script that calculates a bunch of 
things, fetches data from a database, etc.


We train the system on 30,000 examples and then test the system on an 
unseen set of 5,000 records.


The real world results on the test set looked VERY good.  We were 
really happy with our model.


The, we noticed that there was a big error in our data generation script 
and one of the values (an average of sorts.) was being calculated 
incorrectly.  (The perl script failed to clear two iterators, so they 
both grew with every record.)


As an quick experiment, we removed that item from our data set and 
re-ran the process.  The results were not very good.  Perhaps 75% as 
good as training with the wrong factor included.


So, this is really a philosophical question.  Do we:
1) Shrug and say, who cares, the SVM figured it out and likes 
that bad data item for some inexplicable reason
2) Tear into the math and try to figure out WHY the SVM is 
predicting more accurately


Any opinions??

Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused - better empirical results with error in data

2009-09-07 Thread S Ellison
Predicting whilst confused is unlikely to produce sound predictions...
my vote is for finding out why before believing anything.

 Noah Silverman n...@smartmediacorp.com 09/07/09 8:33 PM 
Hi,

I have a strange one for the group.

We have a system that predicts probabilities using a fairly standard svm

(e1017).  We are looking at probabilities of a binary outcome.

The input data is generated by a perl script that calculates a bunch of 
things, fetches data from a database, etc.

We train the system on 30,000 examples and then test the system on an 
unseen set of 5,000 records.

The real world results on the test set looked VERY good.  We were 
really happy with our model.

The, we noticed that there was a big error in our data generation script

and one of the values (an average of sorts.) was being calculated 
incorrectly.  (The perl script failed to clear two iterators, so they 
both grew with every record.)

As an quick experiment, we removed that item from our data set and 
re-ran the process.  The results were not very good.  Perhaps 75% as 
good as training with the wrong factor included.

So, this is really a philosophical question.  Do we:
 1) Shrug and say, who cares, the SVM figured it out and likes 
that bad data item for some inexplicable reason
 2) Tear into the math and try to figure out WHY the SVM is 
predicting more accurately

Any opinions??

Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused - better empirical results with error in data

2009-09-07 Thread Mark Knecht
On Mon, Sep 7, 2009 at 12:33 PM, Noah Silvermann...@smartmediacorp.com wrote:
SNIP

 So, this is really a philosophical question.  Do we:
    1) Shrug and say, who cares, the SVM figured it out and likes that bad
 data item for some inexplicable reason
    2) Tear into the math and try to figure out WHY the SVM is predicting
 more accurately

 Any opinions??

 Thanks!


Boy, I'd sure think you'd want to know why it worked with the 'wrong'
calculations. It's not that the math is wrong, really, but rather that
it wasn't what you thought it was. I cannot see why you wouldn't want
to know why this mistake helped. Won't future project benefit?

Just my 2 cents,
Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused - better empirical results with error in data

2009-09-07 Thread Noah Silverman

You both make good points.

Ideally, it would be nice to know WHY it works.

Without digging into too much verbiage, the system is designed to 
predict the outcome of certain events.  The broken model predicts 
outcomes correctly much more frequently than one with the broken data 
withheld. So, to answer Mark's question, we say it's better because we 
see much better results with our broken model when applied to 
real-world data used for testing.


I have one theory.

The data is listed in our CSV file from newest to oldest.  We are 
supposed to calculated a valued that is an average of some items.  We 
loop through some queries to our database and increment two variables - 
$total_found and $total_score.  The final value is simply $total_score / 
$total_found.


Our programmer forgot to reset both $total_score and $total_found back 
to zero for each record we process.  So both grow.


I think that this may, in a way, be some warped form of a recency 
weighted score.  The newer records will have a score more affected by 
their contribution to the wrongly growing totals.  A record that is 
closer to the end of the data set will be starting with HUGE values for 
$total_score and $total_found, so addition of its values will have very 
little effect.


We've done the following so far today  (Note, scores are just relative 
to indicate performance. Higher is better)

1) Run with bad data = 6.9
2) Run with bad data missing = 5.5
3) Run with correct data = ?? (We're running now, will take a few 
hours to compute.)



I might also try to plot the bad data.  It would be interesting to see 
what shape it has...











On 9/7/09 1:05 PM, Mark Knecht wrote:

On Mon, Sep 7, 2009 at 12:33 PM, Noah Silvermann...@smartmediacorp.com  wrote:
SNIP
   

So, this is really a philosophical question.  Do we:
1) Shrug and say, who cares, the SVM figured it out and likes that bad
data item for some inexplicable reason
2) Tear into the math and try to figure out WHY the SVM is predicting
more accurately

Any opinions??

Thanks!

 

Boy, I'd sure think you'd want to know why it worked with the 'wrong'
calculations. It's not that the math is wrong, really, but rather that
it wasn't what you thought it was. I cannot see why you wouldn't want
to know why this mistake helped. Won't future project benefit?

Just my 2 cents,
Mark



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused - better empirical results with error in data

2009-09-07 Thread Noah Silverman

You both make good points.

Ideally, it would be nice to know WHY it works.

Without digging into too much verbiage, the system is designed to 
predict the outcome of certain events.  The broken model predicts 
outcomes correctly much more frequently than one with the broken data 
withheld. So, to answer Mark's question, we say it's better because we 
see much better results with our broken model when applied to 
real-world data used for testing.


I have one theory.

The data is listed in our CSV file from newest to oldest.  We are 
supposed to calculated a valued that is an average of some items.  We 
loop through some queries to our database and increment two variables - 
$total_found and $total_score.  The final value is simply $total_score / 
$total_found.


Our programmer forgot to reset both $total_score and $total_found back 
to zero for each record we process.  So both grow.


I think that this may, in a way, be some warped form of a recency 
weighted score.  The newer records will have a score more affected by 
their contribution to the wrongly growing totals.  A record that is 
closer to the end of the data set will be starting with HUGE values for 
$total_score and $total_found, so addition of its values will have very 
little effect.


We've done the following so far today  (Note, scores are just relative 
to indicate performance. Higher is better)

1) Run with bad data = 6.9
2) Run with bad data missing = 5.5
3) Run with correct data = ?? (We're running now, will take a few 
hours to compute.)



I might also try to plot the bad data.  It would be interesting to see 
what shape it has...











On 9/7/09 1:05 PM, Mark Knecht wrote:

On Mon, Sep 7, 2009 at 12:33 PM, Noah Silvermann...@smartmediacorp.com  wrote:
SNIP


So, this is really a philosophical question.  Do we:
1) Shrug and say, who cares, the SVM figured it out and likes that bad
data item for some inexplicable reason
2) Tear into the math and try to figure out WHY the SVM is predicting
more accurately

Any opinions??

Thanks!



Boy, I'd sure think you'd want to know why it worked with the 'wrong'
calculations. It's not that the math is wrong, really, but rather that
it wasn't what you thought it was. I cannot see why you wouldn't want
to know why this mistake helped. Won't future project benefit?

Just my 2 cents,
Mark



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused - better empirical results with error in data

2009-09-07 Thread Mark Knecht
On Mon, Sep 7, 2009 at 1:22 PM, Noah Silvermann...@smartmediacorp.com wrote:
SNIP

 The data is listed in our CSV file from newest to oldest.  We are supposed
 to calculated a valued that is an average of some items.  We loop through
 some queries to our database and increment two variables - $total_found and
 $total_score.  The final value is simply $total_score / $total_found.

SNIP

This does seem like it's rife with possibilities for non-causal
action. (Assuming you process from newest toward oldest which is what
I think you say you are doing...) I'm pretty sure that if I knew that
the Dow was going to be higher 3 months from now then my day trading
results would tend toward long vs short and I'd do better.
Unfortunately I don't know where it will be and cannot really do that.

Have you considered processing the data in the other direction. Not in
R, but rather reversing the data frame or better yet writing the csv
file in date order?

Cheers,
Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused - better empirical results with error in data

2009-09-07 Thread Noah Silverman
Interesting point.

Our data is NOT continuous.  Sure, some of the test examples are older 
than others, but there is no relationship between them. (More Markov 
like in behavior.)

When creating a specific record, we actually account for this in our SQL 
queries which tend to be along the lines of:
select x from table where id=1234 and date  '2008-05-01'

This way, whatever data we're looking at, we set things so the current 
and future data doesn't exist yet.

My understanding was that an SVM wouldn't care about the order of the 
data input as long as the examples are independent.

Regardless of all this, we look at real-world test for our evaluation.
 1) We trained the system on examples prior to a certain date.
 2) We test the system with unseen examples after that date.

We take the approach of: If we had used this model, what would our 
portfolio be at the end of the test period.   Sure, we also look at 
things like AUC and R2 (from applying the model to the TEST data.)  
Generally, we see a correlation between AUC, R2, and our final result, 
but not a perfect one.  A model with a SLIGHTLY lower R2 actually 
produced better results in a few cases.  This process should produce 
solid results as we are eliminating any chance of over-fitting when 
measuring performance.

So, one could argue, that whatever gives the best results on the test 
data is the best model, regardless of the correctness of the theory.

Just for fun, I'll see if I can schedule a few hours to run the same 
experiment with the training data order reversed.  If I'm correct, the 
results should be the same.

Thanks!

--
N


On 9/7/09 2:34 PM, Mark Knecht wrote:
 On Mon, Sep 7, 2009 at 1:22 PM, Noah Silvermann...@smartmediacorp.com  
 wrote:
 SNIP

 The data is listed in our CSV file from newest to oldest.  We are supposed
 to calculated a valued that is an average of some items.  We loop through
 some queries to our database and increment two variables - $total_found and
 $total_score.  The final value is simply $total_score / $total_found.

  
 SNIP

 This does seem like it's rife with possibilities for non-causal
 action. (Assuming you process from newest toward oldest which is what
 I think you say you are doing...) I'm pretty sure that if I knew that
 the Dow was going to be higher 3 months from now then my day trading
 results would tend toward long vs short and I'd do better.
 Unfortunately I don't know where it will be and cannot really do that.

 Have you considered processing the data in the other direction. Not in
 R, but rather reversing the data frame or better yet writing the csv
 file in date order?

 Cheers,
 Mark


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused about behavior of an S4 object containing a ts object

2009-01-22 Thread Lyman, Mark
I posted the question below about a month ago but received no response.
I still have not been able to figure out what is happening.

I also noticed another oddity. When the data part of the object is a
multivariate time series, it doesn't show up in the structure, but it
can be treated as a multivariate time series. Is this a bug in str?

 setClass(tsExtended, representation = representation(description 
+ = character), contains = ts)
[1] tsExtended
 tmp - new(tsExtended, matrix(1:20, ncol=2), description = My Time
Series)
 tsp(tmp) - c(1, 5.5, 2)
 tmp
Object of class tsExtended
Time Series:
Start = c(1, 1) 
End = c(5, 2) 
Frequency = 2 
Series 1 Series 2
1.01   11
1.52   12
2.03   13
2.54   14
3.05   15
3.56   16
4.07   17
4.58   18
5.09   19
5.5   10   20
Slot description:
[1] My Time Series

 str(tmp)
Formal class 'tsExtended' [package .GlobalEnv] with 4 slots
  ..@ .Data  : int [1:20] 1 2 3 4 5 6 7 8 9 10 ...
  ..@ description: chr My Time Series
  ..@ tsp: num [1:3] 1 5.5 2
  ..@ .S3Class   : chr ts
 tmp[,1]
Time Series:
Start = c(1, 1) 
End = c(5, 2) 
Frequency = 2 
 [1]  1  2  3  4  5  6  7  8  9 10
 plot(tmp[,2])

Mark Lyman


-Original Message-
From: Lyman, Mark 
Sent: Thursday, December 18, 2008 1:02 PM
To: 'r-help@r-project.org'
Subject: Confused about behavior of an S4 object containing a ts object

I am trying to define an S4 class that contains a ts class object, a
simple 
example is shown in the code below. However, when I try to create a new
object 
of this class the tsp part is ignored, see below. Am I doing something
wrong, 
or is this just a peril of mixing S3 and S4 objects?

 setClass(tsExtended, representation = representation(description 
= character), contains = ts)
[1] tsExtended
 new(tsExtended, ts(1:10, frequency = 2), description = My Time
Series)
Object of class tsExtended
Time Series:
Start = 1
End = 10
Frequency = 1
 [1]  1  2  3  4  5  6  7  8  9 10
Slot description:
[1] My Time Series

 # This however seems to work
 tmp - new(tsExtended, 1:10, description = My Time Series)
 tsp(tmp) - c(1, 5.5, 2)
 tmp
Object of class tsExtended
Time Series:
Start = c(1, 1)
End = c(5, 2)
Frequency = 2
 [1]  1  2  3  4  5  6  7  8  9 10
Slot description:
[1] My Time Series

Mark Lyman, Statistician
Engineering Systems  Integration, ATK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confused about behavior of an S4 object containing a ts object

2008-12-18 Thread Lyman, Mark
I am trying to define an S4 class that contains a ts class object, a
simple 
example is shown in the code below. However, when I try to create a new
object 
of this class the tsp part is ignored, see below. Am I doing something
wrong, 
or is this just a peril of mixing S3 and S4 objects?

 setClass(tsExtended, representation = representation(description 
= character), contains = ts)
[1] tsExtended
 new(tsExtended, ts(1:10, frequency = 2), description = My Time
Series)
Object of class tsExtended
Time Series:
Start = 1
End = 10
Frequency = 1
 [1]  1  2  3  4  5  6  7  8  9 10
Slot description:
[1] My Time Series

 # This however seems to work
 tmp - new(tsExtended, 1:10, description = My Time Series)
 tsp(tmp) - c(1, 5.5, 2)
 tmp
Object of class tsExtended
Time Series:
Start = c(1, 1)
End = c(5, 2)
Frequency = 2
 [1]  1  2  3  4  5  6  7  8  9 10
Slot description:
[1] My Time Series

Mark Lyman, Statistician
Engineering Systems  Integration, ATK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confused with default device setup

2008-10-15 Thread Gang Chen
When invoking dev.new() on my Mac OS X 10.4.11, I get an X11 window
instead of quartz which I feel more desirable.  So I'd like to set
the default device to quartz. However I'm confused because of the
following:

 Sys.getenv(R_DEFAULT_DEVICE)
R_DEFAULT_DEVICE
 quartz

 getOption(device)
[1] X11

What's going on?

Also is file Renviron under /Library/Frameworks/R.framework/Resources/
etc/ppc/ the one I should modify if I want to change some environment
variables? But I don't see R_DEFAULT_DEVICE there.

TIA,
Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confused with default device setup

2008-10-15 Thread Prof Brian Ripley

This was also posted on R-sig-mac, and I've answered it there.
Please don't cross-post.

On Wed, 15 Oct 2008, Gang Chen wrote:


When invoking dev.new() on my Mac OS X 10.4.11, I get an X11 window
instead of quartz which I feel more desirable.  So I'd like to set
the default device to quartz. However I'm confused because of the
following:


Sys.getenv(R_DEFAULT_DEVICE)

R_DEFAULT_DEVICE
quartz


getOption(device)

[1] X11

What's going on?

Also is file Renviron under /Library/Frameworks/R.framework/Resources/
etc/ppc/ the one I should modify if I want to change some environment
variables? But I don't see R_DEFAULT_DEVICE there.

TIA,
Gang


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] confused about CORREP cor.LRtest

2008-03-07 Thread Mark W Kimpel
After some struggling with the data format, non-standard in 
BioConductor, I have gotten cor.balance in package CORREP to work. My 
desire was to obtain maximum-likelihood p-values from the same data 
object using cor.LRtest, but it appears that this function wants 
something different, which I can't figure out from the documentation.

Briefly, my dataset consists of 36 samples from 12 conditions and I have 
  497 genes of interest to be correlated. The following works:

M - cor.balance(stddata, m = 3, G=497)

The following does not:
M.p - cor.LRtest(stddata, m1 = 3, m2 = 3)

Do I need to do something to stddata between example 1 and 2 or does m 
stand for something different in the two examples?

sessionInfo follows. Thanks, Mark

  sessionInfo()
R version 2.7.0 Under development (unstable) (2008-03-05 r44683)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] grid  tools stats graphics  grDevices datasets  utils
[8] methods   base

other attached packages:
  [1] rat2302_2.0.1Rgraphviz_1.17.13graph_1.17.17
  [4] igraph_0.5   CORREP_1.5.0 e1071_1.5-17
  [7] class_7.2-41 affy_1.17.8  preprocessCore_1.1.5
[10] affyio_1.7.13Biobase_1.99.1

loaded via a namespace (and not attached):
[1] cluster_1.11.10
-- 

Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 204-4202 Home (no voice mail please)

mwkimpelatgmaildotcom

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confused about Tukey mult. comp. after ANCOVA

2007-10-11 Thread Chabot Denis
Hi,

I am reposting this as I fear my original post (on Oct. 4th) got  
buried by all the excitement of the R 2.6 release...

I had a first occasion to try multiple comparisons (of intercepts, I  
suppose) following a significant result in an ANCOVA. As until now I  
was doing this with JMP, I compared my results and the post-hoc  
comparisons were different between R and JMP.

I chose to use an example data set from JMP because it was small, so  
I can show it here. It is not the best example for an ANCOVA because  
the factor Drug does not have a significant effect, but it will do.

 drug$x
  [1] 11  8  5 14 19  6 10  6 11  3  6  6  7  8 18  8 19  8  5 15 16  
13 11  9 21 16 12
[28] 12  7 12
 
  drug$y
  [1]  6  0  2  8 11  4 13  1  8  0  0  2  3  1 18  4 14  9  1  9 13  
10 18  5 23 12  5
[28] 16  1 20
  drug$Drug
  [1] a a a a a a a a a a d d d d d d d d d d f f f f f f f f f f
Levels: a d f

I did not manage to get TukeyHSD to work if I fitted the ANCOVA with  
lm, so I used aov:

my.anc - aov(y~x+Drug, data=drug)

  summary(my.anc)
 Df Sum Sq Mean Sq F valuePr(F)
x1 802.94  802.94 50.0393 1.639e-07 ***
Drug 2  68.55   34.28  2.13610.1384
Residuals   26 417.20   16.05
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

I tried this to compare the Drugs, correcting for the effect of x.

  TukeyHSD(my.anc, Drug)
   Tukey multiple comparisons of means
 95% family-wise confidence level

Fit: aov(formula = y ~ x + Drug, data = drug)

$Drug
   diff   lwr  upr p adj
d-a 0.03131758 -4.420216 4.482851 0.9998315
f-a 3.04677613 -1.404758 7.498310 0.2239746
f-d 3.01545855 -1.436075 7.466992 0.2305187

Warning message:
non-factors ignored: x in: replications(paste(~, xx), data = mf)

I am not sure about the Warning, maybe it is the reason the  
differences shown here are different from those shown in JMP for the  
same analysis. Maybe TukeyHSD is not meant to be used with non- 
factors (i.e. not valid for ANCOVAs)?

I just found the package multcomp and am not sure I understand it  
well yet, but its Tukey comparisons gave the same results as JMP.

  summary(glht(m3, linfct=mcp(Drug=Tukey)))

 Simultaneous Tests for General Linear Hypotheses

Multiple Comparisons of Means: Tukey Contrasts


Fit: aov(formula = y ~ x + Drug, data = drug)

Linear Hypotheses:
Estimate Std. Error t value p value
d - a == 00.109  1.795   0.061   0.998
f - a == 03.446  1.887   1.826   0.181
f - d == 03.337  1.854   1.800   0.189
(Adjusted p values reported)


I would very much like to understand why these two Tukey tests gave  
different results in R.

Thanks in advance,

Denis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confused about Tukey mult. comp. after ANCOVA

2007-10-04 Thread Chabot Denis
Hi,

I had a first occasion to try multiple comparisons (of intercepts, I  
suppose) following a significant result in an ANCOVA. As until now I  
was doing this with JMP, I compared my results and the post-hoc  
comparisons were different between R and JMP.

I chose to use an example data set from JMP because it was small, so  
I can show it here. It is not the best example for an ANCOVA because  
the factor Drug does not have a significant effect, but it will do.

 drug$x
  [1] 11  8  5 14 19  6 10  6 11  3  6  6  7  8 18  8 19  8  5 15 16  
13 11  9 21 16 12
[28] 12  7 12
 
  drug$y
  [1]  6  0  2  8 11  4 13  1  8  0  0  2  3  1 18  4 14  9  1  9 13  
10 18  5 23 12  5
[28] 16  1 20
  drug$Drug
  [1] a a a a a a a a a a d d d d d d d d d d f f f f f f f f f f
Levels: a d f

I did not manage to get TukeyHSD to work if I fitted the ANCOVA with  
lm, so I used aov:

my.anc - aov(y~x+Drug, data=drug)

  summary(my.anc)
 Df Sum Sq Mean Sq F valuePr(F)
x1 802.94  802.94 50.0393 1.639e-07 ***
Drug 2  68.55   34.28  2.13610.1384
Residuals   26 417.20   16.05
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

I tried this to compare the Drugs, correcting for the effect of x.

  TukeyHSD(my.anc, Drug)
   Tukey multiple comparisons of means
 95% family-wise confidence level

Fit: aov(formula = y ~ x + Drug, data = drug)

$Drug
   diff   lwr  upr p adj
d-a 0.03131758 -4.420216 4.482851 0.9998315
f-a 3.04677613 -1.404758 7.498310 0.2239746
f-d 3.01545855 -1.436075 7.466992 0.2305187

Warning message:
non-factors ignored: x in: replications(paste(~, xx), data = mf)

I am not sure about the Warning, maybe it is the reason the  
differences shown here are different from those shown in JMP for the  
same analysis. Maybe TukeyHSD is not meant to be used with non- 
factors (i.e. not valid for ANCOVAs)?

I just found the package multcomp and am not sure I understand it  
well yet, but its Tukey comparisons gave the same results as JMP.

  summary(glht(m3, linfct=mcp(Drug=Tukey)))

 Simultaneous Tests for General Linear Hypotheses

Multiple Comparisons of Means: Tukey Contrasts


Fit: aov(formula = y ~ x + Drug, data = drug)

Linear Hypotheses:
Estimate Std. Error t value p value
d - a == 00.109  1.795   0.061   0.998
f - a == 03.446  1.887   1.826   0.181
f - d == 03.337  1.854   1.800   0.189
(Adjusted p values reported)


I would very much like to understand why these two Tukey tests gave  
different results in R.

Thanks in advance,

Denis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.