date:20110331


Hi,
I had a piece of code which looped over a decimal vector like this:


for( i in where ){
  thisdata - subset(herde, herde$mlr = i)
  # do stuff with thisdata..
}

'where' is a vector like seq(-1, 1, by=0.1)

My problem was: 'nrow(thisdata)' in loop repetition 0.4 was different if 
'where' was seq(-1, 1, by=0.1) than when 'where' was seq(-0.8, 1, by=0.1)

It went away after I changed the first line to:

  thisdata - subset(herde, herde$mlr = round(i, digits=1))

This is that floating point trap the R inferno pdf talked about, 
right? That file talked about the problem, but didn't offer a solution.


Similar things happened when I created a table() from a vector with 
values in seq(-1, 1, by=0.1)


Do I really have to round every float at every occurence from now on, or 
is there another solution? I only found all.equal() and identical(), but 
I want to subset for observations with a value /greater/ than something.


Thanks in advance,
 Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] VECM with UNRESTRICTED TREND

2011-03-31 Thread Grzegorz Konat

Hello Bernhard,

thank You so much one again! Now I (more or less) understand the idea, but
still have problem with its practical application.

I do (somewhat following example 8.1 in your textbook):

library(urca)
data(my.data)
names(my.data)
attach(my.data)
dat1 - my.data[, c(dY, X, dM)]
dat2 - cbind(time)
args('ca.jo')
yxm.vecm - ca.jo(dat1, type = trace, ecdet = trend, K = 2, spec =
longrun, dumvar=dat2)

The above code produces following output:

Error in r[i1, , drop = FALSE] - r[-nrow(r):-(nrow(r) - lag + 1L), , drop =
FALSE] :
  non-numeric argument to binary operator

What does that mean? Should I use cbind command to dat1 as well? And doesn't
it transform the series into series of integer numbers?

Thank you once again (especially for your patience).

Best,
Greg



2011/3/31 Pfaff, Bernhard Dr. bernhard_pf...@fra.invesco.com

  Hello Greg,

 you include your trend as a (Nx1) matrix and use this for 'dumvar'. The
 matrix 'dumvar' is just added to the VECM as deterministic regressors and
 while you are referring to case 5, this is basically what you are after, if
 I am not mistaken. But we aware that this implies a quadratic trend for the
 levels.

 Best,
 Bernhard

  --
 *Von:* Grzegorz Konat [mailto:grzegorz.ko...@ibrkk.pl]
 *Gesendet:* Mittwoch, 30. März 2011 20:50
 *An:* Pfaff, Bernhard Dr.; r-help@r-project.org
 *Betreff:* Re: [R] VECM with UNRESTRICTED TREND

 Hello Bernhard,

 Thank You very much. Unfortunately I'm still not really sure how should I
 use dummy vars in this context...
 If I have a system of three variables (x, y, z), lag order = 2 and 1
 cointegrating relation, what should I do? I mean, what kind of 'pattern'
 should be used to create those dummy variables, what should they represent
 and how many of them do I need?

 Many thanks in advance!

 Best,
 Greg

 2011/3/30 Pfaff, Bernhard Dr. bernhard_pf...@fra.invesco.com

 Hello Greg,

 you can exploit the argument 'dumvar' for this. See ?ca.jo

 Best,
 Bernhard

  -Ursprüngliche Nachricht-
  Von: r-help-boun...@r-project.org
  [mailto:r-help-boun...@r-project.org] Im Auftrag von Grzegorz Konat
  Gesendet: Mittwoch, 30. März 2011 16:46
  An: r-help@r-project.org
  Betreff: [R] VECM with UNRESTRICTED TREND
  
  Dear All,
 
  My question is:
 
  how can I estimate VECM system with unrestricted trend (aka
  case 5) option as a deterministic term?
 
  As far as I know, ca.jo in urca package allows for restricted trend
  only [vecm
  - ca.jo(data, type = trace/eigen, ecdet = trend, K =
  n, spec = transitory/longrun)].
  Obviously, I don't have to do this in urca, so if another
  package gives the possibility, please let me know too!
 
  Thanks in advance!
 
  Greg
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 *
 Confidentiality Note: The information contained in this message,
 and any attachments, may contain confidential and/or privileged
 material. It is intended solely for the person(s) or entity to
 which it is addressed. Any review, retransmission, dissemination,
 or taking of any action in reliance upon this information by
 persons or entities other than the intended recipient(s) is
 prohibited. If you received this in error, please contact the
 sender and delete the material from any computer.
 *




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] That dreaded floating point trap

2011-03-31 Thread Ted Harding

On 31-Mar-11 11:24:01, Alexander Engelhardt wrote:
 Hi,
 I had a piece of code which looped over a decimal vector like this:
 
 for( i in where ){
thisdata - subset(herde, herde$mlr = i)
# do stuff with thisdata..
 }
 
 'where' is a vector like seq(-1, 1, by=0.1)
 
 My problem was: 'nrow(thisdata)' in loop repetition 0.4 was different
 if 'where' was seq(-1, 1, by=0.1) than when 'where' was
 , when you wan
 It went away after I changed the first line to:
 
thisdata - subset(herde, herde$mlr = round(i, digits=1))
 
 This is that floating point trap the R inferno pdf talked about, 
 right? That file talked about the problem, but didn't offer a solution.
 
 Similar things happened when I created a table() from a vector with 
 values in seq(-1, 1, by=0.1)
 
 Do I really have to round every float at every occurence from now on,
 or is there another solution? I only found all.equal() and identical(),
 but I want to subset for observations with a value /greater/ than
 something.
 
 Thanks in advance,
   Alex

A very straightforward way to avoid this problem is to construct the
sequence by multiplying a sequence of integers by an approriate
constant. E.g. for your first example:

  for( i in where ){
 thisdata - subset(herde, herde$mlr = i)
 # do stuff with thisdata..
  }
 
'where' is a vector like 0.1*((-10):10)
[ instead of seq(-1, 1, by=0.1) ]

and then, when you want to change to seq(-0.8, 1, by=0.1),
use instead 0.1*(-80,10).

Hoping this helps,
Ted.


E-Mail: (Ted Harding) ted.hard...@wlandres.net
Fax-to-email: +44 (0)870 094 0861
Date: 31-Mar-11   Time: 12:52:35
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how about a p- operator?

2011-03-31 Thread John Fox

Dear Ken and Bill,

Thanks for the comments. I obviously didn't think about the precedence issue
and was unaware that one could define :=.

Best,
 John

 -Original Message-
 From: Kenn Konstabel [mailto:lebats...@gmail.com]
 Sent: March-31-11 7:12 AM
 To: William Dunlap
 Cc: John Fox; Carl Witthoft; r-help@r-project.org
 Subject: Re: [R] how about a p- operator?
 
 In addition to %...% operators one can define := (I haven't seen this
 possibility documented anywhere but it's used in a package) which seems to
 have different precedence.
 
  `:=`-`%-%`   # the %-% by John Fox
  a
 [1] -1
  a := 1000 + 1
 [1] 1001
  a
 [1] 1001
  a %-% 1000 + 1
 [1] 1001
  a
 [1] 1000
 
 Regards,
 Kenn
 
 Kenn Konstabel
 National Institute for Health Development Hiiu 42 Tallinn
 
 On Thu, Mar 31, 2011 at 2:50 AM, William Dunlap wdun...@tibco.com wrote:
  The %...% operators are not a panacea.
  they have the same precedence as `*`
  and `/` (I think) so you get things like:
 
    x %-% 10 - 8  # %-% has higher precedence than -
   [1] 2
    x # not what you thought it would be
   [1] 10
 
    x %-% 10 ^3 # but lower than ^
   [1] 1000
    x # this is what you expected
   [1] 1000
 
  It isn't that hard to write a package with your own parser in it.
  Just have it generate the call tree from your input text and call
  eval() on it.
 
  Bill Dunlap
  Spotfire, TIBCO Software
  wdunlap tibco.com
 
  -Original Message-
  From: r-help-boun...@r-project.org
  [mailto:r-help-boun...@r-project.org] On Behalf Of John Fox
  Sent: Wednesday, March 30, 2011 4:34 PM
  To: 'Carl Witthoft'
  Cc: r-help@r-project.org
  Subject: Re: [R] how about a p- operator?
 
  Dear Carl,
 
  I think that the following does what you want:
 
   `%-%` - function(e1, e2){
  +   e1 - deparse(substitute(e1))
  +   env - parent.frame()
  +   assign(e1, e2, envir=env)
  +   e2
  + }
 
   x %-% 10
  [1] 10
 
   x
  [1] 10
 
  But, as has been pointed out, it's probably easier just to
  parenthesize the usual assignment command.
 
  Regards,
   John
 
  
  John Fox
  Senator William McMaster
    Professor of Social Statistics
  Department of Sociology
  McMaster University
  Hamilton, Ontario, Canada
  http://socserv.mcmaster.ca/jfox
 
 
 
 
   -Original Message-
   From: r-help-boun...@r-project.org
  [mailto:r-help-boun...@r-project.org]
   On Behalf Of Carl Witthoft
   Sent: March-30-11 7:00 PM
   To: r-help@r-project.org
   Subject: [R] how about a p- operator?
  
   I was cursing Matlab again today (what else is new) because
  the default
   action for every Matlab command is to spew the result to
  the console, and
   one must remember to put that darn ;  at the end of every line.
  
   So I just wondered:  was there ever a discussion as to
  providing some
   modified version of the - and - operators in R to do
  the reverse?
     That is, since R does not print the values of a command
  to the console,
   what if there were  an operator such that
  
  
     newobject p- somefunction()
  
   would do the same as
  
   print(newobject - somefunction())
  
  
   Any thoughts?
   Carl
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide http://www.R-project.org/posting-
   guide.html and provide commented, minimal, self-contained,
   reproducible code.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Italicized title from index

Try this:

plot(1, main = bquote(Yield  for  ~ italic(.(spp)) ~ in management
region ~ .(region)))

On Thu, Mar 31, 2011 at 2:35 AM, Jeremy Newman jnewm...@uottawa.ca wrote:
 Hi all!

 I've written a handy script that uses a for loop to allow me to generate a
 large number of figures and statistical outputs for a large dataset.

 I am using indexing to retrieve a species name for the title of my graphs-
 which worked fine. However, I need to italicize these species names.

 I originally used the paste function, and had no problems with indexing:

 *main=paste(Yield for , testsub[1,3],  in management region ,
 testsub[1,2])*

 The title looks like: *Yield for Gadus morhua in management region 4X5Y*

 I tried bquote from a related help thread, I tried to emulate it:

 *spp-testsub[1,3]*
 *region-testsub[1,2]*
 *
 *
 *main=bquote(Yield ~ for ~ italic(.(spp)) ~ in ~ management ~ region ~
 .(region))*

 Which doesn't seem to work at all, but when I try not putting anything after
 the italic:

 *main=bquote(Yield ~ for ~ italic(.(spp)))*

 I get: *Yield for 1*
 While *spp=Gadus morhua*

 I'm at wit's end, I tried to read about substitute, expression, and eval
 functions in the hopes I can figure it out, but I am lost!

 Thanks for any help!

 Cheers,

 -Jeremy N

 Undergraduate Researcher in Macroecology
 University of Ottawa
 Department of Biology

 Ad astra per alia porci!

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Italicized title from index

Better:

plot(1, main = bquote(Yield  for  ~ italic(.(as.character(spp))) ~
in management region ~ .(region)))

You have a factor, so you need convert it to character


On Thu, Mar 31, 2011 at 8:59 AM, Henrique Dallazuanna www...@gmail.com wrote:
 Try this:

 plot(1, main = bquote(Yield  for  ~ italic(.(spp)) ~ in management
 region ~ .(region)))

 On Thu, Mar 31, 2011 at 2:35 AM, Jeremy Newman jnewm...@uottawa.ca wrote:
 Hi all!

 I've written a handy script that uses a for loop to allow me to generate a
 large number of figures and statistical outputs for a large dataset.

 I am using indexing to retrieve a species name for the title of my graphs-
 which worked fine. However, I need to italicize these species names.

 I originally used the paste function, and had no problems with indexing:

 *main=paste(Yield for , testsub[1,3],  in management region ,
 testsub[1,2])*

 The title looks like: *Yield for Gadus morhua in management region 4X5Y*

 I tried bquote from a related help thread, I tried to emulate it:

 *spp-testsub[1,3]*
 *region-testsub[1,2]*
 *
 *
 *main=bquote(Yield ~ for ~ italic(.(spp)) ~ in ~ management ~ region ~
 .(region))*

 Which doesn't seem to work at all, but when I try not putting anything after
 the italic:

 *main=bquote(Yield ~ for ~ italic(.(spp)))*

 I get: *Yield for 1*
 While *spp=Gadus morhua*

 I'm at wit's end, I tried to read about substitute, expression, and eval
 functions in the hopes I can figure it out, but I am lost!

 Thanks for any help!

 Cheers,

 -Jeremy N

 Undergraduate Researcher in Macroecology
 University of Ottawa
 Department of Biology

 Ad astra per alia porci!

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] That dreaded floating point trap


A very straightforward way to avoid this problem is to construct the
sequence by multiplying a sequence of integers by an approriate
constant. E.g. for your first example:

   for( i in where ){
  thisdata- subset(herde, herde$mlr= i)
  # do stuff with thisdata..
   }

'where' is a vector like 0.1*((-10):10)
[ instead of seq(-1, 1, by=0.1) ]

and then, when you want to change to seq(-0.8, 1, by=0.1),
use instead 0.1*(-80,10).



Hi,

this helps, thank you.
But if this code is in a function, and some user supplies a vector, I 
will still have to round it in the function, I guess.


It's weird how 0.1 is different from round(0.1, digits=1) , but I don't 
want to read that 90 page long floating point paper which was referenced 
somewhere :)


Thanks,
 Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] VECM with UNRESTRICTED TREND

2011-03-31 Thread Pfaff, Bernhard Dr.


 

Hello Bernhard, 


thank You so much one again! Now I (more or less) understand the idea, 
but still have problem with its practical application.


I do (somewhat following example 8.1 in your textbook):


library(urca)
data(my.data)
names(my.data)
attach(my.data)
dat1 - my.data[, c(dY, X, dM)]
dat2 - cbind(time)
 
What is 'time'? Just employ matrix(seq(1:nrow(dat1)), ncol = 1) for 
creating the trend variable.
 
Best,
Bernhard
 
 
 args('ca.jo')
yxm.vecm - ca.jo(dat1, type = trace, ecdet = trend, K = 2, spec = 
longrun, dumvar=dat2)

The above code produces following output:

Error in r[i1, , drop = FALSE] - r[-nrow(r):-(nrow(r) - lag + 1L), , 
drop = FALSE] : 
  non-numeric argument to binary operator

What does that mean? Should I use cbind command to dat1 as well? And 
doesn't it transform the series into series of integer numbers?

Thank you once again (especially for your patience).

Best,
Greg



2011/3/31 Pfaff, Bernhard Dr. bernhard_pf...@fra.invesco.com


Hello Greg,
 
you include your trend as a (Nx1) matrix and use this for 
'dumvar'. The matrix 'dumvar' is just added to the VECM as deterministic 
regressors and while you are referring to case 5, this is basically what you 
are after, if I am not mistaken. But we aware that this implies a quadratic 
trend for the levels.
 
Best,
Bernhard




Von: Grzegorz Konat [mailto:grzegorz.ko...@ibrkk.pl] 
Gesendet: Mittwoch, 30. März 2011 20:50
An: Pfaff, Bernhard Dr.; r-help@r-project.org
Betreff: Re: [R] VECM with UNRESTRICTED TREND


Hello Bernhard, 

Thank You very much. Unfortunately I'm still not really 
sure how should I use dummy vars in this context...
If I have a system of three variables (x, y, z), lag 
order = 2 and 1 cointegrating relation, what should I do? I mean, what kind of 
'pattern' should be used to create those dummy variables, what should they 
represent and how many of them do I need?

Many thanks in advance!

Best,
Greg


2011/3/30 Pfaff, Bernhard Dr. 
bernhard_pf...@fra.invesco.com


Hello Greg,

you can exploit the argument 'dumvar' for this. 
See ?ca.jo

Best,
Bernhard

 -Ursprüngliche Nachricht-
 Von: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org] Im 
Auftrag von Grzegorz Konat
 Gesendet: Mittwoch, 30. März 2011 16:46
 An: r-help@r-project.org
 Betreff: [R] VECM with UNRESTRICTED TREND


 Dear All,

 My question is:

 how can I estimate VECM system with 
unrestricted trend (aka
 case 5) option as a deterministic term?

 As far as I know, ca.jo in urca package 
allows for restricted trend
 only [vecm
 - ca.jo(data, type = trace/eigen, ecdet 
= trend, K =
 n, spec = transitory/longrun)].
 Obviously, I don't have to do this in urca, 
so if another
 package gives the possibility, please let me 
know too!

 Thanks in advance!

 Greg


   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help

Re: [R] VECM with UNRESTRICTED TREND

2011-03-31 Thread Grzegorz Konat

'time' was a trend variable from my.data set. Equivalent to the output of
the command 'matrix' you just gave me.

So now I did:

library(urca)
data(my.data)
names(my.data)
attach(my.data)
dat1 - my.data[, c(dY, X, dM)]
mat1 - matrix(seq(1:nrow(dat1)), ncol = 1)
args('ca.jo')
yxm.vecm - ca.jo(dat1, type = trace, ecdet = const, K = 2, spec =
longrun, dumvar=mat1)

and the output is:

Error in r[i1, , drop = FALSE] - r[-nrow(r):-(nrow(r) - lag + 1L), , drop =
FALSE] :
  non-numeric argument to binary operator
In addition: Warning message:
In ca.jo(dat1, type = trace, ecdet = const, K = 2, spec = longrun,  :
No column names in 'dumvar', using prefix 'exo' instead.

What do I do wrong?

Best,
Greg


2011/3/31 Pfaff, Bernhard Dr. bernhard_pf...@fra.invesco.com




 Hello Bernhard,

 thank You so much one again! Now I (more or less) understand the idea, but
 still have problem with its practical application.

 I do (somewhat following example 8.1 in your textbook):

  library(urca)
 data(my.data)
 names(my.data)
 attach(my.data)
 dat1 - my.data[, c(dY, X, dM)]
 dat2 - cbind(time)

 What is 'time'? Just employ matrix(seq(1:nrow(dat1)), ncol = 1) for
 creating the trend variable.

 Best,
 Bernhard


  args('ca.jo')
 yxm.vecm - ca.jo(dat1, type = trace, ecdet = trend, K = 2, spec =
 longrun, dumvar=dat2)

 The above code produces following output:

  Error in r[i1, , drop = FALSE] - r[-nrow(r):-(nrow(r) - lag + 1L), , drop
 = FALSE] :
   non-numeric argument to binary operator

 What does that mean? Should I use cbind command to dat1 as well? And
 doesn't it transform the series into series of integer numbers?

 Thank you once again (especially for your patience).

 Best,
 Greg



 2011/3/31 Pfaff, Bernhard Dr. bernhard_pf...@fra.invesco.com

  Hello Greg,

 you include your trend as a (Nx1) matrix and use this for 'dumvar'. The
 matrix 'dumvar' is just added to the VECM as deterministic regressors and
 while you are referring to case 5, this is basically what you are after, if
 I am not mistaken. But we aware that this implies a quadratic trend for the
 levels.

 Best,
 Bernhard

  --
 *Von:* Grzegorz Konat [mailto:grzegorz.ko...@ibrkk.pl]
 *Gesendet:* Mittwoch, 30. März 2011 20:50
 *An:* Pfaff, Bernhard Dr.; r-help@r-project.org
 *Betreff:* Re: [R] VECM with UNRESTRICTED TREND

   Hello Bernhard,

 Thank You very much. Unfortunately I'm still not really sure how should I
 use dummy vars in this context...
 If I have a system of three variables (x, y, z), lag order = 2 and 1
 cointegrating relation, what should I do? I mean, what kind of 'pattern'
 should be used to create those dummy variables, what should they represent
 and how many of them do I need?

 Many thanks in advance!

 Best,
 Greg

 2011/3/30 Pfaff, Bernhard Dr. bernhard_pf...@fra.invesco.com

 Hello Greg,

 you can exploit the argument 'dumvar' for this. See ?ca.jo

 Best,
 Bernhard

  -Ursprüngliche Nachricht-
  Von: r-help-boun...@r-project.org
  [mailto:r-help-boun...@r-project.org] Im Auftrag von Grzegorz Konat
  Gesendet: Mittwoch, 30. März 2011 16:46
  An: r-help@r-project.org
  Betreff: [R] VECM with UNRESTRICTED TREND
  
  Dear All,
 
  My question is:
 
  how can I estimate VECM system with unrestricted trend (aka
  case 5) option as a deterministic term?
 
  As far as I know, ca.jo in urca package allows for restricted trend
  only [vecm
  - ca.jo(data, type = trace/eigen, ecdet = trend, K =
  n, spec = transitory/longrun)].
  Obviously, I don't have to do this in urca, so if another
  package gives the possibility, please let me know too!
 
  Thanks in advance!
 
  Greg
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 *
 Confidentiality Note: The information contained in this message,
 and any attachments, may contain confidential and/or privileged
 material. It is intended solely for the person(s) or entity to
 which it is addressed. Any review, retransmission, dissemination,
 or taking of any action in reliance upon this information by
 persons or entities other than the intended recipient(s) is
 prohibited. If you received this in error, please contact the
 sender and delete the material from any computer.
 *





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] That dreaded floating point trap

2011-03-31 Thread Sarah Goslee

On Thu, Mar 31, 2011 at 8:14 AM, Alexander Engelhardt
 this helps, thank you.
 But if this code is in a function, and some user supplies a vector, I will
 still have to round it in the function, I guess.

 It's weird how 0.1 is different from round(0.1, digits=1) , but I don't want
 to read that 90 page long floating point paper which was referenced
 somewhere :)


Or you could try the much shorter R FAQ 7.31. Turns out it
isn't weird at all, if you are a computer.

Sarah

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fonts in mosaic

2011-03-31 Thread Michael Friendly


If you are doing multiple plots like this, there is no way I know of
to specify the fonts for labeling *once* for all such plots.  However,
you can do something like this to save typing and keep things consistent:

my.largs - list(
gp_labels = gpar(fontsize = 12, fontfamily = calibri),
gp_varnames = gpar(fontsize = 16, fontfamily = calibri))

mosaic(UCBAdmissions, labeling_args=my.largs)
mosaic(UCBAdmissions, labeling_args=my.largs, shade=TRUE)

etc.


On 3/30/2011 2:42 PM, Henrique Dallazuanna wrote:

Try this:

windowsFonts(calibri = windowsFont(Calibri))

mosaic(UCBAdmissions, labeling_args = list(
   gp_labels = gpar(fontsize = 12, fontfamily = calibri),
   gp_varnames = gpar(fontsize = 16, fontfamily = calibri)
))

On Wed, Mar 30, 2011 at 3:25 PM, Erich Neuwirth
erich.neuwi...@univie.ac.at  wrote:

Achim
I simply want to replace the font R uses on mosaic (whatever it is)
by a font of my choice (say Calibri or Arial)
because I need to embed the R charts in a PowerPoint
presentation and want the fonts to match.
And I want the most simple way of accomplishing this.
I worked my way through the strucplot vignette,
but I could not extract enough information there.
Is there some information about the proper font names
to use in R?



Personally, I simply change the size of the device I'm plotting on. When
I plot on a large device, the fonts will be relatively smaller, and vice

versa. This is what I do when including graphics in PDF files (papers,
slides, reports, etc.).

For fine control, you can set the arguments of the labeling function
employed. ?strucplot shows that the default is ?labeling_border which
has several arguments. For example you can set the graphical parameters
of the labels (gp_labels) or the graphical parameters of the variable
names (gp_varnames). Both arguments take ?gpar lists (grid graphical
parameters). For example you may do



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.








--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] VECM with UNRESTRICTED TREND

2011-03-31 Thread Pfaff, Bernhard Dr.

Well, without further information, I do not know, but try the following
 
library(urca)
example(ca.jo)
trend - matrix(1:nrow(sjf), ncol = 1)
colnames(trend) - trd
ca.jo(sjf, type = trace, ecdet = const, K = 2, spec = longrun, dumvar = 
trend)
 
Best,
Bernhard
 
 




Von: Grzegorz Konat [mailto:grzegorz.ko...@ibrkk.pl] 
Gesendet: Donnerstag, 31. März 2011 14:40
An: Pfaff, Bernhard Dr.; r-help@r-project.org
Betreff: Re: [R] VECM with UNRESTRICTED TREND


'time' was a trend variable from my.data set. Equivalent to the output 
of the command 'matrix' you just gave me. 


So now I did:


library(urca)
data(my.data)
names(my.data)
attach(my.data)
dat1 - my.data[, c(dY, X, dM)]
mat1 - matrix(seq(1:nrow(dat1)), ncol = 1)
args('ca.jo')
yxm.vecm - ca.jo(dat1, type = trace, ecdet = const, K = 2, spec = 
longrun, dumvar=mat1)

and the output is:

Error in r[i1, , drop = FALSE] - r[-nrow(r):-(nrow(r) - lag + 1L), , 
drop = FALSE] : 
  non-numeric argument to binary operator
In addition: Warning message:
In ca.jo(dat1, type = trace, ecdet = const, K = 2, spec = 
longrun,  :
No column names in 'dumvar', using prefix 'exo' instead.

What do I do wrong?

Best,
Greg


2011/3/31 Pfaff, Bernhard Dr. bernhard_pf...@fra.invesco.com



 

Hello Bernhard, 


thank You so much one again! Now I (more or less) 
understand the idea, but still have problem with its practical application.


I do (somewhat following example 8.1 in your textbook):


library(urca)
data(my.data)
names(my.data)
attach(my.data)
dat1 - my.data[, c(dY, X, dM)]
dat2 - cbind(time)
 
What is 'time'? Just employ matrix(seq(1:nrow(dat1)), 
ncol = 1) for creating the trend variable.
 
Best,
Bernhard
 
 
 args('ca.jo')
yxm.vecm - ca.jo(dat1, type = trace, ecdet = 
trend, K = 2, spec = longrun, dumvar=dat2)

The above code produces following output:

Error in r[i1, , drop = FALSE] - r[-nrow(r):-(nrow(r) - 
lag + 1L), , drop = FALSE] : 
  non-numeric argument to binary operator

What does that mean? Should I use cbind command to dat1 
as well? And doesn't it transform the series into series of integer numbers?

Thank you once again (especially for your patience).

Best,
Greg



2011/3/31 Pfaff, Bernhard Dr. 
bernhard_pf...@fra.invesco.com


Hello Greg,
 
you include your trend as a (Nx1) matrix and 
use this for 'dumvar'. The matrix 'dumvar' is just added to the VECM as 
deterministic regressors and while you are referring to case 5, this is 
basically what you are after, if I am not mistaken. But we aware that this 
implies a quadratic trend for the levels.
 
Best,
Bernhard




Von: Grzegorz Konat 
[mailto:grzegorz.ko...@ibrkk.pl] 
Gesendet: Mittwoch, 30. März 2011 20:50
An: Pfaff, Bernhard Dr.; 
r-help@r-project.org
Betreff: Re: [R] VECM with UNRESTRICTED 
TREND


Hello Bernhard, 

Thank You very much. Unfortunately I'm 
still not really sure how should I use dummy vars in this context...
If I have a system of three variables 
(x, y, z), lag order = 2 and 1 cointegrating relation, what should I do? I 
mean, what kind of 'pattern' should be used to create those dummy variables, 
what should they represent and how many of them do I need?

Many thanks in advance!

Best,
Greg

Re: [R] That dreaded floating point trap


Am 31.03.2011 14:41, schrieb Sarah Goslee:

On Thu, Mar 31, 2011 at 8:14 AM, Alexander Engelhardt

this helps, thank you.
But if this code is in a function, and some user supplies a vector, I will
still have to round it in the function, I guess.

It's weird how 0.1 is different from round(0.1, digits=1) , but I don't want
to read that 90 page long floating point paper which was referenced
somewhere :)



Or you could try the much shorter R FAQ 7.31. Turns out it
isn't weird at all, if you are a computer.


You're a computer! :)

But yes.. the FAQ entry was where I found all.equal and the referenced 
90-page-paper. But I didn't find out how to do a subset with 'somevector 
 0.4'.


I think I'll have to round the numbers every time now.. or use some 
other not-so-pretty workaround like 'somevector  0.4 - 0.05' for 
0.1-binned data.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lists of tables and conditional statements

2011-03-31 Thread Herbert, Alan G

Thanks Henrique - that worked  like a charm - I had tried lots of other 
combinations before seeing your reply - wished I had asked sooner!

Alan 

-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Wednesday, March 30, 2011 10:09 PM
To: Herbert, Alan G
Cc: r-help@r-project.org Help
Subject: Re: [R] Lists of tables and conditional statements

On Mar 30, 2011, at 7:27 PM, Henrique Dallazuanna wrote:

 Try this:

 lapply(l, function(x)x[x[,'Sum'] == 3,])

If this is the right answer, you should send a solved message. The dput 
extract was incomplete.

--
David.

 On Wed, Mar 30, 2011 at 7:38 PM, Herbert, Alan G aherb...@bu.edu
 wrote:
 Hi R-users,

 I have a list containing numeric tables of differing row length. I 
 want to make a new list that contains only rows from tables with a 
 Sum greater than 3, plus the names of each table. I was wondering 
 whether there is an elegant way to do this using apply of related 
 functions as this list has many thousands of such tables.

 Here is an example of the list

 $AACS

POOL

 INFO pool1 pool2 pool6 pool7 pool8 pool.all Sum

  12:125561133:novel 0 0 0 0 10   1

  12:125570904:novel 0 0 0 0 10   1

  12:125571014:novel 0 1 0 0 00   1

  12:125571038:novel 0 0 0 1 00   1

  12:125575996:novel 0 0 0 1 00   1

  12:125591844:rs2297478 1 0 1 0 01   3

  12:125599114:novel 0 0 0 1 00   1

  12:125612668:novel 0 0 0 0 10   1

  12:125612839:rs900411  1 0 1 0 11   4

  12:125626650:novel 0 0 0 0 10   1

  12:125626737:novel 0 0 0 1 00   1

 $AADAC

POOL

 INFO pool1 pool2 pool5 pool6 pool7 pool8  
 pool.all Sum

  3:151542411:novel  0 0 0 0 1 0 
 1   2

  3:151542412:novel  0 0 0 0 1 0 
 1   2

  3:151542643:novel  0 1 0 0 0 0 
 0   1

  3:151545322:rs2410836  0 1 0 0 0 0 
 1   2

  3:151545323:rs62272918 0 1 0 0 0 0 
 1   2

  3:151545509:novel  0 0 1 0 0 0 
 1   2

  3:151545601:rs1803155  1 1 1 1 1 1 
 1   7

  3:151545721:novel  0 0 1 0 0 0 
 0   1

  3:151545802:novel  0 0 0 0 1 0 
 0   1

  3:151545824:novel  0 1 0 0 0 0 
 0   1

 Thanks for your help

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] That dreaded floating point trap

2011-03-31 Thread Kenn Konstabel

n Thu, Mar 31, 2011 at 3:56 PM, Alexander Engelhardt
a...@chaotic-neutral.de wrote:
 Am 31.03.2011 14:41, schrieb Sarah Goslee:

 On Thu, Mar 31, 2011 at 8:14 AM, Alexander Engelhardt

 this helps, thank you.
 But if this code is in a function, and some user supplies a vector, I
 will
 still have to round it in the function, I guess.

 It's weird how 0.1 is different from round(0.1, digits=1) , but I don't
 want
 to read that 90 page long floating point paper which was referenced
 somewhere :)


 Or you could try the much shorter R FAQ 7.31. Turns out it
 isn't weird at all, if you are a computer.

 You're a computer! :)

 But yes.. the FAQ entry was where I found all.equal and the referenced
 90-page-paper. But I didn't find out how to do a subset with 'somevector 
 0.4'.

 I think I'll have to round the numbers every time now.. or use some other
 not-so-pretty workaround like 'somevector  0.4 - 0.05' for 0.1-binned data.

You could define your own %% and %% to do rounding and comparison in
one step.

Kenn

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] VECM with UNRESTRICTED TREND

2011-03-31 Thread Grzegorz Konat

The code you gave me works fine with Finland, but the same for my data -
does not!
I do:

library(urca)
data(my.data)
dat1 - my.data[, c(dY, X, dM)]
trend - matrix(1:nrow(dat1), ncol = 1)
colnames(trend) - trd
yxm.vecm - ca.jo(dat1, type = trace, ecdet = const, K = 2, spec =
longrun, dumvar = trend)

and the result is again:

Error in r[i1, , drop = FALSE] - r[-nrow(r):-(nrow(r) - lag + 1L), , drop =
FALSE] :
  non-numeric argument to binary operator

I attach my dataset in xls format. If you have 5 minutes and wish to check
it out, I'd be extremely grateful!

Best,
Greg



2011/3/31 Pfaff, Bernhard Dr. bernhard_pf...@fra.invesco.com

  Well, without further information, I do not know, but try the following

 library(urca)
 example(ca.jo)
 trend - matrix(1:nrow(sjf), ncol = 1)
 colnames(trend) - trd
 ca.jo(sjf, type = trace, ecdet = const, K = 2, spec = longrun,
 dumvar = trend)

 Best,
 Bernhard



  --
 *Von:* Grzegorz Konat [mailto:grzegorz.ko...@ibrkk.pl]
 *Gesendet:* Donnerstag, 31. März 2011 14:40

 *An:* Pfaff, Bernhard Dr.; r-help@r-project.org
 *Betreff:* Re: [R] VECM with UNRESTRICTED TREND

 'time' was a trend variable from my.data set. Equivalent to the output of
 the command 'matrix' you just gave me.

 So now I did:

  library(urca)
 data(my.data)
 names(my.data)
 attach(my.data)
 dat1 - my.data[, c(dY, X, dM)]
 mat1 - matrix(seq(1:nrow(dat1)), ncol = 1)
 args('ca.jo')
 yxm.vecm - ca.jo(dat1, type = trace, ecdet = const, K = 2, spec =
 longrun, dumvar=mat1)

 and the output is:

  Error in r[i1, , drop = FALSE] - r[-nrow(r):-(nrow(r) - lag + 1L), , drop
 = FALSE] :
   non-numeric argument to binary operator
 In addition: Warning message:
 In ca.jo(dat1, type = trace, ecdet = const, K = 2, spec = longrun,
  :
 No column names in 'dumvar', using prefix 'exo' instead.

 What do I do wrong?

 Best,
 Greg


 2011/3/31 Pfaff, Bernhard Dr. bernhard_pf...@fra.invesco.com




  Hello Bernhard,

 thank You so much one again! Now I (more or less) understand the idea, but
 still have problem with its practical application.

 I do (somewhat following example 8.1 in your textbook):

  library(urca)
 data(my.data)
 names(my.data)
 attach(my.data)
 dat1 - my.data[, c(dY, X, dM)]
 dat2 - cbind(time)

 What is 'time'? Just employ matrix(seq(1:nrow(dat1)), ncol = 1) for
 creating the trend variable.

 Best,
 Bernhard


  args('ca.jo')
 yxm.vecm - ca.jo(dat1, type = trace, ecdet = trend, K = 2, spec =
 longrun, dumvar=dat2)

 The above code produces following output:

  Error in r[i1, , drop = FALSE] - r[-nrow(r):-(nrow(r) - lag + 1L), ,
 drop = FALSE] :
   non-numeric argument to binary operator

 What does that mean? Should I use cbind command to dat1 as well? And
 doesn't it transform the series into series of integer numbers?

 Thank you once again (especially for your patience).

 Best,
 Greg



 2011/3/31 Pfaff, Bernhard Dr. bernhard_pf...@fra.invesco.com

  Hello Greg,

 you include your trend as a (Nx1) matrix and use this for 'dumvar'. The
 matrix 'dumvar' is just added to the VECM as deterministic regressors and
 while you are referring to case 5, this is basically what you are after, if
 I am not mistaken. But we aware that this implies a quadratic trend for the
 levels.

 Best,
 Bernhard

  --
 *Von:* Grzegorz Konat [mailto:grzegorz.ko...@ibrkk.pl]
 *Gesendet:* Mittwoch, 30. März 2011 20:50
 *An:* Pfaff, Bernhard Dr.; r-help@r-project.org
 *Betreff:* Re: [R] VECM with UNRESTRICTED TREND

   Hello Bernhard,

 Thank You very much. Unfortunately I'm still not really sure how should I
 use dummy vars in this context...
 If I have a system of three variables (x, y, z), lag order = 2 and 1
 cointegrating relation, what should I do? I mean, what kind of 'pattern'
 should be used to create those dummy variables, what should they represent
 and how many of them do I need?

 Many thanks in advance!

 Best,
 Greg

 2011/3/30 Pfaff, Bernhard Dr. bernhard_pf...@fra.invesco.com

 Hello Greg,

 you can exploit the argument 'dumvar' for this. See ?ca.jo

 Best,
 Bernhard

  -Ursprüngliche Nachricht-
  Von: r-help-boun...@r-project.org
  [mailto:r-help-boun...@r-project.org] Im Auftrag von Grzegorz Konat
  Gesendet: Mittwoch, 30. März 2011 16:46
  An: r-help@r-project.org
  Betreff: [R] VECM with UNRESTRICTED TREND
  
  Dear All,
 
  My question is:
 
  how can I estimate VECM system with unrestricted trend (aka
  case 5) option as a deterministic term?
 
  As far as I know, ca.jo in urca package allows for restricted trend
  only [vecm
  - ca.jo(data, type = trace/eigen, ecdet = trend, K =
  n, spec = transitory/longrun)].
  Obviously, I don't have to do this in urca, so if another
  package gives the possibility, please let me know too!
 
  Thanks in advance!
 
  Greg
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help

Re: [R] That dreaded floating point trap

2011-03-31 Thread ONKELINX, Thierry

Dear Alexander,

Instead of testing 'somevector  0.4', test 'abs(somevector - 0.4)  
some.small.number'

Best regards,

Thierry


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie  Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics  Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey
  

 -Oorspronkelijk bericht-
 Van: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] Namens Alexander Engelhardt
 Verzonden: donderdag 31 maart 2011 14:57
 Aan: r-help@r-project.org
 Onderwerp: Re: [R] That dreaded floating point trap
 
 Am 31.03.2011 14:41, schrieb Sarah Goslee:
  On Thu, Mar 31, 2011 at 8:14 AM, Alexander Engelhardt
  this helps, thank you.
  But if this code is in a function, and some user supplies 
 a vector, I 
  will still have to round it in the function, I guess.
 
  It's weird how 0.1 is different from round(0.1, digits=1) , but I 
  don't want to read that 90 page long floating point paper 
 which was 
  referenced somewhere :)
 
 
  Or you could try the much shorter R FAQ 7.31. Turns out it 
 isn't weird 
  at all, if you are a computer.
 
 You're a computer! :)
 
 But yes.. the FAQ entry was where I found all.equal and the 
 referenced 90-page-paper. But I didn't find out how to do a 
 subset with 'somevector   0.4'.
 
 I think I'll have to round the numbers every time now.. or 
 use some other not-so-pretty workaround like 'somevector  
 0.4 - 0.05' for 0.1-binned data.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to do t-test in r for difference of mean

2011-03-31 Thread arkajyoti jana

I am trying to do t-test to test whether the mean of one one column of the
data frame is greater then another. please help me out.

-- 
Arkajyoti Jana
M. Phil/ 2nd semester
Centre for Economic Studies and planning
School of Social Sciences
Jawaharlal Nehru University
New Delhi-67

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] summing values by week - based on daily dates - but with somedates missing

2011-03-31 Thread Dimitri Liakhovitski

Thank you so much, everyone, for your help.
Extremely valuable suggestions and extremely valuable learnings!
Dimitri

On Thu, Mar 31, 2011 at 5:03 AM, Martyn Byng martyn.b...@nag.co.uk wrote:
 Hi,

 Yep, that was what it was doing. For a sum across week, try something like

 get.week.flag - function(dd) {
  ## get weekday from the date in dd and code it as Monday = 1, Tuesday = 2 etc
  idd = 
 factor(weekdays(dd),levels=c(Monday,Tuesday,Wednesday,Thursday,Friday,Saturday,Sunday))

  ## convert to numeric
  ndd = as.numeric(idd)

  ## flag entries where weekday code gets less (this will flag changes in week)
  wflag = c(FALSE,(ndd[-length(idd)]  ndd[-1]))

  ## cumulative sum to get the week flag
  cumsum(wflag) + 1
 }

 to get a week flag (this is assuming that your data is sorted by date, if not 
 you'll have to sort it first). If you want the week to start on a different 
 day, just change the ordering of the weekdays in the levels statement.

 data.frame(date=myframe$date,day=weekdays(myframe$date),week=get.week.flag(myframe$date))

 seems to indicate that the function is doing what it should, so you can then 
 amend the previous code to use get.week.flag instead of weekdays, as in

 sum.by.week - function(ff) {
  by.day - split(ff$value,get.week.flag(ff$dates))
  lapply(by.day,sum)
 }

 by.grp - split(myframe,myframe$group)
 lapply(by.grp,sum.by.week)

 Martyn

 -Original Message-
 From: Dimitri Liakhovitski [mailto:dimitri.liakhovit...@gmail.com]
 Sent: 30 March 2011 18:03
 To: Martyn Byng
 Cc: r-help
 Subject: Re: [R] summing values by week - based on daily dates - but with 
 somedates missing

 Thank you, Martyn.
 But it looks like this way we are getting sums by day - i.e., across
 all Mondays, all Tuesdays, etc.
 Maybe I did not explain well, sorry! The desired output would contain
 sums for each WHOLE week - across all days that comprise that week -
 Monday through Sunday.
 Makes sense?
 Dimitri

 On Wed, Mar 30, 2011 at 12:53 PM, Martyn Byng martyn.b...@nag.co.uk wrote:
 Hi,

 How about something like:

 sum.by.day - function(ff) {
  by.day - split(ff$value,weekdays(ff$dates))
  lapply(by.day,sum)
 }

 by.grp - split(myframe,myframe$group)

 lapply(by.grp,sum.by.day)


 Martyn

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Dimitri Liakhovitski
 Sent: 30 March 2011 15:23
 To: r-help
 Subject: [R] summing values by week - based on daily dates - but with
 somedates missing

 Dear everybody,

 I have the following challenge. I have a data set with 2 subgroups,
 dates (days), and corresponding values (see example code below).
 Within each subgroup: I need to aggregate (sum) the values by week -
 for weeks that start on a Monday (for example, 2008-12-29 was a
 Monday).
 I find it difficult because I have missing dates in my data - so that
 sometimes I don't even have the date for some Mondays. So, I can't
 write a proper loop.
 I want my output to look something like this:
 group   dates   value
 group.1 2008-12-29  3.0937
 group.1 2009-01-05  3.8833
 group.1 2009-01-12  1.362
 ...
 group.2 2008-12-29  2.250
 group.2 2009-01-05  1.4057
 group.2 2009-01-12  3.4411
 ...

 Thanks a lot for your suggestions! The code is below:
 Dimitri

 ### Creating example data set:
 mydates-rep(seq(as.Date(2008-12-29), length = 43, by = day),2)
 myfactor-c(rep(group.1,43),rep(group.2,43))
 set.seed(123)
 myvalues-runif(86,0,1)
 myframe-data.frame(dates=mydates,group=myfactor,value=myvalues)
 (myframe)
 dim(myframe)

 ## Removing same rows (dates) unsystematically:
 set.seed(123)
 removed.group1-sample(1:43,size=11,replace=F)
 set.seed(456)
 removed.group2-sample(44:86,size=11,replace=F)
 to.remove-c(removed.group1,removed.group2);length(to.remove)
 to.remove-to.remove[order(to.remove)]
 myframe-myframe[-to.remove,]
 (myframe)



 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 This e-mail has been scanned for all viruses by Star.
 

 
 The Numerical Algorithms Group Ltd is a company registered in England
 and Wales with company number 1249803. The registered office is:
 Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.

 This e-mail has been scanned for all viruses by Star. The service is
 powered by MessageLabs.
 




 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com

 
 This

Re: [R] how about a p- operator?

2011-03-31 Thread Duncan Murdoch


On 11-03-30 7:00 PM, Carl Witthoft wrote:

I was cursing Matlab again today (what else is new) because the default
action for every Matlab command is to spew the result to the console,
and one must remember to put that darn ;  at the end of every line.

So I just wondered:  was there ever a discussion as to providing some
modified version of the - and - operators in R to do the reverse?
   That is, since R does not print the values of a command to the
console,  what if there were  an operator such that


   newobjectp- somefunction()

would do the same as

print(newobject- somefunction())


Any thoughts?


Others have given alternatives.  Just some comments on this particular 
proposal:


We already have a problem that

x-3

is slightly ambiguous:  does it mean

x - 3   # yes

or

x  -3   # no

I think the compound operator p- would just repeat this problem.  What does

xp-3

mean?  If the amount of typing in print(x - 3) is a problem, that's a 
user interface issue, so it should be addressed at the user interface 
level, not by changing the language.  A front-end could make it easy to 
transform x - 3 into the longer expression (or an equivalent), or just 
make it easy to examine the .Last.value variable.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] That dreaded floating point trap

2011-03-31 Thread Duncan Murdoch


On 11-03-31 7:24 AM, Alexander Engelhardt wrote:

Hi,
I had a piece of code which looped over a decimal vector like this:


for( i in where ){
thisdata- subset(herde, herde$mlr= i)
# do stuff with thisdata..
}

'where' is a vector like seq(-1, 1, by=0.1)


The solution to this problem is to take steps by representable numbers, 
not by numbers like 0.1 that can't be represented exactly.  For example, 
seq(-1, 1, by=0.25) has exact entries, because fractions with small 
powers of 2 in the denominator are all exactly representable.  (Small 
depends on the numerator, but for fractions between 0 and 1 it's about 
52, so not really so small.)


Duncan Murdoch




My problem was: 'nrow(thisdata)' in loop repetition 0.4 was different if
'where' was seq(-1, 1, by=0.1) than when 'where' was seq(-0.8, 1, by=0.1)
It went away after I changed the first line to:

thisdata- subset(herde, herde$mlr= round(i, digits=1))

This is that floating point trap the R inferno pdf talked about,
right? That file talked about the problem, but didn't offer a solution.

Similar things happened when I created a table() from a vector with
values in seq(-1, 1, by=0.1)

Do I really have to round every float at every occurence from now on, or
is there another solution? I only found all.equal() and identical(), but
I want to subset for observations with a value /greater/ than something.

Thanks in advance,
   Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple lattice question

2011-03-31 Thread Peter Ehlers


On 2011-03-31 03:39, Rubén Roa wrote:


DeaR ComRades,

require(lattice)
data- data.frame(SP=sort(rep(as.factor(c('A','B','C','D','E')),12)),
x=rpois(60,10),
y=rep(c(rep(0,4),rep(10,4),rep(20,4)),5),
z=rep(1:4,15))
xyplot(x~y|SP,data=data,groups=z,layout=c(2,3),pch=1:4,lty=1:4,col='black',type='b')

How do I put a legend for the grouping variable in the empty upper-right panel?


See the help page for xyplot for an example using the iris data.
You just need to add something like

 auto.key = list(x = .6, y = .7, corner = c(0, 0))

to your lattice call. See the 'key' entry on the ?xyplot page.

Peter Ehlers



TIA

Rubén



Dr. Rubén Roa-Ureta
AZTI - Tecnalia / Marine Research Unit
Txatxarramendi Ugartea z/g
48395 Sukarrieta (Bizkaia)
SPAIN

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] another statistical question

2011-03-31 Thread Anna Lee

Dear List!

I have a unverse (basic population) which is not normally distributed.
Now from this universe I take some subsets. Each subset is normally
distributed within itself. I now want to compare the subsets and see
if they differ significantly. So what is my assumption - normal
distributed data and therefore comparison of means, or non normal
distributed data and therefore comparison of medians?


Cheers, Anna
-- 



Der Inhalt dieser E-Mail ist vertraulich. Sollte Ihnen die E-Mail
irrtümlich zugesandt worden sein, bitte ich Sie, mich unverzüglich zu
benachrichtigen und die E-Mail zu löschen.

This e-mail is confidential. If you have received it in error, please
notify me immediately and delete it from your system.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple lattice question

2011-03-31 Thread Rubén Roa

Thanks Peters!

Just a few minor glitches now:

require(lattice)
data - data.frame(SP=sort(rep(as.factor(c('A','B','C','D','E')),12)),
   x=rpois(60,10),
   y=rep(c(rep(0,4),rep(10,4),rep(20,4)),5),
   z=rep(1:4,15))
xyplot(x~y|SP,data=data,groups=z,layout=c(2,3),pch=1:4,lty=1:4,col='black',type='b',
   key=list(x = .65, y = .75, corner = c(0, 0), points=TRUE,
  lines=TRUE, pch=1:4, lty=1:4, type='b',
  text=list(lab = as.character(unique(data$z)

I have an extra column of symbols on the legend,

and,

would like to add a title to the legend. Such as 'main' for plots.

Any suggestions?

Rubén


 

Dr. Rubén Roa-Ureta
AZTI - Tecnalia / Marine Research Unit
Txatxarramendi Ugartea z/g
48395 Sukarrieta (Bizkaia)
SPAIN



 -Mensaje original-
 De: Peter Ehlers [mailto:ehl...@ucalgary.ca] 
 Enviado el: jueves, 31 de marzo de 2011 15:41
 Para: Rubén Roa
 CC: r-help@r-project.org
 Asunto: Re: [R] Simple lattice question
 
 On 2011-03-31 03:39, Rubén Roa wrote:
 
  DeaR ComRades,
 
  require(lattice)
  data- 
 data.frame(SP=sort(rep(as.factor(c('A','B','C','D','E')),12)),
  x=rpois(60,10),
  y=rep(c(rep(0,4),rep(10,4),rep(20,4)),5),
  z=rep(1:4,15))
  
 xyplot(x~y|SP,data=data,groups=z,layout=c(2,3),pch=1:4,lty=1:4,col='bl
  ack',type='b')
 
  How do I put a legend for the grouping variable in the 
 empty upper-right panel?
 
 See the help page for xyplot for an example using the iris data.
 You just need to add something like
 
   auto.key = list(x = .6, y = .7, corner = c(0, 0))
 
 to your lattice call. See the 'key' entry on the ?xyplot page.
 
 Peter Ehlers
 
 
  TIA
 
  Rubén
 
  
 __
  __
 
  Dr. Rubén Roa-Ureta
  AZTI - Tecnalia / Marine Research Unit Txatxarramendi Ugartea z/g
  48395 Sukarrieta (Bizkaia)
  SPAIN
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] another statistical question


Am 31.03.2011 15:46, schrieb Anna Lee:

Dear List!

I have a unverse (basic population) which is not normally distributed.
Now from this universe I take some subsets. Each subset is normally
distributed within itself. I now want to compare the subsets and see
if they differ significantly. So what is my assumption - normal
distributed data and therefore comparison of means, or non normal
distributed data and therefore comparison of medians?


If you want to do an ANOVA, the assumption is normality /within/ the 
groups. That is, Y given X=x is normal distributed. Also, you want the 
same variance within each group (group = your subset = factor value).


Short answer: Means.
(I am only 95% certain)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] choosing best 'match' for given factor

2011-03-31 Thread Murali.Menon

Folks,

I have a 'matching' matrix between variables A, X, L, O:

 a - structure(c(1, 0.41, 0.58, 0.75, 0.41, 1, 0.6, 0.86, 0.58, 
0.6, 1, 0.83, 0.75, 0.86, 0.83, 1), .Dim = c(4L, 4L), .Dimnames = list(
c(A, X, L, O), c(A, X, L, O)))

 a
  A X L O
A  1.00  0.41  0.58  0.75
X  0.41  1.00  0.60  0.86
L  0.58  0.75  1.00  0.83
O  0.60  0.86  0.83  1.00

And I have a search vector of variables

 v - c(X, O)

I want to write a function bestMatch(searchvector, matchMat) such that for each 
variable in searchvector, I get the variable that it has the highest match to - 
but searching only among variables to the left of it in the 'matching' matrix, 
and not matching with any variable in searchvector itself.

So in the above example, although X has the highest match (0.86) with O, I 
can't choose O as it's to the right of X (and also because O is in the 
searchvector v already); I'll have to choose A.

For O, I will choose L, the variable it's best matched with - as it can't 
match X already in the search vector.

My function bestMatch(v, a) will then return c(A, L)

My matrix a is quite large, and I have a long list of search vectors v, so I 
need an efficient method.

I wrote this:

bestMatch - function(searchvector,  matchMat) {
sapply(searchvector, function(cc) {
 y - matchMat[!(rownames(matchMat) %in% 
searchvector)  (index(rownames(matchMat))  match(cc, rownames(matchMat))), 
cc, drop = FALSE];
 rownames(y)[which.max(y)]
})   
}

Any advice?

Thanks,

Murali

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] error in recode.defalt ....object '.data' not found

2011-03-31 Thread Simon Kiss

Dear colleagues, working with the data frame below, trying to reverse two 
variables I the error message below.
i searched through the help list but could not find any postings which could 
help me solve the situation. I tried attaching and detaching the data frame to 
no avail.
Yours, Simon Kiss

*DATA FRAME
'data.frame':   1569 obs. of  9 variables:
 $ equal : num  3 4 3 2 3 4 2 3 2 2 ...
 $ disc  : num  3 2 3 3 2 2 3 3 3 3 ...
 $ family: num  3 2 2 2 3 2 2 1 2 1 ...
 $ special   : num  3 3 4 4 3 3 4 4 3 4 ...
 $ immigrants: num  3 8 3 8 3 3 4 1 1 2 ...
 $ wedlock   : num  3 3 3 3 3 2 2 8 2 3 ...
 $ crime : num  3 2 2 1 2 3 1 8 2 1 ...
 $ breakdown : num  3 3 3 2 2 4 8 2 2 4 ...
 $ nonwhites : num  2 4 3 3 2 2 3 4 3 3 ...

*RECODE
social$nonwhites-recode(social$nonwhites, 1=4; 2=3; 3=2; 4=1; 8=NA; -9=NA)

*ERROR
Error in recode.default(social$nonwhites, 1=4; 2=3; 3=2; 4=1; 8=NA; -9=NA) : 
  object '.data' not found


*
Simon J. Kiss, PhD
Assistant Professor, Wilfrid Laurier University
73 George Street
Brantford, Ontario, Canada
N3T 2C9
Cell: +1 519 761 7606

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error in recode.defalt ....object '.data' not found

2011-03-31 Thread John Fox

Dear Simon,

Your intention appears to be to call the recode() function in the car package. 
That function is not generic, and consequently has no default method, leading 
me to believe that you've called a recode() function in some other package.

I hope this helps,
 John


John Fox
Sen. William McMaster Prof. of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

On Thu, 31 Mar 2011 11:01:42 -0400
 Simon Kiss sjk...@gmail.com wrote:
 Dear colleagues, working with the data frame below, trying to reverse two 
 variables I the error message below.
 i searched through the help list but could not find any postings which could 
 help me solve the situation. I tried attaching and detaching the data frame 
 to no avail.
 Yours, Simon Kiss
 
 *DATA FRAME
 'data.frame': 1569 obs. of  9 variables:
  $ equal : num  3 4 3 2 3 4 2 3 2 2 ...
  $ disc  : num  3 2 3 3 2 2 3 3 3 3 ...
  $ family: num  3 2 2 2 3 2 2 1 2 1 ...
  $ special   : num  3 3 4 4 3 3 4 4 3 4 ...
  $ immigrants: num  3 8 3 8 3 3 4 1 1 2 ...
  $ wedlock   : num  3 3 3 3 3 2 2 8 2 3 ...
  $ crime : num  3 2 2 1 2 3 1 8 2 1 ...
  $ breakdown : num  3 3 3 2 2 4 8 2 2 4 ...
  $ nonwhites : num  2 4 3 3 2 2 3 4 3 3 ...
 
 *RECODE
 social$nonwhites-recode(social$nonwhites, 1=4; 2=3; 3=2; 4=1; 8=NA; -9=NA)
 
 *ERROR
 Error in recode.default(social$nonwhites, 1=4; 2=3; 3=2; 4=1; 8=NA; -9=NA) 
 : 
   object '.data' not found
 
 
 *
 Simon J. Kiss, PhD
 Assistant Professor, Wilfrid Laurier University
 73 George Street
 Brantford, Ontario, Canada
 N3T 2C9
 Cell: +1 519 761 7606
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] AEM package

2011-03-31 Thread Fernanda Melo Carneiro

I'm trying to do AEM eigenfunctions, but I have a different number of points 
into each reservoir (e.g.: reservoir 1-4 points, reservoir 2-6 points, 
reservoir 3-2 points). Can someone help me?Besides this in the Legendre's 
book exemple in the part:length.edge-vector(length=nrow(edges.b))for (i in 
1:nrow(edges.b)){length.edge[i]-D1.mat[edges.b[i,1], edges.b[i,2]]}There is 
some problem with the lenght argument.Thanks 
Fernanda Melo Carneiro contato: (62) 3521-1480 e 8121-7374www.ecoevol.ufg.br
Laboratório de Ecologia Teórica e Síntese (UFG)  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fonts in mosaic

2011-03-31 Thread Erich Neuwirth

The easiest way of changing the font used for labels by the windows
graphics device (opened by a call to windows()) seems to be the following:

Let us assume we want to use the font Consolas for all labels:

windowsFonts(myfont=Consolas)
par(family=myfont)

If one later on wants to change the default font for graphics,
one has to run both commands again. Just issueing a new windowsFonts
command will not change the font used by the graphics device.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ANCOVA for linear regressions without intercept

If you haven't already received an answer, a careful reading of

?formula

will provide it.

-- Bert

On Wed, Mar 30, 2011 at 11:42 PM, Yusuke Fukuda yusuke.fuk...@nt.gov.auwrote:


 Hello R experts

 I have two linear regressions for sexes (Male, Female, Unknown). All have a
 good correlation between body length (response variable) and head length
 (explanatory variable). I know it is not recommended, but for a good
 practical reason (the purpose of study is to find a single conversion factor
 from head length to body length), the regressions need to go through the
 origin (0 intercept).

 Is it possible to do ANCOVA for these regressions without intercepts? When
 I do

 summary(lm(body length ~ sex*head length))

 this will include the intercepts as below


 Coefficients:
   Estimate Std. Error t value Pr(|t|)
 (Intercept)-6.496971.68497  -3.856 0.000118 ***
 sexMale-9.393401.97760  -4.750 2.14e-06 ***
 sexUnknown -1.337912.35453  -0.568 0.569927
 head_length 7.123070.05503 129.443   2e-16 ***
 sexMale:head_length 0.316310.06246   5.064 4.37e-07 ***
 sexUnknown:head_length  0.199370.07022   2.839 0.004556 **
 ---

 Is there any way I can remove the intercepts so that I can simply compare
 the slopes with no intercept taken into account?

 Thanks for help in advance.

 Yusuke Fukuda

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Men by nature long to get on to the ultimate truths, and will often be
impatient with elementary studies or fight shy of them. If it were possible
to reach the ultimate truths without the elementary studies usually prefixed
to them, these would not be preparatory studies but superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to do t-test in r for difference of mean

2011-03-31 Thread Ista Zahn

Hi Arkajyoti,
Please learn to help yourself. Start with

help.search(t-test)

Also read one of the many introductory R tutorials on the web.

Best,
Ista

On Thu, Mar 31, 2011 at 9:07 AM, arkajyoti jana arkajyo...@gmail.com wrote:
 I am trying to do t-test to test whether the mean of one one column of the
 data frame is greater then another. please help me out.

 --
 Arkajyoti Jana
 M. Phil/ 2nd semester
 Centre for Economic Studies and planning
 School of Social Sciences
 Jawaharlal Nehru University
 New Delhi-67

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] choosing best 'match' for given factor

2011-03-31 Thread Nick Sabbe

Hi Murali.
I haven't compared, but this is what I would do:

bestMatch-function(searchVector, matchMat)
{
searchRow-unique(sort(match(searchVector, colnames(matchMat #if
you're sure, you could drop unique
cat(Original row indices:)
print(searchRow)
matchMat-matchMat[, -searchRow, drop=FALSE] #avoid duplicates
altogether
cat(Corrected Matrix:\n)
print(matchMat)
correctedRows-searchRow - seq_along(searchRow) + 1 #works because
of the sort above
cat(Corrected row indices:)
print(correctedRows)
sapply(correctedRows, function(cr){
lookWhere-matchMat[cr, seq(cr-1)]
cat(Will now look into:\n)
print(lookWhere)
cc-which.max(lookWhere)
cat(Max at position, cc, \n)
colnames(matchMat)[cc]
})
}
I don't think there's that much difference. Depending on specific sizes, it
may be more or less costly to first shrink the search matrix like I do. And
similarly depending, I may be better still if you remove the rows that
you're not interested in as well (some more but similar index trickery
required then.

HTH,


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of murali.me...@avivainvestors.com
Sent: donderdag 31 maart 2011 16:46
To: r-help@r-project.org
Subject: [R] choosing best 'match' for given factor

Folks,

I have a 'matching' matrix between variables A, X, L, O:

 a - structure(c(1, 0.41, 0.58, 0.75, 0.41, 1, 0.6, 0.86, 0.58, 
0.6, 1, 0.83, 0.75, 0.86, 0.83, 1), .Dim = c(4L, 4L), .Dimnames = list(
c(A, X, L, O), c(A, X, L, O)))

 a
  A X L O
A  1.00  0.41  0.58  0.75
X  0.41  1.00  0.60  0.86
L  0.58  0.75  1.00  0.83
O  0.60  0.86  0.83  1.00

And I have a search vector of variables

 v - c(X, O)

I want to write a function bestMatch(searchvector, matchMat) such that for
each variable in searchvector, I get the variable that it has the highest
match to - but searching only among variables to the left of it in the
'matching' matrix, and not matching with any variable in searchvector
itself.

So in the above example, although X has the highest match (0.86) with O,
I can't choose O as it's to the right of X (and also because O is in the
searchvector v already); I'll have to choose A.

For O, I will choose L, the variable it's best matched with - as it
can't match X already in the search vector.

My function bestMatch(v, a) will then return c(A, L)

My matrix a is quite large, and I have a long list of search vectors v, so I
need an efficient method.

I wrote this:

bestMatch - function(searchvector,  matchMat) {
sapply(searchvector, function(cc) {
 y - matchMat[!(rownames(matchMat) %in%
searchvector)  (index(rownames(matchMat))  match(cc, rownames(matchMat))),
cc, drop = FALSE];
 rownames(y)[which.max(y)]
})   
}

Any advice?

Thanks,

Murali

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] another statistical question

2011-03-31 Thread Peter Ehlers


On 2011-03-31 07:44, Alexander Engelhardt wrote:

Am 31.03.2011 15:46, schrieb Anna Lee:

Dear List!

I have a unverse (basic population) which is not normally distributed.
Now from this universe I take some subsets. Each subset is normally
distributed within itself. I now want to compare the subsets and see
if they differ significantly. So what is my assumption - normal
distributed data and therefore comparison of means, or non normal
distributed data and therefore comparison of medians?


If you want to do an ANOVA, the assumption is normality /within/ the
groups. That is, Y given X=x is normal distributed. Also, you want the
same variance within each group (group = your subset = factor value).


The equal-variances assumption is not strictly required.
R has the multiple-groups equivalent of the two-sample t-test.
See ?oneway.test.

Peter Ehlers



Short answer: Means.
(I am only 95% certain)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to put line linking two plots

Here is a way to do it using just base graphics:

layout(matrix(c(1,1,2,3), 2, 2, byrow = TRUE))
plot(runif(10), type='b', ylim=c(0,1))
x.tmp - grconvertX(4, to='ndc')
y.tmp - grconvertY(0.9, to='ndc')
plot(runif(20), type='l', ylim=c(0,1))
par(xpd=NA)
segments( 10, 1,
 grconvertX(x.tmp,  from='ndc'), grconvertY(y.tmp, from='ndc'),
col='red' )
plot(runif(20), type='l')



-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Paul Murrell
 Sent: Wednesday, March 30, 2011 4:29 PM
 To: Mario Valle
 Cc: R-help@r-project.org
 Subject: Re: [R] How to put line linking two plots
 
 Hi
 
 On 30/03/2011 10:54 p.m., Mario Valle wrote:
  Hello!
  Suppose I have three charts like below. The top chart is a general
  overview and the bottom charts are related so some point of this
 chart.
  To make clear this relationship I want to draw a line between (4,0.9)
 in
  the top chart and (10,1) in the bottom-left one.
  Currently I add it manually using Inkscape on the resulting pdf file.
  Is it possible to add it inside R? Should I switch to other charting
  packages?
 
 You'll have your work cut out using traditional graphics, but this is
 doable in grid-based graphics.  For example, ...
 
 library(grid)
 library(lattice)
 
 set.seed(123)
 print(xyplot(runif(10)~1:10, type=b),
   position=c(0, .5, 1, 1),
   prefix=top,
   more=TRUE)
 print(xyplot(runif(20)~1:20, type=l),
   position=c(0, 0, .5, .5),
   prefix=left,
   more=TRUE)
 print(xyplot(runif(20)~1:20, type=l),
   position=c(.5, 0, 1, .5),
   prefix=right)
 trellis.focus(panel, 1, 1, prefix=top)
 grid.move.to(unit(4, native), unit(.9, native))
 trellis.unfocus()
 trellis.focus(panel, 1, 1, prefix=left, clip.off=TRUE)
 grid.line.to(unit(10, native), unit(1, native))
 trellis.unfocus()
 
 Paul
 
  Thanks for the advice!
mario
 
  set.seed(123)
  pdf(test.pdf, width=14, height=7)
  layout(matrix(c(1,1,2,3), 2, 2, byrow = TRUE))
  plot(runif(10), type='b')
  plot(runif(20), type='l')
  plot(runif(20), type='l')
  dev.off()
 
  R 2.12.2 on Windows 7 (32bits)
 
 
 --
 Dr Paul Murrell
 Department of Statistics
 The University of Auckland
 Private Bag 92019
 Auckland
 New Zealand
 64 9 3737599 x85392
 p...@stat.auckland.ac.nz
 http://www.stat.auckland.ac.nz/~paul/
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Italicized title from index

2011-03-31 Thread Jeremy Newman

Hi,

Thank you *very* much. I'll remember that in the future!

Cheers,

-Jeremy N

Undergraduate Researcher in Macroecology
University of Ottawa
Department of Biology

Ad astra per alia porci!



On Thu, Mar 31, 2011 at 8:01 AM, Henrique Dallazuanna www...@gmail.comwrote:

 Better:

 plot(1, main = bquote(Yield  for  ~ italic(.(as.character(spp))) ~
 in management region ~ .(region)))

 You have a factor, so you need convert it to character


 On Thu, Mar 31, 2011 at 8:59 AM, Henrique Dallazuanna www...@gmail.com
 wrote:
  Try this:
 
  plot(1, main = bquote(Yield  for  ~ italic(.(spp)) ~ in management
  region ~ .(region)))
 
  On Thu, Mar 31, 2011 at 2:35 AM, Jeremy Newman jnewm...@uottawa.ca
 wrote:
  Hi all!
 
  I've written a handy script that uses a for loop to allow me to generate
 a
  large number of figures and statistical outputs for a large dataset.
 
  I am using indexing to retrieve a species name for the title of my
 graphs-
  which worked fine. However, I need to italicize these species names.
 
  I originally used the paste function, and had no problems with indexing:
 
  *main=paste(Yield for , testsub[1,3],  in management region ,
  testsub[1,2])*
 
  The title looks like: *Yield for Gadus morhua in management region
 4X5Y*
 
  I tried bquote from a related help thread, I tried to emulate it:
 
  *spp-testsub[1,3]*
  *region-testsub[1,2]*
  *
  *
  *main=bquote(Yield ~ for ~ italic(.(spp)) ~ in ~ management ~ region ~
  .(region))*
 
  Which doesn't seem to work at all, but when I try not putting anything
 after
  the italic:
 
  *main=bquote(Yield ~ for ~ italic(.(spp)))*
 
  I get: *Yield for 1*
  While *spp=Gadus morhua*
 
  I'm at wit's end, I tried to read about substitute, expression, and eval
  functions in the hopes I can figure it out, but I am lost!
 
  Thanks for any help!
 
  Cheers,
 
  -Jeremy N
 
  Undergraduate Researcher in Macroecology
  University of Ottawa
  Department of Biology
 
  Ad astra per alia porci!
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
  --
  Henrique Dallazuanna
  Curitiba-ParanÃ¡-Brasil
  25Â° 25' 40 S 49Â° 16' 22 O
 



 --
 Henrique Dallazuanna
 Curitiba-ParanÃ¡-Brasil
 25Â° 25' 40 S 49Â° 16' 22 O


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Graph many points without hiding some

Just a note, Base graphics does support transparency as long as the device 
plotting to supports it.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Dennis Murphy
 Sent: Thursday, March 31, 2011 1:36 AM
 To: Samuel Dennis
 Cc: R-help@r-project.org
 Subject: Re: [R] Graph many points without hiding some
 
 Hi:
 
 I can think of a couple: (1) size reduction of the points; (2) alpha
 transparency; (3)  (1) + (2)
 
 From your original plot in base graphics, I reduced cex to 0.2 and it
 didn't
 look too bad:
 
 plot(rnorm(x,mean=19),rnorm(x),col=3,xlim=c(16,24), cex = 0.2)
 points(rnorm(x,mean=20),rnorm(x),col=1, cex = 0.2)
 points(rnorm(x,mean=21),rnorm(x),col=2, cex = 0.2)
 
 AFAIK, base graphics doesn't have alpha transparency available, but the
 ggplot2 package does. One approach is to adjust the alpha transparency
 on
 default size points; another is to combine reduced point size with
 alpha
 transparency. Here is your example rehashed for ggplot2.
 
 require(ggplot2)
 d - data.frame(x1 = rnorm(1, mean = 19), x2 = rnorm(1, mean =
 20),
 x3 = rnorm(1, mean = 21), x = rnorm(1))
 # Basically stacking x1 - x3, creating two new vars named variable and
 value
 dm - melt(d, id = 'x')   # from reshape package, loads with ggplot2
 # Alpha transparency is set to a low level with default point size,
 # but the colors in the legend are muted by the level of transparency
 ggplot(dm, aes(x = x, y = value, colour = variable)) + theme_bw() +
geom_point(alpha = 0.05) +
scale_colour_manual(values = c('x1' = 'black',
   'x2' = 'red', 'x3' = 'green'))
 
 # A tradeoff is to reduce the point size and increase alpha a bit, but
 these
 changes will
 # also be reflected in the legend.
 
 ggplot(dm, aes(x = x, y = value, colour = variable)) + theme_bw() +
geom_point(alpha = 0.15, size = 1) +
scale_colour_manual(values = c('x1' = 'black',
   'x2' = 'red', 'x3' = 'green'))
 
 You may well find the legend to be useless for this example, so to get
 rid
 of it,
 
 ggplot(dm, aes(x = x, y = value, colour = variable)) + theme_bw() +
geom_point(alpha = 0.15, size = 1) +
scale_colour_manual(values = c('x1' = 'black',
   'x2' = 'red', 'x3' = 'green')) +
opts(legend.position = 'none')
 
 The nice thing about the ggplot2 graph is that you can adjust the point
 size
 and alpha transparency to your tastes. The default point size is 2 and
 the
 default alpha = 1 (no transparency).
 
 HTH,
 Dennis
 
 On Wed, Mar 30, 2011 at 10:04 PM, Samuel Dennis sjdenn...@gmail.com
 wrote:
 
  I have a very large dataset with three variables that I need to graph
 using
  a scatterplot. However I find that the first variable gets masked by
 the
  other two, so the graph looks entirely different depending on the
 order of
  variables. Does anyone have any suggestions how to manage this?
 
  This code is an illustration of what I am dealing with:
 
  x - 1
  plot(rnorm(x,mean=20),rnorm(x),col=1,xlim=c(16,24))
  points(rnorm(x,mean=21),rnorm(x),col=2)
  points(rnorm(x,mean=19),rnorm(x),col=3)
 
  gives an entirely different looking graph to:
 
  x - 1
  plot(rnorm(x,mean=19),rnorm(x),col=3,xlim=c(16,24))
  points(rnorm(x,mean=20),rnorm(x),col=1)
  points(rnorm(x,mean=21),rnorm(x),col=2)
 
  despite being identical in all respects except for the order in which
 the
  variables are plotted.
 
  I have tried using pch=., however the colours are very difficult to
  discern. I have experimented with a number of other symbols with no
 real
  solution.
 
  The only way that appears to work is to iterate the plot with a for
 loop,
  and progressively add a few numbers from each variable, as below.
 However
  although I can do this simply with random numbers as I have done
 here, this
  is an extremely cumbersome method to use with real datasets.
 
  plot(1,1,xlim=c(16,24),ylim=c(-4,4),col=white)
  x - 100
  for (i in 1:100) {
  points(rnorm(x,mean=19),rnorm(x),col=3)
  points(rnorm(x,mean=20),rnorm(x),col=1)
  points(rnorm(x,mean=21),rnorm(x),col=2)
  }
 
  Is there some function in R that could solve this through
 automatically
  iterating my data as above, using transparent symbols, or something
 else?
  Is
  there some other way of solving this issue that I haven't thought of?
 
  Thankyou,
 
  Samuel Dennis
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
   [[alternative HTML version deleted]]

Re: [R] Simple lattice question

2011-03-31 Thread Peter Ehlers


On 2011-03-31 06:58, Rubén Roa wrote:

Thanks Peters!

Just a few minor glitches now:

require(lattice)
data- data.frame(SP=sort(rep(as.factor(c('A','B','C','D','E')),12)),
x=rpois(60,10),
y=rep(c(rep(0,4),rep(10,4),rep(20,4)),5),
z=rep(1:4,15))
xyplot(x~y|SP,data=data,groups=z,layout=c(2,3),pch=1:4,lty=1:4,col='black',type='b',
key=list(x = .65, y = .75, corner = c(0, 0), points=TRUE,
   lines=TRUE, pch=1:4, lty=1:4, type='b',
   text=list(lab = as.character(unique(data$z)

I have an extra column of symbols on the legend,

and,

would like to add a title to the legend. Such as 'main' for plots.

Any suggestions?


for key(), make 'lines' into a list:

 xyplot(x~y|SP,data=data,groups=z,layout=c(2,3),
pch=1:4,lty=1:4,col='black',type='b',
key=list(x = .65, y = .75, corner = c(0, 0),
 title=title here, cex.title=.9, lines.title=3,
 lines=list(pch=1:4, lty=1:4, type='b'),
 text=list(lab = as.character(unique(data$z)

Peter Ehlers



Rubén



Dr. Rubén Roa-Ureta
AZTI - Tecnalia / Marine Research Unit
Txatxarramendi Ugartea z/g
48395 Sukarrieta (Bizkaia)
SPAIN




-Mensaje original-
De: Peter Ehlers [mailto:ehl...@ucalgary.ca]
Enviado el: jueves, 31 de marzo de 2011 15:41
Para: Rubén Roa
CC: r-help@r-project.org
Asunto: Re: [R] Simple lattice question

On 2011-03-31 03:39, Rubén Roa wrote:


DeaR ComRades,

require(lattice)
data-

data.frame(SP=sort(rep(as.factor(c('A','B','C','D','E')),12)),

 x=rpois(60,10),
 y=rep(c(rep(0,4),rep(10,4),rep(20,4)),5),
 z=rep(1:4,15))


xyplot(x~y|SP,data=data,groups=z,layout=c(2,3),pch=1:4,lty=1:4,col='bl

ack',type='b')

How do I put a legend for the grouping variable in the

empty upper-right panel?

See the help page for xyplot for an example using the iris data.
You just need to add something like

   auto.key = list(x = .6, y = .7, corner = c(0, 0))

to your lattice call. See the 'key' entry on the ?xyplot page.

Peter Ehlers



TIA

Rubén



__

__

Dr. Rubén Roa-Ureta
AZTI - Tecnalia / Marine Research Unit Txatxarramendi Ugartea z/g
48395 Sukarrieta (Bizkaia)
SPAIN

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] choosing best 'match' for given factor

Folks:

I think the following may be somewhat faster, as it avoids sorting:

bmat - function(mx,vec)
{
  nm - colnames(mx)
  ivec - match(vec,nm)
  sapply(ivec,function(k){
   if(k==1)NA  else {
lookat - setdiff(seq_len(k-1),ivec) ## only those to left and not
in search vector ##
nm[lookat[which.max(mx[lookat,k] )]]
   }
  }
 )
}

-- Bert

On Thu, Mar 31, 2011 at 8:30 AM, Nick Sabbe nick.sa...@ugent.be wrote:

 Hi Murali.
 I haven't compared, but this is what I would do:

 bestMatch-function(searchVector, matchMat)
 {
        searchRow-unique(sort(match(searchVector, colnames(matchMat #if
 you're sure, you could drop unique
        cat(Original row indices:)
        print(searchRow)
        matchMat-matchMat[, -searchRow, drop=FALSE] #avoid duplicates
 altogether
        cat(Corrected Matrix:\n)
        print(matchMat)
        correctedRows-searchRow - seq_along(searchRow) + 1 #works because
 of the sort above
        cat(Corrected row indices:)
        print(correctedRows)
        sapply(correctedRows, function(cr){
                        lookWhere-matchMat[cr, seq(cr-1)]
                        cat(Will now look into:\n)
                        print(lookWhere)
                        cc-which.max(lookWhere)
                        cat(Max at position, cc, \n)
                        colnames(matchMat)[cc]
                })
 }
 I don't think there's that much difference. Depending on specific sizes, it
 may be more or less costly to first shrink the search matrix like I do. And
 similarly depending, I may be better still if you remove the rows that
 you're not interested in as well (some more but similar index trickery
 required then.

 HTH,


 Nick Sabbe
 --
 ping: nick.sa...@ugent.be
 link: http://biomath.ugent.be
 wink: A1.056, Coupure Links 653, 9000 Gent
 ring: 09/264.59.36

 -- Do Not Disapprove





 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of murali.me...@avivainvestors.com
 Sent: donderdag 31 maart 2011 16:46
 To: r-help@r-project.org
 Subject: [R] choosing best 'match' for given factor

 Folks,

 I have a 'matching' matrix between variables A, X, L, O:

  a - structure(c(1, 0.41, 0.58, 0.75, 0.41, 1, 0.6, 0.86, 0.58,
 0.6, 1, 0.83, 0.75, 0.86, 0.83, 1), .Dim = c(4L, 4L), .Dimnames = list(
    c(A, X, L, O), c(A, X, L, O)))

  a
      A     X     L     O
 A  1.00  0.41  0.58  0.75
 X  0.41  1.00  0.60  0.86
 L  0.58  0.75  1.00  0.83
 O  0.60  0.86  0.83  1.00

 And I have a search vector of variables

  v - c(X, O)

 I want to write a function bestMatch(searchvector, matchMat) such that for
 each variable in searchvector, I get the variable that it has the highest
 match to - but searching only among variables to the left of it in the
 'matching' matrix, and not matching with any variable in searchvector
 itself.

 So in the above example, although X has the highest match (0.86) with O,
 I can't choose O as it's to the right of X (and also because O is in the
 searchvector v already); I'll have to choose A.

 For O, I will choose L, the variable it's best matched with - as it
 can't match X already in the search vector.

 My function bestMatch(v, a) will then return c(A, L)

 My matrix a is quite large, and I have a long list of search vectors v, so I
 need an efficient method.

 I wrote this:

 bestMatch - function(searchvector,  matchMat) {
        sapply(searchvector, function(cc) {
                             y - matchMat[!(rownames(matchMat) %in%
 searchvector)  (index(rownames(matchMat))  match(cc, rownames(matchMat))),
 cc, drop = FALSE];
                             rownames(y)[which.max(y)]
        })
 }

 Any advice?

 Thanks,

 Murali

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



--
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Sequential multiple regression

2011-03-31 Thread Tyler Rinker


Hello,
 
In the past I have tended to reside more in the ANOVA camp but am trying to 
become more familiar with regression techniques in R.  I would like to get the 
F change from a model as I take away factors:
 
SO...
 
mod1-lm(y~x1+x2+x3)...mod2-lm(y~x1,x2)...mod3-lm(y~x1)
 
 
I can do this by hand by running several models in R and taking the MSr1/MSe1, 
MSr2/MSe2...  This is slow and I know there's a better way.
 
In SPSS (which I no longer use) I could easily obtain these results (F-change) 
as documented by Professor Andy Fields:
http://www.statisticshell.com/multireg.pdf
 
You can see the F changes for his two IV model yielding 2 F changes.  Maybe 
it's the language I'm using (sequential multiple regression) that yields me 
poor results in searching the archives and Rseek.  The results tend to be 
around hierarchal regression (I'm not familiar with this terminology being in 
the ANOVA camp).  When I look at the hier.part package and run the examples it 
doesn't seem to give me the F change I'm looking for.  The step function in the 
base program reduces the model but takes away the non sig. IV's (which is a 
great approach but I'm really after those F changes).  As is the usually the 
case I'm sure R does this simply and beautifully, I'm just not experienced with 
the statistical vocabulary and techniques around regression to find what I'm 
looking for.
 
F-change values with R:  Any help would be appreciated.
 
Thank you in advance,
Tyler 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sequential multiple regression

?drop1

-- Bert

On Thu, Mar 31, 2011 at 9:24 AM, Tyler Rinker tyler_rin...@hotmail.com wrote:

 Hello,

 In the past I have tended to reside more in the ANOVA camp but am trying to 
 become more familiar with regression techniques in R.  I would like to get 
 the F change from a model as I take away factors:

 SO...

 mod1-lm(y~x1+x2+x3)...mod2-lm(y~x1,x2)...mod3-lm(y~x1)


 I can do this by hand by running several models in R and taking the 
 MSr1/MSe1, MSr2/MSe2...  This is slow and I know there's a better way.

 In SPSS (which I no longer use) I could easily obtain these results 
 (F-change) as documented by Professor Andy Fields:
 http://www.statisticshell.com/multireg.pdf

 You can see the F changes for his two IV model yielding 2 F changes.  Maybe 
 it's the language I'm using (sequential multiple regression) that yields me 
 poor results in searching the archives and Rseek.  The results tend to be 
 around hierarchal regression (I'm not familiar with this terminology being in 
 the ANOVA camp).  When I look at the hier.part package and run the examples 
 it doesn't seem to give me the F change I'm looking for.  The step function 
 in the base program reduces the model but takes away the non sig. IV's (which 
 is a great approach but I'm really after those F changes).  As is the usually 
 the case I'm sure R does this simply and beautifully, I'm just not 
 experienced with the statistical vocabulary and techniques around regression 
 to find what I'm looking for.

 F-change values with R:  Any help would be appreciated.

 Thank you in advance,
 Tyler
        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] statistical question

You could do bootstrapping for confidence intervals or permutation tests for 
just comparing the medians.  There are other tools as well, just be sure that 
you understand what you are testing and what assumptions are being made in 
whatever method you choose.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Anna Lee
 Sent: Thursday, March 31, 2011 3:15 AM
 To: r-help@r-project.org
 Subject: [R] statistical question
 
 Dear List!
 
 I want to compare medians of  non normal distributed data. Is it
 possible and usefull to calculate 95% confidence intervals for
 medians? And if so - how can this be achieved in R?
 
 Thanks a lot!
 Anna
 
 --
 
 
 
 Der Inhalt dieser E-Mail ist vertraulich. Sollte Ihnen die E-Mail
 irrtümlich zugesandt worden sein, bitte ich Sie, mich unverzüglich zu
 benachrichtigen und die E-Mail zu löschen.
 
 This e-mail is confidential. If you have received it in error, please
 notify me immediately and delete it from your system.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] choosing best 'match' for given factor

Try this:

bestMatch - function(search, match) {
colnames(match)[pmax(apply(match[,search], 2, which.max) - 1, 1)]
}


On Thu, Mar 31, 2011 at 11:46 AM,  murali.me...@avivainvestors.com wrote:
 Folks,

 I have a 'matching' matrix between variables A, X, L, O:

 a - structure(c(1, 0.41, 0.58, 0.75, 0.41, 1, 0.6, 0.86, 0.58,
 0.6, 1, 0.83, 0.75, 0.86, 0.83, 1), .Dim = c(4L, 4L), .Dimnames = list(
    c(A, X, L, O), c(A, X, L, O)))

 a
      A     X     L     O
 A  1.00  0.41  0.58  0.75
 X  0.41  1.00  0.60  0.86
 L  0.58  0.75  1.00  0.83
 O  0.60  0.86  0.83  1.00

 And I have a search vector of variables

 v - c(X, O)

 I want to write a function bestMatch(searchvector, matchMat) such that for 
 each variable in searchvector, I get the variable that it has the highest 
 match to - but searching only among variables to the left of it in the 
 'matching' matrix, and not matching with any variable in searchvector itself.

 So in the above example, although X has the highest match (0.86) with O, 
 I can't choose O as it's to the right of X (and also because O is in the 
 searchvector v already); I'll have to choose A.

 For O, I will choose L, the variable it's best matched with - as it can't 
 match X already in the search vector.

 My function bestMatch(v, a) will then return c(A, L)

 My matrix a is quite large, and I have a long list of search vectors v, so I 
 need an efficient method.

 I wrote this:

 bestMatch - function(searchvector,  matchMat) {
        sapply(searchvector, function(cc) {
                             y - matchMat[!(rownames(matchMat) %in% 
 searchvector)  (index(rownames(matchMat))  match(cc, rownames(matchMat))), 
 cc, drop = FALSE];
                             rownames(y)[which.max(y)]
        })
 }

 Any advice?

 Thanks,

 Murali

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sequential multiple regression

2011-03-31 Thread Tyler Rinker


Bert and anyone else with info,
 
First, Bert thank you for your quick reply.  drop1 gives the results as a type 
II anova.  Is there a way to make drop1 give you type I anova (the args don't 
appear to have a way to do so)?  Another package/function perhaps?
 
Tyler 
 
 Date: Thu, 31 Mar 2011 09:32:02 -0700
 Subject: Re: [R] Sequential multiple regression
 From: gunter.ber...@gene.com
 To: tyler_rin...@hotmail.com
 CC: r-help@r-project.org
 
 ?drop1
 
 -- Bert
 
 On Thu, Mar 31, 2011 at 9:24 AM, Tyler Rinker tyler_rin...@hotmail.com 
 wrote:
 
  Hello,
 
  In the past I have tended to reside more in the ANOVA camp but am trying to 
  become more familiar with regression techniques in R.  I would like to get 
  the F change from a model as I take away factors:
 
  SO...
 
  mod1-lm(y~x1+x2+x3)...mod2-lm(y~x1,x2)...mod3-lm(y~x1)
 
 
  I can do this by hand by running several models in R and taking the 
  MSr1/MSe1, MSr2/MSe2...  This is slow and I know there's a better way.
 
  In SPSS (which I no longer use) I could easily obtain these results 
  (F-change) as documented by Professor Andy Fields:
  http://www.statisticshell.com/multireg.pdf
 
  You can see the F changes for his two IV model yielding 2 F changes.  Maybe 
  it's the language I'm using (sequential multiple regression) that yields me 
  poor results in searching the archives and Rseek.  The results tend to be 
  around hierarchal regression (I'm not familiar with this terminology being 
  in the ANOVA camp).  When I look at the hier.part package and run the 
  examples it doesn't seem to give me the F change I'm looking for.  The step 
  function in the base program reduces the model but takes away the non sig. 
  IV's (which is a great approach but I'm really after those F changes).  As 
  is the usually the case I'm sure R does this simply and beautifully, I'm 
  just not experienced with the statistical vocabulary and techniques around 
  regression to find what I'm looking for.
 
  F-change values with R:  Any help would be appreciated.
 
  Thank you in advance,
  Tyler
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 -- 
 Men by nature long to get on to the ultimate truths, and will often
 be impatient with elementary studies or fight shy of them. If it were
 possible to reach the ultimate truths without the elementary studies
 usually prefixed to them, these would not be preparatory studies but
 superfluous diversions.
 
 -- Maimonides (1135-1204)
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nls.profile

2011-03-31 Thread Douglas Bates

On Thu, Mar 31, 2011 at 4:02 AM, Daniel Kaschek
daniel.kasc...@physik.uni-freiburg.de wrote:
 Hello,

 I use nls.profile to compute confidence intervals of parameter estimates
 of a non-linear model. When computing the profiles, the model function
 produces an error for certain parameter combinations. Therefore nls
 fails and so does nls.profile.

 Is there a way to tell nls.profile to ignore errors and to proceed with
 the next parameter value?

Not easily.  What I tend to do is to adjust (increase) the alphamax
parameter (which is a misnomer, it should have been alphamin but I got
confused) until the profile can succeed.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] one question about bioconductor

2011-03-31 Thread wang peter

dear lady and gentalmen:
  i am gaoshan from kansas university.
i used such coding to deal with gel data

data - ReadAffy()
 Warning messages:
1: In file(out, wt) :
  cannot open file
'C:\Users\gaoshan\AppData\Local\Temp\RtmpvsyXOV\Rhttpd3f0b2e85': No such
file or directory
2: In file(out, wt) :
  cannot open file
'C:\Users\gaoshan\AppData\Local\Temp\RtmpvsyXOV\Rhttpd1d7427a'
eset - rma(data)
Error in gzfile(file, r) : cannot open the connection
In addition: Warning message:
In gzfile(file, r) :
  cannot open compressed file
'C:\Users\gaoshan\AppData\Local\Temp\RtmpvsyXOV\file5f6a6cb5', probable
reason 'No such file or directory'
 can you help me to fix it?
thank u very much
gao shan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to update R?

2011-03-31 Thread Li, Yunfei

Hi,

My R version is old, and I would like to update it. How can I update R and keep 
all the libraries? Thanks

Best,

Yunfei Li
--
Research Assistant
Department of Statistics 
School of Molecular Biosciences
Biotechnology Life Sciences Building 427
Washington State University
Pullman, WA 99164-7520
Phone: 509-339-5096
http://www.wsu.edu/~ye_lab/people.html


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] one question about bioconductor

2011-03-31 Thread Philipp Pagel

On Thu, Mar 31, 2011 at 09:32:06AM -0700, wang peter wrote:
 dear lady and gentalmen:
   i am gaoshan from kansas university.
 i used such coding to deal with gel data
 
 data - ReadAffy()
  Warning messages:
 1: In file(out, wt) :
   cannot open file
 'C:\Users\gaoshan\AppData\Local\Temp\RtmpvsyXOV\Rhttpd3f0b2e85': No such
 file or directory

As the message says: there is something wrong with the path. 

In order to get more helpful replies, you should show the actual code
you used and also give a hint about the spoecific packages you were
using. E.g. ReadAffy most certainly requires at least a filename which
seems to be missing from your comamnd above. 

In addition, I recommend to post your question on the bioconductor
mailing list.

cu
Philipp


-- 
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
Maximus-von-Imhof-Forum 3
85354 Freising, Germany
http://webclu.bio.wzw.tum.de/~pagel/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Cluster analysis, factor variables, large data set

2011-03-31 Thread Hans Ekbrand

Dear R helpers,

I have a large data set with 36 variables and about 50.000 cases. The
variabels represent labour market status during 36 months, there are 8
different variable values (e.g. Full-time Employment, Student,...)

Only cases with at least one change in labour market status is
included in the data set.

To analyse sub sets of the data, I have used daisy in the
cluster-package to create a distance matrix and then used pam (or pamk
in the fpc-package), to get a k-medoids cluster-solution. Now I want
to analyse the whole set.

clara is said to cope with large data sets, but the first step in the
cluster analysis, the creation of the distance matrix must be done by
another function since clara only works with numeric data.

Is there an alternative to the daisy - clara route that does not
require as much RAM?

What functions would you recommend for a cluster analysis of this kind
of data on large data set?


regards,

Hans Ekbrand

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] read password-protected files

Hi list,

I have a bunch of .csv files that are password-protected.  I wonder if there is 
a way to read them in in R without manually removing the password protection 
for 
each file?

Thank you very much!


...Tao
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to update R?

This question has been asked by many people already.  The easiest way is:

1) install the new version
2) copy all or the libraries that you installed later from the library folder 
of older version to the new version
3) uninstall the old version
4) do a library update in the new version

Done!

...Tao






From: Li, Yunfei yunfei...@wsu.edu
To: r-help@r-project.org
Sent: Thu, March 31, 2011 10:16:23 AM
Subject: [R] How to update R?

Hi,

My R version is old, and I would like to update it. How can I update R and keep 
all the libraries? Thanks

Best,

Yunfei Li
--

Research Assistant
Department of Statistics 
School of Molecular Biosciences
Biotechnology Life Sciences Building 427
Washington State University
Pullman, WA 99164-7520
Phone: 509-339-5096
http://www.wsu.edu/~ye_lab/people.html


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to update R?

2011-03-31 Thread Joshua Wiley

Hi Yunfei,

It really depends to some extent on the version of R (and your OS) you
were using. Here is a link to a blog post that discusses some
potential strategies:
http://www.r-statistics.com/2010/04/changing-your-r-upgrading-strategy-and-the-r-code-to-do-it-on-windows/

There are also a number of messages if you search the R-help archive.
However, old version of libraries may not be portable to the most
recent R, so you may need to update R and update all your libraries
too.

Cheers,

Josh

On Thu, Mar 31, 2011 at 10:16 AM, Li, Yunfei yunfei...@wsu.edu wrote:
Hi,

My R version is old, and I would like to update it. How can I update R and
keep all the libraries? Thanks

Best,

Yunfei Li
--
Research Assistant
Department of Statistics
School of Molecular Biosciences
Biotechnology Life Sciences Building 427
Washington State University
Pullman, WA 99164-7520
Phone: 509-339-5096
http://www.wsu.edu/~ye_lab/people.html

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

Re: [R] Cluster analysis, factor variables, large data set

2011-03-31 Thread Christian Hennig


Dear Hans,

clara doesn't require a distance matrix as input (and therefore doesn't 
require you to run daisy), it will work with the raw data matrix using

Euclidean distances implicitly.
I can't tell you whether Euclidean distances are appropriate in this 
situation (this depends on the interpretation and variables and 
particularly on how they are scaled), but they may be fine at least after 
some transformation and standardisation of your variables.


Hope this helps,
Christian

On Thu, 31 Mar 2011, Hans Ekbrand wrote:


Dear R helpers,

I have a large data set with 36 variables and about 50.000 cases. The
variabels represent labour market status during 36 months, there are 8
different variable values (e.g. Full-time Employment, Student,...)

Only cases with at least one change in labour market status is
included in the data set.

To analyse sub sets of the data, I have used daisy in the
cluster-package to create a distance matrix and then used pam (or pamk
in the fpc-package), to get a k-medoids cluster-solution. Now I want
to analyse the whole set.

clara is said to cope with large data sets, but the first step in the
cluster analysis, the creation of the distance matrix must be done by
another function since clara only works with numeric data.

Is there an alternative to the daisy - clara route that does not
require as much RAM?

What functions would you recommend for a cluster analysis of this kind
of data on large data set?


regards,

Hans Ekbrand

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
chr...@stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to update R?

Forgot to mention that my way of updating is for Windows only.

...Tao






From: Joshua Wiley jwiley.ps...@gmail.com
To: Li, Yunfei yunfei...@wsu.edu
Cc: r-help@r-project.org
Sent: Thu, March 31, 2011 11:05:29 AM
Subject: Re: [R] How to update R?

Hi Yunfei,

It really depends to some extent on the version of R (and your OS) you
were using.  Here is a link to a blog post that discusses some
potential strategies:
http://www.r-statistics.com/2010/04/changing-your-r-upgrading-strategy-and-the-r-code-to-do-it-on-windows/


There are also a number of messages if you search the R-help archive.
However, old version of libraries may not be portable to the most
recent R, so you may need to update R and update all your libraries
too.

Cheers,

Josh

On Thu, Mar 31, 2011 at 10:16 AM, Li, Yunfei yunfei...@wsu.edu wrote:
 Hi,

 My R version is old, and I would like to update it. How can I update R and 
 keep 
all the libraries? Thanks

 Best,

 Yunfei Li
--
-
 Research Assistant
 Department of Statistics 
 School of Molecular Biosciences
 Biotechnology Life Sciences Building 427
 Washington State University
 Pullman, WA 99164-7520
 Phone: 509-339-5096
 http://www.wsu.edu/~ye_lab/people.html


[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read password-protected files

2011-03-31 Thread Joshua Wiley

Hi Tao,

You will need to give us more details.  CSV files are pure text.  How
are they password protected?  Are they Excel files?  In a password
protected compressed format?  Encrypted?

Best Regards,

Josh

On Thu, Mar 31, 2011 at 11:00 AM, Shi, Tao shida...@yahoo.com wrote:
 Hi list,

 I have a bunch of .csv files that are password-protected.  I wonder if there 
 is
 a way to read them in in R without manually removing the password protection 
 for
 each file?

 Thank you very much!


 ...Tao
        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read password-protected files

2011-03-31 Thread Prof Brian Ripley


On Thu, 31 Mar 2011, Shi, Tao wrote:


Hi list,

I have a bunch of .csv files that are password-protected.  I wonder if there is


There is nothing in the CSV standard about password protection.  I 
very much doubt that these actually CSV files.  So please follow the 
posting guide [*]



a way to read them in in R without manually removing the password protection for
each file?

Thank you very much!


...Tao
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[*] as above: and as you sent HTML you are either full of contempt for 
the helpeRs or failed to do your homework.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read password-protected files

2011-03-31 Thread Philipp Pagel

On Thu, Mar 31, 2011 at 11:00:48AM -0700, Shi, Tao wrote:
 Hi list,
 
 I have a bunch of .csv files that are password-protected.  I wonder if there 
 is 
 a way to read them in in R without manually removing the password protection 
 for 
 each file?

I doubt that there is such a thing as a password protected csv file.
They are just text files, after all. So I guess you have something
else. How or what did the presumed pasword protection?

cu
Philipp

-- 
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
Maximus-von-Imhof-Forum 3
85354 Freising, Germany
http://webclu.bio.wzw.tum.de/~pagel/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read password-protected files

Hi Josh and Prof. Ripley,

Thanks for the quick replies!

Sorry about the HTML email.  It's the default for my yahoo account and forgot 
to 
switch.

The question was asked by a colleague of mine.  After double-checking, yes, 
they 
are actually Excel files.  Any idea on how to approach this?

Thanks!

...Tao




From: Prof Brian Ripley rip...@stats.ox.ac.uk

Cc: r-help@r-project.org
Sent: Thu, March 31, 2011 11:15:35 AM
Subject: Re: [R] read password-protected files

On Thu, 31 Mar 2011, Shi, Tao wrote:

 Hi list,

 I have a bunch of .csv files that are password-protected.  I wonder if there 
is

There is nothing in the CSV standard about password protection.  I 
very much doubt that these actually CSV files.  So please follow the 
posting guide [*]

 a way to read them in in R without manually removing the password protection 
for
 each file?

[[elided Yahoo spam]]


 ...Tao
 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

[*] as above: and as you sent HTML you are either full of contempt for 
the helpeRs or failed to do your homework.


-- 
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read password-protected files

2011-03-31 Thread Joshua Wiley

Hi Tao,

There are a number of R packages to work with reading/writing Excel
files (e.g., xlsReadWrite).  If Excel is on the system in question, as
long as you have opened the file in Excel, you should be able to use
RODBC with  RDCOMClient or rcom, but I am not sure those will work if
it is password protected unless it is currently unlocked in Excel.

HTH,

Josh

On Thu, Mar 31, 2011 at 11:33 AM, Shi, Tao shida...@yahoo.com wrote:
 Hi Josh and Prof. Ripley,

 Thanks for the quick replies!

 Sorry about the HTML email.  It's the default for my yahoo account and forgot 
 to
 switch.

 The question was asked by a colleague of mine.  After double-checking, yes, 
 they
 are actually Excel files.  Any idea on how to approach this?

 Thanks!

 ...Tao



 
 From: Prof Brian Ripley rip...@stats.ox.ac.uk
 To: Shi, Tao shida...@yahoo.com
 Cc: r-help@r-project.org
 Sent: Thu, March 31, 2011 11:15:35 AM
 Subject: Re: [R] read password-protected files

 On Thu, 31 Mar 2011, Shi, Tao wrote:

 Hi list,

 I have a bunch of .csv files that are password-protected.  I wonder if there
 is

 There is nothing in the CSV standard about password protection.  I
 very much doubt that these actually CSV files.  So please follow the
 posting guide [*]

 a way to read them in in R without manually removing the password protection
for
 each file?

 Thank you very much!


 ...Tao
     [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 [*] as above: and as you sent HTML you are either full of contempt for
 the helpeRs or failed to do your homework.


 --
 Brian D. Ripley,                  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford,             Tel:  +44 1865 272861 (self)
 1 South Parks Road,                     +44 1865 272866 (PA)
 Oxford OX1 3TG, UK                Fax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read password-protected files

Thanks, Josh.  I use xlsReadWrite routinely.  It would be nice if it has a 
password option.

...Tao




- Original Message 
 From: Joshua Wiley jwiley.ps...@gmail.com
 To: Shi, Tao shida...@yahoo.com
 Cc: r-help@r-project.org
 Sent: Thu, March 31, 2011 11:50:27 AM
 Subject: Re: [R] read password-protected files
 
 Hi Tao,
 
 There are a number of R packages to work with reading/writing  Excel
 files (e.g., xlsReadWrite).  If Excel is on the system in  question, as
 long as you have opened the file in Excel, you should be able to  use
 RODBC with  RDCOMClient or rcom, but I am not sure those will work  if
 it is password protected unless it is currently unlocked in  Excel.
 
 HTH,
 
 Josh
 
 On Thu, Mar 31, 2011 at 11:33 AM, Shi, Tao  shida...@yahoo.com wrote:
  Hi  Josh and Prof. Ripley,
 
  Thanks for the quick  replies!
 
  Sorry about the HTML email.  It's the default for my  yahoo account and 
forgot to
  switch.
 
  The question was  asked by a colleague of mine.  After double-checking, 
  yes, 
they
  are  actually Excel files.  Any idea on how to approach this?
 
   Thanks!
 
  ...Tao
 
 
 
   
  From: Prof Brian Ripley rip...@stats.ox.ac.uk
  To:  Shi, Tao shida...@yahoo.com
  Cc: r-help@r-project.org
  Sent: Thu,  March 31, 2011 11:15:35 AM
  Subject: Re: [R] read password-protected  files
 
  On Thu, 31 Mar 2011, Shi, Tao wrote:
 
   Hi list,
 
  I have a bunch of .csv files that are  password-protected.  I wonder if 
there
  is
 
  There is  nothing in the CSV standard about password protection.  I
  very much  doubt that these actually CSV files.  So please follow the
  posting guide  [*]
 
  a way to read them in in R without manually removing the  password 
protection
 for
  each  file?
 
  Thank you very  much!
 
 
  ...Tao
  [[alternative  HTML version deleted]]
 
   __
  R-help@r-project.org mailing  list
  https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide  
http://www.R-project.org/posting-guide.html
  and provide commented,  minimal, self-contained, reproducible code.
 
  [*] as above: and as  you sent HTML you are either full of contempt for
  the helpeRs or failed  to do your homework.
 
 
  --
  Brian D. Ripley,   rip...@stats.ox.ac.uk
  Professor  of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
  University of  Oxford, Tel:  +44 1865 272861 (self)
  1 South Parks Road,  +44 1865 272866 (PA)
  Oxford OX1 3TG, UK Fax:  +44 1865 272595


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read password-protected files

Using RDCOMCliente:

library(RDCOMClient)
eApp - COMCreate(Excel.Application)
wk - eApp$Workbooks()$Open(Filename=your_file,Password=your_password)
tf - tempfile()
wk$Sheets(1)$SaveAs(tf, 3)

DF - read.table(sprintf(%s.txt, tf), header = TRUE, sep = \t)

On Thu, Mar 31, 2011 at 3:33 PM, Shi, Tao shida...@yahoo.com wrote:
 Hi Josh and Prof. Ripley,

 Thanks for the quick replies!

 Sorry about the HTML email.  It's the default for my yahoo account and forgot 
 to
 switch.

 The question was asked by a colleague of mine.  After double-checking, yes, 
 they
 are actually Excel files.  Any idea on how to approach this?

 Thanks!

 ...Tao



 
 From: Prof Brian Ripley rip...@stats.ox.ac.uk

 Cc: r-help@r-project.org
 Sent: Thu, March 31, 2011 11:15:35 AM
 Subject: Re: [R] read password-protected files

 On Thu, 31 Mar 2011, Shi, Tao wrote:

 Hi list,

 I have a bunch of .csv files that are password-protected.  I wonder if there
 is

 There is nothing in the CSV standard about password protection.  I
 very much doubt that these actually CSV files.  So please follow the
 posting guide [*]

 a way to read them in in R without manually removing the password protection
for
 each file?

 [[elided Yahoo spam]]


 ...Tao
     [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 [*] as above: and as you sent HTML you are either full of contempt for
 the helpeRs or failed to do your homework.


 --
 Brian D. Ripley,                  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford,             Tel:  +44 1865 272861 (self)
 1 South Parks Road,                     +44 1865 272866 (PA)
 Oxford OX1 3TG, UK                Fax:  +44 1865 272595

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cluster analysis, factor variables, large data set

2011-03-31 Thread Hans Ekbrand

On Thu, Mar 31, 2011 at 07:06:31PM +0100, Christian Hennig wrote:
 Dear Hans,
 
 clara doesn't require a distance matrix as input (and therefore
 doesn't require you to run daisy), it will work with the raw data
 matrix using
 Euclidean distances implicitly.
 I can't tell you whether Euclidean distances are appropriate in this
 situation (this depends on the interpretation and variables and
 particularly on how they are scaled), but they may be fine at least
 after some transformation and standardisation of your variables.

The variables are unordered factors, stored as integers 1:9, where 

1 means Full-time employment
2 means Part-time employment
3 means Student
4 means Full-time self-employee
...

Does euclidean distances make sense on unordered factors coded as
integers?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] transparent grays?

2011-03-31 Thread Frostygoat

Is there a grayscale equivalent to alpha levels in rgb?

Example:  I have the following to make red transparent circles overlap
with previously plotted blue symbols.

symbols(x=sites$long,y=sites$lat,circles=log(sites$prop.nem
+1),add=T,inches=F,bg=rgb(red=1,green=0,blue=0,
alpha=0.5),fg=rgb(red=1,green=0,blue=0, alpha=0.5))

I'm having a hard time coming up with a grayscale equivalent.

Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cluster analysis, factor variables, large data set

2011-03-31 Thread Hans Ekbrand

On Thu, Mar 31, 2011 at 08:48:02PM +0200, Hans Ekbrand wrote:
 On Thu, Mar 31, 2011 at 07:06:31PM +0100, Christian Hennig wrote:
  Dear Hans,
  
  clara doesn't require a distance matrix as input (and therefore
  doesn't require you to run daisy), it will work with the raw data
  matrix using
  Euclidean distances implicitly.
  I can't tell you whether Euclidean distances are appropriate in this
  situation (this depends on the interpretation and variables and
  particularly on how they are scaled), but they may be fine at least
  after some transformation and standardisation of your variables.
 
 The variables are unordered factors, stored as integers 1:9, where 
 
 1 means Full-time employment
 2 means Part-time employment
 3 means Student
 4 means Full-time self-employee
 ...
 
 Does euclidean distances make sense on unordered factors coded as
 integers?

To be clear, here is an extract

 my.df.full[900:910, 16:19]
PL210F.first.year PL210G.first.year PL210H.first.year PL210I.first.year
900 2 2 1 2
901 1 1 1 1
902 1 1 1 1
903 2 2 2 2
904 1 1 1 1
905 2 2 2 2
906 7 8 2 7
907 5 5 5 5
908 1 1 1 1
909 1 1 1 1
910 1 1 1 1

 class(my.df.full[,16])
[1] integer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cluster analysis, factor variables, large data set

2011-03-31 Thread Peter Langfelder

On Thu, Mar 31, 2011 at 11:48 AM, Hans Ekbrand h...@sociologi.cjb.net wrote:

 The variables are unordered factors, stored as integers 1:9, where

 1 means Full-time employment
 2 means Part-time employment
 3 means Student
 4 means Full-time self-employee
 ...

 Does euclidean distances make sense on unordered factors coded as
 integers?

It probably doesn't. You said you have some 36 observations for each
case, correct? You can turn these 36 observations into a vector of
length 36 * 9 on which Euclidean distance will make some sense, namely
k changes will produce a distance of sqrt(2*k). For each observation
with value p (p between 1 and 9), create a vector r = c(0,0,1,0,...0)
where the entry 1 is in the p-th component. Hence, if values p1 and p2
are the same, euclidean distance between r1 and r2 is zero; if they
are not the same, Euclidan distance is sqrt(2).

Here's some possible R code:


transform = function(obsVector, maxVal)
{
  templateMat = matrix(0, maxVal, maxVal);
  diag(templateMat) = 1;

  return(as.vector(templateMat[, obsVector]));
}

set.seed(10)
n = 4;
m = 5;
max = 4;
data = matrix(sample(c(1:max), n*m, replace = TRUE), m, n);

 data
 [,1] [,2] [,3] [,4]
[1,]3312
[2,]1332
[3,]3324
[4,]1242
[5,]4141


trafoData = apply(data, 2, transform, maxVal = max);

 trafoData
  [,1] [,2] [,3] [,4]
 [1,]0010
 [2,]0001
 [3,]1100
 [4,]0000
 [5,]1000
 [6,]0001
 [7,]0110
 [8,]0000
 [9,]0000
[10,]0010
[11,]1100
[12,]0001
[13,]1000
[14,]0101
[15,]0000
[16,]0010
[17,]0101
[18,]0000
[19,]0000
[20,]1010



The code assumes that cases are in columns and observations in rows of
data. Examine data and trafoData to see how the transformation works.
Once you have the transformed data, simply apply your favorite
clustering method that uses Euclidean distance.

HTH,

Peter


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rank of Matrix

2011-03-31 Thread Carson Farmer

Dear list,

Can anyone tell me how to obtain the rank of a sparse Matrix, for
example from package Matrix (class dgCMatrix)? Here is an example of
QR decomposition of a sparse matrix (from the sparseQR class help).

library(Matrix)
data(KNex)
mm - KNex$mm
str(mmQR - qr(mm))

Similarly, using the functions/classes from the relatively new
MatrixModels package:

library(MatrixModels)
str(trial - data.frame(counts=c(18,17,15,20,10,20,25,13,12),
outcome=gl(3,1,9,labels=LETTERS[1:3]),
treatment=gl(3,3,labels=letters[1:3])))
glmS - glm4(counts ~ 0+outcome + treatment, poisson, trial,
  verbose = TRUE, sparse = TRUE)
str(glmS)
str(X - glmS@pred@X)
str(QR - qr(X))

I'm not very familiar with matrix decomposition, but is the 'p' slot
from a sparseQR object the same as $pivot from base qr? Additionally,
how do I obtain the rank of the input matrix? qr from the base package
produces an object with several components, two of which are rank, and
pivot, is there an equivalent way of getting at these values/vectors
for sparse matrices?
Basically, I am trying to get at the variance-covariance matrix in
order to compute t-values for the coefficients of a (sparse) GLM (i.e.
a summary function for class glpModel). Am I going about this the
right way? Are they perhaps more efficient ways of doing this when
working with sparse matrices. Is there a preferred way of calculating
t-statistics? Note that 'rankMatrix' from package Matrix appears to
densify the matrix before use, so this is not a viable option.

I have asked a similar question before [1], though I am now trying to
'answer' it myself by digging in and actually coding a bit more
myself, however, as you can see I'm still a bit stuck :-(

Thanks for any suggestions,

Carson

[1] http://www.mail-archive.com/r-help@r-project.org/msg130145.html

-- 
Carson J. Q. Farmer
ISSP Doctoral Fellow
National Centre for Geocomputation
National University of Ireland, Maynooth,
http://www.carsonfarmer.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] transparent grays?

Try this:

with(trees, symbols(Height, Volume, circles=Girth/24, inches=FALSE, fg
= gray30,
 bg = sapply(apply(rbind(col2rgb(gray(seq(0,1, l = 10)))/255, alpha =
0.5), 2, as.list), do.call, what = rgb)))


On Thu, Mar 31, 2011 at 2:52 PM, Frostygoat frostyg...@gmail.com wrote:
 Is there a grayscale equivalent to alpha levels in rgb?

 Example:  I have the following to make red transparent circles overlap
 with previously plotted blue symbols.

 symbols(x=sites$long,y=sites$lat,circles=log(sites$prop.nem
 +1),add=T,inches=F,bg=rgb(red=1,green=0,blue=0,
 alpha=0.5),fg=rgb(red=1,green=0,blue=0, alpha=0.5))

 I'm having a hard time coming up with a grayscale equivalent.

 Thanks!

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] generate random numbers

2011-03-31 Thread Anna Lee

Hey List,

does anyone know how I can generate a vector of random numbers from a
given distribution? Something like rnorm just for non normal
distributions???

Thanks a lot!
Anna

-- 



Der Inhalt dieser E-Mail ist vertraulich. Sollte Ihnen die E-Mail
irrtümlich zugesandt worden sein, bitte ich Sie, mich unverzüglich zu
benachrichtigen und die E-Mail zu löschen.

This e-mail is confidential. If you have received it in error, please
notify me immediately and delete it from your system.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sequential multiple regression

See

http://www.stats.ox.ac.uk/pub/MASS3/Exegeses.pdf

?update.lm

-- Bert

On Thu, Mar 31, 2011 at 10:27 AM, Tyler Rinker tyler_rin...@hotmail.com wrote:
 Bert and anyone else with info,

 First, Bert thank you for your quick reply.  drop1 gives the results as a
 type II anova.  Is there a way to make drop1 give you type I anova (the args
 don't appear to have a way to do so)?  Another package/function perhaps?

 Tyler

 Date: Thu, 31 Mar 2011 09:32:02 -0700
 Subject: Re: [R] Sequential multiple regression
 From: gunter.ber...@gene.com
 To: tyler_rin...@hotmail.com
 CC: r-help@r-project.org

 ?drop1

 -- Bert

 On Thu, Mar 31, 2011 at 9:24 AM, Tyler Rinker tyler_rin...@hotmail.com
 wrote:
 
  Hello,
 
  In the past I have tended to reside more in the ANOVA camp but am trying
  to become more familiar with regression techniques in R.  I would like to
  get the F change from a model as I take away factors:
 
  SO...
 
  mod1-lm(y~x1+x2+x3)...mod2-lm(y~x1,x2)...mod3-lm(y~x1)
 
 
  I can do this by hand by running several models in R and taking the
  MSr1/MSe1, MSr2/MSe2...  This is slow and I know there's a better way.
 
  In SPSS (which I no longer use) I could easily obtain these results
  (F-change) as documented by Professor Andy Fields:
  http://www.statisticshell.com/multireg.pdf
 
  You can see the F changes for his two IV model yielding 2 F changes.
   Maybe it's the language I'm using (sequential multiple regression) that
  yields me poor results in searching the archives and Rseek.  The results
  tend to be around hierarchal regression (I'm not familiar with this
  terminology being in the ANOVA camp).  When I look at the hier.part package
  and run the examples it doesn't seem to give me the F change I'm looking
  for.  The step function in the base program reduces the model but takes 
  away
  the non sig. IV's (which is a great approach but I'm really after those F
  changes).  As is the usually the case I'm sure R does this simply and
  beautifully, I'm just not experienced with the statistical vocabulary and
  techniques around regression to find what I'm looking for.
 
  F-change values with R:  Any help would be appreciated.
 
  Thank you in advance,
  Tyler
         [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Men by nature long to get on to the ultimate truths, and will often
 be impatient with elementary studies or fight shy of them. If it were
 possible to reach the ultimate truths without the elementary studies
 usually prefixed to them, these would not be preparatory studies but
 superfluous diversions.

 -- Maimonides (1135-1204)

 Bert Gunter
 Genentech Nonclinical Biostatistics




-- 
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] generate random numbers

There are several non-normal distributions built in that you could just use 
their generationg functions.  There are packages (most with dist somewhere in 
the name) that allow for functions on standard distributions including 
combining together distributions.

If your distribution is not one of the above, but has a well defined inverse 
cumulative distribution then you can just generate random uniforms (runif) and 
put them into the inverse.

If none of the above works then there are still many tools available such as 
rejection sampling or metropolis-hastings sampling.

The logspline package can approximate a distribution from a dataset and 
generate values from the approximation.

We would need to know more about the actual distribution that you are 
interested in to be of more specific help.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Anna Lee
 Sent: Thursday, March 31, 2011 1:24 PM
 To: r-help@r-project.org
 Subject: [R] generate random numbers
 
 Hey List,
 
 does anyone know how I can generate a vector of random numbers from a
 given distribution? Something like rnorm just for non normal
 distributions???
 
 Thanks a lot!
 Anna
 
 --
 
 
 
 Der Inhalt dieser E-Mail ist vertraulich. Sollte Ihnen die E-Mail
 irrtümlich zugesandt worden sein, bitte ich Sie, mich unverzüglich zu
 benachrichtigen und die E-Mail zu löschen.
 
 This e-mail is confidential. If you have received it in error, please
 notify me immediately and delete it from your system.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] generate random numbers

2011-03-31 Thread Ted Harding

On 31-Mar-11 19:23:33, Anna Lee wrote:
 Hey List,
 does anyone know how I can generate a vector of random numbers
 from a given distribution? Something like rnorm just for non
 normal distributions???
 
 Thanks a lot!
 Anna

SUppose we give your distribution the name Dist.

The generic approach would start by defining a function for
the inverse of its cumulative distribution. Call this qDist.
Then

  qDist(runif(1000))

would generate 1000 values from the distribution Dist.

As a ready-made example, qnorm is the inverse of pnorm,
the cumulative distribution function of the Normal distribution.
Then

  qnorm(runif(1000))

would act just like rnorm(1000), though the sequence of values
would be different (a different algorithm) -- and also rnorm()
would be more efficient (being specially written).

Depending on what your desired distribution is, you may find
that an rDist has already been written for it. There are
many distributions already in R for which the family of
functions dDist, pDist, qDist and rDist are provided.

For more specific advice, please give us information about
the specific distribution you want to sample from!

Ted.


E-Mail: (Ted Harding) ted.hard...@wlandres.net
Fax-to-email: +44 (0)870 094 0861
Date: 31-Mar-11   Time: 20:50:52
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] fit.mult.impute() in Hmisc

2011-03-31 Thread Yuelin Li

I tried multiple imputation with aregImpute() and
fit.mult.impute() in Hmisc 3.8-3 (June 2010) and R-2.12.1.

The warning message below suggests that summary(f) of
fit.mult.impute() would only use the last imputed data set.
Thus, the whole imputation process is ignored.

  Not using a Design fitting function; summary(fit) 
   will use standard errors, t, P from last imputation only.  
   Use vcov(fit) to get the correct covariance matrix, 
   sqrt(diag(vcov(fit))) to get s.e.

But the standard errors in summary(f) agree with the values
from sqrt(diag(vcov(f))) to the 4th decimal point.  It would
seem that summary(f) actually adjusts for multiple
imputation?

Does summary(f) in Hmisc 3.8-3 actually adjust for MI?

If it does not adjust for MI, then how do I get the
MI-adjusted coefficients and standard errors?

I can't seem to find answers in the documentations, including
rereading section 8.10 of the Harrell (2001) book  Googling
located a thread in R-help back in 2003, which seemed dated.
Many thanks in advance for the help,

Yuelin.
http://idecide.mskcc.org
---
 library(Hmisc)
Loading required package: survival
Loading required package: splines
 data(kyphosis, package = rpart)
 kp - lapply(kyphosis, function(x) 
+   { is.na(x) - sample(1:length(x), size = 10); x })
 kp - data.frame(kp)
 kp$kyp - kp$Kyphosis == present
 set.seed(7)
 imp - aregImpute( ~ kyp + Age + Start + Number, dat = kp, n.impute = 10, 
+  type = pmm, match = closest)
Iteration 13 
 f - fit.mult.impute(kyp ~ Age + Start + Number, fitter=glm, xtrans=imp, 
+ family = binomial, data = kp)

Variance Inflation Factors Due to Imputation:

(Intercept) Age   Start  Number 
   1.061.281.171.12 

Rate of Missing Information:

(Intercept) Age   Start  Number 
   0.060.220.140.10 

d.f. for t-distribution for Tests of Single Coefficients:

(Intercept) Age   Start  Number 
2533.47  193.45  435.79  830.08 

The following fit components were averaged over the 10 model fits:

  fitted.values linear.predictors 

Warning message:
In fit.mult.impute(kyp ~ Age + Start + Number, fitter = glm, xtrans = imp,  :
  Not using a Design fitting function; summary(fit) will use
standard errors, t, P from last imputation only.  Use vcov(fit) to get the
correct covariance matrix, sqrt(diag(vcov(fit))) to get s.e.


 f

Call:  fitter(formula = formula, family = binomial, data = completed.data)

Coefficients:
(Intercept)  AgeStart   Number  
-3.6971   0.0118  -0.1979   0.6937  

Degrees of Freedom: 80 Total (i.e. Null);  77 Residual
Null Deviance:  80.5 
Residual Deviance: 58   AIC: 66 
 sqrt(diag(vcov(f)))
(Intercept) Age   Start  Number 
  1.5444782   0.0063984   0.0652068   0.2454408 
 -0.1979/0.0652068
[1] -3.0350
 summary(f)

Call:
fitter(formula = formula, family = binomial, data = completed.data)

Deviance Residuals: 
   Min  1Q  Median  3Q Max  
-1.240  -0.618  -0.288  -0.109   2.409  

Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept)  -3.6971 1.5445   -2.39   0.0167
Age   0.0118 0.00641.85   0.0649
Start-0.1979 0.0652   -3.03   0.0024
Number0.6937 0.24542.83   0.0047

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 80.508  on 80  degrees of freedom
Residual deviance: 57.965  on 77  degrees of freedom
AIC: 65.97

Number of Fisher Scoring iterations: 5

 
 =
 
 Please note that this e-mail and any files transmitted with it may be 
 privileged, confidential, and protected from disclosure under 
 applicable law. If the reader of this message is not the intended 
 recipient, or an employee or agent responsible for delivering this 
 message to the intended recipient, you are hereby notified that any 
 reading, dissemination, distribution, copying, or other use of this 
 communication or any of its attachments is strictly prohibited.  If 
 you have received this communication in error, please notify the 
 sender immediately by replying to this message and deleting this 
 message, any attachments, and all copies and backups from your 
 computer.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple lattice question

2011-03-31 Thread David Winsemius



On Mar 31, 2011, at 8:41 AM, Peter Ehlers wrote:


On 2011-03-31 03:39, Rubén Roa wrote:


DeaR ComRades,

require(lattice)
data- data.frame(SP=sort(rep(as.factor(c('A','B','C','D','E')),12)),
   x=rpois(60,10),
   y=rep(c(rep(0,4),rep(10,4),rep(20,4)),5),
   z=rep(1:4,15))
xyplot(x~y| 
SP 
,data 
=data,groups=z,layout=c(2,3),pch=1:4,lty=1:4,col='black',type='b')


How do I put a legend for the grouping variable in the empty upper- 
right panel?


See the help page for xyplot for an example using the iris data.
You just need to add something like

auto.key = list(x = .6, y = .7, corner = c(0, 0))

to your lattice call. See the 'key' entry on the ?xyplot page.


On my machine the default color scheme is TRUE, so the auto.key comes  
out with colored circles. This incantation gets the key to match the  
plotting symbols and lines:


xyplot(x~y|SP, data=data,groups=z, layout=c(2,3),  
par.settings=simpleTheme(pch=1:4,lty=1:4,col='black'), type=b,

auto.key = list(x = .6, y = .7, lines=TRUE, corner = c(0, 0)) )



Peter Ehlers



TIA

Rubén



Dr. Rubén Roa-Ureta
AZTI - Tecnalia / Marine Research Unit
Txatxarramendi Ugartea z/g
48395 Sukarrieta (Bizkaia)
SPAIN

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Effects - plot the marginal effect

2011-03-31 Thread Tomii

Hello,

I try to plot the marginal effect by using package effects (example of the
graph i want to get is in the attached picture).
All variables are continuous.

Here is regression function, results and error effect function gives:

 mreg01 = lm(a90$enep1 ~ a90$enpres + a90$proximity1 + (a90$enpres * 
 a90$proximity1), data=a90) summary(mreg01)
Call:
lm(formula = a90$enep1 ~ a90$enpres + a90$proximity1 + (a90$enpres *
a90$proximity1), data = a90)

Residuals:
Min  1Q  Median  3Q Max
-2.3173 -1.3349 -0.5713  0.8938  8.1084

Coefficients:
  Estimate Std. Error t value Pr(|t|)
(Intercept) 4.2273 0.3090  13.683   2e-16 ***
a90$enpres  0.4225 0.2319   1.822 0.072250 .
a90$proximity1 -3.8797 1.0984  -3.532 0.000696 ***
a90$enpres:a90$proximity1   0.8953 0.4101   2.183 0.032025 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.029 on 78 degrees of freedom
Multiple R-squared: 0.2128, Adjusted R-squared: 0.1826
F-statistic: 7.031 on 3 and 78 DF,  p-value: 0.0003029
 plot(effect(a90$enpres:a90$proximity1, mreg01))Warning messages:1: In 
 a90$enpres:a90$proximity1 :
  numerical expression has 82 elements: only the first used2: In
a90$enpres:a90$proximity1 :
  numerical expression has 82 elements: only the first used3: In
analyze.model(term, mod, xlevels, default.levels) :
  0 does not appear in the modelError in
plot(effect(a90$enpres:a90$proximity1, mreg01)) :
  error in evaluating the argument 'x' in selecting a method for function 'plot'



Thanks in advance.
Tomas
attachment: graph.png__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] nlme for exact binomial model

2011-03-31 Thread Nonyane, Bareng A.S.

Hi
I am conducting a meta-analysis on proportions and I'm interested in using the 
exact binomial likeliood for the within-study effects rather than the 
approximate approaches  and I don't want to use the quasi-likelihood approach 
in R's glmmPQL.
Does anyone know how I can do this using nlme? I can do it in SAS nlmixed but  
I would rather use R -this is my SAS code:

proc nlmixed data=work.temp   ;
parms  mup=2 vp= 08;  /*initial values*/
rawprop=1/(1+exp(-truep)); /*unkown true p logit proportion*/
model  numberofisolates~binomial(sumisolatestudy, rawprop) ;
random truep~normal(mup,vp) subject=studyid;
run;

Thanks

B. Aletta Nonyane, PhD
Assistant Scientist
Department of International Health
Johns Hopkins Bloomberg School of Public Health
615 N. Wolfe Street
Baltimore, MD, 21205


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] VECM with UNRESTRICTED TREND

2011-03-31 Thread renoir vieira

Dear Pfaff,

Would that be possible to fit a Time varying VECM using urca?

Yours,
Renoir

On Thursday, March 31, 2011, Grzegorz Konat grzegorz.ko...@ibrkk.pl wrote:
 The code you gave me works fine with Finland, but the same for my data -
 does not!
 I do:

 library(urca)
 data(my.data)
 dat1 - my.data[, c(dY, X, dM)]
 trend - matrix(1:nrow(dat1), ncol = 1)
 colnames(trend) - trd
 yxm.vecm - ca.jo(dat1, type = trace, ecdet = const, K = 2, spec =
 longrun, dumvar = trend)

 and the result is again:

 Error in r[i1, , drop = FALSE] - r[-nrow(r):-(nrow(r) - lag + 1L), , drop =
 FALSE] :
   non-numeric argument to binary operator

 I attach my dataset in xls format. If you have 5 minutes and wish to check
 it out, I'd be extremely grateful!

 Best,
 Greg



 2011/3/31 Pfaff, Bernhard Dr. bernhard_pf...@fra.invesco.com

  Well, without further information, I do not know, but try the following

 library(urca)
 example(ca.jo)
 trend - matrix(1:nrow(sjf), ncol = 1)
 colnames(trend) - trd
 ca.jo(sjf, type = trace, ecdet = const, K = 2, spec = longrun,
 dumvar = trend)

 Best,
 Bernhard



  --
 *Von:* Grzegorz Konat [mailto:grzegorz.ko...@ibrkk.pl]
 *Gesendet:* Donnerstag, 31. März 2011 14:40

 *An:* Pfaff, Bernhard Dr.; r-help@r-project.org
 *Betreff:* Re: [R] VECM with UNRESTRICTED TREND

 'time' was a trend variable from my.data set. Equivalent to the output of
 the command 'matrix' you just gave me.

 So now I did:

  library(urca)
 data(my.data)
 names(my.data)
 attach(my.data)
 dat1 - my.data[, c(dY, X, dM)]
 mat1 - matrix(seq(1:nrow(dat1)), ncol = 1)
 args('ca.jo')
 yxm.vecm - ca.jo(dat1, type = trace, ecdet = const, K = 2, spec =
 longrun, dumvar=mat1)

 and the output is:

  Error in r[i1, , drop = FALSE] - r[-nrow(r):-(nrow(r) - lag + 1L), , drop
 = FALSE] :
   non-numeric argument to binary operator
 In addition: Warning message:
 In ca.jo(dat1, type = trace, ecdet = const, K = 2, spec = longrun,
  :
 No column names in 'dumvar', using prefix 'exo' instead.

 What do I do wrong?

 Best,
 Greg


 2011/3/31 Pfaff, Bernhard Dr. bernhard_pf...@fra.invesco.com




  Hello Bernhard,

 thank You so much one again! Now I (more or less) understand the idea, but
 still have problem with its practical application.

 I do (somewhat following example 8.1 in your textbook):

  library(urca)
 data(my.data)
 names(my.data)
 attach(my.data)
 dat1 - my.data[, c(dY, X, dM)]
 dat2 - cbind(time)

 What is 'time'? Just employ matrix(seq(1:nrow(dat1)), ncol = 1) for
 creating the trend variable.

 Best,
 Bernhard


  args('ca.jo')
 yxm.vecm - ca.jo(dat1, type = trace, ecdet = trend, K = 2, spec =
 longrun, dumvar=dat2)

 The above code produces following output:

  Error in r[i1, , drop = FALSE] - r[-nrow(r):-(nrow(r) - lag + 1L), ,
 drop = FALSE] :
   non-numeric argument to binary operator

 What does that mean? Should I use cbind command to dat1 as well? And
 doesn't it transform the series into series of integer numbers?

 Thank you once again (especially for your patience).

 Best,
 Greg



 2011/3/31 Pfaff, Bernhard Dr. bernhard_pf...@fra.invesco.com

  Hello Greg,

 you include your trend as a (Nx1) matrix and use this for 'dumvar'. The
 matrix 'dumvar' is just added to the VECM as deterministic regressors and
 while you are referring to case 5, this is basically what you are after, if
 I am not mistaken. But we aware that this implies a quadratic trend for the
 levels

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Create Variable names dynamically

2011-03-31 Thread Noah Silverman

Hi,

I want to create variable names from within my code, but can't find any 
documentation for this.

An example is probably the best way to illustrate. I am reading data in from a 
file, doing a bunch of stuff, and want to generate variables with my output.  
(I could make a list of lists and name all the elements, but I really want 
separate variables.)


#
#This is just a dummy example, please excuse any shortcuts...

data - read.table(file, )
animals - (data[,animal])  
animals
 cat, dog, horse  # Not known what these are before I read the data file

# do a bunch of stuff

mean_cat - abc
var_cat - dfd
mean_dog - 123
var_dog - 453
etc..
##

I thought of trying to use the paste() function to create the variable name, 
but that doesn't work:
for( animal in animals){
paste(mean, animal _) - 123
}

Any ideas???

Thanks


--
Noah Silverman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Create Variable names dynamically

Take a look in ?assign

On Thu, Mar 31, 2011 at 5:42 PM, Noah Silverman n...@smartmediacorp.com wrote:
 Hi,

 I want to create variable names from within my code, but can't find any 
 documentation for this.

 An example is probably the best way to illustrate. I am reading data in from 
 a file, doing a bunch of stuff, and want to generate variables with my 
 output.  (I could make a list of lists and name all the elements, but I 
 really want separate variables.)


 #
 #This is just a dummy example, please excuse any shortcuts...

 data - read.table(file, )
 animals - (data[,animal])
 animals
 cat, dog, horse  # Not known what these are before I read the data file

 # do a bunch of stuff

 mean_cat - abc
 var_cat - dfd
 mean_dog - 123
 var_dog - 453
 etc..
 ##

 I thought of trying to use the paste() function to create the variable name, 
 but that doesn't work:
 for( animal in animals){
        paste(mean, animal _) - 123
 }

 Any ideas???

 Thanks


 --
 Noah Silverman

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read password-protected files