date:20100713

On Tue, Jul 13, 2010 at 3:20 AM, Lorenzo Cattarino
l.cattar...@uq.edu.au wrote:
 Hello R-users,



 I have a very large vector (a) containing elements consisting of numbers
 and letters, this is the  i.e.

 a

 [1] 1.11.2a     1.11.2d     1.11.2e     1.11.2f     1.11.2x1
 1.11.2x1b

[...]

 How can I remove from each record everything that is after the number
 after the second dot? E.g.:

 1.11.2a becomes 1.11.2, 1.12.1x4 becomes 1.12.1, 1.9.1a becomes
 1.9.1...and so forth.

If they are all of the form shown then the question is equivalent to
removing the first alphabetic character, [[:alpha:]], and everything
thereafter (.*) which is just this.

   sub([[:alpha:]].*, , a)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Matrix Column Names

2010-07-13 Thread Martin Maechler

 DN == David Neu da...@davidneu.com
 on Mon, 12 Jul 2010 18:15:04 -0400 writes:

DN Hi, Is there a way to create a matrix in which the
DN column names are not checked to see if they are valid
DN variable names?

Why do you need that if you are really using a matrix, not a
data frame?

DN I'm looking something similar to the check.names
DN argument to data.frame.  

That's a good idea.
The relevant code inside data.frame() is simply

if (check.names) 
vnames - make.names(vnames, unique = TRUE)

DN If so, would such an approach work for the sparse matrix
DN classes in the Matrix package.

Using function  make.names(), yes of course.

{but I'm still puzzled *why* you need this;
 If you want only want somewhat *short* names,
 I'd rather use

vnames - abbreviate(vnames, 8)

or variations of that such as
vnames - abbreviate(vnames, 8, method=both.sides)
or also
vnames - abbreviate(vnames, 8, strict=FALSE)

DN Many thanks!

you're welcome!

Martin Maechler, ETH Zurich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to mean, min lists and numbers

2010-07-13 Thread Jim Lemon


On 07/13/2010 01:10 AM, g...@ucalgary.ca wrote:

I would like to sum/mean/min a list of lists and numbers to return the
related lists.

-1+2*c(1,1,0)+2+c(-1,10,-1) returns c(2,13,0) but
sum(1,2*c(1,1,0),2,c(-1,10,-1)) returns 15 not a list.
Using the suggestions of Gabor Grothendieck,
Reduce('+',list(-1,2*c(1,1,0),2,c(-1,10,-1))) returns what we want,
c(2,13,0).

However, it seems that this way does not work to mean/min.
So, how to mean/min a list of lists and numbers to return a list? Thanks,


Hi James,
If you really have a list, and not a vector as in your example, look at 
the rapply function in the base package.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Regarding R -installation

2010-07-13 Thread venkatesh bandaru

Dear R-help  Team Members,

I am venkatesh , Student of university of Hyderabad, while Installing R from
the specified servers, I encountered the following problem. please help me
regarding. i need this to do my project .
 Thanking you.


*Problem* :

Cannot access installation media
http://download.opensuse.org/repositories/devel:languages:R:patched/openSUSE_11.2
 (Medium 1).
Check whether the server is accessible

Download (curl) error for '
http://download.opensuse.org/repositories/devel:languages:R:patched/openSUSE_11.2/repodata/repomd.xml
':
Error code: Connection failed
Error message: couldn't connect to host


yours truly,
B.venkatesh,
University of Hyderabad
India.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Regarding accesing R- Repositories at servers

2010-07-13 Thread venkatesh bandaru

Dear R-help team ,
I am venkatesh, student of University of Hyderabad, India. I couldn't able to  
access R-repositories at Your specified servers.It is giving error such as  
Couldn't able to access media. Can you please help me Regarding this.

i am anticipating for your reply, thanking you.

wishes  regards
B.venkatesh,
University of Hyderabad,
India
9440186746



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can i draw a graph with high and low data points

2010-07-13 Thread Nathaniel Saxe


I have 5 columns- Trial.Group, Mean, Standard Deviation, Upper percentile,
Lower percentile.

Trial.Group 41 subjects: 3 to 4 yrs-Male
Mean 444
SD 25
upper 494
lower 393

and all the data is like that. 

and i wish to recreate this excel table.
http://r.789695.n4.nabble.com/file/n2287158/untitled.GIF untitled.GIF 



problem with my code- doesn't put Trial.Group on the x axis


Thanks for the help


-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-can-i-draw-a-graph-with-high-and-low-data-points-tp2282524p2287158.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about food sampling analysis

2010-07-13 Thread Richard Weeks

Hi Sarah,

We regularly undertake work in the food sector and have developed many
custom built solutions. To be more specific, the statistics we employ is
that of sensory analysis and we regularly use the sensominer package in
R.

Regards,

Richard Weeks

Mangosolutions
data analysis that delivers


Mail: rwe...@mango-solutions.com
T: +44 (0)1249 767700
F: +44 (0)1249 767707
M: +44 (0)7500 040365

Unit 2 Greenways Business Park
Bellinger Close
Chippenham
Wilts
SN15 1BN
UK
 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Sarah Henderson
Sent: 12 July 2010 22:42
To: R List
Subject: [R] Question about food sampling analysis

Greetings to all, and my apologies for a question that is mostly about
statistics and secondarily about R.  I have just started a new job that
(this week, apparently) requires statistical knowledge beyond my
training
(as an epidemiologist).

The problem:

- We have 57 food production facilities in three categories
- Samples of 4-6 different foods were tested for listeria at each
facility
- I need to describe the presence of listeria in food (1) overall and
(2) by
facility category.

I know that samples within each facility cannot be treated as
independent,
so I need an approach that accounts for (1) clustering within facilities
and
(2) the different number of samples taken at each facility.  If someone
could kindly point me towards the right type of analysis for this and/or
its
associated R functions/packages, I would greatly appreciate it.

Many thanks,

Sarah

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
LEGAL NOTICE
This message is intended for the use o...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Barplots

2010-07-13 Thread Ravi S. Shankar

Hi R,

 

I am examining the mean returns 10 days before and 10 days after a
event. Now I have several events the corresponding pre and post event 10
day mean returns... something like this

 

Pre_Start  Pre_End  Pre_MeanPre_SD
Post_StartPost_EndPost_Mean
Post_SD

1  2002-02-22  2002-03-08  0.004968027
0.017443954   2002-03-12  2002-03-25  0.0004099697
0.012529438

2  2002-04-25  2002-05-08  -0.006371706
0.011008257  2002-05-10  2002-05-23  -0.0022429404
0.007736497

3  2002-07-24  2002-08-06  0.005083225
0.015508255   2002-08-08  2002-08-21  0.0048237816
0.008116529

4  2002-07-24  2002-08-06  0.005083225
0.015508255   2002-08-08  2002-08-21 0.0048237816
0.008116529

5  2003-01-08  2003-01-21 0.004439480
0.012310963   2003-01-23  2003-02-05 -0.0064620002
0.012731789

 

I obtained a barplot using the below

layout(matrix(c(1,1,2,2),ncol=2,byrow=T))

barplot(rnorm(10),main=Pre_Event_Returns,col=red)

barplot(rnorm(10),main=Post_Event_Returns,col=blue)

 

However I would like to know if it is possible to do the following-
merge the two barplots i.e. a single barplot which will include both the
pre and post event returns 

Any suggestions would be appreciated

Ravi

This e-mail may contain confidential and/or privileged i...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odp: in continuation with the earlier R puzzle

2010-07-13 Thread Petr PIKAL

Hi

r-help-boun...@r-project.org napsal dne 12.07.2010 16:09:30:

 When I just run a for loop it works. But if I am going to run a for loop
 every time for large vectors I might as well use C or any other 
language.
 The reason R is powerful is becasue it can handle large vectors without 
each
 element being manipulated? Please let me know where I am wrong.
 
 for(i in 1:length(news1o)){
 + if(news1o[i]s2o[i])
 + s[i]-1
 + else
 + s[i]--1
 + }

Think in R not in C. Why using loops when you can use whole object 
directly. It is like drinking beer from snifters. It is possible but using 
pints is preferable and more convenient.

news1os2o

gives you a logical vector the same length

and you can use it directly for further selection or computation. You can 
consider FALSE as 0 and TRUE as 1 and use it as numeric vector
so

x-runif(10)
y-runif(10)

c(-1,1)[(xy)+1]

selects -1 when FALSE and 1 when TRUE.

or you can use it in mathematical operation directly

(xy)*2-1

Regards
Petr

 
 -- 
 'Raghu'
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odp: Barplots

2010-07-13 Thread Petr PIKAL

Hi

r-help-boun...@r-project.org napsal dne 13.07.2010 12:18:21:

 Hi R,
 
 
 
 I am examining the mean returns 10 days before and 10 days after a
 event. Now I have several events the corresponding pre and post event 10
 day mean returns... something like this
 
 
 
 Pre_Start  Pre_End  Pre_MeanPre_SD
 Post_StartPost_EndPost_Mean
 Post_SD
 
 1  2002-02-22  2002-03-08  0.004968027
 0.017443954   2002-03-12  2002-03-25  0.0004099697
 0.012529438
 
 2  2002-04-25  2002-05-08  -0.006371706
 0.011008257  2002-05-10  2002-05-23  -0.0022429404
 0.007736497
 
 3  2002-07-24  2002-08-06  0.005083225
 0.015508255   2002-08-08  2002-08-21  0.0048237816
 0.008116529
 
 4  2002-07-24  2002-08-06  0.005083225
 0.015508255   2002-08-08  2002-08-21 0.0048237816
 0.008116529
 
 5  2003-01-08  2003-01-21 0.004439480
 0.012310963   2003-01-23  2003-02-05 -0.0064620002
 0.012731789
 
 
 
 I obtained a barplot using the below
 
 layout(matrix(c(1,1,2,2),ncol=2,byrow=T))
 
 barplot(rnorm(10),main=Pre_Event_Returns,col=red)
 
 barplot(rnorm(10),main=Post_Event_Returns,col=blue)

Maybe

barplot(rbind(rnorm(10), 
rnorm(10)),main=Event_Returns,col=c(red,blue), beside=T)

+ appropriate legend

Regards
Petr


 
 
 
 However I would like to know if it is possible to do the following-
 merge the two barplots i.e. a single barplot which will include both the
 pre and post event returns 
 
 Any suggestions would be appreciated
 
 Ravi
 
 This e-mail may contain confidential and/or privileged 
i...{{dropped:13}}
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fast string comparison

I see. I did not get these performances since did not directly compare
arrays but run seemingly expensive for-loops to do it iteratively...
:(

R.









On Tue, Jul 13, 2010 at 1:42 AM, Hadley Wickham had...@rice.edu wrote:
 strings - replicate(1e5, paste(sample(letters, 100, rep = T), collapse =  
 ))
 system.time(strings[-1] == strings[-1e5])
 #   user  system elapsed
 #  0.016   0.000   0.017

 So it takes ~1/100 of a second to do ~100,000 string comparisons. You
 need to provide a reproducible example that illustrates why you think
 string comparisons are slow.

 Hadley


 On Tue, Jul 13, 2010 at 6:52 AM, Ralf B ralf.bie...@gmail.com wrote:
 I am asking this question because String comparison in R seems to be
 awfully slow (based on profiling results) and I wonder if perhaps '=='
 alone is not the best one can do. I did not ask for anything
 particular and I don't think I need to provide a self-contained source
 example for the question. So, to re-phrase my question, are there more
 (runtime) effective ways to find out if two strings (about 100-150
 characters long) are equal?

 Ralf






 On Sun, Jul 11, 2010 at 2:37 PM, Sharpie ch...@sharpsteen.net wrote:


 Ralf B wrote:

 What is the fastest way to compare two strings in R?

 Ralf


 Which way is not fast enough?

 In other words, are you asking this question because profiling showed one of
 R's string comparison operations is causing a massive bottleneck in your
 code? If so, which one and how are you using it?

 -Charlie

 -
 Charlie Sharpsteen
 Undergraduate-- Environmental Resources Engineering
 Humboldt State University
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Fast-string-comparison-tp2285156p2285409.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Assistant Professor / Dobelman Family Junior Chair
 Department of Statistics / Rice University
 http://had.co.nz/


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Modify the plotting parameters for Vennerable obj.

2010-07-13 Thread Fabian Grammes


Dear List,

I would like to modify the settings for plotting a Vennerable object,  
but I don't know how...so if anyone has an idea I would be really  
graetfull.


best, Fabian

some R code to illustrate my problem:

library(Vennerable)

ven - compute.Venn(Venn(SetNames=c(A, B), Weight=c(0,111,106, 26)))

# now my problem is that whenever I plot the object, the plot appears  
in box, and for cosmetic reasons I would like to get rid of that.

plot(ven)

sessionInfo()
R version 2.10.0 (2009-10-26)
i386-apple-darwin8.11.1

locale:
[1] C

attached base packages:
[1] grid  stats graphics  grDevices utils datasets  methods
[8] base

other attached packages:
 [1] Vennerable_2.0 RColorBrewer_1.0-2 lattice_0.18-3  
RBGL_1.22.0
 [5] graph_1.24.0   ggplot2_0.8.5  digest_0.4.2
reshape_0.8.3

 [9] plyr_0.1.9 proto_0.3-8limma_3.2.1

loaded via a namespace (and not attached):
[1] tools_2.10.0



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] in continuation with the earlier R puzzle

2010-07-13 Thread Petr PIKAL

Hi

I do not use any of mentioned libraries so I can not directly answer it. I 
would try to use debug(expr.frame) to see at what time the error is 
thrown.

I have no idea why did you obtain error. Try to evaluate code in peaces 
e.g. what is result of

list(MAf=quote(SMA(Last, 20)), MAs=quote(SMA(Last, 50)))

and look for differences between results got from spx data and nifty data.

Regards
Petr



Raghu r.raghura...@gmail.com napsal dne 13.07.2010 13:17:42:

 Many many thanks to all of you. The beer cleared the air of doubts!
 Pls look at the following lines of code. This is taken from the example 
of 
 tradesys documentation. When I run the given example using the 
data.frame spx 
 it works just very fine but while I use some other data.frame (here 
nifty) it 
 crashes. Now I can intuit that the total rows in the column named Last 
are 
 3637 and if i do a 20d MA and a 50d MA the respective rows for each of 
them 
 are 3618 and 3588. Why does expr.frame crash for one data.frame and not 
for 
 the other? I have given str() for both below for youe kind perusal.
 
 library(tradesys)
  library(TTR)
  x=nifty[,c(Open,Last)]
  d - expr.frame(x, list(MAf=quote(SMA(Last, 20)), MAs=quote(SMA(Last, 
50
 Error in data.frame(c(1000, 1001.53, 987.17, 976.28, 960.32, 951.93, 
949.29,  : 
   arguments imply differing number of rows: 3637, 3618, 3588
 
 
 str(nifty)
 'data.frame':   3637 obs. of  6 variables:
  $ Date..GMT.: Factor w/ 3637 levels 01/01/1996,01/01/1997,..: 321 
687 807
 929 1052 1172 1537 1650 1764 1886 ...
  $ Open  : num  1000 1002 987 976 960 ...
  $ High  : num  1000 1002 987 976 960 ...
  $ Low   : num  1000 989 977 963 952 ...
  $ Last  : num  1000 989 978 964 953 ...
  $ Date  : num  321 687 807 929 1052 ...
  str(spx)
 'data.frame':   14940 obs. of  5 variables:
  $ Open  : num  16.7 16.9 16.9 17 17.1 ...
  $ High  : num  16.7 16.9 16.9 17 17.1 ...
  $ Low   : num  16.7 16.9 16.9 17 17.1 ...
  $ Close : num  16.7 16.9 16.9 17 17.1 ...
  $ Volume: num  126 189 255 201 252 216 263 
 297 333 146 ...
 
 
 Thanks 
 Raghu
 

 On Tue, Jul 13, 2010 at 12:01 PM, Petr PIKAL petr.pi...@precheza.cz 
wrote:
 Hi
 
 r-help-boun...@r-project.org napsal dne 12.07.2010 16:09:30:
 
  When I just run a for loop it works. But if I am going to run a for 
loop
  every time for large vectors I might as well use C or any other
 language.
  The reason R is powerful is becasue it can handle large vectors 
without
 each
  element being manipulated? Please let me know where I am wrong.
 
  for(i in 1:length(news1o)){
  + if(news1o[i]s2o[i])
  + s[i]-1
  + else
  + s[i]--1
  + }

 Think in R not in C. Why using loops when you can use whole object
 directly. It is like drinking beer from snifters. It is possible but 
using
 pints is preferable and more convenient.
 
 news1os2o
 
 gives you a logical vector the same length
 
 and you can use it directly for further selection or computation. You 
can
 consider FALSE as 0 and TRUE as 1 and use it as numeric vector
 so
 
 x-runif(10)
 y-runif(10)
 
 c(-1,1)[(xy)+1]
 
 selects -1 when FALSE and 1 when TRUE.
 
 or you can use it in mathematical operation directly
 
 (xy)*2-1
 
 Regards
 Petr
 
 
  --
  'Raghu'
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 
 
 
 -- 
 'Raghu'

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Zoo - bug ???

2010-07-13 Thread sayan dasgupta

Hi folks,

I am confused whether the following is a bug or it is fine

Here is the explanation

a - zoo(c(NA,1:9),1:10)

Now If I do

rollapply(a,FUN=mean,width=3,align=right)

I get
 rollapply(a,FUN=mean,width=3,align=right)
 3  4  5  6  7  8  9 10
NA NA NA NA NA NA NA NA

But I shouldn't be getting NA right ? i.e for index 10 I should get
(1/3)*(9+8+7)

Similarly

 rollapply(a,FUN=mean,width=3)
 2  3  4  5  6  7  8  9
NA NA NA NA NA NA NA NA


Zoo version :

 installed.packages()[zoo,Version]
[1] 1.6-3



My machine details

 sessionInfo()
R version 2.10.1 (2009-12-14)
i386-pc-intel32

locale:
[1] LC_COLLATE=English_India.1252  LC_CTYPE=English_India.1252
LC_MONETARY=English_India.1252 LC_NUMERIC=C
[5] LC_TIME=English_India.1252

attached base packages:
[1] stats graphics  grDevices datasets  utils methods   base

other attached packages:
[1] zoo_1.6-3  rcom_2.2-1 rscproxy_1.3-1 Revobase_3.2.0

loaded via a namespace (and not attached):
[1] grid_2.10.1lattice_0.18-3 tools_2.10.1


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] in continuation with the earlier R puzzle

2010-07-13 Thread Raghu

Many many thanks to all of you. The beer cleared the air of doubts!
Pls look at the following lines of code. This is taken from the example of
tradesys documentation. When I run the given example using the data.frame
spx it works just very fine but while I use some other data.frame (here
nifty) it crashes. Now I can intuit that the total rows in the column named
Last are 3637 and if i do a 20d MA and a 50d MA the respective rows for
each of them are 3618 and 3588. Why does expr.frame crash for one data.frame
and not for the other? I have given str() for both below for youe kind
perusal.

library(tradesys)
 library(TTR)
 x=nifty[,c(Open,Last)]
 d - expr.frame(x, list(MAf=quote(SMA(Last, 20)), MAs=quote(SMA(Last,
50
Error in data.frame(c(1000, 1001.53, 987.17, 976.28, 960.32, 951.93,
949.29,  :
  arguments imply differing number of rows: 3637, 3618, 3588


str(nifty)
'data.frame':   3637 obs. of  6 variables:
 $ Date..GMT.: Factor w/ 3637 levels 01/01/1996,01/01/1997,..: 321 687
807 929 1052 1172 1537 1650 1764 1886 ...
 $ Open  : num  1000 1002 987 976 960 ...
 $ High  : num  1000 1002 987 976 960 ...
 $ Low   : num  1000 989 977 963 952 ...
 $ Last  : num  1000 989 978 964 953 ...
 $ Date  : num  321 687 807 929 1052 ...
 str(spx)
'data.frame':   14940 obs. of  5 variables:
 $ Open  : num  16.7 16.9 16.9 17 17.1 ...
 $ High  : num  16.7 16.9 16.9 17 17.1 ...
 $ Low   : num  16.7 16.9 16.9 17 17.1 ...
 $ Close : num  16.7 16.9 16.9 17 17.1 ...
 $ Volume: num  126 189 255 201 252 216 263
297 333 146 ...


Thanks
Raghu


On Tue, Jul 13, 2010 at 12:01 PM, Petr PIKAL petr.pi...@precheza.cz wrote:

 Hi

 r-help-boun...@r-project.org napsal dne 12.07.2010 16:09:30:

  When I just run a for loop it works. But if I am going to run a for loop
  every time for large vectors I might as well use C or any other
 language.
  The reason R is powerful is becasue it can handle large vectors without
 each
  element being manipulated? Please let me know where I am wrong.
 
  for(i in 1:length(news1o)){
  + if(news1o[i]s2o[i])
  + s[i]-1
  + else
  + s[i]--1
  + }

 Think in R not in C. Why using loops when you can use whole object
 directly. It is like drinking beer from snifters. It is possible but using
 pints is preferable and more convenient.

 news1os2o

 gives you a logical vector the same length

 and you can use it directly for further selection or computation. You can
 consider FALSE as 0 and TRUE as 1 and use it as numeric vector
 so

 x-runif(10)
 y-runif(10)

 c(-1,1)[(xy)+1]

 selects -1 when FALSE and 1 when TRUE.

 or you can use it in mathematical operation directly

 (xy)*2-1

 Regards
 Petr

 
  --
  'Raghu'
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.




-- 
'Raghu'

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Three-way Panel Data Analysis

2010-07-13 Thread Millo Giovanni

Dear Danice,

as far as I know, three-way panels are not considered in the
econometrics literature (two dimensions make things complicated enough
already). They are also not implemented in plm.

You might find support for more elaborate nesting structures in the nlme
and lme4 packages. Yet, as the empirical question is not clear from what
you say, you might as well want to use separate regressions,
destination/origin/time dummies (possibly interacted with coefficients)
and so on.

Best wishes,
Giovanni

--- original message 

Message: 111
Date: Tue, 13 Jul 2010 07:14:06 +0100
From: danice ng danice...@gmail.com
To: r-help@r-project.org
Subject: [R] Three-way Panel Data Analysis
Message-ID:
aanlktime7a8oydgotqr-qw0mauw8iqn7rvknc9csx...@mail.gmail.com
Content-Type: text/plain

Dear R users,

I have panel data on the amount of money spent by travellers from 8
origin
countries in 4 destinations. I would like to carry out analysis for
destinations, origins and time. However, it seems to me that the package
plm can only esitmate two-way panel data (indexed by a two-dimensional
array). Any suggestions would be greatly appreciated.

Thank you.

Best regards,
Danice

[[alternative HTML version deleted]]



--

Giovanni Millo
Research Dept.,
Assicurazioni Generali SpA
Via Machiavelli 4, 
34132 Trieste (Italy)
tel. +39 040 671184 
fax  +39 040 671160 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] StartsWith over vector of Strings?

Given vectors of strings of arbitrary length

content - c(abc, def)
searchset - c(a, abc, abcdef, d, def, defghi)

Is it possible to determine the content String set that matches the
searchset in the sense of 'startswith' ? This would be a vector of all
strings in content that start with the string of any of the strings in
the searchset. In the little example here, this would be:

result - c(abc, abc, def, def)

Best,
Ralf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Regarding R -installation

2010-07-13 Thread Gavin Simpson

On Tue, 2010-07-13 at 13:39 +0530, venkatesh bandaru wrote:
 Dear R-help  Team Members,
 
 I am venkatesh , Student of university of Hyderabad, while Installing R from
 the specified servers, I encountered the following problem. please help me
 regarding. i need this to do my project .
  Thanking you.
 
 
 *Problem* :
 
 Cannot access installation media
 http://download.opensuse.org/repositories/devel:languages:R:patched/openSUSE_11.2
  (Medium 1).
 Check whether the server is accessible
 
 Download (curl) error for '
 http://download.opensuse.org/repositories/devel:languages:R:patched/openSUSE_11.2/repodata/repomd.xml
 ':
 Error code: Connection failed
 Error message: couldn't connect to host

Those are the package repositories for your distribution of Linux
(OpenSuse), and are nothing to do with R, the R Foundation and it's CRAN
servers AFAIK.

If you want to download the R sources, try:

http://cran.r-project.org/mirrors.html

Choose one near you and then look at the R Sources and R Binaries
entries in the menu on the left.

As for the problem with openSuse, you might need to try their help
forums.

HTH

G

 
 
 yours truly,
 B.venkatesh,
 University of Hyderabad
 India.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Zoo - bug ???

2010-07-13 Thread Gavin Simpson

On Tue, 2010-07-13 at 17:13 +0530, sayan dasgupta wrote:
 Hi folks,
 
 I am confused whether the following is a bug or it is fine
 
 Here is the explanation
 
 a - zoo(c(NA,1:9),1:10)
 
 Now If I do
 
 rollapply(a,FUN=mean,width=3,align=right)

mean() has argument na.rm which defaults to FALSE. As such, if NA are in
the computation the mean is undefined and the answer will be NA. If you
pass na.rm = TRUE to rollapply, mean ignores the NA and works on the
remaining values:

 rollapply(a,FUN=mean,width=3,align=right, na.rm = TRUE)
  3   4   5   6   7   8   9  10 
1.5 2.0 3.0 4.0 5.0 6.0 7.0 8.0

HTH

G

 
 I get
  rollapply(a,FUN=mean,width=3,align=right)
  3  4  5  6  7  8  9 10
 NA NA NA NA NA NA NA NA
 
 But I shouldn't be getting NA right ? i.e for index 10 I should get
 (1/3)*(9+8+7)
 
 Similarly
 
  rollapply(a,FUN=mean,width=3)
  2  3  4  5  6  7  8  9
 NA NA NA NA NA NA NA NA
 
 
 Zoo version :
 
  installed.packages()[zoo,Version]
 [1] 1.6-3
 
 
 
 My machine details
 
  sessionInfo()
 R version 2.10.1 (2009-12-14)
 i386-pc-intel32
 
 locale:
 [1] LC_COLLATE=English_India.1252  LC_CTYPE=English_India.1252
 LC_MONETARY=English_India.1252 LC_NUMERIC=C
 [5] LC_TIME=English_India.1252
 
 attached base packages:
 [1] stats graphics  grDevices datasets  utils methods   base
 
 other attached packages:
 [1] zoo_1.6-3  rcom_2.2-1 rscproxy_1.3-1 Revobase_3.2.0
 
 loaded via a namespace (and not attached):
 [1] grid_2.10.1lattice_0.18-3 tools_2.10.1
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] print.trellis draw.in - plaintext (gmail mishap)

2010-07-13 Thread Mark Connolly


That helped.  I continued to have issues with

draw.in=vplayout(2,2)$name


(I guess I still don't understand it use), but the following positions 
the plot on the grid where I want it.


grid.newpage()
pushViewport(viewport(layout=grid.layout(2,2)))
vp - vplayout(2,2)

pushViewport(vp)
print(p,newpage=FALSE)
upViewport()

The reason the ggplot instances just worked is that the ggplot2 package 
is doing the viewport traversal for me in ggplot2::print.ggplot.



Thanks for the help!
Mark


On 07/12/2010 07:29 PM, Felix Andrews wrote:

The problem is that you have not pushed your viewport so it doesn't
exist in the plot. (You only pushed the layout viewport).

   

grid.ls(viewports = TRUE)
 

ROOT
   GRID.VP.82

Try this:

vp- vplayout(2,2)
pushViewport(vp)
upViewport()
grid.ls(viewports = TRUE)
#ROOT
#  GRID.VP.82
#GRID.VP.86
print(p, newpage = FALSE, draw.in = vp$name)


-Felix


On 13 July 2010 01:22, Mark Connollywmcon...@ncsu.edu  wrote:
   

require(grid)
require(lattice)
fred = data.frame(x=1:5,y=runif(5))
vplayout- function (x,y) viewport(layout.pos.row=x, layout.pos.col=y)
grid.newpage()
pushViewport(viewport(layout=grid.layout(2,2)))
p = xyplot(y~x,fred)
print(  p,newpage=FALSE,draw.in=vplayout(2,2)$name)


On Mon, Jul 12, 2010 at 8:58 AM, Felix Andrewsfe...@nfrac.org  wrote:
 

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
 

Yes, please, reproducible code.



On 10 July 2010 00:49, Mark Connollywmcon...@ncsu.edu  wrote:
   

I am attempting to plot a trellis object on a grid.

vplayout = viewport(layout.pos.row=x, layout.pos.col=y)

grid.newpage()
pushViewport(viewport(layout=grid.layout(2,2)))

g1 = ggplot() ...
g2 = ggplot() ...
g3 = ggplot() ...
p = xyplot() ...

# works as expected
print(g1, vp=vplayout(1,1))
print(g2, vp=vplayout(1,2))
print(g3, vp=vplayout(2,1))

# does not work
print(  p,
 newpage=FALSE,
 draw.in=vplayout(2,2)$name)

Error in grid.Call.graphics(L_downviewport, name$name, strict) :
  Viewport 'GRID.VP.112' was not found


What am I doing wrong?

Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

 



--
Felix Andrews / 安福立
http://www.neurofractal.org/felix/

   
 






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Zoo - bug ???

2010-07-13 Thread sayan dasgupta

On Tue, Jul 13, 2010 at 5:27 PM, Gavin Simpson gavin.simp...@ucl.ac.ukwrote:

 On Tue, 2010-07-13 at 17:13 +0530, sayan dasgupta wrote:
  Hi folks,
 
  I am confused whether the following is a bug or it is fine
 
  Here is the explanation
 
  a - zoo(c(NA,1:9),1:10)
 
  Now If I do
 
  rollapply(a,FUN=mean,width=3,align=right)

 mean() has argument na.rm which defaults to FALSE. As such, if NA are in
 the computation the mean is undefined and the answer will be NA. If you
 pass na.rm = TRUE to rollapply, mean ignores the NA and works on the
 remaining values:

  rollapply(a,FUN=mean,width=3,align=right, na.rm = TRUE)
   3   4   5   6   7   8   9  10
 1.5 2.0 3.0 4.0 5.0 6.0 7.0 8.0


This is fine but the problem is logically when you are doing rollapply only
the first 2 values should be NA
but suppose for index 10 as I have mentioned the rollapply should be a mean
of b9, 8 ,7 and there is no NA here.
So it should not return NA







 HTH

 G

 
  I get
   rollapply(a,FUN=mean,width=3,align=right)
   3  4  5  6  7  8  9 10
  NA NA NA NA NA NA NA NA
 
  But I shouldn't be getting NA right ? i.e for index 10 I should get
  (1/3)*(9+8+7)
 
  Similarly
 
   rollapply(a,FUN=mean,width=3)
   2  3  4  5  6  7  8  9
  NA NA NA NA NA NA NA NA
 
 
  Zoo version :
 
   installed.packages()[zoo,Version]
  [1] 1.6-3
  
 
 
  My machine details
 
   sessionInfo()
  R version 2.10.1 (2009-12-14)
  i386-pc-intel32
 
  locale:
  [1] LC_COLLATE=English_India.1252  LC_CTYPE=English_India.1252
  LC_MONETARY=English_India.1252 LC_NUMERIC=C
  [5] LC_TIME=English_India.1252
 
  attached base packages:
  [1] stats graphics  grDevices datasets  utils methods   base
 
  other attached packages:
  [1] zoo_1.6-3  rcom_2.2-1 rscproxy_1.3-1 Revobase_3.2.0
 
  loaded via a namespace (and not attached):
  [1] grid_2.10.1lattice_0.18-3 tools_2.10.1
  
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 --
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
  Dr. Gavin Simpson [t] +44 (0)20 7679 0522
  ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
  Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
  Gower Street, London  [w] 
 http://www.ucl.ac.uk/~ucfagls/http://www.ucl.ac.uk/%7Eucfagls/
  UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Define package-wide character constants

2010-07-13 Thread Daniel Nüst

Dear list!

I develop a package for R and wonder how I can best define
package-wide constants (both character strings or named vectors of
strings) which are used throughout different classes and methods. I'm
new to R and wonder if there is some kind of “best practice” that I
just haven't read of yet. My main programming language is Java, so if
that helps anyone to understand my thinking: I mean values that I
would normally put into a class like Constants.java as “public static
final” variables, or into a .properties file.

A concrete example: I deal with XML files, both parsing and encoding.
Right now I have several classes representing documents which I
handle, and in each of the encoding methods there is a character
string for the schema location. If I want to change that location then
I have to change it several times (neglecting search and replace), but
I'd rather have a single point of change for that.

I've read documentation on environments, - and the assign
function, BUT am still not sure how to approach my problem BEST.
assign could most probably do the job, but do I use/create a certain
environment (like myConstants)? Or should I just use my package's
environment?


Thanks for any experiences, (alternative) ideas, or pointers at
existing discussions!

Regards,
Daniel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Substring function?

Hi all,

I would like to detect all strings in the vector 'content' that
contain the strings from the vector 'search'. Here a code example:

content - data.frame(urls=c(

http://www.google.com/search?source=ighl=enrlz==q=stuffaq=faqi=g10aql=oq=gs_rfai=CrrIS3;,

http://search.yahoo.com/search;_ylt=Atvki9MVpnxuEcPmXLEWgMqbvZx4?p=stufftoggle=1;)
)
search - data.frame(signatures=c(http://www.google.com/search;))
subset(content, search$signatures %in% content$urls)

I am getting an error:

[1] urls
0 rows (or 0-length row.names)


What I would like to achieve is the return of
http://www.google.com/search?source=ighl=enrlz==q=stuffaq=faqi=g10aql=oq=gs_rfai=CrrIS3;.
Is that possible? In practice I would like to run this over 1000s of
strings in 'content' and 100s of strings in 'search'. Could I run into
performance issues with this approach and, if so, are there better
ways?

Best,
Ralf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Define package-wide character constants

2010-07-13 Thread Duncan Murdoch


On 13/07/2010 8:20 AM, Daniel Nüst wrote:

Dear list!

I develop a package for R and wonder how I can best define
package-wide constants (both character strings or named vectors of
strings) which are used throughout different classes and methods. I'm
new to R and wonder if there is some kind of “best practice” that I
just haven't read of yet. My main programming language is Java, so if
that helps anyone to understand my thinking: I mean values that I
would normally put into a class like Constants.java as “public static
final” variables, or into a .properties file.

A concrete example: I deal with XML files, both parsing and encoding.
Right now I have several classes representing documents which I
handle, and in each of the encoding methods there is a character
string for the schema location. If I want to change that location then
I have to change it several times (neglecting search and replace), but
I'd rather have a single point of change for that.

I've read documentation on environments, - and the assign
function, BUT am still not sure how to approach my problem BEST.
assign could most probably do the job, but do I use/create a certain
environment (like myConstants)? Or should I just use my package's
environment?


Thanks for any experiences, (alternative) ideas, or pointers at
existing discussions!


If they are constants, you won't need to use - or assign() to change 
them:  just define them at top level in one of the source files for your 
package.

For example,

errorMsg - You have made an error.

Then you can refer to them in functions in your package, e.g.

foo - function() {
 stop(errorMsg)
}

If you use a NAMESPACE file you can limit the visibility of these 
objects to your package by not exporting them.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Zoo - bug ???

2010-07-13 Thread Gavin Simpson

On Tue, 2010-07-13 at 17:41 +0530, sayan dasgupta wrote:
 
 
 On Tue, Jul 13, 2010 at 5:27 PM, Gavin Simpson
 gavin.simp...@ucl.ac.uk wrote:
 On Tue, 2010-07-13 at 17:13 +0530, sayan dasgupta wrote:
  Hi folks,
 
  I am confused whether the following is a bug or it is fine
 
  Here is the explanation
 
  a - zoo(c(NA,1:9),1:10)
 
  Now If I do
 
  rollapply(a,FUN=mean,width=3,align=right)
 
 
 mean() has argument na.rm which defaults to FALSE. As such, if
 NA are in
 the computation the mean is undefined and the answer will be
 NA. If you
 pass na.rm = TRUE to rollapply, mean ignores the NA and works
 on the
 remaining values:
 
  rollapply(a,FUN=mean,width=3,align=right, na.rm = TRUE)
  3   4   5   6   7   8   9  10
 
 1.5 2.0 3.0 4.0 5.0 6.0 7.0 8.0
 
 This is fine but the problem is logically when you are doing rollapply
 only the first 2 values should be NA
 but suppose for index 10 as I have mentioned the rollapply should be a
 mean of b9, 8 ,7 and there is no NA here.
 So it should not return NA

Indeed, there seems to be something odd happening here: consider,

 rollapply(a,FUN=mean,width=3)
 2  3  4  5  6  7  8  9 
NA NA NA NA NA NA NA NA 
 rollapply(a,FUN=mean,width=3, na.rm = FALSE)
 2  3  4  5  6  7  8  9 
NA  2  3  4  5  6  7  8
 rollapply(a,FUN=mean,width=3, na.rm = TRUE)
  2   3   4   5   6   7   8   9 
1.5 2.0 3.0 4.0 5.0 6.0 7.0 8.0 

and if you debug zoo:rollapply.zoo, the top one gets passed off to
rollmean early on in the code, whilst the second (na.rm = FALSE) is
handled by rollapply itself. And I see why this is happening. If ...
contains anything, is anything then the code will not enter the switch
statement which passes off control to functions like rollmean() (in this
case). This explains the difference between the first and second calls
with na.rm = FALSE.

And of course, this is mentioned on ?rollapply. Must read the help!!!

So, as rollmean doesn't accept an na.rm argument or pass it on, you need
to do

rollapply(a,FUN=mean,width=3, na.rm = FALSE)

This is not a bug as ?rollapply tells you what it does, passes you
to ?rollmean which states that it doesn't work for inputs with NAs. To
get behaviour you want though, you have to do the somewhat odd
workaround and force computation via rollapply by providing an extra
argument, even a gibberish one, e.g.:

rollapply(a,FUN=mean,width=3, foo = 1)

will work.

HTH

G

 
 
 
 
  
 
 HTH
 
 G
 
 
 
  I get
   rollapply(a,FUN=mean,width=3,align=right)
   3  4  5  6  7  8  9 10
  NA NA NA NA NA NA NA NA
 
  But I shouldn't be getting NA right ? i.e for index 10 I
 should get
  (1/3)*(9+8+7)
 
  Similarly
 
   rollapply(a,FUN=mean,width=3)
   2  3  4  5  6  7  8  9
  NA NA NA NA NA NA NA NA
 
 
  Zoo version :
 
   installed.packages()[zoo,Version]
  [1] 1.6-3
  
 
 
  My machine details
 
   sessionInfo()
  R version 2.10.1 (2009-12-14)
  i386-pc-intel32
 
  locale:
  [1] LC_COLLATE=English_India.1252
  LC_CTYPE=English_India.1252
  LC_MONETARY=English_India.1252 LC_NUMERIC=C
  [5] LC_TIME=English_India.1252
 
  attached base packages:
  [1] stats graphics  grDevices datasets  utils
 methods   base
 
  other attached packages:
  [1] zoo_1.6-3  rcom_2.2-1 rscproxy_1.3-1
 Revobase_3.2.0
 
  loaded via a namespace (and not attached):
  [1] grid_2.10.1lattice_0.18-3 tools_2.10.1
  
 
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible
 code.
 
 --
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~
 %~%~%~%
  Dr. Gavin Simpson [t] +44 (0)20 7679 0522
  ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
  Pearson Building, [e]
 gavin.simpsonATNOSPAMucl.ac.uk
  Gower Street, London  [w]
 http://www.ucl.ac.uk/~ucfagls/
  UK. WC1E 6BT. [w]
 http://www.freshwaters.org.uk
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~
 %~%~%~%
 
 

--

Re: [R] Zoo - bug ???

On Tue, Jul 13, 2010 at 7:43 AM, sayan dasgupta kitt...@gmail.com wrote:
 Hi folks,

 I am confused whether the following is a bug or it is fine

 Here is the explanation

 a - zoo(c(NA,1:9),1:10)

 Now If I do

 rollapply(a,FUN=mean,width=3,align=right)

 I get
 rollapply(a,FUN=mean,width=3,align=right)
  3  4  5  6  7  8  9 10
 NA NA NA NA NA NA NA NA

 But I shouldn't be getting NA right ? i.e for index 10 I should get
 (1/3)*(9+8+7)

 Similarly

 rollapply(a,FUN=mean,width=3)
  2  3  4  5  6  7  8  9
 NA NA NA NA NA NA NA NA

This is documented behavior (thanks to Gavin for pointing this out)
but I agree that it is undesirable and we will consider how to address
this.  In the meantime use
rollapply(a, 3, mean)
so that it does not use rollmean or if you want NAs removed when doing
the mean calculation use na.rm = TRUE as Gavin suggested.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Continuing on with a loop when there's a failure



On Jul 13, 2010, at 8:47 AM, Josh B wrote:


Thanks again, David.

...but, alas, I still can't get it work! Here's what I'm trying now:

for (i in 1:2) {
mod.poly3 - try(lrm(x[,i] ~ pol(x1, 3) + pol(x2, 3), data=x))
results[1,i] - anova(mod.poly3)[1,3]
}


You need to do some programming. You did not get an error from the lrm  
but rather from the anova call because you tried to give the results  
of the try function to anova without first checking to see if an error  
had occurred.


--
David.


Here's what happens (from the console):

Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol,  
weights = weights,  :

  NA/NaN/Inf in foreign function call (arg 1)
Error in UseMethod(anova) :
  no applicable method for 'anova' applied to an object of class  
try-error


...so I still can't make my results matrix. Could I ask you for some  
specific code to make this work? I'm not that familiar with the  
syntax for try or tryCatch, and the help files for them are pretty  
bad, in my humble opinion.


I should clarify that I actually don't care about the failed runs  
per se. I just want R to keep going in spite of them and give me my  
results matrix.


From: David Winsemius dwinsem...@comcast.net
To: Josh B josh...@yahoo.com
Cc: R Help r-help@r-project.org
Sent: Mon, July 12, 2010 8:09:03 PM
Subject: Re: [R] Continuing on with a loop when there's a failure


On Jul 12, 2010, at 6:18 PM, Josh B wrote:

 Hi R sages,

 Here is my latest problem. Consider the following toy example:

 x - read.table(textConnection(y1 y2 y3 x1 x2
 indv.1 bagels donuts bagels 4 6
 indv.2 donuts donuts donuts 5 1
 indv.3 donuts donuts donuts 1 10
 indv.4 donuts donuts donuts 10 9
 indv.5 bagels donuts bagels 0 2
 indv.6 bagels donuts bagels 2 9
 indv.7 bagels donuts bagels 8 5
 indv.8 bagels donuts bagels 4 1
 indv.9 donuts donuts donuts 3 3
 indv.10 bagels donuts bagels 5 9
 indv.11 bagels donuts bagels 9 10
 indv.12 bagels donuts bagels 3 1
 indv.13 donuts donuts donuts 7 10
 indv.14 bagels donuts bagels 2 10
 indv.15 bagels donuts bagels 9 6), header = TRUE)

 I want to fit a logistic regression of y1 on x1 and x2. Then I  
want to run a
 logistic regression of y2 on x1 and x2. Then I want to run a  
logistic regression
 of y3 on x1 and x2. In reality I have many more Y columns than  
simply y1,
 y2, and y3, so I must design a loop. Notice that y2 is  
invariant and thus it
 will fail. In reality, some y columns will fail for much more  
subtle reasons.
 Simply screening my data to eliminate invariant columns will not  
eliminate the

 problem.

 What I want to do is output a piece of the results from each run  
of the loop to
 a matrix. I want the to try each of my y columns, and not give up  
and stop
 running simply because a particular y column is bad. I want it to  
give me NA
 or something similar in my results matrix for the bad y columns,  
but I want it

 to keep going give me good data for the good y columns.

 For instance:
 results - matrix(nrow = 1, ncol = 3)
 colnames(results) - c(y1, y2, y3)

 for (i in 1:2) {
 mod.poly3 - lrm(x[,i] ~ pol(x1, 3) + pol(x2, 3), data=x)
 results[1,i] - anova(mod.poly3)[1,3]
 }

 If I run this code, it gives up when fitting y2 because the y2 is  
bad. It

 doesn't even try to fit y3. Here's what my console shows:

 results
y1 y2 y3
 [1,] 0.6976063 NA NA

 As you can see, it gave up before fitting y3, which would have  
worked.


 How do I force my code to keep going through the loop, despite the  
rotten apples

 it encounters along the way?

?try

http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-capture-or-ignore-errors-in-a-long-simulation_003f

(Doesn't only apply to simulations.)

 Exact code that gets the job done is what I am
 interested in. I am a post-doc -- I am not taking any classes. I  
promise this is

 not a homework assignment!

--
David Winsemius, MD
West Hartford, CT





David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Continuing on with a loop when there's a failure



On Jul 13, 2010, at 9:04 AM, David Winsemius wrote:



On Jul 13, 2010, at 8:47 AM, Josh B wrote:


Thanks again, David.

...but, alas, I still can't get it work!


(BTW, it did work.)


Here's what I'm trying now:

for (i in 1:2) {
   mod.poly3 - try(lrm(x[,i] ~ pol(x1, 3) + pol(x2, 3), data=x))
   results[1,i] - anova(mod.poly3)[1,3]
}


You need to do some programming.


(Or I suppose you could wrap both the lrm and the anova calls in try.)

You did not get an error from the lrm but rather from the anova call  
because you tried to give the results of the try function to anova  
without first checking to see if an error had occurred.


--
David.


Here's what happens (from the console):

Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol,  
weights = weights,  :

 NA/NaN/Inf in foreign function call (arg 1)
Error in UseMethod(anova) :
 no applicable method for 'anova' applied to an object of class  
try-error


...so I still can't make my results matrix. Could I ask you for  
some specific code to make this work? I'm not that familiar with  
the syntax for try or tryCatch, and the help files for them are  
pretty bad, in my humble opinion.


I should clarify that I actually don't care about the failed runs  
per se. I just want R to keep going in spite of them and give me my  
results matrix.


From: David Winsemius dwinsem...@comcast.net
To: Josh B josh...@yahoo.com
Cc: R Help r-help@r-project.org
Sent: Mon, July 12, 2010 8:09:03 PM
Subject: Re: [R] Continuing on with a loop when there's a failure


On Jul 12, 2010, at 6:18 PM, Josh B wrote:

 Hi R sages,

 Here is my latest problem. Consider the following toy example:

 x - read.table(textConnection(y1 y2 y3 x1 x2
 indv.1 bagels donuts bagels 4 6
 indv.2 donuts donuts donuts 5 1
 indv.3 donuts donuts donuts 1 10
 indv.4 donuts donuts donuts 10 9
 indv.5 bagels donuts bagels 0 2
 indv.6 bagels donuts bagels 2 9
 indv.7 bagels donuts bagels 8 5
 indv.8 bagels donuts bagels 4 1
 indv.9 donuts donuts donuts 3 3
 indv.10 bagels donuts bagels 5 9
 indv.11 bagels donuts bagels 9 10
 indv.12 bagels donuts bagels 3 1
 indv.13 donuts donuts donuts 7 10
 indv.14 bagels donuts bagels 2 10
 indv.15 bagels donuts bagels 9 6), header = TRUE)

 I want to fit a logistic regression of y1 on x1 and x2. Then I  
want to run a
 logistic regression of y2 on x1 and x2. Then I want to run a  
logistic regression
 of y3 on x1 and x2. In reality I have many more Y columns than  
simply y1,
 y2, and y3, so I must design a loop. Notice that y2 is  
invariant and thus it
 will fail. In reality, some y columns will fail for much more  
subtle reasons.
 Simply screening my data to eliminate invariant columns will not  
eliminate the

 problem.

 What I want to do is output a piece of the results from each run  
of the loop to
 a matrix. I want the to try each of my y columns, and not give up  
and stop
 running simply because a particular y column is bad. I want it to  
give me NA
 or something similar in my results matrix for the bad y columns,  
but I want it

 to keep going give me good data for the good y columns.

 For instance:
 results - matrix(nrow = 1, ncol = 3)
 colnames(results) - c(y1, y2, y3)

 for (i in 1:2) {
 mod.poly3 - lrm(x[,i] ~ pol(x1, 3) + pol(x2, 3), data=x)
 results[1,i] - anova(mod.poly3)[1,3]
 }

 If I run this code, it gives up when fitting y2 because the y2 is  
bad. It

 doesn't even try to fit y3. Here's what my console shows:

 results
y1 y2 y3
 [1,] 0.6976063 NA NA

 As you can see, it gave up before fitting y3, which would have  
worked.


 How do I force my code to keep going through the loop, despite  
the rotten apples

 it encounters along the way?

?try

http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-capture-or-ignore-errors-in-a-long-simulation_003f

(Doesn't only apply to simulations.)

 Exact code that gets the job done is what I am
 interested in. I am a post-doc -- I am not taking any classes. I  
promise this is

 not a homework assignment!

--
David Winsemius, MD
West Hartford, CT





David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Generate groups with random size but given total sample size

2010-07-13 Thread Arne Schulz

Dear list,
I am currently doing some simulation studies where I want to compare different 
scenarios.
In particular, two scenarios should be compared: 10.000 cases in 100 groups 
with 100 cases per group and 10.000 cases in 100 groups with random group size 
(ranging from 5 to 500).

The first part is no problem:
 id - seq(1,1)
 group - sort(rep(seq(1,100),100))

But I don't get along with the second scenario. Using sample does give me 100 
groups with random cases, but generates more than 10.000 cases:
 set.seed(13)
 sum(sample(5:500, 100))
[1] 24583

Another way could be generating one sample at a time and sum the cases. But 
this would end up in trail  error to fit the 10.000 cases. Maybe it would 
break rules of probability, too.

I'm convinced that there should be another (and even better) way to handle this 
problem in R... :-)


Best regards,
Arne Schulz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Custom nonlinear self starting function w/ 2 covariates

2010-07-13 Thread Sebastien Guyader


Dear all

I finally found the way to do it. Nlme accepts simpler functions than
selfStart:

# Defining my function
Myfun -function(x1,x2,Tmax,Topt,B,E) {
(((Tmax-x1)/(Tmax-Topt))*(x1/Topt)^(Topt/(Tmax-Topt)))*exp(-exp(B*(log(x2)-log(abs(E)
}

# Calling nlme
nlmefit3 - nlme( y ~ Myfun(x1,x2,Tmax,Topt,B,E), data, fixed=Tmax+Topt+B+E
~ 1, random=Tmax+Topt+B+E ~ 1, start=list(fixed=(c(Topt=25.206, Tmax=36.085,
B=-0.825, E=6.435))) )

Unfortunately, in with nlmer I'm stuck with the error message gradient
attribute of evaluated model must be a numeric matrix, but it's good that
it works with nlme.



Sebastien Guyader wrote:
 
 Hello,
 
 I'm trying to adjust a non linear model in which the biological response
 variable (ratio of germinated fungus spores) is dependent on 2 covariates
 (temperature and time). The response to temperature is modeled by a kind
 of beta function with 2 parameters (optimal and maximum temperatures) and
 the time function is a 2-parameter Weibull. Adjustments with nls or gnls
 work, but I need to do mixed-effects modeling.
 
 It seems like nlme or nlmer need self starting functions, but so far I
 can't find a way to code a selfstart function with 2 x covariates. Is it
 just impossible? Is there another way?
 
 Thanks
 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Custom-nonlinear-self-starting-function-w-2-covariates-tp2286099p2287391.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fast string comparison

2010-07-13 Thread Matt Shotwell

On Tue, 2010-07-13 at 01:42 -0400, Hadley Wickham wrote:
 strings - replicate(1e5, paste(sample(letters, 100, rep = T), collapse =  
 ))
 system.time(strings[-1] == strings[-1e5])
 #   user  system elapsed
 #  0.016   0.000   0.017
 
 So it takes ~1/100 of a second to do ~100,000 string comparisons. You
 need to provide a reproducible example that illustrates why you think
 string comparisons are slow.

Here's a vectorized alternative to '==' for strings, with minimal
argument checking or result conversion. I haven't looked at the
corresponding R source code, it may be similar:

library(inline)
code - 
SEXP ans;
int i, len, *cans;
if(!isString(s1) || !isString(s2))
error(\invalid arguments\);
len = length(s1)length(s2)?length(s2):length(s1);
PROTECT(ans = allocVector(INTSXP, len));
cans = INTEGER(ans);
for(i = 0; i  len; i++)
cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\
 CHAR(STRING_ELT(s2,i)));
UNPROTECT(1);
return ans;

sig - signature(s1=character, s2=character)
strcmp - cfunction(sig, code)


 system.time(strings[-1] == strings[-1e5])
   user  system elapsed 
  0.036   0.000   0.035 
 system.time(strcmp(strings[-1], strings[-1e5]))
   user  system elapsed 
  0.032   0.000   0.034 

That's pretty fast, though I seem to be working with a slower system
than Hadley. It's hard to see how this could be improved, except maybe
by caching results of string comparisons. 

-Matt

 
 Hadley
 
 
 On Tue, Jul 13, 2010 at 6:52 AM, Ralf B ralf.bie...@gmail.com wrote:
  I am asking this question because String comparison in R seems to be
  awfully slow (based on profiling results) and I wonder if perhaps '=='
  alone is not the best one can do. I did not ask for anything
  particular and I don't think I need to provide a self-contained source
  example for the question. So, to re-phrase my question, are there more
  (runtime) effective ways to find out if two strings (about 100-150
  characters long) are equal?
 
  Ralf
 
 
 
 
 
 
  On Sun, Jul 11, 2010 at 2:37 PM, Sharpie ch...@sharpsteen.net wrote:
 
 
  Ralf B wrote:
 
  What is the fastest way to compare two strings in R?
 
  Ralf
 
 
  Which way is not fast enough?
 
  In other words, are you asking this question because profiling showed one 
  of
  R's string comparison operations is causing a massive bottleneck in your
  code? If so, which one and how are you using it?
 
  -Charlie
 
  -
  Charlie Sharpsteen
  Undergraduate-- Environmental Resources Engineering
  Humboldt State University
  --
  View this message in context: 
  http://r.789695.n4.nabble.com/Fast-string-comparison-tp2285156p2285409.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
-- 
Matthew S. Shotwell
Graduate Student
Division of Biostatistics and Epidemiology
Medical University of South Carolina
http://biostatmatt.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Substring function?

2010-07-13 Thread Nikhil Kaza

well %in% is really checking if the element is in the set and is not a  
substring operator.


To get the result you want, try

content[grepl(search$signatures, content$urls),]

For multiple operations you could try

sapply(search$signatures, grepl, x=content$urls)




Nikhil Kaza
Asst. Professor,
City and Regional Planning
University of North Carolina

nikhil.l...@gmail.com

On Jul 13, 2010, at 8:22 AM, Ralf B wrote:


Hi all,

I would like to detect all strings in the vector 'content' that
contain the strings from the vector 'search'. Here a code example:

content - data.frame(urls=c(
	http://www.google.com/search?source=ighl=enrlz==q=stuffaq=faqi=g10aql=oq=gs_rfai=CrrIS3 
,
	http://search.yahoo.com/search;_ylt=Atvki9MVpnxuEcPmXLEWgMqbvZx4?p=stufftoggle=1 
)

)
search - data.frame(signatures=c(http://www.google.com/search;))
subset(content, search$signatures %in% content$urls)

I am getting an error:

[1] urls
0 rows (or 0-length row.names)


What I would like to achieve is the return of
http://www.google.com/search?source=ighl=enrlz==q=stuffaq=faqi=g10aql=oq=gs_rfai=CrrIS3 
.

Is that possible? In practice I would like to run this over 1000s of
strings in 'content' and 100s of strings in 'search'. Could I run into
performance issues with this approach and, if so, are there better
ways?

Best,
Ralf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RODBC and Excel 2010 xlsx

2010-07-13 Thread Rodrigo Aluizio

Hi List, just to know if the issue is only a problem of mine or if it is a
general issue due to the new MS Office pack. I'm using R 2.11.1 32 bits in a
Windows 7 x64 with the MS office 2010 x64 installed. I can import .xls files
normally (the same way I did with my Excel 2007 32 bits). But the function
odbcConnectExcel2007 isn't able to import .xlsx files now that I have the
new version of the Office package.

It gives me the following warning message, which make impossible the
importing process through sqlFetch:
Warning messages:
1: In odbcDriverConnect(con, tabQuote = c([, ]), ...) :
  [RODBC] ERROR: state IM002, code 0, message [Microsoft][ODBC Driver
Manager] Nome da fonte de dados nÃ£o encontrado e nenhum driver padrÃ£o
especificado (Source name not found and no default driver specified)
2: In odbcDriverConnect(con, tabQuote = c([, ]), ...) :
  ODBC connection failed

I'm obviously bypassing it converting my .xlsx files to .xls.
Well the question is simple. Is this an expected issue, like the one when
the xlsx format was released and it will be worked out, or I'm having and
specific problem at one of my system components (drivers)?

Thank you very much for the attention.

Rodrigo Aluizio

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SAS Proc summary/means as a R function

2010-07-13 Thread Roger Deangelis


Thanks Richard and Erik,

I hate to buy the book and not find the solution to the following:

proc.means - function() {
   deparse(match.call()[-1])
}

proc.means(this is a sentence)

unexpected symbol in   proc means(this is) 

One possible solution would be to 'peek' into the memory buffer that holds
the
function arguments. 

It is easy to replicate the 'dataset' output for many SAS procs(ie
transpose, freq, summary, means...)
I am not interested in 'report writing in R'.

The hard part is parsing the SAS syntax, I wish R had a drop down to PERL.

per1 on;

   some perl code

perl off;

also

sas on;

  some SAS code

sas off;

The purpose of parmbuff is to turn off of Rs scanning and resolution of
function arguments
and just provide the bare text between '('  and ')' in the function call.

This is a very powerful construct.

A function would provide something like

sas.on(


)

-- 
View this message in context: 
http://r.789695.n4.nabble.com/SAS-Proc-summary-means-as-a-R-function-tp2286888p2287350.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Regarding installation from ROracle

2010-07-13 Thread vikrant


I am using windows Xp OS and R 2.10. I treid to install ROracle package and I
got following error:- 
 

This application has failed to start because orasql9.dll was not found.
Re-installing 
the application may fix this problem

I have already installed the dependecy package DBI

Please help me...

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Regarding-installation-from-ROracle-tp2287331p2287331.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Accessing files on password-protected FTP sites

2010-07-13 Thread Cliff Clive

Thanks for the tip. From the link you posted:

| You can embed the user id and password into the URL. For example:
|
| http://userid:passw...@www.anywhere.com/
| ftp://userid:passw...@ftp.anywhere.com/

I'm still having issues, though. I am trying to fetch some csv files from a
storage site used by my company, and I've tried the read.csv and
download.file commands. These are the error messages that pop up:

read.csv(ftp://userid:passw...@ftp.anywhere.com/data.csv;)
Error in file(file, rt) : cannot open the connection
download.file(ftp://userid:passw...@ftp.anywhere.com/data.csv;,
C:/data.csv)
trying URL 'ftp://userid:passw...@ftp.anywhere.com/data.csv'
Error in download.file(ftp://userid:passw...@ftp.anywhere.com/data.csv;, :
cannot open URL 'ftp://userid:passw...@ftp.anywhere.com/data.csv'

Am I leaving out any important options from these commands, that would allow
me to access the site if I include them? When I type the URL into Firefox
the same way I have entered it into R, I get the files I need. But for my
particular project, I am going to have to automate the process.

Obviously these are not my real userID, password, or website name. In case
it is relevant, I am trying to access files that store information on the
positions in my company's stock portfolio; these files are stored on our
brokerage firm's website.

--
View this message in context:
http://r.789695.n4.nabble.com/Accessing-files-on-password-protected-FTP-sites-tp2286862p2287373.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] SARIMA model

2010-07-13 Thread FMH

Dear All,

Could someone please advice me the appropriate package for fitting the SARIMA 
model? 


Thanks
Fir




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Continuing on with a loop when there's a failure

2010-07-13 Thread Josh B

Thanks again, David.

...but, alas, I still can't get it work! Here's what I'm trying now:

for (i in 1:2) {
mod.poly3 - try(lrm(x[,i] ~ pol(x1, 3) +  pol(x2, 3), data=x))
results[1,i] - anova(mod.poly3)[1,3]
}


Here's what happens (from the console):

Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol, weights = 
weights,  : 

  NA/NaN/Inf in foreign function call (arg 1)
Error in UseMethod(anova) : 
  no applicable method for 'anova' applied to an object of class try-error

...so I still can't make my results matrix. Could I ask you for some specific 
code to make this work? I'm not that familiar with the syntax for try or 
tryCatch, and the help files for them are pretty bad, in my humble opinion.

I should clarify that I actually don't care about the failed runs per se. I 
just 
want R to keep going in spite of them and give me my results matrix.




From: David Winsemius dwinsem...@comcast.net

Cc: R Help r-help@r-project.org
Sent: Mon, July 12, 2010 8:09:03 PM
Subject: Re: [R] Continuing on with a loop when there's a failure


On Jul 12, 2010, at 6:18 PM, Josh B wrote:

 Hi R sages,
 
 Here is my latest problem. Consider the following toy example:
 
 x - read.table(textConnection(y1 y2 y3 x1 x2
 indv.1 bagels donuts bagels 4 6
 indv.2 donuts donuts donuts 5 1
 indv.3 donuts donuts donuts 1 10
 indv.4 donuts donuts donuts 10 9
 indv.5 bagels donuts bagels 0 2
 indv.6 bagels donuts bagels 2 9
 indv.7 bagels donuts bagels 8 5
 indv.8 bagels donuts bagels 4 1
 indv.9 donuts donuts donuts 3 3
 indv.10 bagels donuts bagels 5 9
 indv.11 bagels donuts bagels 9 10
 indv.12 bagels donuts bagels 3 1
 indv.13 donuts donuts donuts 7 10
 indv.14 bagels donuts bagels 2 10
 indv.15 bagels donuts bagels 9 6), header = TRUE)
 
 I want to fit a logistic regression of y1 on x1 and x2. Then I want to run a
 logistic regression of y2 on x1 and x2. Then I want to run a logistic 
regression
 of y3 on x1 and x2. In reality I have many more Y columns than simply y1,
 y2, and y3, so I must design a loop. Notice that y2 is invariant and thus 
it
 will fail. In reality, some y columns will fail for much more subtle reasons.
 Simply screening my data to eliminate invariant columns will not eliminate the
 problem.
 
 What I want to do is output a piece of the results from each run of the loop 
to
 a matrix. I want the to try each of my y columns, and not give up and stop
 running simply because a particular y column is bad. I want it to give me NA
 or something similar in my results matrix for the bad y columns, but I want it
 to keep going give me good data for the good y columns.
 
 For instance:
 results - matrix(nrow = 1, ncol = 3)
 colnames(results) - c(y1, y2, y3)
 
 for (i in 1:2) {
 mod.poly3 - lrm(x[,i] ~ pol(x1, 3) + pol(x2, 3), data=x)
 results[1,i] - anova(mod.poly3)[1,3]
 }
 
 If I run this code, it gives up when fitting y2 because the y2 is bad. It
 doesn't even try to fit y3. Here's what my console shows:
 
 results
y1 y2 y3
 [1,] 0.6976063 NA NA
 
 As you can see, it gave up before fitting y3, which would have worked.
 
 How do I force my code to keep going through the loop, despite the rotten 
apples
 it encounters along the way?

?try

http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-capture-or-ignore-errors-in-a-long-simulation_003f


(Doesn't only apply to simulations.)

 Exact code that gets the job done is what I am
 interested in. I am a post-doc -- I am not taking any classes. I promise this 
is
[[elided Yahoo spam]]

--
David Winsemius, MD
West Hartford, CT


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Continuing on with a loop when there's a failure

2010-07-13 Thread Josh B

In my opinion the try and tryCatch commands are written and documented rather 
poorly. Thus I am not sure what to program exactly.

For instance, I could query mod.poly3 and use an if/then statement to proceed, 
but querying mod.poly3 is weird. For instance, here's the output when it fails:

 mod.poly3 - try(lrm(x[,2] ~ pol(x1, 3) + pol(x2, 3), data=x))
Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol, weights = 
weights,  : 

  NA/NaN/Inf in foreign function call (arg 1)
 mod.poly3
[1] Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol, weights 
= 
weights,  : \n  NA/NaN/Inf in foreign function call (arg 1)\n
attr(,class)
[1] try-error

...and here's the output when it succeeds:
 mod.poly3 - try(lrm(x[,1] ~ pol(x1, 3) + pol(x2, 3), data=x))
 mod.poly3

Logistic Regression Model

lrm(formula = x[, 1] ~ pol(x1, 3) + pol(x2, 3), data = x)


Frequencies of Responses
bagels donuts 
10  5 

   Obs  Max Deriv Model L.R.   d.f.  P  C 
15  4e-04   3.37  6 0.7616   0.76 
   Dxy  Gamma  Tau-a R2  Brier  g 
  0.52   0.52  0.248  0.279  0.183  1.411 
gr gp 
   4.1  0.261 

  Coef S.E.Wald Z P 
Intercept -5.68583 5.23295 -1.09  0.2772
x1 1.87020 2.14635  0.87  0.3836
x1^2  -0.42494 0.48286 -0.88  0.3788
x1^3   0.02845 0.03120  0.91  0.3618
x2 3.49560 3.54796  0.99  0.3245
x2^2  -0.94888 0.82067 -1.16  0.2476
x2^3   0.06362 0.05098  1.25  0.2121

...so what exactly would I query to design my if/then statement?





From: David Winsemius dwinsem...@comcast.net
To: David Winsemius dwinsem...@comcast.net

Sent: Tue, July 13, 2010 9:09:04 AM
Subject: Re: [R] Continuing on with a loop when there's a failure


On Jul 13, 2010, at 9:04 AM, David Winsemius wrote:

 
 On Jul 13, 2010, at 8:47 AM, Josh B wrote:
 
 Thanks again, David.
 
[[elided Yahoo spam]]

(BTW, it did work.)

 Here's what I'm trying now:
 
 for (i in 1:2) {
mod.poly3 - try(lrm(x[,i] ~ pol(x1, 3) + pol(x2, 3), data=x))
results[1,i] - anova(mod.poly3)[1,3]
 }
 
 You need to do some programming.

(Or I suppose you could wrap both the lrm and the anova calls in try.)

 You did not get an error from the lrm but rather from the anova call because 
you tried to give the results of the try function to anova without first 
checking to see if an error had occurred.
 
 --David.
 
 Here's what happens (from the console):
 
 Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol, weights = 
weights,  :
  NA/NaN/Inf in foreign function call (arg 1)
 Error in UseMethod(anova) :
  no applicable method for 'anova' applied to an object of class try-error
 
 ...so I still can't make my results matrix. Could I ask you for some 
 specific 
code to make this work? I'm not that familiar with the syntax for try or 
tryCatch, and the help files for them are pretty bad, in my humble opinion.
 
 I should clarify that I actually don't care about the failed runs per se. I 
just want R to keep going in spite of them and give me my results matrix.
 
 From: David Winsemius dwinsem...@comcast.net

 Cc: R Help r-help@r-project.org
 Sent: Mon, July 12, 2010 8:09:03 PM
 Subject: Re: [R] Continuing on with a loop when there's a failure
 
 
 On Jul 12, 2010, at 6:18 PM, Josh B wrote:
 
  Hi R sages,
 
  Here is my latest problem. Consider the following toy example:
 
  x - read.table(textConnection(y1 y2 y3 x1 x2
  indv.1 bagels donuts bagels 4 6
  indv.2 donuts donuts donuts 5 1
  indv.3 donuts donuts donuts 1 10
  indv.4 donuts donuts donuts 10 9
  indv.5 bagels donuts bagels 0 2
  indv.6 bagels donuts bagels 2 9
  indv.7 bagels donuts bagels 8 5
  indv.8 bagels donuts bagels 4 1
  indv.9 donuts donuts donuts 3 3
  indv.10 bagels donuts bagels 5 9
  indv.11 bagels donuts bagels 9 10
  indv.12 bagels donuts bagels 3 1
  indv.13 donuts donuts donuts 7 10
  indv.14 bagels donuts bagels 2 10
  indv.15 bagels donuts bagels 9 6), header = TRUE)
 
  I want to fit a logistic regression of y1 on x1 and x2. Then I want to run 
a
  logistic regression of y2 on x1 and x2. Then I want to run a logistic 
regression
  of y3 on x1 and x2. In reality I have many more Y columns than simply y1,
  y2, and y3, so I must design a loop. Notice that y2 is invariant and 
  thus 
it
  will fail. In reality, some y columns will fail for much more subtle 
reasons.
  Simply screening my data to eliminate invariant columns will not eliminate 
the
  problem.
 
  What I want to do is output a piece of the results from each run of the 
  loop 
to
  a matrix. I want the to try each of my y columns, and not give up and stop
  running simply because a particular y column is bad. I want it to give me 
NA
  or something similar in my results matrix for the bad y columns, but I 
  want 
it
  to keep going give me good data for the good y columns.
 
  For

Re: [R] RODBC and Excel 2010 xlsx

On Tue, Jul 13, 2010 at 9:31 AM, Rodrigo Aluizio r.alui...@gmail.com wrote:
 Hi List, just to know if the issue is only a problem of mine or if it is a
 general issue due to the new MS Office pack. I'm using R 2.11.1 32 bits in a
 Windows 7 x64 with the MS office 2010 x64 installed. I can import .xls files
 normally (the same way I did with my Excel 2007 32 bits). But the function
 odbcConnectExcel2007 isn't able to import .xlsx files now that I have the
 new version of the Office package.

 It gives me the following warning message, which make impossible the
 importing process through sqlFetch:
 Warning messages:
 1: In odbcDriverConnect(con, tabQuote = c([, ]), ...) :
  [RODBC] ERROR: state IM002, code 0, message [Microsoft][ODBC Driver
 Manager] Nome da fonte de dados não encontrado e nenhum driver padrão
 especificado (Source name not found and no default driver specified)
 2: In odbcDriverConnect(con, tabQuote = c([, ]), ...) :
  ODBC connection failed

 I'm obviously bypassing it converting my .xlsx files to .xls.
 Well the question is simple. Is this an expected issue, like the one when
 the xlsx format was released and it will be worked out, or I'm having and
 specific problem at one of my system components (drivers)?

 Thank you very much for the attention.


Suspect its a driver issue but you might want to look over the variety
of methods for reading in Excel spreadsheets here:
http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windowss=excel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] interpretation of svm models with the e1071 package

2010-07-13 Thread Steve Lianoglou

Hi,

On Sat, Jul 10, 2010 at 12:35 AM, Noah Silverman
n...@smartmediacorp.com wrote:
 Steve,

 Couldn't he also just use the decision.value property to see the equivilent
 of t(x) %*% b for each row?

I don't follow what you're saying. What is this the equivalent of?

What's b here? The bias/offset?

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fast string comparison

2010-07-13 Thread Romain Francois



Hi Matt,

I think there are some confusing factors in your results.


system.time(strcmp(strings[-1], strings[-1e5]))

would also include the time required to perform both subscripting 
(strings[-1] and strings[-1e5] ) which actually takes some time.



Also, you do have a bit of overhead due to the use of STRING_ELT and the 
write barrier.



I've include below a version that uses R internals so that you get the 
fast (but you have to understand the risks, etc ...) version of 
STRING_ELT using the plugin system of inline.


library(inline)
code - 
SEXP ans;
int i, len, *cans;
if(!isString(s1) || !isString(s2))
error(\invalid arguments\);
len = length(s1)length(s2)?length(s2):length(s1);
PROTECT(ans = allocVector(INTSXP, len));
cans = INTEGER(ans);
for(i = 0; i  len; i++)
cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\
 CHAR(STRING_ELT(s2,i)));
UNPROTECT(1);
return ans;

sig - signature(s1=character, s2=character)
strcmp - cfunction(sig, code)

strings - replicate(1e5, paste(sample(letters, 100, rep = T), collapse 
=  ))



lhs - strings[-1]
rhs - strings[-1e5]
system.time( lhs == rhs )
system.time(strcmp( lhs, rhs) )

library(inline)
settings - getPlugin( default )
settings$includes - paste( #define USE_RINTERNALS, settings$includes, 
collapse = \n )

code2 - 
SEXP ans;
int i, len, *cans;
if(!isString(s1) || !isString(s2))
error(\invalid arguments\);
len = length(s1)length(s2)?length(s2):length(s1);
PROTECT(ans = allocVector(INTSXP, len));
cans = INTEGER(ans);
for(i = 0; i  len; i++)
cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\
 CHAR(STRING_ELT(s2,i)));
UNPROTECT(1);
return ans;

sig - signature(s1=character, s2=character )
strcmp2 - cxxfunction(sig, code2, settings = settings)
system.time(strcmp2( lhs, rhs) )



I get:

$ Rscript strings.R
Le chargement a nécessité le package : methods
utilisateur système  écoulé
  0.002   0.000   0.002
utilisateur système  écoulé
  0.004   0.000   0.005
utilisateur système  écoulé
  0.003   0.000   0.003

Romain


Le 13/07/10 15:24, Matt Shotwell a écrit :


On Tue, 2010-07-13 at 01:42 -0400, Hadley Wickham wrote:

strings- replicate(1e5, paste(sample(letters, 100, rep = T), collapse =  ))
system.time(strings[-1] == strings[-1e5])
#   user  system elapsed
#  0.016   0.000   0.017

So it takes ~1/100 of a second to do ~100,000 string comparisons. You
need to provide a reproducible example that illustrates why you think
string comparisons are slow.


Here's a vectorized alternative to '==' for strings, with minimal
argument checking or result conversion. I haven't looked at the
corresponding R source code, it may be similar:

library(inline)
code- 
 SEXP ans;
 int i, len, *cans;
 if(!isString(s1) || !isString(s2))
 error(\invalid arguments\);
 len = length(s1)length(s2)?length(s2):length(s1);
 PROTECT(ans = allocVector(INTSXP, len));
 cans = INTEGER(ans);
 for(i = 0; i  len; i++)
 cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\
  CHAR(STRING_ELT(s2,i)));
 UNPROTECT(1);
 return ans;

sig- signature(s1=character, s2=character)
strcmp- cfunction(sig, code)



system.time(strings[-1] == strings[-1e5])

user  system elapsed
   0.036   0.000   0.035

system.time(strcmp(strings[-1], strings[-1e5]))

user  system elapsed
   0.032   0.000   0.034

That's pretty fast, though I seem to be working with a slower system
than Hadley. It's hard to see how this could be improved, except maybe
by caching results of string comparisons.

-Matt



Hadley


On Tue, Jul 13, 2010 at 6:52 AM, Ralf Bralf.bie...@gmail.com  wrote:

I am asking this question because String comparison in R seems to be
awfully slow (based on profiling results) and I wonder if perhaps '=='
alone is not the best one can do. I did not ask for anything
particular and I don't think I need to provide a self-contained source
example for the question. So, to re-phrase my question, are there more
(runtime) effective ways to find out if two strings (about 100-150
characters long) are equal?

Ralf






On Sun, Jul 11, 2010 at 2:37 PM, Sharpiech...@sharpsteen.net  wrote:



Ralf B wrote:


What is the fastest way to compare two strings in R?

Ralf



Which way is not fast enough?

In other words, are you asking this question because profiling showed one of
R's string comparison operations is causing a massive bottleneck in your
code? If so, which one and how are you using it?

-Charlie



--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/bc8jNi : Rcpp 0.8.4
|- http://bit.ly/dz0RlX : bibtex 0.2-1
`- http://bit.ly/a5CK2h : Les estivales 2010

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the

[R] Extract Clusters from Biclust Object - writeBiclusterResults

2010-07-13 Thread delfin13


 Update: Solution ##
Dear all,
just in case someone has the same question: I found the method
writeBiclusterResults. It prints the results of all modules together in one
file, containing the gene names and array/experiment names. It does not
contain the values, however, so these have to be parsed by yourself from the
original data file.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Extract-Clusters-from-Biclust-Object-tp2286066p2287441.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Dicretizing a normal distribution to predefined bands

2010-07-13 Thread stefano.m


Dear useRs,

I am facing the following problem in R and hope you can help me. I want to
discretize a normal distribution to 4 predefined bands. The  bands are
1,2,10 and 20. In order to maintain the symmetric shape and the mean of the
density I need to cut off all negative values and the corresponding part on
the positive axis and allocate the mass taken away proportionally on the
remaining support. I tried the function discretize from the actuar package,
but I am not sure I properly define the step and range. I am sorry for the
maybe trivial question and thanks in advance for any help!
mary

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Dicretizing-a-normal-distribution-to-predefined-bands-tp2287455p2287455.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] interpretation of svm models with the e1071 package

2010-07-13 Thread Steve Lianoglou

Hi,

On Mon, Jul 12, 2010 at 4:55 AM, manuel.martin
manuel.mar...@orleans.inra.fr wrote:
 On 07/10/2010 04:11 AM, Steve Lianoglou wrote:
 On Fri, Jul 9, 2010 at 12:15 PM, manuel.martin
 manuel.mar...@orleans.inra.fr  wrote:
snip


 Dear all,

 after having calibrated a svm model through the svm() command of the
 e1071
 package, is there a way to
 i) represent the modeled relationships between the y and X variables
 (response variable vs. predictors)?


 Can you explain a bit more ... how do you want them represented?


 I was thinking to a simple ŷ = fi(Xi) plot, fi resulting from the fitted svm
 model. Xi is the predictor, among the whole set of predictors, X, one wish
 to see the relationship with the response.
 For boosted regression trees, which I am more familiar with, this is fi
 function is estimated by averaging the effects of all predictors but Xi, and
 plotting how ŷ varies as Xi does.

I still think you might be able to get some mileage out of calculating
your W vector and looking at the values in each of its
coordinates/bins.

I think one problem trying to figure out something for the plot you
are after is that I feel like depending on the choice of kernel used
in for your SVM, rigging up such an fi(Xi) plot might not be as
straight forward as you might think, since kernels can manipulate your
feature space in fun ways.

There's some literature out there about how to extract
meaning/features from an SVM model. Perhaps you can search through
some of that to help get some ideas.

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Batch file export

2010-07-13 Thread Michael Haenlein

Dear all,

I have a code that generates data vectors within R. For example assume:
z - rlnorm(1000, meanlog = 0, sdlog = 1)

Every time a vector has been generated I would like to export it into a csv
file. So my idea is something as follows:

for (i in 1:100) {
z - rlnorm(1000, meanlog = 0, sdlog = 1)
write.csv(z, c:/z_i.csv)

Where z_i.csv is a filename that is related to the run (e.g. z_001.csv,
z_002.csv, ...).

Could anyone please advice me on the most convenient way of doing this?

Thanks very much in advance,

Michael

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SAS Proc summary/means as a R function

2010-07-13 Thread Duncan Murdoch


On 13/07/2010 8:39 AM, Roger Deangelis wrote:

Thanks Richard and Erik,

I hate to buy the book and not find the solution to the following:

proc.means - function() {
   deparse(match.call()[-1])
}

proc.means(this is a sentence)

unexpected symbol in   proc means(this is) 


One possible solution would be to 'peek' into the memory buffer that holds
the
function arguments. 


It is easy to replicate the 'dataset' output for many SAS procs(ie
transpose, freq, summary, means...)
I am not interested in 'report writing in R'.

The hard part is parsing the SAS syntax, I wish R had a drop down to PERL.

per1 on;

   some perl code

perl off;
  


It would not be hard to write something like that.  The syntax would be

perl(
   some perl code
)

where the function is something like

perl - function(code) {
  f - tempfile()
  writeLines(code, f)
  system(paste(perl, f))
}

You do need to watch out for escapes in the text, or be careful about 
what quotes you use, e.g.


 perl('
+   print Hello World\n;
+ ')
Hello World

Similarly for SAS, but I don't know how you tell SAS to process a file.

Duncan Murdoch


also

sas on;

  some SAS code

sas off;

The purpose of parmbuff is to turn off of Rs scanning and resolution of
function arguments
and just provide the bare text between '('  and ')' in the function call.

This is a very powerful construct.

A function would provide something like

sas.on(


)




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Batch file export

2010-07-13 Thread Nikhil Kaza


write.csv(z, paste(c:/z_,i,.csvsep=''))

You will have to modify this to prepend 0s.


Nikhil Kaza
Asst. Professor,
City and Regional Planning
University of North Carolina

nikhil.l...@gmail.com

On Jul 13, 2010, at 10:03 AM, Michael Haenlein wrote:


Dear all,

I have a code that generates data vectors within R. For example  
assume:

z - rlnorm(1000, meanlog = 0, sdlog = 1)

Every time a vector has been generated I would like to export it  
into a csv

file. So my idea is something as follows:

for (i in 1:100) {
z - rlnorm(1000, meanlog = 0, sdlog = 1)
write.csv(z, c:/z_i.csv)

Where z_i.csv is a filename that is related to the run (e.g.  
z_001.csv,

z_002.csv, ...).

Could anyone please advice me on the most convenient way of doing  
this?


Thanks very much in advance,

Michael

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RODBC and Excel 2010 xlsx

2010-07-13 Thread Prof Brian Ripley


On Tue, 13 Jul 2010, Rodrigo Aluizio wrote:


Hi List, just to know if the issue is only a problem of mine or if it is a
general issue due to the new MS Office pack. I'm using R 2.11.1 32 bits in a


It's a Microsoft muddle, covered in the RODBC manual for the next 
release.  Note that R-sig-db is the right list for questions about 
RODBC, not R-help.


Simply, if you have MS Office 2007/2010 installed, you can only have 
its ODBC drivers of the same architecture installed.  Nothing to do 
with RODBC nor R, and something MS omits to mention on the download 
page for the drivers.


Also, the ODBC data sources manager for x64 Windows tells you only 
about 64-bit drivers and DSNs.  There is a different manager for 
32-bit, but it is rather hidden 


The simplest thing to do is to use 64-bit R.  I found this out the 
other way round: I have 32-bit Office installed and could not install 
the 64-bit ODBC drivers.


The pre-release of RODBC at
http://www.stats.ox.ac.uk/pub/R/RODBC_1.3-2.tar.gz
is only available as a source package, but you can unpack it and read 
the updated manual.



Windows 7 x64 with the MS office 2010 x64 installed. I can import .xls files
normally (the same way I did with my Excel 2007 32 bits). But the function
odbcConnectExcel2007 isn't able to import .xlsx files now that I have the
new version of the Office package.

It gives me the following warning message, which make impossible the
importing process through sqlFetch:
Warning messages:
1: In odbcDriverConnect(con, tabQuote = c([, ]), ...) :
 [RODBC] ERROR: state IM002, code 0, message [Microsoft][ODBC Driver
Manager] Nome da fonte de dados n??o encontrado e nenhum driver padr??o
especificado (Source name not found and no default driver specified)
2: In odbcDriverConnect(con, tabQuote = c([, ]), ...) :
 ODBC connection failed

I'm obviously bypassing it converting my .xlsx files to .xls.
Well the question is simple. Is this an expected issue, like the one when
the xlsx format was released and it will be worked out, or I'm having and
specific problem at one of my system components (drivers)?

Thank you very much for the attention.

Rodrigo Aluizio

[[alternative HTML version deleted]]




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to select the column header with \Sexpr{}

2010-07-13 Thread Ista Zahn

Hi Felipe,
The problem has nothing to do with Sweave or \Sexpr. The problem is
that by the time you call \Sexpr report is a matrix, and you cannot
access the column names of a matrix with names(). You need to use
colnames() or convert the matrix to a data.frame.

Perhaps a true useR can write R code in a Sweave file without checking
it, but for mere mortals it is best to evaluate the R code in an
interactive session to make sure it works before asking Sweave to
insert it into your .tex file. If you had tried to evaluate
names(report)[1] in an interactive session you would have discovered
your problem immediately.

Best,
Ista

On Tue, Jul 13, 2010 at 4:15 AM, Felipe Carrillo
mazatlanmex...@yahoo.com wrote:
 I had tried that earlier and didn't work either, I probably have \Sexpr in the
 wrong place. See example:
 Column one header gets blank:

 \documentclass[11pt]{article}
 \usepackage{longtable,verbatim,ctable}
 \usepackage{longtable,pdflscape}
 \usepackage{fmtcount,hyperref}
 \usepackage{fullpage}
 \title{United States}
 \begin{document}
 \setkeys{Gin}{width=1\textwidth}
 \maketitle
 echo=F,results=hide=
 report - structure(list(Date = c(3/12/2010, 3/13/2010, 3/14/2010,
 3/15/2010), Run1 = c(33 (119 ? 119), n (0 ? 0), 893 (110 ? 146),
 140 (111 ? 150)), Run2 = c(33 (71 ? 71), n (0 ? 0),
 337 (67 ? 74), 140 (68 ? 84)), Run3 = c(890 (32 ? 47),
 n (0 ? 0), 10,602 (32 ? 52), 2,635 (34 ? 66)), Run4 = c(0 ( ? ),
 n (0 ? 0), 0 ( ? ), 0 ( ? )), Run4 = c(0 ( ? ), n (0 ? 0),
 0 ( ? ), 0 ( ? ))), .Names = c(ID_Date, Run1, Run2,
 Run3, Run4, Run5), row.names = c(NA, 4L), class = data.frame)
 require(stringr)
 report - t(apply(report, 1, function(x) {str_replace(x, \\?, -)}))
 #report
 #latex(report,file=)
 @
 \begin{landscape}
 \begin{table}[!tbp]
  \begin{center}
  \begin{tabular}{ll}\hline\hline
 \multicolumn{1}{c}{\Sexpr{names(report)[1]}}   # Using \Sexpr here
 \multicolumn{1}{c}{Run1}
 \multicolumn{1}{c}{Run2}
 \multicolumn{1}{c}{Run3}
 \multicolumn{1}{c}{Run4}
 \multicolumn{1}{c}{Run5}\tabularnewline
 \hline
 13/12/201033 (119 ? 119)33 (71 ? 71)890 (32 ? 47)0 ( ? )0 ( ?
 )\tabularnewline
 23/13/2010n (0 ? 0)n (0 ? 0)n (0 ? 0)n (0 ? 0)n (0 ? 0)\tabularnewline
 33/14/2010893 (110 ? 146)337 (67 ? 74)10,602 (32 ? 52)0 ( ? )0 ( ?
 )\tabularnewline
 43/15/2010140 (111 ? 150)140 (68 ? 84)2,635 (34 ? 66)0 ( ? )0 ( ?
 )\tabularnewline
 \hline
 \end{tabular}
 \end{center}
 \end{table}
 \end{landscape}
 \end{document}

 Felipe D. Carrillo
 Supervisory Fishery Biologist
 Department of the Interior
 US Fish  Wildlife Service
 California, USA



 - Original Message 
 From: David Winsemius dwinsem...@comcast.net
 To: Felipe Carrillo mazatlanmex...@yahoo.com
 Cc: Duncan Murdoch murdoch.dun...@gmail.com; r-h...@stat.math.ethz.ch
 Sent: Mon, July 12, 2010 3:14:49 PM
 Subject: Re: [R] How to select the column header with \Sexpr{}


 On Jul 12, 2010, at 5:45 PM, Felipe Carrillo wrote:

  Thanks for the quick reply Duncan.
  I don't think I have explained myself well, I have a dataset named report
and
  my column headers are run1,run2,run3,run4 and so on.
 
  I know how to access the data below those columns with \Sexpr{report[1,1]} 
  
  \Sexpr{report[1,2]} and so on, but I can't access my column headers
  with \Sexpr{} because I can't find the way to reference run1,run2,run3 and
run4.
  Sorry if I am not explain myself really well.

 Wouldn't this just be:

 \Sexpr{names(report)}  # ?  or perhaps you want specific items in that 
 vector?

 Sexpr{names(report)[1]}, Sexpr{names(report)[2]}, etc

 --David.
 
 
 
 
  - Original Message 
  From: Duncan Murdoch murdoch.dun...@gmail.com
  To: Felipe Carrillo mazatlanmex...@yahoo.com
  Cc: r-h...@stat.math.ethz.ch
  Sent: Mon, July 12, 2010 2:18:15 PM
  Subject: Re: [R] How to select the column header with \Sexpr{}
 
  On 12/07/2010 5:10 PM, Felipe Carrillo wrote:
  Hi:
  Since I work with a few different fish runs my column headers change
  everytime
  I start a new Year. I have been using \Sexpr{} for my row and columns and
now
  I am trying to use with my report column headers. \Sexpr{1,1} is row 1
column 1,
  what can I use for headers? I tried \Sexpr{0,1} but sweave didn't like
  it..Thanks in advance
  for any hints
 
 
  \Sexpr takes an R expression, and inserts the first element of the result
into
  your text.  Using just 0,1 (not including the quotes) is not a valid R
  expression.
 
  You need to use paste() or some other function to construct the label you
want
  to put in place, e.g. \Sexpr{paste(0,1,sep=,)} will give you 0,1.
 
  Duncan Murdoch
 
 
 
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 West Hartford, CT





 __
 R-help@r-project.org mailing

Re: [R] Continuing on with a loop when there's a failure



On Jul 13, 2010, at 9:24 AM, Josh B wrote:

In my opinion the try and tryCatch commands are written and  
documented rather poorly. Thus I am not sure what to program exactly.


Didn't  you see the silent parameter? Its seems to be documented  
fairly clearly to me.


The testing of try at the console is not going to be very  
illuminating, since it really only has value within a function that  
you want to continue despite an error. try() WILL provide that  
facility but _you_ need to decide what you do with the information it  
returns, which in the case of its use with the default silent=FALSE is  
just the error message itself.





For instance, I could query mod.poly3 and use an if/then statement  
to proceed,


 So why didn't you? A good result would be signaled by: lrm %in  
class(mod.poly3)


--
David.

but querying mod.poly3 is weird. For instance, here's the output  
when it fails:


 mod.poly3 - try(lrm(x[,2] ~ pol(x1, 3) + pol(x2, 3), data=x))
Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol,  
weights = weights,  :

  NA/NaN/Inf in foreign function call (arg 1)
 mod.poly3
[1] Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol =  
tol, weights = weights,  : \n  NA/NaN/Inf in foreign function call  
(arg 1)\n

attr(,class)
[1] try-error

...and here's the output when it succeeds:
 mod.poly3 - try(lrm(x[,1] ~ pol(x1, 3) + pol(x2, 3), data=x))
 mod.poly3

Logistic Regression Model

lrm(formula = x[, 1] ~ pol(x1, 3) + pol(x2, 3), data = x)


Frequencies of Responses
bagels donuts
10  5

   Obs  Max Deriv Model L.R.   d.f.  P  C
15  4e-04   3.37  6 0.7616   0.76
   Dxy  Gamma  Tau-a R2  Brier  g
  0.52   0.52  0.248  0.279  0.183  1.411
gr gp
   4.1  0.261

  Coef S.E.Wald Z P
Intercept -5.68583 5.23295 -1.09  0.2772
x1 1.87020 2.14635  0.87  0.3836
x1^2  -0.42494 0.48286 -0.88  0.3788
x1^3   0.02845 0.03120  0.91  0.3618
x2 3.49560 3.54796  0.99  0.3245
x2^2  -0.94888 0.82067 -1.16  0.2476
x2^3   0.06362 0.05098  1.25  0.2121

...so what exactly would I query to design my if/then statement?

From: David Winsemius dwinsem...@comcast.net
To: David Winsemius dwinsem...@comcast.net
Cc: Josh B josh...@yahoo.com; R Help r-help@r-project.org
Sent: Tue, July 13, 2010 9:09:04 AM
Subject: Re: [R] Continuing on with a loop when there's a failure


On Jul 13, 2010, at 9:04 AM, David Winsemius wrote:


 On Jul 13, 2010, at 8:47 AM, Josh B wrote:

 Thanks again, David.

 ...but, alas, I still can't get it work!

(BTW, it did work.)

 Here's what I'm trying now:

 for (i in 1:2) {
mod.poly3 - try(lrm(x[,i] ~ pol(x1, 3) + pol(x2, 3), data=x))
results[1,i] - anova(mod.poly3)[1,3]
 }

 You need to do some programming.

(Or I suppose you could wrap both the lrm and the anova calls in try.)

 You did not get an error from the lrm but rather from the anova  
call because you tried to give the results of the try function to  
anova without first checking to see if an error had occurred.


 --David.

 Here's what happens (from the console):

 Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol,  
weights = weights,  :

  NA/NaN/Inf in foreign function call (arg 1)
 Error in UseMethod(anova) :
  no applicable method for 'anova' applied to an object of class  
try-error


 ...so I still can't make my results matrix. Could I ask you for  
some specific code to make this work? I'm not that familiar with the  
syntax for try or tryCatch, and the help files for them are pretty  
bad, in my humble opinion.


 I should clarify that I actually don't care about the failed runs  
per se. I just want R to keep going in spite of them and give me my  
results matrix.


 From: David Winsemius dwinsem...@comcast.net
 To: Josh B josh...@yahoo.com
 Cc: R Help r-help@r-project.org
 Sent: Mon, July 12, 2010 8:09:03 PM
 Subject: Re: [R] Continuing on with a loop when there's a failure


 On Jul 12, 2010, at 6:18 PM, Josh B wrote:

  Hi R sages,
 
  Here is my latest problem. Consider the following toy example:
 
  x - read.table(textConnection(y1 y2 y3 x1 x2
  indv.1 bagels donuts bagels 4 6
  indv.2 donuts donuts donuts 5 1
  indv.3 donuts donuts donuts 1 10
  indv.4 donuts donuts donuts 10 9
  indv.5 bagels donuts bagels 0 2
  indv.6 bagels donuts bagels 2 9
  indv.7 bagels donuts bagels 8 5
  indv.8 bagels donuts bagels 4 1
  indv.9 donuts donuts donuts 3 3
  indv.10 bagels donuts bagels 5 9
  indv.11 bagels donuts bagels 9 10
  indv.12 bagels donuts bagels 3 1
  indv.13 donuts donuts donuts 7 10
  indv.14 bagels donuts bagels 2 10
  indv.15 bagels donuts bagels 9 6), header = TRUE)
 
  I want to fit a logistic regression of y1 on x1 and x2. Then I  
want to run a
  logistic regression of y2 on x1 and x2. Then I want to run a  
logistic regression
  of y3 on x1 and x2. In reality I have many more Y

Re: [R] Continuing on with a loop when there's a failure



On Jul 13, 2010, at 10:26 AM, David Winsemius wrote:



On Jul 13, 2010, at 9:24 AM, Josh B wrote:

In my opinion the try and tryCatch commands are written and  
documented rather poorly. Thus I am not sure what to program exactly.


Didn't  you see the silent parameter? Its seems to be documented  
fairly clearly to me.


The testing of try at the console is not going to be very  
illuminating, since it really only has value within a function that  
you want to continue despite an error. try() WILL provide that  
facility but _you_ need to decide what you do with the information  
it returns, which in the case of its use with the default  
silent=FALSE is just the error message itself.





For instance, I could query mod.poly3 and use an if/then statement  
to proceed,


So why didn't you? A good result would be signaled by:


rather: lrm %in% class(mod.poly3)



--
David.

but querying mod.poly3 is weird. For instance, here's the output  
when it fails:


 mod.poly3 - try(lrm(x[,2] ~ pol(x1, 3) + pol(x2, 3), data=x))
Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol,  
weights = weights,  :

 NA/NaN/Inf in foreign function call (arg 1)
 mod.poly3
[1] Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol =  
tol, weights = weights,  : \n  NA/NaN/Inf in foreign function call  
(arg 1)\n

attr(,class)
[1] try-error

...and here's the output when it succeeds:
 mod.poly3 - try(lrm(x[,1] ~ pol(x1, 3) + pol(x2, 3), data=x))
 mod.poly3

Logistic Regression Model

lrm(formula = x[, 1] ~ pol(x1, 3) + pol(x2, 3), data = x)


Frequencies of Responses
bagels donuts
   10  5

  Obs  Max Deriv Model L.R.   d.f.  P  C
   15  4e-04   3.37  6 0.7616   0.76
  Dxy  Gamma  Tau-a R2  Brier  g
 0.52   0.52  0.248  0.279  0.183  1.411
   gr gp
  4.1  0.261

 Coef S.E.Wald Z P
Intercept -5.68583 5.23295 -1.09  0.2772
x1 1.87020 2.14635  0.87  0.3836
x1^2  -0.42494 0.48286 -0.88  0.3788
x1^3   0.02845 0.03120  0.91  0.3618
x2 3.49560 3.54796  0.99  0.3245
x2^2  -0.94888 0.82067 -1.16  0.2476
x2^3   0.06362 0.05098  1.25  0.2121

...so what exactly would I query to design my if/then statement?

From: David Winsemius dwinsem...@comcast.net
To: David Winsemius dwinsem...@comcast.net
Cc: Josh B josh...@yahoo.com; R Help r-help@r-project.org
Sent: Tue, July 13, 2010 9:09:04 AM
Subject: Re: [R] Continuing on with a loop when there's a failure


On Jul 13, 2010, at 9:04 AM, David Winsemius wrote:


 On Jul 13, 2010, at 8:47 AM, Josh B wrote:

 Thanks again, David.

 ...but, alas, I still can't get it work!

(BTW, it did work.)

 Here's what I'm trying now:

 for (i in 1:2) {
mod.poly3 - try(lrm(x[,i] ~ pol(x1, 3) + pol(x2, 3), data=x))
results[1,i] - anova(mod.poly3)[1,3]
 }

 You need to do some programming.

(Or I suppose you could wrap both the lrm and the anova calls in  
try.)


 You did not get an error from the lrm but rather from the anova  
call because you tried to give the results of the try function to  
anova without first checking to see if an error had occurred.


 --David.

 Here's what happens (from the console):

 Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol =  
tol, weights = weights,  :

  NA/NaN/Inf in foreign function call (arg 1)
 Error in UseMethod(anova) :
  no applicable method for 'anova' applied to an object of class  
try-error


 ...so I still can't make my results matrix. Could I ask you for  
some specific code to make this work? I'm not that familiar with  
the syntax for try or tryCatch, and the help files for them are  
pretty bad, in my humble opinion.


 I should clarify that I actually don't care about the failed  
runs per se. I just want R to keep going in spite of them and give  
me my results matrix.


 From: David Winsemius dwinsem...@comcast.net
 To: Josh B josh...@yahoo.com
 Cc: R Help r-help@r-project.org
 Sent: Mon, July 12, 2010 8:09:03 PM
 Subject: Re: [R] Continuing on with a loop when there's a failure


 On Jul 12, 2010, at 6:18 PM, Josh B wrote:

  Hi R sages,
 
  Here is my latest problem. Consider the following toy example:
 
  x - read.table(textConnection(y1 y2 y3 x1 x2
  indv.1 bagels donuts bagels 4 6
  indv.2 donuts donuts donuts 5 1
  indv.3 donuts donuts donuts 1 10
  indv.4 donuts donuts donuts 10 9
  indv.5 bagels donuts bagels 0 2
  indv.6 bagels donuts bagels 2 9
  indv.7 bagels donuts bagels 8 5
  indv.8 bagels donuts bagels 4 1
  indv.9 donuts donuts donuts 3 3
  indv.10 bagels donuts bagels 5 9
  indv.11 bagels donuts bagels 9 10
  indv.12 bagels donuts bagels 3 1
  indv.13 donuts donuts donuts 7 10
  indv.14 bagels donuts bagels 2 10
  indv.15 bagels donuts bagels 9 6), header = TRUE)
 
  I want to fit a logistic regression of y1 on x1 and x2. Then I  
want to run a
  logistic regression of y2 on x1 and x2. Then I want to run a  
logistic

[R] how to extract information from anova results

2010-07-13 Thread Luis Borda de Agua

Hi, 

I have used the instruction aov in the following manner:

res - aov(qwe ~ asd)

when I typed res I get:
_
Call:
   aov(formula = qwe ~ asd)

Terms:
  asd Residuals
Sum of Squares  0.0708704 0.5255957
Deg. of Freedom 1 8

Residual standard error: 0.2563191 
Estimated effects may be unbalanced
_

I need to access the value of the Sum of Squares (i.e. I want another variable 
to be equal to it, e.g myvar - Sum.of.Squares) . 
I tried names(res) to see which values are accessible, but I couldn't find the 
Sum of Squares. I had a similar problem when I tried to access the p.value 
which can be readily SEEN using summary(res).

In general, is there an easy way to access the values generated by an R 
function?

Thank you, 

LBA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to extract information from anova results



On Jul 13, 2010, at 10:35 AM, Luis Borda de Agua wrote:


Hi,

I have used the instruction aov in the following manner:

res - aov(qwe ~ asd)

when I typed res I get:
_
Call:
  aov(formula = qwe ~ asd)

Terms:
 asd Residuals
Sum of Squares  0.0708704 0.5255957
Deg. of Freedom 1 8

Residual standard error: 0.2563191
Estimated effects may be unbalanced
_

I need to access the value of the Sum of Squares (i.e. I want  
another variable to be equal to it, e.g myvar - Sum.of.Squares) .
I tried names(res) to see which values are accessible, but I  
couldn't find the Sum of Squares. I had a similar problem when I  
tried to access the p.value which can be readily SEEN using  
summary(res).


In general, is there an easy way to access the values generated by  
an R function?


When you typed res, the interpreter determined that it was of type  
aov and dispatched it to the print method for objects of that class.  
The list of print methods is accessed with:
methods(print) and it's a long list. print.aov is asterisked so you  
either need to look at the function with:


getAnywhere(print.aov)

or perhaps more directly assign summary(res) to an object and  
access its SS values.


--
David.



Thank you,

LBA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to select the column header with \Sexpr{}

2010-07-13 Thread Felipe Carrillo

Thanks Izta:
I see your point, then I should extract the column names when the 
dataset is first read because is a dataframe:
 report - structure(list(Date = c(3/12/2010, 3/13/2010, 3/14/2010,
 3/15/2010), Run1 = c(33 (119 ? 119), n (0 ? 0), 893 (110 ? 146),
 140 (111 ? 150)), Run2 = c(33 (71 ? 71), n (0 ? 0),
 337 (67 ? 74), 140 (68 ? 84)), Run3 = c(890 (32 ? 47),
 n (0 ? 0), 10,602 (32 ? 52), 2,635 (34 ? 66)), Run4 = c(0 ( ? ),
 n (0 ? 0), 0 ( ? ), 0 ( ? )), Run4 = c(0 ( ? ), n (0 ? 0),
0 ( ? ), 0 ( ? ))), .Names = c(ID_Date, Run1, Run2,
Run3, Run4, Run5), row.names = c(NA, 4L), class = data.frame)
str(report)
'data.frame':   4 obs. of  6 variables:
 $ ID_Date: chr  3/12/2010 3/13/2010 3/14/2010 3/15/2010
 $ Run1   : chr  33 (119 ? 119) n (0 ? 0) 893 (110 ? 146) 140 (111 ? 
150)
 $ Run2   : chr  33 (71 ? 71) n (0 ? 0) 337 (67 ? 74) 140 (68 ? 84)
 $ Run3   : chr  890 (32 ? 47) n (0 ? 0) 10,602 (32 ? 52) 2,635 (34 ? 
66)
 $ Run4   : chr  0 ( ? ) n (0 ? 0) 0 ( ? ) 0 ( ? )
 $ Run5   : chr  0 ( ? ) n (0 ? 0) 0 ( ? ) 0 ( ? )
 names(report)[1]  # I can extract the column name here
[1] Date

But after I use 'stringr to convert the character '?' to '-'
'report' is not a dataframe anymore and returns a NULL when trying to extract 
the
column names. 
I was not aware that \Sexpr{} only work on dataframes, thanks for your help.



- Original Message 
 From: Ista Zahn iz...@psych.rochester.edu
 To: Felipe Carrillo mazatlanmex...@yahoo.com
 Cc: David Winsemius dwinsem...@comcast.net; r-h...@stat.math.ethz.ch
 Sent: Tue, July 13, 2010 7:13:39 AM
 Subject: Re: [R] How to select the column header with \Sexpr{}
 
 Hi Felipe,
 The problem has nothing to do with Sweave or \Sexpr. The problem is
 that by the time you call \Sexpr report is a matrix, and you cannot
 access the column names of a matrix with names(). You need to use
 colnames() or convert the matrix to a data.frame.
 
 Perhaps a true useR can write R code in a Sweave file without checking
 it, but for mere mortals it is best to evaluate the R code in an
 interactive session to make sure it works before asking Sweave to
 insert it into your .tex file. If you had tried to evaluate
 names(report)[1] in an interactive session you would have discovered
 your problem immediately.
 
 Best,
 Ista
 
 On Tue, Jul 13, 2010 at 4:15 AM, Felipe Carrillo
 mazatlanmex...@yahoo.com wrote:
  I had tried that earlier and didn't work either, I probably have \Sexpr in 
the
  wrong place. See example:
  Column one header gets blank:
 
  \documentclass[11pt]{article}
  \usepackage{longtable,verbatim,ctable}
  \usepackage{longtable,pdflscape}
  \usepackage{fmtcount,hyperref}
  \usepackage{fullpage}
  \title{United States}
  \begin{document}
  \setkeys{Gin}{width=1\textwidth}
  \maketitle
  echo=F,results=hide=
  report - structure(list(Date = c(3/12/2010, 3/13/2010, 3/14/2010,
  3/15/2010), Run1 = c(33 (119 ? 119), n (0 ? 0), 893 (110 ? 146),
  140 (111 ? 150)), Run2 = c(33 (71 ? 71), n (0 ? 0),
  337 (67 ? 74), 140 (68 ? 84)), Run3 = c(890 (32 ? 47),
  n (0 ? 0), 10,602 (32 ? 52), 2,635 (34 ? 66)), Run4 = c(0 ( ? ),
  n (0 ? 0), 0 ( ? ), 0 ( ? )), Run4 = c(0 ( ? ), n (0 ? 0),
  0 ( ? ), 0 ( ? ))), .Names = c(ID_Date, Run1, Run2,
  Run3, Run4, Run5), row.names = c(NA, 4L), class = data.frame)
  require(stringr)
  report - t(apply(report, 1, function(x) {str_replace(x, \\?, -)}))
  #report
  #latex(report,file=)
  @
  \begin{landscape}
  \begin{table}[!tbp]
   \begin{center}
   \begin{tabular}{ll}\hline\hline
  \multicolumn{1}{c}{\Sexpr{names(report)[1]}}   # Using \Sexpr here
  \multicolumn{1}{c}{Run1}
  \multicolumn{1}{c}{Run2}
  \multicolumn{1}{c}{Run3}
  \multicolumn{1}{c}{Run4}
  \multicolumn{1}{c}{Run5}\tabularnewline
  \hline
  13/12/201033 (119 ? 119)33 (71 ? 71)890 (32 ? 47)0 ( ? )0 ( ?
  )\tabularnewline
  23/13/2010n (0 ? 0)n (0 ? 0)n (0 ? 0)n (0 ? 0)n (0 ? 0)\tabularnewline
  33/14/2010893 (110 ? 146)337 (67 ? 74)10,602 (32 ? 52)0 ( ? )0 ( ?
  )\tabularnewline
  43/15/2010140 (111 ? 150)140 (68 ? 84)2,635 (34 ? 66)0 ( ? )0 ( ?
  )\tabularnewline
  \hline
  \end{tabular}
  \end{center}
  \end{table}
  \end{landscape}
  \end{document}
 
  Felipe D. Carrillo
  Supervisory Fishery Biologist
  Department of the Interior
  US Fish  Wildlife Service
  California, USA
 
 
 
  - Original Message 
  From: David Winsemius dwinsem...@comcast.net
  To: Felipe Carrillo mazatlanmex...@yahoo.com
  Cc: Duncan Murdoch murdoch.dun...@gmail.com; r-h...@stat.math.ethz.ch
  Sent: Mon, July 12, 2010 3:14:49 PM
  Subject: Re: [R] How to select the column header with \Sexpr{}
 
 
  On Jul 12, 2010, at 5:45 PM, Felipe Carrillo wrote:
 
   Thanks for the quick reply Duncan.
   I don't think I have explained myself well, I have a dataset named 
report
 and
   my column headers are run1,run2,run3,run4 and so on.
  
   I know how to access the data below those columns with 
   \Sexpr{report[1,1]} 

   \Sexpr{report[1,2]} and so on, but I can't access my column headers
   with \Sexpr{}

Re: [R] Substring function?

The high-level concept you need is called Regular Expressions.  R 
supports these through several functions, see ?regex .


Ralf B wrote:

Hi all,

I would like to detect all strings in the vector 'content' that
contain the strings from the vector 'search'. Here a code example:

content - data.frame(urls=c(

http://www.google.com/search?source=ighl=enrlz==q=stuffaq=faqi=g10aql=oq=gs_rfai=CrrIS3;,

http://search.yahoo.com/search;_ylt=Atvki9MVpnxuEcPmXLEWgMqbvZx4?p=stufftoggle=1;)
)
search - data.frame(signatures=c(http://www.google.com/search;))
subset(content, search$signatures %in% content$urls)

I am getting an error:

[1] urls
0 rows (or 0-length row.names)


What I would like to achieve is the return of
http://www.google.com/search?source=ighl=enrlz==q=stuffaq=faqi=g10aql=oq=gs_rfai=CrrIS3;.
Is that possible? In practice I would like to run this over 1000s of
strings in 'content' and 100s of strings in 'search'. Could I run into
performance issues with this approach and, if so, are there better
ways?

Best,
Ralf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to extract information from anova results

2010-07-13 Thread Joshua Wiley

I think the easiest way is from calling anova() on your aov class
object.  For instance

y - 1:10
x - runif(10)
my.aov - aov(y ~ x)
anova(my.aov)[Residuals, Sum Sq]
anova(my.aov)[x, Pr(F)]

You can also extract these values from a call to summary(my.aov), but
that output is a list (even for an ANOVA with a single error stratum),
so you'd have to add [[1]] selecting the first (or if there were more
than one whichever you wanted) element of the list.

summary(my.aov)[[1]][Residuals, Sum Sq]

Cheers,

Josh

On Tue, Jul 13, 2010 at 7:35 AM, Luis Borda de Agua lba...@gmail.com wrote:
 Hi,

 I have used the instruction aov in the following manner:

 res - aov(qwe ~ asd)

 when I typed res I get:
 _
 Call:
   aov(formula = qwe ~ asd)

 Terms:
                      asd Residuals
 Sum of Squares  0.0708704 0.5255957
 Deg. of Freedom         1         8

 Residual standard error: 0.2563191
 Estimated effects may be unbalanced
 _

 I need to access the value of the Sum of Squares (i.e. I want another 
 variable to be equal to it, e.g myvar - Sum.of.Squares) .
 I tried names(res) to see which values are accessible, but I couldn't find 
 the Sum of Squares. I had a similar problem when I tried to access the 
 p.value which can be readily SEEN using summary(res).

 In general, is there an easy way to access the values generated by an R 
 function?

 Thank you,

 LBA

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] MplusAutomation

2010-07-13 Thread Gushta, Matthew

R list-
i have begun using the MplusAutomation while piloting a large-scale 
simulation (~200,000 replications). since the package takes advantage of the 
DOS batch mode available in Mplus, each replication starts and activates a new 
instance of a command prompt window. this effectively locks me out of my 
computer for the duration of the simulation.

my question is this: can anyone suggest how i might pass the quiet command to 
the DOS program? is there a way to generally specify this from R? or any 
specific recommendations/experience with this package?

thanks,

--matthew
..
matthew m. gushta
american institutes for research
202.403.5079




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Regarding R -installation

2010-07-13 Thread venkatesh bandaru

Dear  nuncio


 my internet is connected properly, i am running yast as a superuser , i am
getting the following error



 *Problem* :

 Cannot access installation media

 http://download.opensuse.org/repositories/devel:languages:R:patched/openSUSE_11.2
  (Medium 1).
 Check whether the server is accessible

 Download (curl) error for '
 http://download.opensuse.org/repositories/devel:languages:R:patched/openSUSE_11.2/repodata/repomd.xml
 ':
 Error code: Connection failed
 Error message: couldn't connect to host


 yours truly,
 B.venkatesh,
 University of Hyderabad
 India.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] isdst warning when rounding a range of time data: fix or suppress?

2010-07-13 Thread philippgrueber


Dear Clay, 
dear list,

I face the same problem when rounding POSIXct objects. Have you (or has
anybody) found an explanation meanwhile, or a way to work around this issue? 

Example:
opt-options(digits.secs=3)
ts1-as.POSIXct(c(2006-11-01 09:00:00.03, 2006-11-01 09:00:01,
2006-11-01 09:00:01.0245, 2006-11-01 09:00:01.11,2006-11-01 09:00:03),
tz=GMT)
ra1-seq(2,6,1)
data-data.frame(ts1,ra1)
data$lo1-data$ts1==round.POSIXt(data$ts1,secs) 
data

Even though in this example all results are correct, is there a chance that
incorrect results are returned?

Thanks,Phil







Clay Heaton wrote:
 
 Hi, I'm working with timeseries data. The values are every 5 seconds and
 each series can last up to 4-5 days.
 
 To generate the x-axis labels, I'm doing the following:
 
 =
 # Variable for displaying hours on the x-axis
 rtime - as.POSIXct(round(range(timedata), hours))
 
 # Variable for displaying days on the x-axis
 stime - as.POSIXct(round(range(timedata), days))
 
 # Plot the hours on the x-axis
 axis.POSIXct(1, at=seq(rtime[1], rtime[2], by=hour), format=%H,
 cex.axis=.6, lwd=0, lwd.ticks=1, hadj=0.2, las=2, tck=-0.02)
 
 # Plot the days on the x-axis
 axis.POSIXct(1, at=seq(stime[1], stime[2], by=day), format=%A,
 cex.axis=.7, line=1, lty=0, padj=-1.4)
 =
 
 The data generated and the plots look fine. R issues a warning on the
 round() function when rtime is set, though. It looks like this:
 
 round(range(cgmtime), hours)
 [1] 2003-11-04 14:00:00 EST 2003-11-07 11:00:00 EST
 Warning message:
 In if (isdst == -1) { :
   the condition has length  1 and only the first element will be used

 
 Am I approaching this incorrectly? Is there another way to achieve the
 same result without the warning? Or is there a way I can suppress the
 warning?
 
 Thanks in advance,
 Clay
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/isdst-warning-when-rounding-a-range-of-time-data-fix-or-suppress-tp1680540p2287574.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Robust regression error: Too many singular resamples

2010-07-13 Thread Alexandra Denby


You could try rlm in the MASS package; it doesn't use he resampling
step.

That seems to do the trick.  Thank you!

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Robust-regression-error-Too-many-singular-resamples-tp2286585p2287468.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] distributing a value for a given month across the number of weeks in that month

2010-07-13 Thread Dimitri Liakhovitski

Actually,
I realized that my task was a bit more complicated as I have different
(let's call them) Markets and the dates repeat themselves across
markets. And the original code from Gabor gives an error - because
dates repeate themselves and apparently zoo cannot handle it. So, I
had to do program a way around it (below). It works.
However, I am wondering if there is a shorter/more elegant way of doing it?
Thank you!
Dimitri

### My original data frame is a bit more complicated - dates repeat
themselves for 2 markets:
monthly-data.frame(month=c(20100301,20100401,20100501,20100301,20100401,20100501),monthly.value=c(100,200,300,10,20,30),market=c(Market
A,Market A, Market A,Market B,Market B, Market B))
monthly$month-as.character(monthly$month)
monthly$month-as.Date(monthly$month,%Y%m%d)
(monthly)

library(zoo)
# pull in development version of na.locf.zoo
source(http://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/zoo/R/na.locf.R?revision=725root=zoo;)

# convert to zoo
my.z.list-NULL
for(i in 1:length(levels(monthly$market))){
  my.frame-monthly[monthly$market %in% levels(monthly$market)[i],1:2]
  my.z.list[[i]] - with(my.frame, zoo(monthly.value, month))
}

# get sequence of all dates and from that get mondays
all.dates - seq(start(my.z.list[[1]]),
as.Date(as.yearmon(end(my.z.list[[1]])), frac = 1), by = day)
mondays - all.dates[weekdays(all.dates) == Monday]
(mondays)

# use na.locf to fill in mondays and ave to distribute them
weekly-NULL
for(i in 1:length(levels(monthly$market))){
  weekly[[i]] - na.locf(my.z.list[[i]], xout = mondays)
  weekly[[i]][] - ave(weekly[[i]], as.yearmon(mondays), FUN =
function(x) x[1]/length(x))
}
(weekly)

### Creating a data frame with markets stacked on top of each other -
like in the original monthly data frame:
for(i in 1:length(weekly)){
  weekly[[i]]-as.data.frame(weekly[[i]])
  weekly[[i]]$week-row.names(weekly[[i]])
  names(weekly[[i]])[1]-weekly.value
  weekly[[i]]$market-levels(monthly$market)[i]
}
weekly.data-do.call(rbind,weekly)

That's it.
Dimitri

On Fri, Jul 9, 2010 at 10:22 AM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 On Fri, Jul 9, 2010 at 9:35 AM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Hello!

 Any hint would be greatly appreciated.
 I have a data frame that contains (a) monthly dates and (b) a value
 that corresponds to each month - see the data frame monthly below:

 monthly-data.frame(month=c(20100301,20100401,20100501),monthly.value=c(100,200,300))
 monthly$month-as.character(monthly$month)
 monthly$month-as.Date(monthly$month,%Y%m%d)
 (monthly)

 I need to split each month into weeks, e.g., weeks that start on
 Monday (it could as well be Sunday - it does not really matter) and
 distribute the monthly value evenly across weeks. So, if a month has 5
 Mondays, then the monthly value should be dividied by 5, but if a
 month has only 4 weeks, then the monthly value should be divided by 4.

 The output I need is like this:

 week          weekly.value
 2010-03-01   20
 2010-03-08   20
 2010-03-15   20
 2010-03-22   20
 2010-03-29   20
 2010-04-05   50
 2010-04-12   50
 2010-04-19   50
 2010-04-26   50
 2010-05-03   60
 2010-05-10   60
 2010-05-17   60
 2010-05-24   60
 2010-05-31   60



 There is new functionality in na.locf in the development version
 of zoo that makes it particularly convenient to do this.

 First create a zoo object z from monthly and get a vector of all
 the mondays.  Then use na.locf to place the monthly value in each
 monday and ave to distribute them out.


 library(zoo)

 # pull in development version of na.locf.zoo
 source(http://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/zoo/R/na.locf.R?revision=725root=zoo;)

 # convert to zoo
 z - with(monthly, zoo(monthly.value, month))

 # get sequence of all dates and from that get mondays
 all.dates - seq(start(z), as.Date(as.yearmon(end(z)), frac = 1), by = day)
 mondays - all.dates[weekdays(all.dates) == Monday]

 # use na.locf to fill in mondays and ave to distribute them
 weeks - na.locf(z, xout = mondays)
 weeks[] - ave(weeks, as.yearmon(mondays), FUN = function(x) x[1]/length(x))

 # show output in a few different formats
 weeks
 as.data.frame(weeks)
 data.frame(Monday = as.Date(time(weeks)), value = weeks)
 data.frame(Monday = as.Date(time(weeks)), value = weeks, row.names = NULL)
 plot(weeks)

 The output looks like this:

 weeks
 2010-03-01 2010-03-08 2010-03-15 2010-03-22 2010-03-29 2010-04-05 2010-04-12
        20         20         20         20         20         50         50
 2010-04-19 2010-04-26 2010-05-03 2010-05-10 2010-05-17 2010-05-24 2010-05-31
        50         50         60         60         60         60         60
 as.data.frame(weeks)
           weeks
 2010-03-01    20
 2010-03-08    20
 2010-03-15    20
 2010-03-22    20
 2010-03-29    20
 2010-04-05    50
 2010-04-12    50
 2010-04-19    50
 2010-04-26    50
 2010-05-03    60
 2010-05-10    60
 2010-05-17    60
 2010-05-24    60
 2010-05-31    60

Re: [R] Accessing files on password-protected FTP sites




Is it possible to download data from password-protected ftp sites?  I saw
another thread with instructions for uploading files using RCurl, but I
could not find information for downloading them in the RCurl documentation.


Did you try the ?getURL function in RCurl?  See the  `Test the 
passwords` example in the examples on the help page...


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] What is the degrees of freedom in an nlme model

If the curves are sufficiently close to sine (cosine) curves and you know the 
period, then this can be restructured as a linear model and you can avoid all 
the complexities that come with non-linear models.  Further, from your 
description, it does not sound like you really gain much from using the mixed 
effects vs. just fixed effects, so this could be reduced to a simple use of lm 
and anova.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Jun Shen
 Sent: Monday, July 12, 2010 4:57 PM
 To: Bert Gunter
 Cc: R-help
 Subject: Re: [R] What is the degrees of freedom in an nlme model
 
 Hi, Bert,
 
 Thanks for your thoughtful explanation. I think the problem is quite
 over my head and maybe I should leave how for experts :)
 
 The situation is I have a group of sigmoid curves (let's say, they are
 supposed to be the same) but occasionally you will see a few curves
 kind of different. So how do we say they are actually different or not
 from the majority curves in a statistical way? The original idea was
 proposed by Monson and Rodbard in 1978 (Am J Physiol. 1978
 Aug;235(2):E97-102). The paper is freely available. The idea is to fit
 the curves individually and obtain the residual sum of squares and
 then fit the curves altogether somehow constraining some parameters
 and then you have another residual sum of squares. Then you can do a
 F-test. In my case, I wonder if I can use a mixed-effect modeling to
 do the simultaneous fitting job. Now you see, the problem is the
 degrees of freedom. Based on your explanation, it seems no reliable
 calculation of df for nonlinear models. However I can still see the df
 reported in nlme or nls models. Now I am not even sure if I should use
 them.
 
 Another thing I observed is even I added more random effects to the
 nlme model, the denominator df did not seem to change. Is it correct?
 Thanks again.
 
 Jun
 
 On Mon, Jul 12, 2010 at 4:00 PM, Bert Gunter gunter.ber...@gene.com
 wrote:
  Jun:
 
  Short answer: There is no such thing as df for a nonlinear model
 (whether or
  not mixed effects).
 
  Longer answer: df is the dimension of the null space when the data
 are
  projected on the linear subspace of the model matrix of a **linear
 model **
  . So, strictly speaking, no linear model, no df.
 
  HOWEVER... nonlinear models are usually (always??) fit by successive
 linear
  approximations, and approximate df are obtained from these
 approximating
  subspaces.
 
  However, the problem with this is that there is no guarantee that the
  relevant residual distributions are sufficiently chisq with the
 approximate
  df to give reasonable answers. In fact, lots of people much smarter
 than I
  have spent lots of time trying to figure out what sorts of
 approximations
  one should use to get trustworthy results. The thing is, in nonlinear
  models, it can DEPEND on the exact form of the model -- indeed,
 that's what
  distinguishes nonlinear models from linear ones! So this turns out to
 be
  really hard and afaik these smart people don't agree on what should
 be done.
 
 
  To see what one of the smartest people have to say about this, search
 the
  archives for Doug Bates's comments on this w.r.t. lmer (he won't
 compute
  such distributions nor provide P values because he doesn't know how
 to do it
  reliably. Doug -- please correct me if I have it wrong).
 
  A stock way to extricate oneself from this dilemma is: bootstrap!
  Unfortunately, this is also probably too facile: for one thing, with
 a
  nondiagonal covariance matrix (as in mixed effects models), how do
 you
  resample to preserve the covariance structure? I believe this is an
 area of
  active research in the time series literature, for example. For
 another,
  this may be too computationally demanding to be practicable due to
  convergence issues.
 
  Bottom line: there may be no good way to do what you want.
 
  Note to experts: Please view this post as an invitation to correct my
 errors
  and provide authoritative info.
 
  Cheers to all,
 
  Bert
 
  Bert Gunter
  Genentech Nonclinical Biostatistics
 
 
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On
  Behalf Of Jun Shen
  Sent: Monday, July 12, 2010 12:34 PM
  To: R-help
  Subject: [R] What is the degrees of freedom in an nlme model
 
  Dear all,
 
  I want to do a F test, which involves calculation of the degrees of
  freedom for the residuals. Now say, I have a nlme object mod.nlme.
 I
  have two questions
 
  1.How do I extract the degrees of freedom?
  2.How is this degrees of freedom calculated in an nlme model?
 
  Thanks.
 
  Jun Shen
 
  Some sample code and data
  =
  mod.nlme-nlme(RESP~E0+(Emax-

Re: [R] Fast string comparison

2010-07-13 Thread Matt Shotwell

Good idea Romain, there is quite a bit of type testing in the function
versions of STRING_ELT and CHAR, not to mention the function call
overhead. Since the types are checked explicitly, I believe this
function is safe. All together now...

 system.time(strings[-1] == strings[-1e5])
   user  system elapsed 
  0.032   0.000   0.035 
 system.time(strcmp(strings[-1], strings[-1e5]))
   user  system elapsed 
  0.032   0.000   0.034 
 system.time(strcmp2(strings[-1], strings[-1e5]))
   user  system elapsed 
  0.024   0.000   0.026 
 system.time(lhs==rhs)
   user  system elapsed 
  0.012   0.000   0.013 
 system.time(strcmp(lhs, rhs))
   user  system elapsed 
  0.012   0.000   0.011 
 system.time(strcmp2(lhs, rhs))
   user  system elapsed 
  0.004   0.000   0.004

I looks like you can squeeze out more speed using the macro versions of
STRING_ELT and CHAR.

On Tue, 2010-07-13 at 09:48 -0400, Romain Francois wrote:
 Hi Matt,
 
 I think there are some confusing factors in your results.
 
 
 system.time(strcmp(strings[-1], strings[-1e5]))
 
 would also include the time required to perform both subscripting 
 (strings[-1] and strings[-1e5] ) which actually takes some time.
 
 
 Also, you do have a bit of overhead due to the use of STRING_ELT and the 
 write barrier.
 
 
 I've include below a version that uses R internals so that you get the 
 fast (but you have to understand the risks, etc ...) version of 
 STRING_ELT using the plugin system of inline.
 
 library(inline)
 code - 
  SEXP ans;
  int i, len, *cans;
  if(!isString(s1) || !isString(s2))
  error(\invalid arguments\);
  len = length(s1)length(s2)?length(s2):length(s1);
  PROTECT(ans = allocVector(INTSXP, len));
  cans = INTEGER(ans);
  for(i = 0; i  len; i++)
  cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\
   CHAR(STRING_ELT(s2,i)));
  UNPROTECT(1);
  return ans;
 
 sig - signature(s1=character, s2=character)
 strcmp - cfunction(sig, code)
 
 strings - replicate(1e5, paste(sample(letters, 100, rep = T), collapse 
 =  ))
 
 
 lhs - strings[-1]
 rhs - strings[-1e5]
 system.time( lhs == rhs )
 system.time(strcmp( lhs, rhs) )
 
 library(inline)
 settings - getPlugin( default )
 settings$includes - paste( #define USE_RINTERNALS, settings$includes, 
 collapse = \n )
 code2 - 
  SEXP ans;
  int i, len, *cans;
  if(!isString(s1) || !isString(s2))
  error(\invalid arguments\);
  len = length(s1)length(s2)?length(s2):length(s1);
  PROTECT(ans = allocVector(INTSXP, len));
  cans = INTEGER(ans);
  for(i = 0; i  len; i++)
  cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\
   CHAR(STRING_ELT(s2,i)));
  UNPROTECT(1);
  return ans;
 
 sig - signature(s1=character, s2=character )
 strcmp2 - cxxfunction(sig, code2, settings = settings)
 system.time(strcmp2( lhs, rhs) )
 
 
 
 I get:
 
 $ Rscript strings.R
 Le chargement a nécessité le package : methods
 utilisateur système  écoulé
0.002   0.000   0.002
 utilisateur système  écoulé
0.004   0.000   0.005
 utilisateur système  écoulé
0.003   0.000   0.003
 
 Romain
 
 
 Le 13/07/10 15:24, Matt Shotwell a écrit :
 
  On Tue, 2010-07-13 at 01:42 -0400, Hadley Wickham wrote:
  strings- replicate(1e5, paste(sample(letters, 100, rep = T), collapse =  
  ))
  system.time(strings[-1] == strings[-1e5])
  #   user  system elapsed
  #  0.016   0.000   0.017
 
  So it takes ~1/100 of a second to do ~100,000 string comparisons. You
  need to provide a reproducible example that illustrates why you think
  string comparisons are slow.
 
  Here's a vectorized alternative to '==' for strings, with minimal
  argument checking or result conversion. I haven't looked at the
  corresponding R source code, it may be similar:
 
  library(inline)
  code- 
   SEXP ans;
   int i, len, *cans;
   if(!isString(s1) || !isString(s2))
   error(\invalid arguments\);
   len = length(s1)length(s2)?length(s2):length(s1);
   PROTECT(ans = allocVector(INTSXP, len));
   cans = INTEGER(ans);
   for(i = 0; i  len; i++)
   cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\
CHAR(STRING_ELT(s2,i)));
   UNPROTECT(1);
   return ans;
  
  sig- signature(s1=character, s2=character)
  strcmp- cfunction(sig, code)
 
 
  system.time(strings[-1] == strings[-1e5])
  user  system elapsed
 0.036   0.000   0.035
  system.time(strcmp(strings[-1], strings[-1e5]))
  user  system elapsed
 0.032   0.000   0.034
 
  That's pretty fast, though I seem to be working with a slower system
  than Hadley. It's hard to see how this could be improved, except maybe
  by caching results of string comparisons.
 
  -Matt
 
 
  Hadley
 
 
  On Tue, Jul 13, 2010 at 6:52 AM, Ralf Bralf.bie...@gmail.com  wrote:
  I am asking this question because String comparison in R seems to be
  awfully slow (based on

Re: [R] How to select the column header with \Sexpr{}

2010-07-13 Thread Ista Zahn

Hi Felipe,
See in line below.

On Tue, Jul 13, 2010 at 11:04 AM, Felipe Carrillo
mazatlanmex...@yahoo.com wrote:
 Thanks Izta:
 I see your point, then I should extract the column names when the
 dataset is first read because is a dataframe:

That might work, but it's definitely not how I would do it.

  report - structure(list(Date = c(3/12/2010, 3/13/2010, 3/14/2010,
  3/15/2010), Run1 = c(33 (119 ? 119), n (0 ? 0), 893 (110 ? 146),
  140 (111 ? 150)), Run2 = c(33 (71 ? 71), n (0 ? 0),
  337 (67 ? 74), 140 (68 ? 84)), Run3 = c(890 (32 ? 47),
  n (0 ? 0), 10,602 (32 ? 52), 2,635 (34 ? 66)), Run4 = c(0 ( ? ),
  n (0 ? 0), 0 ( ? ), 0 ( ? )), Run4 = c(0 ( ? ), n (0 ? 0),
 0 ( ? ), 0 ( ? ))), .Names = c(ID_Date, Run1, Run2,
 Run3, Run4, Run5), row.names = c(NA, 4L), class = data.frame)
 str(report)
 'data.frame':   4 obs. of  6 variables:
  $ ID_Date: chr  3/12/2010 3/13/2010 3/14/2010 3/15/2010
  $ Run1   : chr  33 (119 ? 119) n (0 ? 0) 893 (110 ? 146) 140 (111 ?
 150)
  $ Run2   : chr  33 (71 ? 71) n (0 ? 0) 337 (67 ? 74) 140 (68 ? 84)
  $ Run3   : chr  890 (32 ? 47) n (0 ? 0) 10,602 (32 ? 52) 2,635 (34 ?
 66)
  $ Run4   : chr  0 ( ? ) n (0 ? 0) 0 ( ? ) 0 ( ? )
  $ Run5   : chr  0 ( ? ) n (0 ? 0) 0 ( ? ) 0 ( ? )
  names(report)[1]  # I can extract the column name here
 [1] Date

 But after I use 'stringr to convert the character '?' to '-'
 'report' is not a dataframe anymore and returns a NULL when trying to extract
 the
 column names.

No, it will not report NULL when extracting _column names_. Try
colnames(report). It will report NULL when trying to extract the
_names_ using names(report), because matrices have colnames and
rownames but not names.

 I was not aware that \Sexpr{} only work on dataframes, thanks for your help.

The problem is _not with \Sexpr_. The problem is that you are asking
for the names() of a matrix, which do not exist in R. You can use
colnames() like this

\Sexpr{colnames(report)[1]}

or you can convert report to a data.frame and use names, like this

\Sexpr{names(as.data.frame(report))[1]}

HTH,
Ista




 - Original Message 
 From: Ista Zahn iz...@psych.rochester.edu
 To: Felipe Carrillo mazatlanmex...@yahoo.com
 Cc: David Winsemius dwinsem...@comcast.net; r-h...@stat.math.ethz.ch
 Sent: Tue, July 13, 2010 7:13:39 AM
 Subject: Re: [R] How to select the column header with \Sexpr{}

 Hi Felipe,
 The problem has nothing to do with Sweave or \Sexpr. The problem is
 that by the time you call \Sexpr report is a matrix, and you cannot
 access the column names of a matrix with names(). You need to use
 colnames() or convert the matrix to a data.frame.

 Perhaps a true useR can write R code in a Sweave file without checking
 it, but for mere mortals it is best to evaluate the R code in an
 interactive session to make sure it works before asking Sweave to
 insert it into your .tex file. If you had tried to evaluate
 names(report)[1] in an interactive session you would have discovered
 your problem immediately.

 Best,
 Ista

 On Tue, Jul 13, 2010 at 4:15 AM, Felipe Carrillo
 mazatlanmex...@yahoo.com wrote:
  I had tried that earlier and didn't work either, I probably have \Sexpr in
the
  wrong place. See example:
  Column one header gets blank:
 
  \documentclass[11pt]{article}
  \usepackage{longtable,verbatim,ctable}
  \usepackage{longtable,pdflscape}
  \usepackage{fmtcount,hyperref}
  \usepackage{fullpage}
  \title{United States}
  \begin{document}
  \setkeys{Gin}{width=1\textwidth}
  \maketitle
  echo=F,results=hide=
  report - structure(list(Date = c(3/12/2010, 3/13/2010, 3/14/2010,
  3/15/2010), Run1 = c(33 (119 ? 119), n (0 ? 0), 893 (110 ? 146),
  140 (111 ? 150)), Run2 = c(33 (71 ? 71), n (0 ? 0),
  337 (67 ? 74), 140 (68 ? 84)), Run3 = c(890 (32 ? 47),
  n (0 ? 0), 10,602 (32 ? 52), 2,635 (34 ? 66)), Run4 = c(0 ( ? ),
  n (0 ? 0), 0 ( ? ), 0 ( ? )), Run4 = c(0 ( ? ), n (0 ? 0),
  0 ( ? ), 0 ( ? ))), .Names = c(ID_Date, Run1, Run2,
  Run3, Run4, Run5), row.names = c(NA, 4L), class = data.frame)
  require(stringr)
  report - t(apply(report, 1, function(x) {str_replace(x, \\?, -)}))
  #report
  #latex(report,file=)
  @
  \begin{landscape}
  \begin{table}[!tbp]
   \begin{center}
   \begin{tabular}{ll}\hline\hline
  \multicolumn{1}{c}{\Sexpr{names(report)[1]}}   # Using \Sexpr here
  \multicolumn{1}{c}{Run1}
  \multicolumn{1}{c}{Run2}
  \multicolumn{1}{c}{Run3}
  \multicolumn{1}{c}{Run4}
  \multicolumn{1}{c}{Run5}\tabularnewline
  \hline
  13/12/201033 (119 ? 119)33 (71 ? 71)890 (32 ? 47)0 ( ? )0 ( ?
  )\tabularnewline
  23/13/2010n (0 ? 0)n (0 ? 0)n (0 ? 0)n (0 ? 0)n (0 ? 
  0)\tabularnewline
  33/14/2010893 (110 ? 146)337 (67 ? 74)10,602 (32 ? 52)0 ( ? )0 ( ?
  )\tabularnewline
  43/15/2010140 (111 ? 150)140 (68 ? 84)2,635 (34 ? 66)0 ( ? )0 ( ?
  )\tabularnewline
  \hline
  \end{tabular}
  \end{center}
  \end{table}
  \end{landscape}
  \end{document}
 
  Felipe D. Carrillo
  Supervisory Fishery Biologist
  Department of the Interior
  US Fish  Wildlife Service

Re: [R] How can i draw a graph with high and low data points

There are several functions in several packages for plotting intervals that 
will give you plots much better than the excel one.  The RSiteSearch function 
or the sos package may help you find those. 

But it is also easy to create such plots using just a few lines of R code and 
base graphics.  Read the help pages for plot.default (look at the ylim 
argument) and the segments function (the order and seq functions may also be of 
use).

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Nathaniel Saxe
 Sent: Tuesday, July 13, 2010 2:36 AM
 To: r-help@r-project.org
 Subject: Re: [R] How can i draw a graph with high and low data points
 
 
 I have 5 columns- Trial.Group, Mean, Standard Deviation, Upper
 percentile,
 Lower percentile.
 
 Trial.Group 41 subjects: 3 to 4 yrs-Male
 Mean 444
 SD 25
 upper 494
 lower 393
 
 and all the data is like that.
 
 and i wish to recreate this excel table.
 http://r.789695.n4.nabble.com/file/n2287158/untitled.GIF untitled.GIF
 
 
 
 problem with my code- doesn't put Trial.Group on the x axis
 
 
 Thanks for the help
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/How-can-i-
 draw-a-graph-with-high-and-low-data-points-tp2282524p2287158.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] StartsWith over vector of Strings?

content[na.omit(pmatch(searchset, content,,TRUE))]

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Ralf B
 Sent: Tuesday, July 13, 2010 5:47 AM
 To: r-help@r-project.org
 Subject: [R] StartsWith over vector of Strings?
 
 Given vectors of strings of arbitrary length
 
 content - c(abc, def)
 searchset - c(a, abc, abcdef, d, def, defghi)
 
 Is it possible to determine the content String set that matches the
 searchset in the sense of 'startswith' ? This would be a vector of all
 strings in content that start with the string of any of the strings in
 the searchset. In the little example here, this would be:
 
 result - c(abc, abc, def, def)
 
 Best,
 Ralf
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] What is the degrees of freedom in an nlme model

2010-07-13 Thread Bert Gunter

...

  Aug;235(2):E97-102). The paper is freely available. The idea is to fit
  the curves individually and obtain the residual sum of squares and
  then fit the curves altogether somehow constraining some parameters
  and then you have another residual sum of squares. Then you can do a
  F-test. 

-- No you can't. This paper was apparently written by someone who doesn't
sufficiently understand the statistical issues. This is not uncommon -- even
papers in statistical journals sometimes get it wrong.


In my case, I wonder if I can use a mixed-effect modeling to
  do the simultaneous fitting job.

-- You need to consult with your local statistician. This forum is not the
appropriate venue for difficult statistical questions that require intimate
familiarity with the data and an understanding of the scientific questions
of interest.

-- Bert Gunter
Genentech Nonclinical Statistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generate groups with random size but given total sample size

For one definition of random:

ss - rexp(100)
ss - ss/sum(ss)

ss - 5 + round( ss*9500 )

cnt - 0
while( ( d - sum(ss) - 1 ) != 0 ) {

tmpid - sample.int(100,1)
ss[tmpid] - ss[tmpid] - d

ss[ ss  500 ] - 500
ss[ ss  5 ] - 5

cnt - cnt + 1
if (cnt  100) {
cat('problems finding a solution, stopping after 100 
iterations\n')
break
}
}

group - rep( 1:100, ss )


Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Arne Schulz
 Sent: Tuesday, July 13, 2010 7:10 AM
 To: r-help@r-project.org
 Subject: [R] Generate groups with random size but given total sample
 size
 
 Dear list,
 I am currently doing some simulation studies where I want to compare
 different scenarios.
 In particular, two scenarios should be compared: 10.000 cases in 100
 groups with 100 cases per group and 10.000 cases in 100 groups with
 random group size (ranging from 5 to 500).
 
 The first part is no problem:
  id - seq(1,1)
  group - sort(rep(seq(1,100),100))
 
 But I don't get along with the second scenario. Using sample does give
 me 100 groups with random cases, but generates more than 10.000 cases:
  set.seed(13)
  sum(sample(5:500, 100))
 [1] 24583
 
 Another way could be generating one sample at a time and sum the cases.
 But this would end up in trail  error to fit the 10.000 cases. Maybe
 it would break rules of probability, too.
 
 I'm convinced that there should be another (and even better) way to
 handle this problem in R... :-)
 
 
 Best regards,
 Arne Schulz
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] MplusAutomation

2010-07-13 Thread Jeff Newmiller

Create a shortcut that targets your batch file and edit its properties to open 
minimized, and call the shortcut from R rather than the calling the batch file 
directly.

Gushta, Matthew mgus...@air.org wrote:

R list-
i have begun using the MplusAutomation while piloting a large-scale 
simulation (~200,000 replications). since the package takes advantage of the 
DOS batch mode available in Mplus, each replication starts and activates a new 
instance of a command prompt window. this effectively locks me out of my 
computer for the duration of the simulation.

my question is this: can anyone suggest how i might pass the quiet command 
to the DOS program? is there a way to generally specify this from R? or any 
specific recommendations/experience with this package?

thanks,

--matthew
..
matthew m. gushta
american institutes for research
202.403.5079




   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
---
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] distributing a value for a given month across the number of weeks in that month

On Tue, Jul 13, 2010 at 11:19 AM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Actually,
 I realized that my task was a bit more complicated as I have different
 (let's call them) Markets and the dates repeat themselves across
 markets. And the original code from Gabor gives an error - because
 dates repeate themselves and apparently zoo cannot handle it. So, I
 had to do program a way around it (below). It works.
 However, I am wondering if there is a shorter/more elegant way of doing it?
 Thank you!
 Dimitri

 ### My original data frame is a bit more complicated - dates repeat
 themselves for 2 markets:
 monthly-data.frame(month=c(20100301,20100401,20100501,20100301,20100401,20100501),monthly.value=c(100,200,300,10,20,30),market=c(Market
 A,Market A, Market A,Market B,Market B, Market B))
 monthly$month-as.character(monthly$month)
 monthly$month-as.Date(monthly$month,%Y%m%d)
 (monthly)


Assuming the dates for each market are the same we split them into a
zoo object with one market per column and following the same approach
as last time we use by in place of ave.  The lines marked ## are same
as last time.
Be sure you are using zoo 1.6-4 from CRAN since it makes use of the
na.locf features added in that version.

 z - read.zoo(monthly, split = market)
 all.dates - seq(start(z), as.Date(as.yearmon(end(z)), frac = 1), by = day) 
 ##
 mondays - all.dates[weekdays(all.dates) == Monday] ##
 weeks - na.locf(z, xout = mondays) ##
 do.call(rbind, by(weeks, as.yearmon(mondays),
+ function(x) zoo(x/nrow(x), rownames(x
   Market.A Market.B
2010-03-01   202
2010-03-08   202
2010-03-15   202
2010-03-22   202
2010-03-29   202
2010-04-05   505
2010-04-12   505
2010-04-19   505
2010-04-26   505
2010-05-03   606
2010-05-10   606
2010-05-17   606
2010-05-24   606
2010-05-31   606

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Time Variable and Historical Interest Rates

2010-07-13 Thread Aaditya Nanduri

Guys, I wrote to the finance mailing list earlier with my questions but was
directed here.

Sorry for the repeat.

---
library(quantmod)

now - Sys.time()

midnight - strptime()#  I want to make this a static variable
that will be equal to 12:00:00 am but I dont know what to put here. I keep
getting NA for everything I do

if(now == midnight) {
getFX(EUR/USD, from = Sys.Date() -1, to = Sys.Date() - 1)
write.table(EURUSD, ~Documents/stat arb/project/eurusd.csv, append = TRUE,
row.names = FALSE, col.names = FALSE)

}


---

Also, append is ignored when I use write.csv. I had to resort to using
write.table. Is this always the case?

As for the historical interest rates, thank you all very much for providing
me with the information (Finance mailing list).
I used the fImport package and called the method fredSeries to download
DPRIME data for the same time frame as currency data I have (Thank you,
Mr. Gallon).

But that is only data for US. What about other countries?

I was talking to a professor and he said that there was a way to read data
from a website into R if you know the url. Would this help in getting the
interest rates of other countries? (I believe the function is aptly named
url). Could someone provide an example, please?

All help is very much appreciated.

Sincerely,
Aaditya Nanduri

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculate confidence interval of the mean based on ANOVA

2010-07-13 Thread Paul


Paul wrote:
I am trying to recreate an analysis that has been done by another group 
(in SAS I believe).  I'm stuck on one part, I think because my stats 
knowledge is lacking, and while it's OT, I'm hoping someone here can help.


Given this dataframe;

snip
Well, that will teach me to read the question ! The previous analysis 
stated (quite clearly) that they calculated confidence intervals using 
number of runs - 1 degrees of freedom, so doing my t quantile over 5 df 
instead of 17 produced the right answer.


Paul.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] hclust information in a table

2010-07-13 Thread Ralph Modjesch

i try to show the result of the cluster-analysis (hclust, method=ward) in a
table with following information

first column: height
second column: number of clusters
third column: clustering information


0,041  |  20  |  (3)-(5)
0,111  |  19  |  (6)-(11)
0,211  |  18  |  (3,5)-(9)
0,402  |  17  |  (6,11)-(16)
...

is there any function or code to do this?


-- 
Mit freundlichen Grüßen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] distributing a value for a given month across the number of weeks in that month

2010-07-13 Thread Dimitri Liakhovitski

Thank you very much, Gabor!
Dimitri

On Tue, Jul 13, 2010 at 12:25 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 On Tue, Jul 13, 2010 at 11:19 AM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Actually,
 I realized that my task was a bit more complicated as I have different
 (let's call them) Markets and the dates repeat themselves across
 markets. And the original code from Gabor gives an error - because
 dates repeate themselves and apparently zoo cannot handle it. So, I
 had to do program a way around it (below). It works.
 However, I am wondering if there is a shorter/more elegant way of doing it?
 Thank you!
 Dimitri

 ### My original data frame is a bit more complicated - dates repeat
 themselves for 2 markets:
 monthly-data.frame(month=c(20100301,20100401,20100501,20100301,20100401,20100501),monthly.value=c(100,200,300,10,20,30),market=c(Market
 A,Market A, Market A,Market B,Market B, Market B))
 monthly$month-as.character(monthly$month)
 monthly$month-as.Date(monthly$month,%Y%m%d)
 (monthly)


 Assuming the dates for each market are the same we split them into a
 zoo object with one market per column and following the same approach
 as last time we use by in place of ave.  The lines marked ## are same
 as last time.
 Be sure you are using zoo 1.6-4 from CRAN since it makes use of the
 na.locf features added in that version.

 z - read.zoo(monthly, split = market)
 all.dates - seq(start(z), as.Date(as.yearmon(end(z)), frac = 1), by = 
 day) ##
 mondays - all.dates[weekdays(all.dates) == Monday] ##
 weeks - na.locf(z, xout = mondays) ##
 do.call(rbind, by(weeks, as.yearmon(mondays),
 + function(x) zoo(x/nrow(x), rownames(x
           Market.A Market.B
 2010-03-01       20        2
 2010-03-08       20        2
 2010-03-15       20        2
 2010-03-22       20        2
 2010-03-29       20        2
 2010-04-05       50        5
 2010-04-12       50        5
 2010-04-19       50        5
 2010-04-26       50        5
 2010-05-03       60        6
 2010-05-10       60        6
 2010-05-17       60        6
 2010-05-24       60        6
 2010-05-31       60        6




-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SAS Proc summary/means as a R function

2010-07-13 Thread schuster


Hello, 

are you trying to pase SAS code (or lightly modified SAS code) and run it in R? 

Then you are right: the hard part is parsing the code. I don't believe that's 
possible without a custom parser, and even then it's really hard to parse all 
the SAS sub languages right: data step, macro code and macro variables, IML, 
SAS Procedures etc. 



On Tuesday 13 July 2010 02:39:22 pm Roger Deangelis wrote:
 Thanks Richard and Erik,
 
 I hate to buy the book and not find the solution to the following:
 
 proc.means - function() {
deparse(match.call()[-1])
 }
 
 proc.means(this is a sentence)
 
 unexpected symbol in   proc means(this is)
 
 One possible solution would be to 'peek' into the memory buffer that holds
 the
 function arguments.
 
 It is easy to replicate the 'dataset' output for many SAS procs(ie
 transpose, freq, summary, means...)
 I am not interested in 'report writing in R'.
 
 The hard part is parsing the SAS syntax, I wish R had a drop down to PERL.
 
 per1 on;
 
some perl code
 
 perl off;
 
 also
 
 sas on;
 
   some SAS code
 
 sas off;
 
 The purpose of parmbuff is to turn off of Rs scanning and resolution of
 function arguments
 and just provide the bare text between '('  and ')' in the function call.
 
 This is a very powerful construct.
 
 A function would provide something like
 
 sas.on(
 
 
 )
 

-- 

Friedrich Schuster
Dompfaffenweg 6
69123 Heidelberg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Wrap column headers caption

2010-07-13 Thread Felipe Carrillo

Hi:
Using this dataframe with quite long column headers, how can I wrap the 
text so that the columns are narrower. I was trying to use strwrap without 
success. Thanks

reportDF - structure(list(IDDate = c(3/12/2010, 3/13/2010, 3/14/2010,
3/15/2010), FirstRunoftheYear = c(33 (119 ? 119), n (0 ? 0), 893 (110 ? 
146),
140 (111 ? 150)), SecondRunoftheYear = c(33 (71 ? 71), n (0 ? 0),
337 (67 ? 74), 140 (68 ? 84)), ThirdRunoftheYear = c(890 (32 ? 47),
n (0 ? 0), 10,602 (32 ? 52), 2,635 (34 ? 66)), FourthRunoftheYear = c(0 
( 
? ),
n (0 ? 0), 0 ( ? ), 0 ( ? )), LastRunoftheYear = c(0 ( ? ), n (0 ? 0),
0 ( ? ), 0 ( ? ))), .Names = c(IDDate, First Run of the Year, Second 
Run of the Year,
Third Run of the Year, Fourth Run of the Year, Last Run of the Year), 
row.names = c(NA, 4L), class = data.frame)
 


Felipe D. Carrillo
Supervisory Fishery Biologist
Department of the Interior
US Fish  Wildlife Service
California, USA




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculate confidence interval of the mean based on ANOVA

2010-07-13 Thread Joshua Wiley

N_runs -1 seems a bit of an odd df to choose to calculate the CI for a
mean.  To answer your question, I think that t.test() is the easiest
way to get a CI in R.  That said, you can use the MS_residuals from
ANOVA to take advantage of variance calculated on groups and pooled.
Something like:

foo - structure(list(OBS = structure(1:18, .Label = c(1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37,
38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48,
49, 50, 51, 52, 53, 54), class = factor), NOM =
structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L), .Label = c(0.05, 0.1, 1), class = factor),
RUN = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L,
5L, 5L, 6L, 6L, 6L), .Label = c(1, 2, 3, 4, 5, 6), class =
factor), CALC = c(0.04989, 0.04872, 0.04544, 0.05645, 0.06516,
0.0622, 0.04868, 0.05006, 0.04746, 0.05574, 0.04442, 0.04742, 0.05508,
0.0593, 0.04898, 0.06373, 0.05537, 0.04674)), .Names = c(OBS, NOM,
RUN, CALC), row.names = c(NA, 18L), class = data.frame)

foo.aov - aov(CALC ~ RUN, data = foo)

sdpooled.calc - sqrt(anova(foo.aov)[Residuals, Mean Sq])
mcalc - mean(foo$CALC)
ncalc - length(foo$CALC)
t.crit - qt(p = .05/2, df = 12, lower.tail=FALSE)

#then if memory serves the CI for means formula is

mcalc - ((t.crit * sdpooled.calc)/sqrt(ncalc))
mcalc + ((t.crit * sdpooled.calc)/sqrt(ncalc))

#rm(foo, foo.aov, sdpooled.calc, mcalc, ncalc, t.crit)

Btw, it helps if you send plaintext emails rather than html.

Best regards,

Josh


On Tue, Jul 13, 2010 at 10:14 AM, Paul p...@paulhurley.co.uk wrote:
 Paul wrote:

 I am trying to recreate an analysis that has been done by another group
 (in SAS I believe).  I'm stuck on one part, I think because my stats
 knowledge is lacking, and while it's OT, I'm hoping someone here can help.

 Given this dataframe;

 snip

 Well, that will teach me to read the question ! The previous analysis stated
 (quite clearly) that they calculated confidence intervals using number of
 runs - 1 degrees of freedom, so doing my t quantile over 5 df instead of 17
 produced the right answer.

 Paul.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SAS Proc summary/means as a R function

2010-07-13 Thread Frank E Harrell Jr

What is the original intent?  The bandwidth:productivity ratio is not 
looking encouraging for this problem.


Frank

On 07/13/2010 12:38 PM, schuster wrote:


Hello,

are you trying to pase SAS code (or lightly modified SAS code) and run it in R?

Then you are right: the hard part is parsing the code. I don't believe that's
possible without a custom parser, and even then it's really hard to parse all
the SAS sub languages right: data step, macro code and macro variables, IML,
SAS Procedures etc.



On Tuesday 13 July 2010 02:39:22 pm Roger Deangelis wrote:

Thanks Richard and Erik,

I hate to buy the book and not find the solution to the following:

proc.means- function() {
deparse(match.call()[-1])
}

proc.means(this is a sentence)

unexpected symbol in   proc means(this is)

One possible solution would be to 'peek' into the memory buffer that holds
the
function arguments.

It is easy to replicate the 'dataset' output for many SAS procs(ie
transpose, freq, summary, means...)
I am not interested in 'report writing in R'.

The hard part is parsing the SAS syntax, I wish R had a drop down to PERL.

per1 on;

some perl code

perl off;

also

sas on;

   some SAS code

sas off;

The purpose of parmbuff is to turn off of Rs scanning and resolution of
function arguments
and just provide the bare text between '('  and ')' in the function call.

This is a very powerful construct.

A function would provide something like

sas.on(


)






--
Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Wrap column headers caption

2010-07-13 Thread Henrique Dallazuanna

You can't try this:

sapply(names(reportDF), toString, width = 10)

abbreviate(names(reportDF))

On Tue, Jul 13, 2010 at 2:43 PM, Felipe Carrillo
mazatlanmex...@yahoo.comwrote:

 Hi:
 Using this dataframe with quite long column headers, how can I wrap the
 text so that the columns are narrower. I was trying to use strwrap without
 success. Thanks

 reportDF - structure(list(IDDate = c(3/12/2010, 3/13/2010,
 3/14/2010,
 3/15/2010), FirstRunoftheYear = c(33 (119 ? 119), n (0 ? 0), 893
 (110 ?
 146),
 140 (111 ? 150)), SecondRunoftheYear = c(33 (71 ? 71), n (0 ? 0),
 337 (67 ? 74), 140 (68 ? 84)), ThirdRunoftheYear = c(890 (32 ? 47),
 n (0 ? 0), 10,602 (32 ? 52), 2,635 (34 ? 66)), FourthRunoftheYear =
 c(0 (
 ? ),
 n (0 ? 0), 0 ( ? ), 0 ( ? )), LastRunoftheYear = c(0 ( ? ), n (0 ?
 0),
 0 ( ? ), 0 ( ? ))), .Names = c(IDDate, First Run of the Year,
 Second
 Run of the Year,
 Third Run of the Year, Fourth Run of the Year, Last Run of the Year),
 row.names = c(NA, 4L), class = data.frame)



 Felipe D. Carrillo
 Supervisory Fishery Biologist
 Department of the Interior
 US Fish  Wildlife Service
 California, USA




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [R-pkgs] New package list for analyzing list surveyexperiments

2010-07-13 Thread Raubertas, Richard

I agree that 'list' is a terrible package name, but only secondarily 
because it is a data type.  The primary problem is that it is so generic

as to be almost totally uninformative about what the package does.  

For some reason package writers seem to prefer maximally uninformative 
names for their packages.  To take some examples of recently announced 
packages, can anyone guess what packages 'FDTH', 'rtv', or 'lavaan' 
do?  Why the aversion to informative names along the lines of
'Freq_dist_and_histogram', 'RandomTimeVariables', and 
'Latent_Variable_Analysis', respectively? 

R.Raubertas

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Jeffrey J. Hallman
 Sent: Monday, July 12, 2010 10:09 AM
 To: r-h...@stat.math.ethz.ch
 Subject: Re: [R] [R-pkgs] New package list for analyzing 
 list surveyexperiments
 
 I know nothing about your package, but list is a terrible 
 name for it,
 as list is also the name of a data type in R. 
 -- 
 Jeff
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
Notice:  This e-mail message, together with any attachme...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Wrap column headers caption



Using this dataframe with quite long column headers, how can I wrap the 
text so that the columns are narrower. I was trying to use strwrap without 
success. Thanks


reportDF - structure(list(IDDate = c(3/12/2010, 3/13/2010, 3/14/2010,
3/15/2010), FirstRunoftheYear = c(33 (119 ? 119), n (0 ? 0), 893 (110 ? 
146),

140 (111 ? 150)), SecondRunoftheYear = c(33 (71 ? 71), n (0 ? 0),
337 (67 ? 74), 140 (68 ? 84)), ThirdRunoftheYear = c(890 (32 ? 47),
n (0 ? 0), 10,602 (32 ? 52), 2,635 (34 ? 66)), FourthRunoftheYear = c(0 ( 
? ),

n (0 ? 0), 0 ( ? ), 0 ( ? )), LastRunoftheYear = c(0 ( ? ), n (0 ? 0),
0 ( ? ), 0 ( ? ))), .Names = c(IDDate, First Run of the Year, Second 
Run of the Year,
Third Run of the Year, Fourth Run of the Year, Last Run of the Year), 
row.names = c(NA, 4L), class = data.frame)


I could be wrong here, but I don't think there's a way to do that as 
print.data.frame is currently defined.  You might find the print.gap 
argument of some use, it ultimately gets passed down to print.default 
and will affect the output.


I can't think of a way to do this, hopefully someone else will have an 
idea.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] StartsWith over vector of Strings?

When running the combined code with your suggested line:

content - data.frame(urls=c(

http://www.google.com/search?source=ighl=enrlz==q=stuffaq=faqi=g10aql=oq=gs_rfai=CrrIS3VU8TJqcMJHuzASm9qyBBgAAAKoEBU_QsmVh;,

http://search.yahoo.com/search;_ylt=Atvki9MVpnxuEcPmXLEWgMqbvZx4?p=stufftoggle=1cop=mssei=UTF-8fr=yfp-t-701;)
)
searchset - data.frame(signatures=c(http://www.google.com/search;))
content[na.omit(pmatch(searchset, content$urls))]
print(content)

I am getting both URLs as results, but in fact, would expect only the
first URL. Am I overlooking something?


Ralf

On Tue, Jul 13, 2010 at 12:03 PM, Greg Snow greg.s...@imail.org wrote:
 content[na.omit(pmatch(searchset, content,,TRUE))]

 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Ralf B
 Sent: Tuesday, July 13, 2010 5:47 AM
 To: r-help@r-project.org
 Subject: [R] StartsWith over vector of Strings?

 Given vectors of strings of arbitrary length

 content - c(abc, def)
 searchset - c(a, abc, abcdef, d, def, defghi)

 Is it possible to determine the content String set that matches the
 searchset in the sense of 'startswith' ? This would be a vector of all
 strings in content that start with the string of any of the strings in
 the searchset. In the little example here, this would be:

 result - c(abc, abc, def, def)

 Best,
 Ralf

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] StartsWith over vector of Strings?

My solution was based on using vectors (which were your original example), now 
you are using data frames.  The actual result is NA, then you just print 
content again (which my code never modified) so you are going to see the full 
content data frame.

Try:

content[na.omit(pmatch(searchset$signatures, content$urls)),]

then look at all the pieces (starting from inside out) to see what is happening 
at each step to understand what is going on.


-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: Ralf B [mailto:ralf.bie...@gmail.com]
 Sent: Tuesday, July 13, 2010 11:57 AM
 To: Greg Snow
 Cc: r-help@r-project.org
 Subject: Re: [R] StartsWith over vector of Strings?
 
 When running the combined code with your suggested line:
 
 content - data.frame(urls=c(
 
   http://www.google.com/search?source=ighl=enrlz==q=stuffaq=f
 aqi=g10aql=oq=gs_rfai=CrrIS3VU8TJqcMJHuzASm9qyBBgAAAKoEBU_QsmVh,
 
   http://search.yahoo.com/search;_ylt=Atvki9MVpnxuEcPmXLEWgMqbvZx4
 ?p=stufftoggle=1cop=mssei=UTF-8fr=yfp-t-701)
 )
 searchset - data.frame(signatures=c(http://www.google.com/search;))
 content[na.omit(pmatch(searchset, content$urls))]
 print(content)
 
 I am getting both URLs as results, but in fact, would expect only the
 first URL. Am I overlooking something?
 
 
 Ralf
 
 On Tue, Jul 13, 2010 at 12:03 PM, Greg Snow greg.s...@imail.org
 wrote:
  content[na.omit(pmatch(searchset, content,,TRUE))]
 
  --
  Gregory (Greg) L. Snow Ph.D.
  Statistical Data Center
  Intermountain Healthcare
  greg.s...@imail.org
  801.408.8111
 
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
  project.org] On Behalf Of Ralf B
  Sent: Tuesday, July 13, 2010 5:47 AM
  To: r-help@r-project.org
  Subject: [R] StartsWith over vector of Strings?
 
  Given vectors of strings of arbitrary length
 
  content - c(abc, def)
  searchset - c(a, abc, abcdef, d, def, defghi)
 
  Is it possible to determine the content String set that matches the
  searchset in the sense of 'startswith' ? This would be a vector of
 all
  strings in content that start with the string of any of the strings
 in
  the searchset. In the little example here, this would be:
 
  result - c(abc, abc, def, def)
 
  Best,
  Ralf
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
  guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RJSONIO install problem

2010-07-13 Thread christiaan pauw

Hi everybody

I am trying to install RJSONIO from source in on Mac OS X 10.5.8. I used the
Package Installer.

The message and sessionInfo is attached below

Can someone help me to understand the error message and maybe give hint
towards solving the problem

thanks in advance
Christiaan

Message:
The downloaded packages are in
/private/var/folders/ub/ubvWLUkKHf8WAywv5rmtcE+++TI/-Tmp-/RtmpZflYon/downloaded_packages
* installing *source* package RJSONIO ...
** libs
*** arch - i386
gcc -arch i386 -std=gnu99
-I/Library/Frameworks/R.framework/Resources/include
-I/Library/Frameworks/R.framework/Resources/include/i386
 -I/usr/local/include-fPIC  -g -O2 -c ConvertUTF.c -o ConvertUTF.o
i686-apple-darwin9-gcc-4.0.1: installation problem, cannot exec 'as': No
such file or directory
make: *** [ConvertUTF.o] Error 1
ERROR: compilation failed for package RJSONIO
* removing
/Library/Frameworks/R.framework/Versions/2.11/Resources/library/RJSONIO

sessionInfo is

R version 2.11.1 (2010-05-31)
x86_64-apple-darwin9.8.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Boxplot: Scale outliers

2010-07-13 Thread Robert Peter


 Hello!

I am trying to scale the outliers in a boxplot. I am passing pars = 
list(boxwex=0.1, staplewex=0.1, outwex=0.1) to the boxplot command. The 
boxes are scaled correctly, but the circles (outliers) are not scaled at 
all, and thus pretty big compared to the boxes scaled with 0.1.

Am I missing something?

Thanks in advance!
Robert

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Time Variable and Historical Interest Rates

2010-07-13 Thread Joshua Wiley

On Tue, Jul 13, 2010 at 9:54 AM, Aaditya Nanduri
aaditya.nand...@gmail.com wrote:
 Guys, I wrote to the finance mailing list earlier with my questions but was
 directed here.

 Sorry for the repeat.

 ---
 library(quantmod)
 
 now - Sys.time()

 midnight - strptime()        #  I want to make this a static variable
 that will be equal to 12:00:00 am but I dont know what to put here. I keep
 getting NA for everything I do

The key to what I did was format().  I am turning the output of
Sys.time() to something that can be compared to the character vector
'midnight'.  Also, I would use 24 hour time.

#assign midnight and now
midnight - 00:00:00
now - format(Sys.time(), format = %H:%M:%S)

#Look at the structure for midnight and now
str(midnight)
str(now)

#print to screen
midnight
now


 if(now == midnight) {

This test seems prone to failure.  There is a one second period when
'now' must be assigned or it will fail.

 getFX(EUR/USD, from = Sys.Date() -1, to = Sys.Date() - 1)
 write.table(EURUSD, ~Documents/stat arb/project/eurusd.csv, append = TRUE,
 row.names = FALSE, col.names = FALSE)
        
 }

 
 ---

 Also, append is ignored when I use write.csv. I had to resort to using
 write.table. Is this always the case?

write.csv() is a convenience wrapper for write.table().  It is also
clearly stated in the documentation for ?write.csv

These [write.csv, write.csv2] wrappers are deliberately inflexible:
they are designed to ensure that the correct conventions are used to
write a valid file.  Attempts to change 'append', 'col.names', 'sep',
'dec' or 'qmethod' are ignored, with a warning.

So yes, it is always the case.  If you want to use write.table() to
make a comma separated file, you might consider adding the argument
sep = , to your write.table() call.


 As for the historical interest rates, thank you all very much for providing
 me with the information (Finance mailing list).
 I used the fImport package and called the method fredSeries to download
 DPRIME data for the same time frame as currency data I have (Thank you,
 Mr. Gallon).

 But that is only data for US. What about other countries?

 I was talking to a professor and he said that there was a way to read data
 from a website into R if you know the url. Would this help in getting the
 interest rates of other countries? (I believe the function is aptly named
 url). Could someone provide an example, please?

I imagine it would help if websites provide different countries
interest rates in a convenient file.  In fact, in general you would
not even have to use url().  Here is an example.  On my website I have
a tab delimited data file.  I can access it from R by:

read.table(
file = http://www.joshuawiley.com/psyc211/Psyc211-hw1-part1.txt;,
header = TRUE, sep = \t)

It is also possible to enter user names and passwords into the URL.
This general pattern also works for ftp sites.  For secure http
(https) I only know how to access them through R in Windows.

Cheers,

Josh


 All help is very much appreciated.

 Sincerely,
 Aaditya Nanduri

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] working out main effect variance when different parameterization is used and interaction term exists

2010-07-13 Thread Adaikalavan Ramasamy


Dear all,

Apologies if this question is bit theoretical and for the longish email. 
I am meta-analyzing the coefficients and standard errors from multiple 
studies where the raw data is not available.


Each study analyst runs a model that includes an interaction term for, 
say, between sex and smoking and age.


Here is an illustrative example example for one study:

 set.seed(1066)

 status - rbinom( 1000, 1, 0.2 )
 males  - rbinom( 1000, 1, 0.6 )
 smoke  - rbinom( 1000, 1, 0.3 )
 age- runif(1000, min=20, max=80)

 coef( summary( f1 - glm( status ~ males*smoke + age,
   family=binomial ) ) )
 # Estimate  Std. Errorz value Pr(|z|)
 # (Intercept) -1.520399871 0.284464584 -5.3447774 9.052825e-08
 # males0.213851446 0.201717381  1.0601538 2.890746e-01
 # smoke   -0.123103049 0.292346483 -0.4210861 6.736922e-01
 # age -0.001056007 0.004612947 -0.2289223 8.189293e-01
 # males:smoke  0.283775173 0.362821438  0.7821345 4.341355e-01


Now, unfortunately some analysts coded sex as females instead of males. 
Using the same dataset, I get the following output with females:


 females - 1 - males
 coef( summary( f1 - glm( status ~ females*smoke + age,
   family=binomial )) )
 #   Estimate  Std. Errorz value Pr(|z|)
 # (Intercept)   -1.306548425 0.262573162* -4.9759405 6.493160e-07
 # females   -0.213851446 0.201717381* -1.0601538 2.890746e-01
 # smoke  0.160672124 0.214923130*  0.7475795 4.547138e-01
 # age   -0.001056007 0.004612947 -0.2289223 8.189293e-01
 # females:smoke -0.283775173 0.362821438 -0.7821345 4.341355e-01


I have worked out algebrically (and numerically) the following:

 Beta(females)   =  -Beta(males)
 Var(females)=  Var(males)

 Beta(females:smoke) =  -Beta(males:smoke)
 Var(females:smoke)  =  Var(males:smoke)

 Beta(smoke | fit1)  =  Beta(smoke | fit2) + Beta(females:smoke)
 =  0.160672124 -0.283775173
 =  -0.1231030

How can I calculate the Var(smoke | fit1) from Var(smoke | fit2) ?

I tried to derive this algebrically but ended up with a covariance term 
which I could not solve. If I could cleverly convert Var(smoke | fit2) 
to Var(smoke | fit1) then I could avoid going back to each analyst since 
 this particular analyses is only one of many hundreds we run and it 
would be annoying for each analyst to use the same parameterisation.


Any suggestions is much appreciated. Many thanks in advance.

Regards, Adai

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [R-pkgs] New package list for analyzing list surveyexperiments