Re: [R] cointegration analysis

2007-08-09 Thread gyadav
i got this error, i dont remember what was the cause, but what i did work around was that see the example in the manual pages of the ca.po... etc and try to make your date in the same format. also just see whether the functions will take so many columns as a parameter. I have not checked it.

Re: [R] cointegration analysis

2007-08-09 Thread gyadav
regrets typo error - please read 'date' as 'data' --- Regards, Gaurav Yadav (mobile: +919821286118) Assistant Manager, CCIL, Mumbai (India) mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Profile: http://www.linkedin.com/in/gydec25 Keep in touch and keep mailing

Re: [R] Reading time/date string

2007-08-09 Thread Matthew Walker
Thanks Mark, that was very helpful. I'm now so close! Can anyone tell me how to extract the value from an instance of a difftime class? I can see the value, but how can I place it in a dataframe? time_string1 - 10:17:07 02 Aug 2007 time_string2 - 13:17:40 02 Aug 2007 time1 -

Re: [R] Reading time/date string

2007-08-09 Thread Prof Brian Ripley
On Thu, 9 Aug 2007, Matthew Walker wrote: Thanks Mark, that was very helpful. I'm now so close! Can anyone tell me how to extract the value from an instance of a difftime class? I can see the value, but how can I place it in a dataframe? as.numeric(time_delta) Hint: you want the

Re: [R] cointegration analysis

2007-08-09 Thread Pfaff, Bernhard Dr.
Hello Dorina, if you apply ca.jo to a system with more than five variables, a *warning* is issued that no critical values are provided. This is not an error, but documented in ?ca.jo. In the seminal paper of Johansen, only cv for up to five variables are provided. Hence, you need to refer to a

Re: [R] R memory usage

2007-08-09 Thread Prof Brian Ripley
See ?gc ?Memory-limits On Wed, 8 Aug 2007, Jun Ding wrote: Hi All, I have two questions in terms of the memory usage in R (sorry if the questions are naive, I am not familiar with this at all). 1) I am running R in a linux cluster. By reading the R helps, it seems there are no default

Re: [R] tcltk error on Linux

2007-08-09 Thread Prof Brian Ripley
On Thu, 9 Aug 2007, Mark W Kimpel wrote: I am having trouble getting tcltk package to load on openSuse 10.2 running R-devel. I have specifically put my /usr/share/tcl directory in my PATH, but R doesn't seem to see it. I also have installed tk on my system. Any ideas on what the problem is?

[R] Odp: Successively eliminating most frequent elemets

2007-08-09 Thread Petr PIKAL
Hi your construction is quite complicated so instead of refining it I tried to do such task different way. If I understand what you want to do you can use set.seed(1) T - matrix(trunc(runif(20)*10), nrow=10, ncol=2) T [,1] [,2] [1,]22 [2,]31 [3,]56 [4,]

[R] using data() and determining data types

2007-08-09 Thread Edna Bell
Hi R Gurus: I'm using the data() function to get the list of data sets for a package. I would like to find the class for each data set; i.e.,data.frame, etc. Using str(), I can find the name of the data set. However, when I try the class function on the str output, I get character, since the

[R] data() problem solved

2007-08-09 Thread Edna Bell
Problem solved: sapply(data(package=car)$results[,3], function(x)class(get(x))) sorry for the silliness Hi R Gurus: I'm using the data() function to get the list of data sets for a package. I would like to find the class for each data set; i.e.,data.frame, etc. Using str(), I can find the

Re: [R] Change in R**2 for block entry regression

2007-08-09 Thread Chuck Cleland
David Kaplan wrote: Hi all, I'm demonstrating a block entry regression using R for my regression class. For each block, I get the R**2 and the associated F. I do this with separate regressions adding the next block in and then get the results by writing separate summary() statements

[R] substrings

2007-08-09 Thread Edna Bell
Hello again! I have a set of character results. If one of the characters is a blank space, followed by other characters, I want to end at the blank space. I tried strsplit, but it picks up again after the blank. Any help would be much appreciated. TIA, Edna

[R] Countvariable for id by date

2007-08-09 Thread David Gyllenberg
Best R-users, Here’s a newbie question. I have tried to find an answer to this via help and the “ave(x,factor(),FUN=function(y) rank (z,tie=’first’)”-function, but without success. I have a dataframe (~8000 observations, registerdata) with four columns: id,

Re: [R] substrings

2007-08-09 Thread Vladimir Eremeev
Is this what you want? a-c(a b c,1 2 3,q - 5) a [1] a b c 1 2 3 q - 5 sapply(strsplit(a,[[:blank:]]),function(x)x[1]) [1] a 1 q Edna Bell wrote: I have a set of character results. If one of the characters is a blank space, followed by other characters, I want to end at the blank

Re: [R] substrings

2007-08-09 Thread Vladimir Eremeev
one more, shorter, solution. a [1] a b c 1 2 3 q- 5 gsub(\\s.+,,a) [1] a 1 q- Edna Bell wrote: I have a set of character results. If one of the characters is a blank space, followed by other characters, I want to end at the blank space. I tried strsplit, but it picks up again

[R] Help on R performance using aov function

2007-08-09 Thread Francoise PFIFFELMANN
Hi, I’m trying to replace some SAS statistical functions by R (batch calling). But I’ve seen that calling R in a batch mode (under Unix) takes about 2or 3 times more than SAS software. So it’s a great problem of performance for me. Here is an extract of the calculation:

Re: [R] Help on R performance using aov function

2007-08-09 Thread Prof Brian Ripley
aov() will handle multiple responses and that would be considerably more efficient than running separate fits as you seem to be doing. Your code is nigh unreadable: please use your spacebar and remove the redundant semicolons: `Writing R Extensions' shows you how to tidy up your code to make

[R] Discriminant scores plot

2007-08-09 Thread Dani Valverde
Hello, How can I plot the discriminant scores resulting from prediction using a dataset and n lda model and its decission boundaries using GGobi and rggobi? Best regards, Dani Daniel Valverde Saubí Grup d'Aplicacions Biomèdiques de la Ressonància Magnètica Nuclear (GABRMN) Departament de

[R] GLMM: MEEM error due to dichotomous variables

2007-08-09 Thread Elva Robinson
I am trying to run a GLMM on some binomial data. My fixed factors include 2 dichotomous variables, day, and distance. When I run the model: modelA-glmmPQL(Leaving~Trial*Day*Dist,random=~1|Indiv,family=binomial) I get the error: iteration 1 Error in MEEM(object, conLin, control$niterEM) :

[R] Subject: Re: how to include bar values in a barplot?

2007-08-09 Thread Ted . Harding
Greg, I'm going to join issue with your here! Not that I'll go near advocating Excel-style graphics (abominable, and the Patrick Burns URL which you cite is remarkable in its restraint). Also, I'm aware that this is potential flame-war territory -- again, I want to avoid that too. However, this

[R] Term Structure Estimation using Kalman Filter

2007-08-09 Thread Bernardo Ribeiro
Long time reader, first time poster, I'm working on a paper regarding a term structure estimation using the Kalman Filter Algorithm. The model in question is the Generalized Vasicek, and since there are coupon-bonds being estimated, I'm supposed to make some changes on the Kalman Filter. Does

Re: [R] Countvariable for id by date

2007-08-09 Thread jim holtman
This should do what you want: x - read.table(textConnection(id;dg1;dg2;date; + 1;F28;;1997-11-04; + 1;F20;F702;1998-11-09; + 1;F20;;1997-12-03; + 1;F208;;2001-03-18; + 2;F32;;1999-03-07; + 2;F29;F32;2000-01-06; + 2;F32;;2003-07-05; +

[R] ARIMA fitting

2007-08-09 Thread laura
Hello, I‘m trying to fit an ARIMA process, using STATS package, arima function. Can I expect, that fitted model with any parameters is stationary, causal and invertible? Thanks __ R-help@stat.math.ethz.ch mailing list

[R] interior branch test

2007-08-09 Thread Nora Muda
Dear R users, Does anyone know which package provide interior branch test for phylogenetic tree with distance based method? Any helps are really appreciated. Thank you. Nora. __ R-help@stat.math.ethz.ch mailing list

[R] Lo and MacKinlay variance ratio test (Lo.Mac)

2007-08-09 Thread Cassius2
Hi all, I am trying to calculate the variance ratio of a time series under heteroskedasticity. So I know that the variance ratio should be calculated as a weighted average of autocorrelations. But I don't find the same results when I calculate the variance ratio manually and when I compute the

Re: [R] Countvariable for id by date

2007-08-09 Thread Gabor Grothendieck
Try this: Lines - id;dg1;dg2;date; 1;F28;;1997-11-04; 1;F20;F702;1998-11-09; 1;F20;;1997-12-03; 1;F208;;2001-03-18; 2;F32;;1999-03-07; 2;F29;F32;2000-01-06; 2;F32;;2003-07-05; 2;F323;F2800;2000-02-05; # replace textConnection(Lines) with actual file name DF - read.csv2(textConnection(Lines),

Re: [R] Rcmdr window border lost

2007-08-09 Thread Andy Weller
OK, I tried completely removing and reinstalling R, but this has not worked - I am still missing window borders for Rcmdr. I am certain that everything is installed correctly and that all dependencies are met - there must be something trivial I am missing?! Thanks in advance, Andy Andy Weller

Re: [R] Help using gPath

2007-08-09 Thread Paul Murrell
Hi Emilio Gagliardi wrote: Hi everyone,I'm trying to figure out how to use gPath and the documentation is not very helpful :( I have the following plot object: plot-surrounds:: background plot.gTree.378:: background guide.gTree.355:: (background.rect.345,

Re: [R] Subject: Re: how to include bar values in a barplot?

2007-08-09 Thread Frank E Harrell Jr
[EMAIL PROTECTED] wrote: Greg, I'm going to join issue with your here! Not that I'll go near advocating Excel-style graphics (abominable, and the Patrick Burns URL which you cite is remarkable in its restraint). Also, I'm aware that this is potential flame-war territory -- again, I want to

Re: [R] Regsubsets statistics

2007-08-09 Thread Thomas Lumley
On Wed, 8 Aug 2007, Markus Brugger wrote: Dear R-help, I have used the regsubsets function from the leaps package to do subset selection of a logistic regression model with 6 independent variables and all possible ^2 interactions. As I want to get information about the statistics behind

Re: [R] Subject: Re: how to include bar values in a barplot?

2007-08-09 Thread Gabor Grothendieck
You could put the numbers inside the bars in which case it would not add to the height of the bar: x - 1:5 names(x) - letters[1:5] bp - barplot(x) text(bp, x - .02 * diff(par(usr)[3:4]), x) On 8/9/07, Frank E Harrell Jr [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote: Greg, I'm going to join

[R] Help with Filtering (interest rates related)

2007-08-09 Thread Bernardo Ribeiro
Dear, r-help, Long time reader, first time poster, I'm working on a paper regarding a term structure estimation using the Kalman Filter Algorithm. The model in question is the Generalized Vasicek, and since there are coupon-bonds being estimated, I'm supposed to make some changes on the Kalman

Re: [R] Subject: Re: how to include bar values in a barplot?

2007-08-09 Thread Frank E Harrell Jr
Gabor Grothendieck wrote: You could put the numbers inside the bars in which case it would not add to the height of the bar: I think the Cleveland/Tufte prescription would be much different: horizontal dot charts with the numbers in the right margin. I do this frequently with great effect.

Re: [R] ARIMA fitting

2007-08-09 Thread Prof Brian Ripley
On Tue, 7 Aug 2007, [EMAIL PROTECTED] wrote: Hello, I‘m trying to fit an ARIMA process, using STATS package, arima function. Can I expect, that fitted model with any parameters is stationary, causal and invertible? Please read ?arima: it answers all your questions, and points out that the

[R] Interpret impulse response functions from irf in MSBVAR library

2007-08-09 Thread sj
Hello, I am wondering if anyone knows how to interpret the values returned by irf function in the MSBVAR library. Some of the literature I have read indicates that impulse responses in the dependent variables are often based on a 1 unit change in the independent variable, but other sources

[R] AlgDesign expand.formula()

2007-08-09 Thread S Ellison
Can anyone explain why AlgDesign's expand.formula help and output differ? #From help: # quad(A,B,C) makes ~(A+B+C)^2+I(A^2)+I(B^2)+I(C^2) expand.formula(~quad(A+B+C)) #actually gives ~(A + B + C)^2 + I(A + B + C^2) They don't _look_ the same... Steve E

[R] Mac OSX fonts in R plots

2007-08-09 Thread Fernando Diaz
I had been looking for information about including OSX fonts in R plots for a long time and never quite found the answer. I spent an hour or so gathering together the following solution which, as far as I have tested, works. I'm posting this for feedback and and archiving. I'd be interested in

[R] Memory problem

2007-08-09 Thread Gang Chen
I got a long list of error message repeating with the following 3 lines when running the loop at the end of this mail: R(580,0xa000ed88) malloc: *** vm_allocate(size=327680) failed (error code=3) R(580,0xa000ed88) malloc: *** error: can't allocate region R(580,0xa000ed88) malloc: *** set a

Re: [R] Rcmdr window border lost

2007-08-09 Thread Peter Dalgaard
Andy Weller wrote: OK, I tried completely removing and reinstalling R, but this has not worked - I am still missing window borders for Rcmdr. I am certain that everything is installed correctly and that all dependencies are met - there must be something trivial I am missing?! Thanks in

Re: [R] tcltk error on Linux

2007-08-09 Thread Seth Falcon
Hi Mark, Prof Brian Ripley [EMAIL PROTECTED] writes: On Thu, 9 Aug 2007, Mark W Kimpel wrote: I am having trouble getting tcltk package to load on openSuse 10.2 running R-devel. I have specifically put my /usr/share/tcl directory in my PATH, but R doesn't seem to see it. I also have

[R] Systematically biased count data regression model

2007-08-09 Thread Matthew and Kim Bowser
Dear all, I am attempting to explain patterns of arthropod family richness (count data) using a regression model. It seems to be able to do a pretty good job as an explanatory model (i.e. demonstrating relationships between dependent and independent variables), but it has systematic problems as

Re: [R] Subject: Re: how to include bar values in a barplot?

2007-08-09 Thread Greg Snow
Ted, Thanks for your thoughts. I don't take it as the start of a flame war (I don't want that either). My original intent was to get the original posters out of the mode of thinking they want to match what the spreadsheet does and into thinking about what message they are trying to get across.

Re: [R] Subject: Re: how to include bar values in a barplot?

2007-08-09 Thread Greg Snow
Gabor, Putting the numbers in the bars is an improvement over putting them over the bars, but if the numbers are large relative to the bars, this could still create a fuzzy top to the bars making them harder to compare. This also has the problem of the poorly laid out table, numbers are easiest

Re: [R] Systematically biased count data regression model

2007-08-09 Thread paulandpen
Matthew it is possible that your results are suffering from heterogeneity, it may be that your model performs well at the aggregate level and this would explain good aggregate fit levels and decent predictive performance etc, you could perhaps look at a 'latent' approach to modelling your

Re: [R] small sample techniques

2007-08-09 Thread Nair, Murlidharan T
Thanks, that discussion was helpful. Well, I have another question I am comparing two proportions for its deviation from the hypothesized difference of zero. My manually calculated z ratio is 1.94. But, when I calculate it using prop.test, it uses Pearson's chi-squared test and the X-squared

Re: [R] Help using gPath

2007-08-09 Thread Emilio Gagliardi
Hi Paul, I'm sorry for not posting code, I wasn't sure if it would be helpful without the data...should I post the code and a sample of the data? I will remember to do that next time! grid.gedit(gPath(ylabel.text.382), gp=gpar(fontsize=16)) OK, I think my confusion comes from the notation

[R] How to apply functions over rows of multiple matrices

2007-08-09 Thread Johannes Hüsing
Dear ExpRts, I would like to perform a function with two arguments over the rows of two matrices. There are a couple of *applys (including mApply in Hmisc) but I haven't found out how to do it straightforward. Applying to row indices works, but looks like a poor hack to me: sens - function(test,

[R] Seasonality

2007-08-09 Thread Alberto Monteiro
I have a time series x = f(t), where t is taken for each month. What is the best function to detect if _x_ has a seasonal variation? If there is such seasonal effect, what is the best function to estimate it? Function arima has a seasonal parameter, but I guess this is too complex to be useful.

Re: [R] How to apply functions over rows of multiple matrices

2007-08-09 Thread Gabor Grothendieck
Is sens really what you want? The denominator is the indexes, e.g. if a row in goldstandard were c(0, 0, 1, 1) then you would be dividing by 3+4. Also test[which(gold == 1)] is the same as test[gold == 1] which is the same test * gold since gold has only 0 and 1's in it. Perhaps what you really

[R] Need Help: Installing/Using xtable package

2007-08-09 Thread M. Jankowski
Hi all, Let me know if I need to ask this question of the bioconductor group. I used the bioconductor utility to install this package and also the CRAN package.install function. My computer crashed a week ago. Today I reinstalled all my bioconductor/R packages. One of my scripts is giving me the

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Michael Cassin
Hi, I've been having similar experiences and haven't been able to substantially improve the efficiency using the guidance in the I/O Manual. Could anyone advise on how to improve the following scan()? It is not based on my real file, please assume that I do need to read in characters, and can't

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Gabor Grothendieck
If we add quote = FALSE to the write.csv statement its twice as fast reading it in. On 8/9/07, Michael Cassin [EMAIL PROTECTED] wrote: Hi, I've been having similar experiences and haven't been able to substantially improve the efficiency using the guidance in the I/O Manual. Could anyone

Re: [R] Need Help: Installing/Using xtable package

2007-08-09 Thread Peter Dalgaard
M. Jankowski wrote: Hi all, Let me know if I need to ask this question of the bioconductor group. I used the bioconductor utility to install this package and also the CRAN package.install function. My computer crashed a week ago. Today I reinstalled all my bioconductor/R packages. One of

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Michael Cassin
Thanks for looking, but my file has quotes. It's also 400MB, and I don't mind waiting, but don't have 6x the memory to read it in. On 8/9/07, Gabor Grothendieck [EMAIL PROTECTED] wrote: If we add quote = FALSE to the write.csv statement its twice as fast reading it in. On 8/9/07, Michael

Re: [R] Need Help: Installing/Using xtable package

2007-08-09 Thread Seth Falcon
Peter Dalgaard [EMAIL PROTECTED] writes: M. Jankowski wrote: Hi all, Let me know if I need to ask this question of the bioconductor group. I used the bioconductor utility to install this package and also the CRAN package.install function. My computer crashed a week ago. Today I

Re: [R] Need Help: Installing/Using xtable package

2007-08-09 Thread M. Jankowski
Ok, I got it now. Just: print(xtable(...),) Thanks! Matt On 8/9/07, Seth Falcon [EMAIL PROTECTED] wrote: Peter Dalgaard [EMAIL PROTECTED] writes: M. Jankowski wrote: Hi all, Let me know if I need to ask this question of the bioconductor group. I used the bioconductor utility to

Re: [R] small sample techniques

2007-08-09 Thread Nordlund, Dan (DSHS/RDA)
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Nair, Murlidharan T Sent: Thursday, August 09, 2007 9:19 AM To: Moshe Olshansky; Rolf Turner; r-help@stat.math.ethz.ch Subject: Re: [R] small sample techniques Thanks, that discussion was helpful.

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Gabor Grothendieck
Another thing you could try would be reading it into a data base and then from there into R. The devel version of sqldf has this capability. That is it will use RSQLite to read the file directly into the database without going through R at all and then read it from there into R so its a

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Gabor Grothendieck
Just one other thing. The command in my prior post reads the data into an in-memory database. If you find that is a problem then you can read it into a disk-based database by adding the dbname argument to the sqldf call naming the database. The database need not exist. It will be created by

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Michael Cassin
I really appreciate the advice and this database solution will be useful to me for other problems, but in this case I need to address the specific problem of scan and read.* using so much memory. Is this expected behaviour? Can the memory usage be explained, and can it be made more efficient?

Re: [R] small sample techniques

2007-08-09 Thread Nair, Murlidharan T
n=300 30% taking A relief from pain 23% taking B relief from pain Question; If there is no difference are we likely to get a 7% difference? Hypothesis H0: p1-p2=0 H1: p1-p2!=0 (not equal to) 1Weighed average of two sample proportion 300(0.30)+300(0.23) --- = 0.265

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Gabor Grothendieck
One other idea. Don't use byrow = TRUE. Matrices are stored in column order so that might be more efficient. You can always transpose it later. Haven't tested it to see if it helps. On 8/9/07, Michael Cassin [EMAIL PROTECTED] wrote: I really appreciate the advice and this database solution

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Charles C. Berry
On Thu, 9 Aug 2007, Michael Cassin wrote: I really appreciate the advice and this database solution will be useful to me for other problems, but in this case I need to address the specific problem of scan and read.* using so much memory. Is this expected behaviour? Can the memory usage be

[R] depreciation of $ for atomic vectors

2007-08-09 Thread Ido M. Tamir
Dear All, I would like to know why $ was deprecated for atomic vectors and what I can use instead. I got used to the following idiom for working with data frames: df - data.frame(start=1:5,end=10:6) apply(df,1,function(row){ return(row$start + row$end) }) I have a data.frame with named columns

Re: [R] depreciation of $ for atomic vectors

2007-08-09 Thread Gabor Grothendieck
Try this: DF - data.frame(start=1:5,end=10:6) # apply(DF,1,function(row){ return(row$start + row$end) }) DF$start + DF$end apply(DF, 1, function(row) row[[start]] + row[[end]]) apply(DF, 1, function(row) row[start] + row[end]) On 8/9/07, Ido M. Tamir [EMAIL PROTECTED] wrote: Dear All, I

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Gabor Grothendieck
Try it as a factor: big2 - rep(letters,length=1e6) object.size(big2)/1e6 [1] 4.000856 object.size(as.factor(big2))/1e6 [1] 4.001184 big3 - paste(big2,big2,sep='') object.size(big3)/1e6 [1] 36.2 object.size(as.factor(big3))/1e6 [1] 4.001184 On 8/9/07, Charles C. Berry [EMAIL

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Charles C. Berry
I do not see how this helps Mike's case: res - (as.character(1:1e6)) object.size(res) [1] 3624 object.size(as.factor(res)) [1] 4224 Anyway, my point was that if two character vectors for which all.equal() yields TRUE can differ by almost an order of magnitude in object.size(), and

Re: [R] tcltk error on Linux

2007-08-09 Thread Mark W Kimpel
Seth and Brian, Today and downloaded and installed the latest R-devel and tcltk now works. My suspicion is that Tcl was not on my path when R-devel was installed previously. BTW, I had though that is was a courtesy to cc: the maintainers of a package when writing either R-devel or R-help

[R] RMySQL loading error

2007-08-09 Thread Clara Anton
Hi, I am having problems loading RMySQL. I am using MySQL 5.0, R version 2.5.1, and RMySQL with Windows XP. When I try to load rMySQL I get the following error: require(RMySQL) Loading required package: RMySQL Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Prof Brian Ripley
On Thu, 9 Aug 2007, Charles C. Berry wrote: On Thu, 9 Aug 2007, Michael Cassin wrote: I really appreciate the advice and this database solution will be useful to me for other problems, but in this case I need to address the specific problem of scan and read.* using so much memory. Is this

[R] S4 based package giving strange error at install time, but not at check time

2007-08-09 Thread Rajarshi Guha
Hi, I have a S4 based package package that was loading fine on R 2.5.0 on both OS X and Linux. I was checking the package against 2.5.1 and doing R CMD check does not give any warnings. So I next built the package and installed it. Though the package installed fine I noticed the following

Re: [R] small sample techniques

2007-08-09 Thread Greg Snow
30 is not 30% of 300 (it is 10%), so your prop.test below is testing something different from your hand calculations. Try: prop.test(c(.30,.23)*300,c(300,300), correct=FALSE) 2-sample test for equality of proportions without continuity correction data: c(0.3, 0.23) * 300 out

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Gabor Grothendieck
The examples were just artificially created data. We don't know what the real case is but if each entry is distinct then factors won't help; however, if they are not distinct then there is a huge potential savings. Also if they are really numeric, as in your example, then storing them as numeric

Re: [R] RMySQL loading error

2007-08-09 Thread Prof Brian Ripley
On Thu, 9 Aug 2007, Clara Anton wrote: Hi, I am having problems loading RMySQL. I am using MySQL 5.0, R version 2.5.1, and RMySQL with Windows XP. More exact versions would be helpful. When I try to load rMySQL I get the following error: require(RMySQL) Loading required package:

Re: [R] RMySQL loading error

2007-08-09 Thread Gabor Grothendieck
This was just discussed: https://www.stat.math.ethz.ch/pipermail/r-help/2007-August/138142.html On 8/9/07, Clara Anton [EMAIL PROTECTED] wrote: Hi, I am having problems loading RMySQL. I am using MySQL 5.0, R version 2.5.1, and RMySQL with Windows XP. When I try to load rMySQL I get the

[R] Tukey HSD

2007-08-09 Thread Kurt Debono
Hi, I was wondering if you could help me: The following are the first few lines of my data set: subject group condition depvar s1 c ver 114.87 s1 c feet114.87 s1 c body114.87 s2 c ver 73.54 s2 c feet64.32 s2 c

Re: [R] S4 based package giving strange error at install time, but not at check time

2007-08-09 Thread Prof Brian Ripley
On Thu, 9 Aug 2007, Rajarshi Guha wrote: Hi, I have a S4 based package package that was loading fine on R 2.5.0 on both OS X and Linux. I was checking the package against 2.5.1 and doing R CMD check does not give any warnings. So I next built the package and installed it. Though the package

Re: [R] Memory problem

2007-08-09 Thread Gang Chen
It seems the problem lies in this line: try(fit.lme - lme(Beta ~ group*session*difficulty+FTND, random = ~1|Subj, Model), tag - 1); As lme fails for most iterations in the loop, the 'try' function catches one error message for each failed iteration. But the puzzling part is, why does the

Re: [R] Systematically biased count data regression model

2007-08-09 Thread Matthew and Kim Bowser
Dear Paul, Thank you very much for your comment. I will apply the 'latent' approach you suggested. Sincerely, Matthew Bowser On 8/9/07, paulandpen [EMAIL PROTECTED] wrote: Matthew it is possible that your results are suffering from heterogeneity, it may be that your model performs well at

[R] a question on lda{MASS}

2007-08-09 Thread Weiwei Shi
hi, assume val is the test data while m is lda model value by using CV=F x = predict(m, val) val2 = val[, 1:(ncol(val)-1)] # the last column is class label # col is sample, row is variable then I am wondering if x$x == (apply(val2*m$scaling), 2, sum) i.e., the scaling (is it coeff vector?)

Re: [R] plot table with sapply - labeling problems

2007-08-09 Thread jim holtman
Here is a modified script that should work. In many cases where you want the names of the element of the list you are processing, you should work with the names: test-as.data.frame(cbind(round(runif(50,0,5)),round(runif(50,0,3)),round(runif(50,0,4 sapply(test, table)-vardist sapply(test,

Re: [R] Systematically biased count data regression model

2007-08-09 Thread Matthew and Kim Bowser
Dear all, I received a very helpful response from someone who requested anonymity, but to whom I am grateful. PLEASE do not quote my name or email (I am trying to stay off spam lists) Matthew: I think this is just a reflection of the fact the model does not fit perfectly. The example below is

[R] odfWeave processing error, file specific

2007-08-09 Thread Aric Gregson
Hello, I hope there is a simple explanation for this. I have been using odfWeave with great satisfaction in R 2.5.0. Unfortunately, I cannot get beyond the following error message with a particular file. I have copied and pasted into new files and the same error pops up. It looks like the error

[R] Subsetting by number of observations in a factor

2007-08-09 Thread Ron Crump
Hi, I generally do my data preparation externally to R, so I this is a bit unfamiliar to me, but a colleague has asked me how to do certain data manipulations within R. Anyway, basically I can get his large file into a dataframe. One of the columns is a management group code (mg). There may be

Re: [R] Systematically biased count data regression model

2007-08-09 Thread Gabor Grothendieck
Perhaps you don't really need to predict the precise count. Maybe its good enough to predict whether the count is above or below average. In that case the model is 74% correct on a holdout sample of the last 54 points based on a model of the first 200 points. # create model on first 200 and

Re: [R] Systematically biased count data regression model

2007-08-09 Thread paulandpen
Matthew, In response to that post, I am afraid I have to disagree. I think a poor model fit (eg 16%) is a reflection of a lot of unmeasured factors and therefore random error in the model. This would explain why overall predictive performance is poor (eg a lot of error in the model) Your

Re: [R] Systematically biased count data regression model

2007-08-09 Thread Gabor Grothendieck
I guess I should not have been so quick to make that conclusion since it seems that 74% of the values in the holdout set are FALSE so simply guessing FALSE for each one would give us 74% accuracy: table(DD[201:254]) FALSE TRUE 4014 40/54 [1] 0.7407407 On 8/9/07, Gabor Grothendieck

Re: [R] small sample techniques

2007-08-09 Thread Moshe Olshansky
Hi Murli, First of all, regarding prop.test, you made a typo: you should have used prop.test(c(69,90),c(300,300)) which gives you the squared value of 3.4228, and it's square root is 1.85 which is not too far from 1.94. I would use Fisher Exact Test (fisher.test). Two sided test has a p-value

Re: [R] Subsetting by number of observations in a factor

2007-08-09 Thread jim holtman
Does this do what you want? It creates a new dataframe with those 'mg' that have at least a certain number of observation. set.seed(2) # create some test data x - data.frame(mg=sample(LETTERS[1:4], 20, TRUE), data=1:20) # split the data into subsets based on 'mg' x.split - split(x, x$mg)

Re: [R] Seasonality

2007-08-09 Thread Felix Andrews
?monthplot ?stl On 8/10/07, Alberto Monteiro [EMAIL PROTECTED] wrote: I have a time series x = f(t), where t is taken for each month. What is the best function to detect if _x_ has a seasonal variation? If there is such seasonal effect, what is the best function to estimate it? Function

Re: [R] Systematically biased count data regression model

2007-08-09 Thread Gabor Grothendieck
Here is one other idea. Since we are not doing that well with the entire data set lets look at a portion and see if we can do better there. This line of code seems to show that D is related to T: plot(data) so lets try conditioning D ~ T on all combos of the factor levels library(lattice)

Re: [R] odfWeave processing error, file specific

2007-08-09 Thread Kuhn, Max
Aric, Can you send me a reproducible example (code and odt file) plus the results if sessionInfo()? Thanks, Max -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Aric Gregson Sent: Thursday, August 09, 2007 6:56 PM To: r-help@stat.math.ethz.ch Subject:

Re: [R] Systematically biased count data regression model

2007-08-09 Thread Steven McKinney
Hi Matthew, You may be experiencing the classic 'regression towards the mean' phenomenon, in which case shrinkage estimation may help with prediction (extremely low and high values need to be shrunk back towards the mean) Here's a reference that discusses the issue in a manner somewhat related

[R] compute ROC curve?

2007-08-09 Thread gallon li
Hello, i have continuous test results for dieased and nondiseased subjects, say X and Y. Both are vectors of numbers. is there any R function which can generate the step function of ROC curve automatically? Thanks! [[alternative HTML version deleted]]

Re: [R] Subsetting by number of observations in a factor

2007-08-09 Thread Ron Crump
Jim, Does this do what you want? It creates a new dataframe with those 'mg' that have at least a certain number of observation. Looks good. I also have an alternative solution which appears to work, so I'll see which is quicker on the big data set in question. My solution: mgsize -

[R] error message: image not found

2007-08-09 Thread Jungeun Song
I have a R version 2.4 and I installed R version 2.5(current version) on Mac OS X 10.4.10. I tried dyn.load to load a object code compiled from C source. I got the following error message: Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library

Re: [R] Subsetting by number of observations in a factor

2007-08-09 Thread jim holtman
Here is an even faster way: # faster way x.mg.size - table(x$mg) # count occurance x.mg.5 - names(x.mg.size)[x.mg.size 5] # select greater than 5 x.new1 - subset(x, x$mg %in% x.mg.5) # use in the subset x.new1 mg data 1 A1 4 A4 5 D5 6 D6 7 A7 8 D8

Re: [R] Tukey HSD

2007-08-09 Thread Richard M. Heiberger
Please see the R-help message http://finzi.psych.upenn.edu/R/Rhelp02a/archive/105165.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

Re: [R] small sample techniques

2007-08-09 Thread Daniel Nordlund
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Nair, Murlidharan T Sent: Thursday, August 09, 2007 12:02 PM To: Nordlund, Dan (DSHS/RDA); r-help@stat.math.ethz.ch Subject: Re: [R] small sample techniques n=300 30% taking A relief from pain