i got this error, i dont remember what was the cause, but what i did work
around was that see the example in the manual pages of the ca.po... etc
and try to make your date in the same format. also just see whether the
functions will take so many columns as a parameter. I have not checked it.
regrets
typo error - please read 'date' as 'data'
---
Regards,
Gaurav Yadav (mobile: +919821286118)
Assistant Manager, CCIL, Mumbai (India)
mailto:[EMAIL PROTECTED]
mailto:[EMAIL PROTECTED]
Profile: http://www.linkedin.com/in/gydec25
Keep in touch and keep mailing
Thanks Mark, that was very helpful. I'm now so close!
Can anyone tell me how to extract the value from an instance of a
difftime class? I can see the value, but how can I place it in a
dataframe?
time_string1 - 10:17:07 02 Aug 2007
time_string2 - 13:17:40 02 Aug 2007
time1 -
On Thu, 9 Aug 2007, Matthew Walker wrote:
Thanks Mark, that was very helpful. I'm now so close!
Can anyone tell me how to extract the value from an instance of a
difftime class? I can see the value, but how can I place it in a
dataframe?
as.numeric(time_delta)
Hint: you want the
Hello Dorina,
if you apply ca.jo to a system with more than five variables, a
*warning* is issued that no critical values are provided. This is not an
error, but documented in ?ca.jo. In the seminal paper of Johansen, only
cv for up to five variables are provided. Hence, you need to refer to a
See
?gc
?Memory-limits
On Wed, 8 Aug 2007, Jun Ding wrote:
Hi All,
I have two questions in terms of the memory usage in R
(sorry if the questions are naive, I am not familiar
with this at all).
1) I am running R in a linux cluster. By reading the R
helps, it seems there are no default
On Thu, 9 Aug 2007, Mark W Kimpel wrote:
I am having trouble getting tcltk package to load on openSuse 10.2
running R-devel. I have specifically put my /usr/share/tcl directory in
my PATH, but R doesn't seem to see it. I also have installed tk on my
system. Any ideas on what the problem is?
Hi
your construction is quite complicated so instead of refining it I tried
to do such task different way. If I understand what you want to do you can
use
set.seed(1)
T - matrix(trunc(runif(20)*10), nrow=10, ncol=2)
T
[,1] [,2]
[1,]22
[2,]31
[3,]56
[4,]
Hi R Gurus:
I'm using the data() function to get the list of data sets for a package.
I would like to find the class for each data set; i.e.,data.frame, etc.
Using str(), I can find the name of the data set.
However, when I try the class function on the str output, I get
character, since the
Problem solved:
sapply(data(package=car)$results[,3], function(x)class(get(x)))
sorry for the silliness
Hi R Gurus:
I'm using the data() function to get the list of data sets for a package.
I would like to find the class for each data set; i.e.,data.frame, etc.
Using str(), I can find the
David Kaplan wrote:
Hi all,
I'm demonstrating a block entry regression using R for my regression
class. For each block, I get the R**2 and the associated F. I do this
with separate regressions adding the next block in and then get the
results by writing separate summary() statements
Hello again!
I have a set of character results. If one of the characters is a
blank space, followed by other characters, I want to end at the blank
space.
I tried strsplit, but it picks up again after the blank.
Any help would be much appreciated.
TIA,
Edna
Best R-users,
Heres a newbie question. I have tried to find an answer to this via
help and the ave(x,factor(),FUN=function(y) rank (z,tie=first)-function,
but without success.
I have a dataframe (~8000 observations, registerdata) with four columns:
id,
Is this what you want?
a-c(a b c,1 2 3,q - 5)
a
[1] a b c 1 2 3 q - 5
sapply(strsplit(a,[[:blank:]]),function(x)x[1])
[1] a 1 q
Edna Bell wrote:
I have a set of character results. If one of the characters is a
blank space, followed by other characters, I want to end at the blank
one more, shorter, solution.
a
[1] a b c 1 2 3 q- 5
gsub(\\s.+,,a)
[1] a 1 q-
Edna Bell wrote:
I have a set of character results. If one of the characters is a
blank space, followed by other characters, I want to end at the blank
space.
I tried strsplit, but it picks up again
Hi,
Im trying to replace some SAS statistical functions by R (batch calling).
But Ive seen that calling R in a batch mode (under Unix) takes about 2or 3
times more than SAS software. So its a great problem of performance for me.
Here is an extract of the calculation:
aov() will handle multiple responses and that would be considerably more
efficient than running separate fits as you seem to be doing.
Your code is nigh unreadable: please use your spacebar and remove the
redundant semicolons: `Writing R Extensions' shows you how to tidy up
your code to make
Hello,
How can I plot the discriminant scores resulting from prediction using a
dataset and n lda model and its decission boundaries using GGobi and
rggobi?
Best regards,
Dani
Daniel Valverde Saubí
Grup d'Aplicacions Biomèdiques de la Ressonància Magnètica Nuclear
(GABRMN)
Departament de
I am trying to run a GLMM on some binomial data. My fixed factors include 2
dichotomous variables, day, and distance. When I run the model:
modelA-glmmPQL(Leaving~Trial*Day*Dist,random=~1|Indiv,family=binomial)
I get the error:
iteration 1
Error in MEEM(object, conLin, control$niterEM) :
Greg, I'm going to join issue with your here! Not that I'll go near
advocating Excel-style graphics (abominable, and the Patrick Burns
URL which you cite is remarkable in its restraint). Also, I'm aware
that this is potential flame-war territory -- again, I want to avoid
that too.
However, this
Long time reader, first time poster,
I'm working on a paper regarding a term structure estimation using the
Kalman Filter Algorithm. The model in question is the Generalized Vasicek,
and since there are coupon-bonds being estimated, I'm supposed to make some
changes on the Kalman Filter.
Does
This should do what you want:
x - read.table(textConnection(id;dg1;dg2;date;
+ 1;F28;;1997-11-04;
+ 1;F20;F702;1998-11-09;
+ 1;F20;;1997-12-03;
+ 1;F208;;2001-03-18;
+ 2;F32;;1999-03-07;
+ 2;F29;F32;2000-01-06;
+ 2;F32;;2003-07-05;
+
Hello,
Im trying to fit an ARIMA process, using STATS package, arima function.
Can I expect, that fitted model with any parameters is stationary, causal
and invertible?
Thanks
__
R-help@stat.math.ethz.ch mailing list
Dear R users,
Does anyone know which package provide interior branch test for phylogenetic
tree with distance based method?
Any helps are really appreciated.
Thank you.
Nora.
__
R-help@stat.math.ethz.ch mailing list
Hi all,
I am trying to calculate the variance ratio of a time series under
heteroskedasticity.
So I know that the variance ratio should be calculated as a weighted average
of autocorrelations.
But I don't find the same results when I calculate the variance ratio
manually and when I compute the
Try this:
Lines - id;dg1;dg2;date;
1;F28;;1997-11-04;
1;F20;F702;1998-11-09;
1;F20;;1997-12-03;
1;F208;;2001-03-18;
2;F32;;1999-03-07;
2;F29;F32;2000-01-06;
2;F32;;2003-07-05;
2;F323;F2800;2000-02-05;
# replace textConnection(Lines) with actual file name
DF - read.csv2(textConnection(Lines),
OK, I tried completely removing and reinstalling R, but this has not
worked - I am still missing window borders for Rcmdr. I am certain that
everything is installed correctly and that all dependencies are met -
there must be something trivial I am missing?!
Thanks in advance, Andy
Andy Weller
Hi
Emilio Gagliardi wrote:
Hi everyone,I'm trying to figure out how to use gPath and the documentation
is not very helpful :(
I have the following plot object:
plot-surrounds::
background
plot.gTree.378::
background
guide.gTree.355:: (background.rect.345,
[EMAIL PROTECTED] wrote:
Greg, I'm going to join issue with your here! Not that I'll go near
advocating Excel-style graphics (abominable, and the Patrick Burns
URL which you cite is remarkable in its restraint). Also, I'm aware
that this is potential flame-war territory -- again, I want to
On Wed, 8 Aug 2007, Markus Brugger wrote:
Dear R-help,
I have used the regsubsets function from the leaps package to do subset
selection of a logistic regression model with 6 independent variables and
all possible ^2 interactions. As I want to get information about the
statistics behind
You could put the numbers inside the bars in which
case it would not add to the height of the bar:
x - 1:5
names(x) - letters[1:5]
bp - barplot(x)
text(bp, x - .02 * diff(par(usr)[3:4]), x)
On 8/9/07, Frank E Harrell Jr [EMAIL PROTECTED] wrote:
[EMAIL PROTECTED] wrote:
Greg, I'm going to join
Dear, r-help,
Long time reader, first time poster,
I'm working on a paper regarding a term structure estimation using the
Kalman Filter Algorithm. The model in question is the Generalized Vasicek,
and since there are coupon-bonds being estimated, I'm supposed to make some
changes on the Kalman
Gabor Grothendieck wrote:
You could put the numbers inside the bars in which
case it would not add to the height of the bar:
I think the Cleveland/Tufte prescription would be much different:
horizontal dot charts with the numbers in the right margin. I do this
frequently with great effect.
On Tue, 7 Aug 2007, [EMAIL PROTECTED] wrote:
Hello,
Im trying to fit an ARIMA process, using STATS package, arima function.
Can I expect, that fitted model with any parameters is stationary, causal
and invertible?
Please read ?arima: it answers all your questions, and points out that the
Hello,
I am wondering if anyone knows how to interpret the values returned by irf
function in the MSBVAR library. Some of the literature I have read indicates
that impulse responses in the dependent variables are often based on a 1
unit change in the independent variable, but other sources
Can anyone explain why AlgDesign's expand.formula help and output differ?
#From help:
# quad(A,B,C) makes ~(A+B+C)^2+I(A^2)+I(B^2)+I(C^2)
expand.formula(~quad(A+B+C))
#actually gives ~(A + B + C)^2 + I(A + B + C^2)
They don't _look_ the same...
Steve E
I had been looking for information about including OSX fonts in R
plots for a long time and never quite found the answer. I spent an
hour or so gathering together the following solution which, as far as
I have tested, works. I'm posting this for feedback and and
archiving. I'd be interested in
I got a long list of error message repeating with the following 3
lines when running the loop at the end of this mail:
R(580,0xa000ed88) malloc: *** vm_allocate(size=327680) failed (error
code=3)
R(580,0xa000ed88) malloc: *** error: can't allocate region
R(580,0xa000ed88) malloc: *** set a
Andy Weller wrote:
OK, I tried completely removing and reinstalling R, but this has not
worked - I am still missing window borders for Rcmdr. I am certain that
everything is installed correctly and that all dependencies are met -
there must be something trivial I am missing?!
Thanks in
Hi Mark,
Prof Brian Ripley [EMAIL PROTECTED] writes:
On Thu, 9 Aug 2007, Mark W Kimpel wrote:
I am having trouble getting tcltk package to load on openSuse 10.2
running R-devel. I have specifically put my /usr/share/tcl directory in
my PATH, but R doesn't seem to see it. I also have
Dear all,
I am attempting to explain patterns of arthropod family richness
(count data) using a regression model. It seems to be able to do a
pretty good job as an explanatory model (i.e. demonstrating
relationships between dependent and independent variables), but it has
systematic problems as
Ted,
Thanks for your thoughts. I don't take it as the start of a flame war
(I don't want that either).
My original intent was to get the original posters out of the mode of
thinking they want to match what the spreadsheet does and into thinking
about what message they are trying to get across.
Gabor,
Putting the numbers in the bars is an improvement over putting them over
the bars, but if the numbers are large relative to the bars, this could
still create a fuzzy top to the bars making them harder to compare.
This also has the problem of the poorly laid out table, numbers are
easiest
Matthew
it is possible that your results are suffering from heterogeneity, it may be
that your model performs well at the aggregate level and this would explain
good aggregate fit levels and decent predictive performance etc,
you could perhaps look at a 'latent' approach to modelling your
Thanks, that discussion was helpful. Well, I have another question
I am comparing two proportions for its deviation from the hypothesized
difference of zero. My manually calculated z ratio is 1.94.
But, when I calculate it using prop.test, it uses Pearson's chi-squared
test and the X-squared
Hi Paul,
I'm sorry for not posting code, I wasn't sure if it would be helpful without
the data...should I post the code and a sample of the data? I will remember
to do that next time!
grid.gedit(gPath(ylabel.text.382), gp=gpar(fontsize=16))
OK, I think my confusion comes from the notation
Dear ExpRts,
I would like to perform a function with two arguments
over the rows of two matrices. There are a couple of
*applys (including mApply in Hmisc) but I haven't found
out how to do it straightforward.
Applying to row indices works, but looks like a poor hack
to me:
sens - function(test,
I have a time series x = f(t), where t is taken for each
month. What is the best function to detect if _x_ has a seasonal
variation? If there is such seasonal effect, what is the
best function to estimate it?
Function arima has a seasonal parameter, but I guess this is
too complex to be useful.
Is sens really what you want? The denominator is the indexes,
e.g. if a row in goldstandard were c(0, 0, 1, 1) then you would
be dividing by 3+4. Also test[which(gold == 1)] is the same
as test[gold == 1] which is the same test * gold since gold
has only 0 and 1's in it. Perhaps what you really
Hi all,
Let me know if I need to ask this question of the bioconductor group.
I used the bioconductor utility to install this package and also the
CRAN package.install function.
My computer crashed a week ago. Today I reinstalled all my
bioconductor/R packages. One of my scripts is giving me the
Hi,
I've been having similar experiences and haven't been able to
substantially improve the efficiency using the guidance in the I/O
Manual.
Could anyone advise on how to improve the following scan()? It is not
based on my real file, please assume that I do need to read in
characters, and can't
If we add quote = FALSE to the write.csv statement its twice as fast
reading it in.
On 8/9/07, Michael Cassin [EMAIL PROTECTED] wrote:
Hi,
I've been having similar experiences and haven't been able to
substantially improve the efficiency using the guidance in the I/O
Manual.
Could anyone
M. Jankowski wrote:
Hi all,
Let me know if I need to ask this question of the bioconductor group.
I used the bioconductor utility to install this package and also the
CRAN package.install function.
My computer crashed a week ago. Today I reinstalled all my
bioconductor/R packages. One of
Thanks for looking, but my file has quotes. It's also 400MB, and I don't
mind waiting, but don't have 6x the memory to read it in.
On 8/9/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
If we add quote = FALSE to the write.csv statement its twice as fast
reading it in.
On 8/9/07, Michael
Peter Dalgaard [EMAIL PROTECTED] writes:
M. Jankowski wrote:
Hi all,
Let me know if I need to ask this question of the bioconductor group.
I used the bioconductor utility to install this package and also the
CRAN package.install function.
My computer crashed a week ago. Today I
Ok, I got it now.
Just: print(xtable(...),)
Thanks!
Matt
On 8/9/07, Seth Falcon [EMAIL PROTECTED] wrote:
Peter Dalgaard [EMAIL PROTECTED] writes:
M. Jankowski wrote:
Hi all,
Let me know if I need to ask this question of the bioconductor group.
I used the bioconductor utility to
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Nair,
Murlidharan T
Sent: Thursday, August 09, 2007 9:19 AM
To: Moshe Olshansky; Rolf Turner; r-help@stat.math.ethz.ch
Subject: Re: [R] small sample techniques
Thanks, that discussion was helpful.
Another thing you could try would be reading it into a data base and then
from there into R.
The devel version of sqldf has this capability. That is it will use RSQLite
to read the file directly into the database without going through R at all
and then read it from there into R so its a
Just one other thing.
The command in my prior post reads the data into an in-memory database.
If you find that is a problem then you can read it into a disk-based
database by adding the dbname argument to the sqldf call
naming the database. The database need not exist. It will
be created by
I really appreciate the advice and this database solution will be useful to
me for other problems, but in this case I need to address the specific
problem of scan and read.* using so much memory.
Is this expected behaviour? Can the memory usage be explained, and can it be
made more efficient?
n=300
30% taking A relief from pain
23% taking B relief from pain
Question; If there is no difference are we likely to get a 7% difference?
Hypothesis
H0: p1-p2=0
H1: p1-p2!=0 (not equal to)
1Weighed average of two sample proportion
300(0.30)+300(0.23)
--- = 0.265
One other idea. Don't use byrow = TRUE. Matrices are stored in column
order so that might be more efficient. You can always transpose it later.
Haven't tested it to see if it helps.
On 8/9/07, Michael Cassin [EMAIL PROTECTED] wrote:
I really appreciate the advice and this database solution
On Thu, 9 Aug 2007, Michael Cassin wrote:
I really appreciate the advice and this database solution will be useful to
me for other problems, but in this case I need to address the specific
problem of scan and read.* using so much memory.
Is this expected behaviour? Can the memory usage be
Dear All,
I would like to know why $ was deprecated for atomic vectors and
what I can use instead.
I got used to the following idiom for working with
data frames:
df - data.frame(start=1:5,end=10:6)
apply(df,1,function(row){ return(row$start + row$end) })
I have a data.frame with named columns
Try this:
DF - data.frame(start=1:5,end=10:6)
# apply(DF,1,function(row){ return(row$start + row$end) })
DF$start + DF$end
apply(DF, 1, function(row) row[[start]] + row[[end]])
apply(DF, 1, function(row) row[start] + row[end])
On 8/9/07, Ido M. Tamir [EMAIL PROTECTED] wrote:
Dear All,
I
Try it as a factor:
big2 - rep(letters,length=1e6)
object.size(big2)/1e6
[1] 4.000856
object.size(as.factor(big2))/1e6
[1] 4.001184
big3 - paste(big2,big2,sep='')
object.size(big3)/1e6
[1] 36.2
object.size(as.factor(big3))/1e6
[1] 4.001184
On 8/9/07, Charles C. Berry [EMAIL
I do not see how this helps Mike's case:
res - (as.character(1:1e6))
object.size(res)
[1] 3624
object.size(as.factor(res))
[1] 4224
Anyway, my point was that if two character vectors for which all.equal()
yields TRUE can differ by almost an order of magnitude in object.size(),
and
Seth and Brian,
Today and downloaded and installed the latest R-devel and tcltk now
works. My suspicion is that Tcl was not on my path when R-devel was
installed previously.
BTW, I had though that is was a courtesy to cc: the maintainers of a
package when writing either R-devel or R-help
Hi,
I am having problems loading RMySQL.
I am using MySQL 5.0, R version 2.5.1, and RMySQL with Windows XP.
When I try to load rMySQL I get the following error:
require(RMySQL)
Loading required package: RMySQL
Error in dyn.load(x, as.logical(local), as.logical(now)) :
unable to load
On Thu, 9 Aug 2007, Charles C. Berry wrote:
On Thu, 9 Aug 2007, Michael Cassin wrote:
I really appreciate the advice and this database solution will be useful to
me for other problems, but in this case I need to address the specific
problem of scan and read.* using so much memory.
Is this
Hi, I have a S4 based package package that was loading fine on R
2.5.0 on both OS X and
Linux. I was checking the package against 2.5.1 and doing R CMD check
does not give any warnings. So I next built the package and installed
it. Though the package installed fine I noticed the following
30 is not 30% of 300 (it is 10%), so your prop.test below is testing
something different from your hand calculations. Try:
prop.test(c(.30,.23)*300,c(300,300), correct=FALSE)
2-sample test for equality of proportions without continuity
correction
data: c(0.3, 0.23) * 300 out
The examples were just artificially created data. We don't know what the
real case is but if each entry is distinct then factors won't help; however,
if they are not distinct then there is a huge potential savings. Also
if they are
really numeric, as in your example, then storing them as numeric
On Thu, 9 Aug 2007, Clara Anton wrote:
Hi,
I am having problems loading RMySQL.
I am using MySQL 5.0, R version 2.5.1, and RMySQL with Windows XP.
More exact versions would be helpful.
When I try to load rMySQL I get the following error:
require(RMySQL)
Loading required package:
This was just discussed:
https://www.stat.math.ethz.ch/pipermail/r-help/2007-August/138142.html
On 8/9/07, Clara Anton [EMAIL PROTECTED] wrote:
Hi,
I am having problems loading RMySQL.
I am using MySQL 5.0, R version 2.5.1, and RMySQL with Windows XP.
When I try to load rMySQL I get the
Hi,
I was wondering if you could help me:
The following are the first few lines of my data set:
subject group condition depvar
s1 c ver 114.87
s1 c feet114.87
s1 c body114.87
s2 c ver 73.54
s2 c feet64.32
s2 c
On Thu, 9 Aug 2007, Rajarshi Guha wrote:
Hi, I have a S4 based package package that was loading fine on R
2.5.0 on both OS X and
Linux. I was checking the package against 2.5.1 and doing R CMD check
does not give any warnings. So I next built the package and installed
it. Though the package
It seems the problem lies in this line:
try(fit.lme - lme(Beta ~ group*session*difficulty+FTND, random =
~1|Subj, Model), tag - 1);
As lme fails for most iterations in the loop, the 'try' function
catches one error message for each failed iteration. But the puzzling
part is, why does the
Dear Paul,
Thank you very much for your comment. I will apply the 'latent'
approach you suggested.
Sincerely,
Matthew Bowser
On 8/9/07, paulandpen [EMAIL PROTECTED] wrote:
Matthew
it is possible that your results are suffering from heterogeneity, it may be
that your model performs well at
hi,
assume
val is the test data while m is lda model value by using CV=F
x = predict(m, val)
val2 = val[, 1:(ncol(val)-1)] # the last column is class label
# col is sample, row is variable
then I am wondering if
x$x == (apply(val2*m$scaling), 2, sum)
i.e., the scaling (is it coeff vector?)
Here is a modified script that should work. In many cases where you
want the names of the element of the list you are processing, you
should work with the names:
test-as.data.frame(cbind(round(runif(50,0,5)),round(runif(50,0,3)),round(runif(50,0,4
sapply(test, table)-vardist
sapply(test,
Dear all,
I received a very helpful response from someone who requested
anonymity, but to whom I am grateful.
PLEASE do not quote my name or email (I am trying to stay off spam lists)
Matthew: I think this is just a reflection of
the fact the model does not fit perfectly. The
example below is
Hello,
I hope there is a simple explanation for this. I have been using
odfWeave with great satisfaction in R 2.5.0. Unfortunately, I cannot
get beyond the following error message with a particular file. I have
copied and pasted into new files and the same error pops up. It looks
like the error
Hi,
I generally do my data preparation externally to R, so I
this is a bit unfamiliar to me, but a colleague has asked
me how to do certain data manipulations within R.
Anyway, basically I can get his large file into a dataframe.
One of the columns is a management group code (mg). There may be
Perhaps you don't really need to predict the precise count.
Maybe its good enough to predict whether the count is above
or below average. In that case the model is 74% correct on a
holdout sample of the last 54 points based on a model of the
first 200 points.
# create model on first 200 and
Matthew,
In response to that post, I am afraid I have to disagree. I think a poor model
fit (eg 16%) is a reflection of a lot of unmeasured factors and therefore
random error in the model. This would explain why overall predictive
performance is poor (eg a lot of error in the model) Your
I guess I should not have been so quick to make that conclusion since
it seems that 74% of the values in the holdout set are FALSE so simply
guessing FALSE for each one would give us 74% accuracy:
table(DD[201:254])
FALSE TRUE
4014
40/54
[1] 0.7407407
On 8/9/07, Gabor Grothendieck
Hi Murli,
First of all, regarding prop.test, you made a typo:
you should have used prop.test(c(69,90),c(300,300))
which gives you the squared value of 3.4228, and it's
square root is 1.85 which is not too far from 1.94.
I would use Fisher Exact Test (fisher.test). Two
sided test has a p-value
Does this do what you want? It creates a new dataframe with those
'mg' that have at least a certain number of observation.
set.seed(2)
# create some test data
x - data.frame(mg=sample(LETTERS[1:4], 20, TRUE), data=1:20)
# split the data into subsets based on 'mg'
x.split - split(x, x$mg)
?monthplot
?stl
On 8/10/07, Alberto Monteiro [EMAIL PROTECTED] wrote:
I have a time series x = f(t), where t is taken for each
month. What is the best function to detect if _x_ has a seasonal
variation? If there is such seasonal effect, what is the
best function to estimate it?
Function
Here is one other idea. Since we are not doing that well with
the entire data set lets look at a portion and see if we can do
better there. This line of code seems to show that D is related
to T:
plot(data)
so lets try conditioning D ~ T on all combos of the factor levels
library(lattice)
Aric,
Can you send me a reproducible example (code and odt file) plus the
results if sessionInfo()?
Thanks,
Max
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Aric Gregson
Sent: Thursday, August 09, 2007 6:56 PM
To: r-help@stat.math.ethz.ch
Subject:
Hi Matthew,
You may be experiencing the classic
'regression towards the mean' phenomenon,
in which case shrinkage estimation may help
with prediction (extremely low and high values
need to be shrunk back towards the mean)
Here's a reference that discusses the issue
in a manner somewhat related
Hello,
i have continuous test results for dieased and nondiseased subjects, say X
and Y. Both are vectors of numbers.
is there any R function which can generate the step function of ROC curve
automatically?
Thanks!
[[alternative HTML version deleted]]
Jim,
Does this do what you want? It creates a new dataframe with those
'mg' that have at least a certain number of observation.
Looks good. I also have an alternative solution which appears to work,
so I'll see which is quicker on the big data set in question.
My solution:
mgsize -
I have a R version 2.4 and I installed R version 2.5(current version)
on Mac OS X 10.4.10. I tried dyn.load to load a object code
compiled from C source. I got the following error message:
Error in dyn.load(x, as.logical(local), as.logical(now)) :
unable to load shared library
Here is an even faster way:
# faster way
x.mg.size - table(x$mg) # count occurance
x.mg.5 - names(x.mg.size)[x.mg.size 5] # select greater than 5
x.new1 - subset(x, x$mg %in% x.mg.5) # use in the subset
x.new1
mg data
1 A1
4 A4
5 D5
6 D6
7 A7
8 D8
Please see the R-help message
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/105165.html
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Nair,
Murlidharan T
Sent: Thursday, August 09, 2007 12:02 PM
To: Nordlund, Dan (DSHS/RDA); r-help@stat.math.ethz.ch
Subject: Re: [R] small sample techniques
n=300
30% taking A relief from pain
99 matches
Mail list logo