[EMAIL PROTECTED] wrote:
Hi,
MY question is like the following:
I would like to have a robust regression line. The data I have are
> mostly clustered around a small range. So
the regression line tend to be influenced strongly by outlier points
> (with large cook's distance). From the applicati
"Michael Benjamin" <[EMAIL PROTECTED]> writes:
> I was just looking ahead two or three years--where is all this genomic
> array research headed? I guess I'm concerned about scalability.
Me too -- but at least in the near future, data will be growing more
than the capacity to process it.
> Is
Hi--
While I agree that we cannot agree on the ideal algorithms, we should be
taking practical steps to implement microarrays in the clinic. I think
we can all agree that our algorithms have some degree of efficacy over
and above conventional diagnostic techniques. If patients are dying
from lac
I suspect that the only way that adding a point at (0,0) would 'improve
the fit' is by giving R^2 a boost. But this would be a spurious measure
of fit, including as it does the invented point. The residual sum of
squares calculated over the actual data would be increased, probably
only by a mod
It is likely that the "true" relationship is nonlinear. There isn't a priori knowledge
about linearity. In the small range where we do have enough data, the relationship
looks linear. Outside the range, the data are very scarse and have high level of
noises too.
This is why adding (0,0) to the d
Not a good idea, unless the regression function is *known* to be linear.
More likely it is only approximately linear over small ranges.
Murray Jorgensen
Wiener, Matthew wrote:
If you know that the line should pass through (0,0), would it make sense to
do a regression without an intercept? You
Hi,
Speed is an issue and large data sets are problematic. But I don't
think that they are the entire problem here. Much more of the problem
is that we don't yet know how to efficiently normalize microarrays
and to estimate gene expression data. We're still trying to get it
right rather than g
If you know that the line should pass through (0,0), would it make sense to
do a regression without an intercept? You can do that by putting "-1" in
the formula, like: lm(y ~ x - 1).
Hope this helps,
Matt
Matthew Wiener
RY84-202
Applied Computer Science & Mathematics Dept.
Merck Research Labs
On 3 Dec 2003 at 16:07, Richard Bonneau wrote:
Could you tell us where you found gl1ec ?
Kjetil Halvorsen
> Hi,
>
> I'm using gl1ce with family=binomial like so:
> >yy
>succ fail
> [1,] 76 23
> [2,] 32 67
> [3,] 56 43
> ...
> [24,] 81 18
>
> >xx
> c1
Hi, all--
I wanted to start a (new) thread on R speed/benchmarking. There is a
nice R benchmarking overview at
http://www.sciviews.org/other/benchmark.htm, along with a free script so
you can see how your machine stacks up.
Looks like R is substantially faster than S-plus.
My problem is this
Richard Bonneau <[EMAIL PROTECTED]> writes:
> Hi,
>
> I'm using gl1ce with family=binomial like so:
> >yy
>succ fail
> [1,] 76 23
> [2,] 32 67
> [3,] 56 43
> ...
> [24,] 81 18
>
> >xx
> c1219 c643
> X1 0.04545455 0.64274145
> X2 0.17723669 0.90
Hi,
I'm using gl1ce with family=binomial like so:
>yy
succ fail
[1,] 76 23
[2,] 32 67
[3,] 56 43
...
[24,] 81 18
>xx
c1219 c643
X1 0.04545455 0.64274145
X2 0.17723669 0.90392792
...
X24 0.80629054 0.12239320
>test.gl1ce <- gl1ce(yy ~ xx, family = binomial
What is the context? What do the "outliers" represent? If you
think carefully about the context, you may find the answer.
hope this helps. spencer graves
p.s. I know statisticians who worked for HP before the split and who
still work for either HP or Agilent, I'm not certain which
Hi,
This is more a statistics question rather than R question. But I thought people on
this list may have some pointers.
MY question is like the following:
I would like to have a robust regression line. The data I have are mostly clustered
around a small range. So
the regression line tend to b
I have a 3d irregular grid of a surface (closed surface)
I would like to calculate the volume enclosed inside this surface
can this be done in R
any help is very much appreciated
best regards
karim
Karim
__
[EMAIL PROTECTED] mailing list
https://www.stat
I have a 3d irregular grid of a surface (closed surface)
I would like to calculate the volume enclosed inside this surface
can this be done in R
any help is very much appreciated
best regards
karim
Karim
__
[EMAIL PROTECTED] mailing list
https://www.stat
Rajarshi Guha <[EMAIL PROTECTED]> writes:
> apply(x, c(2), funtion(v1,v2){ identical(v1,v2) }, v2=c(1,4,2))
>
> The above gives me a syntax error. I also tried:
No wonder! Try with `function' instead of `funtion'.
--
Bjørn-Helge Mevik
__
[EMAIL PROT
> From: Rajarshi Guha
> On Wed, 2003-12-03 at 13:18, J.R. Lockwood wrote:
>
> > list will come up with something clever. the other issues
> is that you
> > need to be careful when doing equality comparisons with
> floating point
> > numbers. unless your matrix consists of characters or intege
It looks like you are trying to fit Schaeffer model (a special case of the
Pella-Tomlinsion general production model) to the data. Such models can be
solved in a completely general way using ADModel Builder, and an example of
the general production model application can be found at
http://otter-
On Wed, 3 Dec 2003, Rajarshi Guha wrote:
> Hi,
> I have an apply statement that looks like:
>
> > check.cols <- function(v1, v2) {
> + return( identical(v1,v2) );
> + }
> > x
> [,1] [,2] [,3]
> [1,]133
> [2,]454
> [3,]276
> > apply(x, c(2), check.cols
On Wed, 2003-12-03 at 12:06, Rajarshi Guha wrote:
> Hi,
> I have a rectangular matrix and I need to check whether any columns
> are identical or not. Currently I'm looping over the columns and
> checking each column with all the others with identical().
>
> However, as experience has shown me, g
On Wed, 2003-12-03 at 13:18, J.R. Lockwood wrote:
> list will come up with something clever. the other issues is that you
> need to be careful when doing equality comparisons with floating point
> numbers. unless your matrix consists of characters or integers,
> you'll need to think about some l
Hi,
I have an apply statement that looks like:
> check.cols <- function(v1, v2) {
+ return( identical(v1,v2) );
+ }
> x
[,1] [,2] [,3]
[1,]133
[2,]454
[3,]276
> apply(x, c(2), check.cols, v2=c(7,8,9))
[1] FALSE FALSE FALSE
Is it possible to make the
Hi,
I have a rectangular matrix and I need to check whether any columns
are identical or not. Currently I'm looping over the columns and
checking each column with all the others with identical().
However, as experience has shown me, getting rid of loops is a good idea
:) Would anybody have any s
Hmmm, thanks for your suggestions i'm in the
same opinion with any subsetting problem, but curious is
that my model i.e. with library(gbm) or simple lm works,
because my task is to find out the weights/importance values
for the attributes and i would like compare the results between
the randomFores
Thank you Frank and Gabor for the fixes and checking and rechecking!
Everything seems to work well with the Hmisc functions tried--upData, describe
and summary.
To summarize:
1. Add the testDateTime and formatDateTime functions (copied from Frank's
messages) to the Hmisc file (or run prior to l
Christian --
You don't provide enough information (like a call) to answer this. I
suspect, though, that you may be subsetting in a way that passes
randomForest no data.
I'm not aware offhand of an easy way to get this error from randomForest. I
tried creating some data superficially similar to
I have been using a little function I wrote myself; look at
http://www.unc.edu/home/aperrin/tips/src/icc.R for the code. Not pretty,
but it works.
ap
--
Andrew J Perrin - http://www.unc.edu/~aperrin
Assistant Professor of Sociol
Thomas Stabla wrote:
>
> Hello,
>
> I have defined a new class
>
> > setClass("myclass", representation(min = "numeric", max = "numeric"))
>
> and want to write accessor functions, so that for
>
> > foo = new("myclass", min = 0, max = 1)
> > min(foo) # prints 0
> > max(foo) # prints 1
>
> At
What I did was, in presence of equal values distances, to randomize the
selection of them, and compute the distortion of the solution using
cophenetic correlation.
I computed 1 "random" trees for each of three methods: average, single
and complete linkage.
Among the "randomly" selected solution
Hi,
Brian Ripley already replied "don't use average linkage"... You
may think about k-medoid (pam) in package cluster instead.
However, often average linkage is not such a bad choice, and if you really
want to use it for your data, you may try the following:
Among the hierarchical methods, single
On Wed, 3 Dec 2003, Bruno Giordano wrote:
> Hi,
> I'm clustering objects defined by categorical variables with a hierarchical
> algorithm - average linkage.
> My distance matrix (general dissimilarity coefficient) includes several
> distances with exactly the same values.
> As I see, a standard ag
Bruno -
Many people add a tiny random number to each of the distances,
or deliberately randomize the input order. This means that
any clustering is not reproducible, unless you go back to the
original randoms, but it forces you not to pay attention to
minor differences.
Ah, I think you're askin
On Wed, 2003-12-03 at 14:34, [EMAIL PROTECTED] wrote:
> Is there a test for independence available based on a multidimensional
> contingency table?
> I've about 300 processes, and for each of them I get numbers for failures and
> successes. I've two or more conditions under which I test these proce
Hi,
what i'm doing wrong?
I'm using a data.frame with ~ 90.000 instances
and 7 attributes, 5 are binary recoded
1 independend variable are a real one
and the target is a real one,too.
The distributions are not very skewed in the dummy variables
,but in the real variables are ~ 60.000
zero valu
Hi,
I'm clustering objects defined by categorical variables with a hierarchical
algorithm - average linkage.
My distance matrix (general dissimilarity coefficient) includes several
distances with exactly the same values.
As I see, a standard agglomerative procedure ignores this problems, simply
sel
Thomas -
"sup" stands for "supremum" or "maximum". The criterion for
complete linkage clustering is that the two groups with the
smallest maximum distance between any of their members will
be joined at each stage.
(I dare say you will have recieved many similar responses
already, but none on-li
On Wed, 3 Dec 2003, Arend P. van der Veen wrote:
> Your recommendations have worked great. I have found both cut and
> ifelse to be useful.
>
> I have one more question. When should I use factors over a character
> vector. I know that they have different uses. However, I am still
> trying to f
If you were using the colours in a model matrix
using a factor would be best but since your case
is plotting using characters is best.
Date: 03 Dec 2003 08:26:05 -0500
From: Arend P. van der Veen <[EMAIL PROTECTED]>
To: R HELP <[EMAIL PROTECTED]>
Subject: Re: [R] Vector Assignments
Yo
Hi,
Can R calculate an intraclass correlation coefficient for clustered data,
when the outcome variable is dichotomous?
By now I calculate it by hand, estimating between- and intracluster variance
by one-way ANOVA - however I don't feel very comfortable about this, since
the distributional assump
Hello,
Is there a test for independence available based on a multidimensional
contingency table?
I've about 300 processes, and for each of them I get numbers for failures and
successes. I've two or more conditions under which I test these processes.
If I had just one process to test I could just
Your recommendations have worked great. I have found both cut and
ifelse to be useful.
I have one more question. When should I use factors over a character
vector. I know that they have different uses. However, I am still
trying to figure out how I can best take advantage of factors.
The foll
On Wed, 3 Dec 2003, Lars Peters wrote:
> Hello,
>
> I've got a big problem. I'm using R for geostatistical analyses, especially
> the field-package.
> I try to generate plots after the kriging process with help of
> image.plot(..., col=terrain.colors, ...). Everything works fine, but I want
> to
Hello,
I've got a big problem. I'm using R for geostatistical analyses, especially
the field-package.
I try to generate plots after the kriging process with help of
image.plot(..., col=terrain.colors, ...). Everything works fine, but I want
to reverse the color-palettes (heat.colors, topo.colors o
On Wed, 3 Dec 2003 10:08:04 + (GMT), you wrote:
>Hi
>
>How can one simulate correlated distributions in R for windows?
I'm not sure exactly what you're asking, but maybe the MASS function
mvrnorm() is what you want.
Duncan Murdoch
__
[EMAIL PROTE
Thanks!
I think the minor differences taking the values with
rnorm result of the homogen distribution without an
effect. But the results of aov and lme should be
similiar for data with effects, too (at least for
simple and balanced designs).
Karl
--- "Pascal A. Niklaus" <[EMAIL PROTECTED]>
schri
Hi,
I'm trying to understand the complete linkage method in hclust. Can anyone provide a
breakdown of the formula (p9 of the pdf documentation) or tell me what the "sup"
operator does/means?
thanks in advance
Tom
[[alternative HTML version deleted]]
___
Assuming you are measuring Y and you have factor A fixed and factor B
random, I would create a model like:
mod<-lme(Y ~ A, random=~1|B/A, mydata)
VarCorr(mod1)
the term "random=~1|B" tells the model that B is a random factor, adding
the "/A" to get "random =~1|B/A" tells the model you want the
in
In which OS and on which graphics device?
For postscript/pdf see the `family' argument.
On Windows devices see the Rdevga file (?Rdevga for details) and the
README.
For X11() on Unix you will need to compile up the R-devel version (see the
FAQ for how to get it) as it is an upcoming feature.
Hello,
Can anyone let me know how I might change the font of name on axis in pairs
plot? I want the font "times new roman".
thanks.
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Hi
How can one simulate correlated distributions in R for windows?
Coomaren P. Vencatasawmy
-
Download Yahoo! Messenger now for a chance to WIN Robbie Williams "Live At Knebworth
DVD"
[[alternative HTML version deleted]]
"Philippe Grosjean" <[EMAIL PROTECTED]> writes:
> Well, writing a quick and durty help for a function with a few lines of
> comment above or below the function code ("a la Matlab") should be nice. I
> don't think that it should be a good idea to provide a complex alternative
> solution for documen
Does anybody have a tuned Rblas.dll compiled against ALTLAS for a dual Xeon
system?
Unfortunately, we have very strict security that does not allow compilers of
any sort on production desktops - we only have Pentium III development PCs.
Regards,
John Marsland
**
>"Wolski" <[EMAIL PROTECTED]> wrotes:
>> I have seen the output and it does not matter to me anymore if prompt
>> or package.skeleton works on any platform. I hope it wasn't a too big
>> heresy. If someone would ask me what are the week point of R, then
>> the only one that pops up immediately, is
54 matches
Mail list logo