Re: [R] Q: Mean, median and confidence intervals with functions summary boxplot.stats

2007-08-30 Thread S Ellison
If you look at ?boxplot.stats, you will find that the confidence interval it reports is centred on the median and : The notches (if requested) extend to '+/-1.58 IQR/sqrt(n)'. If you have skewed data it is very possible (as you have found) that the mean is outside median+/-1.58 IQR/sqrt(n).

Re: [R] Experimental Design with R

2007-08-28 Thread S Ellison
A full factorial can be generated using expand.grid. Try AlgDesign, BHH2 and conf.design for various fractional and D-optimal design generators; they all work. I haven't tried the experiment package yet, though the synopsis looks interesting. The 'Design' package itself I found disappointing for

Re: [R] for plots

2007-08-17 Thread S Ellison
?plot.gam or ?preplot.gam says there is an ask= option, slightly bizarrely set to F by default. If ask=T, You'll get a menu of options for plotting. In windows, hitting Esc will get you out of the menu. example: gam(Kyphosis ~ s(Age,4) + Number, family = binomial, data=kyphosis, trace=TRUE)

Re: [R] Odp: Very new - beginners questions

2007-08-13 Thread S Ellison
Petr PIKAL [EMAIL PROTECTED] 13/08/2007 14:10:40 How do I read one row of data so as to load site2 into a variable called site2? ?read.table and other read. commands. Don't forget scan(). But Petr is right, you probably do not need to read one row at a time. Once I plot a graph

[R] AlgDesign expand.formula()

2007-08-09 Thread S Ellison
Can anyone explain why AlgDesign's expand.formula help and output differ? #From help: # quad(A,B,C) makes ~(A+B+C)^2+I(A^2)+I(B^2)+I(C^2) expand.formula(~quad(A+B+C)) #actually gives ~(A + B + C)^2 + I(A + B + C^2) They don't _look_ the same... Steve E

Re: [R] Opening a script with the R editor via file association (on Windows)

2007-08-07 Thread S Ellison
Does ?Rscript help? Christopher Green [EMAIL PROTECTED] 06/08/2007 23:13:47 On Fri, 3 Aug 2007, Duncan Murdoch wrote: On 03/08/2007 7:16 PM, Christopher Green wrote: Is there an easy way to open an R script file in the R Editor found in more recent versions of R on Windows via a file

Re: [R] Function for trim blanks from a string(s)?

2007-08-07 Thread S Ellison
Stripping either or both ends can be achieved by the somewhat tortuous but fairly general stripspace-function(x) sub(^\\s*([^ ]*.*[^ ])\\s*$, \\1,x) which uses a replacement buffer to keep the useful part of the string. Prof Brian Ripley [EMAIL PROTECTED] 06/08/2007 21:23:49 I am sure Marc

Re: [R] Sorting data frame by a string variable

2007-07-18 Thread S Ellison
Dimitri If you try order(c(b,a,c)) [1] 2 1 3 or sort(c(b,a,c)) [1] a b c You will see that sort() and order() DO respect character order. Your problem could be that your data frame variable is not a character but a factor (the default for read.table, for example) Check the class of the

Re: [R] type III ANOVA for a nested linear model

2007-07-12 Thread S Ellison
The aliasing problem arises in alias(), which is what Anova uses to detect aliasing. It is simply the fact that anova more or less blithely ignores the NA's that makes anova behave apparently more 'sensibly' than Anova. But like Carsten, I found this difficult to understand. Unordered factors

Re: [R] A More efficient method?

2007-07-04 Thread S Ellison
#Given Cat=c('a','a','a','b','b','b','a','a','b') # Categorical variable #and defining coding-array(c(-1,1), dimnames=list(unique(Cat) )) #(ie an array of values corresponding to your character array levels, and with names set to those levels) coding[Cat] #does what you want. Keith

Re: [R] working with matrix

2007-07-04 Thread S Ellison
Yemi, Try which.max(apply(X,1,prod)) (or possibly abs(apply(X,1,prod)) if you're only interested in unsigned product max. S Yemi Oyeyemi [EMAIL PROTECTED] 04/07/2007 17:18:41 I am new in R and I want to solve this problem; I have a matrix X (with n-rows and p-colums) my problem is to

Re: [R] : regular expressions: escaping a dot

2007-06-29 Thread S Ellison
EOF from a keyboard on Windows is often Ctrl+Z. But you're right; it's platform dependent. CtrlZ in Unix has less desirable effects on readLines(). And on your running R process... Drat. Peter Dalgaard [EMAIL PROTECTED] 29/06/2007 12:42:41 S Ellison wrote: Wouldn't it be nice if the Help

[R] Boxplot issues

2007-06-22 Thread S Ellison
Boxplot and bxp seem to have changed behaviour a bit of late (R 2.4.1). Or maybe I am mis-remembering. An annoying feature is that while at=3:6 will work, there is no way of overriding the default xlim of 0.5 to n+0.5. That prevents plotting boxes on, for example, interval scales - a useful

Re: [R] abline plots at wrong abscissae after boxplot

2007-06-22 Thread S Ellison
Boxplot positions and labels are not the same thing. You have groups 'called' 2, 3, 4. As factors - which is what bocplot will turn them into - they will be treated as arbitrary labels and _numbered_ 1:3 (try as.numeric(factor(x)). So your lm() used 2:4, but your plot (and abline) uses 1:3

Re: [R] help w/ nonlinear regression

2007-06-19 Thread S Ellison
Your B coefficient differs by a suspicious-looking factor of 2.30... (ln(10). Does SPSS log() mean log10 or ln? R log(x) uses ln(x). S Eduardo Esteves [EMAIL PROTECTED] 19/06/2007 17:19:35 Dear All, I'd like to fit a kind of logistic model to small data-set using nonlinear least-squares

Re: [R] Legend outside plotting area

2007-05-29 Thread S Ellison
Judith, Haven't tried it in anger myself, but two things suggest themselves. The first is to use the lattice package, which seems to draw keys (autokey option) outside the plot region by default. Look at the last couple of examples in ?xyplot. May save a lot of hassle... In classical R

Re: [R] hierarhical cluster analysis of groups of vectors

2007-05-29 Thread S Ellison
Anders; If you want to _test_ for differences, ANOVA applied to on the (typically) first principal component scores for each object would give a fairly quick indication of whether there was a case to answer (though scaling is an issue to be aware of; a low-variance variable might differ

Re: [R] shell.exec() on Windows, unexpected behavior

2007-05-14 Thread S Ellison
Richard; Windows file open behaviour is dictated by the complete set of file associations in the windows registry. You can inspect them in Explorer via tools|folder options|File types, by finding the file type and looking at the advanced options. I would suspect that installing acrobat and

Re: [R] Make sign test show test statistics

2007-05-14 Thread S Ellison
Johan, Tests return objects of class htest; see ?t.test for a description. binom.test(59,100)$statistic confirms that Ted harding is right about the test statistic; it's just the number of successes. Steve Ellison Johan A. Stenberg [EMAIL PROTECTED] 14/05/2007 11:07:53 When I perform a

Re: [R] Allocating shelf space

2007-05-11 Thread S Ellison
Isn't this rather closely related to the (more or less classic) bin packing and knapsack problems? The 'hardness' of the problem is combinatoric (NP) and that is statistical, but the answers aren't particularly statistical. Both have been studied quite a lot for pragmatic reasons (packing

Re: [R] Weighted least squares

2007-05-09 Thread S Ellison
Adaikalavan Ramasamy [EMAIL PROTECTED] 09/05/2007 01:37:31 ..the variance of means of each row in table above is ZERO because the individual elements that comprise each row are identical. ... Then is it valid then to use lm( y ~ x, weights=freq ) ? ermmm... probably not, because if that

Re: [R] Bad optimization solution

2007-05-08 Thread S Ellison
Paul, You have picked a function that is not smoothly differentiable and also started at one of many 'stationary' points in a system with multiple solutions. In practice, I think it'll get a zero gradient as the algorithm does things numerically and you have a symmetric function. It probably

Re: [R] Weighted least squares

2007-05-08 Thread S Ellison
Doubling the length of the data doubles the apparent number of observations. You would expect the standard error to reduce by sqrt(2) (which it just about does, though I'm not clear on why its not exact here) Weights are not as simple as they look. You have given all your data the same weight,

Re: [R] Weighted least squares

2007-05-08 Thread S Ellison
Hadley, You asked .. what is the usual way to do a linear regression when you have aggregated data? Least squares generally uses inverse variance weighting. For aggregated data fitted as mean values, you just need the variances for the _means_. So if you have individual means x_i and sd's

Re: [R] Tail area of sum of Chi-square variables

2007-03-29 Thread S Ellison
I was wondering if there are any R functions that give the tail area of a sum of chisquare distributions of the type: a_1 X_1 + a_2 X_2 where a_1 and a_2 are constants and X_1 and X_2 are independent chi-square variables with different degrees of freedom. You might also check out

Re: [R] Creating q and p functions from a self-defined distribut

2007-03-15 Thread S Ellison
[EMAIL PROTECTED] 15/03/2007 13:26:52 On 15-Mar-07 12:09:42, Eli Gurarie wrote: Hello all, I am fishing for some suggestions on efficient ways to make qdist and pdist type functions from an arbitrary distribution whose probability density function I've defined myself. Ted Harding

[R] Distinct combinations for bootstrapping small sets

2007-03-06 Thread S Ellison
Small data sets (6-12 values, or a similarly small number of groups) which don't look nice and symmetric are quite common in my field (analytical chemistry and biological variants thereof), and often contain outliers or at least stragglers that I cannot simply discard. One of the things I