If you look at ?boxplot.stats, you will find that the confidence interval it
reports is centred on the median and :
The notches (if requested) extend to '+/-1.58 IQR/sqrt(n)'.
If you have skewed data it is very possible (as you have found) that the mean
is outside median+/-1.58 IQR/sqrt(n).
A full factorial can be generated using expand.grid.
Try AlgDesign, BHH2 and conf.design for various fractional and D-optimal design
generators; they all work.
I haven't tried the experiment package yet, though the synopsis looks
interesting.
The 'Design' package itself I found disappointing for
?plot.gam or ?preplot.gam says there is an ask= option, slightly bizarrely set
to F by default.
If ask=T, You'll get a menu of options for plotting. In windows, hitting Esc
will get you out of the menu.
example:
gam(Kyphosis ~ s(Age,4) + Number, family = binomial, data=kyphosis,
trace=TRUE)
Petr PIKAL [EMAIL PROTECTED] 13/08/2007 14:10:40
How do I read one row of data so as to load site2 into a variable
called site2?
?read.table and other read. commands.
Don't forget scan(). But Petr is right, you probably do not need to read one
row at a time.
Once I plot a graph
Can anyone explain why AlgDesign's expand.formula help and output differ?
#From help:
# quad(A,B,C) makes ~(A+B+C)^2+I(A^2)+I(B^2)+I(C^2)
expand.formula(~quad(A+B+C))
#actually gives ~(A + B + C)^2 + I(A + B + C^2)
They don't _look_ the same...
Steve E
Does ?Rscript help?
Christopher Green [EMAIL PROTECTED] 06/08/2007 23:13:47
On Fri, 3 Aug 2007, Duncan Murdoch wrote:
On 03/08/2007 7:16 PM, Christopher Green wrote:
Is there an easy way to open an R script file in the R Editor found in
more recent versions of R on Windows via a file
Stripping either or both ends can be achieved by the somewhat tortuous but
fairly general
stripspace-function(x) sub(^\\s*([^ ]*.*[^ ])\\s*$, \\1,x)
which uses a replacement buffer to keep the useful part of the string.
Prof Brian Ripley [EMAIL PROTECTED] 06/08/2007 21:23:49
I am sure Marc
Dimitri
If you try
order(c(b,a,c))
[1] 2 1 3
or
sort(c(b,a,c))
[1] a b c
You will see that sort() and order() DO respect character order.
Your problem could be that your data frame variable is not a character but a
factor (the default for read.table, for example)
Check the class of the
The aliasing problem arises in alias(), which is what Anova uses to detect
aliasing. It is simply the fact that anova more or less blithely ignores the
NA's that makes anova behave apparently more 'sensibly' than Anova.
But like Carsten, I found this difficult to understand. Unordered factors
#Given
Cat=c('a','a','a','b','b','b','a','a','b') # Categorical variable
#and defining
coding-array(c(-1,1), dimnames=list(unique(Cat) ))
#(ie an array of values corresponding to your character array levels, and with
names set to those levels)
coding[Cat]
#does what you want.
Keith
Yemi,
Try
which.max(apply(X,1,prod))
(or possibly abs(apply(X,1,prod)) if you're only interested in unsigned product
max.
S
Yemi Oyeyemi [EMAIL PROTECTED] 04/07/2007 17:18:41
I am new in R and I want to solve this problem;
I have a matrix X (with n-rows and p-colums) my problem is to
EOF from a keyboard on Windows is often Ctrl+Z.
But you're right; it's platform dependent. CtrlZ in Unix has less desirable
effects on readLines(). And on your running R process...
Drat.
Peter Dalgaard [EMAIL PROTECTED] 29/06/2007 12:42:41
S Ellison wrote:
Wouldn't it be nice if the Help
Boxplot and bxp seem to have changed behaviour a bit of late (R 2.4.1). Or
maybe I am mis-remembering.
An annoying feature is that while at=3:6 will work, there is no way of
overriding the default xlim of 0.5 to n+0.5. That prevents plotting boxes on,
for example, interval scales - a useful
Boxplot positions and labels are not the same thing.
You have groups 'called' 2, 3, 4. As factors - which is what bocplot will
turn them into - they will be treated as arbitrary labels and _numbered_ 1:3
(try as.numeric(factor(x)).
So your lm() used 2:4, but your plot (and abline) uses 1:3
Your B coefficient differs by a suspicious-looking factor of 2.30... (ln(10).
Does SPSS log() mean log10 or ln? R log(x) uses ln(x).
S
Eduardo Esteves [EMAIL PROTECTED] 19/06/2007 17:19:35
Dear All,
I'd like to fit a kind of logistic model to small data-set using nonlinear
least-squares
Judith,
Haven't tried it in anger myself, but two things suggest themselves. The first
is to use the lattice package, which seems to draw keys (autokey option)
outside the plot region by default. Look at the last couple of examples in
?xyplot. May save a lot of hassle...
In classical R
Anders;
If you want to _test_ for differences, ANOVA applied to on the (typically)
first principal component scores for each object would give a fairly quick
indication of whether there was a case to answer (though scaling is an issue to
be aware of; a low-variance variable might differ
Richard;
Windows file open behaviour is dictated by the complete set of file
associations in the windows registry. You can inspect them in Explorer via
tools|folder options|File types, by finding the file type and looking at the
advanced options.
I would suspect that installing acrobat and
Johan,
Tests return objects of class htest; see ?t.test for a description.
binom.test(59,100)$statistic confirms that Ted harding is right about the test
statistic; it's just the number of successes.
Steve Ellison
Johan A. Stenberg [EMAIL PROTECTED] 14/05/2007 11:07:53
When I perform a
Isn't this rather closely related to the (more or less classic) bin packing and
knapsack problems? The 'hardness' of the problem is combinatoric (NP) and that
is statistical, but the answers aren't particularly statistical. Both have been
studied quite a lot for pragmatic reasons (packing
Adaikalavan Ramasamy [EMAIL PROTECTED] 09/05/2007 01:37:31
..the variance of means of each row in table above is ZERO because
the individual elements that comprise each row are identical.
... Then is it valid then to use lm( y ~ x, weights=freq ) ?
ermmm... probably not, because if that
Paul,
You have picked a function that is not smoothly differentiable and also started
at one of many 'stationary' points in a system with multiple solutions. In
practice, I think it'll get a zero gradient as the algorithm does things
numerically and you have a symmetric function. It probably
Doubling the length of the data doubles the apparent number of observations.
You would expect the standard error to reduce by sqrt(2) (which it just about
does, though I'm not clear on why its not exact here)
Weights are not as simple as they look. You have given all your data the same
weight,
Hadley,
You asked
.. what is the usual way to do a linear
regression when you have aggregated data?
Least squares generally uses inverse variance weighting. For aggregated data
fitted as mean values, you just need the variances for the _means_.
So if you have individual means x_i and sd's
I was wondering if there are any R functions that give the tail area
of a sum of chisquare distributions of the type:
a_1 X_1 + a_2 X_2
where a_1 and a_2 are constants and X_1 and X_2 are independent
chi-square variables with different degrees of freedom.
You might also check out
[EMAIL PROTECTED] 15/03/2007 13:26:52
On 15-Mar-07 12:09:42, Eli Gurarie wrote:
Hello all,
I am fishing for some suggestions on efficient ways to make qdist and
pdist type functions from an arbitrary distribution whose probability
density function I've defined myself.
Ted Harding
Small data sets (6-12 values, or a similarly small number of groups) which
don't look nice and symmetric are quite common in my field (analytical
chemistry and biological variants thereof), and often contain outliers or at
least stragglers that I cannot simply discard. One of the things I
27 matches
Mail list logo