date:20100929

Re: [R] ramdom sampling from a dataset

2010-09-29 Thread Dennis Murphy

Try this:

x - c(2,5,9,4,5,6,7,8)
u - replicate(20, sample(x, replace = TRUE))
t(u)

The first argument of replicate() is the number of times to iterate the
process.
I believe you'll find that it does indeed do random sampling; each row
represents
a nonparametric bootstrap sample of x. There are other ways of doing this;
one is

v - matrix(sample(x, 160, replace = TRUE), ncol = 8, byrow = TRUE)

HTH,
Dennis

On Tue, Sep 28, 2010 at 2:03 PM, Michael Larkin mlar...@rsmas.miami.eduwrote:

 I am trying to get R to pick random integers from my dataset (i.e.
 bootstrapping) with replacement.  However, my attempts at this have been
 unsuccessful.  Here is a basic example of what I am doing:

 I have a data vector of 8 integers (data= 2,5,9,4,5,6,7,8).  I used the
 sample function and it worked but it only repeated my values in the exact
 same order.  It did not randomly sample them.  Here is my code:

 sample(data, replace=TRUE)

 Any advice would be greatly appreciated.

 Mike






[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] boxplot

2010-09-29 Thread Joshua Wiley

Dear Felipe,

Assuming you are not interested in the exact formulae that are used to
calculate each of these, from top to bottom it is:

minimum, lower quartile, median, upper quartile, maximum

If there are dots, these usually indicate outliers.

Please read ?boxplot  for more details on what R will do when.  If you
are interested in the specifics of the computations, read
?boxplot.stats

Hope that helps,

Josh

See inline comments to match these things to your terms.

On Tue, Sep 28, 2010 at 10:06 PM, Luis Felipe Parra
felipe.pa...@quantil.com.co wrote:

 Hello, does somebody know in a boxplot, what does each element in the
 boxplot represent?

 1. lines at the extremes of the dotted lines?

min and max (unless there are outliers AND you have not set range = 0)

 2. Extremes of the boxes

lower  upper quartile

 3. Black line in the middle of the box?

median

 4. notches?

 Thank you

 Felipe Parra

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] boxplot

2010-09-29 Thread Dennis Murphy

http://www.netmba.com/statistics/plot/box/

...found via Google search.

HTH,
Dennis

On Tue, Sep 28, 2010 at 10:06 PM, Luis Felipe Parra 
felipe.pa...@quantil.com.co wrote:

 Hello, does somebody know in a boxplot, what does each element in the
 boxplot represent?

 1. lines at the extremes of the dotted lines?
 2. Extremes of the boxes
 3. Black line in the middle of the box?
 4. notches?

 Thank you

 Felipe Parra

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A problem with plotting a long expression in ylab ?

2010-09-29 Thread Tal Galili

My honor.

A short question: if there is something in the device that is sensitive to
the overlapping of the text, then is it possible to add a warning massage
output when the length of the text is longer then the device dimensions?

With much respect,
Tal

Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Wed, Sep 29, 2010 at 1:24 AM, Paul Murrell p.murr...@auckland.ac.nzwrote:

 Hi

 It is a bug.  A fix has been committed.

 Thanks for the report!

 Paul


 On 29/09/2010 10:15 a.m., Ben Bolker wrote:

 Barry Rowlingsonb.rowlingsonat  lancaster.ac.uk  writes:

  My point is that in regular text, ylab plots it where it then goes
 outside
 the borders.
 With the use of expressions - the text just doesn't show up.
 Originally I thought it was because of my miss-use of expressions, until
 I
 figured it was the level of cex.lab I was using.
 The problem is that when you can't see the text, you don't have a sense
 of
 how much to decrease the cex.lab so the text will fit.
 I hope I was now clearer.


 Gotcha. Seems to only affect ylab though. Do this:

t =  expression(paste(test
 loo(% of 360 *degree,
 )))
plot(1,xlab=t,ylab=t,main=t)

 then if I shrink my graphics window I can make the ylab disappear but
 not the xlab or title.

  Seems to affect any rotated expressions:

  plot(1)
 text(1,1,t,srt=90)
 text(1,1,t,srt=0)
 text(1,1,t,srt=45)


  Now shrink window and watch the rotated expressions vanish! They
 disappear when they start (or finish) out of the entire graphics
 device, not the plot region...

  I cant find anything relating to clipping in the help, and I am on
 Linux, so see if there's any news about it, try it with R-patched or
 R-devel and then report a bug after having read all the other stuff
 about R bug reporting!

 Barry



I don't claim to understand it, but there is something quite
 fundamental about the properties of the X11() graphics device in R
 that makes labels that would otherwise overlap, disappear -- if
 you do 'extreme resizing' with the graphics above, you can see that
 otherwise-overlapping x- and y-axis tick labels disappear as the
 graph gets scrunched.  This is (apparently) true of X11 graphics
 on MacOS as well -- Quartz window has a different behavior.
 Trying with pdf() as well -- for height=2, width=2, only 1 y-axis
 and 2 x-axis tick labels survive, *but* the x and y labels and the
 title are all still present (but clipped, of course).
[Hmmm. Take my reports above with a grain of salt, I wasn't
 always using expression()s.]

   So I would guess that if you reported this as a bug you would
 be told that it was a poorly documented property of R's X11
 graphics model, rather than a bug ...

   I have no idea where to start looking for more information
 about what defines this behavior -- if I were desperate to know
 I would probably try asking Paul Murrell ...

   I would be very interested to see this discussed on r-devel,
 if anyone bit ...

Ben Bolker

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 --
 Dr Paul Murrell
 Department of Statistics
 The University of Auckland
 Private Bag 92019
 Auckland
 New Zealand
 64 9 3737599 x85392
 p...@stat.auckland.ac.nz
 http://www.stat.auckland.ac.nz/~paul/


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] get response from a glm

2010-09-29 Thread Michael Bedward

Hello,

The predict.glm is your friend.

Type ?predict.glm to see the help page.

Michael


On 29 September 2010 12:45, xinxin xx xxgr...@hotmail.com wrote:

 Hi everyone:

  I am new to R and I have a really basic question.
     I have already got a generalized linear model from some dataset, say y=b0 
 + b1X1 + b2X2.  Then I want to get the value of y provided, say, X1=1, X2=2.  
 And the confidence Intervals of this y.
     I know I can just calculate that since I know the model already. But is 
 there some code that can calculate those automatically?


 Thank you very much!

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] String split and concatenation

2010-09-29 Thread Steven Kang

x - rep(letters[1:3], 2)

Are there any ways to transform  assign the above as the one shown below
to an object? (in exact format; i.e length of 1  class of character),
i.e
x
('a', 'b', 'c', 'a', 'b', 'c')

Highly appreciate for any advice.



On Wed, Sep 29, 2010 at 3:33 PM, bill.venab...@csiro.au wrote:

 dump(x, file = x.R)
 file.show(x.R)

 will get you most of the way.

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Steven Kang
 Sent: Wednesday, 29 September 2010 3:11 PM
 To: r-help@r-project.org
 Subject: [R] String split and concatenation

 Hi R users,


 I desire to transform the following vector consisting of repeated
 characters

 x - rep(letters, 3)
 into this exact format (i.e a single string containing each characters in
 quotation mark separated by comma between each; al ).

 (a, b, c, d, a, b, c, d, ..., a, b,
 c,
 d, .z)

 Any advice would be much appreciated.



 --
 Steven

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Steven

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] name ONLY one column // empty.dimnames() in 'sfsmisc'

2010-09-29 Thread Martin Maechler

 IC == Ivan Calandra ivan.calan...@uni-hamburg.de
 on Mon, 27 Sep 2010 16:59:31 +0200 writes:

IC Hi,
IC I'm not sure it's even possible (and if it is I don't know how, but I'm 
IC no expert).

yes it is possible (in some way), see below.

IC But I think it doesn't make much sense to have only one named column. 
IC Just give it a vector:
IC vect_names - c(myname1, myname2, myname3)
IC colnames(my_matrix) - vect_names

yes, exactly,  ... but read on

IC HTH,
IC Ivan

IC Le 9/27/2010 10:26, Lorenzo Cattarino a écrit :
 Hi R-users
 
 I can not change the name of one column only of my matrix.
 
 my_matrix - matrix (1:12,ncol=3)

 colnames(my_matrix)[1] - 'myname'
 
 Error in dimnames(x)- dn :
 length of 'dimnames' [2] not equal to array extent

Just read ... and maybe re-read this error message. It is
exactly on target:

The colnames  are the dimnames(.)[2] and they have the wrong length.
{ I'm too often disappointed that many R users read only the
  first, or actually the zero-th word of the 'very fine' error
  messages that we provide you with (most of the time):
  Such useRs only read up to Error  .. and are put off, saying
  It does not work or  Why can't I ...  or  
}

Now back to the original question:
What you want is to give empty column names to all but the
first column.

  mat - matrix(1:12, 3)
  colnames(mat) - c(myname, , )

And a note: giving empty column- and row-names is sometimes
desirable if just for printing.
For that reason, our  'sfsmisc' CRAN package has contained the function
empty.dimnames() for many years now (actually longer than R
exists; we had it as an S+ function in our goodies
collection). Excerpt from its help page:

 empty.dimnames package:sfsmisc R Documentation

 Empty Dimnames of an Array.

 Description:
  `Remove' all dimension names from an array for compact printing.

 Usage:
  empty.dimnames(a)
 
 Arguments:
a: an ‘array’, especially a matrix.

 Value:
  Returns ‘a’ with its dimnames replaced by empty character strings.

 Author(s):
  Bill Venables / Martin Maechler, Sept 1993.


and here the examples we provide:

 example(empty.dimnames)

empty. empty.dimnames(diag(5)) # looks much nicer
  
 1 0 0 0 0
 0 1 0 0 0
 0 0 1 0 0
 0 0 0 1 0
 0 0 0 0 1

empty. (a - matrix(-9:10, 4,5))
 [,1] [,2] [,3] [,4] [,5]
[1,]   -9   -5   -137
[2,]   -8   -4048
[3,]   -7   -3159
[4,]   -6   -226   10

empty. empty.dimnames(a) # nicer, right?
  
 -9 -5 -1 3  7
 -8 -4  0 4  8
 -7 -3  1 5  9
 -6 -2  2 6 10


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Validation / Training - test data

2010-09-29 Thread Sam

Dear List,

I have developed two models i want to use to predict a response, one with a 
binary response and one with a ordinal response.

My original plan was to divide the data into test (300 entries) and training 
(1000 entries) and check the power of the model by looking at the % correct 
predictions. However i have been told my a colleague  that 1300 entries is far 
too little to partition the data set and i should use the whole data set, and 
determine the power of the model with scores such as c-value and Brier score 
and use bootstrapping.

I understand how to bootstrap in R however i have never used it on predicted 
values.

My questions are -

1. Using the boot() command how do i use this to test the power of my 
predictive model?
2. Is it possible to bootstrap brier score or is this not necessary?
3. ( This is a separate point i am struggling with, i thought i would include 
it here instead of posting again!) I have selected the most likely model with 
AIC criteria from a set of candidate GLMM models, however as GLMM has no 
predict function i have used the best model and excluded the random effects and 
ran it as a glm and used the predict function from here - is this OK?

Thanks

Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] drop.terms and [.terms ignores intercept

2010-09-29 Thread Niels Richard Hansen


Dear R-help list

The functions drop.terms and [.terms ignores if the intercept
has been explicitly removed. Is that a deliberate feature?

For instance,

drop.terms(terms(~a+b-1),1)
terms(~a+b-1)[2]

R version 2.12.0 Under development (unstable) (2010-09-13 r52905)

Best, Niels

--
Niels Richard Hansen Web:   www.math.ku.dk/~richard 
Associate Professor  Email: niels.r.han...@math.ku.dk
Department of Mathematical Sciences nielsrichardhan...@gmail.com
University of Copenhagen Skype: nielsrichardhansen.dk   
Universitetsparken 5 Phone: +45 353 20783 (office)  
2100 Copenhagen Ø   +45 2859 0765 (mobile)
Denmark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] drop.terms and [.terms ignores intercept

2010-09-29 Thread peter dalgaard


On Sep 29, 2010, at 10:29 , Niels Richard Hansen wrote:

 The functions drop.terms and [.terms ignores if the intercept
 has been explicitly removed. Is that a deliberate feature?

Perhaps rather an unimplemented one. The root cause is that both functions use 
reformulate() on the term.labels attribute, and there is no way to specify 
that you want to reformulate into a no-intercept formula. On the other hand, 
the modeling code will happily proceed with a no-intercept model even if there 
is no -1 in formula part of a terms object, e.g. 

 x - terms(y~a+b)
 attr(x,intercept) - 0
 lm(x)

Call:
lm(formula = x)

Coefficients:
 a   b  
0.2263  0.4178  

 formula(x)
y ~ a + b

so I suppose that there is no really good excuse not to carry the intercept 
attribute over. As usual, with code as old as this, there is always the risk 
that something actually relies on current behavior. 


-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] constrained optimization -which package?

2010-09-29 Thread Hans W. Borchers

Leonardo Monasterio leonardo.monasterio at gmail.com writes:

 
 Dear R users,
 
 In the function bellow I  want to find the maximum value of v,
 subject to the constrain that the sum of x is equal to 1.
  I want to maximize:
 v-t(x)%*%distance%*%x
 
 Subject to:
 sum(x)=1


I do not see why you would take the trouble to use constraint optimization 
here. Use x_n as a dependent variable, x_n - 1 - sum(x), x an (n-1)-dim. 
vector, define another function f1(x) - f(c(x, 1-sum(x))) appropriately, and 
apply optim() without constraints.

Hans Werner


 Where:
 x is a vector n X 1
 distance is a matrix n*n and it is given.
 (In practice, the number of n can go up to hundreds.)
 
 I have a bit of experience using R but not much on optimization
 techniques or on R optimization packages.  I have taken a look at
 optim and nlminb, but I am quite confused.
  Do you have any suggestion on how to do it?
 Thank you very much,
 Leo  Monasterio.
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Use R in Visual Basic Environment

2010-09-29 Thread Christophe Bouffioux

Hi Soumen,

it depends what you mean by embedded
what i have done once is to run a R script in batch mode from an access
application,
the R code import an access table, with the RODBC package, produce some
table and graph
which are compiled on the fly in html output, using HWRITER or R2HTML
package
and i can open this html output from an access form, or it open
automatically


kind regards
Christophe




On Tue, Sep 28, 2010 at 12:39 PM, Soumen Pal soumen.4...@gmail.com wrote:

 I need your kind help regarding the following:

 I wish to know is there any way to use R in Visual Basic environment. I
 want
 to develop a VB application where R can be embedded (R will work as a back
 end statistical engine). If available, please provide me some source of
 study materials/articles available on the internet related to this.

 --
 Thanks  Regards,

 Soumen Pal

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] drop.terms and [.terms ignores intercept

2010-09-29 Thread Niels Richard Hansen



On 29/09/10 11.13, peter dalgaard wrote:


On Sep 29, 2010, at 10:29 , Niels Richard Hansen wrote:


The functions drop.terms and [.terms ignores if the intercept has been
explicitly removed. Is that a deliberate feature?


Perhaps rather an unimplemented one. The root cause is that both functions
use reformulate() on the term.labels attribute, and there is no way to
specify that you want to reformulate into a no-intercept formula. On the
other hand, the modeling code will happily proceed with a no-intercept model
even if there is no -1 in formula part of a terms object, e.g.


x- terms(y~a+b) attr(x,intercept)- 0 lm(x)


Call: lm(formula = x)

Coefficients: a   b 0.2263  0.4178


formula(x)

y ~ a + b

so I suppose that there is no really good excuse not to carry the intercept
attribute over. As usual, with code as old as this, there is always the risk
that something actually relies on current behavior.


Thanks Peter. I expected something like that.

As I see it, the intercept (or more often, removal of the intercept)
is treated differently from other terms, and information on the presence
of an intercept is stored in an attribute of the terms object rather
than the formula. This attribute is not copied when using drop.terms
or [.terms. I believe it should be, but as you say Peter, some may rely
on the way the functions behave now.

So basically, you should use code like

x - terms(y~a+b-1)
z - x[2]
attr(z, intercept) -  attr(x, intercept)

to make sure that the intercept information is preserved.

- Niels






--
Niels Richard Hansen Web:   www.math.ku.dk/~richard 
Associate Professor  Email: niels.r.han...@math.ku.dk
Department of Mathematical Sciences nielsrichardhan...@gmail.com
University of Copenhagen Skype: nielsrichardhansen.dk   
Universitetsparken 5 Phone: +45 353 20783 (office)  
2100 Copenhagen Ø   +45 2859 0765 (mobile)
Denmark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to rotate the x axis lable to an interested angle?

2010-09-29 Thread komais


  
http://r.789695.n4.nabble.com/file/n2718591/QQ%E6%88%AA%E5%9B%BE%E6%9C%AA%E5%91%BD%E5%90%8D--1.png
 

I am  a beginner for R,and i could not solve this problem.who can help me to
solve it ?
how to rotate the x axis lable to an interested angle as the picture show? 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/how-to-rotate-the-x-axis-lable-to-an-interested-angle-tp2718591p2718591.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] 3D Graphics

2010-09-29 Thread JoH


Dear All,

I have made a scatter plot and placed a plane within it using scatterplot3d.
However, I have been asked for the data points to be a surface plot or have
the plane more closely resemble the data rather than show trends.

I have since tried to use the rgl package. Why doesn't this package use the
window which already contains graphics?

I have one axis with decreasing values and two others with increasing
values, hence I get an error message when I've tried using pers3d.

Are there anyother ways in which I can create a surface plot?

Thank you,

Jo 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/3D-Graphics-tp2718658p2718658.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Splitting data in to multiple boxplots

2010-09-29 Thread deadlyspider


Ok, I don't think I was specific enough.

The data originally came in this form


1  a  12
2  b  4
3  a  3
4  c  54
5  a  12
6  b  11
7  c  9
8  c  2
.  .  .
.  .  .
.  .  .



Where I sorted by the second column (NB the second column is the categories
and they have long names). I would then like separate boxplots for each
category.

The loop idea would be nice but unfortunately I do not fully understand the
answer given.

Thanks.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Splitting-data-in-to-multiple-boxplots-tp2717491p2718659.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 95% confidence intercal with glm

2010-09-29 Thread Sam

I am looking to do the same but am a bit confused 

 and apply the inverse link function for your model.

i understand up to this point and i understand what this means, however i am 
unsure why it needs to be done and how you do it - i.e i use family=binomial 
is this wrong if i use this method to calculate 95% CI?

Thanks

Sam
On 28 Sep 2010, at 14:50, Ben Bolker wrote:

zozio32 remy.pascal at gmail.com writes:

 
 
 Hi
 
 I had to use a glm instead of my basic lm on some data due to unconstant
 variance.
 
 now, when I plot the model over the data, how can I easily get the 95%
 confidence interval that sormally coming from:
 
 yv - predict(modelVar,list(aveLength=xv),int=c)
 matlines(xv,yv,lty=c(1,2,2))
 
 There is no interval argument to pass to the predict function when using a
 glm, so I was wondering if I had to use an other function
 

 You need to use predict with se=TRUE; construct the confidence
intervals by computing predicted values +- 1.96 times the standard
error returned; and apply the inverse link function for your model.

 If heteroscedasticity is your main problem, and not a specific
(known) non-normal distribution, you might consider using the gls
function from the nlme package with an appropriate 'weights' argument.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Regular expressions: offsets of groups

2010-09-29 Thread Titus von der Malsburg

Bill, Michael,

good to see I'm not the only one who sees potential for improvements
in the regexpr domain.  Adding a subpattern argument is certainly a
step in the right direction and would make my life much easier.
However, in my application I need to know not only the position of one
group but also the position of the overall match in the original
string.  The ideal solution would provide positions and match lengths
for the whole pattern and for all groups if desired.  Only this would
solve all related issues.  One possibility is to have a subpattern
argument that accepts a vector of numbers (0 refers to the whole
pattern):

   gregexpr(a+(b+), abcdaabbc, subpattern=c(0,1))
 [[1]]:
 [[1]][[1]]:
 [1] 1 5
 attr(, match.length):
 [1] 2 4
 [[1]][[2]]:
 [1] 2 7
 attr(, match.length):
 [1] 1 2

A weakness of this solution is that the structure of the return values
changes if length(subpattern)1.  An alternative is to have a separate
function, say ggregepxr for group gregexpr, that returns a list of
lists as in the above example.  This function would always return
positions and match lengths of the whole pattern (group 0) and all
groups.  The original gregexpr could still have the subpattern
argument but it would only accept single numbers.  This way the return
format of gregexpr remains the same.

Best,

  Titus


On Wed, Sep 29, 2010 at 2:42 AM, Michael Bedward
michael.bedw...@gmail.com wrote:
 Ah, that's interesting - thanks Bill. That's certainly on the right
 track for me (Titus, you too ?) especially if the subpattern argument
 accepted a vector of multiple group indices.

 As you say, this is straightforward in C. I'd be happy to (try to)
 make a patch for the R sources if there was some consensus on the best
 way to implement it, ie. as a new R function or by extending existing
 function(s).

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] svg plot and dashed lines

2010-09-29 Thread Ivan Calandra

  Dear users,

When I boxplot(), the lines of the whiskers are dashed. However, when I 
save in an svg file, the dashed lines of the whiskers are not dashed 
anymore.
How can I have the dashed lines in the svg file?
I don't have this problem with a ps file, but I cannot edit such file as 
easily as an svg file. That's why I'd like to stick to the svg format.

Thanks in advance,
Ivan


df - structure(list(a = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c(A, B), class 
= factor), b = c(0.904439748839731, -0.855322875817714, 
-0.957288625102814, 0.130401502975395, -1.27765131101282, 
-2.08861064654457, 1.10234256081394, -2.05533035069656, 
-1.04529859053820, -0.0847903566670016, 1.02553030160793, 
0.321170740199536, 1.87419854190502, -0.891404432182873, 
0.968745913802415, -0.85229752730528, 0.641555656821046, 
1.72455661053506, -0.523097596614304, 1.26729031187194)), .Names = 
c(a, b), row.names = c(NA, -20L), class = data.frame)

library(RSvgDevice)
devSVG(file=test.svg)
boxplot(b~a, data=df)
dev.off()


-- 
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Splitting data in to multiple boxplots

2010-09-29 Thread Martyn Byng

Hi,

Something like

bb =
data.frame(label=c(a,b,a,b,c,a,b,c),val=c(4,2,1,6,4,3,2,
1))
l = split(bb,bb$label)
par(mfrow=c(2,2))
lapply(l,function(a) {boxplot(a$val)})

might be what you are looking for

Martyn

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of deadlyspider
Sent: 29 September 2010 11:02
To: r-help@r-project.org
Subject: Re: [R] Splitting data in to multiple boxplots


Ok, I don't think I was specific enough.

The data originally came in this form


1  a  12
2  b  4
3  a  3
4  c  54
5  a  12
6  b  11
7  c  9
8  c  2
.  .  .
.  .  .
.  .  .



Where I sorted by the second column (NB the second column is the
categories
and they have long names). I would then like separate boxplots for each
category.

The loop idea would be nice but unfortunately I do not fully understand
the
answer given.

Thanks.
-- 
View this message in context:
http://r.789695.n4.nabble.com/Splitting-data-in-to-multiple-boxplots-tp2
717491p2718659.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


This e-mail has been scanned for all viruses by Star.\ _...{{dropped:12}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problems extracting p-value from summary(aov(...))

2010-09-29 Thread syrvn


Hi,

I did a aov and used summary to obtain the p-value. I tried many ways to
extract the p-value from
the summary result but failed. Among others I tried the following:



 test.summary -
 summary(aov(data[,1]~time.points+Error(subject/time.points)))
 test.summary

Error: subject
  Df  Sum Sq  Mean Sq F value Pr(F)
Residuals  9 0.27467 0.030518   

Error: subject:time.points
Df   Sum Sq   Mean Sq F value  Pr(F)  
time.points  2 0.018563 0.0092814  3.1777 0.06578 .
Residuals   18 0.052574 0.0029208  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
 
 as.matrix(test.summary[[1]][,5])
Error in `[.default`(test.summary[[1]], , 5) : 
  incorrect number of dimensions
 test.summary$Error: Within[[1]]$Pr(F)
NULL
 test.summary[[2]][,5]
Error in `[.default`(test.summary[[2]], , 5) : 
  incorrect number of dimensions
 


Any advise?
Cheers
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Problems-extracting-p-value-from-summary-aov-tp2718726p2718726.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] matrix plot

2010-09-29 Thread hairryharry


Hi,

Fairly new to R - have done basic plots but now faced with plotting a 
matrix/table of results -I know what I want but cannot find out how to 
do it.


Basically have individual questions ( x) to which an organization can 
rate themselves 1-10 (y) what I want to show is a matrix/density type 
plot (like the matrix corrolation plots I have seen on R graph site) 
showing for eac question (x) a circle or shape of varying size and/or 
colour to represent the number of organisations rating themselves for 
each value of y.


Hope this makes sense, any advice would be gratefully received.

HH

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Adding two data.frames.

2010-09-29 Thread Joel


Hi

I have two data frames that contains the same sort of data and I want do add
them together to get one big data frame, anyone know how to do that?

ex:
data.frame1
A B C
1 2 3
4 5 6 
7 8 9

and
data.frame2
A B C
9 8 7
6 5 4
3 2 1

Would then become one big set:

data.frame3
A B C
1 2 3
4 5 6
7 8 9
9 8 7
6 5 4
3 2 1


Thx for your help

//Joel
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Adding-two-data-frames-tp2718771p2718771.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding two data.frames.

2010-09-29 Thread Ivan Calandra

  Hi,

Take a look at rbind()

HTH,
Ivan

Le 9/29/2010 13:24, Joel a écrit :
 Hi

 I have two data frames that contains the same sort of data and I want do add
 them together to get one big data frame, anyone know how to do that?

 ex:
 data.frame1
 A B C
 1 2 3
 4 5 6
 7 8 9

 and
 data.frame2
 A B C
 9 8 7
 6 5 4
 3 2 1

 Would then become one big set:

 data.frame3
 A B C
 1 2 3
 4 5 6
 7 8 9
 9 8 7
 6 5 4
 3 2 1


 Thx for your help

 //Joel

-- 
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Regular expressions: offsets of groups

2010-09-29 Thread Michael Bedward

I'd definitely be a customer for it Titus. And it does seem like an
obvious hole in regex processing in R that cries out to be filled.

Um, ggregexpr isn't the sexiest of function names :)  Perhaps we can
think of something a little easier ?

How is your C coding ? Bill ? Anyone else ?  I could have a got at
writing some prototype code to test in the next few days, though if
someone else with decent C skills is itching to do it please speak up.

Michael

On 29 September 2010 20:08, Titus von der Malsburg malsb...@gmail.com wrote:
 Bill, Michael,

 good to see I'm not the only one who sees potential for improvements
 in the regexpr domain.  Adding a subpattern argument is certainly a
 step in the right direction and would make my life much easier.
 However, in my application I need to know not only the position of one
 group but also the position of the overall match in the original
 string.  The ideal solution would provide positions and match lengths
 for the whole pattern and for all groups if desired.  Only this would
 solve all related issues.  One possibility is to have a subpattern
 argument that accepts a vector of numbers (0 refers to the whole
 pattern):

   gregexpr(a+(b+), abcdaabbc, subpattern=c(0,1))
  [[1]]:
  [[1]][[1]]:
  [1] 1 5
  attr(, match.length):
  [1] 2 4
  [[1]][[2]]:
  [1] 2 7
  attr(, match.length):
  [1] 1 2

 A weakness of this solution is that the structure of the return values
 changes if length(subpattern)1.  An alternative is to have a separate
 function, say ggregepxr for group gregexpr, that returns a list of
 lists as in the above example.  This function would always return
 positions and match lengths of the whole pattern (group 0) and all
 groups.  The original gregexpr could still have the subpattern
 argument but it would only accept single numbers.  This way the return
 format of gregexpr remains the same.

 Best,

  Titus


 On Wed, Sep 29, 2010 at 2:42 AM, Michael Bedward
 michael.bedw...@gmail.com wrote:
 Ah, that's interesting - thanks Bill. That's certainly on the right
 track for me (Titus, you too ?) especially if the subpattern argument
 accepted a vector of multiple group indices.

 As you say, this is straightforward in C. I'd be happy to (try to)
 make a patch for the R sources if there was some consensus on the best
 way to implement it, ie. as a new R function or by extending existing
 function(s).


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Regular expressions: offsets of groups

2010-09-29 Thread Titus von der Malsburg

On Wed, Sep 29, 2010 at 1:58 PM, Michael Bedward
michael.bedw...@gmail.com wrote:
 How is your C coding ? Bill ? Anyone else ?  I could have a got at
 writing some prototype code to test in the next few days, though if
 someone else with decent C skills is itching to do it please speak up.

We have a skilled C- and R-programmer who could work on it. I'll talk to him.

   Titus

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 3D Graphics

2010-09-29 Thread Duncan Murdoch


 On 29/09/2010 6:00 AM, JoH wrote:

Dear All,

I have made a scatter plot and placed a plane within it using scatterplot3d.
However, I have been asked for the data points to be a surface plot or have
the plane more closely resemble the data rather than show trends.

I have since tried to use the rgl package. Why doesn't this package use the
window which already contains graphics?


The usual graphics drivers don't support OpenGL, which rgl uses.  It 
would probably be possible to implement a standard R graphics device on 
an rgl canvas, but I don't know anyone who has done that.



I have one axis with decreasing values and two others with increasing
values, hence I get an error message when I've tried using pers3d.


You can always plot against negative x without labels, and add the 
labels for positive x later (using persp3d and axis3d).


Duncan Murdoch


Are there anyother ways in which I can create a surface plot?

Thank you,

Jo



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems extracting p-value from summary(aov(...))

2010-09-29 Thread Dennis Murphy

Hi:

Try this:

test.summary[[1]][, 5]

It should return a vector of p-values, the last being NA. In your case,
since there is only one non-NA p-value, it is enough to do

test.summary[[1]][, 5][1]

You mean that wasn't obvious?   :)

Explanation:
summary(aovobj) actually returns a list, but that's not obvious from looking
at the print method. If you ask for its names, as in names(summary(aovobj)),
it returns NULL, further adding to the confusion. However, since it is a
list,
summary(aovobj)[[1]]  returns the ANOVA table, and ANOVA tables in R's
modeling functions are typically matrices. The model terms are the rownames
of the matrix (although that isn't obvious at first, either). That being the
case, the p-values are in column 5 (whose column name is 'Pr(F)' ), so we
can extract that column with
summary(aovobj)[[1]][, 5]   orsummary(aovobj)[[1]][, 'Pr(F)']
To get the first element of that p-value vector, use
summary(aovobj)[[1]][, 5][1]

HTH,
Dennis

On Wed, Sep 29, 2010 at 3:47 AM, syrvn ment...@gmx.net wrote:


 Hi,

 I did a aov and used summary to obtain the p-value. I tried many ways to
 extract the p-value from
 the summary result but failed. Among others I tried the following:



  test.summary -
  summary(aov(data[,1]~time.points+Error(subject/time.points)))
  test.summary

 Error: subject
  Df  Sum Sq  Mean Sq F value Pr(F)
 Residuals  9 0.27467 0.030518

 Error: subject:time.points
Df   Sum Sq   Mean Sq F value  Pr(F)
 time.points  2 0.018563 0.0092814  3.1777 0.06578 .
 Residuals   18 0.052574 0.0029208
 ---
 Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1
 
  as.matrix(test.summary[[1]][,5])
 Error in `[.default`(test.summary[[1]], , 5) :
  incorrect number of dimensions
  test.summary$Error: Within[[1]]$Pr(F)
 NULL
  test.summary[[2]][,5]
 Error in `[.default`(test.summary[[2]], , 5) :
  incorrect number of dimensions
 


 Any advise?
 Cheers
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Problems-extracting-p-value-from-summary-aov-tp2718726p2718726.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to rotate the x axis lable to an interested angle?

2010-09-29 Thread Jim Lemon


On 09/29/2010 07:19 PM, komais wrote:



http://r.789695.n4.nabble.com/file/n2718591/QQ%E6%88%AA%E5%9B%BE%E6%9C%AA%E5%91%BD%E5%90%8D--1.png

I am  a beginner for R,and i could not solve this problem.who can help me to
solve it ?
how to rotate the x axis lable to an interested angle as the picture show?

Hi komais,
Have a look at the staxlab function in the plotrix package.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding two data.frames.

2010-09-29 Thread Joel


Thx for the help works very well for me.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Adding-two-data-frames-tp2718771p2718901.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matrix plot

2010-09-29 Thread Dennis Murphy

Hi:

Are you thinking of a levelplot/heatmap? Some places to look:

?levelplot in the lattice package
?geom_tile in the ggplot2 package
?heatmap2 in the gplots package

You can find many more heatmap functions in R; try

library(sos)# install first if you don't have it - *very* handy
for quick searches
findFn('heatmap')

There are more than a few packages with heatmap functions...

HTH,
Dennis

On Wed, Sep 29, 2010 at 3:55 AM, hairryharry hairryha...@tesco.net wrote:

 Hi,

 Fairly new to R - have done basic plots but now faced with plotting a
 matrix/table of results -I know what I want but cannot find out how to do
 it.

 Basically have individual questions ( x) to which an organization can rate
 themselves 1-10 (y) what I want to show is a matrix/density type plot (like
 the matrix corrolation plots I have seen on R graph site) showing for eac
 question (x) a circle or shape of varying size and/or colour to represent
 the number of organisations rating themselves for each value of y.

 Hope this makes sense, any advice would be gratefully received.

 HH

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Validation / Training - test data

2010-09-29 Thread Frank Harrell


Split sample validation is highly unstable with your sample size.

The rms package can help with bootstrapping or cross-validation, assuming
you have all modeling steps repreated for each resample.

Frank

-
Frank Harrell
Department of Biostatistics, Vanderbilt University
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Validation-Training-test-data-tp2718523p2718905.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 95% confidence intercal with glm

2010-09-29 Thread Ben Bolker

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


## from ?glm
counts - c(18,17,15,20,10,20,25,13,12)
outcome - gl(3,1,9)
treatment - gl(3,3)
d.AD - data.frame(treatment, outcome, counts)
glm.D93 - glm(counts ~ outcome + treatment, family=poisson,
   data=d.AD)

## predict on 'link' or 'linear predictor' scale, with SEs
pp - predict(glm.D93,se.fit=TRUE)
etaframe -
with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit))
pframe - exp(etaframe)  ## inverse-link
## picture
pframe - as.data.frame(cbind(obs=d.AD$counts,pframe))
library(plotrix)
plot(pframe$obs,ylim=c(5,30))
with(pframe,plotCI(1:9,fit,li=lower,ui=upper,col=2,add=TRUE))

  If you're using a binomial model you need 'plogis' rather than 'exp'
as your
inverse link, and you may need to multiply by the binomial N as
appropriate.

On 10-09-29 06:07 AM, Sam wrote:
 I am looking to do the same but am a bit confused

 and apply the inverse link function for your model.

 i understand up to this point and i understand what this means,
 however i am unsure why it needs to be done and how you do it - i.e
 i use family=binomial is this wrong if i use this method to
 calculate 95% CI?

 Thanks

 Sam On 28 Sep 2010, at 14:50, Ben Bolker wrote:

 zozio32 remy.pascal at gmail.com writes:



 Hi

 I had to use a glm instead of my basic lm on some data due to
 unconstant variance.

 now, when I plot the model over the data, how can I easily get
 the 95% confidence interval that sormally coming from:

 yv - predict(modelVar,list(aveLength=xv),int=c)
 matlines(xv,yv,lty=c(1,2,2))

 There is no interval argument to pass to the predict function
 when using a glm, so I was wondering if I had to use an other
 function


 You need to use predict with se=TRUE; construct the confidence
 intervals by computing predicted values +- 1.96 times the standard
 error returned; and apply the inverse link function for your
 model.

 If heteroscedasticity is your main problem, and not a specific
 (known) non-normal distribution, you might consider using the gls
 function from the nlme package with an appropriate 'weights'
 argument.

 __ R-help@r-project.org
 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
 read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkyjNKQACgkQc5UpGjwzenNW9gCgjfXin/9dI2y0rnk9wZWu97P8
TVgAn3WsG2ATva5WpLZXS1PXUTMRTTMi
=gvAc
-END PGP SIGNATURE-

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotting multiple animal tracks against Date/Time

2010-09-29 Thread Gabor Grothendieck

On Wed, Sep 29, 2010 at 7:52 AM, Struve, Juliane
j.str...@imperial.ac.uk wrote:
 I will post the example again to see if its readable now. My question is why
 does read.zoo(file=filenames,) work and  lapply(filenames, read.zoo,...) 
 does not ? Since I am reading the same file in both statements I just do not 
 know how to interpret Error in strptime(x, format, tz = tz) : invalid 'x' 
 argument.

 Thank you for all help.

 Juliane

  library(chron)
  library(zoo)
  #Generate example file
  Fish_ID=1646
  Date - 01/01/2004 00:01:00
  Date - as.POSIXct(strptime(Date,format=%m/%d/%Y %H:%M:%S))
  R2sqrt -100
  Test - data.frame(Fish_ID=Fish_ID,Date=Date,R2sqrt=R2sqrt)
  write.csv(Test,file=Test)
  #Read in example file
  filenames=Test
  read.zoo(file=filenames, header = TRUE, FUN = as.chron, sep = ,, 
 colClasses = c(NULL, NULL, character, numeric))
  lapply(filenames, read.zoo, header = TRUE, FUN = as.chron, sep = ,, 
 colClasses = c(NULL, NULL, character, numeric))

FUN is an argument of lapply so what is actually running is

   lapply(filenames, FUN = as.chron, ...)

rather than

   lapply(filenames, FUN = read.zoo, ...).

It seems the short form usage of lapply won`t work here.  Try this instead:

   lapply(filenames, function(F) read.zoo(F, header = TRUE, sep = ,,
FUN = as.chron,
 colClasses = c(NULL, NULL, character, numeric)))

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matrix plot

2010-09-29 Thread Thomas Stewart

HH-

I'm not familiar with the plots you mention, but the following is a quick
attempt to create the plot you describe.

data-data.frame(
  org=1:10,
  q1=sample(1:10,replace=T),
  q2=sample(1:10,replace=T),
  q3=sample(1:10,replace=T))

#  This generates a random data set like the one you describe.  It looks
like this:

#   org q1 q2 q3
#11  9  1  4
#22  1  2  1
#33  1  3  7
#44 10  9  6
# ...

n-ncol(data)-1 # NUMBER OF ORGANIZATIONS

new.mat-apply(data[,-1],2,function(x){xtabs(~factor(x,levels=1:10))})

plot(rep(1:n,each=10),rep(1:10,n),
  pch=16,
  cex=c(new.mat),
  xaxt=n,
  xlab=Question,
  ylab=Score)
axis(1,at=1:n)

Instead of cex=c(new.mat), you can supply a function of new.mat, say
cex=f(new.mat) to specify the size of the dots.

Hope that helps.
-tgs




On Wed, Sep 29, 2010 at 6:55 AM, hairryharry hairryha...@tesco.net wrote:

 Hi,

 Fairly new to R - have done basic plots but now faced with plotting a
 matrix/table of results -I know what I want but cannot find out how to do
 it.

 Basically have individual questions ( x) to which an organization can rate
 themselves 1-10 (y) what I want to show is a matrix/density type plot (like
 the matrix corrolation plots I have seen on R graph site) showing for eac
 question (x) a circle or shape of varying size and/or colour to represent
 the number of organisations rating themselves for each value of y.

 Hope this makes sense, any advice would be gratefully received.

 HH

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems extracting p-value from summary(aov(...))

2010-09-29 Thread peter dalgaard


On Sep 29, 2010, at 14:25 , Dennis Murphy wrote:

 
 test.summary[[1]][, 5][1]
 
 You mean that wasn't obvious?   :)

Worse, it doesn't actually work...

 test.summary - summary(npk.aovE)
 test.summary[[1]][, 5]
Error in `[.default`(test.summary[[1]], , 5) : 
  incorrect number of dimensions
 test.summary$Error: block[[1]][, 5]
[1] 0.5252361NA
 test.summary[[1]][[1]][, 5]
[1] 0.5252361NA

I.e., you need an extra list extraction operator. The data structure goes

List of 2
 $ Error: block :List of 1
  ..$ :Classes ‘anova’ and 'data.frame':2 obs. of  5 variables:
  .. ..$ Df : num [1:2] 1 4
  .. ..$ Sum Sq : num [1:2] 37 306
...
 and the intermediate list has class (summary.aov,listof). I believe the 
point is that you can have a matrix LHS and get a table for each of its columns.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Validation / Training - test data

2010-09-29 Thread Sam

Thanks for this,

I had used 

 validate(model0, method=boot,B=200)

To get a index.corrected Brier score, 

However i am also wanting to bootstrap the predicted probabilities  output from 
 predict(model1, type = response) to get a idea of confidence, or am i best 
just using se.fit = TRUE and then calculating the 95%CI? Does what i want to do 
make sense?

Thanks


On 29 Sep 2010, at 13:38, Frank Harrell wrote:


Split sample validation is highly unstable with your sample size.

The rms package can help with bootstrapping or cross-validation, assuming
you have all modeling steps repreated for each resample.

Frank

-
Frank Harrell
Department of Biostatistics, Vanderbilt University
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Validation-Training-test-data-tp2718523p2718905.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] short captions for xtable?

2010-09-29 Thread cuz


Hi,

For my dissertation, I've made copious use of xtable. I've just gotten
stumped however. I'm a fan of extended captions explaining the table, but
now I have to assemble a a list of tables and the captions are unwieldy. I
presume xtable calls LaTeX's \caption function. Is there a way to include
short captions (a la \caption[short caption]{long caption}?

Thanks in advance for you help.

Cuz
-- 
View this message in context: 
http://r.789695.n4.nabble.com/short-captions-for-xtable-tp2719000p2719000.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Splitting data in to multiple boxplots

2010-09-29 Thread Joshua Wiley

Hi,

The form of the data is not terribly important neither is whether it
was sorted as boxplots are not order dependent are category a is a
sorted or not.  See below for individual plots with your new data.


# read in data
dat - read.table(textConnection(
1  a  12
2  b  4
3  a  3
4  c  54
5  a  12
6  b  11
7  c  9
8  c  2
), header = FALSE)
closeAllConnections()

# add names since you said they lack them
names(dat) - c(id, cat, value)

# this makes it so it will ask for input before
# changing plots
par(ask = TRUE)

# run the function boxplot for different values by() each level in dat$cat.
# if par(ask = TRUE) was set, it should make you click or hit
# return before changing each plot
by(dat$value, dat$cat, boxplot)



HTH,

Josh

On Wed, Sep 29, 2010 at 3:01 AM, deadlyspider wrcst...@gmail.com wrote:

 Ok, I don't think I was specific enough.

 The data originally came in this form


 1  a  12
 2  b  4
 3  a  3
 4  c  54
 5  a  12
 6  b  11
 7  c  9
 8  c  2
 .  .  .
 .  .  .
 .  .  .



 Where I sorted by the second column (NB the second column is the categories
 and they have long names). I would then like separate boxplots for each
 category.

 The loop idea would be nice but unfortunately I do not fully understand the
 answer given.

 Thanks.
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Splitting-data-in-to-multiple-boxplots-tp2717491p2718659.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 3D Graphics

2010-09-29 Thread Ben Bolker

Duncan Murdoch murdoch.duncan at gmail.com writes:

 
   On 29/09/2010 6:00 AM, JoH wrote:
  Are there anyother ways in which I can create a surface plot?

  persp() (in base R) is less flexible than rgl::persp3d,
doesn't do hidden line removal, etc., but does appear in the
regular R graphics window and can be added to in a similar way
to scatterplot3d (see the examples).  (I believe it will have
the same complaint as persp3d about x, y needing to be in increasing
order)

  See also wireframe in the lattice package.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fitting a half-ellipse curve

2010-09-29 Thread Niklaus Hurlimann

Dear mailing list,

I have following array:

   X2 Y2
[1,] 422.7900  6.0
[2,] 469.8007  10.5
[3,] 483.9428  11.0
[4,] 532.4917  25.5
[5,] 596.1942  33.5
[6,] 630.8496  40.5
[7,] 733.2996  45.0
[8,] 946.4779  32.0
[9,] 996.8068  35.5
[10,] 1074.3310  23.0

I do afterwards the following:

plot.new()

plot.window(xlim=c(min(X1)-50,max(X1)+50),
ylim=c(min(Y1)-25,max(Y1)+25))

axis(2, cex.axis=1.2)
axis(1, cex.axis=1.2)

points(X2, Y2, type=p, pch=0, cex=1.2, col=black)

title(main=Dyke Geometry Along Strike, cex.main=1.2, font.main=4)
title(xlab=distance [m], cex.lab=1.2)
title(ylab=half-thickness [m], cex.lab=1.2)

box()


I would like curve fitting where I proceeded with a non
linear-regression with the according function below:

nls(formula = Y2 ~ abs(b*abs(1-X2^2/a^2)^(1/2)), start = list( a=282,
b=40))

The formula should give the y-positive part of an ellipse in my opinion
or I might be completely wrong.

In the end I would like to fit a curve of half an ellipse through the
data.  I might could do this as well with a 2nd order polynomial fit. I
am grateful for any suggestions and comments to my problem.


Cheers



-- 
Niklaus HÃ¼rlimann
Doctorant-PhD

UniversitÃ© de Lausanne  
Institut de MinÃ©ralogie et GÃ©ochimie 
L'Anthropole 
CH-1015 Lausanne 
Suisse

E-mail: niklaus.hurlim...@unil.ch
Tel:+41(0)21 692 4452 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to rotate the x axis lable to an interested angle?

2010-09-29 Thread Ben Bolker

komais liutao1986 at yahoo.com.cn writes:

 how to rotate the x axis lable to an interested angle as the picture show? 


http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-create-rotated-axis-labels_003f

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to rotate the x axis lable to an interested angle?

2010-09-29 Thread komais


Thank you very much. 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/how-to-rotate-the-x-axis-lable-to-an-interested-angle-tp2718591p2718963.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems extracting p-value from summary(aov(...))

2010-09-29 Thread Dennis Murphy

Hi:

On Wed, Sep 29, 2010 at 5:56 AM, peter dalgaard pda...@gmail.com wrote:


 On Sep 29, 2010, at 14:25 , Dennis Murphy wrote:

 
  test.summary[[1]][, 5][1]
 
  You mean that wasn't obvious?   :)

 Worse, it doesn't actually work...

  test.summary - summary(npk.aovE)
  test.summary[[1]][, 5]
 Error in `[.default`(test.summary[[1]], , 5) :
  incorrect number of dimensions
  test.summary$Error: block[[1]][, 5]
 [1] 0.5252361NA
  test.summary[[1]][[1]][, 5]
 [1] 0.5252361NA

 I.e., you need an extra list extraction operator. The data structure goes

 List of 2
  $ Error: block :List of 1
  ..$ :Classes anova and 'data.frame':2 obs. of  5 variables:
  .. ..$ Df : num [1:2] 1 4
  .. ..$ Sum Sq : num [1:2] 37 306
 ...
  and the intermediate list has class (summary.aov,listof). I believe
 the point is that you can have a matrix LHS and get a table for each of its
 columns.


I presume we're using about the npk data in the MASS package.

 library(MASS)
 npk.aov - aov(yield ~ block + N*P*K, data = npk)
 npk.aovE - summary(npk.aov)
 str(npk.aovE)
List of 1
 $ :Classes anova and 'data.frame':   8 obs. of  5 variables:
  ..$ Df : num [1:8] 5 1 1 1 1 1 1 12
  ..$ Sum Sq : num [1:8] 343.3 189.3 8.4 95.2 21.3 ...
  ..$ Mean Sq: num [1:8] 68.7 189.3 8.4 95.2 21.3 ...
  ..$ F value: num [1:8] 4.447 12.259 0.544 6.166 1.378 ...
  ..$ Pr(F) : num [1:8] 0.01594 0.00437 0.4749 0.0288 0.26317 ...
 - attr(*, class)= chr [1:2] summary.aov listof
 npk.aovE[[1]][, 5]
[1] 0.015938790 0.004371812 0.474904093 0.028795054 0.263165283 0.168647879
[7] 0.862752086  NA
 npk.aovE[[1]][, 5][1]
[1] 0.01593879
 npk.aovE[[1]][, 'Pr(F)']
[1] 0.015938790 0.004371812 0.474904093 0.028795054 0.263165283 0.168647879
[7] 0.862752086  NA
 npk.aovE[[1]][, 'Pr(F)'][1]
[1] 0.01593879

What am I missing?

Regards,
Dennis

--
 Peter Dalgaard
 Center for Statistics, Copenhagen Business School
 Solbjerg Plads 3, 2000 Frederiksberg, Denmark
 Phone: (+45)38153501
 Email: pd@cbs.dk  Priv: pda...@gmail.com



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotting multiple animal tracks against Date/Time

2010-09-29 Thread Struve, Juliane

I will post the example again to see if its readable now. My question is why 
does read.zoo(file=filenames,) work and  lapply(filenames, read.zoo,...) 
does not ? Since I am reading the same file in both statements I just do not 
know how to interpret Error in strptime(x, format, tz = tz) : invalid 'x' 
argument.

Thank you for all help.

Juliane 

 library(chron)
 library(zoo)
 #Generate example file
 Fish_ID=1646
 Date - 01/01/2004 00:01:00
 Date - as.POSIXct(strptime(Date,format=%m/%d/%Y %H:%M:%S))
 R2sqrt -100
 Test - data.frame(Fish_ID=Fish_ID,Date=Date,R2sqrt=R2sqrt)
 write.csv(Test,file=Test)
 #Read in example file 
 filenames=Test
 read.zoo(file=filenames, header = TRUE, FUN = as.chron, sep = ,, colClasses 
= c(NULL, NULL, character, numeric)) 
 lapply(filenames, read.zoo, header = TRUE, FUN = as.chron, sep = ,, 
colClasses = c(NULL, NULL, character, numeric))


From: Gabor Grothendieck [ggrothendi...@gmail.com]
Sent: 29 September 2010 00:09
To: Struve, Juliane
Cc: r-help@r-project.org
Subject: Re: [R] plotting multiple animal tracks against Date/Time

On Tue, Sep 28, 2010 at 9:30 AM, Struve, Juliane
j.str...@imperial.ac.uk wrote:

 Hi,

 in this self-contained example the file the same error message appears as 
 when I read in my original results files.

 library (zoo)
 library(chron)
 #generate example data
 Fish_ID=1646
  Date - 01/01/2004 00:01:00
  Date - as.POSIXct(strptime(Date,format=%m/%d/%Y %H:%M:%S))
  R2sqrt -100
 #put into dataframe
 Test - data.frame(Fish_ID=Fish_ID,Date=Date,R2sqrt=R2sqrt)
 # write .csv file
 write.csv(Test,file=Test)
 #generate list of files
 filenames=Test
 #read file(s) into zoo object
 read.zoo(file=filenames, header = TRUE, FUN = as.chron, sep = ,, colClasses 
 = c(NULL, NULL, character, numeric)) #works fine
 #read list of files into zoo.object
 lapply(filenames, read.zoo, header = TRUE, FUN = as.chron, sep = ,, 
 colClasses = c(NULL, NULL, character, numeric))# error
 Error in strptime(x, format, tz = tz) : invalid 'x' argument

 Am I missing something ?

 Thank you for your time and patience.

Self contained means anyone else can just copy your code and paste it
into their session and see the error message you see.

Its likely that your file does not contain what you think it does.



--
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 95% confidence intercal with glm

2010-09-29 Thread Sam

Hi Ben and list,

Sorry to be a pain! I have followed your code, and modified it -

 pp - predict(model1,se.fit=TRUE, type = response)
 etaframe -
 + with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit))
 pframe - plogis(etaframe) 
 pframe

My response variable is 0 or 1, the probabilities are all high above my cut-off 
point of 0.445, i think this may have something to do with 

 you may need to multiply by the binomial N as
 appropriate.

However how do i multiply if my response is 0 or 1?

Furthermore, will the approach above work for a ordinal response?

Thanks

Sam

On 29 Sep 2010, at 13:44, Ben Bolker wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


## from ?glm
counts - c(18,17,15,20,10,20,25,13,12)
outcome - gl(3,1,9)
treatment - gl(3,3)
d.AD - data.frame(treatment, outcome, counts)
glm.D93 - glm(counts ~ outcome + treatment, family=poisson,
 data=d.AD)

## predict on 'link' or 'linear predictor' scale, with SEs
pp - predict(glm.D93,se.fit=TRUE)
etaframe -
with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit))
pframe - exp(etaframe)  ## inverse-link
## picture
pframe - as.data.frame(cbind(obs=d.AD$counts,pframe))
library(plotrix)
plot(pframe$obs,ylim=c(5,30))
with(pframe,plotCI(1:9,fit,li=lower,ui=upper,col=2,add=TRUE))

If you're using a binomial model you need 'plogis' rather than 'exp'
as your
inverse link, and you may need to multiply by the binomial N as
appropriate.

On 10-09-29 06:07 AM, Sam wrote:
 I am looking to do the same but am a bit confused
 
 and apply the inverse link function for your model.
 
 i understand up to this point and i understand what this means,
 however i am unsure why it needs to be done and how you do it - i.e
 i use family=binomial is this wrong if i use this method to
 calculate 95% CI?
 
 Thanks
 
 Sam On 28 Sep 2010, at 14:50, Ben Bolker wrote:
 
 zozio32 remy.pascal at gmail.com writes:
 
 
 
 Hi
 
 I had to use a glm instead of my basic lm on some data due to
 unconstant variance.
 
 now, when I plot the model over the data, how can I easily get
 the 95% confidence interval that sormally coming from:
 
 yv - predict(modelVar,list(aveLength=xv),int=c)
 matlines(xv,yv,lty=c(1,2,2))
 
 There is no interval argument to pass to the predict function
 when using a glm, so I was wondering if I had to use an other
 function
 
 
 You need to use predict with se=TRUE; construct the confidence
 intervals by computing predicted values +- 1.96 times the standard
 error returned; and apply the inverse link function for your
 model.
 
 If heteroscedasticity is your main problem, and not a specific
 (known) non-normal distribution, you might consider using the gls
 function from the nlme package with an appropriate 'weights'
 argument.
 
 __ R-help@r-project.org
 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
 read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkyjNKQACgkQc5UpGjwzenNW9gCgjfXin/9dI2y0rnk9wZWu97P8
TVgAn3WsG2ATva5WpLZXS1PXUTMRTTMi
=gvAc
-END PGP SIGNATURE-

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 95% confidence intercal with glm

2010-09-29 Thread Sam

Hi Ben and list,

Sorry to be a pain! I have followed your code, and modified it -

 pp - predict(model1,se.fit=TRUE, type = response)
 etaframe -
 + with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit))
 pframe - plogis(etaframe) 
 pframe

My response variable is 0 or 1, the probabilities are all high above my cut-off 
point of 0.445, i think this may have something to do with 

 you may need to multiply by the binomial N as
 appropriate.

However how do i multiply if my response is 0 or 1?

Furthermore, will the approach above work for a ordinal response?

Thanks

Sam

On 29 Sep 2010, at 13:44, Ben Bolker wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


## from ?glm
counts - c(18,17,15,20,10,20,25,13,12)
outcome - gl(3,1,9)
treatment - gl(3,3)
d.AD - data.frame(treatment, outcome, counts)
glm.D93 - glm(counts ~ outcome + treatment, family=poisson,
 data=d.AD)

## predict on 'link' or 'linear predictor' scale, with SEs
pp - predict(glm.D93,se.fit=TRUE)
etaframe -
with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit))
pframe - exp(etaframe)  ## inverse-link
## picture
pframe - as.data.frame(cbind(obs=d.AD$counts,pframe))
library(plotrix)
plot(pframe$obs,ylim=c(5,30))
with(pframe,plotCI(1:9,fit,li=lower,ui=upper,col=2,add=TRUE))

If you're using a binomial model you need 'plogis' rather than 'exp'
as your
inverse link, and you may need to multiply by the binomial N as
appropriate.

On 10-09-29 06:07 AM, Sam wrote:
 I am looking to do the same but am a bit confused
 
 and apply the inverse link function for your model.
 
 i understand up to this point and i understand what this means,
 however i am unsure why it needs to be done and how you do it - i.e
 i use family=binomial is this wrong if i use this method to
 calculate 95% CI?
 
 Thanks
 
 Sam On 28 Sep 2010, at 14:50, Ben Bolker wrote:
 
 zozio32 remy.pascal at gmail.com writes:
 
 
 
 Hi
 
 I had to use a glm instead of my basic lm on some data due to
 unconstant variance.
 
 now, when I plot the model over the data, how can I easily get
 the 95% confidence interval that sormally coming from:
 
 yv - predict(modelVar,list(aveLength=xv),int=c)
 matlines(xv,yv,lty=c(1,2,2))
 
 There is no interval argument to pass to the predict function
 when using a glm, so I was wondering if I had to use an other
 function
 
 
 You need to use predict with se=TRUE; construct the confidence
 intervals by computing predicted values +- 1.96 times the standard
 error returned; and apply the inverse link function for your
 model.
 
 If heteroscedasticity is your main problem, and not a specific
 (known) non-normal distribution, you might consider using the gls
 function from the nlme package with an appropriate 'weights'
 argument.
 
 __ R-help@r-project.org
 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
 read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkyjNKQACgkQc5UpGjwzenNW9gCgjfXin/9dI2y0rnk9wZWu97P8
TVgAn3WsG2ATva5WpLZXS1PXUTMRTTMi
=gvAc
-END PGP SIGNATURE-

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] FW: creating a custom background

2010-09-29 Thread Arenson, Ethan


Hi.

I want to create a plot with Pantone654 as the background. The RGB for this 
color is (0,61,121), which corresponds to a hex of #003D79. How do I specify 
the bg parameter for this?

All Best,
Ethan


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] FW: creating a custom background

2010-09-29 Thread Henrique Dallazuanna

Try this:

par(bg = '#003D79')

On Wed, Sep 29, 2010 at 11:14 AM, Arenson, Ethan earen...@nbome.org wrote:


 Hi.

 I want to create a plot with Pantone654 as the background. The RGB for this
 color is (0,61,121), which corresponds to a hex of #003D79. How do I specify
 the bg parameter for this?

 All Best,
 Ethan


[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 95% confidence intercal with glm

2010-09-29 Thread Ben Bolker

On 10-09-29 10:04 AM, Sam wrote:
 Hi Ben and list,

 Sorry to be a pain! I have followed your code, and modified it -


  **You should not use type=response here.**
  The point is that the (symmetric) confidence intervals are computed on
the link/linear predictor
scale, and then inverse-link-transformed (in this case, along with the
fitted values) -- they then
become asymmetric.

 pp - predict(model1,se.fit=TRUE, type = response)
 etaframe -
 + with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit))
 pframe - plogis(etaframe)
 pframe

 My response variable is 0 or 1, the probabilities are all high above my
cut-off point of 0.445, i think this may have something to do with

 you may need to multiply by the binomial N as
 appropriate.

 However how do i multiply if my response is 0 or 1?
 

  if 'size=1' (i.e. Bernoulli trials) then you would be multiplying by
1, so don't bother.


 On 29 Sep 2010, at 13:44, Ben Bolker wrote:


 ## from ?glm
 counts - c(18,17,15,20,10,20,25,13,12)
 outcome - gl(3,1,9)
 treatment - gl(3,3)
 d.AD - data.frame(treatment, outcome, counts)
 glm.D93 - glm(counts ~ outcome + treatment, family=poisson,
  data=d.AD)

 ## predict on 'link' or 'linear predictor' scale, with SEs
 pp - predict(glm.D93,se.fit=TRUE)
 etaframe -
 with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit))
 pframe - exp(etaframe)  ## inverse-link
 ## picture
 pframe - as.data.frame(cbind(obs=d.AD$counts,pframe))
 library(plotrix)
 plot(pframe$obs,ylim=c(5,30))
 with(pframe,plotCI(1:9,fit,li=lower,ui=upper,col=2,add=TRUE))

 If you're using a binomial model you need 'plogis' rather than 'exp'
 as your
 inverse link, and you may need to multiply by the binomial N as
 appropriate.

 On 10-09-29 06:07 AM, Sam wrote:
  I am looking to do the same but am a bit confused

  and apply the inverse link function for your model.

  i understand up to this point and i understand what this means,
  however i am unsure why it needs to be done and how you do it - i.e
  i use family=binomial is this wrong if i use this method to
  calculate 95% CI?

  Thanks

  Sam On 28 Sep 2010, at 14:50, Ben Bolker wrote:

  zozio32 remy.pascal at gmail.com writes:

 
 
  Hi
 
  I had to use a glm instead of my basic lm on some data due to
  unconstant variance.
 
  now, when I plot the model over the data, how can I easily get
  the 95% confidence interval that sormally coming from:
 
  yv - predict(modelVar,list(aveLength=xv),int=c)
  matlines(xv,yv,lty=c(1,2,2))
 
  There is no interval argument to pass to the predict function
  when using a glm, so I was wondering if I had to use an other
  function
 

  You need to use predict with se=TRUE; construct the confidence
  intervals by computing predicted values +- 1.96 times the standard
  error returned; and apply the inverse link function for your
  model.

  If heteroscedasticity is your main problem, and not a specific
  (known) non-normal distribution, you might consider using the gls
  function from the nlme package with an appropriate 'weights'
  argument.

  __ R-help@r-project.org
  mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
  read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] generalized additive mixed models for ordinal data

2010-09-29 Thread Camarda, Carlo Giovanni

Dear R-users,

Is there any R-function for fitting generalized additive mixed
models for ordinal data? Do they actually make some sense? I can fit a
generalized linear mixed model for ordinal data using the function
clmm(ordinal) and I'm able to cope with generalized additive model for
ordinal data within the package VGAM.
But I would like to fit something like: 

g(\gamma_{ij}) = \theta_{j}  +  x_{i1} \beta_1  +  f(x_{2i})  +  u_{i}, 

where \gamma_{ij} denote the cumulative probability that the i-th
observation falls in the j-th category or below. 

Sorry for the rather out-of-R question,
Carlo Giovanni Camarda







--
This mail has been sent through the MPI for Demographic ...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] short captions for xtable?

2010-09-29 Thread Marc Schwartz

On Sep 29, 2010, at 8:31 AM, cuz wrote:

 
 Hi,
 
 For my dissertation, I've made copious use of xtable. I've just gotten
 stumped however. I'm a fan of extended captions explaining the table, but
 now I have to assemble a a list of tables and the captions are unwieldy. I
 presume xtable calls LaTeX's \caption function. Is there a way to include
 short captions (a la \caption[short caption]{long caption}?
 
 Thanks in advance for you help.
 
 Cuz


I don't see any obvious way to do that with xtable. You may wish to contact the 
package author directly and perhaps ask about adding that functionality as an 
enhancement.

In Frank's Hmisc package, there is the latex() function, which has separate 
arguments for 'caption' and 'caption.lot', so you may wish to look at that 
approach.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] nlminb and optim

2010-09-29 Thread Doran, Harold

I am using both nlminb and optim to get MLEs from a likelihood function I have 
developed. AFAIK, the model I has not been previously used in this way and so I 
am struggling a bit to unit test my code since I don't have another data set to 
compare this kind of estimation to.

The likelihood I have is (in tex below)

\begin{equation}
\label{eqn:marginal}
L(\beta) = \prod_{s=1}^N \int \prod_{i=1}^K\frac{e^{x_{is}(\theta_s-\beta_i)}} 
{x_{is}!e^{e^(\theta_s-\beta_i)}} f(\theta)d(\theta)
\end{equation}

Where I view $\theta$ as a nuisance parameter and so I integrate it out of the 
likelihood. The goal is to get parameter estimates for $\beta$. The integral 
cannot be easily evaluated so I approximate it as:

\begin{equation}
\label{eqn:marginal2}
L(\beta) \approx \prod_{s=1}^N \sum_{q=1}^Q 
\prod_{i=1}^K\frac{e^{x_{is}(\theta_{q}-\beta_i)}} 
{x_{is}!e^{e^(\theta_{q}-\beta_i)}} w_q
\end{equation}

\noindent where $\theta_{q}$ is the node at quadrature point $q = 1, \ldots, Q$ 
and $w_q$ is the weight at quadrature point $q$.

For now, I am assuming $f(\theta)$ is Uniform but this may change and that is 
not a major issue. Now, I have written a function using both nlminb and optim. 
I have copied my function below where the arguments are the data set, Q for the 
number of quadrature points, and an option for providing starting values. I am 
running into some flow control problems and so I hack it a bit and fix a lower 
bound as you may see in the function.

Here is the issue at hand. First, nlminb and optim give different parameter 
estimates. Optim tells me it converged but nlminb tells me I get relative 
convergence. If I use different number of quadrature points in optim, I get 
very different parameter estimates and this should not happen. But, varying the 
number of quadrature points in nlminb yields the same parameter estimates, but 
I am not sure the model converged.

I have also provided the data set I am working with below as well as 
sessionInfo(). There may be many issues going on here and am grateful for any 
reactions you all may have. I *think* my likelihood is written properly and I 
*think* I am handling flow control issues in a relatively decent way. I may be 
misinterpreting optim or nlminb reports.

In any event, should anyone take the time to review this and offers pointers I 
would be grateful.
Harold



fit - function(data, Q, startVal = NULL, ...){
if(!is.null(startVal)  length(startVal) != ncol(data) ){
stop(Length of argument startVal not equal to 
the number of parameters estimated)
}
if(!is.null(startVal)){
startVal - startVal
} else {
startVal - rep(0, ncol(data))
}
#qq - gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1)
qq - gauss.quad.prob(Q, dist = 'uniform', alpha = -5, beta=5)
rr1 - matrix(0, nrow = Q, ncol = nrow(data))
data - as.matrix(data)
L - nrow(data)
C - ncol(data)
fn - function(b){
b - b[1:C]
for(j in 1:Q){
for(i in 1:L){
rr1[j,i] - 
prod((exp(data[i,]*(qq$nodes[j]-b))) / (factorial(data[i,]) * 
exp(exp(qq$nodes[j]-b *
qq$weights[j]
}
}
rr1[rr1==0] - 1e-10
-sum(log(colSums(rr1)))
}
#opt - optim(startVal, fn, method = BFGS, hessian = FALSE)
opt - nlminb(startVal, fn)
#list(coefficients = opt$par, LogLik = -opt$value, 
Std.Error = sqrt(diag(solve(opt$hessian
#list(coefficients = opt$par, LogLik = -opt$value)
opt
}

fit(data, Q = 10)



 data
   cindyR cjR ohsR sdhpR
  [1,] 24  146 9
  [2,] 19  14610
  [3,] 19  14610
  [4,] 17  186 8
  [5,] 19  175 8
  [6,] 17  176 9
  [7,] 13  15511
  [8,] 19  136 8
  [9,] 21  10411
 [10,] 16  155 9
 [11,] 14  16510
 [12,] 14  13510
 [13,] 17  155 8
 [14,] 16  184 8
 [15,] 16  145 8
 [16,] 18  135 8
 [17,] 15  146 7
 [18,] 16  175 7
 [19,] 18  125 8
 [20,] 18   94 9
 [21,] 17   9410
 [22,] 18  124 8
 [23,] 15  155 7
 [24,] 16   95 8
 [25,] 18  115 7
 [26,] 18  104 9
 [27,] 15  154 7
 [28,] 16  114 9
 [29,] 11  165 8
 [30,] 11  165 8
 [31,] 16

Re: [R] 95% confidence intercal with glm

2010-09-29 Thread Chris Mcowen

Right, that makes sense, thanks

The reason i used type= response was i wanted to convert the predicted 
probabilities to the response scale, as surely this is the scale at which a 
95CI value is most useful for?

I.e 

 pp - predict(model1,se.fit=TRUE, type = response)


1  0.68

Probability of point 1 having a response of 1 = 0.68 - # this is above the 
cutoff therefore this has a response of 1

Then it would be of most use to get the 95CI on this scale to see if the 
probability straddles the cutoff value?

Maybe i am missing something?

Thanks

 
On 29 Sep 2010, at 15:27, Ben Bolker wrote:

On 10-09-29 10:04 AM, Sam wrote:
 Hi Ben and list,
 
 Sorry to be a pain! I have followed your code, and modified it -
 

 **You should not use type=response here.**
 The point is that the (symmetric) confidence intervals are computed on
the link/linear predictor
scale, and then inverse-link-transformed (in this case, along with the
fitted values) -- they then
become asymmetric.

 pp - predict(model1,se.fit=TRUE, type = response)
 etaframe -
 + with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit))
 pframe - plogis(etaframe)
 pframe
 
 My response variable is 0 or 1, the probabilities are all high above my
cut-off point of 0.445, i think this may have something to do with
 
 you may need to multiply by the binomial N as
 appropriate.
 
 However how do i multiply if my response is 0 or 1?
 

 if 'size=1' (i.e. Bernoulli trials) then you would be multiplying by
1, so don't bother.

 
 On 29 Sep 2010, at 13:44, Ben Bolker wrote:
 
 
 ## from ?glm
 counts - c(18,17,15,20,10,20,25,13,12)
 outcome - gl(3,1,9)
 treatment - gl(3,3)
 d.AD - data.frame(treatment, outcome, counts)
 glm.D93 - glm(counts ~ outcome + treatment, family=poisson,
 data=d.AD)
 
 ## predict on 'link' or 'linear predictor' scale, with SEs
 pp - predict(glm.D93,se.fit=TRUE)
 etaframe -
 with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit))
 pframe - exp(etaframe)  ## inverse-link
 ## picture
 pframe - as.data.frame(cbind(obs=d.AD$counts,pframe))
 library(plotrix)
 plot(pframe$obs,ylim=c(5,30))
 with(pframe,plotCI(1:9,fit,li=lower,ui=upper,col=2,add=TRUE))
 
 If you're using a binomial model you need 'plogis' rather than 'exp'
 as your
 inverse link, and you may need to multiply by the binomial N as
 appropriate.
 
 On 10-09-29 06:07 AM, Sam wrote:
 I am looking to do the same but am a bit confused
 
 and apply the inverse link function for your model.
 
 i understand up to this point and i understand what this means,
 however i am unsure why it needs to be done and how you do it - i.e
 i use family=binomial is this wrong if i use this method to
 calculate 95% CI?
 
 Thanks
 
 Sam On 28 Sep 2010, at 14:50, Ben Bolker wrote:
 
 zozio32 remy.pascal at gmail.com writes:
 
 
 
 Hi
 
 I had to use a glm instead of my basic lm on some data due to
 unconstant variance.
 
 now, when I plot the model over the data, how can I easily get
 the 95% confidence interval that sormally coming from:
 
 yv - predict(modelVar,list(aveLength=xv),int=c)
 matlines(xv,yv,lty=c(1,2,2))
 
 There is no interval argument to pass to the predict function
 when using a glm, so I was wondering if I had to use an other
 function
 
 
 You need to use predict with se=TRUE; construct the confidence
 intervals by computing predicted values +- 1.96 times the standard
 error returned; and apply the inverse link function for your
 model.
 
 If heteroscedasticity is your main problem, and not a specific
 (known) non-normal distribution, you might consider using the gls
 function from the nlme package with an appropriate 'weights'
 argument.
 
 __ R-help@r-project.org
 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
 read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] getDLLVersion

2010-09-29 Thread Kyle Covington

Hi,

I'm just learning to write R extensions in C and to embed R in C.

I was trying to get through the example in the help page on calling the .dll
directly (
http://cran.r-project.org/doc/manuals/R-exts.html#Calling-R_002edll-directly
).

When I compile I consistently get the error that getDLLVersion() as well as
get_R_HOME() and getRUser() are undefined references.

These are defined in Rembedded.h ln60 as:
extern char *getDLLVersion(void), *getRUser(void), *get_R_HOME(void);

I included Rembedded.h from the example.

I searched the entire R directory for the string getDLLVersion and only
found it in Rembedded.h and the help documentation (for the .dll directly
example).

Am I missing some key file or just not understanding how these functions
work?  I'm just learning C so this may be a very basic question.  If anyone
could point me to a good reference on this it would be very helpful.

Technical Notes:
OS: win7 32bit
Compiler: mingw32
R: 2.11.1

Thanks
Kyle

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] FW: creating a custom background

2010-09-29 Thread Thomas Stewart

Ethan-

You need to be more explicit about what you mean by 'background'.  Do you
mean:
(a) the entire plot including margins?, or
(b) only the plotting area?, or
(c) a different color for both margins and plotting area?

If you want (a), the solution is par(bg = '#003D79').

If you want (b), the solution is a little trick I picked up from the R help
archives:

plot.area.col-function(col){
  ttt-par()$usr
  rect(ttt[1],ttt[3],ttt[2],ttt[4],col=col)
  }

plot(1,1,
  panel.first=plot.area.col('#003D79')
  )

If you want (c), you can combine both (a) and (b) as follows:

par(bg = 'blue')

plot(1,1,
  panel.first=plot.area.col('red')
  )

Hope that helps.
-tgs



If you are using the plot function, and you only want to change the plot
area colored

On Wed, Sep 29, 2010 at 10:14 AM, Arenson, Ethan earen...@nbome.org wrote:


 Hi.

 I want to create a plot with Pantone654 as the background. The RGB for this
 color is (0,61,121), which corresponds to a hex of #003D79. How do I specify
 the bg parameter for this?

 All Best,
 Ethan


[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] What's the meaning of Species ~ . in IRIS data

2010-09-29 Thread Gundala Viswanath

I am refering to a function call like this:

data(iris)
x - svmlight(Species ~ ., data = iris)

I tried to see the content of it by typing:

 Species ~ .

but it gives nothing.  How can I see it's content ?

- P.Dubois

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] What's the meaning of Species ~ . in IRIS data

2010-09-29 Thread Duncan Murdoch


 On 29/09/2010 10:51 AM, Gundala Viswanath wrote:

I am refering to a function call like this:

data(iris)
x- svmlight(Species ~ ., data = iris)

I tried to see the content of it by typing:

  Species ~ .

but it gives nothing.  How can I see it's content ?


str(Species ~ .) will tell you that it's a formula.  The help page for 
the svmlight() function should tell you how it is used.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] What's the meaning of Species ~ . in IRIS data

2010-09-29 Thread Prof Brian Ripley


On Wed, 29 Sep 2010, Duncan Murdoch wrote:


On 29/09/2010 10:51 AM, Gundala Viswanath wrote:

I am refering to a function call like this:

data(iris)
x- svmlight(Species ~ ., data = iris)

I tried to see the content of it by typing:

  Species ~ .

but it gives nothing.  How can I see it's content ?


str(Species ~ .) will tell you that it's a formula.  The help page for the 
svmlight() function should tell you how it is used.


If this is svmlight in klaR, it seems not to.  But ?formula will 
definitely tell you how to interpret the formula.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R crashes when loading rgl package before minqa package

2010-09-29 Thread Gaspard Lequeux



Hej,

Calling newuoa (from the minqa package) makes R crash when the package rgl 
is loaded first. This however only on certain selected data.


The data used for testing (saved to 'bugs.R'):


xvals = 
c(1,2,4,5,7,8,9,10,11,12,14,15,16,18,19,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36)

yvals = 
c(857.7597,975.8624,978.2655,979.3034,965.5919,983.8946,992.2512,992.1178,979.5379,974.4269,968.4113,991.5210,977.3361,985.7800,975.5220,974.6880,973.8102,980.7295,982.0034,984.7993,978.4948,970.4351,969.0718,983.7892,976.3637,980.7833,987.1665,976.6000,975.1332,971.0757,989.4693)

initpar = c(-5.1471384, -3861.8905839, 979.2616002, 0.2572355, 27.5705764)

optimft - function(x) {
  yft = x[2] + (x[3] - x[2])/((1 + exp(x[1] * (log(xvals) - log(x[4]^x[5])
  return(sum((yvals - yft)^2))
}


Sequence of commands needed to make the bug appear:

Start R
source('bugs.R')
library(minqa)
library(rgl)
newuoa(initpar, optimft)
 = OK

Start R
source('bugs.R')
library(rgl)
library(minqa)
newuoa(initpar, optimft)
  = Crash: segfault: address 0x18, cause 'memory not mapped'

I found the bug using the package qpcR, where rgl is loaded when loading 
qpcR while minqa is only loaded later, when needed.



Running on Debian squeeze 64 bit.
R version: R version 2.11.1 (2010-05-31) x86_64-pc-linux-gnu
rgl version: 0.91
minqa version:  1.1.9
Rcpp version: 0.8.6 (loaded by minqa)

Kind regards,

Gaspard Lequeux

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] generalized additive mixed models for ordinal data

2010-09-29 Thread Dennis Murphy

Hi:

You could look into the gamm4 package. Its description is:

Fit generalized additive mixed models via a version of mgcv's gamm function,
using lme4 for estimation via Fabian Scheipl's trick.

HTH,
Dennis

On Wed, Sep 29, 2010 at 7:28 AM, Camarda, Carlo Giovanni 
cama...@demogr.mpg.de wrote:

 Dear R-users,

Is there any R-function for fitting generalized additive mixed
 models for ordinal data? Do they actually make some sense? I can fit a
 generalized linear mixed model for ordinal data using the function
 clmm(ordinal) and I'm able to cope with generalized additive model for
 ordinal data within the package VGAM.
 But I would like to fit something like:

 g(\gamma_{ij}) = \theta_{j}  +  x_{i1} \beta_1  +  f(x_{2i})  +  u_{i},

 where \gamma_{ij} denote the cumulative probability that the i-th
 observation falls in the j-th category or below.

 Sorry for the rather out-of-R question,
 Carlo Giovanni Camarda







 --
 This mail has been sent through the MPI for Demographic ...{{dropped:10}}

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] (OT) Change of email address

2010-09-29 Thread Ted Harding

Apologies for bothering anyone who may not be interested in
this, but some of you will, for instance, have my current
email address in your address books, etc.

As  result of a new policy by Manchester University, retired
former staff who no longer contribute actively to research
in their former departments are to have their email accounts
closed.

The address from which I have been mailing to R-help (etc.):
  ted.hard...@manchester.ac.uk
will therefore soon be closed down.

I have set up (and subscribed to R-help, R-devel, R-announce)
a new email address:

  ted.hard...@wlandres.net

If you have kept a record of my current (manchester.ac.uk)
address, please modify this to the new address.

Messages sent to the current address will continue to arrive,
for another week or two.

With thanks, and best wishes to all,
Ted.


E-Mail: (Ted Harding) ted.hard...@wlandres.net
Fax-to-email: +44 (0)870 094 0861
Date: 29-Sep-10   Time: 16:58:18
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] short captions for xtable?

2010-09-29 Thread cuz


Thanks. I didn't see any obvious way to do it either. It looks like the
caption option has room for additional input that are as yet not designated.
I'll take a look at the latex() possibility.
Best,
cuz
-- 
View this message in context: 
http://r.789695.n4.nabble.com/short-captions-for-xtable-tp2719000p2719284.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R Spracheinstellung

2010-09-29 Thread Keller Tobias (kelleto0)

sehr geehrte Damen und Herren

Für unseren Statistik Unterricht brauchen wir das R-Programm.
Ich habe das Programm heruntergeladen, deutsch gewählt und installiert. Das 
Problem ist nach 3mal neu installieren, dass das Programm immer auf englisch 
kommt.

Können Sie mir helfen?

Vg

Tobias Keller

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R Graphic - Tellis as a potential

2010-09-29 Thread Jeevan Duggempudi

Hello all, 

I have been meaning to learn R for a while and have  just subscribed to this 
list. I am planning to give R a shot at one of  my live projects. I am looking 
to explore graphical features of R on my  data below. 


Sample Data:
Cat1 - Cat2 - Cat3 - Cat4 - NumPeople - Salary
H - L - H - L - 100 - 5
L - L - L - L - 40 - 3
 - H - H - - 100 - 45000

Cat1 through Cat4 are categorical variables containing High, Medium, Low or 
Blank values and the last two variables are continuous. 


1.  Is there a good way of graphically representing this data in R? I am  
looking for total population and average salary within each combination  of 
values. 

2. Cat1 and Cat2 are related while  Cat3, Cat4 are related. I have a grid in 
Excel that summarizes by  having (Cat1 + Cat2) on the left and (Cat3 + Cat4) on 
the top. Could we  draw this kind of matrix using R? I started looking up this 
morning and  see that this could be done using trellis objects. I will 
appreciate if  anyone has examples similar to this. 


Thanks,


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems extracting p-value from summary(aov(...))

2010-09-29 Thread Dennis Murphy

Hi:

Someone off-list (Josh Wiley - thank you) mentioned the Error() term in the
OP's ANOVA, which I missed in responding to the post - sorry for the
misinformation. Using the npk example with code that Josh showed me, we
have, for the following model,

npk.aov2 - aov(yield ~ N*P*K + Error(block/P), data = npk)
npk.aovE2 - summary(npk.aov2)
str(npk.aovE2)
List of 3
 $ Error: block  :List of 1
  ..$ :Classes anova and 'data.frame':2 obs. of  5 variables:
  .. ..$ Df : num [1:2] 1 4
  .. ..$ Sum Sq : num [1:2] 37 306
  .. ..$ Mean Sq: num [1:2] 37 76.6
  .. ..$ F value: num [1:2] 0.483 NA
  .. ..$ Pr(F) : num [1:2] 0.525 NA
  ..- attr(*, class)= chr [1:2] summary.aov listof
 $ Error: block:P:List of 1
  ..$ :Classes anova and 'data.frame':3 obs. of  5 variables:
  .. ..$ Df : num [1:3] 1 1 4
  .. ..$ Sum Sq : num [1:3] 8.4 33.1 38.3
  .. ..$ Mean Sq: num [1:3] 8.4 33.14 9.57
  .. ..$ F value: num [1:3] 0.878 3.464 NA
  .. ..$ Pr(F) : num [1:3] 0.402 0.136 NA
  ..- attr(*, class)= chr [1:2] summary.aov listof
 $ Error: Within :List of 1
  ..$ :Classes anova and 'data.frame':5 obs. of  5 variables:
  .. ..$ Df : num [1:5] 1 1 1 1 8
  .. ..$ Sum Sq : num [1:5] 189.282 95.202 21.282 0.482 147.023
  .. ..$ Mean Sq: num [1:5] 189.282 95.202 21.282 0.482 18.378
  .. ..$ F value: num [1:5] 10.2994 5.1802 1.158 0.0262 NA
  .. ..$ Pr(F) : num [1:5] 0.0124 0.0524 0.3133 0.8754 NA
  ..- attr(*, class)= chr [1:2] summary.aov listof
 - attr(*, class)= chr summary.aovlist

In this case, the p-values can be gotten from each component of the list,
but one has to read the output from str() carefully. For example, under each
$Error component is another component, and it is the subcomponent that needs
to be subsetted to grab the p-values:

Top level:
 npk.aovE2[[1]][[1]][, 5]
[1] 0.5252361NA

Mid-level:
 npk.aovE2[[2]][[1]][, 5]
[1] 0.4017266 0.1362185NA

Bottom level:
 npk.aovE2[[3]][[1]][, 5]
[1] 0.01243794 0.05239664 0.31325902 0.87540509 NA

BTW, this is one reason why str() is such an important function in R.

Thanks to Peter and Josh for pointing out my error; apologies for the
thickheadedness.

Dennis

On Wed, Sep 29, 2010 at 5:56 AM, peter dalgaard pda...@gmail.com wrote:


 On Sep 29, 2010, at 14:25 , Dennis Murphy wrote:

 
  test.summary[[1]][, 5][1]
 
  You mean that wasn't obvious?   :)

 Worse, it doesn't actually work...

  test.summary - summary(npk.aovE)
  test.summary[[1]][, 5]
 Error in `[.default`(test.summary[[1]], , 5) :
  incorrect number of dimensions
  test.summary$Error: block[[1]][, 5]
 [1] 0.5252361NA
  test.summary[[1]][[1]][, 5]
 [1] 0.5252361NA

 I.e., you need an extra list extraction operator. The data structure goes

 List of 2
  $ Error: block :List of 1
  ..$ :Classes anova and 'data.frame':2 obs. of  5 variables:
  .. ..$ Df : num [1:2] 1 4
  .. ..$ Sum Sq : num [1:2] 37 306
 ...
  and the intermediate list has class (summary.aov,listof). I believe
 the point is that you can have a matrix LHS and get a table for each of its
 columns.

 --
 Peter Dalgaard
 Center for Statistics, Copenhagen Business School
 Solbjerg Plads 3, 2000 Frederiksberg, Denmark
 Phone: (+45)38153501
 Email: pd@cbs.dk  Priv: pda...@gmail.com



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] caret package version 4.63

2010-09-29 Thread KuhnA03

Version 4.63 of the caret package is now on CRAN.

caret can be used to tune the parameters of predictive models using
resampling, estimate variable importance and visualize the results.
There are also various modeling and helper functions that can be
useful for training models.

caret has wrappers to over 99 different models for classification
and regression. See the package vignettes or:

  http://user2010.org/slides/Kuhn.pdf
  http://www.jstatsoft.org/v28/i05

for more details. 


Since the last posting to the list:

 - wrappers for a number of new models were added, notably gam
   models (from both the gam and mgcv packages) and logic trees

 - when resampling with train(), class probabilities can now be
   used to calculate performance (such as the AUC of an ROC curve).
   A basic summary function, twoClassSummary(), can be used to
   calculate sensitivity, specificity and the ROC AUC.

 - repeated k-fold CV and the bootstrap 632 technique are available
   in train()

 - pre-processing can not be used within each resampling iteration
   within train().

 - a function for independent component regression (icr) was added

 - the class for aggregating and visualization resampling results
   (resamples) has been enhanced with more visualization methods.
   The class can also work with caret's feature selection routines
   (rfe() and sbf())

 - the print method for train() has been improved

 - functions can be now be passed to the tuneGrid argument in
   train()

 - an existing function that catalogs the existing models available
   within train(), called modelLookup(), is now available to the
   users

 - when parallel processing, more computations are being executed
   in the worker processes than previously (e.g. performance calcs)

The NEWS file has the blow-by-blow list of changes.


The package homepage is

  https://r-forge.r-project.org/projects/caret/

Send questions, collaborations, comments etc to max.kuhn at pfizer.com.

Max

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotting multiple animal tracks against Date/Time

2010-09-29 Thread Dennis Murphy

Hi:

This 'works' on the lapply end:

# Function to read one file from the list:
g - function(x) read.zoo(file = x, header = TRUE, FUN = as.chron, sep =
,,
 colClasses = c(NULL, NULL, character,
numeric))
# Apply to all files in list:
lapply(filenames, g)
[[1]]
(01/01/04 00:01:00)
100

# Just to prove to myself that this works on a list of input files,

write.csv(Test,file=Test2)
write.csv(Test,file=Test3)
filenames - c('Test', 'Test2', 'Test3')
lapply(filenames, g)
[[1]]
(01/01/04 00:01:00)
100

[[2]]
(01/01/04 00:01:00)
100

[[3]]
(01/01/04 00:01:00)
100

Hope this helps,
Dennis

On Wed, Sep 29, 2010 at 4:52 AM, Struve, Juliane j.str...@imperial.ac.ukwrote:

 I will post the example again to see if its readable now. My question is
 why
 does read.zoo(file=filenames,) work and  lapply(filenames,
 read.zoo,...) does not ? Since I am reading the same file in both statements
 I just do not know how to interpret Error in strptime(x, format, tz = tz) :
 invalid 'x' argument.

 Thank you for all help.

 Juliane

  library(chron)
  library(zoo)
  #Generate example file
  Fish_ID=1646
  Date - 01/01/2004 00:01:00
  Date - as.POSIXct(strptime(Date,format=%m/%d/%Y %H:%M:%S))
  R2sqrt -100
  Test - data.frame(Fish_ID=Fish_ID,Date=Date,R2sqrt=R2sqrt)
  write.csv(Test,file=Test)
  #Read in example file
  filenames=Test
  read.zoo(file=filenames, header = TRUE, FUN = as.chron, sep = ,,
 colClasses = c(NULL, NULL, character, numeric))
  lapply(filenames, read.zoo, header = TRUE, FUN = as.chron, sep = ,,
 colClasses = c(NULL, NULL, character, numeric))

 
 From: Gabor Grothendieck [ggrothendi...@gmail.com]
 Sent: 29 September 2010 00:09
 To: Struve, Juliane
 Cc: r-help@r-project.org
 Subject: Re: [R] plotting multiple animal tracks against Date/Time

 On Tue, Sep 28, 2010 at 9:30 AM, Struve, Juliane
 j.str...@imperial.ac.uk wrote:
 
  Hi,
 
  in this self-contained example the file the same error message appears as
 when I read in my original results files.
 
  library (zoo)
  library(chron)
  #generate example data
  Fish_ID=1646
   Date - 01/01/2004 00:01:00
   Date - as.POSIXct(strptime(Date,format=%m/%d/%Y %H:%M:%S))
   R2sqrt -100
  #put into dataframe
  Test - data.frame(Fish_ID=Fish_ID,Date=Date,R2sqrt=R2sqrt)
  # write .csv file
  write.csv(Test,file=Test)
  #generate list of files
  filenames=Test
  #read file(s) into zoo object
  read.zoo(file=filenames, header = TRUE, FUN = as.chron, sep = ,,
 colClasses = c(NULL, NULL, character, numeric)) #works fine
  #read list of files into zoo.object
  lapply(filenames, read.zoo, header = TRUE, FUN = as.chron, sep = ,,
 colClasses = c(NULL, NULL, character, numeric))# error
  Error in strptime(x, format, tz = tz) : invalid 'x' argument
 
  Am I missing something ?
 
  Thank you for your time and patience.

 Self contained means anyone else can just copy your code and paste it
 into their session and see the error message you see.

 Its likely that your file does not contain what you think it does.



 --
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (OT) Change of email address

2010-09-29 Thread Vojtěch Zeisek

Hello,
go to https://stat.ethz.ch/mailman/options/r-help/y...@email and if You do not 
remember Your password, use Password reminder (down) to mail it to Your 
address. Then login and do all needed changes. You can replace r-hep in link 
above with r-sig-phylo, r-announce and so on. And of course replace y...@email 
above with probable ted.hard...@wlandres.net.
Have a nice day!
Vojtěch

Dne St 29. září 2010 10:58:22 ted.hard...@wlandres.net napsal(a):
 Apologies for bothering anyone who may not be interested in
 this, but some of you will, for instance, have my current
 email address in your address books, etc.
 
 As  result of a new policy by Manchester University, retired
 former staff who no longer contribute actively to research
 in their former departments are to have their email accounts
 closed.
 
 The address from which I have been mailing to R-help (etc.):
   ted.hard...@manchester.ac.uk
 will therefore soon be closed down.
 
 I have set up (and subscribed to R-help, R-devel, R-announce)
 a new email address:
 
   ted.hard...@wlandres.net
 
 If you have kept a record of my current (manchester.ac.uk)
 address, please modify this to the new address.
 
 Messages sent to the current address will continue to arrive,
 for another week or two.
 
 With thanks, and best wishes to all,
 Ted.
 
 
 E-Mail: (Ted Harding) ted.hard...@wlandres.net
 Fax-to-email: +44 (0)870 094 0861
 Date: 29-Sep-10   Time: 16:58:18
 -- XFMail --
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.
-- 
Vojtěch Zeisek

Department of Botany, Faculty of Science, Charles Uni., Prague, CZ
Institute of Botany, Academy of Science, Czech Republic
Community of the openSUSE GNU/Linux

https://www.natur.cuni.cz/faculty-en?set_language=en
http://www.ibot.cas.cz/?p=indexamp;site=en
http://www.opensuse.org/
http://web.natur.cuni.cz/~zeisek/


signature.asc
Description: This is a digitally signed message part.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R Spracheinstellung

2010-09-29 Thread Berwin A Turlach

G'day Tobias,

On Wed, 29 Sep 2010 14:01:10 +
Keller Tobias (kelleto0) kelle...@students.zhaw.ch wrote:

 Für unseren Statistik Unterricht brauchen wir das R-Programm.
 Ich habe das Programm heruntergeladen, deutsch gewählt und
 installiert. Das Problem ist nach 3mal neu installieren, dass das
 Programm immer auf englisch kommt.
 
 Können Sie mir helfen?

R for Windows FAQ 3.2:

http://cran.ms.unimelb.edu.au/bin/windows/base/rw-FAQ.html#I-want-R-in-English_0021

HTH.

Cheers,

Berwin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] 95% confidence intercal with glm

2010-09-29 Thread Sam

Dear List and Ben

( I apologise if this has been sent twice, but it is not showing in my sent 
folder and i have been having trouble with my email of late)

Right, that makes sense, thanks

The reason i used type= response was i wanted to convert the predicted 
probabilities to the response scale, as surely this is the scale at which a 
95CI value is most useful for?

I.e 

 pp - predict(model1,se.fit=TRUE, type = response)


1  0.68

Probability of point 1 having a response of 1 = 0.68 - # this is above the 
cutoff therefore this has a response of 1

Then it would be of most use to get the 95CI on this scale to see if the 
probability straddles the cutoff value?

Maybe i am missing something?

Thanks


On 29 Sep 2010, at 15:27, Ben Bolker wrote:

On 10-09-29 10:04 AM, Sam wrote:
 Hi Ben and list,
 
 Sorry to be a pain! I have followed your code, and modified it -
 

**You should not use type=response here.**
The point is that the (symmetric) confidence intervals are computed on
the link/linear predictor
scale, and then inverse-link-transformed (in this case, along with the
fitted values) -- they then
become asymmetric.

 pp - predict(model1,se.fit=TRUE, type = response)
 etaframe -
 + with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit))
 pframe - plogis(etaframe)
 pframe
 
 My response variable is 0 or 1, the probabilities are all high above my
cut-off point of 0.445, i think this may have something to do with
 
 you may need to multiply by the binomial N as
 appropriate.
 
 However how do i multiply if my response is 0 or 1?
 

if 'size=1' (i.e. Bernoulli trials) then you would be multiplying by
1, so don't bother.

 
 On 29 Sep 2010, at 13:44, Ben Bolker wrote:
 
 
 ## from ?glm
 counts - c(18,17,15,20,10,20,25,13,12)
 outcome - gl(3,1,9)
 treatment - gl(3,3)
 d.AD - data.frame(treatment, outcome, counts)
 glm.D93 - glm(counts ~ outcome + treatment, family=poisson,
data=d.AD)
 
 ## predict on 'link' or 'linear predictor' scale, with SEs
 pp - predict(glm.D93,se.fit=TRUE)
 etaframe -
 with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit))
 pframe - exp(etaframe)  ## inverse-link
 ## picture
 pframe - as.data.frame(cbind(obs=d.AD$counts,pframe))
 library(plotrix)
 plot(pframe$obs,ylim=c(5,30))
 with(pframe,plotCI(1:9,fit,li=lower,ui=upper,col=2,add=TRUE))
 
 If you're using a binomial model you need 'plogis' rather than 'exp'
 as your
 inverse link, and you may need to multiply by the binomial N as
 appropriate.
 
 On 10-09-29 06:07 AM, Sam wrote:
 I am looking to do the same but am a bit confused
 
 and apply the inverse link function for your model.
 
 i understand up to this point and i understand what this means,
 however i am unsure why it needs to be done and how you do it - i.e
 i use family=binomial is this wrong if i use this method to
 calculate 95% CI?
 
 Thanks
 
 Sam On 28 Sep 2010, at 14:50, Ben Bolker wrote:
 
 zozio32 remy.pascal at gmail.com writes:
 
 
 
 Hi
 
 I had to use a glm instead of my basic lm on some data due to
 unconstant variance.
 
 now, when I plot the model over the data, how can I easily get
 the 95% confidence interval that sormally coming from:
 
 yv - predict(modelVar,list(aveLength=xv),int=c)
 matlines(xv,yv,lty=c(1,2,2))
 
 There is no interval argument to pass to the predict function
 when using a glm, so I was wondering if I had to use an other
 function
 
 
 You need to use predict with se=TRUE; construct the confidence
 intervals by computing predicted values +- 1.96 times the standard
 error returned; and apply the inverse link function for your
 model.
 
 If heteroscedasticity is your main problem, and not a specific
 (known) non-normal distribution, you might consider using the gls
 function from the nlme package with an appropriate 'weights'
 argument.
 
 __ R-help@r-project.org
 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
 read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 95% confidence intercal with glm

2010-09-29 Thread Ben Bolker

  [I'm a little confused: are you Sam Smith or Chris Mcowen ... ?]

  This is admittedly a bit confusing, but the best scale on which to
compute standard errors is the link scale.
   It turns out (I hadn't realized this) that predict.glm does give
you not-crazy answers when you ask for
se.fit=TRUE on the response scale, in which case you can (a)  just
take +/- 1.96 SE around the response-scale fit and be done.
However, these are less accurate than doing what I suggested ((b)
computing fit +/- 1.96 SE on the link scale and
inverse-link-transforming).  Among other things, method a produces
confidence intervals that are exactly symmetric (generally
not true) and that are not necessarily bounded within the appropriate
range -- for example, the lower 95% CI bound for females
in the example below is slightly  0 ...

## extended example from ?predict.glm
require(graphics)

## example from Venables and Ripley (2002, pp. 190-2.)
ldose - rep(0:5, 2)
numdead - c(1, 4, 9, 13, 18, 20, 0, 2, 6, 10, 12, 16)
sex - factor(rep(c(M, F), c(6, 6)))
SF - cbind(numdead, numalive=20-numdead)
budworm.lg - glm(SF ~ sex*ldose, family=binomial)
summary(budworm.lg)

ci.scale - 1.96
plot(c(1,32), c(0,1), type = n, xlab = dose,
 ylab = prob, log = x, las=1, bty=l)
text(2^ldose, numdead/20, as.character(sex), col=c(red,black)[sex])
ld - seq(0, 5, 0.1)
pM - predict(budworm.lg,
  data.frame(ldose=ld,
 sex=factor(rep(M, length(ld)),
   levels=levels(sex))),
  type = response,se.fit=TRUE)
lines(2^ld, pM$fit )
matlines(2^ld,pM$fit+outer(pM$se.fit,ci.scale*c(-1,1)),lty=2,col=1)
pF - predict(budworm.lg,
  data.frame(ldose=ld,
 sex=factor(rep(F, length(ld)),
   levels=levels(sex))),
  type = response,se.fit=TRUE)
lines(2^ld, pF$fit,col=2)
matlines(2^ld,pF$fit+outer(pF$se.fit,ci.scale*c(-1,1)),lty=2,col=2)

pM_ex - predict(budworm.lg,
  data.frame(ldose=ld,
 sex=factor(rep(M, length(ld)),
   levels=levels(sex))),
  type = link,se.fit=TRUE)
with(pM_ex,

matlines(2^ld,plogis(fit+outer(se.fit,ci.scale*c(-1,1))),lty=3,col=1))

pF_ex - predict(budworm.lg,
  data.frame(ldose=ld,
 sex=factor(rep(F, length(ld)),
   levels=levels(sex))),
  type = link,se.fit=TRUE)
with(pF_ex,

matlines(2^ld,plogis(fit+outer(se.fit,ci.scale*c(-1,1))),lty=3,col=2))

legend(bottomright,
   c(prediction,
 +/- 2 SE (response scale),
 +/- 2 SE (link scale)),
   lty=1:3)
  



On 10-09-29 10:38 AM, Chris Mcowen wrote:
 Right, that makes sense, thanks

 The reason i used type= response was i wanted to convert the predicted
probabilities to the response scale, as surely this is the scale at
which a 95CI value is most useful for?

 I.e

 pp - predict(model1,se.fit=TRUE, type = response)


 1  0.68

 Probability of point 1 having a response of 1 = 0.68 - # this is above
the cutoff therefore this has a response of 1

 Then it would be of most use to get the 95CI on this scale to see if
the probability straddles the cutoff value?

 Maybe i am missing something?

 Thanks

 
 On 29 Sep 2010, at 15:27, Ben Bolker wrote:

 On 10-09-29 10:04 AM, Sam wrote:
 Hi Ben and list,

 Sorry to be a pain! I have followed your code, and modified it -


  **You should not use type=response here.**
  The point is that the (symmetric) confidence intervals are computed on
 the link/linear predictor
 scale, and then inverse-link-transformed (in this case, along with the
 fitted values) -- they then
 become asymmetric.

 pp - predict(model1,se.fit=TRUE, type = response)
 etaframe -
 + with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit))
 pframe - plogis(etaframe)
 pframe

 My response variable is 0 or 1, the probabilities are all high above my
 cut-off point of 0.445, i think this may have something to do with

 you may need to multiply by the binomial N as
 appropriate.

 However how do i multiply if my response is 0 or 1?


  if 'size=1' (i.e. Bernoulli trials) then you would be multiplying by
 1, so don't bother.


 On 29 Sep 2010, at 13:44, Ben Bolker wrote:


 ## from ?glm
 counts - c(18,17,15,20,10,20,25,13,12)
 outcome - gl(3,1,9)
 treatment - gl(3,3)
 d.AD - data.frame(treatment, outcome, counts)
 glm.D93 - glm(counts ~ outcome + treatment, family=poisson,
 data=d.AD)

 ## predict on 'link' or 'linear predictor' scale, with SEs
 pp - predict(glm.D93,se.fit=TRUE)
 etaframe -
 with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit))
 pframe - exp(etaframe)  ## inverse-link
 ## picture
 pframe - as.data.frame(cbind(obs=d.AD$counts,pframe))
 library(plotrix)
 plot(pframe$obs,ylim=c(5,30))
 with(pframe,plotCI(1:9,fit,li=lower,ui=upper,col=2,add=TRUE))

 If you're using a binomial model you need 'plogis' rather than 'exp'
 as your
 inverse link, and

Re: [R] 95% confidence intercal with glm

2010-09-29 Thread Sam

Thats great thanks very much for your help



On 29 Sep 2010, at 17:30, Ben Bolker wrote:

 [I'm a little confused: are you Sam Smith or Chris Mcowen ... ?]

 This is admittedly a bit confusing, but the best scale on which to
compute standard errors is the link scale.
  It turns out (I hadn't realized this) that predict.glm does give
you not-crazy answers when you ask for
se.fit=TRUE on the response scale, in which case you can (a)  just
take +/- 1.96 SE around the response-scale fit and be done.
However, these are less accurate than doing what I suggested ((b)
computing fit +/- 1.96 SE on the link scale and
inverse-link-transforming).  Among other things, method a produces
confidence intervals that are exactly symmetric (generally
not true) and that are not necessarily bounded within the appropriate
range -- for example, the lower 95% CI bound for females
in the example below is slightly  0 ...

## extended example from ?predict.glm
require(graphics)

## example from Venables and Ripley (2002, pp. 190-2.)
ldose - rep(0:5, 2)
numdead - c(1, 4, 9, 13, 18, 20, 0, 2, 6, 10, 12, 16)
sex - factor(rep(c(M, F), c(6, 6)))
SF - cbind(numdead, numalive=20-numdead)
budworm.lg - glm(SF ~ sex*ldose, family=binomial)
summary(budworm.lg)

ci.scale - 1.96
plot(c(1,32), c(0,1), type = n, xlab = dose,
ylab = prob, log = x, las=1, bty=l)
text(2^ldose, numdead/20, as.character(sex), col=c(red,black)[sex])
ld - seq(0, 5, 0.1)
pM - predict(budworm.lg,
 data.frame(ldose=ld,
sex=factor(rep(M, length(ld)),
  levels=levels(sex))),
 type = response,se.fit=TRUE)
lines(2^ld, pM$fit )
matlines(2^ld,pM$fit+outer(pM$se.fit,ci.scale*c(-1,1)),lty=2,col=1)
pF - predict(budworm.lg,
 data.frame(ldose=ld,
sex=factor(rep(F, length(ld)),
  levels=levels(sex))),
 type = response,se.fit=TRUE)
lines(2^ld, pF$fit,col=2)
matlines(2^ld,pF$fit+outer(pF$se.fit,ci.scale*c(-1,1)),lty=2,col=2)

pM_ex - predict(budworm.lg,
 data.frame(ldose=ld,
sex=factor(rep(M, length(ld)),
  levels=levels(sex))),
 type = link,se.fit=TRUE)
with(pM_ex,

matlines(2^ld,plogis(fit+outer(se.fit,ci.scale*c(-1,1))),lty=3,col=1))

pF_ex - predict(budworm.lg,
 data.frame(ldose=ld,
sex=factor(rep(F, length(ld)),
  levels=levels(sex))),
 type = link,se.fit=TRUE)
with(pF_ex,

matlines(2^ld,plogis(fit+outer(se.fit,ci.scale*c(-1,1))),lty=3,col=2))

legend(bottomright,
  c(prediction,
+/- 2 SE (response scale),
+/- 2 SE (link scale)),
  lty=1:3)




On 10-09-29 10:38 AM, Chris Mcowen wrote:
 Right, that makes sense, thanks
 
 The reason i used type= response was i wanted to convert the predicted
probabilities to the response scale, as surely this is the scale at
which a 95CI value is most useful for?
 
 I.e
 
 pp - predict(model1,se.fit=TRUE, type = response)
 
 
 1  0.68
 
 Probability of point 1 having a response of 1 = 0.68 - # this is above
the cutoff therefore this has a response of 1
 
 Then it would be of most use to get the 95CI on this scale to see if
the probability straddles the cutoff value?
 
 Maybe i am missing something?
 
 Thanks
 
 
 On 29 Sep 2010, at 15:27, Ben Bolker wrote:
 
 On 10-09-29 10:04 AM, Sam wrote:
 Hi Ben and list,
 
 Sorry to be a pain! I have followed your code, and modified it -
 
 
 **You should not use type=response here.**
 The point is that the (symmetric) confidence intervals are computed on
 the link/linear predictor
 scale, and then inverse-link-transformed (in this case, along with the
 fitted values) -- they then
 become asymmetric.
 
 pp - predict(model1,se.fit=TRUE, type = response)
 etaframe -
 + with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit))
 pframe - plogis(etaframe)
 pframe
 
 My response variable is 0 or 1, the probabilities are all high above my
 cut-off point of 0.445, i think this may have something to do with
 
 you may need to multiply by the binomial N as
 appropriate.
 
 However how do i multiply if my response is 0 or 1?
 
 
 if 'size=1' (i.e. Bernoulli trials) then you would be multiplying by
 1, so don't bother.
 
 
 On 29 Sep 2010, at 13:44, Ben Bolker wrote:
 
 
 ## from ?glm
 counts - c(18,17,15,20,10,20,25,13,12)
 outcome - gl(3,1,9)
 treatment - gl(3,3)
 d.AD - data.frame(treatment, outcome, counts)
 glm.D93 - glm(counts ~ outcome + treatment, family=poisson,
data=d.AD)
 
 ## predict on 'link' or 'linear predictor' scale, with SEs
 pp - predict(glm.D93,se.fit=TRUE)
 etaframe -
 with(pp,cbind(fit,lower=fit-1.96*se.fit,upper=fit+1.96*se.fit))
 pframe - exp(etaframe)  ## inverse-link
 ## picture
 pframe - as.data.frame(cbind(obs=d.AD$counts,pframe))
 library(plotrix)
 plot(pframe$obs,ylim=c(5,30))
 with(pframe,plotCI(1:9,fit,li=lower,ui=upper,col=2,add=TRUE))
 
 If you're using a

[R] Simplify, several Ave or aggregates

2010-09-29 Thread skan


Hello.

How can I write this all in one line?

mydata is a zoo series, limit is a numeric vector of the same size

tmp - ave(coredata(mydata),as.Date(index(mydata)),FUN = function(x) (
(cummax(x)-x )) )
tmp - (tmp  limit)
final - ave(coredata(tmp),as.Date(index(mydata)),FUN = function(x) cumprod(
x) )

I've tried to use two vectors as argument to ave(...) but it seems to accept
just one even if I join them into a maxtrix.

This is just an example, but any other function could be use.

Here I need to compare the value of cummax(mydata)-mydata with a numeric
vector and once it surpasses it I'll keep zeros till the end of the day. The
cummax is calculated from the beginning of each day.

If limit were a single number instead of a vector (with different possible
numbers) I could write it:

ave(coredata(mydata),as.Date(index(mydata)),FUN = function(x) cumprod(
(cummax(x)-x )  limit) )

But I can't introduce there a vector longer than x (with the length of each
day) and I don't know how to introduce it as another argument in ave()

cheers
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Simplify-several-Ave-or-aggregates-tp2719363p2719363.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R Spracheinstellung

2010-09-29 Thread peter dalgaard


On Sep 29, 2010, at 18:21 , Berwin A Turlach wrote:

 G'day Tobias,
 
 On Wed, 29 Sep 2010 14:01:10 +
 Keller Tobias (kelleto0) kelle...@students.zhaw.ch wrote:
 
 Für unseren Statistik Unterricht brauchen wir das R-Programm.
 Ich habe das Programm heruntergeladen, deutsch gewählt und
 installiert. Das Problem ist nach 3mal neu installieren, dass das
 Programm immer auf englisch kommt.
 
 Können Sie mir helfen?
 
 R for Windows FAQ 3.2:
 
 http://cran.ms.unimelb.edu.au/bin/windows/base/rw-FAQ.html#I-want-R-in-English_0021
 

For different values of en, of course (presumably de). And assuming Windows 
OS.

-p

 HTH.
 
 Cheers,
 
   Berwin
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Validation / Training - test data

2010-09-29 Thread Frank Harrell


It all depends on the ultimate use of the results.

Frank

-
Frank Harrell
Department of Biostatistics, Vanderbilt University
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Validation-Training-test-data-tp2718523p2719370.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Ranked Set Sampling

2010-09-29 Thread Ahmed Albatineh

Dear All;

I have searched to see if any code in R was written to calculate mean and
variance of a Ranked Set Sample, but did not find any. Is there any package
for RSS, or kindly can somebody share a code he/she wrote, I am very
grateful and willing to acknowledge that in my work. Thanks much

Ahmed

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] move colorkey

2010-09-29 Thread Peter Ehlers


On 2010-09-28 18:39, Marlin Keith Cox wrote:

When using a wireframe, I need to move the colorkey from the right
position (default0 towards the plot.  I have also needed to adjust the
height and used the code

colorkey=list(T,space='right',height=.5)

I have looked at documents (within levelplot) but cannot find a way to
move the colorkey other than right, left, bottom and top.  I do not
understand corner interacts with x, y; unimplemented.  Is this a way
to place a colorkey.

keith



Try adding a layout.widths setting to your wireframe call:

  par.settings = list(layout.widths = list(key.right = 0.8))


You might then have to adjust the space on the right with

  par.settings = list(layout.widths = list(key.right = 0.8,
   right.padding = 2))


  -Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (OT) Change of email address

2010-09-29 Thread Kevin E. Thorpe


Ted is well aware of how to change his list email.  He was advising
people on the list who who have his old email address in their address
books to remove it.

On 09/29/2010 12:16 PM, Vojtěch Zeisek wrote:

Hello,
go to https://stat.ethz.ch/mailman/options/r-help/y...@email and if You do not
remember Your password, use Password reminder (down) to mail it to Your
address. Then login and do all needed changes. You can replace r-hep in link
above with r-sig-phylo, r-announce and so on. And of course replace y...@email
above with probable ted.hard...@wlandres.net.
Have a nice day!
Vojtěch

Dne St 29. září 2010 10:58:22 ted.hard...@wlandres.net napsal(a):

Apologies for bothering anyone who may not be interested in
this, but some of you will, for instance, have my current
email address in your address books, etc.

As  result of a new policy by Manchester University, retired
former staff who no longer contribute actively to research
in their former departments are to have their email accounts
closed.

The address from which I have been mailing to R-help (etc.):
   ted.hard...@manchester.ac.uk
will therefore soon be closed down.

I have set up (and subscribed to R-help, R-devel, R-announce)
a new email address:

   ted.hard...@wlandres.net

If you have kept a record of my current (manchester.ac.uk)
address, please modify this to the new address.

Messages sent to the current address will continue to arrive,
for another week or two.

With thanks, and best wishes to all,
Ted.




--
Kevin E. Thorpe
Biostatistician/Trialist, Knowledge Translation Program
Assistant Professor, Dalla Lana School of Public Health
University of Toronto
email: kevin.tho...@utoronto.ca  Tel: 416.864.5776  Fax: 416.864.3016

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] drawing samples based on a matching variable

2010-09-29 Thread Charles C. Berry


On Tue, 28 Sep 2010, L Brown wrote:


Hi, everyone. I have what I hope will be a simple coding question. It seems
this is a common job, but so far I've had trouble finding the answer in
searches.

I have two matrices (x and y) with a different number of observations in
each. I need to draw a random sample without replacement of observations
from x, and then, using a matching variable, draw a sample of equal size
from y. It is the matching variable that is hanging me up.

For example--


# example matrices. lets assume seed always equals 1. (lets also assume I

have assigned variable names A and B to my columns..)

set.seed(1)
x-cbind(1:10,sample(1:5,10,rep=T))
x

 [A] [B]
[1,]12
[2,]22
[3,]33
[4,]45
[5,]52
[6,]65
[7,]75
[8,]84
[9,]94
[10,]   101



Looks like set.seed(1) was invoked here, too.


y-cbind(1:14,sample(1:5,14,rep=T))
y

 [A] [B]
[1,]12
[2,]22
[3,]33
[4,]45
[5,]52
[6,]65
[7,]75
[8,]84
[9,]94
[10,]   101
[11,]   112
[12,]   121
[13,]   134
[14,]   142


#draw random sample of n=4 without replacement from matrix x.
x.samp-x[sample(10,4,replace=F),]
x.samp

[A] [B]
[1,]33
[2,]45
[3,]52
[4,]75

Next, I would need to draw four observations from matrix y (without
replacement) so that the distribution of y$B is identical to x.samp$B.

I'd appreciate any help, and sorry to post such a basic question!



Break it down like this:


x.levels - sort( unique(x[,2]) )
x.samp.tab - table( factor( x.samp[,2], x.levels ) )
y.rows - split(1:nrow(y),factor( y[,2], x.levels ) )
unlist( mapply( sample, y.rows, x.samp.tab ),use.names=FALSE )


In some cases sample might fail if

length( y.rows[[i]] )  x.samp.tab[ i ]

you can trace which element that was using 'traceback()' or write a 
wrapper for sample() that checks that condition.


HTH,

Chuck



LB

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] eha aftreg overall p-value

2010-09-29 Thread T. Smithson

Dear useRs,

I am currently fitting an advanced failure time model using Göran
Broström's excellent eha library with the aftreg command.

My question: How do I interpret the Overall p-value, that is
reported at the very bottom of the output? I already figured out it
must be a chi-square test, but I am wondering what a p-value  0.01
means:

Does it mean my model fits the actual data correctly OR does it mean
my model DOES NOT fit the data correctly? In other words: Is it
better to have a small or a large overall p-value?

EXAMPLE OUTPUT:
Events230
Total time at risk  3340
Max. log. likelihood  -380.34
LR test statistic 16.7
Degrees of freedom1
Overall p-value   4.41898e-05

Any advice is highly appreciated!!

Thanks
Thomas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subtraction based on two groups in a dataframe

2010-09-29 Thread 1Rnwb


Thanks for the help.
Sharad

On Mon, Sep 27, 2010 at 9:12 PM, Remko Duursma [via R] 
ml-node+2716469-935075351-6...@n4.nabble.comml-node%2b2716469-935075351-6...@n4.nabble.com
 wrote:

 Try something like this:



 dfr - read.table(textConnection(plate.id well.id   Group  HYB
  rlt1
 1 P1  A1 Control SKOV3hyb 0.190
 2 P1  A2 Control SKOV3hyb 0.210
 3 P1  A3 Control SKOV3hyb 0.205
 4 P1  A4 Control SKOV3hyb 0.206
 5 P1  A5 Control SKOV3hyb 0.184
 385   P1  A1ovca SKOV3hyb 0.184
 386   P1  A2ovca SKOV3hyb 0.229
 387   P1  A3ovca SKOV3hyb 0.214
 388   P1  A4ovca SKOV3hyb 0.226
 389   P1  A5ovca SKOV3hyb 0.217 ))


 difs - lapply(split(dfr,dfr$plate.id), function(x)x$rlt1[x$Group ==
 Control] - x$rlt1[x$Group == ovca])
 dfr$Diff - Reduce(c,difs)



 greetings,
 Remko

 --
  View message @
 http://r.789695.n4.nabble.com/subtraction-based-on-two-groups-in-a-dataframe-tp2716104p2716469.html
 To unsubscribe from subtraction based on two groups in a dataframe, click
 herehttp://r.789695.n4.nabble.com/template/TplServlet.jtp?tpl=unsubscribe_by_codenode=2716104code=c2JwdXJvaGl0QGdtYWlsLmNvbXwyNzE2MTA0fDU4ODg0MTYwOQ==.




-- 
View this message in context: 
http://r.789695.n4.nabble.com/subtraction-based-on-two-groups-in-a-dataframe-tp2716104p2719364.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Table with different digit number

2010-09-29 Thread Nicola Sturaro Sommacal (Quantide srl)

Thank you very much for your solution but it works only in a dataframe
object. If I am using an ftable object, it doesn't run.

I use, as a workaround, to fill with blank spaces the left of each number,
so when I print the table, it appears aligned to right.

But, obviously, this doesn't work for the HTML table, because HTML ignores
multiple white spaces .

Any suggestion?

Thanks in advance,
Nicola



2010/9/28 Henrique Dallazuanna www...@gmail.com

 Try this:

  df[1,] - as.character(df[1,])

 On Tue, Sep 28, 2010 at 8:48 AM, Nicola Sturaro Sommacal (Quantide srl) 
 mailingl...@sturaro.net wrote:

 Hi!

 I have a table representing both absolute and relative frequency, for
 example (code to get example data under the signature):

   ItalyGermany
 absolute100 105
 relative 40.51 41.18

 How can I print a different number of decimal digits? I try to transform
 to
 as.character, but cells result aligned to left and I don't like this
 solution. At the end of my work I need to export the table to HTML, so
 this
 can be do also with xtable package.

 Thanks in advance for your help.

 Nicola Sturaro Sommacal

 --
 Quantide srl
 http://www.quantide.com

 

 This is the code to get the data.frame with the data above:

 df = data.frame(italy = c(100,40.51), germany = c(105, 41.18))
 row.names(df) = c(absolute, relative)

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] executing loop

2010-09-29 Thread cassie jones

Dear All,


I am trying to define a loop for a m*n matrix, where i=1:n and j=1:m. Unlike
the usual for loop, i should go in the following way:

For j=1,
i=1,2,3,n
For j=2,
i=n,n-1,n-2,..,1
For j=3,
i=1,2,3,.n etc.
which means i should go in either increasing or decreasing order
alternatively.
Can anyone please help me in doing this?


Thanks,
Cassie

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Converting a line by line program into an array to perform summary stats

2010-09-29 Thread George Coyle

I am trying to turn several lines of information into a variable.  I used
the filx function to input my file then the readlines to qualify what I
want.  Essentially I have data in a file every 10 minutes through a day for
several years down a column:

date time value
9/28/10   02:00  13
9/28/10   02:10  15
9/28/10   02:20  12
9/28/10   02:30  11
etc.

I then use if statements to segment the day into various portions (2:00 to
2:30 for example) and find the average or maximum value for the time.  Next
I compare this value to another time of the day to see if the price was
reached/broken.  For example the 2:00 to 2:30 low from above is 11 at 2:30.
I would then check the 8:00 to 13:00 time period to see if the 2:00 to 2:30
low occured from 8:00 to 13:00.  If it did occur I would call this day a
qualifier and check the time 2:30 price occured in the 8:00 to 13:00 period
until the price at 13:00.  So another example if 2:30 was 11, 8:00 was 13,
8:45 was 11, and 13:00 was 20 I would say this day qualified and the price
from the am was first acheived at 8:45 in the 8:00 to 13:00 period and the
change from that price to the 13:00 period was 20-11=+9.

This example was all background to my problem.  Using the filx/readlines
methods I wind up with output for the days which qualify, but when I try to
create an array variable using the total of each line for a given column
(aka summarizing every day that my given criteria were met), something about
having used the line by line method doesn't allow me to create an array
variable.  It only gives me the last value.  The variables I declared to
fidn the qualifying days are specific to each day vs across all days that
qualified.  Is it possible to convert this line by line data into an array
wih some R code I don't know?

I considered converting the data example above into a matrix, but then I run
into tricky problems of shortened days and days missing data (the code I
used for the readlines method enables me to specify the new days without
this problem...I don't have to fill in blanks it just changes days when it
hits a new day).  Given how specific this problem is I can't find much on
the R help logs.

Perhaps outputting the line by line output to a file, then reinputting it
using a read.csv method would enable me to achieve my desired result but
then I have the laborious task of dumping into excel, converting file types,
saving as csv, reuploading to R, writing a separate program, etc.

Is there an easy fix to my problem?

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] executing loop

2010-09-29 Thread Peter Langfelder

for (j in 1:n)
{
   if (j%%2==0)
   {
 iRange = c(n:1)
} else
  iRange = c(1:n)
for (i in iRange)
{
   your code
}
}

Peter


On Wed, Sep 29, 2010 at 10:40 AM, cassie jones cassiejone...@gmail.com wrote:
 Dear All,


 I am trying to define a loop for a m*n matrix, where i=1:n and j=1:m. Unlike
 the usual for loop, i should go in the following way:

 For j=1,
 i=1,2,3,n
 For j=2,
 i=n,n-1,n-2,..,1
 For j=3,
 i=1,2,3,.n etc.
 which means i should go in either increasing or decreasing order
 alternatively.


for (j in 1:n)
{
   if (j%%2==0)
   {
 iRange = c(n:1)
} else
  iRange = c(1:n)
for (i in iRange)
{
   your code
}
}

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Use R in Visual Basic Environment

2010-09-29 Thread Liviu Andronic

Hello

On Tue, Sep 28, 2010 at 1:39 PM, Soumen Pal soumen.4...@gmail.com wrote:
 I need your kind help regarding the following:

 I wish to know is there any way to use R in Visual Basic environment. I want
 to develop a VB application where R can be embedded (R will work as a back
 end statistical engine). If available, please provide me some source of
 study materials/articles available on the internet related to this.

Have you tried searching 'r vba' on Google [1]? At least the second
and third links seem relevant. I think you'll want to search for 'vba
rexcel' on either Google or Rseek. Regards
Liviu

[1] 
http://www.google.com/search?hl=ensource=hpq=r+vbabtnG=Google+Searchaq=faqi=aql=oq=gs_rfai=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Exponential integral

2010-09-29 Thread Mishra, Srikanta

Is there an R function for evaluating the exponential integral

Ei(x) = INT(-inf,x) exp(t)/t dt


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] resampling issue

2010-09-29 Thread Michael Larkin

I am trying to get R to resample my dataset of two columns of age and length
data for fish.  I got it to work, but it is not resampling every replicate.
Instead, it resamples my data once and then repeated it 5 times.  

Here is my dataset of 9 fish samples with an age and length for each one:   
Age Length
2   200
5   450
6   600
7   702
8   798
5   453
4   399
1   120
2   202


Here is my code which resamples my data to produce up to 9 different samples
and creates a new dataset of 12 samples: 

testdat-growth[sample(9,12,replace=T),]


Now I want R to repeat this procedure 5 times. Here is my code: 


testdat2 - replicate(5, sample(testdat), simplify=F)
testdat2

Here is my output showing that it did it once and then just repeated the
values: 


 testdat2
[[1]]
Age Length
1 2200
9 2202
8 1120
5 8798
4 7702
6 5453
1.1   2200
4.1   7702
4.2   7702
5.1   8798
4.3   7702
6.1   5453

[[2]]
Age Length
1 2200
9 2202
8 1120
5 8798
4 7702
6 5453
1.1   2200
4.1   7702
4.2   7702
5.1   8798
4.3   7702
6.1   5453

[[3]]
Age Length
1 2200
9 2202
8 1120
5 8798
4 7702
6 5453
1.1   2200
4.1   7702
4.2   7702
5.1   8798
4.3   7702
6.1   5453

[[4]]
Length Age
1  200   2
9  202   2
8  120   1
5  798   8
4  702   7
6  453   5
1.1200   2
4.1702   7
4.2702   7
5.1798   8
4.3702   7
6.1453   5

[[5]]
Length Age
1  200   2
9  202   2
8  120   1
5  798   8
4  702   7
6  453   5
1.1200   2
4.1702   7
4.2702   7
5.1798   8
4.3702   7
6.1453   5


Any advice on how to get R to resample each time would be greatly
appreciated?  

Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Exponential integral

2010-09-29 Thread Ben Bolker

Mishra, Srikanta MISHRAS at battelle.org writes:

 
 Is there an R function for evaluating the exponential integral
 
 Ei(x) = INT(-inf,x) exp(t)/t dt


  Try the gsl package.
  Also library(sos); findFn({exponential integral})
although admittedly it doesn't find the Expint page
in the gsl package ...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] resampling issue

2010-09-29 Thread Henrique Dallazuanna

I think that you want:

replicate(5, growth[sample(9,12,replace=T),], simplify = FALSE)


On Wed, Sep 29, 2010 at 3:19 PM, Michael Larkin mlar...@rsmas.miami.eduwrote:

 I am trying to get R to resample my dataset of two columns of age and
 length
 data for fish.  I got it to work, but it is not resampling every replicate.
 Instead, it resamples my data once and then repeated it 5 times.

 Here is my dataset of 9 fish samples with an age and length for each one:
 Age Length
 2   200
 5   450
 6   600
 7   702
 8   798
 5   453
 4   399
 1   120
 2   202


 Here is my code which resamples my data to produce up to 9 different
 samples
 and creates a new dataset of 12 samples:

 testdat-growth[sample(9,12,replace=T),]


 Now I want R to repeat this procedure 5 times. Here is my code:


 testdat2 - replicate(5, sample(testdat), simplify=F)
 testdat2

 Here is my output showing that it did it once and then just repeated the
 values:


  testdat2
 [[1]]
Age Length
 1 2200
 9 2202
 8 1120
 5 8798
 4 7702
 6 5453
 1.1   2200
 4.1   7702
 4.2   7702
 5.1   8798
 4.3   7702
 6.1   5453

 [[2]]
Age Length
 1 2200
 9 2202
 8 1120
 5 8798
 4 7702
 6 5453
 1.1   2200
 4.1   7702
 4.2   7702
 5.1   8798
 4.3   7702
 6.1   5453

 [[3]]
Age Length
 1 2200
 9 2202
 8 1120
 5 8798
 4 7702
 6 5453
 1.1   2200
 4.1   7702
 4.2   7702
 5.1   8798
 4.3   7702
 6.1   5453

 [[4]]
Length Age
 1  200   2
 9  202   2
 8  120   1
 5  798   8
 4  702   7
 6  453   5
 1.1200   2
 4.1702   7
 4.2702   7
 5.1798   8
 4.3702   7
 6.1453   5

 [[5]]
Length Age
 1  200   2
 9  202   2
 8  120   1
 5  798   8
 4  702   7
 6  453   5
 1.1200   2
 4.1702   7
 4.2702   7
 5.1798   8
 4.3702   7
 6.1453   5


 Any advice on how to get R to resample each time would be greatly
 appreciated?

 Mike

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] spplot cuts

2010-09-29 Thread bbolker



  Hmmm.  Maybe a documentation typo in ?spplot.
  If you follow the documentation through to ?levelplot, you find
that 

cuts: number of levels the range of ‘z’ would be divided into

(no mention of actual breakpoints) but:

 at: numeric vector giving breakpoints along the range of ‘z’.
  Contours (if any) will be drawn at these heights, and the
  regions in between would be colored using ‘col.regions’.  In
  the latter case, values outside the range of ‘at’ will not be
  drawn at all.  This serves as a way to limit the range of the
  data shown, similar to what a ‘zlim’ argument might have been
  used for.  However, this also means that when supplying ‘at’
  explicitly, one has to be careful to include values outside
  the range of ‘z’ to ensure that all the data are shown.

 So you might try 'at' instead of 'cuts'.



-- 
View this message in context: 
http://r.789695.n4.nabble.com/spplot-cuts-tp2716237p2719533.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Converting a line by line program into an array to perform summary stats

2010-09-29 Thread Dieter Menne



George Coyle wrote:
 
 I am trying to turn several lines of information into a variable.  I used
 the filx function to input my file then the readlines to qualify what I
 want.  Essentially I have data in a file every 10 minutes through a day
 for
 several years down a column:
 
 date time value
 9/28/10   02:00  13
 9/28/10   02:10  15
 9/28/10   02:20  12
 9/28/10   02:30  11
 etc.
 
 I then use if statements to segment the day into various portions (2:00 to
 2:30 for example) and find the average or maximum value for the time. 
 Next
 .
 

George,

it is very difficult to understand what you want, so let' try to chop it
down. Let's say you want to find the best way to do some time of processing
on what generally called a time series.

First, reading this into a matrix will not work, because you have three
different column types. The standard workhorse for this type of data is a
dataframe, and read.csv gives a data frame.

Next, note that your structure is even simpler: you have a time series,
probably not an regular one, because I assume your stock quotes are not
available all the time with equal intervals, but will stop at night.

So the general idea would be: 
 read in a data frame with read.csv
 convert to a time series. See package zoo for many examples
 do the processing (which I do not understand) on the time series

Dieter


-- 
View this message in context: 
http://r.789695.n4.nabble.com/Converting-a-line-by-line-program-into-an-array-to-perform-summary-stats-tp2719489p2719542.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] String split and concatenation

2010-09-29 Thread Greg Snow

 paste( '(', paste( ', rep(letters[1:3],2), ', sep=, collapse=','), ')', 
 sep= ) 
[1] ('a','b','c','a','b','c')

If you need the space after the comma then just change ',' to ', '.

The outer paste can be replaced with sprintf (and that may be more readable).

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Steven Kang
 Sent: Wednesday, September 29, 2010 2:16 AM
 To: bill.venab...@csiro.au
 Cc: r-help@r-project.org
 Subject: Re: [R] String split and concatenation
 
 x - rep(letters[1:3], 2)
 
 Are there any ways to transform  assign the above as the one shown
 below
 to an object? (in exact format; i.e length of 1  class of character),
 i.e
 x
 ('a', 'b', 'c', 'a', 'b', 'c')
 
 Highly appreciate for any advice.
 
 
 
 On Wed, Sep 29, 2010 at 3:33 PM, bill.venab...@csiro.au wrote:
 
  dump(x, file = x.R)
  file.show(x.R)
 
  will get you most of the way.
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org]
  On Behalf Of Steven Kang
  Sent: Wednesday, 29 September 2010 3:11 PM
  To: r-help@r-project.org
  Subject: [R] String split and concatenation
 
  Hi R users,
 
 
  I desire to transform the following vector consisting of repeated
  characters
 
  x - rep(letters, 3)
  into this exact format (i.e a single string containing each
 characters in
  quotation mark separated by comma between each; al ).
 
  (a, b, c, d, a, b, c, d, ..., a,
 b,
  c,
  d, .z)
 
  Any advice would be much appreciated.
 
 
 
  --
  Steven
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.htmlhttp://www.r-
 project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 --
 Steven
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] New package GAD (General ANOVA Design)

2010-09-29 Thread Maurício Camargo

Hi, everyone.

GAD package analyses complex ANOVA models with any combination of
orthogonal/nested and fixed/random factors, as described by Underwood
(1997). There are two restrictions: (i) data must be balanced; (ii)
fixed nested factors are not allowed. Homogeneity of variances is
checked using Cochran's C test and a posteriori comparisons of means
are done using Student-Newman-Keuls (SNK) procedure.

Best,

Maurício G. Camargo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matrix plot

2010-09-29 Thread hairryharry


Thanks,

Thats great just what I was trying to do.

HH


Thomas Stewart wrote:

HH-

I'm not familiar with the plots you mention, but the following is a 
quick attempt to create the plot you describe.


data-data.frame(
  org=1:10,
  q1=sample(1:10,replace=T),
  q2=sample(1:10,replace=T),
  q3=sample(1:10,replace=T))

#  This generates a random data set like the one you describe.  It 
looks like this:


#   org q1 q2 q3
#11  9  1  4
#22  1  2  1
#33  1  3  7
#44 10  9  6
# ...

n-ncol(data)-1 # NUMBER OF ORGANIZATIONS

new.mat-apply(data[,-1],2,function(x){xtabs(~factor(x,levels=1:10))})

plot(rep(1:n,each=10),rep(1:10,n),
  pch=16,
  cex=c(new.mat),
  xaxt=n,
  xlab=Question,
  ylab=Score)
axis(1,at=1:n)

Instead of cex=c(new.mat), you can supply a function of new.mat, say 
cex=f(new.mat) to specify the size of the dots.


Hope that helps.
-tgs




On Wed, Sep 29, 2010 at 6:55 AM, hairryharry hairryha...@tesco.net 
mailto:hairryha...@tesco.net wrote:


Hi,

Fairly new to R - have done basic plots but now faced with
plotting a matrix/table of results -I know what I want but cannot
find out how to do it.

Basically have individual questions ( x) to which an organization
can rate themselves 1-10 (y) what I want to show is a
matrix/density type plot (like the matrix corrolation plots I have
seen on R graph site) showing for eac question (x) a circle or
shape of varying size and/or colour to represent the number of
organisations rating themselves for each value of y.

Hope this makes sense, any advice would be gratefully received.

HH

__
R-help@r-project.org mailto:R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Trying to avoid loop structure

2010-09-29 Thread Mathieu Beaulieu


Dear R-helpers,
I'm trying to associate linear coefficients (intercept and slope) to tens of 
thousands of observations based on a table with benchmark values.
#Example - Value table and their corresponding coefficients (intercept and 
slope)
coef = data.frame(cbind(st=c(1:5),b = runif(5,0.3,5),a = 
seq(0.5,5,1)))print(coef)
#Example of observations to be computedobs = runif(20,1,5)print(obs)

#This should be fairly simple - but I can't get the loop structure out of my 
head. I don't know how to tell the software if the observation is greater than 
st[i-1] but smaller than st[i], use a[i] and b[i] - otherwise go to the next 
line.So far, I've tried this:
for(t in 1:length(obs)) {
for(i in 1:length(coef$st)) if(coef$st[i-1]  obs[t]  coef$st[i]) x[t] 
= coef$b[i] + (coef.$a[i] * obs[t])
}#  
Doesn't work - if statement do not accept double conditions structure (y[-1]  
x  y) and I should put an else statement - but i'm stuck there. I have the 
feeling I should look into lapply but it doesn't seem like lapply works with 
if statements.
Any tips would be much appreciated,
Thanks,


Mathieu Beaulieu,
Soil, Water  Environment Laboratory
Institute for Resources, Environment and Sustainability
University of British-Columbia
429-2202 Main Mall,
Vancouver, BC,
V6T 1Z4
Canada





  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Trying to avoid loop structure

2010-09-29 Thread Phil Spector


Mathieu -
   First of all, you can combine as many conditions as 
you want in an if statement, using  (and) and || (or).

So to say
coef$st[i-1]  obs[t]  coef$st[i]
use
coef$st[i-1]  obs[t]  obs[t]  coef$st[i]

So following your logic, you could use

x = numeric(length(obs))
for(t in 1:length(obs)) {
  for(i in 1:length(coef$st))
 if(coef$st[i-1]  obs[t]  obs[t]  coef$st[i])
   x[t] = coef$b[i] + (coef$a[i] * obs[t])
}

Alternatively, you could solve the problem using R:

mapply(function(o,i)coef$b[i] + coef$a[i] * o, obs,
cut(obs,c(coef$st,6),labels=FALSE) + 1)

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu





On Wed, 29 Sep 2010, Mathieu Beaulieu wrote:



Dear R-helpers,
I'm trying to associate linear coefficients (intercept and slope) to tens of 
thousands of observations based on a table with benchmark values.
#Example - Value table and their corresponding coefficients (intercept and 
slope)
coef = data.frame(cbind(st=c(1:5),b = runif(5,0.3,5),a = 
seq(0.5,5,1)))print(coef)
#Example of observations to be computedobs = runif(20,1,5)print(obs)

#This should be fairly simple - but I can't get the loop structure out of my head. I 
don't know how to tell the software if the observation is greater than st[i-1] but 
smaller than st[i], use a[i] and b[i] - otherwise go to the next line.So far, I've 
tried this:
for(t in 1:length(obs)) {
for(i in 1:length(coef$st)) if(coef$st[i-1]  obs[t]  coef$st[i]) x[t] 
= coef$b[i] + (coef.$a[i] * obs[t])
}#
Doesn't work - if statement do not accept double conditions structure (y[-1]  x  y) and 
I should put an else statement - but i'm stuck there. I have the feeling I should look into 
lapply but it doesn't seem like lapply works with if statements.
Any tips would be much appreciated,
Thanks,


Mathieu Beaulieu,
Soil, Water  Environment Laboratory
Institute for Resources, Environment and Sustainability
University of British-Columbia
429-2202 Main Mall,
Vancouver, BC,
V6T 1Z4
Canada






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A problem with plotting a long expression in ylab ?

2010-09-29 Thread Paul Murrell


Hi

On 29/09/2010 8:17 p.m., Tal Galili wrote:

My honor.

A short question: if there is something in the device that is sensitive
to the overlapping of the text, then is it possible to add a warning
massage output when the length of the text is longer then the device
dimensions?


The graphics engine does a lot of this sort of clipping (not just text), 
but all of it is silent.  You certainly wouldn't want to print all 
clipping warnings to screen and there is no current plan to return 
clipping information from the graphics engine.  The current expectation 
is that the interested useR/developeR will use the R-level tools, such 
as strwidth(), to check for overlap, etc.


Paul


With much respect,
Tal

Contact
Details:---
Contact me: tal.gal...@gmail.com mailto:tal.gal...@gmail.com |
972-52-7275845
Read me: www.talgalili.com http://www.talgalili.com (Hebrew) |
www.biostatistics.co.il http://www.biostatistics.co.il (Hebrew) |
www.r-statistics.com http://www.r-statistics.com (English)
--




On Wed, Sep 29, 2010 at 1:24 AM, Paul Murrell p.murr...@auckland.ac.nz
mailto:p.murr...@auckland.ac.nz wrote:

Hi

It is a bug.  A fix has been committed.

Thanks for the report!

Paul


On 29/09/2010 10:15 a.m., Ben Bolker wrote:

Barry Rowlingsonb.rowlingsonat lancaster.ac.uk
http://lancaster.ac.uk  writes:

My point is that in regular text, ylab plots it where it
then goes outside
the borders.
With the use of expressions - the text just doesn't show up.
Originally I thought it was because of my miss-use of
expressions, until I
figured it was the level of cex.lab I was using.
The problem is that when you can't see the text, you
don't have a sense of
how much to decrease the cex.lab so the text will fit.
I hope I was now clearer.


Gotcha. Seems to only affect ylab though. Do this:

   t =  expression(paste(test
loo(% of 360
*degree,
)))
   plot(1,xlab=t,ylab=t,main=t)

then if I shrink my graphics window I can make the ylab
disappear but
not the xlab or title.

  Seems to affect any rotated expressions:

plot(1)
text(1,1,t,srt=90)
text(1,1,t,srt=0)
text(1,1,t,srt=45)


  Now shrink window and watch the rotated expressions
vanish! They
disappear when they start (or finish) out of the entire graphics
device, not the plot region...

  I cant find anything relating to clipping in the help, and
I am on
Linux, so see if there's any news about it, try it with
R-patched or
R-devel and then report a bug after having read all the
other stuff
about R bug reporting!

Barry



I don't claim to understand it, but there is something quite
fundamental about the properties of the X11() graphics device in R
that makes labels that would otherwise overlap, disappear -- if
you do 'extreme resizing' with the graphics above, you can see that
otherwise-overlapping x- and y-axis tick labels disappear as the
graph gets scrunched.  This is (apparently) true of X11 graphics
on MacOS as well -- Quartz window has a different behavior.
Trying with pdf() as well -- for height=2, width=2, only 1 y-axis
and 2 x-axis tick labels survive, *but* the x and y labels and the
title are all still present (but clipped, of course).
[Hmmm. Take my reports above with a grain of salt, I wasn't
always using expression()s.]

   So I would guess that if you reported this as a bug you would
be told that it was a poorly documented property of R's X11
graphics model, rather than a bug ...

   I have no idea where to start looking for more information
about what defines this behavior -- if I were desperate to know
I would probably try asking Paul Murrell ...

   I would be very interested to see this discussed on r-devel,
if anyone bit ...

Ben Bolker

__
R-help@r-project.org mailto:R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Dr Paul Murrell
Department of Statistics
The

[R] Understanding linear contrasts in Anova using R

2010-09-29 Thread David Howell


 #I am trying to understand how R fits models for contrasts in a
#simple one-way anova. This is an example, I am not stupid enough to want
#to simultaneously apply all of these contrasts to real data. With a few
#exceptions, the tests that I would compute by hand (or by other software)
#will give the same t or F statistics. It is the contrast estimates that 
R produces

#that I can't seem to understand.
#
# In searching for answers to this problem, I found a great PowerPoint 
slide (I think by John Fox).
# The slide pointed to the coefficients, said something like these are 
coeff. that no one could love, and
#then suggested looking at the means to understand where they came from. 
I have stared

# and stared at his means and then my means, but can't find a relationship.

# The following code and output illustrates the problem.

# Various examples of Anova using R

dv - c(1.28,  1.35,  3.31,  3.06,  2.59,  3.25,  2.98,  1.53, -2.68,  
2.64,  1.26,  1.06,
   -1.18,  0.15,  1.36,  2.61,  0.66,  1.32,  0.73, -1.06,  0.24,  
0.27,  0.72,  2.28,
   -0.41, -1.25, -1.33, -0.47, -0.60, -1.72, -1.74, -0.77, -0.41, 
-1.20, -0.31, -0.74,
   -0.45,  0.54, -0.98,  1.68,  2.25, -0.19, -0.90,  0.78,  0.05,  
2.69,  0.15,  0.91,
2.01,  0.40,  2.34, -1.80,  5.00,  2.27,  6.47,  2.94,  0.47,  
3.22,  0.01, -0.66)


group - factor(rep(1:5, each = 12))


# Use treatment contrasts to compare each group to the first group.
options(contrasts = c(contr.treatment,contr.poly))  # The default
model2 - lm(dv ~ group)
summary(model2)
  # Summary table is the same--as it should be
  # Intercept is Group 1 mean and other coeff. are deviations from that.
  # This is what I would expect.
  #summary(model1)
  #  Df Sum Sq Mean Sq F valuePr(F)
  #  group4  62.46 15.6151  6.9005 0.0001415 ***
  #  Residuals   55 124.46  2.2629
  #Coefficients:
  #Estimate Std. Error t value Pr(|t|)
  #(Intercept)  1.802500.43425   4.151 0.000116 ***
  #group2  -1.127500.61412  -1.836 0.071772 .
  #group3  -2.715000.61412  -4.421 4.67e-05 ***
  #group4  -1.258330.61412  -2.049 0.045245 *
  #group5   0.086670.61412   0.141 0.888288


# Use sum contrasts to compare each group against grand mean.
options(contrasts = c(contr.sum,contr.poly))
model3 - lm(dv ~ group)
summary(model3)

  # Again, this is as expected. Intercept is grand mean and others are 
deviatoions from that.

  #Coefficients:
  #  Estimate Std. Error t value Pr(|t|)
  #  (Intercept)   0.7997 0.1942   4.118 0.000130 ***
  #  group11.0028 0.3884   2.582 0.012519 *
  #  group2   -0.1247 0.3884  -0.321 0.749449
  #  group3   -1.7122 0.3884  -4.408 4.88e-05 ***
  #  group4   -0.2555 0.3884  -0.658 0.513399

#SO FAR, SO GOOD

# IF I wanted polynomial contrasts BY HAND I would use
#a(i) =  -2   -1   0   1   2   for linear contrast(or some 
linear function of this )

#Effect = Sum(a(j)M(i))# where M = mean
#Effect(linear) = -2(1.805) -1(0.675) +0(-.912) +1(.544) +2(1.889) = 
0.043

#SS(linear) = n*(Effect(linear)^2)/Sum((a(j)^2))  = 12(.043)/10 = .002
#F(linear) = SS(linear)/MS(error) = .002/2.263 = .001
#t(linear) = sqrt(.001) = .031

# To do this in R I would use
order.group - ordered(group)
model4 - lm(dv~order.group)
summary(model4)
#  This gives:
#Coefficients:
#  Estimate Std. Error t value Pr(|t|)
#(Intercept)0.799670.19420   4.118 0.000130 ***
#order.group.L  0.013440.43425   0.031 0.975422
#order.group.Q  2.135190.43425   4.917 8.32e-06 ***
#order.group.C  0.110150.43425   0.254 0.800703
#order.group^4 -0.796020.43425  -1.833 0.072202 .

# The t value for linear is same as I got (as are others) but I don't 
understand
# the estimates. The intercept is the grand mean, but I don't see the 
relationship

# of other estimates to that or to the ones I get by hand.
# My estimates are the sum of (coeff times means) i.e.  0 (intercept), 
.0425, 7.989, .3483, -6.66

# and these are not a linear (or other nice pretty) function of est. from R.

# Just as a check, I forced intercept through zero with with deviation 
scores or -1 in model.

#  Now force intercept to 0 by using deviation scores

devdv - dv-mean(dv)
model5 - lm(devdv~order.group)
summary(model5)
#Same as above except intercept = 0

# Now do it by removing the intercept
model6 - lm(dv~order.group -1)
summary(model6)
#  Weird because coeff = cell means
# Estimate Std. Error t value Pr(|t|)
#order.group1   1.8025 0.4342   4.151 0.000116 ***
#order.group2   0.6750 0.4342   1.554 0.125824
#order.group3  -0.9125 0.4342  -2.101 0.040208 *
#order.group4   0.5442 0.4342   1.253 0.215464
#order.group5   1.8892 0.4342   4.350 5.94e-05 ***

# BUT note that row labels would suggest that we were not solving for 
polynomials,

# but the contrast matrix is made

1 2 >

1 - 100 of 146 matches

Mail list logo