Re: [R] S curve via R

2006-07-26 Thread Johannes Hüsing
 Hello sir:
 How can I get S curve function via R?
 For SPSS,the function is:y=exp(b0+b1/x)


I am not sure if this is the answer you want, but

Scurve - function(x, b0=0, b1=1) {
exp(b0+b1/x)
}

should do what you request.

Greetings


Johannes

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Way OT] New hardware

2006-07-26 Thread Martin Maechler
 Sean == Sean Davis [EMAIL PROTECTED]
 on Tue, 25 Jul 2006 17:16:02 -0400 writes:

Sean Can anyone share experience with opteron versus the
Sean xeon (woodcrest) for R under linux?  I am looking at
Sean using 16-32Gb of ram in a workstation (as opposed to a
Sean server).

Hmm, not that I'd be an expert...
If you want to use so much RAM you want to use a 64-bit
architecture and software (OS, libraries, compilers,...), right?
AFAIK, that's been known to work well with Opterons and different
flavors of Linuxen (e.g. we have dual
Opterons, one with Redhat Enterprise and two with Ubuntu 6.06).

Now I read that there are 64-bit Xeons with EMT64 (which is
said to be Intel's emulation of AMD64), so in principle the same
versions of Linux and R should run there as well.
Since I haven't heard of any success stories
I'm interested as well, in reports from R users.

Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] command completion in R-WinEdt

2006-07-26 Thread Uwe Ligges
Franco Mendolia wrote:
 Hello!
 
 Is there any possibility to use command completion in R-WinEdt?

You can make use of the Command Completion Wizard for WinEdt available 
at http://www.winedt.org/Plugins/.
A list of function names ships with the RWiNEdt package (file R.lst). 
Simply make it known to the Command Completion Wizard.

Uwe Ligges



 Thanks
 Franco
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Way OT] New hardware

2006-07-26 Thread Prof Brian Ripley
On Wed, 26 Jul 2006, Martin Maechler wrote:

  Sean == Sean Davis [EMAIL PROTECTED]
  on Tue, 25 Jul 2006 17:16:02 -0400 writes:
 
 Sean Can anyone share experience with opteron versus the
 Sean xeon (woodcrest) for R under linux?  I am looking at
 Sean using 16-32Gb of ram in a workstation (as opposed to a
 Sean server).
 
 Hmm, not that I'd be an expert...
 If you want to use so much RAM you want to use a 64-bit
 architecture and software (OS, libraries, compilers,...), right?
 AFAIK, that's been known to work well with Opterons and different
 flavors of Linuxen (e.g. we have dual
 Opterons, one with Redhat Enterprise and two with Ubuntu 6.06).
 
 Now I read that there are 64-bit Xeons with EMT64 (which is
 said to be Intel's emulation of AMD64), so in principle the same
 versions of Linux and R should run there as well.
 Since I haven't heard of any success stories
 I'm interested as well, in reports from R users.

There have been several posted here or R-devel.  Things do change, but 
every time we have had a formal test, Opterons were considerably better 
than Xeons on performance/£.

[BTW, I don't think you will get a workstation motherboard that takes 32Gb 
of RAM (although things change fast):  my own machine has a server 
motherboard in a small tower case.]

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overplotting: plot() invocation looks ugly ... suggestions?

2006-07-26 Thread hadley wickham
 And if lattice is ok then try this:

 library(lattice)
 xyplot(Consumption ~ Quarter, group = Year, data, type = o)

Or you can use ggplot:

install.packages(ggplot)
library(ggplot)
qplot(Quarter, Consumption, data=data,type=c(point,line), id=data$Year)

Unfortunately this has uncovered a couple of small bugs for me to fix
(no automatic legend, and have to specify the data frame explicitly)

The slighly more verbose example below shows you what it should look like.

data$Year - factor(data$Year)
p - ggplot(data, aes=list(x=Quarter, y=Consumption, id=Year, colour=Year))
ggline(ggpoint(p), size=2)

Regards,

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert decimals to fractions - sorted

2006-07-26 Thread JeeBee

Hi Muhammad,

How about this?

at - read.table(textConnection(a))
at2 - cbind(at, jeebee=as.character(as.fractions(as.numeric(at[,2]

sort.order - order(at2$V2)

at2[sort.order,]
at2[sort.order,c(1,3)]

JeeBee.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overplotting: plot() invocation looks ugly ... suggestions?

2006-07-26 Thread Constantinos Antoniou
Hello,

I would like to make a question regarding the use of a grey background
(by ggplot in this case, but also in other settings - I seem to
remember a relevant lattice discussion). It seems that it is generally
discouraged by journals. I guess one practical reason is that it makes
photocopying difficult (in the sense that it may lead to low contrast
situations). It might have to do with printing costs, as it leads to
higher coverage of the page, but I do not know about that.

[Disclaimer: it does look nice, though.]

Any comments?

Thanks,

Costas

On 7/26/06, hadley wickham [EMAIL PROTECTED] wrote:
  And if lattice is ok then try this:
 
  library(lattice)
  xyplot(Consumption ~ Quarter, group = Year, data, type = o)

 Or you can use ggplot:

 install.packages(ggplot)
 library(ggplot)
 qplot(Quarter, Consumption, data=data,type=c(point,line), id=data$Year)

 Unfortunately this has uncovered a couple of small bugs for me to fix
 (no automatic legend, and have to specify the data frame explicitly)

 The slighly more verbose example below shows you what it should look like.

 data$Year - factor(data$Year)
 p - ggplot(data, aes=list(x=Quarter, y=Consumption, id=Year, colour=Year))
 ggline(ggpoint(p), size=2)

 Regards,

 Hadley

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Power tests for ROC analysis

2006-07-26 Thread Robinson, Peter
Dear List,

please forgive this question from a hobby statistician, but I was wondering if 
there is any way of doing power calculations to estimate how much data is 
needed so that the sensitivity/specificity values along a ROC curve will be 
within a certain confidence interval? I am not aware of any such method, but 
was recently asked how much data would be needed to perform ROC analysis for a 
study.

Thanks a lot,
Peter

Dr. med. Peter Robinson, MSc.
Institut für Medizinische Genetik
Universitätsklinikum Charité 
Humboldt-Universität
Augustenburger Platz 1
13353 Berlin
Phone: ++49-30-450 569124
Fax: ++49-30-450 569915
[EMAIL PROTECTED]
http://www.charite.de/ch/medgen/robinson

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overplotting: plot() invocation looks ugly ... suggestions?

2006-07-26 Thread Karl Ove Hufthammer
Constantinos Antoniou skreiv:

 I would like to make a question regarding the use of a grey background
 (by ggplot in this case, but also in other settings - I seem to
 remember a relevant lattice discussion). It seems that it is generally
 discouraged by journals. I guess one practical reason is that it makes
 photocopying difficult (in the sense that it may lead to low contrast
 situations). It might have to do with printing costs, as it leads to
 higher coverage of the page, but I do not know about that.
 
 [Disclaimer: it does look nice, though.]

 Any comments?

Just a small one: The grey background used by ggplot does look nice;
the one used by earlier versions of lattice did not. All IMHO, of course.

-- 
Karl Ove Hufthammer
E-mail and Jabber: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overplotting: plot() invocation looks ugly ... suggestions?

2006-07-26 Thread hadley wickham
 I would like to make a question regarding the use of a grey background
 (by ggplot in this case, but also in other settings - I seem to
 remember a relevant lattice discussion). It seems that it is generally
 discouraged by journals. I guess one practical reason is that it makes
 photocopying difficult (in the sense that it may lead to low contrast
 situations). It might have to do with printing costs, as it leads to
 higher coverage of the page, but I do not know about that.

 [Disclaimer: it does look nice, though.]

 Any comments?

It is very easy to change to the usual black on white grid lines (see
?ggopt and ?ggsave), so if your journal does require it, it's easy to
turn off.

Here are a few reasons I like the gray background (in no particular order):

 * you can then use white gridlines, which miniminally impinge on the
plot, but still aid lookup to the relevant axis

 * the color of the plot more closely matches the color (in the
typographic sense) of the text, so that the plot fits into a printed
document without drawing so much attention to itself.

 * the contrast between the plot surface and the points is a little
lower, which makes it a bit more pleasant to read

Of course the big disadvantage is if you don't have a high quality
printer, or a looking at a photocopy of a photocopy etc.  This
disadvantage should go away with time as the quality of printed output
steadily improves.

Regards,

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Axis Title in persp() Overlaps with Axis Labels

2006-07-26 Thread Jose Claudio Faria
Dear Kilian,

Also give a looked at: 
http://wiki.r-project.org/rwiki/doku.php?id=graph_gallery:new-graphics

You will see a new and very flexible function to 3D plot.

Regards,
__
Jose Claudio Faria
Brasil/Bahia/Ilheus/UESC/DCET
Estatística Experimental/Prof. Adjunto
mails: [EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]

Paul Murrell p.murrell at auckland.ac.nz writes:

 
  Hi
 
  Kilian Plank wrote:
   Good morning,
  
   in a 3D plot based on persp() the axis title (of dimension z) 
overlaps with
   the axis labels.
   How can the distance (between axis labels and axis title) be increased?
 

  Paul

   Another way to do it: get the perspective matrix
back from persp() and use trans3d() to redo essentially
the same calculations that persp() does to decide where
to put the label:

x - seq(-10, 10, length= 30)
y - x
f - function(x,y) { r - sqrt(x^2+y^2); 10 * sin(r)/r }
z - outer(x, y, f)
z[is.na(z)] - 1

par(mfrow=c(2, 2))
persp(x, y, z, theta = 30, phi = 30, expand = 0.5,
   col = lightblue, ticktype=detailed)

persp(x, y, z, theta = 30, phi = 30, expand = 0.5,
   col = lightblue, ticktype=detailed,
   zlab=\n\n\n\nz)

p1 - persp(x, y, z, theta = 30, phi = 30, expand = 0.5,
   col = lightblue, ticktype=detailed,zlab=)

ranges - t(sapply(list(x,y,z),range))
means - rowMeans(ranges)

## label offset distance, as a fraction of the plot width
labelspace - 0.12  ## tweak this until you like the result

xpos - min(x)-(diff(range(x)))*labelspace
ypos - min(y)-(diff(range(y)))*labelspace
labelbot3d - c(xpos,ypos,min(z))
labeltop3d - c(xpos,ypos,max(z))
labelmid3d - c(xpos,ypos,mean(range(z)))

trans3dfun - function(v) { trans3d(v[1],v[2],v[3],p1) }
labelbot2d - trans3dfun(labelbot3d)
labelmid2d - trans3dfun(labelmid3d)
labeltop2d - trans3dfun(labeltop3d)
labelang - 
180/pi*atan2(labeltop2d$y-labelbot2d$y,labeltop2d$x-labelbot2d$x)
par(xpd=NA,srt=labelang)  ## disable clipping and set string rotation
text(labelmid2d[1]$x,labelmid2d$y,z label)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Main title of plot

2006-07-26 Thread Marco Boks
I am a newbie, and I am afraid this may be a rather trivial question. However I 
could not find the answer anywhere.

 

I am plotting a series of plots with different values for p. In the main title 
of a plot I have used the following code:

 

 

plot(a,b,type=l,ylim=c(0,1), xlab=freq,ylab=power, main=c(maximum 
gain=,p) )

 

That works fine. However the value of p is plotted on a new line, instead of 
just after the =

 

Is there anyway to print the value of p on the same line?

 

Thanks 

 
Marco

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Main title of plot

2006-07-26 Thread Gabor Grothendieck
This was just discussed yesterday.  See the thread:

https://www.stat.math.ethz.ch/pipermail/r-help/2006-July/109931.html

On 7/26/06, Marco Boks [EMAIL PROTECTED] wrote:
 I am a newbie, and I am afraid this may be a rather trivial question. However 
 I could not find the answer anywhere.



 I am plotting a series of plots with different values for p. In the main 
 title of a plot I have used the following code:





 plot(a,b,type=l,ylim=c(0,1), xlab=freq,ylab=power, main=c(maximum 
 gain=,p) )



 That works fine. However the value of p is plotted on a new line, instead of 
 just after the =



 Is there anyway to print the value of p on the same line?



 Thanks


 Marco

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] randomForest question

2006-07-26 Thread Arne.Muller
Hello,

I've a question regarding randomForest (from the package with same name). I've 
16 featurs (nominative), 159 positive and 318 negative cases that I'd like to 
classify (binary classification).

Using the tuning from the e1071 package it turns out that the best performance 
if reached when using all 16 features per tree (mtry=16). However, the 
documentation of randomForest suggests to take the sqrt(#features), i.e. 4. How 
can I explain this difference? When using all features this is the same as a 
classical decision tree, with the difference that the tree is built and tested 
with different data sets, right?

example (I've tried different configurations, incl. changing ntree):
 param - try(tune(randomForest, class ~ ., data=d.all318, 
 range=list(mtry=c(4, 8, 16), ntree=c(1000;

 summary(param)

Parameter tuning of `randomForest':

- sampling method: 10-fold cross validation 

- best parameters:
 mtry ntree
   16  1000

- best performance: 0.1571809 

- Detailed performance results:
  mtry ntree error
14  1000 0.1928635
28  1000 0.1634752
3   16  1000 0.1571809

thanks a lot for your help,

kind regards,

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert decimals to fractions - sorted

2006-07-26 Thread Muhammad Subianto
Dear all,
Thanks for your help.
I played with you suggest and still didn't sort (summary) which I need.

 t(table(at2[sort.order,c(1,3)]))
   V1
jeebee  -1 1
  0  0 4
  11/21  0 1
  1/21 0
  1/21   1 1
  13/42  1 1
  17/42  0 2
  2/21   0 3
  3/14   1 2
  5/42   0 1
  8/21   0 1


I need the result summary (order) like,

  -1 1
  0/42   0 4
  2/42   1 1
  4/42   0 3
  5/42   0 1
  9/42   1 2
  13/42  1 1
  16/42  0 1
  17/42  0 2
  21/42  1 0
  22/42  0 1

Thanks very much for any suggestions.
Groeten  Regards, Muhammad Subianto


On 7/26/06, JeeBee [EMAIL PROTECTED] wrote:

 Hi Muhammad,

 How about this?

 at - read.table(textConnection(a))
 at2 - cbind(at, jeebee=as.character(as.fractions(as.numeric(at[,2]

 sort.order - order(at2$V2)

 at2[sort.order,]
 at2[sort.order,c(1,3)]

 JeeBee.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


On 7/25/06, Muhammad Subianto [EMAIL PROTECTED] wrote:
 Dear all,
 Based on my question a few months ago
 https://stat.ethz.ch/pipermail/r-help/2006-January/086952.html
 and solved with
 https://stat.ethz.ch/pipermail/r-help/2006-January/086955.html
 https://stat.ethz.ch/pipermail/r-help/2006-January/086956.html
 and from
 https://stat.ethz.ch/pipermail/r-help/2006-January/086958.html

 frac.fun - function(x, den){
 dec - seq(0, den) / den
 nams - paste(seq(0, den), den, sep = /)
 sapply(x, function(y) nams[which.min(abs(y - dec))])
 }
 ###
 frac.fun(c(0, 1, 0.827, .06, 0.266), 75)

 Now, I have a dataset something like this:

 a -1 0
 1 0.095238095238095
 1 0.214285714285714
-1 0.5
 1 0.309523809523810
-1 0.0476190476190476
 1 0.404761904761905
 1 0.119047619047619
-1 0.214285714285714
-1 0.309523809523810
 1 0
 1 0
 1 0.404761904761905
 1 0.095238095238095
 1 0.047619047619047
 1 0.380952380952381
 1 0.214285714285714
 1 0.523809523809524
 1 0
 1 0.095238095238095

 First, I make it as fractions and then sorted.
 I have played around to make it sort, but it didn't succes.

 df - read.table(textConnection(a))
 library(MASS)
 as.fractions(as.numeric(df[,2]))
 cbind(table(df[,2], df[,1]), summary(as.factor(df[,2])))
 table(frac.fun(as.numeric(df[,2]),42), df[,1])
  table(frac.fun(as.numeric(df[,2]),42), df[,1])

 -1 1
   0/42   0 4
   13/42  1 1
   16/42  0 1
   17/42  0 2
   21/42  1 0
   22/42  0 1
   2/42   1 1
   4/42   0 3
   5/42   0 1
   9/42   1 2
 

 How to make the result as sort (to increase) like this,

 -1 1
   0/42   0 4
   2/42   1 1
   4/42   0 3
   5/42   0 1
   9/42   1 2
   13/42  1 1
   16/42  0 1
   17/42  0 2
   21/42  1 0
   22/42  0 1

 Thank's for any help.

 Best, Muhammad Subianto


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sweave and tth

2006-07-26 Thread Kuhn, Max
Dr. Harrell,

 I tried odfWeave to create an OpenOffice file and found that it 
 exhausted the memory of my large linux machine and took a long time to

 run. 

Do you have any details about the problem that you encountered? A bug
that someone else had pointed out might be the culprit. I have the
default image format as png, but since a lot of linux systems don't have
that device automatically available to them, I have a switch for the
device in odfWeaveControl:

 plotDevice = ifelse(.Platform$OS.type == windows, png, bitmap),

The bitmap device units are in inches and the bmp device is in pixels.
The bug is the default image size if 480 inches (whoops).

Can you try using:

  odfWeaveControl(
   plotHeight = 5,
   plotWidth = 5,
   dispHeight = 5,
   dispWidth = 5)

in your odfWeave call and let me know if this was the issue? I was able
to reproduce the error on our linux systems and this fix worked (strange
that the package passes R CMD check though).

If this works, Section 7 of the odfWeave manual lists two command line
tools (not included in my package) that can do the conversion from odt
to Word (or other formats)

 I really appreciate Max Kuhn's efforts with odfWeave and hope to keep
up 
 with its development.

No problem. I'll be releasing a bug fix version to solve the device
units issue. Also, others have reported problems with locales. I believe
I have a fix for this issue too.

 
 Thanks.
 -- 
 Frank E Harrell Jr   Professor and Chair   School of Medicine
   Department of Biostatistics   Vanderbilt
University

Max
--
LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Main title of plot

2006-07-26 Thread Marc Schwartz
Gabor,

I think that this is actually different, since it does not involve
plotmath.

The issue here is the use of c() in:

  main=c(maximum gain=,p)

rather than:

  main = paste(maximum gain =, p)

Marco, try this:

plot(a, b, type=l, ylim=c(0, 1), xlab = freq, ylab = power,
 main = paste(maximum gain =,p))

See ?paste for concatenating multiple vectors into a single character
vector (string).

HTH,

Marc Schwartz



On Wed, 2006-07-26 at 07:29 -0400, Gabor Grothendieck wrote:
 This was just discussed yesterday.  See the thread:
 
 https://www.stat.math.ethz.ch/pipermail/r-help/2006-July/109931.html
 
 On 7/26/06, Marco Boks [EMAIL PROTECTED] wrote:
  I am a newbie, and I am afraid this may be a rather trivial
 question. However I could not find the answer anywhere.
 
 
 
  I am plotting a series of plots with different values for p. In the
 main title of a plot I have used the following code:
 
 
 
 
 
  plot(a,b,type=l,ylim=c(0,1), xlab=freq,ylab=power,
 main=c(maximum gain=,p) )
 
 
 
  That works fine. However the value of p is plotted on a new line,
 instead of just after the =
 
 
 
  Is there anyway to print the value of p on the same line?
 
 
 
  Thanks
 
 
  Marco
 

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] the first and last case

2006-07-26 Thread Mauricio Cardeal
Hi all

Sometime ago I asked for a solution about how to aggregate data and the 
help was wonderful. Now, I´d like to know how to extract for each 
individual case below the first and the last observation to obtain this:

ind  y
18
19
27
2   11
39
3   10
4   8
4   5

# Below the example:

ind - c(1,1,1,2,2,3,3,3,4,4,4,4)
y - c(8,10,9,7,11,9,9,10,8,7,6,5)
dat - as.data.frame(cbind(ind,y))
dat
attach(dat)
mean.ind - aggregate(dat$y, by=list(dat$ind), mean)
mean.ind

Thanks
Mauricio

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Faster alternative to by?

2006-07-26 Thread michael watson \(IAH-C\)
Hi

I have a data.frame, two columns, 12304 rows.  Both columns are factors.
I want to do an equivalent of an SQL group by statement, and count the
number of rows in the data frame for each unique value of the second
column.

I have:

countl - by(mapped, mapped$col2, nrow)

Now, mapped$col2 has 10588 levels, so this statement takes a really long
time to run.  Is there a more efficient way of doing this in R?

Thanks

Mick

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Faster alternative to by?

2006-07-26 Thread Jacques VESLOT
table(mapped$col2)
---
Jacques VESLOT

CNRS UMR 8090
I.B.L (2ème étage)
1 rue du Professeur Calmette
B.P. 245
59019 Lille Cedex

Tel : 33 (0)3.20.87.10.44
Fax : 33 (0)3.20.87.10.31

http://www-good.ibl.fr
---


michael watson (IAH-C) a écrit :
 Hi
 
 I have a data.frame, two columns, 12304 rows.  Both columns are factors.
 I want to do an equivalent of an SQL group by statement, and count the
 number of rows in the data frame for each unique value of the second
 column.
 
 I have:
 
 countl - by(mapped, mapped$col2, nrow)
 
 Now, mapped$col2 has 10588 levels, so this statement takes a really long
 time to run.  Is there a more efficient way of doing this in R?
 
 Thanks
 
 Mick
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the first and last case

2006-07-26 Thread Jacques VESLOT
do.call(rbind,lapply(split(dat, dat$ind), function(x) x[c(1,nrow(x)),]))
---
Jacques VESLOT

CNRS UMR 8090
I.B.L (2ème étage)
1 rue du Professeur Calmette
B.P. 245
59019 Lille Cedex

Tel : 33 (0)3.20.87.10.44
Fax : 33 (0)3.20.87.10.31

http://www-good.ibl.fr
---


Mauricio Cardeal a écrit :
 Hi all
 
 Sometime ago I asked for a solution about how to aggregate data and the 
 help was wonderful. Now, I´d like to know how to extract for each 
 individual case below the first and the last observation to obtain this:
 
 ind  y
 18
 19
 27
 2   11
 39
 3   10
 4   8
 4   5
 
 # Below the example:
 
 ind - c(1,1,1,2,2,3,3,3,4,4,4,4)
 y - c(8,10,9,7,11,9,9,10,8,7,6,5)
 dat - as.data.frame(cbind(ind,y))
 dat
 attach(dat)
 mean.ind - aggregate(dat$y, by=list(dat$ind), mean)
 mean.ind
 
 Thanks
 Mauricio
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R: the first and last case

2006-07-26 Thread Guazzetti Stefano
could it be

 dat[unlist(tapply(1:nrow(dat), ind, range)),]
?

stefano



   -Messaggio originale-
   Da: [EMAIL PROTECTED]
   [mailto:[EMAIL PROTECTED] conto di 
   Mauricio Cardeal
   Inviato: 26 July, 2006 14:22
   A: r-help@stat.math.ethz.ch
   Oggetto: [R] the first and last case
   
   
   Hi all
   
   Sometime ago I asked for a solution about how to aggregate 
   data and the 
   help was wonderful. Now, I´d like to know how to extract for each 
   individual case below the first and the last observation to 
   obtain this:
   
   ind  y
   18
   19
   27
   2   11
   39
   3   10
   4   8
   4   5
   
   # Below the example:
   
   ind - c(1,1,1,2,2,3,3,3,4,4,4,4)
   y - c(8,10,9,7,11,9,9,10,8,7,6,5)
   dat - as.data.frame(cbind(ind,y))
   dat
   attach(dat)
   mean.ind - aggregate(dat$y, by=list(dat$ind), mean)
   mean.ind
   
   Thanks
   Mauricio
   
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the first and last case

2006-07-26 Thread Carlos J. Gil Bellosta
Dear Jacques,

I believe you need dat ordered by ind and y before you apply your solution,
right?

Sincerely,

Carlos J. Gil Bellosta
http://www.datanalytics.com
http://www.data-mining-blog.com

Quoting Jacques VESLOT [EMAIL PROTECTED]:

 do.call(rbind,lapply(split(dat, dat$ind), function(x) x[c(1,nrow(x)),]))
 ---
 Jacques VESLOT

 CNRS UMR 8090
 I.B.L (2ème étage)
 1 rue du Professeur Calmette
 B.P. 245
 59019 Lille Cedex

 Tel : 33 (0)3.20.87.10.44
 Fax : 33 (0)3.20.87.10.31

 http://www-good.ibl.fr
 ---


 Mauricio Cardeal a écrit :
 Hi all

 Sometime ago I asked for a solution about how to aggregate data and the
 help was wonderful. Now, I´d like to know how to extract for each
 individual case below the first and the last observation to obtain this:

 ind  y
 18
 19
 27
 2   11
 39
 3   10
 4   8
 4   5

 # Below the example:

 ind - c(1,1,1,2,2,3,3,3,4,4,4,4)
 y - c(8,10,9,7,11,9,9,10,8,7,6,5)
 dat - as.data.frame(cbind(ind,y))
 dat
 attach(dat)
 mean.ind - aggregate(dat$y, by=list(dat$ind), mean)
 mean.ind

 Thanks
 Mauricio

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [RODBC] ERROR: Could not SQLExecDirect

2006-07-26 Thread Dieter Menne
Peter Eiger Peter.Eiger at gmx.net writes:

 I've got a problem with RODBC and saving (sqlSave) of a dataframe in Access.
 R 2.0.1 is running on windows XP.
 
 When executing the examples in R help for the USArrests data set sqlSAve
works fine, but running
 sqlSave() for a dataframe Adat 
 
  str(Adat)
 `data.frame':   1202 obs. of  18 variables: 
 
 containing 18 columns and ca. 1200 rows fails.
 
 I get the following error message:
 
  sqlSave(channel, Adat)
 Error in sqlSave(channel, Adat) : [RODBC] ERROR: Could not SQLExecDirect
 
 The data was fetched from the same Access database before and was not
manipulated before the attempt to save.

Try to set rownames = FALSE in sqlSave, it's TRUE by default which I believe is
a bit unfortunate. And probably append=TRUE. It's also good to try with
fast=FALSE first.

When I get an error of that type, I first save to a non-existing table, and do a
compare of what comes out with the original table.

Dieter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the first and last case

2006-07-26 Thread Gabor Grothendieck
Try these:

# 1
library(Hmisc)
summary(y ~ ind, dat, fun = range, overall = FALSE)

# 2
# or with specified column names
f - function(x) c(head = head(x,1), tail = tail(x,1))
summary(y ~ ind, dat, fun = f, overall = FALSE)

# 3
# another approach using by - same f as above
do.call(rbind, by(dat$y, dat$ind, f))

# 4
# same but with with an ind column
g - function(x) c(ind = x$ind[1], head = head(x$y,1), tail = tail(x$y,1))
do.call(rbind, by(dat, dat$ind, g))


On 7/26/06, Mauricio Cardeal [EMAIL PROTECTED] wrote:
 Hi all

 Sometime ago I asked for a solution about how to aggregate data and the
 help was wonderful. Now, I´d like to know how to extract for each
 individual case below the first and the last observation to obtain this:

 ind  y
 18
 19
 27
 2   11
 39
 3   10
 4   8
 4   5

 # Below the example:

 ind - c(1,1,1,2,2,3,3,3,4,4,4,4)
 y - c(8,10,9,7,11,9,9,10,8,7,6,5)
 dat - as.data.frame(cbind(ind,y))
 dat
 attach(dat)
 mean.ind - aggregate(dat$y, by=list(dat$ind), mean)
 mean.ind

 Thanks
 Mauricio

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SURVEY PREDICTED SEs: Problem

2006-07-26 Thread mo2259

Hello R-list,
I'm attempting to migrate from Stata to R for my complex survey
work.It has been straight-forward so far except for the
following problem:

I have some code below, but first I'll describe the problem.

When I compute predicted logits from a logistic regression, the
standard errors of the predicted logits are way off (but the
predicted logits are fine).  Furthermore, the model logit 
coefficients have appropriate SEs. As a comparison, I ran the same
model without the survey design; the predicted SEs come out fine.

Here is example code (first no survey design model and predictions;
then survey design model and predictions):

 #MODEL COEF. ESTIMATES (NO SURVEY DESIGN)
 model.l.nosvy - glm(qn58~t8l,data=all.stratum,family=binomial)
 summary(model.l.nosvy)

Call:
glm(formula = qn58 ~ t8l, family = binomial, data = all.stratum)

Deviance Residuals:
   Min  1Q  Median  3Q Max
-1.310  -1.245   1.050   1.111   1.158

Coefficients:
 Estimate Std. Error z value Pr(|z|)
(Intercept)  0.175890   0.006176   28.48   2e-16 ***
t8l -0.018643   0.001376  -13.55   2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 145934  on 105857  degrees of freedom
Residual deviance: 145750  on 105856  degrees of freedom
AIC: 145754

Number of Fisher Scoring iterations: 3


 #PREDICTED SEs
 phat.l.se.logit.nosvy - predict(model.l.nosvy,se=TRUE)
 as.matrix(table(phat.l.se.logit.nosvy$se.fit))
 [,1]
0.00632408017609573 14456
0.00633130215261306 15188
0.00741988836010757 12896
0.00743834214717549 10392
0.00923404822144662 13207
0.00925875968615561 15864
0.0114294663004145  12235
0.0114574202170594  11620

 #MODEL COEF. ESTIMATES (SURVEY DESIGN)
 model.l - svyglm(qn58~t8l,design=all.svy,family=binomial)
 summary(model.l)

Call:
svyglm(qn58 ~ t8l, design = all.svy, family = binomial)

Survey design:
svydesign(id = ~psu, strata = ~stratum, weights = ~weight, data =
all.stratum,
nest = T)

Coefficients:
 Estimate Std. Error t value Pr(|t|)
(Intercept) -0.016004   0.023267  -0.6880.492
t8l -0.024496   0.004941  -4.958 1.13e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 0.934964)

Number of Fisher Scoring iterations: 2


 #PREDICTED SEs
 phat.l.logit.se - predict(model.l,se=TRUE)
 as.matrix(table(phat.l.logit.se$se.fit))
  [,1]
2.04867522818685 15188
2.05533753780321 14456
2.39885304369985 10392
2.41588959524594 12896
2.98273190185571 15864
3.00556161422958 13207
3.69102305734136 11620
3.71685978156846 12235

#THESE SEs are too large.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] randomForest question [Broadcast]

2006-07-26 Thread Liaw, Andy
When mtry is equal to total number of features, you just get regular bagging
(in the R package -- Breiman  Cutler's Fortran code samples variable with
replacement, so you can't do bagging with that).  There are cases when
bagging will do better than random feature selection (i.e., RF), even in
simulated data, but I'd say not very often.

HTH,
Andy


From: [EMAIL PROTECTED]
 
 Hello,
 
 I've a question regarding randomForest (from the package with 
 same name). I've 16 featurs (nominative), 159 positive and 
 318 negative cases that I'd like to classify (binary classification).
 
 Using the tuning from the e1071 package it turns out that the 
 best performance if reached when using all 16 features per 
 tree (mtry=16). However, the documentation of randomForest 
 suggests to take the sqrt(#features), i.e. 4. How can I 
 explain this difference? When using all features this is the 
 same as a classical decision tree, with the difference that 
 the tree is built and tested with different data sets, right?
 
 example (I've tried different configurations, incl. changing ntree):
  param - try(tune(randomForest, class ~ ., data=d.all318, 
  range=list(mtry=c(4, 8, 16), ntree=c(1000;
 
  summary(param)
 
 Parameter tuning of `randomForest':
 
 - sampling method: 10-fold cross validation 
 
 - best parameters:
  mtry ntree
16  1000
 
 - best performance: 0.1571809 
 
 - Detailed performance results:
   mtry ntree error
 14  1000 0.1928635
 28  1000 0.1634752
 3   16  1000 0.1571809
 
   thanks a lot for your help,
 
   kind regards,
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Moving Average

2006-07-26 Thread ricardosilva
Dear R-Users,

How can I compute simple moving averages from a time series in R?
Note that I do not want to estimate a MA model, just compute the MA's 
given a lenght (as excel does).

Thanks

Ricardo Gonçalves Silva, M. Sc.
Apoio aos Processos de Modelagem Matemática
Econometria  Inadimplência
Serasa S.A.
(11) - 6847-8889
[EMAIL PROTECTED]

**
As informações contidas nesta mensagem e no(s) arquivo(s) anexo(s) são 
endereçadas exclusivamente à(s) pessoa(s) e/ou instituição(ões) acima 
indicada(s), podendo conter dados confidenciais, os quais não podem, sob 
qualquer forma ou pretexto, ser utilizados, divulgados, alterados, 
impressos ou copiados, total ou parcialmente, por pessoas não autorizadas. 
Caso não seja o destinatário, favor providenciar sua exclusão e notificar 
o remetente imediatamente.  O uso impróprio será tratado conforme as 
normas da empresa e da legislação em vigor.
Esta mensagem expressa o posicionamento pessoal do subscritor e não 
reflete necessariamente a opinião da Serasa.
**
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] String frequencies in rows

2006-07-26 Thread Mario Falchi
Hi All,
 
I’m trying to evaluate the frequency of different strings in each row of a 
data.frame :
INPUT:
ID G1 G2 G3 G4 … GN
1 AA BB AB AB … 
2 BB AB AB AA … 
3 AC CC AC AA … 
4  BB BB BB BB… 

The number of different strings can vary in each row.
 
My solution has been:
for (i in 1:length(INPUT[,1])){
 b=as.data.frame(table(t((INPUT[i,2:5]
some operations using the string values and frequencies
(e.g. b for i==1 is:
 AA 1
 BB 1
 AB 2 )
} 

However my dataframe contains thousands rows and this script takes a lot of 
time.
Could someone suggest me a faster way?
 
Thank you very much,
Mario Falchi
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Moving Average

2006-07-26 Thread Gabor Grothendieck
See

?filter - simple and exponential are special cases
?runmean - in package caTools (the fastest)
?rollmean - in zoo package
?embed - can write your own using embed as basis
?sma - in package fSeries, also see ewma in same package

Probably other functions in other packages too.

On 7/26/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 Dear R-Users,

 How can I compute simple moving averages from a time series in R?
 Note that I do not want to estimate a MA model, just compute the MA's
 given a lenght (as excel does).

 Thanks
 
 Ricardo Gonçalves Silva, M. Sc.
 Apoio aos Processos de Modelagem Matemática
 Econometria  Inadimplência
 Serasa S.A.
 (11) - 6847-8889
 [EMAIL PROTECTED]

 **
 As informações contidas nesta mensagem e no(s) arquivo(s) anexo(s) são
 endereçadas exclusivamente à(s) pessoa(s) e/ou instituição(ões) acima
 indicada(s), podendo conter dados confidenciais, os quais não podem, sob
 qualquer forma ou pretexto, ser utilizados, divulgados, alterados,
 impressos ou copiados, total ou parcialmente, por pessoas não autorizadas.
 Caso não seja o destinatário, favor providenciar sua exclusão e notificar
 o remetente imediatamente.  O uso impróprio será tratado conforme as
 normas da empresa e da legislação em vigor.
 Esta mensagem expressa o posicionamento pessoal do subscritor e não
 reflete necessariamente a opinião da Serasa.
 **
[[alternative HTML version deleted]]



 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Moving Average

2006-07-26 Thread markleeds
From: [EMAIL PROTECTED]
Date: 2006/07/26 Wed AM 09:29:27 CDT
To: r-help@stat.math.ethz.ch
Subject: [R] Moving Average

i think it was mave() in Splus. probably something similar
in R. do RSiteSearch(moving average)at an R prompt.


Dear R-Users,

How can I compute simple moving averages from a time series in R?
Note that I do not want to estimate a MA model, just compute the MA's 
given a lenght (as excel does).

Thanks

Ricardo Gonçalves Silva, M. Sc.
Apoio aos Processos de Modelagem Matemática
Econometria  Inadimplência
Serasa S.A.
(11) - 6847-8889
[EMAIL PROTECTED]

**
As informações contidas nesta mensagem e no(s) arquivo(s) anexo(s) são 
endereçadas exclusivamente à(s) pessoa(s) e/ou instituição(ões) acima 
indicada(s), podendo conter dados confidenciais, os quais não podem, sob 
qualquer forma ou pretexto, ser utilizados, divulgados, alterados, 
impressos ou copiados, total ou parcialmente, por pessoas não autorizadas. 
Caso não seja o destinatário, favor providenciar sua exclusão e notificar 
o remetente imediatamente.  O uso impróprio será tratado conforme as 
normas da empresa e da legislação em vigor.
Esta mensagem expressa o posicionamento pessoal do subscritor e não 
reflete necessariamente a opinião da Serasa.
**
   [[alternative HTML version deleted]]



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] String frequencies in rows

2006-07-26 Thread Ben Bolker
Mario Falchi mariofalchi at yahoo.com writes:

 I’m trying to evaluate the frequency of different strings in each row of a
data.frame :
 INPUT:
 ID G1 G2 G3 G4 … GN
 1 AA BB AB AB … 

  Something like

z - data[,-1]
table(z,row(z))

  ?

  Ben Bolker

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave and tth

2006-07-26 Thread Frank E Harrell Jr
Kuhn, Max wrote:
 Dr. Harrell,
 
 I tried odfWeave to create an OpenOffice file and found that it 
 exhausted the memory of my large linux machine and took a long time to
 
 run. 
 
 Do you have any details about the problem that you encountered? A bug
 that someone else had pointed out might be the culprit. I have the
 default image format as png, but since a lot of linux systems don't have
 that device automatically available to them, I have a switch for the
 device in odfWeaveControl:
 
  plotDevice = ifelse(.Platform$OS.type == windows, png, bitmap),
 
 The bitmap device units are in inches and the bmp device is in pixels.
 The bug is the default image size if 480 inches (whoops).
 
 Can you try using:
 
   odfWeaveControl(
plotHeight = 5,
plotWidth = 5,
dispHeight = 5,
dispWidth = 5)
 
 in your odfWeave call and let me know if this was the issue? I was able
 to reproduce the error on our linux systems and this fix worked (strange
 that the package passes R CMD check though).

Max,

I should have contacted you first - sorry about that.  png is working 
fine on my debian linux system, so I just ran

library(odfWeave)
odfWeave('/usr/local/lib/R/site-library/odfWeave/examples/examples.odt', 

   '/tmp/out.odt',
   control=
odfWeaveControl(plotHeight = 5,plotWidth = 5,dispHeight = 
5,dispWidth = 5))

and it ran extremely fast, creating out.odt that loaded extremely fast 
into OpenOffice writer, unlike the first out.odt I had tried.

If you develop a way to include high-resolution graphics that will be 
even better.

I have updated http://biostat.mc.vanderbilt.edu/SweaveConvert accordingly.

 
 If this works, Section 7 of the odfWeave manual lists two command line
 tools (not included in my package) that can do the conversion from odt
 to Word (or other formats)

Excellent

Thanks!
Frank

 
 I really appreciate Max Kuhn's efforts with odfWeave and hope to keep
 up 
 with its development.
 
 No problem. I'll be releasing a bug fix version to solve the device
 units issue. Also, others have reported problems with locales. I believe
 I have a fix for this issue too.
 
 Thanks.
 -- 
 Frank E Harrell Jr   Professor and Chair   School of Medicine
   Department of Biostatistics   Vanderbilt
 University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Branching on 'grep' returns...

2006-07-26 Thread Allen S. Rout



Greetings, all.

I'm fiddling with some text manipulation in R, and I've found
something which feels counterintuitive to my PERL-trained senses; I'm
hoping that I can glean new R intuition about the situation.

Here's an example, as concise as I could make it. 


trg-c(this,that)

# these two work as I'd expected.
if ( grep(this,trg) ) { cat(Y\n) } else { cat(N\n) } 
if ( grep(that,trg) ) { cat(Y\n) } else { cat(N\n) } 

# These all fail with error 'argument is of length zero'
# if ( grep(other,trg) ) { cat(Y\n) } else { cat(N\n) } 
# if ( grep(other,trg) == TRUE) { cat(Y\n) } else { cat(N\n) } 
# if ( grep(other,trg) == 1) { cat(Y\n) } else { cat(N\n) } 


# This says that the result is a numeric zero.   Shouldn't I be able
#  to if on that, or at least compare it with a number?
grep(other, trg)

# I eventually decided this worked, but felt odd to me.
if ( any(grep(other,trg))) { cat(Y\n) } else { cat(N\n) } 


So, is the 'Wrap it in an any()' just normal R practice, and I'm too
new to know it?  Is there a more fundamental dumb move I'm making?




- Allen S. Rout

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] String frequencies in rows

2006-07-26 Thread Liaw, Andy
It's usually faster to operate on columns of data frames, rather
than rows, so the following might help:

R x
  G1 G2 G3 G4
1 AA BB AB AB
2 BB AB AB AA
3 AC CC AC AA
4 BB BB BB BB
R xt - as.data.frame(t(x))
R sapply(xt, table)
$`1`

AA AB BB 
 1  2  1 

$`2`

AA AB BB 
 1  2  1 

$`3`

AA AC CC 
 1  2  1 

$`4`

BB 
 4 

Andy 

From: Mario Falchi
 
 Hi All,
  
 I’m trying to evaluate the frequency of different strings 
 in each row of a data.frame :
 INPUT:
 ID G1 G2 G3 G4 … GN
 1 AA BB AB AB …
 2 BB AB AB AA …
 3 AC CC AC AA …
 4  BB BB BB BB… 
 
 The number of different strings can vary in each row.
  
 My solution has been:
 for (i in 1:length(INPUT[,1])){
  b=as.data.frame(table(t((INPUT[i,2:5]
 some operations using the string values and frequencies 
 (e.g. b for i==1 is:
  AA 1
  BB 1
  AB 2 )
 } 
 
 However my dataframe contains thousands rows and this script 
 takes a lot of time.
 Could someone suggest me a faster way?
  
 Thank you very much,
 Mario Falchi
   [[alternative HTML version deleted]]
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Branching on 'grep' returns...

2006-07-26 Thread Gabor Grothendieck
If you are using grep then I think you have it right.  Note that

   this %in% trg

is also available.

On 26 Jul 2006 11:16:25 -0400, Allen S. Rout [EMAIL PROTECTED] wrote:



 Greetings, all.

 I'm fiddling with some text manipulation in R, and I've found
 something which feels counterintuitive to my PERL-trained senses; I'm
 hoping that I can glean new R intuition about the situation.

 Here's an example, as concise as I could make it.


 trg-c(this,that)

 # these two work as I'd expected.
 if ( grep(this,trg) ) { cat(Y\n) } else { cat(N\n) }
 if ( grep(that,trg) ) { cat(Y\n) } else { cat(N\n) }

 # These all fail with error 'argument is of length zero'
 # if ( grep(other,trg) ) { cat(Y\n) } else { cat(N\n) }
 # if ( grep(other,trg) == TRUE) { cat(Y\n) } else { cat(N\n) }
 # if ( grep(other,trg) == 1) { cat(Y\n) } else { cat(N\n) }


 # This says that the result is a numeric zero.   Shouldn't I be able
 #  to if on that, or at least compare it with a number?
 grep(other, trg)

 # I eventually decided this worked, but felt odd to me.
 if ( any(grep(other,trg))) { cat(Y\n) } else { cat(N\n) }


 So, is the 'Wrap it in an any()' just normal R practice, and I'm too
 new to know it?  Is there a more fundamental dumb move I'm making?




 - Allen S. Rout

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] PCA with not non-negative definite covariance

2006-07-26 Thread Quin Wills
Thanks.

I suppose that another option could be just to use classical
multi-dimensional scaling. By my understanding this is (if based on
Euclidian measure) completely analogous to PCA, and because it's based
explicitly on distances, I could easily exclude the variables with NA's on a
pairwise basis when calculating the distances.

Quin

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: 25 July 2006 09:24 AM
To: Quin Wills
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] PCA with not non-negative definite covariance

Hi , hi all,

 Am I correct to understand from the previous discussions on this topic (a
 few years back) that if I have a matrix with missing values my PCA options
 seem dismal if:
 (1) I don’t want to impute the missing values.
 (2) I don’t want to completely remove cases with missing values.
 (3) I do cov() with use=”pairwise.complete.obs”, as this produces
 negative eigenvalues (which it has in my case!).

(4) Maybe you can use the Non-linear Iterative Partial Least Squares
(NIPALS)
algorithm (intensively used in chemometry). S. Dray proposes a version of
this
procedure at http://pbil.univ-lyon1.fr/R/additifs.html.


Hope this help :)


Pierre



--
Ce message a été envoyé depuis le webmail IMP (Internet Messaging Program)

-- 
No virus found in this incoming message.


 

--

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] implementing user defined covariance matirx

2006-07-26 Thread Thilo Kellermann
maybe the help of one of the corrStruct classes is of interest to you:
?corCompSymm
?corSymm
?corAR1
?corCar1
?corARMA
?corExp
?corGaus
?corLin
?corRatio
?corSphere

Good luck,
Thilo


On Wednesday 26 July 2006 17:34, Jonathan Smith wrote:
 I am trying to implement my own covariance matrix into R.Then be able
 to use it in gls or lme or nlme to analyze some data.  I simply want to use
 corr=mymodel(form ~tim/peep) same as can use corr= corAR1 and many others.
 Having a terrible time trying to figure out how to do this.   I have found
 documentation saying that you can do this but cant find out how.  Any
 suggestings would be greatly  appreciated.

 My covariance matrix is:
 | 1   s1^gs2^gs3^g� |
 | s1^g 1  s1^gs2^g� |
 | s2^gs1^g 1  s1^g... |
 | s3^gs2^gs1^g1   � |

 Thankyou kindly for your time
 Jonathan Smith

-- 

Thilo Kellermann
Department of Psychiatry und Psychotherapy
RWTH Aachen University
Pauwelstr. 30
52074 Aachen
Tel.: +49 (0)241 / 8089977
Fax.: +49 (0)241 / 8082401
E-Mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] PCA with not non-negative definite covariance

2006-07-26 Thread hadley wickham
 I suppose that another option could be just to use classical
 multi-dimensional scaling. By my understanding this is (if based on
 Euclidian measure) completely analogous to PCA, and because it's based
 explicitly on distances, I could easily exclude the variables with NA's on a
 pairwise basis when calculating the distances.

I don't think it as straightforward as that because distances
calculated on observations with missing values will be smaller than
other distances.  I suspect adjusting for this would be in some way
equivalent to imputation.

Exactly what do you want a low-dimensional representation of your data
set for?  (And why are you concerned about negative eigenvalues?)

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multcomp

2006-07-26 Thread Nair, Murlidharan T
Let me clarify with a simpler example what I want to accomplish
library(multcomp)
data(recovery)
Dcirec-simint(minutes~blanket,data=recovery, conf.level=0.9,
alternative=less)
out.data.mat - with(Dcirec,data.frame(estimate, conf.int, p.value.raw =
c(p.value.raw), p.value.bon, p.value.adj))


I want to generate the same type of plot using out.data.mat that I get
by plot(Dcirec) 

How do I specify the plot method how the data in out.data.mat is to be
plotted? 

I am interested in doing this because, I am running about 1500 different
comparisons, which creates 1500 different objects. I need to analyze
them and combine significant ones into one plot. 






-Original Message-
From: Greg Snow [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, July 25, 2006 12:12 PM
To: Nair, Murlidharan T
Subject: RE: [R] Multcomp

Doing:

 str( fungus.cirec )

Suggests that fungus.cirec$conf.int contains the confidence intervals,
you can manually plot the subset that you are intereseted in (and label
them whatever you want)


-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Nair, Murlidharan
T
Sent: Saturday, July 22, 2006 11:00 AM
To: R-help@stat.math.ethz.ch
Subject: [R] Multcomp

Here it is again, hope this is more clear
 
I am using the following data (only a small subset is given):
 
Habitat Fungus.yield
Birch 20.83829053
Birch 22.9718181
Birch 22.28216829
Birch 24.23136797
Birch 22.32147961
Birch 20.30783598
Oak 27.24047258
Oak 29.7730014
Oak 30.12608508
Oak 25.76088669
Oak 30.14750974
Hornbeam 17.05307949
Hornbeam 15.32805111
Hornbeam 18.26920177
Hornbeam 21.30987049
Hornbeam 21.7173223

I am using the multcomp package to do multiple comparisons as follows 

library(multcomp) # loads the package

fungus-read.table(fungi.txt, Header=T)# Reads the data from file
saved as fungi.txt


fungus.cirec-simint(Fungus.yield~Habitat,
data=fungus,conf.level=0.95,type =c(Tukey))  # Computes cimultaneous
intervals using Tukey's method


plot(fungus.cirec)   # plots the data

The plot function plots all the comparisons, I want to plot only part of
the data since it clutters the graph. 

How do I plot only part of the data ?

How do I tell it to mark the significant comparisons?

How do I get rid of the field names in the plot? For eg. The plot labels
are HabitatBirch-HabitatOak, I want it to be labeled as Birch-Oak.

 

Hope I have posted it according to the guidelines, let me know
otherwise. 

Cheers .../Murli

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] PCA with not non-negative definite covariance

2006-07-26 Thread Berton Gunter
Not sure what completely analagous means; mds is nonlinear, PCA is linear.

In any case, the bottom line is that if you have high dimensional data with
many missing values, you cannot know what the multivariate distribution
looks like -- and you need a **lot** of data with many variables to usefully
characterize it anyway. So you must either make some assumptions about what
the distribution could be (including imputation methodology) or use any of
the many exploratory techniques available to learn what you can.
Thermodynamics holds -- you can't get something for nothing (you can't fool
Mother Nature).

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
The business of the statistician is to catalyze the scientific learning
process.  - George E. P. Box
 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Quin Wills
 Sent: Wednesday, July 26, 2006 8:44 AM
 To: [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] PCA with not non-negative definite covariance
 
 Thanks.
 
 I suppose that another option could be just to use classical
 multi-dimensional scaling. By my understanding this is (if based on
 Euclidian measure) completely analogous to PCA, and because it's based
 explicitly on distances, I could easily exclude the variables 
 with NA's on a
 pairwise basis when calculating the distances.
 
 Quin
 
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
 Sent: 25 July 2006 09:24 AM
 To: Quin Wills
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] PCA with not non-negative definite covariance
 
 Hi , hi all,
 
  Am I correct to understand from the previous discussions on 
 this topic (a
  few years back) that if I have a matrix with missing values 
 my PCA options
  seem dismal if:
  (1) I don’t want to impute the missing values.
  (2) I don’t want to completely remove cases with missing values.
  (3) I do cov() with use=”pairwise.complete.obs”, as 
 this produces
  negative eigenvalues (which it has in my case!).
 
 (4) Maybe you can use the Non-linear Iterative Partial Least Squares
 (NIPALS)
 algorithm (intensively used in chemometry). S. Dray proposes 
 a version of
 this
 procedure at http://pbil.univ-lyon1.fr/R/additifs.html.
 
 
 Hope this help :)
 
 
 Pierre
 
 
 
 --
 
 Ce message a été envoyé depuis le webmail IMP (Internet 
 Messaging Program)
 
 -- 
 No virus found in this incoming message.
 
 
  
 
 --
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] odesolve loading problem

2006-07-26 Thread Ariel Chernomoretz
Hi,
I get the following error message when loading the package odesolve ( R 
2.2.1 - odesolve 0.5.14 - AMD64 - Linux Fedora Core 4 ) :

  library(odesolve)
Error in library.dynam(lib,package,package.lib) :
   shared library 'TRUE' not found
Error: package/namespace load failed for 'odesolve'

Any help would be greatly appreciated

Ariel./

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Branching on 'grep' returns...

2006-07-26 Thread Thomas Lumley
On Wed, 26 Jul 2006, Allen S. Rout wrote:
 # These all fail with error 'argument is of length zero'
 # if ( grep(other,trg) ) { cat(Y\n) } else { cat(N\n) }
 # if ( grep(other,trg) == TRUE) { cat(Y\n) } else { cat(N\n) }
 # if ( grep(other,trg) == 1) { cat(Y\n) } else { cat(N\n) }


 # This says that the result is a numeric zero.   Shouldn't I be able
 #  to if on that, or at least compare it with a number?
 grep(other, trg)

It is a numeric(0), that is, a zero-length vector of numbers. If you 
compare it with a number you get a zero-length logical vector. You can't 
get TRUE or FALSE, because a zero-length vector of 1s looks just like a 
zero-length vector of 0s, (or a zero-length vector of any other number)

In handling zero-length vectors (and in other vectorization contexts) it 
is useful to distinguish between vectorized functions, which return a 
vector of the same length as the input, and reducing functions, which 
return a vector of length 1.

The == operator is vectorized, but if() requires a condition of length 1, 
so they don't match.  The solution is to apply some reducing function. 
Two possible options are length() and (as you found) any().

-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multcomp

2006-07-26 Thread Nair, Murlidharan T
Let me clarify with a simpler example what I want to accomplish
library(multcomp)
data(recovery)
Dcirec-simint(minutes~blanket,data=recovery, conf.level=0.9,
alternative=less) out.data.mat - with(Dcirec,data.frame(estimate,
conf.int, p.value.raw = c(p.value.raw), p.value.bon, p.value.adj))


I want to generate the same type of plot using out.data.mat that I get
by plot(Dcirec) 

How do I specify the plot method how the data in out.data.mat is to be
plotted? 

I am interested in doing this because, I am running about 1500 different
comparisons, which creates 1500 different objects. I need to analyze
them and combine significant ones into one plot.

-Original Message-
From: Greg Snow [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, July 25, 2006 12:12 PM
To: Nair, Murlidharan T
Subject: RE: [R] Multcomp

Doing:

 str( fungus.cirec )

Suggests that fungus.cirec$conf.int contains the confidence intervals,
you can manually plot the subset that you are intereseted in (and label
them whatever you want)


-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Nair, Murlidharan
T
Sent: Saturday, July 22, 2006 11:00 AM
To: R-help@stat.math.ethz.ch
Subject: [R] Multcomp

Here it is again, hope this is more clear
 
I am using the following data (only a small subset is given):
 
Habitat Fungus.yield
Birch 20.83829053
Birch 22.9718181
Birch 22.28216829
Birch 24.23136797
Birch 22.32147961
Birch 20.30783598
Oak 27.24047258
Oak 29.7730014
Oak 30.12608508
Oak 25.76088669
Oak 30.14750974
Hornbeam 17.05307949
Hornbeam 15.32805111
Hornbeam 18.26920177
Hornbeam 21.30987049
Hornbeam 21.7173223

I am using the multcomp package to do multiple comparisons as follows 

library(multcomp) # loads the package

fungus-read.table(fungi.txt, Header=T)# Reads the data from file
saved as fungi.txt


fungus.cirec-simint(Fungus.yield~Habitat,
data=fungus,conf.level=0.95,type =c(Tukey))  # Computes cimultaneous
intervals using Tukey's method


plot(fungus.cirec)   # plots the data

The plot function plots all the comparisons, I want to plot only part of
the data since it clutters the graph. 

How do I plot only part of the data ?

How do I tell it to mark the significant comparisons?

How do I get rid of the field names in the plot? For eg. The plot labels
are HabitatBirch-HabitatOak, I want it to be labeled as Birch-Oak.

 

Hope I have posted it according to the guidelines, let me know
otherwise. 

Cheers .../Murli

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multcomp

2006-07-26 Thread Gabor Grothendieck
Look through
   multcomp:::plot.hmtest
to find out which components of an hmtest object are actually used.
Now look at what an hmtest object looks like by doing this

dput(Dcirec)

or looking through the source of the function that produces hmtest
objects.  With this information in hand we can construct one from
out.data.mat:

my.hmtest - structure(list(
  estimate = t(t(structure(out.data.mat[,estimate],
 .Names = rownames(out.data.mat,
  conf.int = out.data.mat[,2:3],
  ctype = Dunnett),
  class = hmtest)
plot(my.hmtest)

Note that this is a bit fragile since changes to the internal
representation of hmtest objects could cause your
object to cease working although as long as those
changes do not affect the three components we are
using it should be ok.  By the way I hard coded
Dunnett above since ctype is not available
in out.data.mat .

On 7/26/06, Nair, Murlidharan T [EMAIL PROTECTED] wrote:
 Let me clarify with a simpler example what I want to accomplish
 library(multcomp)
 data(recovery)
 Dcirec-simint(minutes~blanket,data=recovery, conf.level=0.9,
 alternative=less) out.data.mat - with(Dcirec,data.frame(estimate,
 conf.int, p.value.raw = c(p.value.raw), p.value.bon, p.value.adj))


 I want to generate the same type of plot using out.data.mat that I get
 by plot(Dcirec)

 How do I specify the plot method how the data in out.data.mat is to be
 plotted?

 I am interested in doing this because, I am running about 1500 different
 comparisons, which creates 1500 different objects. I need to analyze
 them and combine significant ones into one plot.

 -Original Message-
 From: Greg Snow [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, July 25, 2006 12:12 PM
 To: Nair, Murlidharan T
 Subject: RE: [R] Multcomp

 Doing:

  str( fungus.cirec )

 Suggests that fungus.cirec$conf.int contains the confidence intervals,
 you can manually plot the subset that you are intereseted in (and label
 them whatever you want)


 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 [EMAIL PROTECTED]
 (801) 408-8111


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Nair, Murlidharan
 T
 Sent: Saturday, July 22, 2006 11:00 AM
 To: R-help@stat.math.ethz.ch
 Subject: [R] Multcomp

 Here it is again, hope this is more clear

 I am using the following data (only a small subset is given):

 Habitat Fungus.yield
 Birch 20.83829053
 Birch 22.9718181
 Birch 22.28216829
 Birch 24.23136797
 Birch 22.32147961
 Birch 20.30783598
 Oak 27.24047258
 Oak 29.7730014
 Oak 30.12608508
 Oak 25.76088669
 Oak 30.14750974
 Hornbeam 17.05307949
 Hornbeam 15.32805111
 Hornbeam 18.26920177
 Hornbeam 21.30987049
 Hornbeam 21.7173223

 I am using the multcomp package to do multiple comparisons as follows

 library(multcomp) # loads the package

 fungus-read.table(fungi.txt, Header=T)# Reads the data from file
 saved as fungi.txt


 fungus.cirec-simint(Fungus.yield~Habitat,
 data=fungus,conf.level=0.95,type =c(Tukey))  # Computes cimultaneous
 intervals using Tukey's method


 plot(fungus.cirec)   # plots the data

 The plot function plots all the comparisons, I want to plot only part of
 the data since it clutters the graph.

 How do I plot only part of the data ?

 How do I tell it to mark the significant comparisons?

 How do I get rid of the field names in the plot? For eg. The plot labels
 are HabitatBirch-HabitatOak, I want it to be labeled as Birch-Oak.



 Hope I have posted it according to the guidelines, let me know
 otherwise.

 Cheers .../Murli

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] odesolve loading problem

2006-07-26 Thread Setzer . Woodrow
Hi Ariel,
The problem is that I specified the wrong dependency in the DEPENDS
field of the DESCRIPTION file of the package: I specified R version of
at least 2.2.1, but that should have been 2.3.1.  You have two choices
-- upgrade your R to 2.3.1, or install odesolve 0.5.13.  I will send an
updated version of the package to CRAN, with a note to the CRAN
maintainers about the problem, but that won't help you if you need to
use R version 2.2.1.  There is an archive of older R packages on CRAN,
linked to at the bottom of the Contributed Packages page on CRAN.
Please accept my apology for the inconvenience -- I rushed through a
change requested by a user, and did not take time to fully appreciate
the consequences.
Woody

R. Woodrow Setzer, Ph. D.
National Center for Computational Toxicology
US Environmental Protection Agency
Mail Drop B205-01/US EPA/RTP, NC 27711
Ph: (919) 541-0128Fax: (919) 541-1194



 Ariel  
 Chernomoretz   
 [EMAIL PROTECTED]To 
 r-help@stat.math.ethz.ch  
 Sent by:cc 
 [EMAIL PROTECTED]   
 tat.math.ethz.ch   Subject 
  [R] odesolve loading problem  

 07/28/2006 01:30   
 AM 







Hi,
I get the following error message when loading the package odesolve ( R
2.2.1 - odesolve 0.5.14 - AMD64 - Linux Fedora Core 4 ) :

  library(odesolve)
Error in library.dynam(lib,package,package.lib) :
   shared library 'TRUE' not found
Error: package/namespace load failed for 'odesolve'

Any help would be greatly appreciated

Ariel./

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multcomp

2006-07-26 Thread Gabor Grothendieck
Here is a minor simplication:

my.hmtest - structure(list(
   estimate = t(t(out.data.mat[,estimate,drop=FALSE])),
   conf.int = out.data.mat[,2:3],
   ctype = Dunnett),
  class = hmtest)
plot(my.hmtest)

On 7/26/06, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 Look through
   multcomp:::plot.hmtest
 to find out which components of an hmtest object are actually used.
 Now look at what an hmtest object looks like by doing this

 dput(Dcirec)

 or looking through the source of the function that produces hmtest
 objects.  With this information in hand we can construct one from
 out.data.mat:

 my.hmtest - structure(list(
  estimate = t(t(structure(out.data.mat[,estimate],
 .Names = rownames(out.data.mat,
  conf.int = out.data.mat[,2:3],
  ctype = Dunnett),
  class = hmtest)
 plot(my.hmtest)

 Note that this is a bit fragile since changes to the internal
 representation of hmtest objects could cause your
 object to cease working although as long as those
 changes do not affect the three components we are
 using it should be ok.  By the way I hard coded
 Dunnett above since ctype is not available
 in out.data.mat .

 On 7/26/06, Nair, Murlidharan T [EMAIL PROTECTED] wrote:
  Let me clarify with a simpler example what I want to accomplish
  library(multcomp)
  data(recovery)
  Dcirec-simint(minutes~blanket,data=recovery, conf.level=0.9,
  alternative=less) out.data.mat - with(Dcirec,data.frame(estimate,
  conf.int, p.value.raw = c(p.value.raw), p.value.bon, p.value.adj))
 
 
  I want to generate the same type of plot using out.data.mat that I get
  by plot(Dcirec)
 
  How do I specify the plot method how the data in out.data.mat is to be
  plotted?
 
  I am interested in doing this because, I am running about 1500 different
  comparisons, which creates 1500 different objects. I need to analyze
  them and combine significant ones into one plot.
 
  -Original Message-
  From: Greg Snow [mailto:[EMAIL PROTECTED]
  Sent: Tuesday, July 25, 2006 12:12 PM
  To: Nair, Murlidharan T
  Subject: RE: [R] Multcomp
 
  Doing:
 
   str( fungus.cirec )
 
  Suggests that fungus.cirec$conf.int contains the confidence intervals,
  you can manually plot the subset that you are intereseted in (and label
  them whatever you want)
 
 
  --
  Gregory (Greg) L. Snow Ph.D.
  Statistical Data Center
  Intermountain Healthcare
  [EMAIL PROTECTED]
  (801) 408-8111
 
 
  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] On Behalf Of Nair, Murlidharan
  T
  Sent: Saturday, July 22, 2006 11:00 AM
  To: R-help@stat.math.ethz.ch
  Subject: [R] Multcomp
 
  Here it is again, hope this is more clear
 
  I am using the following data (only a small subset is given):
 
  Habitat Fungus.yield
  Birch 20.83829053
  Birch 22.9718181
  Birch 22.28216829
  Birch 24.23136797
  Birch 22.32147961
  Birch 20.30783598
  Oak 27.24047258
  Oak 29.7730014
  Oak 30.12608508
  Oak 25.76088669
  Oak 30.14750974
  Hornbeam 17.05307949
  Hornbeam 15.32805111
  Hornbeam 18.26920177
  Hornbeam 21.30987049
  Hornbeam 21.7173223
 
  I am using the multcomp package to do multiple comparisons as follows
 
  library(multcomp) # loads the package
 
  fungus-read.table(fungi.txt, Header=T)# Reads the data from file
  saved as fungi.txt
 
 
  fungus.cirec-simint(Fungus.yield~Habitat,
  data=fungus,conf.level=0.95,type =c(Tukey))  # Computes cimultaneous
  intervals using Tukey's method
 
 
  plot(fungus.cirec)   # plots the data
 
  The plot function plots all the comparisons, I want to plot only part of
  the data since it clutters the graph.
 
  How do I plot only part of the data ?
 
  How do I tell it to mark the significant comparisons?
 
  How do I get rid of the field names in the plot? For eg. The plot labels
  are HabitatBirch-HabitatOak, I want it to be labeled as Birch-Oak.
 
 
 
  Hope I have posted it according to the guidelines, let me know
  otherwise.
 
  Cheers .../Murli
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bootstrap within litter

2006-07-26 Thread Antonio_Paredes
Hello everyone.

I have 6 to 10 strata with 6 to 12 subject within each stratum. I will 
like to do bootstrap to compute a confidence interval for an estimator 
which is a function of the Wilconson sum rank test. Are there any function 
in R to do this? Any reference will be helpful.

Thank you

Tony.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] memory problems when combining randomForests

2006-07-26 Thread Eleni Rapsomaniki
Dear all,

I am trying to train a randomForest using all my control data (12,000 cases, ~
20 explanatory variables, 2 classes). Because of memory constraints, I have
split my data into 7 subsets and trained a randomForest for each, hoping that
using combine() afterwards would solve the memory issue. Unfortunately,
combine() still runs out of memory. Is there anything else I can do? (I am not
using the formula version)

Many Thanks
Eleni Rapsomaniki

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] bug?

2006-07-26 Thread Patrick Jahn
Dear All,
if you generate a sequence with small latitude like:

x-seq(0,1,0.005)

and you ask for all points of this lattice how many points are in a 
neighbourhood with radius 0.01 
of each point:

v - rep( 0 , length( x ) ) ; 
for (i in 1:length(x) )
 {  v[i] - length(x[ abs(x-x[i])  0.01 ] ) ;   };

then the answer should be:  v = (2, 3, 3, 3, 3,...,3, 3, 3, 3, 2), because 
every point instead 
of the borders has 3 points in a 0.01-neighbourhood.

but v contains also many 4 and also 5:

 v
  [1] 2 4 3 4 4 3 4 4 3 4 4 3 4 4 4 4 5 4 4 5 4 4 5 4 4 4 3 4 4 4 4 3 3 3 4 4 4
 [38] 4 3 3 4 4 4 4 3 3 4 4 4 4 3 3 3 3 3 3 4 4 4 4 3 3 3 3 3 3 3 3 3 4 4 4 4 3
 [75] 3 3 3 3 3 3 3 3 4 4 4 4 3 3 3 3 3 3 3 3 4 4 4 4 3 3 3 3 3 3 3 3 3 3 3 3 3
[112] 3 3 3 4 4 4 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 3 3 3 3 3
[149] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
[186] 3 3 3 3 3 4 4 4 4 3 3 3 3 3 3 2

Could anyone explain this fact and help me to compute exactly on general data.

Thank you very much,
Patrick Jahn

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R vs. Stata

2006-07-26 Thread Hamilton, Cody

I have read some very good reviews comparing R (or Splus) to SAS.  Does
anyone know if there are any reviews comparing R (or Splus) to Stata?  I
am trying to get others to try R in my department, and I have never used
Stata.



Regards, -Cody



Cody Hamilton, Ph.D

Institute for Health Care Research and Improvement

Baylor Health Care System

(214) 265-3618





This e-mail, facsimile, or letter and any files or attachments transmitted with 
it contains information that is confidential and privileged. This information 
is intended only for the use of the individual(s) and entity(ies) to whom it is 
addressed. If you are the intended recipient, further disclosures are 
prohibited without proper authorization. If you are not the intended recipient, 
any disclosure, copying, printing, or use of this information is strictly 
prohibited and possibly a violation of federal or state law and regulations. If 
you have received this information in error, please notify Baylor Health Care 
System immediately at 1-866-402-1661 or via e-mail at [EMAIL PROTECTED] Baylor 
Health Care System, its subsidiaries, and affiliates hereby claim all 
applicable privileges related to this information.
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bug?

2006-07-26 Thread Thomas Lumley
On Wed, 26 Jul 2006, Patrick Jahn wrote:

 Dear All,
 if you generate a sequence with small latitude like:

 x-seq(0,1,0.005)

 and you ask for all points of this lattice how many points are in a 
 neighbourhood with radius 0.01
 of each point:

 v - rep( 0 , length( x ) ) ;
 for (i in 1:length(x) )
 {  v[i] - length(x[ abs(x-x[i])  0.01 ] ) ;   };

 then the answer should be:  v = (2, 3, 3, 3, 3,...,3, 3, 3, 3, 2), 
 because every point instead
 of the borders has 3 points in a 0.01-neighbourhood.

 but v contains also many 4 and also 5:

 v
  [1] 2 4 3 4 4 3 4 4 3 4 4 3 4 4 4 4 5 4 4 5 4 4 5 4 4 4 3 4 4 4 4 3 3 3 4 4 4
 [38] 4 3 3 4 4 4 4 3 3 4 4 4 4 3 3 3 3 3 3 4 4 4 4 3 3 3 3 3 3 3 3 3 4 4 4 4 3
 [75] 3 3 3 3 3 3 3 3 4 4 4 4 3 3 3 3 3 3 3 3 4 4 4 4 3 3 3 3 3 3 3 3 3 3 3 3 3
 [112] 3 3 3 4 4 4 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 3 3 3 3 
 3
 [149] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 
 3
 [186] 3 3 3 3 3 4 4 4 4 3 3 3 3 3 3 2

 Could anyone explain this fact and help me to compute exactly on general 
 data.


Yes and no.

The fact is easily explained: 0.005 and 0.01 are not exactly representable 
in floating point, and so it will not be true for all x that x+0.005+0.005 
= x+0.01. This is a FAQ.

For this problem an easy solution is to multiply by 200 (or 1000) and work 
with integers, which can be exactly represented.  There is no solution for 
general data, although software for arbitrary precision floating point may 
come close (there was a message yesterday from someone trying to interface 
pari/gp, which does this, with R).



-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (robust) mixed-effects model with covariate

2006-07-26 Thread Giuseppe Pagnoni
Dear Thilo,

many thanks for your reply.  I realized that there was an error in my 
formula which should have been:

aov(y ~ Group * (Time + Age) + Error (Subj/Time), data=df1)

or alternatively:

lme(RVP.A ~ Group*(Time+Age), random = ~ 1|Subj/Time,data=df1)

but I get different results in each case, and different still from the 
results of another stat program (JMP).
The problem is that I am not sure which one (if one indeed is) correct!

Also, in the model you proposed:

lme(y~Group*Time, random ~ age | Subj, data = df1)

it appears that age is not between the effects of interests, so I do not 
get an estimate of the significance of the Age or the Age*Group effect.

I have Pinheiro  Bates, and I read the first chapter but it didn't seem 
to provide an example analogous to my case.  Also, it looks like it 
would take me some months to study the book thoroughly and frankly that 
seems a bit excessive for such a (apparently?) simple problem  I was 
hoping somebody would magically provide the correct syntax :-)  !

thanks again anyway for your help

best regards

   giuseppe



Thilo Kellermann wrote:
 On Monday 24 July 2006 20:16, Giuseppe Pagnoni wrote:
   
 Dear all,

 First of all I apologize if you received this twice: I was checking the
 archive and I noticed that the text was scrubbed from the message,
 probably due to some setting in my e-mail program.


 I am unsure about how to specify a model in R and I thought of asking
 some advice to the list. I have two groups (Group= A, B) of subjects,
 with each subject undertaking a test before and after a certain
 treatment (Time= pre, post). Additionally, I want to enter
 the age of the subject as a covariate (the performance on the test is
 affected by age), and I also want to allow different slopes for the
 effect of age in the two groups of subjects (age might affect the
 performance of the two groups differentially).

 Is the right model to use something like the following?

 aov (y ~ Group*Time + Group*Age + Error(Subj/Group), data=df1 )

 (If I enter that command, within summary, I get the following:
 Error() model is singular in: aov(y ~ Group * Time + Group * Age +
 Error(Subj/Group), data = df1))

 
 try:
 aov(y~Group*Time*Age + Error(Subj*Time*Age), data = df1)
 which specifies an ANOVA (but not with mixed effects) with three main effects 
 and all interaction terms plus an error term that is independent between 
 groups (!) and relates to within subjects variability.

 For a real mixed effects analysis you should use the (n)lme function from 
 the nlme package and one possible model could look like this:

 lme(y~Group*Time, random ~ age | Subj, data = df1)

 but the exact specification depends on your assumptions, in which it is 
 possible to specify two or three models and compare their fits with anova(). 
 For more information on mixed effects you should consult:
 Jose C. Pinheiro  Douglas M. Bates (2000) Mixed-Effects Models in S and 
 S-PLUS. Springer, New York.

 Good luck,
 Thilo

   
 As a second question: I have an outlier in one of the two groups. The
 outlier is not due to a measurement error but simply to the performance
 of the subject (possibly related to his medical history, but I have no
 way to determine that with certainty). This subject is
 signaled to be an outlier within its group: averaging the pre and post
 values for the performance of the subjects in his group, the Grubbs test
 yields a probability of 0.002 for the subject to be an outlier (the
 subject is marked as a significant outlier also if I
 perform the test separately on the pre and the post data).

 If I remove this subject from its group, I get significant effects of
 Group and Group X Age (not using the R formula above, but another stat
 software), but if I leave the subject in those effects disappear. Since
 I understand that removing outliers is always worrysome, I would like to
 know if it is possible in R to estimate a model similar to that outlined
 above but in a resistant/robust fashion, and what would be the actual
 syntax to do that. I will very much appreciate any help or suggestion
 about this.

 thanks in advance and best regards

 giuseppe
 

   


-- 
-
Giuseppe Pagnoni
Psychiatry and Behavioral Sciences
Emory University School of Medicine
1639 Pierce Drive, Suite 4000
Atlanta, GA, 30322
tel: 404.712.8431
fax: 404.727.3233

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R vs. Stata

2006-07-26 Thread Patrick Burns
There is some discussion in:

http://www.burns-stat.com/pages/Tutor/R_relative_statpack.pdf

which can also be found at the UCLA website.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and A Guide for the Unwilling S User)

Hamilton, Cody wrote:

I have read some very good reviews comparing R (or Splus) to SAS.  Does
anyone know if there are any reviews comparing R (or Splus) to Stata?  I
am trying to get others to try R in my department, and I have never used
Stata.



Regards, -Cody



Cody Hamilton, Ph.D

Institute for Health Care Research and Improvement

Baylor Health Care System

(214) 265-3618





This e-mail, facsimile, or letter and any files or attachme...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] residual df in lmer and simulation results

2006-07-26 Thread Bill Shipley
Hello.  Douglas Bates has explained in a previous posting to R why he does
not output residual degrees of freedom, F values and probabilities in the
mixed model (lmer) function:  because the usual degrees of freedom (obs -
fixed df -1) are not exact and are really only upper bounds.  I am
interpreting what he said but I am not a professional statistician, so I
might be getting this wrong...
Does anyone know of any more recent results, perhaps from simulations, that
quantify the degree of bias that using such upper bounds for the demoninator
degrees of freedom produces?  Is it possible to calculate a lower bounds for
such degrees of freedom?

Thanks for any help.

Bill Shipley
North American Editor, Annals of Botany
Editor, Population and Community Biology series, Springer Publishing
Département de biologie, Université de Sherbrooke,
Sherbrooke (Québec) J1K 2R1 CANADA
[EMAIL PROTECTED]
http://pages.usherbrooke.ca/jshipley/recherche/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to split the left and right hand terms of a formula

2006-07-26 Thread Daniel Gerlanc
Hello All,

I've sent a few messages to the list regarding splitting a formula
into its right and left hand terms.  Thanks to everyone who has
responded.

I believe that the best way to extract the left and right hand terms
as character vectors follows:

library(nlme)

formula - y ~ x + z

left.term - all.vars(getResponseFormula(formula))
covariates - all.vars(getCovariateFormula(formula))

Thanks!

Dan Gerlanc
Williams College

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] arima() function - issues

2006-07-26 Thread Sachin J
Hi,
   
  My query is related to ARIMA function in stats package.
While looking for the time series literature I found following link which 
highlights discrepancy in arima function while dealing with 
differenced time series. Is there a substitute function similar to 
sarima mentioned in the following website implemened in R? Any pointers would 
be of great help. 
   
  http://lib.stat.cmu.edu/general/stoffer/tsa2/Rissues.htm
   
  Thanx in advance.
  Sachin
   


-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ks.test exact p-value

2006-07-26 Thread Kyle Hall
R enthusiasts!

I have been simulating daily in-stream bacteria concentrations using a variety 
of scenarios.  I am using the ks.test (two sample,two-sided) for analysis. My 
data sets are both of equal size (n=64).

My question is this:  For the two sample, two sided ks.test, how is the exact 
P-value calculated?

I have not been able to find an explicit citation of how the P-value is 
calculated.  I have read the help file which cites three publications and 
assigns those pubs to one version or another of the ks.test (eg. one sample, 
one-sided, etc.).  I have also read to the Conover book (referenced but not 
cited).  Conover tables are adapted from a Birnbaum and Hall (1960) paper, and 
then I have also found tables by Kim and Jennrich, but I feel I have found 
some disagreement between these sources (with regards to critical D values and 
p-values).  I would like to compare the methods utilized by R version 2.3.0 on 
Windows XP.  
Can anybody tell me the exact method for calculating the p-value for a 
two-sample, two sided ks.test?
Any help would greatly appreciated!

Kyle Hall
Graduate Research Assistant
Biological Systems Engineering
Virginia Tech
(540) 231-2083

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] configure fails for R 2.3.1 on SunOS 5.8

2006-07-26 Thread Benjamin Tyner
Does this mean I need to use '--with-readline=no'? configure says:

checking build system type... sparc-sun-solaris2.8
checking host system type... sparc-sun-solaris2.8
loading site script './config.site'
loading build specific script './config.site'
checking for pwd... /usr/bin/pwd
checking whether builddir is srcdir... yes
checking for working aclocal... found
checking for working autoconf... found
checking for working automake... found
checking for working autoheader... found
checking for working makeinfo... found
checking for gawk... gawk
checking for egrep... egrep
checking whether ln -s works... yes
checking for ranlib... ranlib
checking for bison... bison -y
checking for ar... ar
checking for a BSD-compatible install... tools/install-sh -c
checking for sed... /usr/xpg4/bin/sed
checking for more... /usr/bin/more
checking for perl... no
checking for false... /usr/bin/false
configure: WARNING: you cannot build the object documentation system
checking for dvips... no
checking for tex... no
checking for latex... no
configure: WARNING: you cannot build DVI versions of the R manuals
checking for makeindex... no
checking for pdftex... no
checking for pdflatex... no
configure: WARNING: you cannot build PDF versions of the R manuals
checking for makeinfo... /p/gnu/makeinfo
checking for install-info... /p/gnu/install-info
checking for unzip... /usr/bin/unzip
checking for zip... /usr/bin/zip
checking for gzip... /usr/bin/gzip
checking for firefox... /p/firefox/firefox
using default browser ... /p/firefox/firefox
checking for acroread... no
checking for acroread4... no
checking for xpdf... no
checking for gv... /p/X11/gv
checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ANSI C... none needed
checking how to run the C preprocessor... gcc -E
checking whether gcc needs -traditional... no
checking how to run the C preprocessor... gcc -E
checking for f95... f95
checking whether we are using the GNU Fortran 77 compiler... no
checking whether f95 accepts -g... yes
checking for g++... g++
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking how to run the C++ preprocessor... g++ -E
checking whether __attribute__((visibility())) is supported... no
checking whether gcc accepts -fvisibility... no
checking whether f95 accepts -fvisibility... yes
checking for a sed that does not truncate output... /usr/xpg4/bin/sed
checking for ld used by gcc... /usr/ccs/bin/ld
checking if the linker (/usr/ccs/bin/ld) is GNU ld... no
checking for /usr/ccs/bin/ld option to reload object files... -r
checking for BSD-compatible nm... /usr/ccs/bin/nm -p
checking how to recognise dependent libraries... pass_all
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... no
checking for unistd.h... yes
checking dlfcn.h usability... yes
checking dlfcn.h presence... yes
checking for dlfcn.h... yes
checking the maximum length of command line arguments... 262144
checking command to parse /usr/ccs/bin/nm -p output from gcc object... ok
checking for objdir... .libs
checking for ranlib... (cached) ranlib
checking for strip... strip
checking if gcc static flag  works... yes
checking if gcc supports -fno-rtti -fno-exceptions... yes
checking for gcc option to produce PIC... -fPIC
checking if gcc PIC flag -fPIC works... yes
checking if gcc supports -c -o file.o... yes
checking whether the gcc linker (/usr/ccs/bin/ld) supports shared 
libraries... yes
checking whether -lc should be explicitly linked in... yes
checking dynamic linker characteristics... solaris2.8 ld.so
checking how to hardcode library paths into programs... immediate
checking whether stripping libraries is possible... no
checking if libtool supports shared libraries... yes
checking whether to build shared libraries... yes
checking whether to build static libraries... no
configure: creating libtool
appending configuration tag CXX to libtool
checking for ld used by g++... /usr/ccs/bin/ld
checking if the linker (/usr/ccs/bin/ld) is GNU ld... no
checking whether the g++ linker (/usr/ccs/bin/ld) supports shared 
libraries... yes
checking for g++ option to produce PIC... -fPIC
checking if g++ PIC flag -fPIC works... yes
checking if g++ supports -c -o file.o... yes
checking whether the g++ linker (/usr/ccs/bin/ld) supports shared 
libraries... yes
checking dynamic linker characteristics... solaris2.8 ld.so
checking how to hardcode library paths into programs... immediate
checking whether stripping libraries is 

[R] creating a color display matrix

2006-07-26 Thread Kartik Pappu
Hello all,

I am trying to use R to create a colored data matrix.  I have my data 
which (after certain steps of normalization and log transformation) 
gives me a x by y matrix of numbers between 0 and -5.  I want to be 
able to create from this matrix of numbers a x by y image (box) that 
contains x*y squares and somehow uses the value in the original matrix 
to come up with a color that corresponds to the number.  Hence each box 
will be colored on a scale between 0 and 5.  For example my data could 
look like this:

 X1.FcH  X2.FcH  X3.FcH X4.FcH  X5.FcH   
  X6.FcH   X7.FcH
1-AP0.09667593 -4.66298640 -1.28299697 -4.8739017 -4.95862831 
-5.178603 -4.878524750
2-AP   -4.69186869 -0.08547776 -4.56495440 -4.8348255 -4.80256152 
-5.121531 -4.894347108
3-AP   -1.71380667 -4.52626124 -0.06810053 -4.8703810 -4.65657593 
-5.024595 -4.824712621
4-AP   -4.47968850 -4.48604718 -4.44314403 -0.1569536 -4.86436977 
-4.988196 -4.550416356
5-AP   -4.64616469 -4.5307 -4.78163386 -4.9162949 -0.01729274 
-5.061663 -0.769960777
6-AP   -4.61047573 -4.60917414 -4.72514817 -5.0084772 -4.87797740 
-0.284934 -1.782745357
7-AP   -4.48157167 -4.61850313 -4.72241281 -4.8694868 -1.66122821 
-3.887898 -0.002522157

How do I make a 7 x 7 box that has 49 squares and each square has a 
color in the RGB spectrum that corresponds to the value.  So for 
example in the matrix above the biggest number which is 0.00966 (at 
position 1 , 1) could be set to RED and the smallest number which is 
5.0084 (at position 6, 5) can be set to BLUE and all the other numbers 
be shades on a scale of Red going to Blue.

I hope this problem makes sense.  I am rather new to R and was 
wondering if there was a function or solution to this problem out 
there.

Thanks
Kartik


--
IMPORTANT WARNING:  This email (and any attachments) is only intended for the 
use of the person or entity to which it is addressed, and may contain 
information that is privileged and confidential.  You, the recipient, are 
obligated to maintain it in a safe, secure and confidential manner.  
Unauthorized redisclosure or failure to maintain confidentiality may subject 
you to federal and state penalties. If you are not the recipient, please 
immediately notify us by return email, and delete this message from your 
computer.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a color display matrix

2006-07-26 Thread davidr
?image has worked for me.

David L. Reiner
Rho Trading Securities, LLC
Chicago  IL  60605
312-362-4963

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Kartik Pappu
Sent: Wednesday, July 26, 2006 3:30 PM
To: r-help@stat.math.ethz.ch
Subject: [R] creating a color display matrix

Hello all,

I am trying to use R to create a colored data matrix.  I have my data 
which (after certain steps of normalization and log transformation) 
gives me a x by y matrix of numbers between 0 and -5.  I want to be 
able to create from this matrix of numbers a x by y image (box) that 
contains x*y squares and somehow uses the value in the original matrix 
to come up with a color that corresponds to the number.  Hence each box 
will be colored on a scale between 0 and 5.  For example my data could 
look like this:

 X1.FcH  X2.FcH  X3.FcH X4.FcH  X5.FcH

  X6.FcH   X7.FcH
1-AP0.09667593 -4.66298640 -1.28299697 -4.8739017 -4.95862831 
-5.178603 -4.878524750
2-AP   -4.69186869 -0.08547776 -4.56495440 -4.8348255 -4.80256152 
-5.121531 -4.894347108
3-AP   -1.71380667 -4.52626124 -0.06810053 -4.8703810 -4.65657593 
-5.024595 -4.824712621
4-AP   -4.47968850 -4.48604718 -4.44314403 -0.1569536 -4.86436977 
-4.988196 -4.550416356
5-AP   -4.64616469 -4.5307 -4.78163386 -4.9162949 -0.01729274 
-5.061663 -0.769960777
6-AP   -4.61047573 -4.60917414 -4.72514817 -5.0084772 -4.87797740 
-0.284934 -1.782745357
7-AP   -4.48157167 -4.61850313 -4.72241281 -4.8694868 -1.66122821 
-3.887898 -0.002522157

How do I make a 7 x 7 box that has 49 squares and each square has a 
color in the RGB spectrum that corresponds to the value.  So for 
example in the matrix above the biggest number which is 0.00966 (at 
position 1 , 1) could be set to RED and the smallest number which is 
5.0084 (at position 6, 5) can be set to BLUE and all the other numbers 
be shades on a scale of Red going to Blue.

I hope this problem makes sense.  I am rather new to R and was 
wondering if there was a function or solution to this problem out 
there.

Thanks
Kartik


--
IMPORTANT WARNING:  This email (and any attachments) is only intended
for the use of the person or entity to which it is addressed, and may
contain information that is privileged and confidential.  You, the
recipient, are obligated to maintain it in a safe, secure and
confidential manner.  Unauthorized redisclosure or failure to maintain
confidentiality may subject you to federal and state penalties. If you
are not the recipient, please immediately notify us by return email, and
delete this message from your computer.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] configure fails for R 2.3.1 on SunOS 5.8

2006-07-26 Thread Peter Dalgaard
Benjamin Tyner [EMAIL PROTECTED] writes:

 Does this mean I need to use '--with-readline=no'? configure says:


 configure: error: --with-readline=yes (default) and headers/libs are not 
 available

Yes, or that you need to install readline headers/libs (in a
sufficiently recent version), or that you have installed them, but not
where R looks for them, so that you need to specify the location.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] configure fails for R 2.3.1 on SunOS 5.8

2006-07-26 Thread Benjamin Tyner
Thanks; configure completes succesfully using --with-readline=no. 
However, toward the end of running make, it reports

mkdir ../share/locale/[EMAIL PROTECTED]
mkdir ../share/locale/[EMAIL PROTECTED]/LC_MESSAGES
  [EMAIL PROTECTED]
you should 'make docs' now ...
*** Error code 255
make: Fatal error: Command failed for target `R.1'
Current working directory ~/btyner/R-2.3.1/doc
*** Error code 1 (ignored)
building all R object docs (text, HTML, LaTeX, examples)
you need Perl version 5 to build the R object docs
*** Error code 1
make: Fatal error: Command failed for target `help-indices'
Current working directory ~/btyner/R-2.3.1/src/library
*** Error code 1
make: Fatal error: Command failed for target `docs'
Current working directory ~/btyner/R-2.3.1/src/library
*** Error code 1 (ignored)
begin installing recommended package VR

I do not have perl installed, as I thought one could install sans 
documentation.

Ben


Peter Dalgaard wrote:

Benjamin Tyner [EMAIL PROTECTED] writes:

  

Does this mean I need to use '--with-readline=no'? configure says:




  

configure: error: --with-readline=yes (default) and headers/libs are not 
available



Yes, or that you need to install readline headers/libs (in a
sufficiently recent version), or that you have installed them, but not
where R looks for them, so that you need to specify the location.

  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] PCA with not non-negative definite covariance

2006-07-26 Thread Quin Wills
My apologies (in response to the last 2 replies). I should write sensibly -
including subject titles that make grammatical sense.

(1) By analogous, I mean that using classical MDS with Euclidian distance is
equivalent to plotting the first k principle components.
(2) Agreed re. distribution assumptions.
(3) Agreed re. the need to use some kind of imputation for calculating
distances. I'm thinking pairwise exclusion for correlation.

Re. why I want to do this is simply for graphically representing my data.

Quin



-Original Message-
From: Berton Gunter [mailto:[EMAIL PROTECTED] 
Sent: 26 July 2006 05:10 PM
To: 'Quin Wills'; [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch
Subject: RE: [R] PCA with not non-negative definite covariance

Not sure what completely analagous means; mds is nonlinear, PCA is linear.

In any case, the bottom line is that if you have high dimensional data with
many missing values, you cannot know what the multivariate distribution
looks like -- and you need a **lot** of data with many variables to usefully
characterize it anyway. So you must either make some assumptions about what
the distribution could be (including imputation methodology) or use any of
the many exploratory techniques available to learn what you can.
Thermodynamics holds -- you can't get something for nothing (you can't fool
Mother Nature).

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
The business of the statistician is to catalyze the scientific learning
process.  - George E. P. Box
 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Quin Wills
 Sent: Wednesday, July 26, 2006 8:44 AM
 To: [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] PCA with not non-negative definite covariance
 
 Thanks.
 
 I suppose that another option could be just to use classical
 multi-dimensional scaling. By my understanding this is (if based on
 Euclidian measure) completely analogous to PCA, and because it's based
 explicitly on distances, I could easily exclude the variables 
 with NA's on a
 pairwise basis when calculating the distances.
 
 Quin
 
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
 Sent: 25 July 2006 09:24 AM
 To: Quin Wills
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] PCA with not non-negative definite covariance
 
 Hi , hi all,
 
  Am I correct to understand from the previous discussions on 
 this topic (a
  few years back) that if I have a matrix with missing values 
 my PCA options
  seem dismal if:
  (1) I don’t want to impute the missing values.
  (2) I don’t want to completely remove cases with missing values.
  (3) I do cov() with use=”pairwise.complete.obs”, as 
 this produces
  negative eigenvalues (which it has in my case!).
 
 (4) Maybe you can use the Non-linear Iterative Partial Least Squares
 (NIPALS)
 algorithm (intensively used in chemometry). S. Dray proposes 
 a version of
 this
 procedure at http://pbil.univ-lyon1.fr/R/additifs.html.
 
 
 Hope this help :)
 
 
 Pierre
 
 
 
 --
 
 Ce message a été envoyé depuis le webmail IMP (Internet 
 Messaging Program)
 
 -- 
 No virus found in this incoming message.
 
 
  
 
 --
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-- 
No virus found in this incoming message.


 

--

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RODBC on linux

2006-07-26 Thread Armstrong, Whit
Anyone out there using Linux RODBC and unixODBC to connect to a
Microsoft SQL server?

If possible can someone post a sample .odbc.ini file?

I saw a few discussions on the archives a few years ago, but no config
file details were available.

Thanks,
Whit





This e-mail message is intended only for the named recipient(s) above. It may 
contain confidential information. If you are not the intended recipient you are 
hereby notified that any dissemination, distribution or copying of this e-mail 
and any attachment(s) is strictly prohibited. If you have received this e-mail 
in error, please immediately notify the sender by replying to this e-mail and 
delete the message and any attachment(s) from your system. Thank you.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Codes; White's heteroscedasticity test and GARCH models

2006-07-26 Thread Spiros Mesomeris
Hello,
   
  I have just recently started using R and was wondering whether anybody had a 
code written for White's heteroscedasticity correction for standard errors.
   
  Also, can anybody share a code for the GARCH(1,1) and GARCH-in-mean models 
for modelling regression residuals?
   
   
  Thanks a lot in advance,
  Spyros


-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Codes; White's heteroscedasticity test and GARCH models

2006-07-26 Thread Kerpel, John
Check tseries and fSeries packages for GARCH

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Spiros Mesomeris
Sent: Wednesday, July 26, 2006 5:00 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Codes; White's heteroscedasticity test and GARCH models

Hello,
   
  I have just recently started using R and was wondering whether anybody
had a code written for White's heteroscedasticity correction for
standard errors.
   
  Also, can anybody share a code for the GARCH(1,1) and GARCH-in-mean
models for modelling regression residuals?
   
   
  Thanks a lot in advance,
  Spyros


-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Codes; White's heteroscedasticity test and GARCH models

2006-07-26 Thread Achim Zeileis
Spyros:

   I have just recently started using R and was wondering whether anybody
   had a code written for White's heteroscedasticity correction for
   standard errors.

See package sandwich, particularly functions vcovHC() and sandwich().

   Also, can anybody share a code for the GARCH(1,1) and GARCH-in-mean
   models for modelling regression residuals?

See function garch() in package tseries.

Furthermore, the econometrics and finance task views might be helpful for
you:
  http://CRAN.R-project.org/src/contrib/Views/Econometrics.html
  http://CRAN.R-project.org/src/contrib/Views/Finance.html

hth,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overplotting: plot() invocation looks ugly ... suggestions?

2006-07-26 Thread John McHenry
Hi Hadley,

Thanks for your suggestion.

The description of ggplot states:

Description:   ... It combines the advantages of both base and lattice
graphics ... and you can still build up a plot step by
   step from multiple data sources

So I thought I'd try to enhance the plot by adding in the means from each 
quarter (this is snagged directly from ESS):

   qplot(Quarter, Consumption, data=data, type=c(point,line), id=data$Year)
( mean.per.quarter- with(data, tapply(Consumption, Quarter, mean)) )
points(mean.per.quarter, pch=+, cex=2.0)

 qplot(Quarter, Consumption, data=data, type=c(point,line), id=data$Year)
  ( mean.per.quarter- with(data, tapply(Consumption, Quarter, mean)) )
1 2 3 4 
888.2 709.2 616.4 832.8 
  points(mean.per.quarter, pch=+, cex=2.0)
Error in plot.xy(xy.coords(x, y), type = type, ...) : 
plot.new has not been called yet
 
 

Now I'm green behind the ears when it comes to R, so I'm guessing that there is 
some major conflict between base graphics and lattice graphics, which I thought 
ggplot avoided, given the library help blurb.

I'm assuming that there must be a way to add points / lines to lattice / ggplot 
graphics (in the latter case it seems to be via ggpoint, or some such)? But is 
there a way that allows me to add via:

points(mean.per.quarter, pch=+, cex=2.0)

and similar, or do I have to learn the lingo for lattice / ggplot?

Thanks,

Jack.



hadley wickham [EMAIL PROTECTED] wrote:  And if lattice is ok then try this:

 library(lattice)
 xyplot(Consumption ~ Quarter, group = Year, data, type = o)

Or you can use ggplot:

install.packages(ggplot)
library(ggplot)
qplot(Quarter, Consumption, data=data,type=c(point,line), id=data$Year)

Unfortunately this has uncovered a couple of small bugs for me to fix
(no automatic legend, and have to specify the data frame explicitly)

The slighly more verbose example below shows you what it should look like.

data$Year - factor(data$Year)
p - ggplot(data, aes=list(x=Quarter, y=Consumption, id=Year, colour=Year))
ggline(ggpoint(p), size=2)

Regards,

Hadley



-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R vs. Stata

2006-07-26 Thread Hamilton, Cody

Thanks Patrick! 
-Cody

-Original Message-
From: Patrick Burns [mailto:[EMAIL PROTECTED]
Sent: Wednesday, July 26, 2006 13:35 PM
To: Hamilton, Cody
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] R vs. Stata

There is some discussion in:

http://www.burns-stat.com/pages/Tutor/R_relative_statpack.pdf

which can also be found at the UCLA website.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and A Guide for the Unwilling S User)

Hamilton, Cody wrote:

I have read some very good reviews comparing R (or Splus) to SAS.  Does
anyone know if there are any reviews comparing R (or Splus) to Stata?
I
am trying to get others to try R in my department, and I have never
used
Stata.



Regards, -Cody



Cody Hamilton, Ph.D

Institute for Health Care Research and Improvement

Baylor Health Care System

(214) 265-3618





This e-mail, facsimile, or letter and any files or attachments
transmitted with it contains information that is confidential and
privileged. This information is intended only for the use of the
individual(s) and entity(ies) to whom it is addressed. If you are the
intended recipient, further disclosures are prohibited without proper
authorization. If you are not the intended recipient, any disclosure,
copying, printing, or use of this information is strictly prohibited and
possibly a violation of federal or state law and regulations. If you
have received this information in error, please notify Baylor Health
Care System immediately at 1-866-402-1661 or via e-mail at
[EMAIL PROTECTED] Baylor Health Care System, its subsidiaries,
and affiliates hereby claim all applicable privileges related to this
information.
   [[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


 


This e-mail, facsimile, or letter and any files or attachmen...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory problems when combining randomForests [Broadcast]

2006-07-26 Thread Liaw, Andy
You need to give us more details, like how you call randomForest, versions
of the package and R itself, etc.  Also, see if this helps you:
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/32918.html

Andy
 
From: Eleni Rapsomaniki
 
 Dear all,
 
 I am trying to train a randomForest using all my control data 
 (12,000 cases, ~ 20 explanatory variables, 2 classes). 
 Because of memory constraints, I have split my data into 7 
 subsets and trained a randomForest for each, hoping that 
 using combine() afterwards would solve the memory issue. 
 Unfortunately,
 combine() still runs out of memory. Is there anything else I 
 can do? (I am not using the formula version)
 
 Many Thanks
 Eleni Rapsomaniki
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overplotting: plot() invocation looks ugly ... suggestions?

2006-07-26 Thread Gabor Grothendieck
With the lattice package it would be done like this (where
the panel.points function places large red pluses on
the plot):

xyplot(Consumption ~ Quarter, group = Year, data, type = o)
trellis.focus(panel, 1, 1)
panel.points(1:4, mean.per.quarter, pch = +, cex = 2, col = red)
trellis.unfocus()


On 7/26/06, John McHenry [EMAIL PROTECTED] wrote:
 Hi Hadley,

 Thanks for your suggestion.

 The description of ggplot states:

 Description:   ... It combines the advantages of both base and lattice
graphics ... and you can still build up a plot step by
   step from multiple data sources

 So I thought I'd try to enhance the plot by adding in the means from each 
 quarter (this is snagged directly from ESS):

qplot(Quarter, Consumption, data=data, type=c(point,line), 
  id=data$Year)
( mean.per.quarter- with(data, tapply(Consumption, Quarter, mean)) )
points(mean.per.quarter, pch=+, cex=2.0)

  qplot(Quarter, Consumption, data=data, type=c(point,line), id=data$Year)
   ( mean.per.quarter- with(data, tapply(Consumption, Quarter, mean)) )
1 2 3 4
 888.2 709.2 616.4 832.8
   points(mean.per.quarter, pch=+, cex=2.0)
 Error in plot.xy(xy.coords(x, y), type = type, ...) :
plot.new has not been called yet
 
 

 Now I'm green behind the ears when it comes to R, so I'm guessing that there 
 is some major conflict between base graphics and lattice graphics, which I 
 thought ggplot avoided, given the library help blurb.

 I'm assuming that there must be a way to add points / lines to lattice / 
 ggplot graphics (in the latter case it seems to be via ggpoint, or some 
 such)? But is there a way that allows me to add via:

 points(mean.per.quarter, pch=+, cex=2.0)

 and similar, or do I have to learn the lingo for lattice / ggplot?

 Thanks,

 Jack.



 hadley wickham [EMAIL PROTECTED] wrote:  And if lattice is ok then try 
 this:
 
  library(lattice)
  xyplot(Consumption ~ Quarter, group = Year, data, type = o)

 Or you can use ggplot:

 install.packages(ggplot)
 library(ggplot)
 qplot(Quarter, Consumption, data=data,type=c(point,line), id=data$Year)

 Unfortunately this has uncovered a couple of small bugs for me to fix
 (no automatic legend, and have to specify the data frame explicitly)

 The slighly more verbose example below shows you what it should look like.

 data$Year - factor(data$Year)
 p - ggplot(data, aes=list(x=Quarter, y=Consumption, id=Year, colour=Year))
 ggline(ggpoint(p), size=2)

 Regards,

 Hadley



 -

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] seq unexpected behavior

2006-07-26 Thread Vries, Han de
seq(0.1, 0.9 - 0.8, by = 0.1) gives the following error message:

Error in seq.default(0.1, 0.9 - 0.8, by = 0.1) : 
wrong sign in 'by' argument

but seq(0.1, 0.8 - 0.7, by = 0.1) gives
[1] 0.1
(no error message)

Why do I get an error message in the first case?
Han



 sessionInfo()
R version 2.2.1, 2005-12-20, i386-pc-mingw32

attached base packages:
[1] methods   stats graphics  grDevices utils
datasets 
[7] base   

(NB I also tried version 2.3.1 and got the same result - both versions
are precompiled)
  
 Sys.getlocale()
[1] LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252




This email message is for the sole use of the intended recip...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] seq unexpected behavior

2006-07-26 Thread Marc Schwartz
On Wed, 2006-07-26 at 18:35 -0700, Vries, Han de wrote:
 seq(0.1, 0.9 - 0.8, by = 0.1) gives the following error message:
 
 Error in seq.default(0.1, 0.9 - 0.8, by = 0.1) : 
 wrong sign in 'by' argument
 
 but seq(0.1, 0.8 - 0.7, by = 0.1) gives
 [1] 0.1
 (no error message)
 
 Why do I get an error message in the first case?
 Han


See R FAQ 7.31 Why doesn't R think these numbers are equal?

 print(0.9 - 0.8, 20)
[1] 0.09997780

 print(0.8 - 0.7, 20)
[1] 0.10008882


In the first case, the result of the subtraction is slightly less than
0.1, resulting in a negative interval. In the second case, it is
slightly greater than 0.1, which is OK.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Non-parametric four-way interactions?

2006-07-26 Thread Paul Smith
Dear All

I am trying to study four-way interactions in an ANOVA problem.
However, qqnorm+qqline result

(at http://phhs80.googlepages.com/qqnorm.png)

is not promising regarding the normality of data (960 observations).
The result of Shapiro-Wilk test is also not encouraging:

W = 0.9174, p-value  2.2e-16

(I am aware of the fact that normality tests tend to reject normality
for large samples.)

By the way, the histogram is at:

http://phhs80.googlepages.com/hist.png

To circumvent the problem, I looked for non-parametric tests, but I
found nothing, but the article:

http://www.pgia.ac.lk/socs/asasl/journal_papers/PDFformat/g.bakeerathanpaper-2.pdf

Finally, my question is: has R got implemented functions to use
non-parametric tests to avoid the fulfillment of the normality
assumption required to study four-way interactions?

Thanks in advance,

Paul

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RODBC on linux

2006-07-26 Thread Marc Schwartz
On Wed, 2006-07-26 at 17:52 -0400, Armstrong, Whit wrote:
 Anyone out there using Linux RODBC and unixODBC to connect to a
 Microsoft SQL server?
 
 If possible can someone post a sample .odbc.ini file?
 
 I saw a few discussions on the archives a few years ago, but no config
 file details were available.
 
 Thanks,
 Whit

Whit,

Do you have a Linux ODBC driver for SQL Server?  unixODBC is simply the
driver manager, not the driver itself.

MS does not offer (not surprisingly) an ODBC driver for Unix/Linux.
There are resources available however and these might be helpful:

http://www.sommarskog.se/mssql/unix.html

Note that Easysoft provides (at a cost) an ODBC-ODBC bridge for
Unix/Linux platforms which supports ODBC connections to SQL Server:

http://www.easysoft.com/products/data_access/odbc_odbc_bridge/index.html

I am using RODBC to connect from a FC5 system to an Oracle 10g server
running on RHEL, however Oracle provides the ODBC driver for Linux that
can work with the unixODBC facilities.

Also, note that there is a R-sig-DB e-mail list:

https://stat.ethz.ch/mailman/listinfo/r-sig-db

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RODBC on linux

2006-07-26 Thread Dirk Eddelbuettel

On 26 July 2006 at 20:56, Marc Schwartz wrote:
| On Wed, 2006-07-26 at 17:52 -0400, Armstrong, Whit wrote:
|  Anyone out there using Linux RODBC and unixODBC to connect to a
|  Microsoft SQL server?
[...]
| Do you have a Linux ODBC driver for SQL Server?  unixODBC is simply the
| driver manager, not the driver itself.
| 
| MS does not offer (not surprisingly) an ODBC driver for Unix/Linux.

But there is the FreeTDS project (package freetds-dev in Debian) with its
associated ODBC drive (package tdsodbc, from the FreeTDS sources). At some
point a few years ago, a colleague and I were trying to coax that and
unixOBDC to let R (on Solaris) talk to Sybase (on Solaris) and got it to
work.  MS-SQL is (AFAIK) a descendant of Sybase code originally licensed by
MS, hence the common FreeTDS code lineage).  So it should be doable.

Luckily I haven't needed to talk to MS SQL myself so the usual grain of salt
alert...  And sorry, hence no working .odbc.ini to share.

Dirk

-- 
Hell, there are no rules here - we're trying to accomplish something. 
  -- Thomas A. Edison

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RODBC on linux

2006-07-26 Thread Marc Schwartz
On Wed, 2006-07-26 at 21:38 -0500, Dirk Eddelbuettel wrote:
 On 26 July 2006 at 20:56, Marc Schwartz wrote:
 | On Wed, 2006-07-26 at 17:52 -0400, Armstrong, Whit wrote:
 |  Anyone out there using Linux RODBC and unixODBC to connect to a
 |  Microsoft SQL server?
 [...]
 | Do you have a Linux ODBC driver for SQL Server?  unixODBC is simply the
 | driver manager, not the driver itself.
 | 
 | MS does not offer (not surprisingly) an ODBC driver for Unix/Linux.
 
 But there is the FreeTDS project (package freetds-dev in Debian) with its
 associated ODBC drive (package tdsodbc, from the FreeTDS sources). At some
 point a few years ago, a colleague and I were trying to coax that and
 unixOBDC to let R (on Solaris) talk to Sybase (on Solaris) and got it to
 work.  MS-SQL is (AFAIK) a descendant of Sybase code originally licensed by
 MS, hence the common FreeTDS code lineage).  So it should be doable.
 
 Luckily I haven't needed to talk to MS SQL myself so the usual grain of salt
 alert...  And sorry, hence no working .odbc.ini to share.

FreeTDS was one of the options listed on the first URL that I had
included.  :-)

Here is the direct link:

  http://www.freetds.org/ 

Regards,

Marc

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Non-parametric four-way interactions?

2006-07-26 Thread Frank E Harrell Jr
Paul Smith wrote:
 Dear All
 
 I am trying to study four-way interactions in an ANOVA problem.
 However, qqnorm+qqline result
 
 (at http://phhs80.googlepages.com/qqnorm.png)
 
 is not promising regarding the normality of data (960 observations).
 The result of Shapiro-Wilk test is also not encouraging:
 
 W = 0.9174, p-value  2.2e-16
 
 (I am aware of the fact that normality tests tend to reject normality
 for large samples.)
 
 By the way, the histogram is at:
 
 http://phhs80.googlepages.com/hist.png
 
 To circumvent the problem, I looked for non-parametric tests, but I
 found nothing, but the article:
 
 http://www.pgia.ac.lk/socs/asasl/journal_papers/PDFformat/g.bakeerathanpaper-2.pdf
 
 Finally, my question is: has R got implemented functions to use
 non-parametric tests to avoid the fulfillment of the normality
 assumption required to study four-way interactions?
 
 Thanks in advance,
 
 Paul

Yes, although I seldom want to look at 4th order interactions.  You can 
fit a proportional odds model for an ordinal response which is a 
generalization of the Wilcoxon/Kruskal-Wallis approach, and allows one 
to have N-1 intercepts in the model when there are N data points (i.e., 
it works even with no ties in the data).  However if N is large the 
matrix operations will be prohibitive and you might reduce Y to 100-tile 
groups.  The PO model uses only the ranks of Y so is monotonic 
transformation invariant.

library(Design)  # also requires library(Hmisc)
f - lrm(y ~ a*b*c*d)
f
anova(f)

Also see the polr function in VR
-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RODBC on linux

2006-07-26 Thread Prof Brian Ripley
On Wed, 26 Jul 2006, Marc Schwartz wrote:

 On Wed, 2006-07-26 at 17:52 -0400, Armstrong, Whit wrote:
  Anyone out there using Linux RODBC and unixODBC to connect to a
  Microsoft SQL server?
  
  If possible can someone post a sample .odbc.ini file?
  
  I saw a few discussions on the archives a few years ago, but no config
  file details were available.
  
  Thanks,
  Whit
 
 Whit,
 
 Do you have a Linux ODBC driver for SQL Server?  unixODBC is simply the
 driver manager, not the driver itself.
 
 MS does not offer (not surprisingly) an ODBC driver for Unix/Linux.
 There are resources available however and these might be helpful:
 
 http://www.sommarskog.se/mssql/unix.html
 
 Note that Easysoft provides (at a cost) an ODBC-ODBC bridge for
 Unix/Linux platforms which supports ODBC connections to SQL Server:
 
 http://www.easysoft.com/products/data_access/odbc_odbc_bridge/index.html

Several people have successfully used that, from the earliest days of 
RODBC: I believe it was part of Michael Lapsley's motivation to write 
RODBC.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.