[R] Help with reshape/reShape and indexing

2009-05-13 Thread Dana Sevak

Dear R Helpers,

I have trouble applying reShape and reshape although I read the documentation 
and several posts, so I would very much appreciate your help on the two points 
below.

I have a dataframe

df = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 
20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4))

 df
  Name X1  X2
1a 12 200
2a 13 250
3a 14 300
4b 20 600
5b 25 700
6c 30 900

First I need to add an additional column to this dataframe that will count the 
number of rows per each Name entry.  The resulting df should look like:

df.index = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 
20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4), Index=c(1,2,3,1,2,1))

 df.index
  Name X1  X2Index
1a 12 2001
2a 13 2502
3a 14 3003
4b 20 6001
5b 25 7002
6c 30 9001

How can I do this?


Secondly, I would like to reshape this dataframe in the form:

 df2
   1  2  3
a 12 13 14
b 20 25 NA
c 30 NA NA

Since the df is sorted by Name and X2, I would need that the available X1 
values populate the resulting rows in df2 from left to right (i.e. if only one 
value is available, it is written in the first column and the remaining columns 
get NAs).  If I could generate the Index column, I think I could accomplish 
this with:

df2 = reShape(df.index$X1, id=df.index$Name, colvar=df.index$Index)
colnames(df2) = c(V1, V2, V3)

However, is there a way to get to df2 without using the Index column and still 
have the NAs written as described above?

Thank you so much for your help on these two issues.

With best regards,
Dana Sevak

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overlay cdf

2009-05-13 Thread Bill.Venables
Here are some ideas you might like to consider

par(mar = c(5,4,2,4)+0.1, yaxs = r)
Sample - rgamma(1000,2.5,.8)
hist(Sample, main = , freq = FALSE, ylim = c(0,1))

pu - par(usr)[1:2]
x - seq(pu[1], pu[2], len = 5000)
y - pgamma(x, 2.5, 0.8)
par(new = TRUE)
plot(x, y, type = l, axes = FALSE, ann = FALSE, col = red)
lines(x, dgamma(x, 2.5, 0.8), col = darkgreen)

axis(4, col = red)
mtext(side = 4, text = Cumulative probability, col = red, line = 2.5)

x0 - c(0, sort(Sample))
p0 - 0:1000/1000
lines(x0, p0, type = S, col = blue)


Bill Venables
http://www.cmis.csiro.au/bill.venables/ 


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of beetle2
Sent: Wednesday, 13 May 2009 3:23 PM
To: r-help@r-project.org
Subject: [R] Overlay cdf


Hi,
Is it possible to  overlay a cummulative distribution function on a
histogram of a gamma distribuition.

I have a gamma function 

Sample = rgamma(1000,2.5,.8)+1.5
hist(Sample)

regards


-- 
View this message in context: 
http://www.nabble.com/Overlay-cdf-tp23515551p23515551.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a postscript file with two xyplots

2009-05-13 Thread Dieter Menne
Liati liats80 at hotmail.com writes:

 I would like to create one postscript file with two different xyplots (which

library(lattice)
postscript(myps.ps)
xyplot(1~1,main=Plot 1)
xyplot(2~3,main=Plot 2)
dev.off()

Dieter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] import HTML tables

2009-05-13 Thread Dieter Menne


Dimitri Szerman-2 wrote:
 
 Hello,
 I was wondering if there is a function in R that imports tables directly
 from a HTML document.
 

The XML package can do this:

http://markmail.org/message/cyicoa3htme4gei2

Duncan Temple Lang:

The htmlParse() and htmlTreeParse() functions in the XML package use the
non-strict HTML parser in libxml2 and so the HTML document can be malformed. 


Dieter
-- 
View this message in context: 
http://www.nabble.com/import-HTML-tables-tp23504282p23517322.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] adonis help - (non-parametric (permutational) manova)

2009-05-13 Thread Gavin Simpson
On Tue, 2009-05-12 at 15:53 -0400, stephen sefick wrote:
 I am trying to apply this technique (M.J Anderson 2001) to a dataset
 of aquatic insect abundances.  There is a sample in the unrestored and
 restored segement of a stream for every time period.  I would like to
 compare the centroids of the distance matrices for the treatments up
 (unrestored) and dn (restored) to see if there is a difference in
 insect communities between the treatments.  I will not include the raw
 data in this posting as it is large for posting to the list; however,
 I would be happy to provide it off list if it would make this easier
 (and reproducible).
 
 my environmental matrix (or factor matrix I am not sure of the
 terminology) is set up like this:
 
 datesite
 0104 dn   0104   dn
 0106 dn   0106   dn
 0203 dn   0203   dn
 0503 dn   0503   dn
 0704 dn   0704   dn
 0803 dn   0803   dn
 0804 dn   0804   dn
 0805 dn   0805   dn
 1005 dn   1005   dn
 1102 dn   1102   dn
 1204 dn   1204   dn
 0104 up   0104   up
 0106 up   0106   up
 0203 up   0203   up
 0503 up   0503   up
 0704 up   0704   up
 0803 up   0803   up
 0804 up   0804   up
 0805 up   0805   up
 1005 up   1005   up
 1102 up   1102   up
 1204 up   1204   up
 
 my site x species matrix is called a, so here is the call to adonis:
 
 adonis(a~site, data=b, strata=b[,date] ,Permutations=999)

I don't think the permutations will be stratified correctly - you want
them to represent a time series yes? 'strata' is meant to define in
vegan. Samples within the strata are permuted. So if you only have two
samples per unique time point (1 for up and 1 for dn), the effect of
setting strata to the date variable will be to permute only pairs of
samples.

Work has begun (and stalled for a little while - my fault) on providing
a wider range of restricted permutation tests. The function
permuted.index2 in vegan can generate permutations for time series (or
other ordered observations), but you'd have to a) edit adonis in place
to use permuted.index2 and work out how to set up the call to this
function correctly so that it returns the permutation structure adonis
wants. Then check it does what it says it does - there is at least one
bug that I know of but I'm not fixing it as the development version on
my local machine has completely changed the way the permutation schemes
are specified. Contact me off-list if you would like some help with
this, though as I'm teaching for two weeks, I won't be able to look at
it until later.

For now, perhaps you could just ignore the time-series aspects and run
the analysis without strata, but require a far lower p-value than you
might normally use to reflect the fact that the permutations do not take
into account correlations between time points.

HTH

 
 Is this the correct way of testing the null hypothesis that :
 
 There is no difference in community structure between treatments.
 
 Thank you very much in advance, and anything that you need to make
 this easier please don't hessitate to ask.
 
 regards,
 
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Do you use R for data manipulation?

2009-05-13 Thread Wacek Kusnierczyk
Warren Young wrote:
 Farrel Buchinsky wrote:
 Is R an appropriate tool for data manipulation and data reshaping and
 data
 organizing? I think so but someone who recently joined our group
 thinks not.
 The new recruit believes that python or another language is a far better
 tool for developing data manipulation scripts that can be then used by
 several members of our research group. Her assessment is that R is
 useful
 only when it comes to data analysis and working with statistical models.

 It's hard to shift people's individual preferences, but impressive
 objective comparisons are easy to come by.  Ask her how many lines it
 would take to do this trivial R task in Python:

 data - read.csv('original-data.csv')
 write.csv('scaled-data.csv', data * 10)


you might want to learn that this is a question of appropriate
libraries.  in r, read.csv and write.csv reside in the package utils. 
in python, you'd use numpy:

from numpy import loadtxt, savetxt
savetxt('scaled.csv', loadtxt('original.csv', delimiter=',')*10,
delimiter=',')

this makes 2 lines, together with importing the library.



 R's ability to do something to an entire data structure -- or a slice
 of it, or some other subset -- in a single operation is very useful
 when cleaning up data for presentation and analysis.  

but this is really *hardly* r-specific.  you can do that in many, many
languages, be assured.  just peek out.

 Also point out how easy it is to get data *out* of R, as above, not
 just into it, so you can then hack on it in Python, if that's the
 better language for further manipulation.

 If she gives you static about how a few more lines are no big deal,
 remind her that it's well established that bug count is always a
 simple function of line count.  This fact has been known since the 70's.

that's a slogan, esp. when you think of how compact (but unreadable, and
thus error-prone) can code written in perl be.  often, more lines of
code make it easier to maintain, and thus avoid bugs.



 While making your points, remember that she has a good one, too: R is
 not the only good language out there.  You should learn Python while
 she's learning R.

+1

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] adonis help - (non-parametric (permutational) manova)

2009-05-13 Thread Gavin Simpson
Apologies, I seem to have deleted the important part of a sentence
below:

On Wed, 2009-05-13 at 09:37 +0100, Gavin Simpson wrote:
snip / 
  adonis(a~site, data=b, strata=b[,date] ,Permutations=999)
 
 I don't think the permutations will be stratified correctly - you want
 them to represent a time series yes? 'strata' is meant to define in
 vegan. 

Should have said:

'strata' is meant to define groups of samples, or blocks, in vegan.

G

 Samples within the strata are permuted. So if you only have two
 samples per unique time point (1 for up and 1 for dn), the effect of
 setting strata to the date variable will be to permute only pairs of
 samples.
 
 Work has begun (and stalled for a little while - my fault) on providing
 a wider range of restricted permutation tests. The function
 permuted.index2 in vegan can generate permutations for time series (or
 other ordered observations), but you'd have to a) edit adonis in place
 to use permuted.index2 and work out how to set up the call to this
 function correctly so that it returns the permutation structure adonis
 wants. Then check it does what it says it does - there is at least one
 bug that I know of but I'm not fixing it as the development version on
 my local machine has completely changed the way the permutation schemes
 are specified. Contact me off-list if you would like some help with
 this, though as I'm teaching for two weeks, I won't be able to look at
 it until later.
 
 For now, perhaps you could just ignore the time-series aspects and run
 the analysis without strata, but require a far lower p-value than you
 might normally use to reflect the fact that the permutations do not take
 into account correlations between time points.
 
 HTH
 
  
  Is this the correct way of testing the null hypothesis that :
  
  There is no difference in community structure between treatments.
  
  Thank you very much in advance, and anything that you need to make
  this easier please don't hessitate to ask.
  
  regards,
  
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] survival curves for time dependent covariates (was consultation)

2009-05-13 Thread Heinz Tuechler

At 14:50 12.05.2009, Terry Therneau wrote:

*I´m writing to ask you how can I do Survivals Curves using Time-dependent
*covariates? Which packages I need to Install?*

  This is a very difficult problem 
statistically.  That is, there are not many

good ideas for what SHOULD be done.  Hence, there are no packages. Almost
everything you find in an applied paper (e.g. a medical journal) is wrong.

 Terry Therneau



Dear Terry,

just in case it does not make too much work to 
you, maybe you could give some references to 
examples of wrong applications in applied medical papers.


Thanks,
Heinz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Segmentation fault in package rJava on CentOS server

2009-05-13 Thread Carlos J. Gil Bellosta
Hello,

I just installed rJava on

[r...@ug13 ~]# R --version
R version 2.9.0 (2009-04-17)

runing on a

[r...@ug13 ~]# cat /etc/redhat-release
CentOS release 5.3 (Final)

This is the output of

[r...@ug13 ~]# R CMD javareconf
Java interpreter : /usr/bin/java
Java version : 1.4.2_18
Java home path   : /usr/java/j2sdk1.4.2_18/jre
Java compiler: /usr/bin/javac
Java headers gen.: /usr/bin/javah
Java archive tool: /usr/bin/jar
Java library path:
$(JAVA_HOME)/lib/i386/client:$(JAVA_HOME)/lib/i386:$(JAVA_HOME)/../lib/i386
JNI linker flags : -L$(JAVA_HOME)/lib/i386/client
-L$(JAVA_HOME)/lib/i386 -L$(JAVA_HOME)/../lib/i386 -ljvm
JNI cpp flags: -I$(JAVA_HOME)/../include -I$(JAVA_HOME)/../include/linux

Package rJava got properly installed (there were a number of warnings,
though, in the installation process). However,

 library(rJava)
 .jinit()

 *** caught segfault ***
address 0xc, cause 'memory not mapped'

Traceback:
 1: .External(RinitJVM, boot.classpath, parameters, PACKAGE = rJava)
 2: .jinit()

Whenever I try to interact with Java from R --I am interested in the
RJDBC package--, I get the same segmentation fault at the .jinit call.
In particular, when .jinit calls RinitJVM.

Any ideas?

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with reshape/reShape and indexing

2009-05-13 Thread Uwe Ligges



Dana Sevak wrote:

Dear R Helpers,

I have trouble applying reShape and reshape although I read the documentation 
and several posts, so I would very much appreciate your help on the two points 
below.

I have a dataframe

df = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 20, 25, 
30), X2 = c(200, 250, 300, 600, 700, 4))


df

  Name X1  X2
1a 12 200
2a 13 250
3a 14 300
4b 20 600
5b 25 700
6c 30 900

First I need to add an additional column to this dataframe that will count the 
number of rows per each Name entry.  The resulting df should look like:

df.index = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 
20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4), Index=c(1,2,3,1,2,1))


df.index

  Name X1  X2Index
1a 12 2001
2a 13 2502
3a 14 3003
4b 20 6001
5b 25 7002
6c 30 9001

How can I do this?


Secondly, I would like to reshape this dataframe in the form:


df2

   1  2  3
a 12 13 14
b 20 25 NA
c 30 NA NA



This does it more or less your way:

ds - split(df, df$Name)
ds - lapply(ds, function(x){x$Index - seq_along(x[,1]); x})
df2 - unsplit(ds, df$Name)
tapply(df2$X1, df2[,c(Name, Index)], function(x) x)

athough there may exist much easier ways ...

Uwe Ligges




Since the df is sorted by Name and X2, I would need that the available X1 
values populate the resulting rows in df2 from left to right (i.e. if only one 
value is available, it is written in the first column and the remaining columns 
get NAs).  If I could generate the Index column, I think I could accomplish 
this with:

df2 = reShape(df.index$X1, id=df.index$Name, colvar=df.index$Index)
colnames(df2) = c(V1, V2, V3)

However, is there a way to get to df2 without using the Index column and still 
have the NAs written as described above?

Thank you so much for your help on these two issues.

With best regards,
Dana Sevak

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Response surface plot

2009-05-13 Thread Tim Carnus
HI,

thank you for that. I had come across a while ago a presentation outlining the 
ideas for such a function but can't remember who or where.

Thanks again,

Tim

- Original Message -
From: Duncan Murdoch murd...@stats.uwo.ca
Date: Tuesday, May 12, 2009 4:26 pm
Subject: Re: [R] Response surface plot
To: Tim Carnus tim.car...@ucd.ie
Cc: r-help@r-project.org

 On 5/12/2009 8:43 AM, Tim Carnus wrote:
     Dear List,
     I am trying to plot a similar graph to 
 attached from minitab manual in R.
     I have a response Y and three components 
 which systematically vary in their
     proportions. I have found in R 
 methods/packages to plot ternary plots (eg.
     plotrix) but nothing which can extend it to 
 response surface in 3-D.
     Any help appreciated,
 
 I'm not aware of anyone who has done this.  The way to do 
 the surface in 
 rgl would be to construct a mesh of triangles using tmesh3d, and 
 set the 
 color of each vertex as part of the material argument. It's a 
 little 
 tricky to get the colors right when they vary by vertex, but the 
 code 
 below gives an example.
 
 I would construct the mesh by starting with one triangle and 
 calling 
 subdivision3d, but you may want more control over them.
 
 For example:
 
 library(rgl)
 
 # First create a flat triangle and subdivide it
 triangle - c(0,0,0,1, 1,0,0,1, 0.5, sqrt(3)/2, 0, 1)
 mesh - tmesh3d( triangle, 1:3, homogeneous=TRUE)
 mesh - subdivision3d(mesh, 4, deform=FALSE, normalize=TRUE)
 
 # Now get the x and y coordinates and compute the surface height
 x - with(mesh, vb[1,])
 y - with(mesh, vb[2,])
 z - x^2 + y^2
 mesh$vb[3,] - z
 
 # Now assign colors according to the height; remember that the
 # colors need to be in the order of mesh$it, not vertex order.
 
 vcolors - rainbow(100)[99*z+1]
 tricolors - vcolors[mesh$it]
 mesh$material = list(color=tricolors)
 
 # Now draw the surface, and a rudimentary frame behind it.
 
 shade3d(mesh)
 triangles3d(matrix(triangle, byrow=TRUE, ncol=4), col=white)
 quads3d(matrix(c(1,0.5,0.5,1, 0,sqrt(3)/2, sqrt(3)/2,0, 
 0,0,1,1), 
 ncol=3), col=white)
 bg3d(gray)
 
 Duncan Murdoch

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple plot margins

2009-05-13 Thread Uwe Ligges



Andre Nathan wrote:

Hello

I'm plotting 6 graphs using mfrow = c(2, 3). In these plots, only
graphs in the first column have titles for the y axis, and only the ones
in the last row have titles for the x axis.

I'd like all plots to be of the same size, and I'm trying to keep them
as near each other as possible, but I'm having the following problem.

If I make a single call to par(mar = ...), to leave room on the left and
bottom for the axes titles, a lot of space will be wasted because not
all graphs need titles; however, if I make one call of par(mar = ...)
per plot, to have finer control of the margins, the first column and
last row plots will be smaller than the rest, because the titles use up
some of their space.

I thought that setting large enough values for oma would do what I
want, but it doesn't appear to work if mar is too small.

To illustrate better what I'm trying to do:

  l +-+ +-+ +-+
  a | | | | | |
  b | | | | | |
  e | | | | | |
  l +-+ +-+ +-+
  
  l +-+ +-+ +-+

  a | | | | | |
  b | | | | | |
  e | | | | | |
  l +-+ +-+ +-+
 label   label   label

where the margins between each plot should be narrow.

Should I just plot the graphs without axis titles and then use text() to
manually position them?




Can't you do it with lattice / grid?


If not, example:

par(mfrow = c(2,3), mar = c(0,0,0,0), oma = c(5,5,0,0), xpd=NA)
plot(1, xaxt=n, xlab=, ylab=A)
plot(1, xaxt=n, yaxt=n, xlab=, ylab=)
plot(1, xaxt=n, yaxt=n, xlab=, ylab=)
plot(1, xlab=I, ylab=B)
plot(1, xlab=II, ylab=, yaxt=n)
plot(1, xlab=III, ylab=, yaxt=n)

Uwe Ligges





Thanks in advance,
Andre

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] where does the null come from?

2009-05-13 Thread Wacek Kusnierczyk
m = matrix(1:4, 2)

apply(m, 1, cat, '\n')
# 1 2
# 3 4
# NULL

why the null?

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with reshape/reShape and indexing

2009-05-13 Thread Dimitris Rizopoulos

one way is the following:

df.index - df
df.index$Index - ave(seq_along(df$Name), df$Name, FUN = seq_along)
df.index

df2 - reshape(df.index[c(Name, Index, X1)], timevar = Index, 
idvar = Name, direction = wide)

df2


I hope it helps.

Best,
Dimitris


Dana Sevak wrote:

Dear R Helpers,

I have trouble applying reShape and reshape although I read the documentation 
and several posts, so I would very much appreciate your help on the two points 
below.

I have a dataframe

df = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 20, 25, 
30), X2 = c(200, 250, 300, 600, 700, 4))


df

  Name X1  X2
1a 12 200
2a 13 250
3a 14 300
4b 20 600
5b 25 700
6c 30 900

First I need to add an additional column to this dataframe that will count the 
number of rows per each Name entry.  The resulting df should look like:

df.index = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 
20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4), Index=c(1,2,3,1,2,1))


df.index

  Name X1  X2Index
1a 12 2001
2a 13 2502
3a 14 3003
4b 20 6001
5b 25 7002
6c 30 9001

How can I do this?


Secondly, I would like to reshape this dataframe in the form:


df2

   1  2  3
a 12 13 14
b 20 25 NA
c 30 NA NA

Since the df is sorted by Name and X2, I would need that the available X1 
values populate the resulting rows in df2 from left to right (i.e. if only one 
value is available, it is written in the first column and the remaining columns 
get NAs).  If I could generate the Index column, I think I could accomplish 
this with:

df2 = reShape(df.index$X1, id=df.index$Name, colvar=df.index$Index)
colnames(df2) = c(V1, V2, V3)

However, is there a way to get to df2 without using the Index column and still 
have the NAs written as described above?

Thank you so much for your help on these two issues.

With best regards,
Dana Sevak

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] questions on rpart (tree changes when rearrange the order of covariates?!)

2009-05-13 Thread Uwe Ligges



Yuanyuan wrote:

Greetings,

I am using rpart for classification with class method. The test data  is
the Indian diabetes data from package mlbench.

I fitted a classification tree firstly using the original data, and then
exchanged the order of Body mass and Plasma glucose which are the
strongest/important variables in the growing phase. The second tree is a
little different from the first one. The misclassification tables are
different too. I did not change the data, but why the results are so
different?


Well, at some splits the variable that comes first and yields in the 
same reduction of the entropy criterion as another one might be used, 
hence another result.


Uwe Ligges






Does anyone know how rpart deal with ties?

Here is the codes for running the two trees.


library(mlbench)
data(PimaIndiansDiabetes2)
mydata-PimaIndiansDiabetes2
library(rpart)
fit2-rpart(diabetes~., data=mydata,method=class)
plot(fit2,uniform=T,main=CART for original data)
text(fit2,use.n=T,cex=0.6)
printcp(fit2)
table(predict(fit2,type=class),mydata$diabetes)
## misclassifcation table: rows are fitted class
  neg pos
  neg 437  68
  pos  63 200
#Klimt(fit2,mydata)

pmydata-data.frame(mydata[,c(1,6,3,4,5,2,7,8,9)])
fit3-rpart(diabetes~., data=pmydata,method=class)
plot(fit3,uniform=T,main=CART after exchaging mass  glucose)
text(fit3,use.n=T,cex=0.6)
printcp(fit3)
table(predict(fit3,type=class),pmydata$diabetes)
##after exchage the order of BODY mass and PLASMA glucose
  neg pos
  neg 436  64
  pos  64 204
#Klimt(fit3,pmydata)


Thanks,


--
Yuanyuan Huang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] where does the null come from?

2009-05-13 Thread K. Elo
Hi!

Wacek Kusnierczyk wrote:
 m = matrix(1:4, 2)
 
 apply(m, 1, cat, '\n')
 # 1 2
 # 3 4
 # NULL
 
 why the null?

Could it be the return value of 'cat'. See ?cat, where:

---snip ---
Value
 None (invisible NULL).
---snip ---

Kind regrads,
Kimmo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] where does the null come from?

2009-05-13 Thread Peter Dalgaard
Wacek Kusnierczyk wrote:
 m = matrix(1:4, 2)
 
 apply(m, 1, cat, '\n')
 # 1 2
 # 3 4
 # NULL
 
 why the null?

It comes from unlist()ing a list of NULLs, which in turn are the return
values of cat().

It is arguably a design-buglet not to return list(NULL, NULL), but the
internal logic is to unlist() unless the first element is.recursive (and
NULL is not) or the return values have different length() (and all are
zero). It _is_, however, in accordance with the documentation (see the
Value: section):

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] where does the null come from?

2009-05-13 Thread Wacek Kusnierczyk
Peter Dalgaard wrote:
 Wacek Kusnierczyk wrote:
   
 m = matrix(1:4, 2)

 apply(m, 1, cat, '\n')
 # 1 2
 # 3 4
 # NULL

 why the null?
 

 It comes from unlist()ing a list of NULLs, which in turn are the return
 values of cat().
   

yes;  i'd think i'd get a list of nulls, but...


 It is arguably a design-buglet not to return list(NULL, NULL), but the
 internal logic is to unlist() unless the first element is.recursive (and
 NULL is not) or the return values have different length() (and all are
 zero). It _is_, however, in accordance with the documentation (see the
 Value: section):
   

... i agree the actual outcome is appropriately explained in the docs. 
i don't think it has no merit, but it's a bit surprising at first.

thanks,
vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with a cumullative Hazrd Ratio plot

2009-05-13 Thread Bernardo Rangel Tura
Hi R-masters

I need help to make modified cumulative hazard ratio plot.

I need create a common plot but with the number of subjects in risk each
ticks times for two different groups in bottom of plot (I put one
example in attach).

Do you know a routine for this?
Is possible create a routine for this? 
In this case with how commands?

Thanks in advance!
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multiple ANOVA tests

2009-05-13 Thread Imri

Hello!!!
I'm trying to do multiple ANOVA tests with R (testing the affect off
different factors on the same response). As a result I get many ANOVA
tables, and I want to extract a list of the Pr(F) from all the tables. 
Maybe someone have an idea how to do this?
Thanks
Imri  

-- 
View this message in context: 
http://www.nabble.com/Multiple-ANOVA-tests-tp23518637p23518637.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] converting numeric into character strings

2009-05-13 Thread Olivier ETERRADOSSI

Hi Melissa
unless I miss a point, you should get what you want with (for example)

y-paste(b,collapse=,)

Hope this helps. Olivier


Melissa2k9 wrote:
 
 Hi,
 
 Im trying to put some numbers into a dataframe , I have a list of numbers
 (change points in a time series) like such
 
 [1]  2 11 12 20 21 98 99
 
 but I want R to recognise this as just a character string so it will put
 it in one row and column, ideally I want them seperated by commas so I
 would have for example
 
 Person  Change points (seconds)
 A   2,11,12,20,21,98,99  
 B4,5,89
  
 etc. Is there any way I can get this
 
 I've tried this:
 
 for example if the command to get the list of numbers was b-which(a!=s),
 then i have tried
 
 as.character(b) but I just end up with 
 
 [1] 2  11 12 20 21 98 99
 
 which is not what I want as this is more than one string and is not
 seperated by commas, I also tried
 
 paste(b,sep=,) but I end up with the same thing. Sorry it's a bit
 confusing to read but any help would be great! 
 
 Melissa
 

-- 
View this message in context: 
http://www.nabble.com/converting-numeric-into-character-strings-tp23518762p23519577.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting a grid with color on a map

2009-05-13 Thread Ray Brownrigg

dxc13 wrote:

Hi all,
I have posted similar questions regarding this topic, but I just can't seem
to get over the hump and find a straightforward way to do this.  I have
attached my file as a reference.
Basically, the attached file is a 5 degree by 5 degree grid of the the world
(2592 cells), most of them are NA's.  I just want to be able to plot this
grid over a world map and color code the cells.  For example, if a cell has
a temperature less than 20 degrees it will be blue, 21 to 50 green color,
51-70 orange, 71+ red colored cells.  For any NAs, they should be colored
white.

I know how to create a map of the world using map() and add a grid to it
using map.grid(), but I can't color code the cells the way I need.  Is there
a way to do this in R?

Thanks again.
dxc13 http://www.nabble.com/file/p23514804/time1test.txt time1test.txt 


How about the following, which doesn't need a grid at all?

library(maps)
temp - as.matrix(read.table(time1test.txt))
xvals - c(0, 0, 5, 5, 0)
yvals - c(0, 5, 5, 0, 0)
map(world)
palette(rainbow(50))
for (lat in seq(-90, 85, 5))
  for (lon in seq(-180, 175, 5)) {
col - temp[(lat + 95)/5, (lon + 185)/5]
if (!is.na(col)) polygon(lat + xvals, lon + yvals, col=col, border=NA)
  }
palette(default)

HTH
Ray Brownrigg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overlay cdf

2009-05-13 Thread Matthieu Dubois
You might also use ?curve

# same example as Bill's
par(mar = c(5,4,2,4)+0.1, yaxs = r)
Sample - rgamma(1000,2.5,.8)
hist(Sample, main = , freq = FALSE, ylim = c(0,1))

curve(pgamma(x, 2.5, 0.8), add=T, col='red')
curve(dgamma(x, 2.5, 0.8), add=T, col='darkgreen')

axis(4, col = red)
mtext(side = 4, text = Cumulative probability, col = red, line = 2.5)

x0 - c(0, sort(Sample))
p0 - 0:1000/1000
lines(x0, p0, type = S, col = blue)
 
Regards,

Matthieu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with a cumullative Hazrd Ratio plot

2009-05-13 Thread Nutter, Benjamin
?mtext

You may need to adjust the margins.  For this I recommend adjusting that
mar option in par (see ?par).



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Bernardo Rangel Tura
Sent: Wednesday, May 13, 2009 6:31 AM
To: r-help
Subject: Re: [R] Help with a cumullative Hazrd Ratio plot

On Wed, 2009-05-13 at 07:19 -0300, Bernardo Rangel Tura wrote:
 Hi R-masters
 
 I need help to make modified cumulative hazard ratio plot.
 
 I need create a common plot but with the number of subjects in risk
each
 ticks times for two different groups in bottom of plot (I put one
 example in attach).
 
 Do you know a routine for this?
 Is possible create a routine for this? 
 In this case with how commands?
 
 Thanks in advance!

Sorry I put attach in jpeg format 
In this mail a attach in PDF format
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S. News  World Report (2008).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] overlap contour

2009-05-13 Thread Bala subramanian
Friends,

I have two covariance matrices (m1 and m2) of same size (150x150). I used
contourplot function to make contour plots individually (c1 and c2). I am
interested in making one contourplot overlapping the two individual contours
so that the portion of the plot above and below the diagonal can represent
the c1 and c2. Someone suggest me how can i do the same.

Is there any way that i can combine m1 and m2 and write the combined matrix
to a file and plot it to achieve the mentioned above.

Thanks,
Bala

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting a grid with color on a map

2009-05-13 Thread Jim Lemon

dxc13 wrote:

Hi all,
I have posted similar questions regarding this topic, but I just can't seem
to get over the hump and find a straightforward way to do this.  I have
attached my file as a reference.
Basically, the attached file is a 5 degree by 5 degree grid of the the world
(2592 cells), most of them are NA's.  I just want to be able to plot this
grid over a world map and color code the cells.  For example, if a cell has
a temperature less than 20 degrees it will be blue, 21 to 50 green color,
51-70 orange, 71+ red colored cells.  For any NAs, they should be colored
white.

I know how to create a map of the world using map() and add a grid to it
using map.grid(), but I can't color code the cells the way I need.  Is there
a way to do this in R?
  

Hi dxc13,
This might get you started:

temp1-read.table(time1test.dat,header=TRUE)
mapcol-color.scale(as.matrix(temp1[36:1,]),c(0.5,1),c(0.5,0),c(1,0))
# have to draw the map to get the user coordinates
map()
# get the limits of the map
maplim-par(usr)
# transform the temperatures into colors, reversing the row order
color2D.matplot(temp1[36:1,],cellcolors=mapcol,axes=FALSE)
# don't erase the current plot
par(new=TRUE)
# draw an empty plot with the appropriate axes (I think)
plot(0,xlim=maplim[1:2],ylim=maplim[3:4],type=n)
# add the map over the color squares
map(add=TRUE)

This seems a bit wonky, probably because I haven't adjusted the 
coordinates. Also, I'm only getting grayscale colors, even though the 
colors in mapcol aren't gray. Don't know why yet.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting a grid with color on a map

2009-05-13 Thread Jim Lemon

Oops, forgot to include:

library(plotrix)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with reshape/reShape and indexing

2009-05-13 Thread Gabor Grothendieck
Try this:

DF$Index - ave(1:nrow(DF), DF$Name, FUN = seq_along)
reshape(DF[-3], dir = wide, idvar = Name, timevar = Index)

Also see the reshape package for another similar facility.



On Wed, May 13, 2009 at 2:02 AM, Dana Sevak dana.se...@yahoo.com wrote:

 Dear R Helpers,

 I have trouble applying reShape and reshape although I read the documentation 
 and several posts, so I would very much appreciate your help on the two 
 points below.

 I have a dataframe

 df = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 
 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4))

 df
  Name X1  X2
 1    a 12 200
 2    a 13 250
 3    a 14 300
 4    b 20 600
 5    b 25 700
 6    c 30 900

 First I need to add an additional column to this dataframe that will count 
 the number of rows per each Name entry.  The resulting df should look like:

 df.index = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 
 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4), Index=c(1,2,3,1,2,1))

 df.index
  Name X1  X2    Index
 1    a 12 200    1
 2    a 13 250    2
 3    a 14 300    3
 4    b 20 600    1
 5    b 25 700    2
 6    c 30 900    1

 How can I do this?


 Secondly, I would like to reshape this dataframe in the form:

 df2
   1  2  3
 a 12 13 14
 b 20 25 NA
 c 30 NA NA

 Since the df is sorted by Name and X2, I would need that the available X1 
 values populate the resulting rows in df2 from left to right (i.e. if only 
 one value is available, it is written in the first column and the remaining 
 columns get NAs).  If I could generate the Index column, I think I could 
 accomplish this with:

 df2 = reShape(df.index$X1, id=df.index$Name, colvar=df.index$Index)
 colnames(df2) = c(V1, V2, V3)

 However, is there a way to get to df2 without using the Index column and 
 still have the NAs written as described above?

 Thank you so much for your help on these two issues.

 With best regards,
 Dana Sevak

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] questions on rpart (tree changes when rearrange the order of covariates?!)

2009-05-13 Thread Liaw, Andy
From: Uwe Ligges
 
 Yuanyuan wrote:
  Greetings,
  
  I am using rpart for classification with class method. 
 The test data  is
  the Indian diabetes data from package mlbench.
  
  I fitted a classification tree firstly using the original 
 data, and then
  exchanged the order of Body mass and Plasma glucose which are the
  strongest/important variables in the growing phase. The 
 second tree is a
  little different from the first one. The misclassification 
 tables are
  different too. I did not change the data, but why the results are so
  different?
 
 Well, at some splits the variable that comes first and yields in the 
 same reduction of the entropy criterion as another one might be used, 
 hence another result.
 
 Uwe Ligges

I recently tried writing adaboost.m1 using rpart, and was surprised that
with very small training set (say n=10 or 20), I get a large improvement
in test set accuracy if I randomly shuffle the columns in the data at
every adaboost iteration.  (With twonorm data, we're talking about 25%
error vs. 19%, using n=2000 test set.)  It turned out to be the way
rpart deals with ties--- first come, first win.  Without shuffling the
columns, rpart almost never pick any variable beyond the 10th.  (In
twonorm, all variables are equally important, so one would expect
roughly equal selection frequency.)  

I've gotten some pointers from Terry Therneau about where in the code to
check.  I may try to implement breaking ties at random (as I've done in
randomForest).  No promises, though...

Andy
 
 
 
 
  
  Does anyone know how rpart deal with ties?
  
  Here is the codes for running the two trees.
  
  
  library(mlbench)
  data(PimaIndiansDiabetes2)
  mydata-PimaIndiansDiabetes2
  library(rpart)
  fit2-rpart(diabetes~., data=mydata,method=class)
  plot(fit2,uniform=T,main=CART for original data)
  text(fit2,use.n=T,cex=0.6)
  printcp(fit2)
  table(predict(fit2,type=class),mydata$diabetes)
  ## misclassifcation table: rows are fitted class
neg pos
neg 437  68
pos  63 200
  #Klimt(fit2,mydata)
  
  pmydata-data.frame(mydata[,c(1,6,3,4,5,2,7,8,9)])
  fit3-rpart(diabetes~., data=pmydata,method=class)
  plot(fit3,uniform=T,main=CART after exchaging mass  glucose)
  text(fit3,use.n=T,cex=0.6)
  printcp(fit3)
  table(predict(fit3,type=class),pmydata$diabetes)
  ##after exchage the order of BODY mass and PLASMA glucose
neg pos
neg 436  64
pos  64 204
  #Klimt(fit3,pmydata)
  
  
  Thanks,
  
  
  
 --
 
  Yuanyuan Huang
  
  [[alternative HTML version deleted]]
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
Notice:  This e-mail message, together with any attachme...{{dropped:12}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting a grid with color on a map

2009-05-13 Thread Jim Lemon
It was the NAs that fooled color2D.matplot. This gets your colors, 
although not exactly what you want. Look at the help for color2D.matplot 
to get that. I think fiddling with the x and y limits on the map() call 
will get the positions right.


temp1-read.table(time1test.dat,header=TRUE)
library(plotrix)
# reverse the row order, as color2D.matplot reverses it
color2D.matplot(temp1[36:1,],c(0.5,1),c(0.5,0),c(1,0),axes=FALSE)
# don't erase the above plot
par(new=TRUE)
# do a ghost plot with just the axes
plot(0,xlim=maplim[1:2],ylim=maplim[3:4],type=n)
# add the map on top in black
map(add=TRUE,col=black)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with reshape/reShape and indexing

2009-05-13 Thread hadley wickham
 This does it more or less your way:

 ds - split(df, df$Name)
 ds - lapply(ds, function(x){x$Index - seq_along(x[,1]); x})
 df2 - unsplit(ds, df$Name)
 tapply(df2$X1, df2[,c(Name, Index)], function(x) x)

 athough there may exist much easier ways ...

Here's one way with the plyr and reshape package:

library(plyr)
df.index - ddply(df, .(Name), transform, Index = seq_along(X1))

library(reshape)
cast(df.index, Name ~ Index, value = X1)

Hadley



-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] silhouette: clustering labels have to be consecutive integers starting

2009-05-13 Thread Martin Maechler
 TS == Tao Shi shi...@hotmail.com
 on Wed, 10 Oct 2007 06:15:53 + writes:

TS Thank you very much, Benilton and Prof. Ripley, for the
TS speedy replies!

TS Looking forward to the fix!
TS Tao

I have finally re-stumbled onto this e-mail thread,
and indeed found fixed the problem.

Version 1.12.0 of 'cluster' should become visible within a few days,
and will allow to call

silhoutte(g, dis)

on a grouping vector of k different integer values which need
*not* necessarily be in 1:k.

Martin Maechler,
ETH Zurich


 From: Prof Brian Ripley rip...@stats.ox.ac.uk
 To: Benilton Carvalho bcarv...@jhsph.edu
 CC: Tao Shi shi...@hotmail.com, maech...@stat.math.ethz.ch,
 r-help@r-project.org
 Subject: Re: [R] silhouette: clustering labels have to be consecutive 
 intergers starting from 1?
 Date: Wed, 10 Oct 2007 05:33:03 +0100 (BST)
 
 It is a C-level problem in package cluster: valgrind gives
 
 ==11377== Invalid write of size 8
 ==11377==at 0xA4015D3: sildist (sildist.c:35)
 ==11377==by 0x4706D8: do_dotCode (dotcode.c:1750)
 
 This is a matter for the package maintainer (Cc:ed here), not R-help.
 
 On Tue, 9 Oct 2007, Benilton Carvalho wrote:
 
 that happened to me with R-2.4.0 (alpha) and was fixed on R-2.4.0
 (final)...
 
 http://tolstoy.newcastle.edu.au/R/e2/help/06/11/5061.html
 
 then i stopped using... now, the problem seems to be back. The same
 examples still apply.
 
 This fails:
 
 require(cluster)
 set.seed(1)
 x - rnorm(100)
 g - sample(2:4, 100, rep=T)
 for (i in 1:100){
 print(i)
 tmp - silhouette(g, dist(x))
 }
 
 and this works:
 
 require(cluster)
 set.seed(1)
 x - rnorm(100)
 g - sample(2:4, 100, rep=T)
 for (i in 1:100){
 print(i)
 tmp - silhouette(as.integer(factor(g)), dist(x))
 }
 
 and here's the sessionInfo():
 
  sessionInfo()
 R version 2.6.0 (2007-10-03)
 x86_64-unknown-linux-gnu
 
 locale:
 LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.U
 TF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-
 8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_ID
 ENTIFICATION=C
 
 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base
 
 other attached packages:
 [1] cluster_1.11.9
 
 
 (Red Hat EL 2.6.9-42 smp - AMD opteron 848)
 
 b
 
 On Oct 9, 2007, at 8:35 PM, Tao Shi wrote:
 
 Hi list,
 
 When I was using 'silhouette' from the 'cluster' package to
 calculate clustering performances, R crashed.  I traced the problem
 to the fact that my clustering labels only have 2's and 3's.  when
 I replaced them with 1's and 2's, the problem was solved.  Is the
 function purposely written in this way so when I have clustering
 labels, 2 and 3, for example, the function somehow takes the
 'missing' cluster 2 into account when it calculates silhouette
 widths?
 
 Thanks,
 
 Tao
 
 ##
 ## sorry about the long attachment
 
 R.Version()
 $platform
 [1] i386-pc-mingw32
 
 $arch
 [1] i386
 
 $os
 [1] mingw32
 
 $system
 [1] i386, mingw32
 
 $status
 [1] 
 
 $major
 [1] 2
 
 $minor
 [1] 5.1
 
 $year
 [1] 2007
 
 $month
 [1] 06
 
 $day
 [1] 27
 
 $`svn rev`
 [1] 42083
 
 $language
 [1] R
 
 $version.string
 [1] R version 2.5.1 (2007-06-27)
 
 library(cluster)
 cl1   ## clustering labels
 [1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2
 [30] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 [59] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 [88] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 [117] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 [146] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 [175] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 [204] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 x1  ## 1-d input vector
 [1] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963
 [6] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963
 [11] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963
 [16] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963
 [21] 1.0163758 0.7657763 0.7370084 0.6999689 0.7366476
 [26] 0.7883921 0.6925395 0.7729240 0.7202391 0.7910149
 [31] 0.7397698 0.7958092 0.6978596 0.7350255 0.7294362
 [36] 0.6125713 0.7174000 0.7413046 0.7044205 0.7568104
 [41] 0.7048469 0.7334515 0.7143170 0.7002311 0.7540981
 [46] 0.7627527 0.7712762 0.8193611 0.7801148 

Re: [R] AFT-model with time-dependent covariates

2009-05-13 Thread Terry Therneau
 The coding for an AFT model with time-dependent covariates will be very hard, 
and I don't know of anyone who has done it.  (But I don't keep watch of other 
survival packages, so something might be there).
 
  In a Cox model, a subject's risk depends only on the current value of his/her 
covariates; in an AFT model the risk depends on the entire covariate history.  
(My 'accelerated age' is the sum of all the extra years I have ever gained).  
Coding this is not theoretically complex, but would be a pain-in-the-rear 
amount 
of bookkeeping.
  
Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotCI line types for line and for bar

2009-05-13 Thread lehe

Thank you!
Yes, I am using the plotCI from gplots and I want the line connecting the
centers to be dashed, just as for the bars. However changing the type to be
p as you said does not give dashed line but no line at all (only points). 




lehe wrote:
 
 Anyone has some clue to this question?
 Thanks in advance!
 
 
 lehe wrote:
 
 Hi,
 I was wondering how to specify the line type for line instead of for bar.
 Here is my code:
 plotCI(x=mcra1avg, uiw=stdev1, type=l,col=2,lty=2)
 This way, I will have the bar line as dashed lty=2 and red col=2, and
 the line connecting the centers of the bars is also red col=2 but solid 
 lty=1. How to make the line connecting the bar centers have the same
 solid lty as the bar?
 Thanks and regards!
 
 
 
   You neglected to say that you were using the plotCI from gplots (not the
 one from plotrix, which has slightly different behaviors).  Here's my
 solution (with some data made up -- you didn't give a reproducible
 example).
 I assume that you meant above that you wanted the line connecting the
 centers to be dashed?
 
 mcra1avg - 1:3
 stdev1 - c(0.2,0.1,0.4)
 library(gplots)
 plotCI(x=mcra1avg, uiw=stdev1, type=p,col=2,lty=2)
 lines(mcra1avg,col=2,lty=2)
 
   By the way, it's not all uncommon to have to wait more than 12 hours for
 a response on the R list -- the variability is very high ... I would say
 it's generally good to wait at least 24 hours before bumping ... 
 
   Ben Bolker
 

-- 
View this message in context: 
http://www.nabble.com/plotCI-line-types-for-line-and-for-bar-tp23501900p23520615.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] questions on rpart (tree changes when rearrange the order of covariates)

2009-05-13 Thread Terry Therneau
 If two variables have exactly the same split importance, then rpart will use 
the one that was first in the model statement.  So if
rpart(group ~ age + height + weight + sex)
and at some split point both age and weight gave a split with 20 correct and 9 
incorrect, then age would be used to split at that node.

  Even though the error of the age and weight splits are the same, the set of 9 
subjects that were incorrect may be different, i.e., they don't send exactly 
the 
same observations to the left and the right.  Thus, the rest of the tree from 
that point on may be different, giving a different fit.
  
  For continuous y this rarely happens -- that two splits have exactly the same 
R^2 -- but it is not uncommon in classification problems.  
  
Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for a quick way to combine rows in a matrix

2009-05-13 Thread Rocko22

Hello,

I reviewed my code and this will work now for any number of successive TA,
I hope:

b=matrix(1:64, ncol=4)
rownames(b)=rep(c(AA,AT,TA,TT),each=4)
key - rownames(b)
key[key == AT] - TA
c - b
rownames(c)=key

for(i in 2:I(nrow(c))) {
   if(rownames(c)[i]==TA  rownames(c)[i-1]==TA) { c[i,] -
colSums(c[i:I(i-1),])
  c[i-1,]-NA}} # sums the rows and replace the used rows by NA
values
c - c[apply(c,1,function(x)any(!is.na(x))),] # removes the rows with NA
values
c

Rock



Rocko22 wrote:
 
 In the first reply, what was calculated was the overall means by group
 (amino acids). It does not work for a larger database.
 I am quite really new to R, and I worked on your question just to learn
 how to manipulate data with R.
 The following seems to work. The code could be made a lot more elegant and
 straightforward, but it works only when there is no more than two
 successive TA:
 
 Let's try with a matrix b that contains more rows than in your example:
 
 b=matrix(1:32, ncol=4)
 rownames(b)=rep(c(AA,AT,TA,TT),2)
 key - rownames(b)
 key[key == AT] - TA
 rownames(b)=key
 
 for(i in 1:I(nrow(b)-1)) {
if(rownames(b)[i]==TA  rownames(b)[i+1]==TA) { b[i,] -
 colSums(b[i:I(i+1),])
   b[i+1,]-NA}} # sums the rows and replace the used rows by
 NA values
 b - b[order(b[,1],na.last=NA),] # removes the rows with NA values
 
 Of course, the rows are reordered, and that may be not wanted. The
 ordering was just to remove the NA rows.
 
 Rock :-D
 
 

-- 
View this message in context: 
http://www.nabble.com/Looking-for-a-quick-way-to-combine-rows-in-a-matrix-tp23491348p23520900.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R package to fit mixture or cure survival models

2009-05-13 Thread marc bernard

Dear All,
 
I am desperately trying to find any R package that fits a mixture survival 
models also know as a cure models. These  are survival models where the 
survival function is improper, which also means that  a fraction of subjects 
are expected not to expreience the event. A huge literature has been developed 
for these  type of models but I couldn't find any R package that fits them. 
 
Bests
 
Marc


_
[[elided Hotmail spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Nagelkerkes R2N

2009-05-13 Thread Andrea Weidacher
Hello All,

as I´m new to R and survival analysis, I´ve got a question about the
Design::validate function:

My Code:
cox - cph(Surv(t,status) ~ var1 + var2 + var3, data=data, x=TRUE, y=TRUE,
surv=TRUE)
cox.val - validate(cox, B=10, dxy=TRUE, pr=TRUE);

My output (cox.val):
  index.orig   training   test
Dxy   -0.3639222921368090891 -0.3591157308750822175 -0.3634294047761231106
R2 1.000  1.000  1.000
Slope  1.000  1.000  1.0055508323397084336
D  0.0232804472888947744  0.0226998668193014774  0.0232190381679612834
U -0.607553318187988 -0.610134584621832  0.254159617147094
Q  0.0233412026207135703  0.0227608802777636665  0.0231936222062465713
optimism index.corrected  n
Dxy0.0043136739010409269 -0.36823596603785002657 10
R2 0.000  1. 10
Slope -0.0055508323397084336  1.00555083233970843359 10
D -0.0005191713486598047  0.02379961863755457596 10
U -0.864294201768926  0.2567408835809379 10
Q -0.0004327419284829055  0.02377394454919647515 10

And my question ist about the R2: Why ist the value always 1.0. That doesn´t
seem to me like a realistic value.

And so I tried to calculate R2 with my own formula:
LR - -2*cox$loglik[2]
L0 - -2*cox$loglik[1]
n - length(data[,ID])
R2N - (1-exp(-LR/n)) / (1-exp(L0/n))

R2N calculated that way is -0.00132314024559236. 

Can anybody help me to understand the formula to R2 and why the
validate-function results in 1.0?

Thanks,

Andrea.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotCI line types for line and for bar

2009-05-13 Thread Ben Bolker



lehe wrote:
 
 Thank you!
 Yes, I am using the plotCI from gplots and I want the line connecting the
 centers to be dashed, just as for the bars. However changing the type to
 be p as you said does not give dashed line but no line at all (only
 points). 
 
 

 Yes, but the next line

lines(mcra1avg,col=2,lty=2) 

adds a line with the desired line type.
Perhaps one idea about R graphics that would be useful to you is that one
often builds
up a desired plot by adding pieces sequentially, rather than finding a
single plot
command that does everything at once.

  Ben Bolker

-- 
View this message in context: 
http://www.nabble.com/plotCI-line-types-for-line-and-for-bar-tp23501900p23521202.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] strucchange | weighted models

2009-05-13 Thread Achim Zeileis

On Tue, 12 May 2009, f.query wrote:


Greetings -

Am hoping to use the strucchange package to look for structural breaks in 
some messy regression data. A series of preliminary analyses indicate that 
BLUE for these data will involve some weighting the data (estimates of a 
particular population parameter) by a function of the variance of the 
estimate (say, inverse of the variance). While I've gone through the docs for 
strucchange (which are excellent, btw),


Thanks!

I don't see a simple (or obvious) way 
to apply some sort of 'weighting' to the regressions implemented in the 
package.


I think there isn't in the old efp()/Fstats()/breakpoints() part, then 
there is no easy way. But int the new gefp() function you can use 
weights. If you want to do breakpoints estimation, I've got some modified 
code which is not included in the package...let me know if you need it.


hth,
Z

Short of diving into source (which I could do, but I'm not sure how 
the various tests would be impacted by weighting of any sort), was wondering 
if anyone had dealt with this sort of issue - either with strucchange, or 
some other approach/package?


Thanks in advance...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] silhouette: clustering labels have to be consecutive integers starting

2009-05-13 Thread Carvalho, Benilton
Thank you very much, Martin.

Warmest regards, b

Em 13/05/2009, às 09:14, Martin Maechler
maech...@stat.math.ethz.ch escreveu:
!#x000a
 TS == Tao Shi shi...@hotmail.com
on Wed, 10 Oct 2007 06:15:53 + writes:

TS Thank you very much, Benilton and Prof. Ripley, for the
TS speedy replies!

TS Looking forward to the fix!
TS Tao

 I have finally re-stumbled onto this e-mail thread,
 and indeed found fixed the problem.

 Version 1.12.0 of 'cluster' should become visible within a few days,
 and will allow to call

silhoutte(g, dis)

 on a grouping vector of k different integer values which need
 *not* necessarily be in 1:k.

 Martin Maechler,
 ETH Zurich


 From: Prof Brian Ripley rip...@stats.ox.ac.uk
 To: Benilton Carvalho bcarv...@jhsph.edu
 CC: Tao Shi shi...@hotmail.com, maech...@stat.math.ethz.ch,
 r-help@r-project.org
 Subject: Re: [R] silhouette: clustering labels have to be
 consecutive
 intergers starting from 1?
 Date: Wed, 10 Oct 2007 05:33:03 +0100 (BST)

 It is a C-level problem in package cluster: valgrind gives

 ==11377== Invalid write of size 8
 ==11377==at 0xA4015D3: sildist (sildist.c:35)
 ==11377==by 0x4706D8: do_dotCode (dotcode.c:1750)

 This is a matter for the package maintainer (Cc:ed here), not R-
 help.

 On Tue, 9 Oct 2007, Benilton Carvalho wrote:

 that happened to me with R-2.4.0 (alpha) and was fixed on R-2.4.0
 (final)...

 http://tolstoy.newcastle.edu.au/R/e2/help/06/11/5061.html

 then i stopped using... now, the problem seems to be back. The same
 examples still apply.

 This fails:

 require(cluster)
 set.seed(1)
 x - rnorm(100)
 g - sample(2:4, 100, rep=T)
 for (i in 1:100){
 print(i)
 tmp - silhouette(g, dist(x))
 }

 and this works:

 require(cluster)
 set.seed(1)
 x - rnorm(100)
 g - sample(2:4, 100, rep=T)
 for (i in 1:100){
 print(i)
 tmp - silhouette(as.integer(factor(g)), dist(x))
 }

 and here's the sessionInfo():

 sessionInfo()
 R version 2.6.0 (2007-10-03)
 x86_64-unknown-linux-gnu

 locale:
 LC_CTYPE=
 en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.U
 TF-
 8;L
 C_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-
 8;L
 C_NAME=
 C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_ID
 ENTIFICATION=C

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods
 base

 other attached packages:
 [1] cluster_1.11.9


 (Red Hat EL 2.6.9-42 smp - AMD opteron 848)

 b

 On Oct 9, 2007, at 8:35 PM, Tao Shi wrote:

 Hi list,

 When I was using 'silhouette' from the 'cluster' package to
 calculate clustering performances, R crashed.  I traced the
 problem
 to the fact that my clustering labels only have 2's and 3's.  when
 I replaced them with 1's and 2's, the problem was solved.  Is the
 function purposely written in this way so when I have clustering
 labels, 2 and 3, for example, the function somehow takes the
 'missing' cluster 2 into account when it calculates silhouette
 widths?

 Thanks,

 Tao

 ##
 ## sorry about the long attachment

 R.Version()
 $platform
 [1] i386-pc-mingw32

 $arch
 [1] i386

 $os
 [1] mingw32

 $system
 [1] i386, mingw32

 $status
 [1] 

 $major
 [1] 2

 $minor
 [1] 5.1

 $year
 [1] 2007

 $month
 [1] 06

 $day
 [1] 27

 $`svn rev`
 [1] 42083

 $language
 [1] R

 $version.string
 [1] R version 2.5.1 (2007-06-27)

 library(cluster)
 cl1   ## clustering labels
 [1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2
 [30] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 [59] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 [88] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 [117] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 [146] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 [175] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 [204] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 x1  ## 1-d input vector
 [1] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963
 [6] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963
 [11] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963
 [16] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963
 [21] 1.0163758 0.7657763 0.7370084 0.6999689 0.7366476
 [26] 0.7883921 0.6925395 0.7729240 0.7202391 0.7910149
 [31] 0.7397698 0.7958092 0.6978596 0.7350255 0.7294362
 [36] 0.6125713 0.7174000 0.7413046 0.7044205 0.7568104
 [41] 0.7048469 0.7334515 0.7143170 0.7002311 0.7540981
 [46] 0.7627527 0.7712762 0.8193611 0.7801148 0.9061762
 [51] 0.8248195 0.7932630 0.7248037 0.7423547 0.6419314
 [56] 0.6001092 0.7572272 0.7631742 0.7085384 0.8710853
 [61] 0.6589563 0.7464943 0.7487340 0.7751280 0.7946542
 [66] 0.7666081 0.8508109 0.8314308 0.7442471 0.8006093
 [71] 0.7949156 0.7852447 0.7630048 0.7104764 0.6768218
 [76] 0.6806351 0.7255355 0.7431389 0.7523627 0.7670515
 [81] 0.8118214 0.7215615 0.8186164 0.6941610 0.8285453
 [86] 0.8395170 0.8088044 0.8182706 0.7550723 0.7948639
 [91] 0.7204830 0.7109068 0.7756949 

[R] Help with reshape/reShape and indexing

2009-05-13 Thread Ista Zahn
Hi Dana,

 -- Forwarded message --
 From: Dana Sevak dana.se...@yahoo.com
 To: r-help@r-project.org
 Date: Tue, 12 May 2009 23:02:00 -0700 (PDT)
 Subject: [R] Help with reshape/reShape and indexing

 Dear R Helpers,

 I have trouble applying reShape and reshape although I read the documentation 
 and several posts, so I would very much appreciate your help on the two 
 points below.

There are usually many ways to accomplish any given task in R, and
which one you use is a matter of preference. I've settled on use the
reshape package for these kinds of tasks. If you're comfortable with
the solutions already suggested there's no need to continue reading.
Otherwise here's another approach:

 I have a dataframe

 df = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 
 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4))

  df
 Name X1  X2
 1    a 12 200
 2    a 13 250
 3    a 14 300
 4    b 20 600
 5    b 25 700
 6    c 30 900

 First I need to add an additional column to this dataframe that will count 
 the number of rows per each Name entry.  The resulting df should look like:

 df.index = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13, 14, 
 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4), Index=c(1,2,3,1,2,1))

  df.index
 Name X1  X2    Index
 1    a 12 200    1
 2    a 13 250    2
 3    a 14 300    3
 4    b 20 600    1
 5    b 25 700    2
 6    c 30 900    1

 How can I do this?

Easy enough with the plyr package (loaded with reshape):

df = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13,
14, 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4))
library(reshape)
df$Index - ddply(df, Name, colwise(seq_along))[,1]


 Secondly, I would like to reshape this dataframe in the form:

  df2
  1  2  3
 a 12 13 14
 b 20 25 NA
 c 30 NA NA

 Since the df is sorted by Name and X2, I would need that the available X1 
 values populate the resulting rows in df2 from left to right (i.e. if only 
 one value is available, it is written in the first column and the remaining 
 columns get NAs).

I don't really understand this. What happened to X2? Anyway, I would
do it like this:

 df$X2 - NULL
 m.df - melt(df, measure.vars=X1)
 df.final - cast(m.df, ... ~ Index)
 df.final
  Name variable  123
1a   X1 12   13   14
2b   X1 20   25 NA
3c   X1 30 NA NA

But I don't see why you want to drop X2, so I would actually do

df = data.frame(Name=c(a, a, a, b, b, c), X1=c(12, 13,
14, 20, 25, 30), X2 = c(200, 250, 300, 600, 700, 4))
df$Index - ddply(df, Name, colwise(seq_along))[,1]
df$X2 - as.character(df$X2)
m.df - melt(df, measure.vars=c(X1,X2))
df.final - cast(m.df, ... ~ Index)
df.final
  Name variable   123
1a   X1  12   13   14
2a   X2 200  250  300
3b   X1  20   25 NA
4b   X2 600  700 NA
5c   X1  30 NA NA
6c   X2   4 NA NA

All the best,
Ista
  If I could generate the Index column, I think I could accomplish this with:

 df2 = reShape(df.index$X1, id=df.index$Name, colvar=df.index$Index)
 colnames(df2) = c(V1, V2, V3)

 However, is there a way to get to df2 without using the Index column and 
 still have the NAs written as described above?

 Thank you so much for your help on these two issues.

 With best regards,
 Dana Sevak

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overlay cdf

2009-05-13 Thread beetle2

Thanks alot I found the function

x0 - c(0, sort(Sample))
p0 - 0:1000/1000
lines(x0, p0, type = S, col = blue)


Very helpfull 
As it seems to plot an instantaneous representation of the variables in the
gamma distribution



Bill.Venables wrote:
 
 Here are some ideas you might like to consider
 
 par(mar = c(5,4,2,4)+0.1, yaxs = r)
 Sample - rgamma(1000,2.5,.8)
 hist(Sample, main = , freq = FALSE, ylim = c(0,1))
 
 pu - par(usr)[1:2]
 x - seq(pu[1], pu[2], len = 5000)
 y - pgamma(x, 2.5, 0.8)
 par(new = TRUE)
 plot(x, y, type = l, axes = FALSE, ann = FALSE, col = red)
 lines(x, dgamma(x, 2.5, 0.8), col = darkgreen)
 
 axis(4, col = red)
 mtext(side = 4, text = Cumulative probability, col = red, line = 2.5)
 
 x0 - c(0, sort(Sample))
 p0 - 0:1000/1000
 lines(x0, p0, type = S, col = blue)
 
 
 Bill Venables
 http://www.cmis.csiro.au/bill.venables/ 
 
 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of beetle2
 Sent: Wednesday, 13 May 2009 3:23 PM
 To: r-help@r-project.org
 Subject: [R] Overlay cdf
 
 
 Hi,
 Is it possible to  overlay a cummulative distribution function on a
 histogram of a gamma distribuition.
 
 I have a gamma function 
 
 Sample = rgamma(1000,2.5,.8)+1.5
 hist(Sample)
 
 regards
 
 
 -- 
 View this message in context:
 http://www.nabble.com/Overlay-cdf-tp23515551p23515551.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Overlay-cdf-tp23515551p23517150.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] converting numeric into character strings

2009-05-13 Thread Melissa2k9

Hi,

Im trying to put some numbers into a dataframe , I have a list of numbers
(change points in a time series) like such

[1]  2 11 12 20 21 98 99

but I want R to recognise this as just a character string so it will put it
in one row and column, ideally I want them seperated by commas so I would
have for example

Person  Change points (seconds)
A   2,11,12,20,21,98,99  
B4,5,89
 
etc. Is there any way I can get this

I've tried this:

for example if the command to get the list of numbers was b-which(a!=s),
then i have tried

as.character(b) but I just end up with 

[1] 2  11 12 20 21 98 99

which is not what I want as this is more than one string and is not
seperated by commas, I also tried

paste(b,sep=,) but I end up with the same thing. Sorry it's a bit
confusing to read but any help would be great! 

Melissa
-- 
View this message in context: 
http://www.nabble.com/converting-numeric-into-character-strings-tp23518762p23518762.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting a grid with color on a map

2009-05-13 Thread dxc13

Thanks, Jim.  This seems to be what I am looking for.  Just have to fine tune
the colors to get some distinctive greens, blues, yellows and oranges in
there and I should be good to go.


Jim Lemon-2 wrote:
 
 It was the NAs that fooled color2D.matplot. This gets your colors, 
 although not exactly what you want. Look at the help for color2D.matplot 
 to get that. I think fiddling with the x and y limits on the map() call 
 will get the positions right.
 
 temp1-read.table(time1test.dat,header=TRUE)
 library(plotrix)
 # reverse the row order, as color2D.matplot reverses it
 color2D.matplot(temp1[36:1,],c(0.5,1),c(0.5,0),c(1,0),axes=FALSE)
 # don't erase the above plot
 par(new=TRUE)
 # do a ghost plot with just the axes
 plot(0,xlim=maplim[1:2],ylim=maplim[3:4],type=n)
 # add the map on top in black
 map(add=TRUE,col=black)
 
 Jim
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/plotting-a-grid-with-color-on-a-map-tp23514804p23521213.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read multiple large files into one dataframe

2009-05-13 Thread SYKES, Jennifer
Hello

 

Apologies if this is a simple question, I have searched the help and
have not managed to work out a solution.

Does anybody know an efficient method for reading many text files of the
same format into one table/dataframe?

 

I have around 90 files that contain continuous data over 3 months but
that are split into individual days data and I need the whole 3 months
in one file for analysis.  Each days file contains a large amount of
data (approx 30MB each) and so I need a memory efficient method to merge
all of the files into the one dataframe object.  From what I have read I
will probably want to avoid using for loops etc?  All files are in the
same directory, none have a header row, and each contain around 180,000
rows and the same 25 columns/variables.  Any suggested packages/routines
would be very useful.

 

Thanks

 

Jennifer

 

 



-
***If
you are not the intended recipient, please notify our Help Desk at
Email postmas...@nats.co.uk immediately. You should not copy or use
this email or attachment(s) for any purpose nor disclose their
contents to any other person. NATS computer systems may be
monitored and communications carried on them recorded, to secure
the effective operation of the system and for other lawful
purposes. Please note that neither NATS nor the sender accepts any
responsibility for viruses or any losses caused as a result of
viruses and it is your responsibility to scan or otherwise check
this email and any attachments. NATS means NATS (En Route) plc
(company number: 4129273), NATS (Services) Ltd (company number
4129270), NATSNAV Ltd (company number: 4164590) or NATS Ltd
(company number 3155567) or NATS Holdings Ltd (company number
4138218). All companies are registered in England and their
registered office is at 5th Floor, Brettenham House South,
Lancaster Place, London, WC2E 7EN.
**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mixture of survivals or cure models

2009-05-13 Thread marc bernard

Dear All,

 

I am desperately trying to find any R package that fits a mixture survival 
models also know as a cure models. This  are survival model where the survival 
function is improper  which also means that  a fraction of subjects are 
expected not to expreience the event. A Huge literature has been developed for 
thes type of models but I couldn't find any R package that fits this type of 
models. 

 

Bests

 

Marc

_


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] converting numeric into character strings

2009-05-13 Thread Jorge Ivan Velez
Dear Melissa,
Try this:

 x - c(2, 11, 12, 20, 21, 98, 99)
 paste(x, collapse=,)
[1] 2,11,12,20,21,98,99

See ?paste for more information.

HTH,

Jorge



On Wed, May 13, 2009 at 5:53 AM, Melissa2k9 m.mcquil...@lancaster.ac.ukwrote:


 Hi,

 Im trying to put some numbers into a dataframe , I have a list of numbers
 (change points in a time series) like such

 [1]  2 11 12 20 21 98 99

 but I want R to recognise this as just a character string so it will put it
 in one row and column, ideally I want them seperated by commas so I would
 have for example

 Person  Change points (seconds)
 A   2,11,12,20,21,98,99
 B4,5,89

 etc. Is there any way I can get this

 I've tried this:

 for example if the command to get the list of numbers was b-which(a!=s),
 then i have tried

 as.character(b) but I just end up with

 [1] 2  11 12 20 21 98 99

 which is not what I want as this is more than one string and is not
 seperated by commas, I also tried

 paste(b,sep=,) but I end up with the same thing. Sorry it's a bit
 confusing to read but any help would be great!

 Melissa
 --
 View this message in context:
 http://www.nabble.com/converting-numeric-into-character-strings-tp23518762p23518762.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 3dscatter for linux

2009-05-13 Thread threshold

Hi, do you have any suggestions how to make 3D scatterplot, BUT under linux.
Worth mentioning is the fact that 'scatterplot3d' does not load under Ubuntu
8.10.
Do you know any alternatives?? I tried cloud or persp but X,Y and Z axes are
emprical in my case, and cannot be replaced by any seq(...).
Thanks in advance, robert
 
-- 
View this message in context: 
http://www.nabble.com/3dscatter-for-linux-tp23521603p23521603.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mixture of survivals or cure models

2009-05-13 Thread Gabor Grothendieck
Check out:

http://www.math.mun.ca/~ypeng/research/

On Wed, May 13, 2009 at 8:34 AM, marc bernard
marc_bern...@hotmail.co.uk wrote:

 Dear All,



 I am desperately trying to find any R package that fits a mixture survival 
 models also know as a cure models. This  are survival model where the 
 survival function is improper  which also means that  a fraction of subjects 
 are expected not to expreience the event. A Huge literature has been 
 developed for thes type of models but I couldn't find any R package that fits 
 this type of models.



 Bests



 Marc

 _


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mixture of survivals or cure models

2009-05-13 Thread Gabor Grothendieck
Also:

http://post.queensu.ca/~pengp/software.html

On Wed, May 13, 2009 at 9:21 AM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 Check out:

 http://www.math.mun.ca/~ypeng/research/

 On Wed, May 13, 2009 at 8:34 AM, marc bernard
 marc_bern...@hotmail.co.uk wrote:

 Dear All,



 I am desperately trying to find any R package that fits a mixture survival 
 models also know as a cure models. This  are survival model where the 
 survival function is improper  which also means that  a fraction of subjects 
 are expected not to expreience the event. A Huge literature has been 
 developed for thes type of models but I couldn't find any R package that 
 fits this type of models.



 Bests



 Marc

 _


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] overlap contour

2009-05-13 Thread Ben Bolker



Bala subramanian-2 wrote:
 
 Friends,
 
 I have two covariance matrices (m1 and m2) of same size (150x150). I used
 contourplot function to make contour plots individually (c1 and c2). I am
 interested in making one contourplot overlapping the two individual
 contours
 so that the portion of the plot above and below the diagonal can represent
 the c1 and c2. Someone suggest me how can i do the same.
 
 Is there any way that i can combine m1 and m2 and write the combined
 matrix
 to a file and plot it to achieve the mentioned above.
 
 

I'm not quite sure what you mean (and this may be why no-one has responded
so far).

Do you mean

m2[lower.triang(m2)] - m1[lower.triang(m1)]
contour(m2)

?

  I can imagine a fancier solution where you use contourLines to
extract the contour lines, remove points where xy, and plot them,
but that seems like more work.

 Ben Bolker

-- 
View this message in context: 
http://www.nabble.com/overlap-contour-tp23520206p23521760.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple ANOVA tests

2009-05-13 Thread Tal Galili
Hi lmri.
You could do this by doing something like this:

Getting an Anova first:#
utils::data(npk, package=MASS)
( npk.aov - aov(yield ~ block + N*P*K, npk) )
summary(npk.aov) # I want the P value from this summary of aov object.

#here is the code:
summary(npk.aov)[[1]]$P
# [1] 0.015938790 0.004371812 0.474904093 0.028795054 0.263165283
0.168647879
# [7] 0.862752086  NA
# the last one is of the P value for the residuals, which doesn't exist - so
returns NA.
#so you might wanna use:
na.omit(summary(npk.aov)[[1]]$P)

Now you have a vector of P values, and you could do whatever you want with
it...


Cheers,
Tal







On Wed, May 13, 2009 at 1:32 PM, Imri bisr...@agri.huji.ac.il wrote:


 Hello!!!
 I'm trying to do multiple ANOVA tests with R (testing the affect off
 different factors on the same response). As a result I get many ANOVA
 tables, and I want to extract a list of the Pr(F) from all the tables.
 Maybe someone have an idea how to do this?
 Thanks
Imri

 --
 View this message in context:
 http://www.nabble.com/Multiple-ANOVA-tests-tp23518637p23518637.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
--


My contact information:
Tal Galili
Phone number: 972-50-3373767
FaceBook: Tal Galili
My Blogs:
http://www.r-statistics.com/
http://www.talgalili.com
http://www.biostatistics.co.il

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read multiple large files into one dataframe

2009-05-13 Thread baptiste auguie

I'd first try plyr and see if it's efficient enough,


library(plyr)

listOfFiles - list.files(pattern= .txt)

d - ldply(listOfFiles, read.table)
str(d)



alternatively,


d - do.call(rbind, lapply(listOfFiles, read.table))



HTH,

baptiste


On 13 May 2009, at 12:45, SYKES, Jennifer wrote:


Hello



Apologies if this is a simple question, I have searched the help and
have not managed to work out a solution.

Does anybody know an efficient method for reading many text files of  
the

same format into one table/dataframe?



I have around 90 files that contain continuous data over 3 months but
that are split into individual days data and I need the whole 3 months
in one file for analysis.  Each days file contains a large amount of
data (approx 30MB each) and so I need a memory efficient method to  
merge
all of the files into the one dataframe object.  From what I have  
read I

will probably want to avoid using for loops etc?  All files are in the
same directory, none have a header row, and each contain around  
180,000
rows and the same 25 columns/variables.  Any suggested packages/ 
routines

would be very useful.



Thanks



Jennifer







-
***If
you are not the intended recipient, please notify our Help Desk at
Email postmas...@nats.co.uk immediately. You should not copy or use
this email or attachment(s) for any purpose nor disclose their
contents to any other person. NATS computer systems may be
monitored and communications carried on them recorded, to secure
the effective operation of the system and for other lawful
purposes. Please note that neither NATS nor the sender accepts any
responsibility for viruses or any losses caused as a result of
viruses and it is your responsibility to scan or otherwise check
this email and any attachments. NATS means NATS (En Route) plc
(company number: 4129273), NATS (Services) Ltd (company number
4129270), NATSNAV Ltd (company number: 4164590) or NATS Ltd
(company number 3155567) or NATS Holdings Ltd (company number
4138218). All companies are registered in England and their
registered office is at 5th Floor, Brettenham House South,
Lancaster Place, London, WC2E 7EN.
**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


_

Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read multiple large files into one dataframe

2009-05-13 Thread Mike Lawrence
What types of data are in each file? All numbers, or a mix of numbers
and characters? Any missing data or special NA values?

On Wed, May 13, 2009 at 7:45 AM, SYKES, Jennifer
jennifer.sy...@nats.co.uk wrote:
 Hello



 Apologies if this is a simple question, I have searched the help and
 have not managed to work out a solution.

 Does anybody know an efficient method for reading many text files of the
 same format into one table/dataframe?



 I have around 90 files that contain continuous data over 3 months but
 that are split into individual days data and I need the whole 3 months
 in one file for analysis.  Each days file contains a large amount of
 data (approx 30MB each) and so I need a memory efficient method to merge
 all of the files into the one dataframe object.  From what I have read I
 will probably want to avoid using for loops etc?  All files are in the
 same directory, none have a header row, and each contain around 180,000
 rows and the same 25 columns/variables.  Any suggested packages/routines
 would be very useful.



 Thanks



 Jennifer







 -
 ***If
 you are not the intended recipient, please notify our Help Desk at
 Email postmas...@nats.co.uk immediately. You should not copy or use
 this email or attachment(s) for any purpose nor disclose their
 contents to any other person. NATS computer systems may be
 monitored and communications carried on them recorded, to secure
 the effective operation of the system and for other lawful
 purposes. Please note that neither NATS nor the sender accepts any
 responsibility for viruses or any losses caused as a result of
 viruses and it is your responsibility to scan or otherwise check
 this email and any attachments. NATS means NATS (En Route) plc
 (company number: 4129273), NATS (Services) Ltd (company number
 4129270), NATSNAV Ltd (company number: 4164590) or NATS Ltd
 (company number 3155567) or NATS Holdings Ltd (company number
 4138218). All companies are registered in England and their
 registered office is at 5th Floor, Brettenham House South,
 Lancaster Place, London, WC2E 7EN.
 **

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 3dscatter for linux

2009-05-13 Thread Ben Bolker



threshold wrote:
 
 Hi, do you have any suggestions how to make 3D scatterplot, BUT under
 linux. Worth mentioning is the fact that 'scatterplot3d' does not load
 under Ubuntu 8.10.
 Do you know any alternatives?? I tried cloud or persp but X,Y and Z axes
 are emprical in my case, and cannot be replaced by any seq(...).
 Thanks in advance, robert
  
 

http://wiki.r-project.org/rwiki/doku.php?id=tips:graphics-3d:graphics-3d

  See esp. rgl::plot3d()

  Also, cloud() seems to work just fine with irregular x,y, z:

 d - data.frame(x=runif(10),y=runif(10),z=runif(10))
 library(lattice)
 cloud(z~x*y,data=d)

how/why doesn't scatterplot3d load?  I can't find any reference to this on
the mailing lists, but maybe I'm missing something.  It does fine in Ubuntu
9.04 (intrepid).

  Ben Bolker

-- 
View this message in context: 
http://www.nabble.com/3dscatter-for-linux-tp23521603p23521711.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] questions on rpart (tree changes when rearrange the order of covariates?!)

2009-05-13 Thread Dimitri Liakhovitski
I wonder - isn't this issue one of the reasons to use RandomForests
rather than CART?

On Wed, May 13, 2009 at 8:03 AM, Liaw, Andy andy_l...@merck.com wrote:
 From: Uwe Ligges

 Yuanyuan wrote:
  Greetings,
 
  I am using rpart for classification with class method.
 The test data  is
  the Indian diabetes data from package mlbench.
 
  I fitted a classification tree firstly using the original
 data, and then
  exchanged the order of Body mass and Plasma glucose which are the
  strongest/important variables in the growing phase. The
 second tree is a
  little different from the first one. The misclassification
 tables are
  different too. I did not change the data, but why the results are so
  different?

 Well, at some splits the variable that comes first and yields in the
 same reduction of the entropy criterion as another one might be used,
 hence another result.

 Uwe Ligges

 I recently tried writing adaboost.m1 using rpart, and was surprised that
 with very small training set (say n=10 or 20), I get a large improvement
 in test set accuracy if I randomly shuffle the columns in the data at
 every adaboost iteration.  (With twonorm data, we're talking about 25%
 error vs. 19%, using n=2000 test set.)  It turned out to be the way
 rpart deals with ties--- first come, first win.  Without shuffling the
 columns, rpart almost never pick any variable beyond the 10th.  (In
 twonorm, all variables are equally important, so one would expect
 roughly equal selection frequency.)

 I've gotten some pointers from Terry Therneau about where in the code to
 check.  I may try to implement breaking ties at random (as I've done in
 randomForest).  No promises, though...

 Andy




 
  Does anyone know how rpart deal with ties?
 
  Here is the codes for running the two trees.
 
 
  library(mlbench)
  data(PimaIndiansDiabetes2)
  mydata-PimaIndiansDiabetes2
  library(rpart)
  fit2-rpart(diabetes~., data=mydata,method=class)
  plot(fit2,uniform=T,main=CART for original data)
  text(fit2,use.n=T,cex=0.6)
  printcp(fit2)
  table(predict(fit2,type=class),mydata$diabetes)
  ## misclassifcation table: rows are fitted class
        neg pos
    neg 437  68
    pos  63 200
  #Klimt(fit2,mydata)
 
  pmydata-data.frame(mydata[,c(1,6,3,4,5,2,7,8,9)])
  fit3-rpart(diabetes~., data=pmydata,method=class)
  plot(fit3,uniform=T,main=CART after exchaging mass  glucose)
  text(fit3,use.n=T,cex=0.6)
  printcp(fit3)
  table(predict(fit3,type=class),pmydata$diabetes)
  ##after exchage the order of BODY mass and PLASMA glucose
        neg pos
    neg 436  64
    pos  64 204
  #Klimt(fit3,pmydata)
 
 
  Thanks,
 
 
 
 --
 
  Yuanyuan Huang
 
      [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 Notice:  This e-mail message, together with any attachme...{{dropped:12}}

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Dimitri Liakhovitski
MarketTools, Inc.
dimitri.liakhovit...@markettools.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] import HTML tables

2009-05-13 Thread Duncan Temple Lang

Dieter Menne wrote:


Dimitri Szerman-2 wrote:

Hello,
I was wondering if there is a function in R that imports tables directly
from a HTML document.



The XML package can do this:

http://markmail.org/message/cyicoa3htme4gei2

Duncan Temple Lang:

The htmlParse() and htmlTreeParse() functions in the XML package use the
non-strict HTML parser in libxml2 and so the HTML document can be malformed. 


Indeed. Thanks Dieter.

htmlParse() reads the document; getNodeSet allows us to
easily find the table or tables of interest.
We can find the th and td entries easily using XPath also.

The less automated part is how to meaningfully process the content.
That is where a human  should be involved, deciding whether to trim
white space, how to convert text to values, dealing with missing cells.
We can do a lot by default, but ...


There is a relatively simple function at

  http://www.omegahat.org/ParseXML/readHTMLTable.R

that provides something resembling read.table.
It is not well tested as in the past, I have just used XPath
directly as, once you know XPath, extracting content from HTML/XML is
very straightforward.

  D.





Dieter


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] name siz ein cluster

2009-05-13 Thread Penner, Johannes
I would like to change to size of the names in a cluster dendrogram (not
the axis or the header) (package clue). The normal things (pch,
cex.label, font) do not work here.

Thanks in advance!
Johannes

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] access to the current element of lapply

2009-05-13 Thread Martial Sankar

Dear All, 

I would like to use the 'split' function on the dataframe elements contained in 
a list L.

For example : 

 (df - data.frame(cbind(c(rep('A',2), rep('B',2)), rep(1:4
  X1 X2
1  A  1
2  A  2
3  B  3
4  B  4
 (L-split(df, df$X1))
$A
  X1 X2
1  A  1
2  A  2

$B
  X1 X2
3  B  3
4  B  4

Now, I would like to split EACH data frame, ie, according to column 2(X2).

 lapply(L, split, df$X2)

$A
$A$`1`
  X1 X2
1  A  1

$A$`2`
  X1 X2
2  A  2

$A$`3`
[1] X1 X2
0 rows (or 0-length row.names)

$A$`4`
[1] X1 X2
0 rows (or 0-length row.names)


$B
$B$`1`
  X1 X2
3  B  3

$B$`2`
  X1 X2
4  B  4

$B$`3`
[1] X1 X2
0 rows (or 0-length row.names)

$B$`4`
[1] X1 X2
0 rows (or 0-length row.names)


Warning messages:
1: In split.default(seq_len(nrow(x)), f, drop = drop, ...) :
  data length is not a multiple of split variable
2: In split.default(seq_len(nrow(x)), f, drop = drop, ...) :
  data length is not a multiple of split variable



I works but it's dirty.
How  could I do it properly, without warnings and 0 rows data frame in output ?
I thought accessing to the current element of 'lapply' to recuperate the vector 
of the column 2 would work.
i.e:

lapply(L,split, L[[current]][,2]) 


Is there a way to do something like that in R ?


Thanks in advance !

- Martial








_
Découvrez toutes les possibilités de communication avec vos proches

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nagelkerkes R2N

2009-05-13 Thread Frank E Harrell Jr
A new version of Design will be posted to CRAN in the next 2 days. 
After than, update your system, including an update to the survival 
package.  Then re-try.


Your formula is wrong as it can't be negative.  LR should be the 
likelihood ratio chi-square stat : -2 times the difference in the two 
loglik values.


Frank

Andrea Weidacher wrote:

Hello All,

as I´m new to R and survival analysis, I´ve got a question about the
Design::validate function:

My Code:
cox - cph(Surv(t,status) ~ var1 + var2 + var3, data=data, x=TRUE, y=TRUE,
surv=TRUE)
cox.val - validate(cox, B=10, dxy=TRUE, pr=TRUE);

My output (cox.val):
  index.orig   training   test
Dxy   -0.3639222921368090891 -0.3591157308750822175 -0.3634294047761231106
R2 1.000  1.000  1.000
Slope  1.000  1.000  1.0055508323397084336
D  0.0232804472888947744  0.0226998668193014774  0.0232190381679612834
U -0.607553318187988 -0.610134584621832  0.254159617147094
Q  0.0233412026207135703  0.0227608802777636665  0.0231936222062465713
optimism index.corrected  n
Dxy0.0043136739010409269 -0.36823596603785002657 10
R2 0.000  1. 10
Slope -0.0055508323397084336  1.00555083233970843359 10
D -0.0005191713486598047  0.02379961863755457596 10
U -0.864294201768926  0.2567408835809379 10
Q -0.0004327419284829055  0.02377394454919647515 10

And my question ist about the R2: Why ist the value always 1.0. That doesn´t
seem to me like a realistic value.

And so I tried to calculate R2 with my own formula:
LR - -2*cox$loglik[2]
L0 - -2*cox$loglik[1]
n - length(data[,ID])
R2N - (1-exp(-LR/n)) / (1-exp(L0/n))

R2N calculated that way is -0.00132314024559236. 


Can anybody help me to understand the formula to R2 and why the
validate-function results in 1.0?

Thanks,

Andrea.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rpart - not for classification?

2009-05-13 Thread Dimitri Liakhovitski
Hello!
I very minor point.

I typed help.search(classification).
It found a bunch of things including randomForests - which makes a lot sense.

I am wondering why rpart was not found. I think - it should make sense too.


-- 
Dimitri Liakhovitski
MarketTools, Inc.
dimitri.liakhovit...@markettools.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] name siz ein cluster

2009-05-13 Thread Simon Pickett
I'm afraid I have no experience with the clue package, but if all else fails 
you could consider the hclust package.


You change font size in the conventional way with this.

Cheers, Simon.


- Original Message - 
From: Penner, Johannes johannes.pen...@mfn-berlin.de

To: r-help@r-project.org
Sent: Wednesday, May 13, 2009 3:08 PM
Subject: [R] name siz ein cluster



I would like to change to size of the names in a cluster dendrogram (not
the axis or the header) (package clue). The normal things (pch,
cex.label, font) do not work here.

Thanks in advance!
Johannes

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] access to the current element of lapply

2009-05-13 Thread Marc Schwartz

On May 13, 2009, at 9:12 AM, Martial Sankar wrote:



Dear All,

I would like to use the 'split' function on the dataframe elements  
contained in a list L.


For example :


(df - data.frame(cbind(c(rep('A',2), rep('B',2)), rep(1:4

 X1 X2
1  A  1
2  A  2
3  B  3
4  B  4

(L-split(df, df$X1))

$A
 X1 X2
1  A  1
2  A  2

$B
 X1 X2
3  B  3
4  B  4

Now, I would like to split EACH data frame, ie, according to column  
2(X2).



lapply(L, split, df$X2)


$A
$A$`1`
 X1 X2
1  A  1

$A$`2`
 X1 X2
2  A  2

$A$`3`
[1] X1 X2
0 rows (or 0-length row.names)

$A$`4`
[1] X1 X2
0 rows (or 0-length row.names)


$B
$B$`1`
 X1 X2
3  B  3

$B$`2`
 X1 X2
4  B  4

$B$`3`
[1] X1 X2
0 rows (or 0-length row.names)

$B$`4`
[1] X1 X2
0 rows (or 0-length row.names)


Warning messages:
1: In split.default(seq_len(nrow(x)), f, drop = drop, ...) :
 data length is not a multiple of split variable
2: In split.default(seq_len(nrow(x)), f, drop = drop, ...) :
 data length is not a multiple of split variable



I works but it's dirty.
How  could I do it properly, without warnings and 0 rows data frame  
in output ?
I thought accessing to the current element of 'lapply' to recuperate  
the vector of the column 2 would work.

i.e:

lapply(L,split, L[[current]][,2])


Is there a way to do something like that in R ?


Thanks in advance !

- Martial


# Split on BOTH columns and drop unused levels
L - split(df, list(df$X1, df$X2), drop = TRUE)

 L
$A.1
  X1 X2
1  A  1

$A.2
  X1 X2
2  A  2

$B.3
  X1 X2
3  B  3

$B.4
  X1 X2
4  B  4


Is that what you want?

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read multiple large files into one dataframe

2009-05-13 Thread Simon Pickett

can you provide reproducible code please?

even a fake example would help.

I would

1) set up a loop to read in each file from a directory
2)  inside the loop chop up/ aggregate the data, each file in turn and spit 
each new aggreagated file out to a directory using write.table(). This will 
reduce the memory needed by only including the info you want. Make sure each 
file is a data frame with the same names.
3) set up a new loop to read in each new small file and rbind them all 
together to make your new master file.


The R gurus may have a more parsimonious solution.

HTH

Simon.


- Original Message - 
From: SYKES, Jennifer jennifer.sy...@nats.co.uk

To: r-help@r-project.org
Sent: Wednesday, May 13, 2009 11:45 AM
Subject: [R] read multiple large files into one dataframe



Hello



Apologies if this is a simple question, I have searched the help and
have not managed to work out a solution.

Does anybody know an efficient method for reading many text files of the
same format into one table/dataframe?



I have around 90 files that contain continuous data over 3 months but
that are split into individual days data and I need the whole 3 months
in one file for analysis.  Each days file contains a large amount of
data (approx 30MB each) and so I need a memory efficient method to merge
all of the files into the one dataframe object.  From what I have read I
will probably want to avoid using for loops etc?  All files are in the
same directory, none have a header row, and each contain around 180,000
rows and the same 25 columns/variables.  Any suggested packages/routines
would be very useful.



Thanks



Jennifer







-
***If
you are not the intended recipient, please notify our Help Desk at
Email postmas...@nats.co.uk immediately. You should not copy or use
this email or attachment(s) for any purpose nor disclose their
contents to any other person. NATS computer systems may be
monitored and communications carried on them recorded, to secure
the effective operation of the system and for other lawful
purposes. Please note that neither NATS nor the sender accepts any
responsibility for viruses or any losses caused as a result of
viruses and it is your responsibility to scan or otherwise check
this email and any attachments. NATS means NATS (En Route) plc
(company number: 4129273), NATS (Services) Ltd (company number
4129270), NATSNAV Ltd (company number: 4164590) or NATS Ltd
(company number 3155567) or NATS Holdings Ltd (company number
4138218). All companies are registered in England and their
registered office is at 5th Floor, Brettenham House South,
Lancaster Place, London, WC2E 7EN.
**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] name siz ein cluster

2009-05-13 Thread Penner, Johannes
I tried for example:

Plot(mycluster, font=2) 

But this changes only the font size of the y-axis.

Regards
Johannes
--
Project Coordinator BIOTA West Amphibians

Museum of Natural History
Dep. of Research (Herpetology)
Invalidenstrasse 43
D-10115 Berlin
Tel: +49 (0)30 2093 8708
Fax: +49 (0)30 2093 8565

http://www.biota-africa.org
http://community-ecology.biozentrum.uni-wuerzburg.de

-Ursprüngliche Nachricht-
Von: Simon Pickett [mailto:simon.pick...@bto.org] 
Gesendet: Mittwoch, 13. Mai 2009 16:30
An: Penner, Johannes; r-help@r-project.org
Betreff: Re: [R] name siz ein cluster

I'm afraid I have no experience with the clue package, but if all else fails 
you could consider the hclust package.

You change font size in the conventional way with this.

Cheers, Simon.


- Original Message - 
From: Penner, Johannes johannes.pen...@mfn-berlin.de
To: r-help@r-project.org
Sent: Wednesday, May 13, 2009 3:08 PM
Subject: [R] name siz ein cluster


I would like to change to size of the names in a cluster dendrogram (not
 the axis or the header) (package clue). The normal things (pch,
 cex.label, font) do not work here.

 Thanks in advance!
 Johannes

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] where does the null come from?

2009-05-13 Thread Gabor Grothendieck
 out -   apply(m, 1, cat, '\n')
1 3
2 4
 out
NULL


On Wed, May 13, 2009 at 5:23 AM, Wacek Kusnierczyk
waclaw.marcin.kusnierc...@idi.ntnu.no wrote:
    m = matrix(1:4, 2)

    apply(m, 1, cat, '\n')
    # 1 2
    # 3 4
    # NULL

 why the null?

 vQ

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Histogram + % of cases for a given criteria

2009-05-13 Thread S. Nunes
Hi all,

I am doing some explorations using a dataset with the following
structure (id, value, flag).
For instance:

a, 2.2, 1
b, 3.0, 1
c, 2.9, 0
d, 3.1, 1
...

I have plotted a standard histogram using a simple command like:
hist(data$value)

My question:

I would like to superimpose a line ([0%-100%] scale) representing the
% of values that, for each class of the histogram, have the $flag
equal to 1.

What strategy do you recommend? Is this easily doable in R?

I hope I made myself clear. Please let me know if not.

Thanks in advance,
--
Sérgio Nunes

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Histogram + % of cases for a given criteria

2009-05-13 Thread S. Nunes
Hi all,

I am doing some explorations using a dataset with the following
structure (id, value, flag).
For instance:

a, 2.2, 1
b, 3.0, 1
c, 2.9, 0
d, 3.1, 1
...

I have plotted a standard histogram using a simple command like:
hist(data$value)

My question:

I would like to superimpose a line ([0%-100%] scale) representing the
% of values that, for each class of the histogram, have the $flag
equal to 1.


What strategy

I hope I made myself clear. Please let me know if not.

Thanks in advance,
--
Sérgio Nunes

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] where does the null come from?

2009-05-13 Thread Ted Harding
On 13-May-09 14:43:17, Gabor Grothendieck wrote:
 out -   apply(m, 1, cat, '\n')
 1 3
 2 4
 out
 NULL

Or, more explicitly, from ?cat :

  Value:
   None (invisible 'NULL').

Ted.

 On Wed, May 13, 2009 at 5:23 AM, Wacek Kusnierczyk
 waclaw.marcin.kusnierc...@idi.ntnu.no wrote:
 _ _m = matrix(1:4, 2)

 _ _apply(m, 1, cat, '\n')
 _ _# 1 2
 _ _# 3 4
 _ _# NULL

 why the null?

 vQ

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 13-May-09   Time: 15:56:04
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read multiple large files into one dataframe

2009-05-13 Thread Liaw, Andy
A few points to consider:

- If all the data are numeric, then use matrices instead of data frames.

- With either data frames or matrices, there is no way (that I'm aware
of anyway) in R to stack them without making at least one copy in
memory.

- Since none of the files has a header row, I would concatenate them
into one file outside R (e.g., on *nix, cat *  all.txt) and then read
that in.  You can also try it inside R with something like
read.table(pipe()).  You will want to make use of the colClasses
argument in read.table() to specify the column types, though, to ensure
that read.table() only go through the input once.

- You're probably better off getting the data into a database (even
something like sqlite) and use an R interface to that database.

- 30MB x 90 = 2.7GB.  Unless you're on a 64-bit machine with lots of
RAM, you're not likely to have much fun with the data even when you
manage to get it into R in one piece.

Andy

From: SYKES, Jennifer
 
 Hello
 
  
 
 Apologies if this is a simple question, I have searched the help and
 have not managed to work out a solution.
 
 Does anybody know an efficient method for reading many text 
 files of the
 same format into one table/dataframe?
 
  
 
 I have around 90 files that contain continuous data over 3 months but
 that are split into individual days data and I need the whole 3 months
 in one file for analysis.  Each days file contains a large amount of
 data (approx 30MB each) and so I need a memory efficient 
 method to merge
 all of the files into the one dataframe object.  From what I 
 have read I
 will probably want to avoid using for loops etc?  All files are in the
 same directory, none have a header row, and each contain 
 around 180,000
 rows and the same 25 columns/variables.  Any suggested 
 packages/routines
 would be very useful.
 
  
 
 Thanks
 
  
 
 Jennifer
 
  
 
  
 
 
 
 -
 ***If
 you are not the intended recipient, please notify our Help Desk at
 Email postmas...@nats.co.uk immediately. You should not copy or use
 this email or attachment(s) for any purpose nor disclose their
 contents to any other person. NATS computer systems may be
 monitored and communications carried on them recorded, to secure
 the effective operation of the system and for other lawful
 purposes. Please note that neither NATS nor the sender accepts any
 responsibility for viruses or any losses caused as a result of
 viruses and it is your responsibility to scan or otherwise check
 this email and any attachments. NATS means NATS (En Route) plc
 (company number: 4129273), NATS (Services) Ltd (company number
 4129270), NATSNAV Ltd (company number: 4164590) or NATS Ltd
 (company number 3155567) or NATS Holdings Ltd (company number
 4138218). All companies are registered in England and their
 registered office is at 5th Floor, Brettenham House South,
 Lancaster Place, London, WC2E 7EN.
 **
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
Notice:  This e-mail message, together with any attachme...{{dropped:12}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Simulation

2009-05-13 Thread Debbie Zhang


Dear R users,

Can anyone please tell me how to generate a large number of samples in R, given 
certain distribution and size.

For example, if I want to generate 1000 samples of size n=100, with a N(0,1) 
distribution, how should I proceed? 

(Since I dont want to do rnorm(100,0,1) in R for 1000 times)

 

Thanks for help



Debbie

_
Looking to change your car this year? Find car news, reviews and more

e%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641_t=762955845_r=tig_OCT07_m=EXT
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problems with randomly generating samples

2009-05-13 Thread Debbie Zhang

Dear R users,
Can anyone please tell me how to generate a large number of samples in R, given 
certain distribution and size.
For example, if I want to generate 1000 samples of size n=100, with a N(0,1) 
distribution, how should I proceed? 
(Since I dont want to do rnorm(100,0,1) in R for 1000 times)
 
Thanks for help

Debbie


_
Looking to change your car this year? Find car news, reviews and more

e%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641_t=762955845_r=tig_OCT07_m=EXT
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simulation

2009-05-13 Thread Gábor Csárdi
On Wed, May 13, 2009 at 5:13 PM, Debbie Zhang debbie0...@hotmail.com wrote:


 Dear R users,

 Can anyone please tell me how to generate a large number of samples in R, 
 given certain distribution and size.

 For example, if I want to generate 1000 samples of size n=100, with a N(0,1) 
 distribution, how should I proceed?

 (Since I dont want to do rnorm(100,0,1) in R for 1000 times)

Why not? It took 0.05 seconds on my 5 years old laptop.

Gabor



 Thanks for help



 Debbie

 _
 Looking to change your car this year? Find car news, reviews and more

 e%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641_t=762955845_r=tig_OCT07_m=EXT
        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gabor Csardi gabor.csa...@unil.ch UNIL DGM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with randomly generating samples

2009-05-13 Thread Nordlund, Dan (DSHS/RDA)
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of Debbie Zhang
 Sent: Wednesday, May 13, 2009 8:18 AM
 To: r-help@r-project.org
 Subject: [R] Problems with randomly generating samples
 
 
 Dear R users,
 Can anyone please tell me how to generate a large number of samples in R, 
 given
 certain distribution and size.
 For example, if I want to generate 1000 samples of size n=100, with a N(0,1)
 distribution, how should I proceed?
 (Since I dont want to do rnorm(100,0,1) in R for 1000 times)
 
 Thanks for help
 
 Debbie
 
 

How about

samples - rnorm(1000*100,0,1)
dim(samples) - c(1000,100)

Hope this is helpful,

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA  98504-5204
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simulation

2009-05-13 Thread Dimitris Rizopoulos

what about putting in a matrix, e.g.,

matrix(rnorm(1000*100), 1000, 100)


I hope it helps.

Best,
Dimitris


Debbie Zhang wrote:


Dear R users,

Can anyone please tell me how to generate a large number of samples in R, given 
certain distribution and size.

For example, if I want to generate 1000 samples of size n=100, with a N(0,1) distribution, how should I proceed? 


(Since I dont want to do rnorm(100,0,1) in R for 1000 times)

 


Thanks for help



Debbie

_
Looking to change your car this year? Find car news, reviews and more

e%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641_t=762955845_r=tig_OCT07_m=EXT
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simulation

2009-05-13 Thread Mike Lawrence
If you want k samples of size n, why generate k*n samples and put them
in a k-by-n matrix where you can do what you want to each sample:

k = 10
n = 100
x=matrix(rnorm(k*n),k,n)
rowMeans(x)

If you need to do more complex things to each sample and if k is large
enough that you don't want the matrix sitting around in memory while
you do these things, you could also check out ?replicate .

On Wed, May 13, 2009 at 12:13 PM, Debbie Zhang debbie0...@hotmail.com wrote:


 Dear R users,

 Can anyone please tell me how to generate a large number of samples in R, 
 given certain distribution and size.

 For example, if I want to generate 1000 samples of size n=100, with a N(0,1) 
 distribution, how should I proceed?

 (Since I dont want to do rnorm(100,0,1) in R for 1000 times)



 Thanks for help



 Debbie

 _
 Looking to change your car this year? Find car news, reviews and more

 e%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641_t=762955845_r=tig_OCT07_m=EXT
        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] limits

2009-05-13 Thread Mike Prager
Uwe Ligges lig...@statistik.tu-dortmund.de wrote:

 So you want some software that can do symbolic calculations? In that 
 case use other software. R is designed for numerical analyses.

In particular, if you are looking for good free software, you
might try Maxima.


-- 
Mike Prager, NOAA, Beaufort, NC
* Opinions expressed are personal and not represented otherwise.
* Any use of tradenames does not constitute a NOAA endorsement.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simulation

2009-05-13 Thread Barry Rowlingson
On Wed, May 13, 2009 at 4:26 PM, Gábor Csárdi csa...@rmki.kfki.hu wrote:
 On Wed, May 13, 2009 at 5:13 PM, Debbie Zhang debbie0...@hotmail.com wrote:


 Dear R users,

 Can anyone please tell me how to generate a large number of samples in R, 
 given certain distribution and size.

 For example, if I want to generate 1000 samples of size n=100, with a N(0,1) 
 distribution, how should I proceed?

 (Since I dont want to do rnorm(100,0,1) in R for 1000 times)

 Why not? It took 0.05 seconds on my 5 years old laptop.

 Second-guessing the user, I think she maybe doesn't want to type in
'rnorm(100,0,1)' 1000 times...

 Soln - for loop:

  z=list()
  for(i in 1:1000){z[[i]]=rnorm(100,0,1)}

now inspect the individual bits:

  hist(z[[1]])
  hist(z[[545]])

If that's the problem, then I suggest she reads an introduction to R...

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simulation

2009-05-13 Thread Jorge Ivan Velez
Dear Debbie,
Here are two options:

# Parameters
N - 1000
n - 100

# Option 1
mys - replicate(N, rnorm(n))
mys

# Option 2
mys2 - matrix(rnorm(N*n),ncol=N)
mys2

HTH,

Jorge


On Wed, May 13, 2009 at 11:13 AM, Debbie Zhang debbie0...@hotmail.comwrote:



 Dear R users,

 Can anyone please tell me how to generate a large number of samples in R,
 given certain distribution and size.

 For example, if I want to generate 1000 samples of size n=100, with a
 N(0,1) distribution, how should I proceed?

 (Since I dont want to do rnorm(100,0,1) in R for 1000 times)



 Thanks for help



 Debbie

 _
 Looking to change your car this year? Find car news, reviews and more


 e%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641_t=762955845_r=tig_OCT07_m=EXT
[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with reshape/reShape and indexing

2009-05-13 Thread Dana Sevak

To all of you who answered me: Thank you so much!
Each approach taught me something new and I really appreciate your help!

Best regards,
Dana Sevak

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with randomly generating samples

2009-05-13 Thread Ted Harding
On 13-May-09 15:18:05, Debbie Zhang wrote:
 Dear R users,
 Can anyone please tell me how to generate a large number of samples in
 R, given certain distribution and size.
 For example, if I want to generate 1000 samples of size n=100, with a
 N(0,1) distribution, how should I proceed? 
 (Since I dont want to do rnorm(100,0,1) in R for 1000 times)
  
 Thanks for help
 Debbie

One possibility is

  nsamples - 1000
  sampsize - 100
  Samples - matrix(rnorm(nsamples*sampsize,0,1),nrow=nsamples)

Then each row of the matrix Samples will be a sample of size 'sampsize',
the i-th can be accessed as Samples[i,], and there are 'nsamples' rows
to choose from.

Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 13-May-09   Time: 16:46:05
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simulation

2009-05-13 Thread Linlin Yan
Does every 100 numbers in rnorm(100 * 1000, 0, 1) have the N(0,1) distribution?

On Wed, May 13, 2009 at 11:13 PM, Debbie Zhang debbie0...@hotmail.com wrote:


 Dear R users,

 Can anyone please tell me how to generate a large number of samples in R, 
 given certain distribution and size.

 For example, if I want to generate 1000 samples of size n=100, with a N(0,1) 
 distribution, how should I proceed?

 (Since I dont want to do rnorm(100,0,1) in R for 1000 times)



 Thanks for help



 Debbie

 _
 Looking to change your car this year? Find car news, reviews and more

 e%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641_t=762955845_r=tig_OCT07_m=EXT
        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ode first step

2009-05-13 Thread Benoit Boulinguiez
Hi all,

I try to assess the parameters (K1,K2) of a model that describes the
adsorption of a molecule onto on adsorbent.

equation: dq/dt = K1*C*(qm-q)-K2*q

I know the value of 'qm' and I experimentally measure the variables 'q',
'C', and the time 't'.

   t C q
1  0 144.05047 0.000
2565  99.71492 0.1105625
3988  74.99426 0.1722100
4   1415  58.65572 0.2129545
5   1833  48.34586 0.2386649
6   2257  40.29413 0.2587440
7   2675  32.92470 0.2771216
8   3105  29.57162 0.2854834
9   3552  28.01424 0.2893672
10  3986  25.62167 0.2953337
11  4415  23.62612 0.3003101
12  4841  21.95523 0.3044769
13  5264  21.08464 0.3066480
14  5698  19.68040 0.3101498
15  6509  18.31788 0.3135476
16  6950  17.65868 0.3151915
17  7403  17.00206 0.3168290
18  8130  16.38856 0.3183589
19  9001  15.58544 0.3203617
20  9928  15.27882 0.3211263
21 11899  14.46415 0.3231579
22 16354  13.91779 0.3245204
23 18926  13.82630 0.3247485
24 21602  13.66776 0.3251439
25 24413  13.98560 0.3243513
26 27056  13.87143 0.3246360
27 29844  13.64881 0.3251912

It's a differential equation, thus I had a look on the command 'ode' from
the deSolve package.

I'm early stuck on the use of the function 'ode' cause I don't get how to
define the function 'func' required by 'ode'

Any help would be appreciated.


Regards/Cordialement

-
Benoit Boulinguiez
Ph.D student
Ecole de Chimie de Rennes (ENSCR) Bureau 1.20 
Equipe CIP UMR CNRS 6226 Sciences Chimiques de Rennes
Avenue du Général Leclerc 
CS 50837 
35708 Rennes CEDEX 7 
Tel 33 (0)2 23 23 80 83
Fax 33 (0)2 23 23 81 20
http://www.ensc-rennes.fr/ http://www.ensc-rennes.fr/  

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] limits

2009-05-13 Thread Gabor Grothendieck
Try the rSymPy or Ryacas packages.

In the rSymPy code below the var command defines x
as symbolic to sympy and then we perform the
computation:

 library(rSymPy)
Loading required package: rJava
 sympy(var('x'))
[1] x
 sympy(limit(x*x + x + 2, x, 2))
[1] 8

Or using devel version define x as symbolic first to sympy
and then to R:

 library(rSymPy)
 source(http://rsympy.googlecode.com/svn/trunk/R/Sym.R;)
 sympy(var('x'))
[1] x
 x - Sym(x)
 limit(x*x + x + 2, x, 2)
[1] 8

or using Ryacas:

 library(Ryacas)
Loading required package: XML
 x - Sym(x)
 Limit(x^2+x+2, x, 2)
[1] Starting Yacas!
expression(8)

More info is available here which you should read before
using these packages:

http://rsympy.googlecode.com
http://ryacas.googlecode.com


On Tue, May 5, 2009 at 5:39 AM, Hassan Mohamed
hassan_hany_fa...@yahoo.com wrote:
 Hey,
 what is the R function for the mathematical limit ?
 e.g. to calculate  and return the amount that the expression
 X^2 +X +2
 approach
 as X approach 2
 (X- 2)
 thanks
 hassan



        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Centering R output in Sweave/LaTeX

2009-05-13 Thread Jean-Louis Abitbol
Good Day to All,

When sweaving the following:

\begin{table}
\centering
echo=FALSE=
ftable(ifmtm$type, ifmtm$gender, ifmtm$marche , ifmtm$nfic,
dnn=c(Type,Gender,Ambulant,Visit))
@
\caption{Four-way cross-tabulation on all data}
\label{tab:crosstab}
\end{table}

the output of ftable is not centered while the latex caption is.

Is there a way to center the R output in this setting ?

Thanks for any help and best wishes, JL

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simulation

2009-05-13 Thread Wacek Kusnierczyk
Barry Rowlingson wrote:
 On Wed, May 13, 2009 at 4:26 PM, Gábor Csárdi csa...@rmki.kfki.hu wrote:
   
 On Wed, May 13, 2009 at 5:13 PM, Debbie Zhang debbie0...@hotmail.com wrote:
 
 Dear R users,

 Can anyone please tell me how to generate a large number of samples in R, 
 given certain distribution and size.

 For example, if I want to generate 1000 samples of size n=100, with a 
 N(0,1) distribution, how should I proceed?

 (Since I dont want to do rnorm(100,0,1) in R for 1000 times)
   
 Why not? It took 0.05 seconds on my 5 years old laptop.
 

  Second-guessing the user, I think she maybe doesn't want to type in
 'rnorm(100,0,1)' 1000 times...

  Soln - for loop:

   z=list()
   for(i in 1:1000){z[[i]]=rnorm(100,0,1)}

 now inspect the individual bits:

   hist(z[[1]])
   hist(z[[545]])

 If that's the problem, then I suggest she reads an introduction to R...

i'd suggest reading the r inferno by pat burns [1], where he deals with
this sort of for-looping lists the way it deserves ;)

vQ

[1] http://www.burns-stat.com/pages/Tutor/R_inferno.pdf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ode first step

2009-05-13 Thread Dieter Menne
Benoit Boulinguiez benoit.boulinguiez at ensc-rennes.fr writes:

 I try to assess the parameters (K1,K2) of a model that describes the
 adsorption of a molecule onto on adsorbent.
 
 equation: dq/dt = K1*C*(qm-q)-K2*q
 
 I know the value of 'qm' and I experimentally measure the variables 'q',
 'C', and the time 't'.
 
 I'm early stuck on the use of the function 'ode' cause I don't get how to
 define the function 'func' required by 'ode'

Have a look at the lsoda documentation of the earlier package odesolve,
which has easier to understand examples.

Dieter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for a quick way to combine rows in a matrix

2009-05-13 Thread Chris Stubben

You can automate this step
key[key == AT] - TA

## create a function to reverse a string -- see strsplit help page for this
strReverse function
reverse - function(x) sapply(lapply(strsplit(x, NULL), rev), paste,
collapse=)

key - rownames(a)
# combine rownames with reverse (rownames)
n-cbind(key, rev=reverse(key))
 key  rev 
[1,] AA AA
[2,] AT TA
[3,] TA AT
[4,] TT TT

# Now just sort the values in the rows   (apply returns column vectors so I
also use t() ) and then run do.call on first column
 n-t(apply(n,1, sort))

do.call(rbind, by(a, n[,1], colSums)) 
   V1 V2 V3 V4
AA  1  5  9 13
AT  5 13 21 29
TT  4  8 12 16


I often need to combine reverse complement DNA strings, so you could do that
too 

# DNA complement
comp -  function(x) chartr(ACGT, TGCA, x)

n-cbind(key, rev=reverse(comp(key)))  
 n-t(apply(n,1, sort))
do.call(rbind, by(a, n[,1], colSums)) 
   V1 V2 V3 V4
AA  5 13 21 29   
AT  2  6 10 14
TA  3  7 11 15


Chris Stubben


jholtman wrote:
 
 Try this:
 
 key - rownames(a)
 key[key == AT] - TA
 do.call(rbind, by(a, key, colSums))
V2 V3 V4 V5
 AA  1  5  9 13
 TA  5 13 21 29
 TT  4  8 12 16
 
 
 On Mon, May 11, 2009 at 4:53 PM, Crosby, Jacy R
 jacy.r.cro...@uth.tmc.eduwrote:
 
 I'm working with genotype data in a frequency table:

  a=matrix(1:16, nrow=4)
  rownames(a)=c(AA,AT,TA,TT)
  a
   [,1] [,2] [,3] [,4]
 AA159   13
 AT26   10   14
 TA37   11   15
 TT48   12   16

 'AT' and 'TA' are essentially the same, and I'd like to combine (add) the
 rows to reflect this. The final matrix should be:

   [,1] [,2] [,3] [,4]
 AA159   13
 AT513   21   29
 TT48   12   16

 Is there a fast way to do this?

 Thanks in advance!

 Jacy Crosby
 jacy.r.cro...@uth.tmc.edu

 

-- 
View this message in context: 
http://www.nabble.com/Looking-for-a-quick-way-to-combine-rows-in-a-matrix-tp23491348p23525634.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Anova

2009-05-13 Thread stephen sefick
melt.updn - structure(list(date = structure(c(11808, 11869, 11961, 11992,
12084, 12173, 12265, 12418, 12600, 12631, 12753, 12996, 13057,
13149, 11808, 11869, 11961, 11992, 12084, 12173, 12265, 12418,
12600, 12631, 12753, 12996, 13057, 13149), class = Date), variable =
structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c(unrestored,
restored), class = factor), value = c(1.34057641541824, 0.918021774919366,
0.905654270934854, 0.305945104043220, 0.58298856330543, 1.36580645291274,
0.874195629894938, 0.87482377014642, 0.930267689669002, 0.41753134369356,
1.09248531450337, 1.72571397293738, 0.305751868168171, 0.584498524462223,
0.983300317501076, 1.27216569968585, 0.730578393573363, 0.88361473836175,
1.16501295544266, 2.08896500025784, 0.664286881841064, 1.03859387871079,
1.39172581649833, 0.323405269371357, 1.00207568577518, 1.54383416626015,
0.611261918697393, 0.848992483196744)), .Names = c(date, variable,
value), row.names = c(NA, -28L), class = data.frame)

aov(value~variable, data=melt.updn)

I am having problems making sure that I am doing the correct analysis.
 I am trying to see if there is a difference in the mean of the
restored segment versus the unrestored segment (variable in x).  These
are repeated measures on the same treatments through time.  Is there a
way to control for the differences in time steps?  Any ideas?
thanks for the help,



-- 
Stephen Sefick

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple plot margins

2009-05-13 Thread Andre Nathan
On Wed, 2009-05-13 at 11:22 +0200, Uwe Ligges wrote:
 If not, example:
 
 par(mfrow = c(2,3), mar = c(0,0,0,0), oma = c(5,5,0,0), xpd=NA)
 plot(1, xaxt=n, xlab=, ylab=A)
 plot(1, xaxt=n, yaxt=n, xlab=, ylab=)
 plot(1, xaxt=n, yaxt=n, xlab=, ylab=)
 plot(1, xlab=I, ylab=B)
 plot(1, xlab=II, ylab=, yaxt=n)
 plot(1, xlab=III, ylab=, yaxt=n)
 

Thank you. I don't know what I did wrong, but that worked.

Best regards,
Andre

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Calling R from .net environment

2009-05-13 Thread Arun Kumar Saha
Hi,  Currently I am a .net programmer and would like to use R for my
statistical computations engine. I already have installed RServer250.exe so
that I could call R from my .net programming environment, however
unfortunately, i could not be able to find RServer250.exe in the R-(D) COM
Interface region. If someone guide me how to add these COM components and
call the R-code through my application, it would be very good to me.
Regards,


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simulation

2009-05-13 Thread Barry Rowlingson
On Wed, May 13, 2009 at 5:36 PM, Wacek Kusnierczyk
waclaw.marcin.kusnierc...@idi.ntnu.no wrote:
 Barry Rowlingson wrote:

  Soln - for loop:

   z=list()
   for(i in 1:1000){z[[i]]=rnorm(100,0,1)}

 now inspect the individual bits:

   hist(z[[1]])
   hist(z[[545]])

 If that's the problem, then I suggest she reads an introduction to R...

 i'd suggest reading the r inferno by pat burns [1], where he deals with
 this sort of for-looping lists the way it deserves ;)

 I don't think extending a list this way is too expensive. Not like
doing 1000 foo=rbind(foo,bar)s to a matrix. The overhead for extending
a list should really only be adding a single new pointer to the list
pointer structure. The existing list data isn't copied.

Plus lists are more flexible. You can do:

 z=list()
 for(i in 1:1000){
  z[[i]]=rnorm(i,0,1)  # generate 'i' samples
}

and then you can see how the properties of samples of rnorm differ
with increasing numbers of samples.

Yes, you can probably vectorize this with lapply or something, but I
prefer clarity over concision when dealing with beginners...

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova

2009-05-13 Thread Gavin Simpson
On Wed, 2009-05-13 at 12:43 -0400, stephen sefick wrote:
 melt.updn - structure(list(date = structure(c(11808, 11869, 11961, 11992,
 12084, 12173, 12265, 12418, 12600, 12631, 12753, 12996, 13057,
 13149, 11808, 11869, 11961, 11992, 12084, 12173, 12265, 12418,
 12600, 12631, 12753, 12996, 13057, 13149), class = Date), variable =
 structure(c(1L,
 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c(unrestored,
 restored), class = factor), value = c(1.34057641541824, 0.918021774919366,
 0.905654270934854, 0.305945104043220, 0.58298856330543, 1.36580645291274,
 0.874195629894938, 0.87482377014642, 0.930267689669002, 0.41753134369356,
 1.09248531450337, 1.72571397293738, 0.305751868168171, 0.584498524462223,
 0.983300317501076, 1.27216569968585, 0.730578393573363, 0.88361473836175,
 1.16501295544266, 2.08896500025784, 0.664286881841064, 1.03859387871079,
 1.39172581649833, 0.323405269371357, 1.00207568577518, 1.54383416626015,
 0.611261918697393, 0.848992483196744)), .Names = c(date, variable,
 value), row.names = c(NA, -28L), class = data.frame)
 
 aov(value~variable, data=melt.updn)

You can think of this as a linear model and just use lm:

lm(value~variable, data=melt.updn)

 
 I am having problems making sure that I am doing the correct analysis.
  I am trying to see if there is a difference in the mean of the
 restored segment versus the unrestored segment (variable in x).  These
 are repeated measures on the same treatments through time.  Is there a
 way to control for the differences in time steps?  Any ideas?
 thanks for the help,

One option is to fit this model using generalised least squares:

## do some plotting to look at potential differences:

require(lattice)
xyplot(value ~ time | variable, data = melt.updn, 
   type = c(p,smooth))
## so perhaps some evidence of trend,
## different in the two groups possibly
bwplot(value ~ variable, data = melt.updn)
## doesn't look like there is much difference though

require(nlme)
melt.updn$time - rep(with(melt.updn[1:14,], date - date[1]) + 1, 2)
## include fixed time effect to account for any trend for example?
## use a CAR(1) structure allows for different separations in sampling times 
lmod - gls(value ~ variable + time, data = melt.updn,
  corr = corCAR1(form=  ~ time | variable))
summary(lmod)
intervals(lmod) ## fitting problems with these dummy data
## test CAR(1) structure - do we need?
lmod2 - gls(value ~ variable + time, data = melt.updn)
anova(lmod, lmod2) ## no need for the structure here
summary(lmod2) ## looks like no difference in un/restored
anova(lmod2)

Just a few thoughts, without knowing exactly your data and design it is
difficult to say more. With only two groups, it is difficult to more. I
also assume these are dummy data otherwise there really doesn't look
like there is any difference between the two groups of samples.

HTH

G
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Checking a (new) package - examples require other package functions

2009-05-13 Thread Rebecca Sela
I am creating an R package.  I ran R CMD check on the package, and everything 
passed until it tried to run the examples.  Then, the result was:

* checking examples ... ERROR
Running examples in REEMtree-Ex.R failed.
The error most likely occurred in:

 ### * AutoCorrelationLRtest
 
 flush(stderr()); flush(stdout())
 
 ### Name: AutoCorrelationLRtest
 ### Title: Test for autocorrelation in the residuals of a RE-EM tree
 ### Aliases: AutoCorrelationLRtest
 ### Keywords: htest tree models
 
 ### ** Examples
 
 # Estimation without autocorrelation
 simpleEMresult-RandomEffectsTree(Y~D+t+X, data=simpleREEMdata, random=~1|ID, 
 simpleREEMdata$ID)
Error: couldn't find function RandomEffectsTree
Execution halted


The function RandomEffectsTree is defined in the R code for the package.  How 
can I refer to other functions from the package in examples?  (I have the 
Writing R-extensions PDF, so it would be enough to point me to the right 
page, if the answer is in there and I just missed it.)

Thanks!

Rebecca

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] replace() help

2009-05-13 Thread Crosby, Jacy R
Can anyone see what I'm doing wrong here (highlighted below)? This is driving 
me crazy... probably a ')' or something equally moronic...

 genw1[,1]
A2 A3 A5 A7 A9 A00010 A00012 A00013 A00014 A00015 A00017 
A00018 A00019 A00021 A00023 A00024
CC CC CC CC CC CC CC CC CC CC CC
 CC CC CC CC CC
Etc...this is a rather large vector

 table(genw1[,1])

   ??CCCG
   25 10632 1

 genw2-mat.or.vec(nrow(genw1),ncol(genw1))
 rownames(genw2)-rownames(genw1)
 colnames(genw2)-colnames(genw1)

 genw2[,1]-replace(genw1[,1],which(genw1[,1]==CC), HC)

Warning message:
In `[-.factor`(`*tmp*`, list, value = HC) :
  invalid factor level, NAs generated


Just for error checking (this is working properly):
 which(genw1[,1]==CC)
 [1] 1 2 3 4 5 6 7 8 9101112
131415161718
[19]192021222324252627282930
313233343536
Etc...

And it works here...

 x-matrix(c('CC', 'CC', '??', 'CG'),nrow=2 )
 x
 [,1] [,2]
[1,] CC ??
[2,] CC CG

 x2-mat.or.vec(nrow(x), ncol(x))
 x2[,1]-replace(x[,1],which(x[,1]==CC), HC)
 x2
 [,1] [,2]
[1,] HC 0
[2,] HC 0


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calling R from .net environment

2009-05-13 Thread Hutchinson,David [PYR]
Take a look at this article on CodeProject:

http://www.codeproject.com/KB/cs/RtoCSharp.aspx 

Cheers,
Dave

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Arun Kumar Saha
Sent: Wednesday, May 13, 2009 10:33 AM
To: r-h...@stat.math.ethz.ch
Subject: [R] Calling R from .net environment

Hi,  Currently I am a .net programmer and would like to use R for my
statistical computations engine. I already have installed RServer250.exe
so that I could call R from my .net programming environment, however
unfortunately, i could not be able to find RServer250.exe in the R-(D)
COM Interface region. If someone guide me how to add these COM
components and call the R-code through my application, it would be very
good to me.
Regards,


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mann-Kendall test

2009-05-13 Thread Rafael Moral
Dear useRs,

I've been trying to run a Mann-Kendall test in my data in order to detect 
trends.
I studied the examples given at the Kendall package and I can understand pretty 
well how it works on time-series data.
However, my data consists of values in different sites per year, as I display 
below;

 Year 1 | Year 2 | Year 3 | ...
Site 1    x   x  x   ...
Site 2x   x  x   ...
Site 3x   x  x   ...
  ...   ...  ... ...  ...
(where 'x' represents different values)

There's the MannKendall() function on package 'Kendall' and the tau() function 
on package 'pheno', and I guess they should do the trend detection I need.
The problem is I don't know how to manipulate my data in order to get the 
results.
Should I run the M-K test on each Site, on each Year or on the entire dataset?
Also, there are some probabilities I should take into account when running a 
M-K test, but I can't seem to find out how to obtain them.

Thanks in advance,
Rafael.


  Veja quais são os assuntos do momento no Yahoo! +Buscados

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple plot margins

2009-05-13 Thread Greg Snow
Here is a response to almost exactly the same question from a couple of weeks 
ago:

http://finzi.psych.upenn.edu/R/Rhelp08/2009-April/196967.html



-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Andre Nathan
 Sent: Tuesday, May 12, 2009 11:12 AM
 To: r-help@r-project.org
 Subject: [R] Multiple plot margins
 
 Hello
 
 I'm plotting 6 graphs using mfrow = c(2, 3). In these plots, only
 graphs in the first column have titles for the y axis, and only the
 ones
 in the last row have titles for the x axis.
 
 I'd like all plots to be of the same size, and I'm trying to keep them
 as near each other as possible, but I'm having the following problem.
 
 If I make a single call to par(mar = ...), to leave room on the left
 and
 bottom for the axes titles, a lot of space will be wasted because not
 all graphs need titles; however, if I make one call of par(mar = ...)
 per plot, to have finer control of the margins, the first column and
 last row plots will be smaller than the rest, because the titles use up
 some of their space.
 
 I thought that setting large enough values for oma would do what I
 want, but it doesn't appear to work if mar is too small.
 
 To illustrate better what I'm trying to do:
 
   l +-+ +-+ +-+
   a | | | | | |
   b | | | | | |
   e | | | | | |
   l +-+ +-+ +-+
 
   l +-+ +-+ +-+
   a | | | | | |
   b | | | | | |
   e | | | | | |
   l +-+ +-+ +-+
  label   label   label
 
 where the margins between each plot should be narrow.
 
 Should I just plot the graphs without axis titles and then use text()
 to
 manually position them?
 
 Thanks in advance,
 Andre
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] simple anova question

2009-05-13 Thread AllenL

Dear R group,
Simple anova question:
I am attempting to recreate a figure (from chapter 10 of Mordern Statistics
for the Life Sciences, chapter 10, figure 10.8).

It is an interaction diagram plotting BYIELD (continuous) as a function of
BSPACING (categorical) with different lines/colours for another categorical
variable BVARIETY. The data is replicated into four categorical BBLOCK(s).
The corresponding analysis looks like this:
BYIELD~BBLOCK+BSPACING+BVARIETY

What I want to extract from this model is simply the expected value all
possible combination of factors. I can do this by adding the correct
combinations of model coefficients, but this seems silly. Surely there is a
one-line function for sorting this sort of thing out?

Many thanks,
Allen
-- 
View this message in context: 
http://www.nabble.com/simple-anova-question-tp23528280p23528280.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >