[R] set up a blank csv file and write time series to it row by row

2007-08-10 Thread Yuchen Luo
Dear Friends.

Greetings!

I have asked the question of how to set up a blank file and write a list to
it as a row for many times, with the number of lists unknown.

I have received many beautiful solutions.  Thanks go to Professor *Murdoch,
Professor *Menne, Professor Grothendieck and Dr. Olshansky.  I have
organized the solutions below:



##

*Set up a blank table in harddrive and write to it row by row*

*#Method 1*

*blank = data.frame(name=character(0), wife=character(0),
no.children=numeric(0))*

write.csv(blank, 'file1.csv', row.names=FALSE)

a1 = list(name=Tom, wife=Joy, no.children=9)

a2 = list(name=Paul, wife=Alic, no.children=5)

write.table(a1, file=file1.csv, sep=',', append=TRUE, row.names=FALSE,
col.names=FALSE)

write.table(a2, file=file1.csv, sep=',', append=TRUE, row.names=FALSE,
col.names=FALSE)**

*
*

*#Method 2*

*blank = data.frame(name=character(0), wife=character(0),
no.children=numeric(0))*

write.csv(blank, 'file2.csv')

a1 = list(name=Tom, wife=Joy, no.children=9)

a2 = list(name=Paul, wife=Alic, no.children=5)

write.table(a1, file=file2.csv, sep=',', append=TRUE, row.names=2,
col.names=FALSE)

write.table(a2, file=file2.csv, sep=',', append=TRUE, row.names=3,
col.names=FALSE)**

###

My problem now is, how to write a time series (instead of a list) to a csv
file? Also, how to set up such a csv file to accept the time series?

I know the length of the time series' but I do not know how many of them are
going to come up.

Examples are :

bb1=c(1:10)

bb2=c(101:110)

How to write bb1 and bb2 to a csv file and how to set up blank csv file to
accept such time series in the first place?

Your help will be highly appreciated!!!

Best Wishes!

Yuchen Luo

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Unexpected behavior in PBSmapping package

2007-08-10 Thread D. Dailey
Using R 2.5.1 on Windows XP Professional, and PBSmapping package version 
2.51, I have encountered some behavior which puzzles me.  I am including 
the package's listed maintainer on this email but also seek the thoughts 
of the R-help community.

I have a set of EventData, which I want to plot as points, and to color 
the points according to some criterion.  It turns out that some of my 
points fall outside my desired plotting region.  It looks like this 
causes the PBSmapping functions plotPoints and addPoints to incorrectly 
deal with the color assignments.

Consider the following toy example:

### Begin Example ###

library( PBSmapping )

# Define some EventData
events - as.EventData( read.table( textConnection(
'EID X  Y   Color
1 494 1494 red
2 497 1497 blue
3 500 1500 green
4 503 1503 yellow' ), header=TRUE, strings=FALSE ),
proj='UTM', zone=10 )

par( mfrow=c(3,1) )

# Plot the events with plot limits large enough to show
# the full extent of all the symbols
plotPoints( events, pch=16, cex=5, col=events$Color,
   xlim=c(490,508), ylim=c(1490,1508), proj=TRUE )
with( events, text( X, Y, toupper( substr( Color, 1, 1 ) ),
   font=2, cex=2 ) )

# Normal plot extents; partial symbols cut off by edges
# of plotting region (as expected)
plotPoints( events, pch=16, cex=5, col=events$Color, proj=TRUE )
with( events, text( X, Y, toupper( substr( Color, 1, 1 ) ),
   font=2, cex=2 ) )

## Now use more-restrictive plot limits
plotPoints( events, pch=16, cex=5, col=events$Color,
   xlim=c(499,505), ylim=c(1499,1505), proj=TRUE )
with( events, text( X, Y, toupper( substr( Color, 1, 1 ) ),
   font=2, cex=2 ) )
# Note that symbols are plotted in the right places (note text labels)
# but colors are not as expected

### End example ###


For the moment, I have worked around this issue by using a with( 
events, points( ... ) ) construction, but this seems suboptimal; I 
would prefer to use addPoints (which exhibits the same problem as 
plotPoints does in the toy example above).  I would appreciate any 
insights those on the list might have.

Please include me directly on any reply to the list, as I am at least a 
couple weeks behind on reading the digested version of the list.  I see 
that there have been no mentions of the PBSmapping package even in the 
digests I have not yet read.

Session info:

  sessionInfo()
R version 2.5.1 (2007-06-27)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
States.1252;LC_MONETARY=English_United 
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets 
methods   base

other attached packages:
PBSmapping
 2.51



--David Dailey
Shoreline, Washington, USA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help using gPath

2007-08-10 Thread Paul Murrell
Hi


Emilio Gagliardi wrote:
 Hi Paul,
 
 I'm sorry for not posting code, I wasn't sure if it would be helpful without
 the data...should I post the code and a sample of the data?  I will remember
 to do that next time!


It's important not only to post code, but also to make sure that other 
people can run it (i.e., include real data or have the code generate 
data or use one of R's predefined data sets).

Also, isn't this next time ? :)


 grid.gedit(gPath(ylabel.text.382), gp=gpar(fontsize=16))
 
 
 OK, I think my confusion comes from the notation that current.grobTree()
 produces and what strings are required in order to make changes to the
 underlying grobs.
 But, from what you've provided, it looks like I can access each grob with
 its unique name, regardless of which parent it is nested in...that helps


Yes.  By default, grid will search the tree of all grobs to find the 
name you provide.  You can even just provide part of the name and it 
will find partial matches (depending on argument settings).  On the 
other hand, by specifying a path that specified parent and child grobs, 
you can make sure you get exactly the grob you want.


 like to remove the left border on the first panel.  I'd like to adjust the


 I'd guess you'd have to remove the grob background.rect.345 and then
 draw in just the sides you want, which would require getting to the
 right viewport, for which you'll need to study the viewport tree (see
 current.vpTree())
 
 
 I did some digging into this and it seems pretty complicated, is there an
 example anywhere that makes sense to the beginner? The whole viewport grob
 relationship is not clear to me. So, accessing viewports and removing
 objects and drawing new ones is beyond me at this point. I can get my mind
 around your example below because I can see the object I want to modify in
 the viewer, and the code changes a property of that object, click enter, and
 bang the object changes.  When you start talking external pointers and
 finding viewports and pushing and popping grobs I just get lost. I found the
 viewports for the grobTree, it looks like this:


There's a book that provides a full explanation and the (basic) grid 
chapter is online (see 
http://www.stat.auckland.ac.nz/~paul/RGraphics/rgraphics.html)


 viewport[ROOT]-(viewport[layout]-(viewport[axis_h_1_1]-(viewport[bottom_axis]-(viewport[labels],
 viewport[ticks])),
 viewport[axis_h_1_2]-(viewport[bottom_axis]-(viewport[labels],
 viewport[ticks])),
 viewport[axis_v_1_1]-(viewport[left_axis]-(viewport[labels],
 viewport[ticks])), viewport[panel_1_1], viewport[panel_1_2],
 viewport[strip_h_1_1], viewport[strip_h_1_2], viewport[strip_v_1_1]))
 
 at that point I was like, ok, I'm done. :S


Yep, the facilities for investigating the viewport and grob tree are 
basically inadequate.  Based on some work Hadley did for ggplot, the 
development version of R has a slightly better tool called grid.ls() 
that can show how the grob tree and the viewport tree intertwine.  That 
would allow you to see which viewport each grob was drawn in, which 
would help you, for example, to know which viewport you had to go to to 
replace a rectangle you want to remove.


 Something like ...

 grid.gedit(geom_bar.rect, gp=gpar(col=green))


 Again, it would really help to have some code to run.
 
 
 My apologies, I thought the grobTree was sufficient in this case.  Thanks
 very much for your help.


Sorry to harp on about it, but if I had your code I could show you an 
example of how grid.ls() might help.

Paul
-- 
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
[EMAIL PROTECTED]
http://www.stat.auckland.ac.nz/~paul/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] need help with pdf-plot

2007-08-10 Thread Antje
I still have this problem. Does anybody know any solution?

Antje

Antje schrieb:
 Hello,
 
 I'm trying to plot a set of barplots like a matrix (2 rows, 10 columns 
 fromreduced_mat) to a pdf. It works with the following parameters:
 
 pdf(test.pdf,width=ncol(reduced_mat)*2, height=nrow(reduced_mat)*2, 
 pointsize 
 = 12)
 
 par(mfcol = c(nrow(reduced_mat),ncol(reduced_mat)), oma = c(0,0,0,0), 
 lwd=48/96, cex.axis = 0.5, las = 2, cex.main = 1.0)
 
 The I get a long narrow page format with the quadratic barplots.
 
 But I would like to have a A4 format in the end and the plots not filling the 
 whole page (they should stay somehow quadratic and not be stretched...).
 
 What shall I look for to achieve this?
 
 Antje
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-10 Thread Michael Cassin
Thanks for all the comments,

The artificial dataset is as representative of my 440MB file as I could design.

I did my best to reduce the complexity of my problem to minimal
reproducible code as suggested in the posting guidelines.  Having
searched the archives, I was happy to find that the topic had been
covered, where Prof Ripley suggested that the I/O manuals gave some
advice.  However, I was unable to get anywhere with the I/O manuals
advice.

I spent 6 hours preparing my post to R-help. Sorry not to have read
the 'R-Internals' manual.  I just wanted to know if I could use scan()
more efficiently.

My hurdle seems nothing to do with efficiently calling scan() .  I
suspect the same is true for the originator of this memory experiment
thread. It is the overhead of storing short strings, as Charles
identified and Brian explained.  I appreciate the investigation and
clarification you both have made.

56B overhead for a 2 character string seems extreme to me, but I'm not
complaining. I really like R, and being free, accept that
it-is-what-it-is.

In my case pre-processing is not an option, it is not a one off
problem with a particular file. In my application, R is run in batch
mode as part of a tool chain for arbitrary csv files.  Having found
cases where memory usage was as high as 20x file size, and allowing
for a copy of the the loaded dataset, I'll just need to document that
it is possible that files as small as 1/40th of system memory may
consume it all.  That rules out some important datasets (US Census, UK
Office of National Statistics files, etc) for 2GB servers.

Regards, Mike


On 8/9/07, Prof Brian Ripley [EMAIL PROTECTED] wrote:
 On Thu, 9 Aug 2007, Charles C. Berry wrote:

  On Thu, 9 Aug 2007, Michael Cassin wrote:
 
  I really appreciate the advice and this database solution will be useful to
  me for other problems, but in this case I  need to address the specific
  problem of scan and read.* using so much memory.
 
  Is this expected behaviour?

 Yes, and documented in the 'R Internals' manual.  That is basic reading
 for people wishing to comment on efficiency issues in R.

  Can the memory usage be explained, and can it be
  made more efficient?  For what it's worth, I'd be glad to try to help if 
  the
  code for scan is considered to be worth reviewing.
 
  Mike,
 
  This does not seem to be an issue with scan() per se.
 
  Notice the difference in size of big2, big3, and bigThree here:
 
  big2 - rep(letters,length=1e6)
  object.size(big2)/1e6
  [1] 4.000856
  big3 - paste(big2,big2,sep='')
  object.size(big3)/1e6
  [1] 36.2

 On a 32-bit computer every R object has an overhead of 24 or 28 bytes.
 Character strings are R objects, but in some functions such as rep (and
 scan for up to 10,000 distinct strings) the objects can be shared.  More
 string objects will be shared in 2.6.0 (but factors are designed to be
 efficient at storing character vectors with few values).

 On a 64-bit computer the overhead is usually double.  So I would expect
 just over 56 bytes/string for distinct short strings (and that is what
 big3 gives).

 But 56Mb is really not very much (tiny on a 64-bit computer), and 1
 million items is a lot.

 [...]


 --
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Positioning text in top left corner of plot

2007-08-10 Thread Paul Murrell
Hi


Daniel Brewer wrote:
 Thanks for the replies, but I still cannot get what I want.  I do not
 want the label inside the plot area, but in the top left of the paper, I
 suppose in the margins.  When I try to use text to do this, it does not
 seem to plot it outside the plot area.  I have also tried to use mtext,
 but that does not really cut it, as I cannot get the label in the
 correct position.  Ideally, it would be best if I could use legend but
 have it outside the plot area.
 
 Any ideas?


plot(1:10)
library(grid)
grid.text(What do we want?  Text in the corner!\nWhere do we want it? 
Here!,
   x=unit(2, mm), y=unit(1, npc) - unit(2, mm),
   just=c(left, top))

Paul


 Thanks
 
 Benilton Carvalho wrote:
 maybe this is what you want?

 plot(rnorm(10))
 legend(topleft, A), bty=n)

 ?

 b

 On Aug 7, 2007, at 11:08 AM, Daniel Brewer wrote:

 Simple question how can you position text in the top left hand corner of
 a plot?  I am plotting multiple plots using par(mfrow=c(2,3)) and all I
 want to do is label these plots a), b), c) etc.  I have been fiddling
 around with both text and mtext but without much luck.  text is fine but
  each plot has a different scale on the axis and so this makes it
 problematic.  What is the best way to do this?

 Many thanks

 Dan


-- 
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
[EMAIL PROTECTED]
http://www.stat.auckland.ac.nz/~paul/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Positioning text in top left corner of plot

2007-08-10 Thread Jim Lemon
Daniel Brewer wrote:
 Thanks for the replies, but I still cannot get what I want.  I do not
 want the label inside the plot area, but in the top left of the paper, I
 suppose in the margins.  When I try to use text to do this, it does not
 seem to plot it outside the plot area.  I have also tried to use mtext,
 but that does not really cut it, as I cannot get the label in the
 correct position.  Ideally, it would be best if I could use legend but
 have it outside the plot area.
 
 Any ideas?
 
Hi Dan,

Try this:

plot(1:5)
par(xpd=TRUE)
text(0.5,5.5,Outside)
par(xpd=FALSE)

Jim

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GLMM: MEEM error due to dichotomous variables

2007-08-10 Thread lorenz.gygax
 I am trying to run a GLMM on some binomial data. My fixed 
 factors include 2 
 dichotomous variables, day, and distance. When I run the model:
 
 modelA-glmmPQL(Leaving~Trial*Day*Dist,random=~1|Indiv,family=
binomial)
 
 I get the error:
 
 iteration 1
 Error in MEEM(object, conLin, control$niterEM) :
 Singularity in backsolve at level 0, block 1
 
 From looking at previous help topics,( 
 http://tolstoy.newcastle.edu.au/R/help/02a/4473.html)
 I gather this is because of the dichotomous predictor 
 variables - what approach should I take to avoid this problem?

Are you sure? I have never had problems including factors in a glmmPQL so far. 
More likely, the combination of your explanatory variables leads to a 
fragmentation in your response such that each combination of your factor levels 
only contain 0s or 1s. Thus, your model is 'too good' (it has too many 
predictors given the amount of data). Try e.g. to fit a model without the 
interactions.

Cheers, Lorenz
- 
Lorenz Gygax
Centre for proper housing of ruminants and pigs
Agroscope Reckenholz-Tänikon Research Station ART
Tänikon, CH-8356 Ettenhausen / Switzerland

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] GLM with tweedie: NA for AIC

2007-08-10 Thread laran gines
Dear R users;
 
I am modelling densities of some species of birds, so I have a problem with a 
great ammount of zeros.
I have decided to try GLMs with the tweedie family, but in all the models I 
have tried  I got an NA for the AIC value.
Just  to check the problem I've compared the a glm using the Gaussian family 
with the identity link and a glm using the tweedie family with var.power=0 and 
link.power=1. These are equal, as expected, except the fact that the tweedie 
output gives me an NA for the AIC.
Can anyone help me with this problem?
Below you can find the two outputs I refer.
 
Best Wishes;
 
Catarina
 
 summary(glm(formula=ACIN~DIST_REF+DIST_H2O+DIST_OST+ 
 COTA+H2O_SUP+vasa,family=gaussian(link=identity)))
Call:glm(formula = ACIN ~ DIST_REF + DIST_H2O + DIST_OST + COTA + H2O_SUP + 
vasa, family = gaussian(link = identity))
Deviance Residuals:   Min 1Q Median 3QMax  
-0.112792  -0.042860  -0.021113  -0.006311   1.551824  
Coefficients:  Estimate Std. Error t value Pr(|t|)  (Intercept) 
-6.625e-02  5.454e-02  -1.215   0.2256  DIST_REF 3.581e-06  1.336e-05   
0.268   0.7889  DIST_H2O-3.168e-05  1.527e-05  -2.074   0.0391 *DIST_OST
-1.799e-05  1.953e-05  -0.921   0.3579  COTA 5.648e-04  2.470e-04   
2.287   0.0230 *H2O_SUP -2.172e-04  3.994e-04  -0.544   0.5870  vasa
 3.695e-02  4.573e-02   0.808   0.4199  ---Signif. codes:  0 '***' 0.001 '**' 
0.01 '*' 0.05 '.' 0.1 ' ' 1 
(Dispersion parameter for gaussian family taken to be 0.02151985)
Null deviance: 5.6028  on 257  degrees of freedomResidual deviance: 5.4015  
on 251  degrees of freedomAIC: -249.33
Number of Fisher Scoring iterations: 2
 
 
 summary(glm(formula=ACIN~DIST_REF+DIST_H2O+DIST_OST+ 
 COTA+H2O_SUP+vasa,control=glm.control(maxit=750),family=tweedie(var.power=0, 
 link.power=1)))
Call:glm(formula = ACIN ~ DIST_REF + DIST_H2O + DIST_OST + COTA + H2O_SUP + 
vasa, family = tweedie(var.power = 0, link.power = 1), control = 
glm.control(maxit = 750))
Deviance Residuals:   Min 1Q Median 3QMax  
-0.112792  -0.042860  -0.021113  -0.006311   1.551824  
Coefficients:  Estimate Std. Error t value Pr(|t|)  (Intercept) 
-6.625e-02  5.454e-02  -1.215   0.2256  DIST_REF 3.581e-06  1.336e-05   
0.268   0.7889  DIST_H2O-3.168e-05  1.527e-05  -2.074   0.0391 *DIST_OST
-1.799e-05  1.953e-05  -0.921   0.3579  COTA 5.648e-04  2.470e-04   
2.287   0.0230 *H2O_SUP -2.172e-04  3.994e-04  -0.544   0.5870  vasa
 3.695e-02  4.573e-02   0.808   0.4199  ---Signif. codes:  0 '***' 0.001 '**' 
0.01 '*' 0.05 '.' 0.1 ' ' 1 
(Dispersion parameter for Tweedie family taken to be 0.02151985)
Null deviance: 5.6028  on 257  degrees of freedomResidual deviance: 5.4015  
on 251  degrees of freedomAIC: NA
Number of Fisher Scoring iterations: 2
 
_
Conheça o Windows Live Spaces, a rede de relacionamentos conectada ao Messenger!

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cleaning up the memory

2007-08-10 Thread Monica Pisica

Hi,
 
I have 4 huge tables on which i want to do a PCA analysis and a kmean 
clustering. If i run each table individually i have no problems, but if i want 
to run it in a for loop i exceed the memory alocation after the second table, 
even if i save the results as a csv table and i clean up all the big objects 
with rm command. To me it seems that even if i don't have the objects anymore, 
the memory these objects used to occupy is not cleared. Is there any way to 
clear up the memory as well? I don't want to close R and start it up again. 
Also i am running R under Windows.
 
thanks,
 
Monica
_
[[trailing spam removed]]

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] need help to manipulate function and time interval

2007-08-10 Thread KOITA Lassana - STAC/ACE
Hi R-users,

I have to define a noise level function L and its energy in the various 
moment of the day by:

 if time is between 18:00:00 and 23:59:59 then  L[j] - L[j]+5 and W - 
10^((L+5)/10) 

if time is between 22:00:00 and 05:59:59 == L - L+10  and W - 
10^((L+10)/10)
else
L=L and W = W

Could someone help me to realize this function please? You will find my 
following proposal  code, but my main problem is to handle the time 
interval.

Best regard

###
myfunc - function(mytab, Time, Level)
{
vect - rep(0, length(mytab))
for(i in 1:length(vect))
   {
  for(j in 1:length(Time))

if(time[j] is between 18:00:00 and 23:59:59) 

L[i] - L[j]+5 
 
   vect[i] - 10^((L[i])/10

if (time[j] is between 22:00:00 and 05:59:59)

L[i] - L[j]+10 
 
 vect[i] - 10^((L[i])/10

else 
 
   L[i] = L[j]
 
  vect[i] - 10^((L[i])/10
 } 
} 

###

Lassana KOITA 
Chargé d'Etudes de Sécurité Aéroportuaire et d'Analyse Statistique  / 
Project Engineer Airport Safety Studies  Statistical analysis
Service Technique de l'Aviation Civile (STAC) / Civil Aviation Technical 
Department 
Direction Générale de l'Aviation Civile (DGAC) / French Civil Aviation 
Headquarters
Tel: 01 49 56 80 60
Fax: 01 49 56 82 14
E-mail: [EMAIL PROTECTED]
http://www.stac.aviation-civile.gouv.fr/
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] small sample techniques

2007-08-10 Thread Nordlund, Dan (DSHS/RDA)
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Nair, 
 Murlidharan T
 Sent: Thursday, August 09, 2007 12:02 PM
 To: Nordlund, Dan (DSHS/RDA); r-help@stat.math.ethz.ch
 Subject: Re: [R] small sample techniques
 
 n=300
 30% taking A relief from pain
 23% taking B relief from pain
 Question; If there is no difference are we likely to get a 7% 
 difference?
 
 Hypothesis
 H0: p1-p2=0
 H1: p1-p2!=0 (not equal to)
 
 1Weighed average of two sample proportion
 300(0.30)+300(0.23)
 --- = 0.265
   300+300
 2Std Error estimate of the difference between two 
 independent proportions
   sqrt((0.265 *0.735)*((1/300)+(1/300))) = 0.03603
 
 3Evaluation of the difference between sample proportion as a 
 deviation from the hypothesized difference of zero
  ((0.30-0.23)-(0))/0.03603 = 1.94
 
 
 z did not approach 1.96 hence H0 is not rejected. 
 
 This is what I was trying to do using prop.test. 
 
 prop.test(c(30,23),c(300,300)) 
 
 What function should I use? 
 
 

The proportion test above indicates that p1=0.1 and p2=0.0767.  But in your 
t-test you specify p1=0.3 and p2=0.23.  Which is correct?  If p1=0.3 and 
p2=0.23, then use

prop.test(c(.30*300,.23*300),c(300,300))

Hope this is helpful,

Dan

Daniel J. Nordlund
Research and Data Analysis
Washington State Department of Social and Health Services
Olympia, WA  98504-5204

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] write.table

2007-08-10 Thread Yinghai Deng
write.table(mydata.frame, mydata, col.names=NA, quote=F, sep=\t) will
solve the problem.
Deng
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Weiwei Shi
Sent: August 10, 2007 12:41 PM
To: r-help@stat.math.ethz.ch
Subject: [R] write.table


Hi,

I am always with this qustion when I tried to write a data.frame with
row.names and col.names. I have to re-make the data frame to let its
first column be the rownames and let row.names=F so that I can align
the colnames correctly.

Is there a way or option in write.table to automatically do that?

thanks.

--
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.

Did you always know?
No, I did not. But I believed...
---Matrix III

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subject: Re: how to include bar values in a barplot?

2007-08-10 Thread Greg Snow


Jim Lemon Wrote:

 I also greatly enjoyed Ted's rebuttal of the Bar charts are 
 evil and must be banned argument. If bar charts are 
 appropriate for the audience, give 'em bar charts. One great 
 way to turn off your customers is to tell them what they can 
 and can't do with your product.

I don't remember anyone saying that barcharts are evil or that they
should be banned (3-D bar charts and pie charts on the other hand ...). 

I think that a variation on fortune(108) applies here.  While barcharts
may be appropriate for some audiences, it is also appropriate to educate
our audiences to better alternatives.


-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Test (ignore)

2007-08-10 Thread Nordlund, Dan (DSHS/RDA)


Daniel J. Nordlund
Research and Data Analysis
Washington State Department of Social and Health Services
Olympia, WA  98504-5204

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to include bar values in a barplot?

2007-08-10 Thread Greg Snow
Welcome to the world of R.

I'm glad that you found the discussion enlightening, now that you have
thought about things a bit, here is some code to try out that shows some
of the alternatives to the original plots you provided (which is best
depends on your audience and what your main question of interest is
(which comparisons are most important):

tmp - c(34,22,77)

tmp2 - barplot(tmp, names=LETTERS[1:3])

# put numbers at bottom of bars
axis(1, at=tmp2, labels=as.character(tmp), tick=FALSE, line = -1)

# put numbers at top of plot
axis(3, at=tmp2, labels=as.character(tmp), tick=FALSE)


# horizontal boxplot

op - par(mar=c(5,6,4,6)+0.1)
tmp2 - barplot(tmp, names=LETTERS[1:3], horiz=TRUE)

# put numbers on the right
axis(4, at=tmp2, labels=as.character(tmp), tick=FALSE, las=1)

par(op)

# the dotplot
library(Hmisc)
dotchart2(tmp, labels=LETTERS[1:3], auxdata=tmp, xlim=range(0,tmp))


# alternatives to stacked bars

tmp1 - c(8, 22, 60, 10,  10, 21, 59, 10)
tmp2 - factor(rep(c('A','B'), each=4))
tmp3 - factor(rep(1:4, 2))

dotchart2(tmp1, groups=tmp2, labels=tmp3, xlim=range(0,tmp1))
dotchart2(tmp1, groups=tmp3, labels=tmp2, xlim=range(0,tmp1))

library(lattice)
tmp - data.frame( tmp1=tmp1, tmp2=tmp2, tmp3=tmp3 )

dotplot( tmp2~tmp1, data=tmp, groups=tmp3, pch=levels(tmp3),
scales=list(x=list(limits=range(0,tmp1))) )
dotplot( tmp3~tmp1, data=tmp, groups=tmp2, pch=levels(tmp2),
xlim=range(0,tmp1) )





-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Donatas G.
 Sent: Friday, August 10, 2007 2:15 AM
 To: r-help@stat.math.ethz.ch
 Subject: Re: [R] how to include bar values in a barplot?
 
 Quoting Greg Snow [EMAIL PROTECTED]:
 
  My original intent was to get the original posters out of 
 the mode of 
  thinking they want to match what the spreadsheet does and into 
  thinking about what message they are trying to get across.  To get 
  them (and possibly others) thinking I made the statements a 
 bit more 
  bold than my actual position (I did include a couple of qualifiers).
 
 As an original poster (and a brand new user of R), I would 
 like to comment on the educational experience I have just received. ;)
 
 The discussion was interesting and enlightening, and gives 
 some good ideas about the ways (tables, graphs, graphs with 
 numbers etc.) to get the data accross to the ones one is 
 presenting to. I see some of you guys do feel quite strongly 
 about it, which is fine for me. I do not. I usually care for 
 barplot aesthetics and informativeness more than for visual 
 simplicity. That may change in time :)
 
 I see R graphical capabilities are huge but hard to access at 
 times - that is when spreadsheet seems preferrable. For 
 example, as a user of Linux I still cannot figure out why the 
 fonts (and graphics in general) look much more ugly on R in 
 Linux than they do in R on Windows - no smoothing, sub-pixell 
 hinting, anything like that. That is what my next free time 
 homework on R will be about
 :)
 
 Sincerely
 
 Donatas Glodenis
 PhD candidate
 Department of Sociology of the Faculty of Philosophy Vilnius 
 University Lithuania
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rfImpute

2007-08-10 Thread Eric Turkheimer
I am having trouble with the rfImpute function in the randomForest package.
Here is a sample...

clunk.roughfix-na.roughfix(clunk)

 clunk.impute-rfImpute(CONVERT~.,data=clunk)
ntree  OOB  1  2
  300:  26.80%  3.83% 85.37%
ntree  OOB  1  2
  300:  18.56%  5.74% 51.22%
Error in randomForest.default(xf, y, ntree = ntree, ..., do.trace = ntree,
:
NA not permitted in predictors

So roughFix works, but rfImpute doesn't

Thanks,
Eric
 ent3c *at* virginia.edu

-- 
Eric Turkheimer, PhD
Department of Psychology
University of Virginia
PO Box 400400
Charlottesville, VA  22904-4400

434-982-4732
434-982-4766 (FAX)

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with counting how many times each value occur in each column

2007-08-10 Thread François Pinard
[Gabor Grothendieck]

   table(col(mat), mat)

Clever, simple, and elegant! :-)

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Odp: having problems with factor()

2007-08-10 Thread Jabez Wilson
You've spotted it!

table(df$area)
 0  1  2  3  4  5  7 
21 27 71 46 19  3  1 
 
There are no values in area 6.

Thank you very much.

Jabez


- Original Message 
From: Petr PIKAL [EMAIL PROTECTED]
To: Jabez Wilson [EMAIL PROTECTED]
Cc: R-Help r-help@stat.math.ethz.ch
Sent: Friday, 10 August, 2007 1:02:21 PM
Subject: Odp: [R] having problems with factor()


Hi
[EMAIL PROTECTED] napsal dne 10.08.2007 13:41:53:

 Dear R Help,
 I have a set of data of heights of trees described by area that they are 
in. 
 The areas are numerical (0 to 7).
 
 htarea
 1   320   3
 2   410   4
 3   230   2
 4   360   3
 5   126   1
 6   280   2
 7   260   2
 8   280   2
 9   280   2
 10  260   2
 ...
 180 450   4
 181  90   1
 182 120   1
 183 440   4
 184 210   2
 185 330   3
 186 210   2
 187 100   1
 188   0   0
 
 I want to convert the area column values to factors, to do an anova. 
However, if I use:
 
 df$areaf - factor(df$area, 
labels=c(0,I,II,III,IV,V,VI,VII))
 
 it gives the following message:
 

Hm, maybe some of the values are missing

 num-sample(1:3, 10, replace=T)
 num
[1] 1 3 1 2 3 3 1 3 3 3
 factor(num, labels=c(O, I, II))
[1] O  II O  I  II II O  II II II
Levels: O I II

 factor(num, labels=c(O, I, II, III))
Error in factor(num, labels = c(O, I, II, III)) : 
invalid labels; length 4 should be 1 or 3


try

table(df$area)
to see what level you really have

Regards
Petr


 Error in factor(df$area, labels = c(0, I, II, III, IV, V, 
VI,  : 
 invalid labels; length 8 should be 1 or 7
 
 Can anyone help?
 
 Jabez
 
 
   ___
 
 now.
 
[[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


  ___

now.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help wit matrices

2007-08-10 Thread Ravi Varadhan
An even simpler solution is:

mat2 - 1 * (mat1  0.25)

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Lanre Okusanya
Sent: Friday, August 10, 2007 2:20 PM
To: jim holtman
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Help wit matrices

that was ridiculously simple. duh.

THanks

Lanre

On 8/10/07, jim holtman [EMAIL PROTECTED] wrote:
 Is this what you want:

  x - matrix(runif(100), 10)
  round(x, 3)
[,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10]
  [1,] 0.268 0.961 0.262 0.347 0.306 0.762 0.524 0.062 0.028 0.226
  [2,] 0.219 0.100 0.165 0.131 0.578 0.933 0.317 0.109 0.527 0.131
  [3,] 0.517 0.763 0.322 0.374 0.910 0.471 0.278 0.382 0.880 0.982
  [4,] 0.269 0.948 0.510 0.631 0.143 0.604 0.788 0.169 0.373 0.327
  [5,] 0.181 0.819 0.924 0.390 0.415 0.485 0.702 0.299 0.048 0.507
  [6,] 0.519 0.308 0.511 0.690 0.211 0.109 0.165 0.192 0.139 0.681
  [7,] 0.563 0.650 0.258 0.689 0.429 0.248 0.064 0.257 0.321 0.099
  [8,] 0.129 0.953 0.046 0.555 0.133 0.499 0.755 0.181 0.155 0.119
  [9,] 0.256 0.954 0.418 0.430 0.460 0.373 0.620 0.477 0.132 0.050
 [10,] 0.718 0.340 0.854 0.453 0.943 0.935 0.170 0.771 0.221 0.929
  ifelse(x  .5, 1, 0)
   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
  [1,]010001100 0
  [2,]000011001 0
  [3,]110010001 1
  [4,]011101100 0
  [5,]011000100 1
  [6,]101100000 1
  [7,]110100000 0
  [8,]010100100 0
  [9,]010000100 0
 [10,]101011010 1


 On 8/10/07, Lanre Okusanya [EMAIL PROTECTED] wrote:
  Hello all,
 
  I am working with a 1000x1000 matrix, and I would like to return a
  1000x1000 matrix that tells me which value in the matrix is greater
  than a theshold value (1 or 0 indicator).
  i have tried
   mat2-as.matrix(as.numeric(mat10.25))
  but that returns a 1:10 matrix.
  I have also tried for loops, but they are grossly inefficient.
 
  THanks for all your help in advance.
 
  Lanre
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem you are trying to solve?


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cleaning up the memory

2007-08-10 Thread Prof Brian Ripley

On Fri, 10 Aug 2007, Monica Pisica wrote:



Thanks! I will look into ...

I have 4 GB RAM, and i was monitoring the memory with Windows task 
manager so i was looking how R gets more and more memory allocation 
from less than 100Mb to  1500Mb .


Then you are almost certainly fragmenting the address space.

We still don't know your OS and whether you have enabled the /3GB switch 
(if relevant to that version of Windows).   Most versions of Windows have 
a 2Gb address space, but some can be as high as 4Gb (Vista 64 which I use 
is one: the details are in the rw-FAQ for the latest versions of R, e.g. 
R-patched and R-devel).  That factor of 2 can make a big difference.


My initial tables are between 30 to 80 Mb and the resulting tables that 
incorporate the initial tables plus PCA and kmeans results are inbetween 
50 to 200MB or thereabouts!


And yes, i don't really care about memory allocation in detail - what i 
want is to free that memory after every cycle ;-)


Although, after i didn't do anything in R and it was idle for more than 
30 min. the memory allocation according to Task manager dropped to 15 Mb 
. which is good - but i cannot wait inbetween cycles half an hour 
though .


Calling gc() will reduce the memory allocation, but that is not the point.
You can have 15Mb allocated and still not a 50Mb hole in the address 
space (although that would be extremely unlucky, not having several 200Mb 
holes is quite likely).




Again thanks,

Monica Date: Fri, 10 Aug 2007 18:28:07 +0100 From: 
[EMAIL PROTECTED] To: [EMAIL PROTECTED] CC: 
r-help@stat.math.ethz.ch Subject: Re: [R] Cleaning up the memory  On 
Fri, 10 Aug 2007, Monica Pisica wrote:Hi,   I have 4 huge 
tables on which i want to do a PCA analysis and a kmean   clustering. 
If i run each table individually i have no problems, but if   i want 
to run it in a for loop i exceed the memory alocation after the   
second table, even if i save the results as a csv table and i clean up  
 all the big objects with rm command. To me it seems that even if i 
don't   have the objects anymore, the memory these objects used to 
occupy is not   cleared. Is there any way to clear up the memory as 
well? I don't want   to close R and start it up again. Also i am 
running R under Windows.  See ?gc, which does the clearing.  
However, unless you study the memory allocation in detail (which you  
cannot do from R code), you don't actually know that this is the 
problem.  More likely is that you have fragmentation of your 32-bit 
address space:  see ?Memory-limits.  Without any idea what memory 
you have and what 'huge' means, we can only  make wild guesses. It 
might be worth raising the memory limit (the  --max-mem-size flag).  
  thanks,   Monica  
_  
[[trailing spam removed]]   [[alternative HTML version deleted]]  
 __  
R-help@stat.math.ethz.ch mailing list  
https://stat.ethz.ch/mailman/listinfo/r-help  PLEASE do read the 
posting guide http://www.R-project.org/posting-guide.html  and provide 
commented, minimal, self-contained, reproducible code.   --  Brian 
D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, 
http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 
272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, 
UK Fax: +44 1865 272595 
_ 
Messenger Café ? open for fun 24/7. Hot games, cool activities served 
daily. Visit now. http://cafemessenger.com?ocid=TXT_TAGLM_AugWLtagline


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help wit matrices

2007-08-10 Thread John Kane
Will something like this help?

mm - matrix(rnorm(100),nrow=10)
mm
nn  -  mm  .5
nn

--- Lanre Okusanya [EMAIL PROTECTED] wrote:

 Hello all,
 
 I am working with a 1000x1000 matrix, and I would
 like to return a
 1000x1000 matrix that tells me which value in the
 matrix is greater
 than a theshold value (1 or 0 indicator).
 i have tried
   mat2-as.matrix(as.numeric(mat10.25))
 but that returns a 1:10 matrix.
 I have also tried for loops, but they are grossly
 inefficient.
 
 THanks for all your help in advance.
 
 Lanre
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help wit matrices

2007-08-10 Thread Roland Rau
I hope you don't mind that I offer also two solutions. No.1 is really 
bad. No.2 should be on par with the other ones.

Best,
Roland



mydata - matrix(rnorm(10*10), ncol=10)

threshold.value - 1.5

mydata2 - matrix(0, nrow=nrow(mydata), ncol=ncol(mydata))
mydata3 - matrix(0, nrow=nrow(mydata), ncol=ncol(mydata))


### not really the way to go:
for (i in 1:nrow(mydata)) {
   for (j in 1:ncol(mydata)) {
 if (mydata[i,j]threshold.value) {
   mydata2[i,j] - 1
 }
   }
}
### a better way...
mydata3[mydata  threshold.value] - 1
mydata2
mydata3



Lanre Okusanya wrote:
 Hello all,
 
 I am working with a 1000x1000 matrix, and I would like to return a
 1000x1000 matrix that tells me which value in the matrix is greater
 than a theshold value (1 or 0 indicator).
 i have tried
   mat2-as.matrix(as.numeric(mat10.25))
 but that returns a 1:10 matrix.
 I have also tried for loops, but they are grossly inefficient.
 
 THanks for all your help in advance.
 
 Lanre
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] half-logit and glm (again)

2007-08-10 Thread Richard D. Morey
I know this has been dealt with before on this list, but the previous 
messages lacked detail, and I haven't figured it out yet.

The model is:

\x_{ij} = \mu + \alpha_i + \beta_j

\alpha is a random effect (subjects), and \beta is a fixed effect 
(condition).

I have a link function:

p_{ij} = .5 + .5( 1 / (1 + exp{ -x_{ij} } ) )

Which is simply a logistic transformed to be between .5 and 1.

The data y_{ij} ~ Binomial( p_{ij}, N_{ij} )

I've generated data using this model, and I'd like to fit it. My data is 
a data frame with 3 columns, response (0/1), subject (a factor), and 
condition (another factor).

Here is my link definition:
#
halflogit=function(){
half.logit=function(mu) qlogis(2*mu-1)
half.logit.inv=function(eta) .5*plogis(eta)+.5
half.logit.deriv=function(eta) .5*(exp(eta/2)+exp(-eta/2))^-2
half.logit.inv.indicator=function(eta) TRUE
half.logit.indicator=function(mu) mu.5  mu1
link - half.logit
structure(list(linkfun = half.logit, linkinv = half.logit.inv,
mu.eta = half.logit.deriv, validmu = 
half.logit.indicator ,valideta = half.logit.inv.indicator, name = link),
   class = link-glm)
}

binomial(halflogit())

Family: binomial
Link function: half.logit
#

I based this off the help for the family() function.

So I try to call glmmPQL (based on other R-help posts, this is the 
easiest to use?)

#
glmmPQL(response ~ condition, random = ~ 1|subject, family = 
binomial(halflogit()), data = dat)

Error in if (!(validmu(mu)  valideta(eta))) stop(cannot find valid 
starting values: please specify some) :
 missing value where TRUE/FALSE needed
In addition: Warning message:
NaNs produced in: qlogis(p, location, scale, lower.tail, log.p)

#

It looks like I've misdefined something and it is going outside the 
specified domains for the functions. I can't find any reference to 
starting starting values in help for glmmPQL() or lme().

  If anyone has any working code where they've done a user defined link 
function, it would be greatly appreciated.


Thanks,
Richard

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] write.table

2007-08-10 Thread Weiwei Shi
I did not read ?write.table in details about CSV section.

Thanks.

On 8/10/07, Yinghai Deng [EMAIL PROTECTED] wrote:
 write.table(mydata.frame, mydata, col.names=NA, quote=F, sep=\t) will
 solve the problem.
 Deng
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] Behalf Of Weiwei Shi
 Sent: August 10, 2007 12:41 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] write.table


 Hi,

 I am always with this qustion when I tried to write a data.frame with
 row.names and col.names. I have to re-make the data frame to let its
 first column be the rownames and let row.names=F so that I can align
 the colnames correctly.

 Is there a way or option in write.table to automatically do that?

 thanks.

 --
 Weiwei Shi, Ph.D
 Research Scientist
 GeneGO, Inc.

 Did you always know?
 No, I did not. But I believed...
 ---Matrix III

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.

Did you always know?
No, I did not. But I believed...
---Matrix III

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] kde2d error message

2007-08-10 Thread Prof Brian Ripley
If X or Y contains missing values, _you_ supplied missing values as the 
'lims' argument and it will be those missing values that are reported.

I do not see how you expect to be able to do density estimation with 
missing values: they are unknown and so no part of the answer is known. If 
you are prepared to omit them, you can do so but my software (if this is 
indeed kde2d from package MASS, uncredited) does not make such arbitrary 
choices for you.

On Fri, 10 Aug 2007, Jennifer Dillon wrote:

 Hello!

 I am trying to do a smooth with the kde2d function,

That is not what the only kde2d function I know of does.

 and I'm getting an error message about NAs.  Does anyone have any 
 suggestions?  Does this function not do well with NAs in general?

 fit - kde2d(X, Y, n=100,lims=c(range(X),range(Y)))

 Error in if (from == to || length.out  2) by - 1 :
missing value where TRUE/FALSE needed


 Thanks in advance!!

 Jen

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

PLEASE do as we ask.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] write.table

2007-08-10 Thread Weiwei Shi
Hi,

I am always with this qustion when I tried to write a data.frame with
row.names and col.names. I have to re-make the data frame to let its
first column be the rownames and let row.names=F so that I can align
the colnames correctly.

Is there a way or option in write.table to automatically do that?

thanks.

-- 
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.

Did you always know?
No, I did not. But I believed...
---Matrix III

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help using gPath

2007-08-10 Thread Emilio Gagliardi
haha Paul,


It's important not only to post code, but also to make sure that other
 people can run it (i.e., include real data or have the code generate
 data or use one of R's predefined data sets).


Oh, I hadn't thought of using the predefined datasets, thats a good idea!

Also, isn't this next time ? :)


By next time I meant, when I ask a question in the future, I didn't think
you'd respond!

So here is some code!

library(reshape)
library(ggplot2)

theme_t - list(grid.fill=white,grid.colour=lightgrey,background.colour=
black,axis.colour=dimgrey)
ggtheme(theme_t)

grp -
c(2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3)
time -
c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2)
cc - c(0.7271,0.7563,0.6979,0.8208,0.7521,0.7875,0.7563,0.7771,0.8208,
0.7938,0.8083,0.7188,0.7521,0.7854,0.7979,0.7583,0.7646,0.6938,0.6813,0.7708
,0.7375,0.8104,0.8104,0.7792,0.7833,0.8083,0.8021,0.7313,0.7958,0.7021,
0.8167,0.8167,0.7583,0.7167,0.6563,0.6896,0.7333,0.8208,0.7396,0.8063,0.7083
,0.6708,0.7292,0.7646,0.7667,0.775,0.8021,0.8125,0.7646,0.6917,0.7458,0.7833
,0.7396,0.7229,0.7708,0.7729,0.8083,0.7771,0.6854,0.8417,0.7667,0.7063,0.75,
0.7813,0.8271,0.7896,0.7979,0.625,0.7938,0.7583,0.7396,0.7583,0.7938,0.7333,
0.7875,0.8146)

data - as.data.frame(cbind(time,grp,cc))
data$grp - factor(data$grp,labels=c(Group A,Group B))
data$time - factor(data$time,labels=c(Pre-test,Post-test))
boxplot - qplot(grp, cc, data=data, geom=boxplot,
orientation=horizontal, ylim=c(0.5,1), main=Hello World!, xlab=Label
X, ylab=Label Y, facets=.~time, colour=red, size=2)
boxplot + geom_jitter(aes(colour=steelblue)) + scale_colour_identity() +
scale_size_identity()
grid.gedit(ylabel, gp=gpar(fontsize=16))


 There's a book that provides a full explanation and the (basic) grid
 chapter is online (see
 http://www.stat.auckland.ac.nz/~paul/RGraphics/rgraphics.html)


Awesome, I'll check that out.

Yep, the facilities for investigating the viewport and grob tree are
 basically inadequate.  Based on some work Hadley did for ggplot, the
 development version of R has a slightly better tool called grid.ls()
 that can show how the grob tree and the viewport tree intertwine.  That
 would allow you to see which viewport each grob was drawn in, which
 would help you, for example, to know which viewport you had to go to to
 replace a rectangle you want to remove.


okie dokie, I'm ready to be amazed! hehe.
emilio

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] smoothing function for proportions

2007-08-10 Thread roger koenker
It is not entirely clear what you are using for y values in  
smooth.spline,
but it would appear that it is just the point estimates.  I would  
suggest
using instead -- at each x value -- a few equally spaced quantiles of
the estimated proportions.  Implicitly, smooth.spline expects to be  
fitting
a mean curve to data that has constant variance, so you might also
consider reweighting to approximate this, as well.


url:www.econ.uiuc.edu/~rogerRoger Koenker
email[EMAIL PROTECTED]Department of Economics
vox: 217-333-4558University of Illinois
fax:   217-244-6678Champaign, IL 61820


On Aug 10, 2007, at 10:23 AM, Rose Hoberman wrote:

 Sorry, forgot to attach the graph.

 On 8/10/07, Rose Hoberman [EMAIL PROTECTED] wrote:
 I am looking for a function that can fit a smooth function to a  
 vector
 of estimated proportions, such that the smoothed value is within
 specified confidence bounds of each proportion.  In other words,  
 given
 a small number of trials and large confidence intervals, I would
 prefer the function to vary smoothly, but given a large number of
 trials and small confidence intervals, I would prefer the function to
 lie within the confidence intervals, even if it is not smooth.

 I have attached a postscript file illustrating a data set I would  
 like
 to smooth.  As the figure shows, for large values of x, I have few
 data points, and so the ML estimate of the proportion varies widely,
 and the confidence intervals are very large.  When I use the
 smooth.spline function with a large value of spar (the red line), the
 function is not as smooth as desired for large values of x.  When I
 use a smaller value of spar (the green line), the function fails to
 stay within the confidence bounds of the proportions.   Is there a
 smoothing function for which I can specify upper and lower limits for
 the y value for specific values of x?

 Thanks for any suggestions,

 Rose

 smoothProportions.ps
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] a question on lda{MASS}

2007-08-10 Thread Weiwei Shi
hi,

maybe I should re-phrase my question a bit:

is there a way to get explicit formulae like Y ~ sum of CiXi from the
model build by lda{MASS} to calculate $x (value) ?

I assume scaling is the coeff and Xi is from test data and Y is $x
called LD1. But I want to confirm this.

Thanks.

Weiwei

On 8/9/07, Weiwei Shi [EMAIL PROTECTED] wrote:
 hi,

 assume
 val is the test data while m is lda model value by using CV=F

 x = predict(m, val)

 val2 = val[, 1:(ncol(val)-1)] # the last column is class label

 # col is sample, row is variable

 then I am wondering if

 x$x == (apply(val2*m$scaling), 2, sum)

 i.e., the scaling (is it coeff vector?) times val data and sum is the
 discrimant result $x?

 Thanks.

 --
 Weiwei Shi, Ph.D
 Research Scientist
 GeneGO, Inc.

 Did you always know?
 No, I did not. But I believed...
 ---Matrix III



-- 
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.

Did you always know?
No, I did not. But I believed...
---Matrix III

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] smoothing function for proportions

2007-08-10 Thread Rose Hoberman
Sorry, forgot to attach the graph.

On 8/10/07, Rose Hoberman [EMAIL PROTECTED] wrote:
 I am looking for a function that can fit a smooth function to a vector
 of estimated proportions, such that the smoothed value is within
 specified confidence bounds of each proportion.  In other words, given
 a small number of trials and large confidence intervals, I would
 prefer the function to vary smoothly, but given a large number of
 trials and small confidence intervals, I would prefer the function to
 lie within the confidence intervals, even if it is not smooth.

 I have attached a postscript file illustrating a data set I would like
 to smooth.  As the figure shows, for large values of x, I have few
 data points, and so the ML estimate of the proportion varies widely,
 and the confidence intervals are very large.  When I use the
 smooth.spline function with a large value of spar (the red line), the
 function is not as smooth as desired for large values of x.  When I
 use a smaller value of spar (the green line), the function fails to
 stay within the confidence bounds of the proportions.   Is there a
 smoothing function for which I can specify upper and lower limits for
 the y value for specific values of x?

 Thanks for any suggestions,

 Rose



smoothProportions.ps
Description: PostScript document
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help wit matrices

2007-08-10 Thread jim holtman
Is this what you want:

 x - matrix(runif(100), 10)
 round(x, 3)
   [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10]
 [1,] 0.268 0.961 0.262 0.347 0.306 0.762 0.524 0.062 0.028 0.226
 [2,] 0.219 0.100 0.165 0.131 0.578 0.933 0.317 0.109 0.527 0.131
 [3,] 0.517 0.763 0.322 0.374 0.910 0.471 0.278 0.382 0.880 0.982
 [4,] 0.269 0.948 0.510 0.631 0.143 0.604 0.788 0.169 0.373 0.327
 [5,] 0.181 0.819 0.924 0.390 0.415 0.485 0.702 0.299 0.048 0.507
 [6,] 0.519 0.308 0.511 0.690 0.211 0.109 0.165 0.192 0.139 0.681
 [7,] 0.563 0.650 0.258 0.689 0.429 0.248 0.064 0.257 0.321 0.099
 [8,] 0.129 0.953 0.046 0.555 0.133 0.499 0.755 0.181 0.155 0.119
 [9,] 0.256 0.954 0.418 0.430 0.460 0.373 0.620 0.477 0.132 0.050
[10,] 0.718 0.340 0.854 0.453 0.943 0.935 0.170 0.771 0.221 0.929
 ifelse(x  .5, 1, 0)
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]010001100 0
 [2,]000011001 0
 [3,]110010001 1
 [4,]011101100 0
 [5,]011000100 1
 [6,]101100000 1
 [7,]110100000 0
 [8,]010100100 0
 [9,]010000100 0
[10,]101011010 1


On 8/10/07, Lanre Okusanya [EMAIL PROTECTED] wrote:
 Hello all,

 I am working with a 1000x1000 matrix, and I would like to return a
 1000x1000 matrix that tells me which value in the matrix is greater
 than a theshold value (1 or 0 indicator).
 i have tried
  mat2-as.matrix(as.numeric(mat10.25))
 but that returns a 1:10 matrix.
 I have also tried for loops, but they are grossly inefficient.

 THanks for all your help in advance.

 Lanre

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with counting how many times each value occur in each column

2007-08-10 Thread Gabor Grothendieck
Try this where we have constructed the example to illustrate that
it does handle the case where not all values are in each column:

   mat - matrix(rep(1:6, each = 4), 6)

   table(col(mat), mat)

On 8/10/07, Tom Cohen [EMAIL PROTECTED] wrote:
 Dear list,
  I have the following dataset and want to know how many times each value 
 occur in each column.
   data
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
  [1,] -100 -100 -100000000  -100
  [2,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  [3,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  [4,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  [5,] -100 -100 -100 -100 -100 -100 -100 -100 -100   -50
  [6,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  [7,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  [8,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  [9,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [10,] -100 -100 -100  -50 -100 -100 -100 -100 -100  -100
 [11,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [12,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [13,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [14,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [15,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [16,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [17,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [18,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [19,] -100 -100 -100000000  -100
 [20,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  The result matrix should look like
   -100 0 -50
 [1]   20
 [2]   20
 [3]   20
 [4]   17
 [5]   18
 [6]   18
 [7]   18  and so on
 [8]
 [9]
 [10]

 How can I do this in R ?
  Thanks alot for your help,
 Tom


 -

 Jämför pris på flygbiljetter och hotellrum: 
 http://shopping.yahoo.se/c-169901-resor-biljetter.html
[[alternative HTML version deleted]]


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Positioning text in top left corner of plot

2007-08-10 Thread Daniel Brewer
Jim Lemon wrote:
 Daniel Brewer wrote:
 Thanks for the replies, but I still cannot get what I want.  I do not
 want the label inside the plot area, but in the top left of the paper, I
 suppose in the margins.  When I try to use text to do this, it does not
 seem to plot it outside the plot area.  I have also tried to use mtext,
 but that does not really cut it, as I cannot get the label in the
 correct position.  Ideally, it would be best if I could use legend but
 have it outside the plot area.

 Any ideas?

 Hi Dan,
 
 Try this:
 
 plot(1:5)
 par(xpd=TRUE)
 text(0.5,5.5,Outside)
 par(xpd=FALSE)
 
 Jim

Here is what I used in the end:
par(xpd=T)
text(-0.15*(par(usr)[2]-par(usr)[1]),par(usr)[4]+0.14*(par(usr)[4]-par(usr)[3]),labels[i],cex=1.5)
par(xpd=F)

Ans that worked a treat.
Thanks

Dan

-- 
**
Daniel Brewer, Ph.D.
Institute of Cancer Research
Email: [EMAIL PROTECTED]
**

The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company 
Limited by Guarantee, Registered in England under Company No. 534147 with its 
Registered Office at 123 Old Brompton Road, London SW7 3RP.

This e-mail message is confidential and for use by the addre...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot legend in margin

2007-08-10 Thread Uwe Ligges


Daniel Brewer wrote:
 Hi all,
 Another plotting question I am afraid.  Is there anyway of putting a
 legend for a plot in a margin rather than within the figure.  I am
 trying to plot a 3x2 plot and I want to have:
 1) One key along the bottom for all the plots
 2) A label (a,b,c) for each plot (see previous emails)
 
 Is there any websites etc. that explain this sort of thing?

Please read the posting guide.

After that, type:
   RSiteSearch(legend margin)

Currently, the fourth entry shows a solution:
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/67979.html

Uwe Ligges


 Dan


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot legend in margin

2007-08-10 Thread Daniel Brewer
Thanks.  That got me onto the right track.  Because it is a multiplot
and I wanted it along the bottom, I found that I had to use par(xpd=NA)
and then position it relative to the last of the multiplots.  After a
bit of trial and error I got there.

Thanks

Lauri Nikkinen wrote:
 Very simple example:
  
 opar - par(mar = c(10, 4, 4, 4))
 plot(1:10)
 lines(1:10)
 par(xpd=TRUE)
 legend(4,-1.5,lty=1, col=black, legend=straigh line)
 par(opar)
  
 -Lauri

-- 
**
Daniel Brewer, Ph.D.
Institute of Cancer Research
Email: [EMAIL PROTECTED]
**

The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company 
Limited by Guarantee, Registered in England under Company No. 534147 with its 
Registered Office at 123 Old Brompton Road, London SW7 3RP.

This e-mail message is confidential and for use by the addre...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plot legend in margin

2007-08-10 Thread Daniel Brewer
Hi all,
Another plotting question I am afraid.  Is there anyway of putting a
legend for a plot in a margin rather than within the figure.  I am
trying to plot a 3x2 plot and I want to have:
1) One key along the bottom for all the plots
2) A label (a,b,c) for each plot (see previous emails)

Is there any websites etc. that explain this sort of thing?

Dan

-- 
**
Daniel Brewer, Ph.D.

Institute of Cancer Research
Email: [EMAIL PROTECTED]
**

The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company 
Limited by Guarantee, Registered in England under Company No. 534147 with its 
Registered Office at 123 Old Brompton Road, London SW7 3RP.

This e-mail message is confidential and for use by the addre...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] multivariate lme or lmer?

2007-08-10 Thread François Lefevre
How can we get variance/covariance components in a linear model with 
random effects when the response is multivariate?
e.g. variance components estimates are obtained through lme or lmer in 
the univariate case but these functions do not seem to extend to the 
multivariate case. I'd like to estimate covariance components within or 
between levels of factors in a general case.
Sorry for this basic question and thank you for the help.
Francois

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subject: Re: how to include bar values in a barplot?

2007-08-10 Thread Jim Lemon
Gabor Grothendieck wrote:
 You could put the numbers inside the bars in which
 case it would not add to the height of the bar:
 
 x - 1:5
 names(x) - letters[1:5]
 bp - barplot(x)
 text(bp, x - .02 * diff(par(usr)[3:4]), x)
 
Indeed, the boxed.labels function makes this pretty easy.
boxed.labels(bp,x-0.2*diff(par(usr)[3:4]),x)

gives you the labels in a little white rectangle so that none are invisible.

I also greatly enjoyed Ted's rebuttal of the Bar charts are evil and 
must be banned argument. If bar charts are appropriate for the 
audience, give 'em bar charts. One great way to turn off your customers 
is to tell them what they can and can't do with your product.

Jim

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Combining two ANOVA outputs of different lengths

2007-08-10 Thread Christoph Scherber
Dear R users,

I have been trying to combine two anova outputs into one single table 
(for later publication). The outputs are of different length, and share 
only some common explanatory variables.

Using merge() or melt() (from the reshape package) did not work out.

Here are the model outputs and what I would like to have:

anova(model1)
 numDF denDF  F-value p-value
(Intercept) 174 0.063446  0.8018
days174 6.613997  0.0121
logdiv  174 1.587983  0.2116
leg 174 4.425843  0.0388

anova(model2)
  numDF denDF   F-value p-value
(Intercept)  173 165.94569  .0001
funcgr   173   7.91999  0.0063
grass173  42.16909  .0001
leg  173   4.72108  0.0330
funcgr:grass 173   8.49068  0.0047

#merge(anova(model1),anova(model2),...)

F-value 1   p-val1  F-value 2   p-value 2
(Intercept) 0.0634460.8018  165.94569   .0001
days6.6139970.0121  NA  NA
logdiv  1.5879830.2116  NA  NA
leg 4.4258430.0388  4.72108 0.033
funcgr  NA  NA  7.91999 0.0063
grass   NA  NA  42.16909.0001
funcgr:grassNA  NA  8.49068 0.0047


I would be glad if someone would have an idea of how to do this in 
principle.

I am using R 2.5.1 on Windows XP.

Thanks very much in advance!

Best wishes
Christoph









-- 
Dr. Christoph Scherber
DNPW, Agroecology
University of Goettingen
Waldweg 26
D-37073 Goettingen
Germany

phone +49(0)551 39 8807
fax   +49(0)551 39 8806

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Seasonality

2007-08-10 Thread Pfaff, Bernhard Dr.
Hello Alberto, hello Felix,

aside of monthplot() and stl(), there is the possibility to use Census 
X-12-ARIMA. The program can be downloaded from:

http://www.census.gov/srd/www/x12a/

It should be mentioned that this is *not* a pure R solution, but one can set up 
the relevant scripts and output files and call the program from R and read in 
the relevant numbers back into R again.

Best,
Bernhard 

?monthplot

?stl


On 8/10/07, Alberto Monteiro [EMAIL PROTECTED] wrote:
 I have a time series x = f(t), where t is taken for each
 month. What is the best function to detect if _x_ has a seasonal
 variation? If there is such seasonal effect, what is the
 best function to estimate it?

 Function arima has a seasonal parameter, but I guess this is
 too complex to be useful.

 Alberto Monteiro

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Felix Andrews / 安福立
PhD candidate
Integrated Catchment Assessment and Management Centre
The Fenner School of Environment and Society
The Australian National University (Building 48A), ACT 0200
Beijing Bag, Locked Bag 40, Kingston ACT 2604
http://www.neurofractal.org/felix/
voice:+86_1051404394 (in China)
mobile:+86_13522529265 (in China)
mobile:+61_410400963 (in Australia)
xmpp:[EMAIL PROTECTED]
3358 543D AAC6 22C2 D336  80D9 360B 72DD 3E4C F5D8

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

*
Confidentiality Note: The information contained in this mess...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ordering a data.frame by average rank of multiple columns

2007-08-10 Thread Tom.O

Hi 

I have run into a problem and i wonder if anyone has a smart way of doing
this.

For example i have this data frame for 5 different test groups:

Res1 - c(1,5,4,-0.5,3)
Res2 - c(-1,8,2,0,3)
Mean - c(0.5,1,1.5,-.5,2)
MyFrame - data.frame(Res1,Res2,Mean,row.names=c(G1,G2,G3,G4,G5))

where the first two columns are the results of two different tests, the
third column is the mean of the group.

I want to order this data.frame by the combined rank of Res1  Res2, but
where weigths are assigned to the importeance av each column. Lets assume
that Res1 is twice as important and lower values rank better.

MyRanks-data.frame(Rank1=rank(MyFrame[,Res1]),Rank2=rank(MyFrame[,Res2]),CombR=2*rank(MyFrame[,Res1])+rank(MyFrame[,Res2]),row.names=c(G1,G2,G3,G4,G5))

Rank1 Rank2 CombR 
G1 2 1 5
G2 5 515
G3 4 311
G4 1 2 4
G5 3 410


and the rank of the combined is 2,5,4,1,3 , but to be able to sort MyFrame
in that order I need to enter this vector of positions c(4,1,5,3,2) but do
anyone have a smart way of converting ranks to positions?

Tom


-- 
View this message in context: 
http://www.nabble.com/ordering-a-data.frame-by-average-rank-of-multiple-columns-tf4247393.html#a12087498
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-10 Thread Prof Brian Ripley
I don't understand why one would run a 64-bit version of R on a 2GB 
server, especially if one were worried about object size.  You can run 
32-bit versions of R on x86_64 Linux (see the R-admin manual for a 
comprehensive discussion), and most other 64-bit OSes default to 32-bit 
executables.

Since most OSes limit 32-bit executables to around 3GB of address space, 
there starts to become a case for 64-bit executables at 4GB RAM but not 
much case at 2GB.

It was my intention when providing the infrastructure for it that Linux 
binary distributions on x86_64 would provide both 32-bit and 64-bit 
executables, but that has not happened.  It would be possible to install 
ix86 builds on x86_64 if -m32 was part of the ix86 compiler specification 
and the dependency checks would notice they needed 32-bit libraries. 
(I've had trouble with the latter on FC5: an X11 update removed all my 
32-bit X11 RPMs.)

On Fri, 10 Aug 2007, Michael Cassin wrote:

 Thanks for all the comments,

 The artificial dataset is as representative of my 440MB file as I could 
 design.

 I did my best to reduce the complexity of my problem to minimal
 reproducible code as suggested in the posting guidelines.  Having
 searched the archives, I was happy to find that the topic had been
 covered, where Prof Ripley suggested that the I/O manuals gave some
 advice.  However, I was unable to get anywhere with the I/O manuals
 advice.

 I spent 6 hours preparing my post to R-help. Sorry not to have read
 the 'R-Internals' manual.  I just wanted to know if I could use scan()
 more efficiently.

 My hurdle seems nothing to do with efficiently calling scan() .  I
 suspect the same is true for the originator of this memory experiment
 thread. It is the overhead of storing short strings, as Charles
 identified and Brian explained.  I appreciate the investigation and
 clarification you both have made.

 56B overhead for a 2 character string seems extreme to me, but I'm not
 complaining. I really like R, and being free, accept that
 it-is-what-it-is.

Well, there are only about 5 2-char strings in an 8-bit locale, so 
this does seem a case for using factors (as has been pointed out several 
times).

And BTW, it is not 56B overhead, but 56B total for up to 7 chars.

 In my case pre-processing is not an option, it is not a one off
 problem with a particular file. In my application, R is run in batch
 mode as part of a tool chain for arbitrary csv files.  Having found
 cases where memory usage was as high as 20x file size, and allowing
 for a copy of the the loaded dataset, I'll just need to document that
 it is possible that files as small as 1/40th of system memory may
 consume it all.  That rules out some important datasets (US Census, UK
 Office of National Statistics files, etc) for 2GB servers.

 Regards, Mike


 On 8/9/07, Prof Brian Ripley [EMAIL PROTECTED] wrote:
 On Thu, 9 Aug 2007, Charles C. Berry wrote:

 On Thu, 9 Aug 2007, Michael Cassin wrote:

 I really appreciate the advice and this database solution will be useful to
 me for other problems, but in this case I  need to address the specific
 problem of scan and read.* using so much memory.

 Is this expected behaviour?

 Yes, and documented in the 'R Internals' manual.  That is basic reading
 for people wishing to comment on efficiency issues in R.

 Can the memory usage be explained, and can it be
 made more efficient?  For what it's worth, I'd be glad to try to help if 
 the
 code for scan is considered to be worth reviewing.

 Mike,

 This does not seem to be an issue with scan() per se.

 Notice the difference in size of big2, big3, and bigThree here:

 big2 - rep(letters,length=1e6)
 object.size(big2)/1e6
 [1] 4.000856
 big3 - paste(big2,big2,sep='')
 object.size(big3)/1e6
 [1] 36.2

 On a 32-bit computer every R object has an overhead of 24 or 28 bytes.
 Character strings are R objects, but in some functions such as rep (and
 scan for up to 10,000 distinct strings) the objects can be shared.  More
 string objects will be shared in 2.6.0 (but factors are designed to be
 efficient at storing character vectors with few values).

 On a 64-bit computer the overhead is usually double.  So I would expect
 just over 56 bytes/string for distinct short strings (and that is what
 big3 gives).

 But 56Mb is really not very much (tiny on a 64-bit computer), and 1
 million items is a lot.

 [...]


 --
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595



-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK

Re: [R] how to include bar values in a barplot?

2007-08-10 Thread Donatas G.
Quoting Greg Snow [EMAIL PROTECTED]:

 My original intent was to get the original posters out of the mode of
 thinking they want to match what the spreadsheet does and into thinking
 about what message they are trying to get across.  To get them (and
 possibly others) thinking I made the statements a bit more bold than my
 actual position (I did include a couple of qualifiers).

As an original poster (and a brand new user of R), I would like to comment on
the educational experience I have just received. ;)

The discussion was interesting and enlightening, and gives some good 
ideas about
the ways (tables, graphs, graphs with numbers etc.) to get the data accross to
the ones one is presenting to. I see some of you guys do feel quite strongly
about it, which is fine for me. I do not. I usually care for barplot 
aesthetics
and informativeness more than for visual simplicity. That may change in 
time :)

I see R graphical capabilities are huge but hard to access at times - that is
when spreadsheet seems preferrable. For example, as a user of Linux I still
cannot figure out why the fonts (and graphics in general) look much more ugly
on R in Linux than they do in R on Windows - no smoothing, sub-pixell hinting,
anything like that. That is what my next free time homework on R will be about
:)

Sincerely

Donatas Glodenis
PhD candidate
Department of Sociology of the Faculty of Philosophy
Vilnius University
Lithuania

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] need help to manipulate function and time interval

2007-08-10 Thread Henrique Dallazuanna
Hi,

Try whit:

if(time[j] = 18:00:00   23:59:59)
...
...

-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

On 10/08/07, KOITA Lassana - STAC/ACE [EMAIL PROTECTED]
wrote:

 Hi R-users,

 I have to define a noise level function L and its energy in the various
 moment of the day by:

 if time is between 18:00:00 and 23:59:59 then  L[j] - L[j]+5 and W -
 10^((L+5)/10)

 if time is between 22:00:00 and 05:59:59 == L - L+10  and W -
 10^((L+10)/10)
 else
 L=L and W = W

 Could someone help me to realize this function please? You will find my
 following proposal  code, but my main problem is to handle the time
 interval.

 Best regard

 ###
 myfunc - function(mytab, Time, Level)
 {
 vect - rep(0, length(mytab))
 for(i in 1:length(vect))
{
   for(j in 1:length(Time))

 if(time[j] is between 18:00:00 and 23:59:59)

 L[i] - L[j]+5

vect[i] - 10^((L[i])/10

 if (time[j] is between 22:00:00 and 05:59:59)

 L[i] - L[j]+10

  vect[i] - 10^((L[i])/10

 else

L[i] = L[j]

   vect[i] - 10^((L[i])/10
  }
 }

 ###

 Lassana KOITA
 Chargé d'Etudes de Sécurité Aéroportuaire et d'Analyse Statistique  /
 Project Engineer Airport Safety Studies  Statistical analysis
 Service Technique de l'Aviation Civile (STAC) / Civil Aviation Technical
 Department
 Direction Générale de l'Aviation Civile (DGAC) / French Civil Aviation
 Headquarters
 Tel: 01 49 56 80 60
 Fax: 01 49 56 82 14
 E-mail: [EMAIL PROTECTED]
 http://www.stac.aviation-civile.gouv.fr/
 [[alternative HTML version deleted]]


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cleaning up the memory

2007-08-10 Thread Prof Brian Ripley
On Fri, 10 Aug 2007, Monica Pisica wrote:


 Hi,

 I have 4 huge tables on which i want to do a PCA analysis and a kmean 
 clustering. If i run each table individually i have no problems, but if 
 i want to run it in a for loop i exceed the memory alocation after the 
 second table, even if i save the results as a csv table and i clean up 
 all the big objects with rm command. To me it seems that even if i don't 
 have the objects anymore, the memory these objects used to occupy is not 
 cleared. Is there any way to clear up the memory as well? I don't want 
 to close R and start it up again. Also i am running R under Windows.

See ?gc, which does the clearing.

However, unless you study the memory allocation in detail (which you 
cannot do from R code), you don't actually know that this is the problem. 
More likely is that you have fragmentation of your 32-bit address space: 
see ?Memory-limits.

Without any idea what memory you have and what 'huge' means, we can only 
make wild guesses.  It might be worth raising the memory limit (the 
--max-mem-size flag).


 thanks,

 Monica
 _
 [[trailing spam removed]]

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [Fwd: Re: How to apply functions over rows of multiple matrices]

2007-08-10 Thread Johannes Hüsing
[Apologies to Gabor, who I sent a personal copy of the reply
erroneously instead of posting to List directly]

[...]
  Perhaps what you really intend is to
 take the average over those elements in each row of the first matrix
which correspond to 1's in the second in the corresponding
 row of the second.  In that case its just:

 rowSums(newtest * goldstandard) / rowSums(goldstandard)


Thank you for clearing my thoughts about the particular example.
My question was a bit more general though, as I have different
functions which are applied row-wise to multiple matrices. An
example that sets all values of a row of matrix A to NA after the
first occurrence of TRUE in matrix B.

fillfrom - function(applvec, testvec=NULL) {
  if (is.null(testvec)) testvec - applvec
  if (length(testvec) != length(applvec)) {
stop(applvec and testvec have to be of same length!)
  } else if(any(testvec, na.rm=TRUE)) {
applvec[min(which(testvec)) : length(applvec)] - NA
  }
  applvec
}

fillafter - function(applvec, testvec=NULL) {
  if (is.null(testvec)) testvec - applvec
  fillfrom(applvec, c(FALSE, testvec[-length(testvec)]))
}

numtest - 6
numsubj - 20

newtest - array(rbinom(numtest*numsubj, 1, .5),
dim=c(numsubj, numtest))
goldstandard - array(rbinom(numtest*numsubj, 1, .5),
dim=c(numsubj, numtest))

newtest.NA - t(sapply(1:nrow(newtest), function(i) {
  fillafter(newtest[i,], goldstandard[i,]==1)}))

My general question is if R provides some syntactic sugar
for the awkward sapply(1:nrow(A)) expression. Maybe in this
case there is also a way to bypass the apply mechanism and
my way of thinking about the problem has to be adapted. But
as the *apply calls are galore in R, I feel this is a standard
way of dealing with vectors and matrices.





--

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Positioning text in top left corner of plot

2007-08-10 Thread Daniel Brewer
Thanks for the replies, but I still cannot get what I want.  I do not
want the label inside the plot area, but in the top left of the paper, I
suppose in the margins.  When I try to use text to do this, it does not
seem to plot it outside the plot area.  I have also tried to use mtext,
but that does not really cut it, as I cannot get the label in the
correct position.  Ideally, it would be best if I could use legend but
have it outside the plot area.

Any ideas?

Thanks

Benilton Carvalho wrote:
 maybe this is what you want?
 
 plot(rnorm(10))
 legend(topleft, A), bty=n)
 
 ?
 
 b
 
 On Aug 7, 2007, at 11:08 AM, Daniel Brewer wrote:
 
 Simple question how can you position text in the top left hand corner of
 a plot?  I am plotting multiple plots using par(mfrow=c(2,3)) and all I
 want to do is label these plots a), b), c) etc.  I have been fiddling
 around with both text and mtext but without much luck.  text is fine but
  each plot has a different scale on the axis and so this makes it
 problematic.  What is the best way to do this?

 Many thanks

 Dan
-- 
**
Daniel Brewer, Ph.D.
Institute of Cancer Research
Email: [EMAIL PROTECTED]
**

The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company 
Limited by Guarantee, Registered in England under Company No. 534147 with its 
Registered Office at 123 Old Brompton Road, London SW7 3RP.

This e-mail message is confidential and for use by the addre...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] reading xcms files

2007-08-10 Thread Roberto Olivares Hernandez
Hi,

I am using xcms library  to read mass spectrum data. I generate objects from 
CDF files using the command line

  SME10 - xcmsRaw(SME_10.CDF)

I have 50 CDF files with different name and I don't want to repeat the command  
for each one. Is there any option to read all the files and generate a 
corresponding object name?

In advance thank you
Roberto


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combining two ANOVA outputs of different lengths

2007-08-10 Thread Peter Dalgaard
Christoph Scherber wrote:
 Dear R users,

 I have been trying to combine two anova outputs into one single table 
 (for later publication). The outputs are of different length, and share 
 only some common explanatory variables.

 Using merge() or melt() (from the reshape package) did not work out.

 Here are the model outputs and what I would like to have:

 anova(model1)
  numDF denDF  F-value p-value
 (Intercept) 174 0.063446  0.8018
 days  174 6.613997  0.0121
 logdiv  174 1.587983  0.2116
 leg 174 4.425843  0.0388

 anova(model2)
   numDF denDF   F-value p-value
 (Intercept)  173 165.94569  .0001
 funcgr   173   7.91999  0.0063
 grass173  42.16909  .0001
 leg  173   4.72108  0.0330
 funcgr:grass 173   8.49068  0.0047

 #merge(anova(model1),anova(model2),...)

   F-value 1   p-val1  F-value 2   p-value 2
 (Intercept)   0.0634460.8018  165.94569   .0001
 days  6.6139970.0121  NA  NA
 logdiv1.5879830.2116  NA  NA
 leg   4.4258430.0388  4.72108 0.033
 funcgrNA  NA  7.91999 0.0063
 grass NA  NA  42.16909.0001
 funcgr:grass  NA  NA  8.49068 0.0047


 I would be glad if someone would have an idea of how to do this in 
 principle.
   
The main problems are that the merge key is the rownames and that you 
want to keep entries that are missing in one of the analysis. There are 
ways to deal with that:

  example(anova.lm)
.
  merge(anova(fit2), anova(fit4), by=0, all=T)
  Row.names Df.x  Sum Sq.x Mean Sq.x F value.xPr(F).x Df.y  Sum Sq.y
1  ddpi   NANANANA  NA1  63.05403
2   dpi   NANANANA  NA1  12.40095
3 pop151 204.11757 204.11757 13.211166 0.0006878681 204.11757
4 pop751  53.34271  53.34271  3.452517 0.0694253851  53.34271
5 Residuals   47 726.16797  15.45038NA  NA   45 650.71300
  Mean Sq.y  F value.y Pr(F).y
1  63.05403  4.3604959 0.0424711387
2  12.40095  0.8575863 0.3593550848
3 204.11757 14.1157322 0.0004921955
4  53.34271  3.6889104 0.0611254598
5  14.46029 NA   NA



Presumably, you can take it from here.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Re : compute ROC curve?

2007-08-10 Thread justin bem
see ROCR or accuracy package.
 
Justin BEM
BP 1917 Yaoundé
Tél (237) 99597295
(237) 22040246

- Message d'origine 
De : gallon li [EMAIL PROTECTED]
À : r-help r-help@stat.math.ethz.ch
Envoyé le : Vendredi, 10 Août 2007, 4h15mn 36s
Objet : [R] compute ROC curve?

Hello,

i have continuous test results for dieased and nondiseased subjects, say X
and Y. Both are vectors of numbers.

is there any R function which can generate the step function of ROC curve
automatically?

Thanks!

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







  
__ 

ail !
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reading xcms files

2007-08-10 Thread Prof Brian Ripley
On Fri, 10 Aug 2007, Roberto Olivares Hernandez wrote:

 Hi,

 I am using xcms library to read mass spectrum data. I generate objects 
 from CDF files using the command line

  SME10 - xcmsRaw(SME_10.CDF)

 I have 50 CDF files with different name and I don't want to repeat the 
 command for each one. Is there any option to read all the files and 
 generate a corresponding object name?

Something like

for(f in Sys.glob(*.CDF)) assign(sub(\\.CDF$, , f), xcmsRaw(f))

(untested, of course).

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] need help to manipulate function and time interval

2007-08-10 Thread Uwe Ligges


Henrique Dallazuanna wrote:
 Hi,
 
 Try whit:
 
 if(time[j] = 18:00:00   23:59:59)

This code is obviously wrong and does not help for the next few lines in 
the questioner's message, please do not post unsensible stuff.

Uwe Ligges


 ...
 ...
 
 
 
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] need help to manipulate function and time interval

2007-08-10 Thread Uwe Ligges


KOITA Lassana - STAC/ACE wrote:
 Hi R-users,
 
 I have to define a noise level function L and its energy in the various 
 moment of the day by:
 
  if time is between 18:00:00 and 23:59:59 then  L[j] - L[j]+5 and W - 
 10^((L+5)/10) 


What kind of object is time? Just a character or some Time/Date format?
Do you know the day?

If time is between 18:00:00 and 23:59:59, should the next point (time is 
between 22:00:00 and 05:59:59) be executed additionally if time is, 
e.g., 23:00:00 or is there any other condition I cannot see?

All the information is quite essential in order to help...

BTW: I don't think the rest of your code is sensible (at least, some 
braces are missing).

Uwe Ligges



 if time is between 22:00:00 and 05:59:59 == L - L+10  and W - 
 10^((L+10)/10)
 else
 L=L and W = W
 
 Could someone help me to realize this function please? You will find my 
 following proposal  code, but my main problem is to handle the time 
 interval.
 
 Best regard
 
 ###
 myfunc - function(mytab, Time, Level)
 {
 vect - rep(0, length(mytab))
 for(i in 1:length(vect))
{
   for(j in 1:length(Time))
 
 if(time[j] is between 18:00:00 and 23:59:59) 
 
 L[i] - L[j]+5 
  
vect[i] - 10^((L[i])/10
 
 if (time[j] is between 22:00:00 and 05:59:59)
 
 L[i] - L[j]+10 
  
  vect[i] - 10^((L[i])/10
 
 else 
  
L[i] = L[j]
  
   vect[i] - 10^((L[i])/10
  } 
 } 
 
 ###
 
 Lassana KOITA 
 Chargé d'Etudes de Sécurité Aéroportuaire et d'Analyse Statistique  / 
 Project Engineer Airport Safety Studies  Statistical analysis
 Service Technique de l'Aviation Civile (STAC) / Civil Aviation Technical 
 Department 
 Direction Générale de l'Aviation Civile (DGAC) / French Civil Aviation 
 Headquarters
 Tel: 01 49 56 80 60
 Fax: 01 49 56 82 14
 E-mail: [EMAIL PROTECTED]
 http://www.stac.aviation-civile.gouv.fr/
   [[alternative HTML version deleted]]
 
 
 
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] smoothing function for proportions

2007-08-10 Thread Rose Hoberman
I am looking for a function that can fit a smooth function to a vector
of estimated proportions, such that the smoothed value is within
specified confidence bounds of each proportion.  In other words, given
a small number of trials and large confidence intervals, I would
prefer the function to vary smoothly, but given a large number of
trials and small confidence intervals, I would prefer the function to
lie within the confidence intervals, even if it is not smooth.

I have attached a postscript file illustrating a data set I would like
to smooth.  As the figure shows, for large values of x, I have few
data points, and so the ML estimate of the proportion varies widely,
and the confidence intervals are very large.  When I use the
smooth.spline function with a large value of spar (the red line), the
function is not as smooth as desired for large values of x.  When I
use a smaller value of spar (the green line), the function fails to
stay within the confidence bounds of the proportions.   Is there a
smoothing function for which I can specify upper and lower limits for
the y value for specific values of x?

Thanks for any suggestions,

Rose

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help wit matrices

2007-08-10 Thread Henrique Dallazuanna
mat2-matrix(as.numeric(mat10.25), ncol=1000)
-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

On 10/08/07, Lanre Okusanya [EMAIL PROTECTED] wrote:

 Hello all,

 I am working with a 1000x1000 matrix, and I would like to return a
 1000x1000 matrix that tells me which value in the matrix is greater
 than a theshold value (1 or 0 indicator).
 i have tried
   mat2-as.matrix(as.numeric(mat10.25))
 but that returns a 1:10 matrix.
 I have also tried for loops, but they are grossly inefficient.

 THanks for all your help in advance.

 Lanre

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot legend in margin

2007-08-10 Thread Greg Snow
Another couple of things to think about:

You could use the layout function to set up your multiple plots and
include an extra plotting area at the bottom to place the legend in.

If you stick with the solution below then the cnvrt.coords function from
the TeachingDemos package may be useful (will help you find the
coordinates relative to the last plot).

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Daniel Brewer
 Sent: Friday, August 10, 2007 4:55 AM
 To: Lauri Nikkinen; r-help@stat.math.ethz.ch
 Subject: Re: [R] Plot legend in margin
 
 Thanks.  That got me onto the right track.  Because it is a 
 multiplot and I wanted it along the bottom, I found that I 
 had to use par(xpd=NA) and then position it relative to the 
 last of the multiplots.  After a bit of trial and error I got there.
 
 Thanks
 
 Lauri Nikkinen wrote:
  Very simple example:
   
  opar - par(mar = c(10, 4, 4, 4))
  plot(1:10)
  lines(1:10)
  par(xpd=TRUE)
  legend(4,-1.5,lty=1, col=black, legend=straigh line)
  par(opar)
   
  -Lauri
 
 --
 **
 Daniel Brewer, Ph.D.
 Institute of Cancer Research
 Email: [EMAIL PROTECTED]
 **
 
 The Institute of Cancer Research: Royal Cancer Hospital, a 
 charitable Company Limited by Guarantee, Registered in 
 England under Company No. 534147 with its Registered Office 
 at 123 Old Brompton Road, London SW7 3RP.
 
 This e-mail message is confidential and for use by the\  ...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Row name of empty string issue

2007-08-10 Thread adiamond

I have a data.frame with rownames taken from a database.  Unfortunately, one
of the rownames (automatically obtained from the DB) is an empty string.  I
often do computations on the DB s.t. the answers (rows) are indexed with
respect to the rownames so a computation on a DB record might necessitate
the indexing into the data.frame by the emtpy row name.  Unfortunately, that
doesn't seem to work.  Explicitly:
I have this statement (in a loop going through sourceNames which are
rownames of the data.frames CurrentRecordBlankFieldsCountSums and 
BlankFieldsCount):

BlankFieldsCount[sourceNamei,]= BlankFieldsCount[sourceNamei,] + 
CurrentRecordBlankFieldsCountSums[sourceNamei ,];

if sourceNamei is any name other than  it works fine but otherwise
CurrentRecordBlankFieldsCountSums[sourceNamei ,] returns a bunch of NAs
because apparently it didn't fine a row named .  IMHO, if R lets you name
a row , then it should let you index it with the name .

Anyway, as further proof of the setup:
 rownames(CurrentRecordBlankFieldsCountSums)[1]
[1] 
# So the first rowname of CurrentRecordBlankFieldsCountSums is an empty
string 

 CurrentRecordBlankFieldsCountSums[1 ,]
 IDCaseNumber Category SSN LastName FirstName
00   10 0
# So, the first row has some data (not just NAs as would be returned if that
row didn't exist)

But ff I index that same row using the rowname it doesn't find the row:
 CurrentRecordBlankFieldsCountSums[rownames(
 CurrentRecordBlankFieldsCountSums)[1] ,]
   IDCaseNumber Category SSN LastName
NA   NA   NA  NA 
# I get the same result if I do this:CurrentRecordBlankFieldsCountSums[ ,]

As a sanity check:
  == rownames(CurrentRecordBlankFieldsCountSums)[1]
[1] TRUE

For other rows, (rownames that aren't , there's no problem):
 rownames(CurrentRecordBlankFieldsCountSums)[2]
[1] FRED
 CurrentRecordBlankFieldsCountSums[rownames(
 CurrentRecordBlankFieldsCountSums)[2] ,]
  IDCaseNumber Category SSN LastName FirstName
FRED00   00 

-- 
View this message in context: 
http://www.nabble.com/Row-name-of-empty-string-issue-tf4250291.html#a12096455
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with counting how many times each value occur in each column

2007-08-10 Thread François Pinard
[Tom Cohen]

  I have the following dataset and want to know how many times each value 
 occur in each column.

   data
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,] -100 -100 -100000000  -100
 [2,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [3,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [4,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [5,] -100 -100 -100 -100 -100 -100 -100 -100 -100   -50
 [6,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [7,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [8,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [9,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[10,] -100 -100 -100  -50 -100 -100 -100 -100 -100  -100
[11,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[12,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[13,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[14,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[15,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[16,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[17,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[18,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[19,] -100 -100 -100000000  -100
[20,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  The result matrix should look like
   -100 0 -50
[1]   20  
[2]   20
[3]   20
[4]   17
[5]   18
[6]   18
[7]   18  and so on 
[8] 
[9] 
[10]

Presuming that data is a matrix, one could try a sequence like this:

dataf - factor(data)
dim(dataf) - dim(data)
result - t(apply(dataf, 2, tabulate, nlevels(dataf)))
colnames(result) - levels(dataf)
result

If you want the columns sorted, you might decide the order of the levels 
on the factor() call, or explicitly reorder columns afterwards.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] kde2d error message

2007-08-10 Thread Jennifer Dillon
Hello!

I am trying to do a smooth with the kde2d function, and I'm getting an error
message about NAs.  Does anyone have any suggestions?  Does this function
not do well with NAs in general?

fit - kde2d(X, Y, n=100,lims=c(range(X),range(Y)))

Error in if (from == to || length.out  2) by - 1 :
missing value where TRUE/FALSE needed


Thanks in advance!!

Jen

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Fwd: Re: How to apply functions over rows of multiple matrices]

2007-08-10 Thread Johannes Hüsing
 1. matrices are stored columnwise so R is better at column-wise operations
 than row-wise.

I am seeing this by my code which contains more t() than
what seems healthy. However, the summaries are patient-wise
over repeated measurements. Out of convention, I am storing
patients in rows and measurements in columns.


 2. Here is one way to do it (although I am not sure its better than the
 index approach):

row.apply - function(f, a, b)
   t(mapply(f, as.data.frame(t(a)), as.data.frame(t(b


Ah, thank you so much. I'll take the generalization to N arguments
à la mapply() as an exercise for the reader.

 3. The code for the example in this post could be simplified to:

 first.1 - apply(cbind(goldstandard, 1), 1, which.max)
 ifelse(col(newtest)  first.1, NA, newtest)


Ouch! Consider this scholar slapped.

 4. given that both examples did not inherently need row by row operations
I wonder if that is the wrong generalization in the first place?


Given that you managed to squeeze my 20 lines of code into 2 lines
AND that row.apply() does not exist in base without many people
missing it, I'll have to concede this point and eliminate the
craving for row.apply() in favour of the whole-object approach.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] having problems with factor()

2007-08-10 Thread Rainer Hurling
I am afraid the above example will not work. In original dataset of 
Jabez Wilson numerical range is from 0..7.

So try this one:

df-as.factor(c(0,I,II,III,IV,V,VI,VII)[df$area+1])

Hope this is what you want,
Rainer


Henrique Dallazuanna schrieb:
 Hi,
 
 df
 ht area
 1  3203
 2  4104
 3  2302
 4  3603
 5  1261
 6  2802
 7  2602
 8  2802
 9  2802
 10 2602
 
 df$area - as.factor(df$area)
 levels(df$area) - c(I, II, III, IV)
 
 
  On 10/08/07, Jabez Wilson [EMAIL PROTECTED] wrote:
  Dear R Help,
  I have a set of data of heights of trees described by area that they 
  are in. The areas are numerical (0 to 7).
 
  htarea
  1   320   3
  2   410   4
  3   230   2
  4   360   3
  5   126   1
  6   280   2
  7   260   2
  8   280   2
  9   280   2
  10  260   2
  ...
  180 450   4
  181  90   1
  182 120   1
  183 440   4
  184 210   2
  185 330   3
  186 210   2
  187 100   1
  188   0   0
 
  I want to convert the area column values to factors, to do an anova.
  However, if I use:
 
  df$areaf - factor(df$area,
  labels=c(0,I,II,III,IV,V,VI,VII))
 
  it gives the following message:
 
  Error in factor(df$area, labels = c(0, I, II, III, IV, V,
  VI,  :
  invalid labels; length 8 should be 1 or 7
 
  Can anyone help?
 
  Jabez

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] having problems with factor()

2007-08-10 Thread Henrique Dallazuanna
Hi,

df
ht area
1  3203
2  4104
3  2302
4  3603
5  1261
6  2802
7  2602
8  2802
9  2802
10 2602

df$area - as.factor(df$area)
levels(df$area) - c(I, II, III, IV)



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

On 10/08/07, Jabez Wilson [EMAIL PROTECTED] wrote:

 Dear R Help,
 I have a set of data of heights of trees described by area that they are
 in. The areas are numerical (0 to 7).

 htarea
 1   320   3
 2   410   4
 3   230   2
 4   360   3
 5   126   1
 6   280   2
 7   260   2
 8   280   2
 9   280   2
 10  260   2
 ...
 180 450   4
 181  90   1
 182 120   1
 183 440   4
 184 210   2
 185 330   3
 186 210   2
 187 100   1
 188   0   0

 I want to convert the area column values to factors, to do an anova.
 However, if I use:

 df$areaf - factor(df$area,
 labels=c(0,I,II,III,IV,V,VI,VII))

 it gives the following message:

 Error in factor(df$area, labels = c(0, I, II, III, IV, V,
 VI,  :
 invalid labels; length 8 should be 1 or 7

 Can anyone help?

 Jabez


   ___

 now.

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] having problems with factor()

2007-08-10 Thread Jabez Wilson
Dear R Help,
I have a set of data of heights of trees described by area that they are in. 
The areas are numerical (0 to 7).

htarea
1   320   3
2   410   4
3   230   2
4   360   3
5   126   1
6   280   2
7   260   2
8   280   2
9   280   2
10  260   2
...
180 450   4
181  90   1
182 120   1
183 440   4
184 210   2
185 330   3
186 210   2
187 100   1
188   0   0

I want to convert the area column values to factors, to do an anova. However, 
if I use:

df$areaf - factor(df$area, labels=c(0,I,II,III,IV,V,VI,VII))

it gives the following message:

Error in factor(df$area, labels = c(0, I, II, III, IV, V, VI,  : 
invalid labels; length 8 should be 1 or 7

Can anyone help?

Jabez


  ___

now.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ordering a data.frame by average rank of multiple columns

2007-08-10 Thread Gabor Grothendieck
Try this:

positions - order(ranks)

On 8/10/07, Tom.O [EMAIL PROTECTED] wrote:

 Hi

 I have run into a problem and i wonder if anyone has a smart way of doing
 this.

 For example i have this data frame for 5 different test groups:

 Res1 - c(1,5,4,-0.5,3)
 Res2 - c(-1,8,2,0,3)
 Mean - c(0.5,1,1.5,-.5,2)
 MyFrame - data.frame(Res1,Res2,Mean,row.names=c(G1,G2,G3,G4,G5))

 where the first two columns are the results of two different tests, the
 third column is the mean of the group.

 I want to order this data.frame by the combined rank of Res1  Res2, but
 where weigths are assigned to the importeance av each column. Lets assume
 that Res1 is twice as important and lower values rank better.

 MyRanks-data.frame(Rank1=rank(MyFrame[,Res1]),Rank2=rank(MyFrame[,Res2]),CombR=2*rank(MyFrame[,Res1])+rank(MyFrame[,Res2]),row.names=c(G1,G2,G3,G4,G5))

Rank1 Rank2 CombR
 G1 2 1 5
 G2 5 515
 G3 4 311
 G4 1 2 4
 G5 3 410


 and the rank of the combined is 2,5,4,1,3 , but to be able to sort MyFrame
 in that order I need to enter this vector of positions c(4,1,5,3,2) but do
 anyone have a smart way of converting ranks to positions?

 Tom


 --
 View this message in context: 
 http://www.nabble.com/ordering-a-data.frame-by-average-rank-of-multiple-columns-tf4247393.html#a12087498
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Positioning text in top left corner of plot

2007-08-10 Thread Daniel Brewer
This works fine for one plot, but if it is a multiple plot (mfrow=c(2,2)
say) then each individual label is placed in the same position i.e.
absolute top left on the canvas.  I would like it top left of each
individual plot.

Thanks anyway.  Got any idea how to fix this?

Dan

Paul Murrell wrote:
 Hi
 
 
 Daniel Brewer wrote:
 Thanks for the replies, but I still cannot get what I want.  I do not
 want the label inside the plot area, but in the top left of the paper, I
 suppose in the margins.  When I try to use text to do this, it does not
 seem to plot it outside the plot area.  I have also tried to use mtext,
 but that does not really cut it, as I cannot get the label in the
 correct position.  Ideally, it would be best if I could use legend but
 have it outside the plot area.

 Any ideas?
 
 
 plot(1:10)
 library(grid)
 grid.text(What do we want?  Text in the corner!\nWhere do we want it?
 Here!,
   x=unit(2, mm), y=unit(1, npc) - unit(2, mm),
   just=c(left, top))
 
 Paul
 
 
 Thanks

 Benilton Carvalho wrote:
 maybe this is what you want?

 plot(rnorm(10))
 legend(topleft, A), bty=n)

 ?

 b

 On Aug 7, 2007, at 11:08 AM, Daniel Brewer wrote:

 Simple question how can you position text in the top left hand
 corner of
 a plot?  I am plotting multiple plots using par(mfrow=c(2,3)) and all I
 want to do is label these plots a), b), c) etc.  I have been fiddling
 around with both text and mtext but without much luck.  text is fine
 but
  each plot has a different scale on the axis and so this makes it
 problematic.  What is the best way to do this?

 Many thanks

 Dan
 
 


The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company 
Limited by Guarantee, Registered in England under Company No. 534147 with its 
Registered Office at 123 Old Brompton Road, London SW7 3RP.

This e-mail message is confidential and for use by the addre...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Positioning text in top left corner of plot

2007-08-10 Thread Daniel Brewer
Thanks.  That works if it is only a single plot, but if there are
multiple plots (e.g. par(mfrow=c(2,2))) it confusingly puts the label in
the absolute top left always i.e. the top left of plot one.

Dan

S Ellison wrote:
 Try something like
 mtext(side=3, line=-1, text=Here again?, adj=0, outer=T)
 
 This puts text just inside the top left corner.
 
 
 Jim Lemon [EMAIL PROTECTED] 10/08/2007 10:37:30 
 Daniel Brewer wrote:
 Thanks for the replies, but I still cannot get what I want.  I do not
 want the label inside the plot area, but in the top left of the paper, I
 suppose in the margins.  When I try to use text to do this, it does not
 seem to plot it outside the plot area.  I have also tried to use mtext,
 but that does not really cut it, as I cannot get the label in the
 correct position.  Ideally, it would be best if I could use legend but
 have it outside the plot area.

 Any ideas?

 Hi Dan,
 
 Try this:
 
 plot(1:5)
 par(xpd=TRUE)
 text(0.5,5.5,Outside)
 par(xpd=FALSE)
 
 Jim


The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company 
Limited by Guarantee, Registered in England under Company No. 534147 with its 
Registered Office at 123 Old Brompton Road, London SW7 3RP.

This e-mail message is confidential and for use by the addre...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] need help with pdf-plot

2007-08-10 Thread Ivar Herfindal
Dear Antje

I cannot see that you have got any replies yet, so I will make and 
attempt. However, I am sure other have more formally correct solutions.

When you call the pdf(), you can set paper=a4 (or a4r for 
landscape). However, the width and the height of your plot should then 
not exceed the size of the paper (which is approximately 8.27*11.69 
inches for a4). Try (I have only tested on windows XP, R 2.5.0):

pdf(test1.pdf, width=10, heigh=5, paper=a4r)
par(mfrow=c(1,3), pty=s) #pty=s gives square plotting regions
plot(rnorm(100))
plot(rnorm(100))
plot(rnorm(100))
dev.off()

Hope this helps

Ivar


Antje skrev:
 I still have this problem. Does anybody know any solution?

 Antje

 Antje schrieb:
   
 Hello,

 I'm trying to plot a set of barplots like a matrix (2 rows, 10 columns 
 fromreduced_mat) to a pdf. It works with the following parameters:

 pdf(test.pdf,width=ncol(reduced_mat)*2, height=nrow(reduced_mat)*2, 
 pointsize 
 = 12)

 par(mfcol = c(nrow(reduced_mat),ncol(reduced_mat)), oma = c(0,0,0,0), 
 lwd=48/96, cex.axis = 0.5, las = 2, cex.main = 1.0)

 The I get a long narrow page format with the quadratic barplots.

 But I would like to have a A4 format in the end and the plots not filling 
 the 
 whole page (they should stay somehow quadratic and not be stretched...).

 What shall I look for to achieve this?

 Antje

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help with counting how many times each value occur in each column

2007-08-10 Thread Tom Cohen
Dear list,
  I have the following dataset and want to know how many times each value occur 
in each column.
   data
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,] -100 -100 -100000000  -100
 [2,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [3,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [4,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [5,] -100 -100 -100 -100 -100 -100 -100 -100 -100   -50
 [6,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [7,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [8,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [9,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[10,] -100 -100 -100  -50 -100 -100 -100 -100 -100  -100
[11,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[12,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[13,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[14,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[15,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[16,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[17,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[18,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[19,] -100 -100 -100000000  -100
[20,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  The result matrix should look like
   -100 0 -50
[1]   20  
[2]   20
[3]   20
[4]   17
[5]   18
[6]   18
[7]   18  and so on 
[8] 
[9] 
[10]
  
How can I do this in R ?
  Thanks alot for your help,
Tom

   
-

Jämför pris på flygbiljetter och hotellrum: 
http://shopping.yahoo.se/c-169901-resor-biljetter.html
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: having problems with factor()

2007-08-10 Thread Petr PIKAL
Hi
[EMAIL PROTECTED] napsal dne 10.08.2007 13:41:53:

 Dear R Help,
 I have a set of data of heights of trees described by area that they are 
in. 
 The areas are numerical (0 to 7).
 
 htarea
 1   320   3
 2   410   4
 3   230   2
 4   360   3
 5   126   1
 6   280   2
 7   260   2
 8   280   2
 9   280   2
 10  260   2
 ...
 180 450   4
 181  90   1
 182 120   1
 183 440   4
 184 210   2
 185 330   3
 186 210   2
 187 100   1
 188   0   0
 
 I want to convert the area column values to factors, to do an anova. 
However, if I use:
 
 df$areaf - factor(df$area, 
labels=c(0,I,II,III,IV,V,VI,VII))
 
 it gives the following message:
 

Hm, maybe some of the values are missing

 num-sample(1:3, 10, replace=T)
 num
 [1] 1 3 1 2 3 3 1 3 3 3
 factor(num, labels=c(O, I, II))
 [1] O  II O  I  II II O  II II II
Levels: O I II

 factor(num, labels=c(O, I, II, III))
Error in factor(num, labels = c(O, I, II, III)) : 
invalid labels; length 4 should be 1 or 3


try

table(df$area)
to see what level you really have

Regards
Petr


 Error in factor(df$area, labels = c(0, I, II, III, IV, V, 
VI,  : 
 invalid labels; length 8 should be 1 or 7
 
 Can anyone help?
 
 Jabez
 
 
   ___
 
 now.
 
[[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Remove redundant observations for cross-validation

2007-08-10 Thread Eleni Rapsomaniki


Hi,

This is a general statistics question that I believe occurs often so may have
some R functions/packages dedicated to it.
Suppose you want to check the accuracy of a classifier using a large training
data-set where each row represents an observation. Is there a simple approach
for removing redundant rows (rows with very similar values for all columns)
from the training data so as to obtain a realistic classification performance
upon x-validation? The only one I can think of is clustering the data into an
arbitary number of clusters and selecting one observation from each cluster.

e.g
library(cluster)
x - rbind(cbind(rnorm(10,0,0.5), rnorm(10,0,0.5)),
   cbind(rnorm(10,5,2.5), rnorm(15,5,2.5)),
   cbind(rnorm(10,15,0.5), rnorm(15,15,0.5)),
   cbind(rnorm(5,5,0.1), rnorm(5,5,0.1)))
  
pamx - pam(x, 15)

y=array(NA, dim=c(15,ncol(x)))
for(i in 1:15){
y[i,]=x[sample(which(pamx$clustering==i), 1),]
}

This seems a bit subjective though... Any better ideas?

Eleni Rapsomaniki

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Fwd: Re: How to apply functions over rows of multiple matrices]

2007-08-10 Thread Gabor Grothendieck
1. matrices are stored columnwise so R is better at column-wise operations
than row-wise.

2. Here is one way to do it (although I am not sure its better than the
index approach):

   row.apply - function(f, a, b)
  t(mapply(f, as.data.frame(t(a)), as.data.frame(t(b

3. The code for the example in this post could be simplified to:

first.1 - apply(cbind(goldstandard, 1), 1, which.max)
ifelse(col(newtest)  first.1, NA, newtest)

4. given that both examples did not inherently need row by row operations
   I wonder if that is the wrong generalization in the first place?


On 8/10/07, Johannes Hüsing [EMAIL PROTECTED] wrote:
 [Apologies to Gabor, who I sent a personal copy of the reply
 erroneously instead of posting to List directly]

 [...]
   Perhaps what you really intend is to
  take the average over those elements in each row of the first matrix
 which correspond to 1's in the second in the corresponding
  row of the second.  In that case its just:
 
  rowSums(newtest * goldstandard) / rowSums(goldstandard)
 

 Thank you for clearing my thoughts about the particular example.
 My question was a bit more general though, as I have different
 functions which are applied row-wise to multiple matrices. An
 example that sets all values of a row of matrix A to NA after the
 first occurrence of TRUE in matrix B.

 fillfrom - function(applvec, testvec=NULL) {
  if (is.null(testvec)) testvec - applvec
  if (length(testvec) != length(applvec)) {
stop(applvec and testvec have to be of same length!)
  } else if(any(testvec, na.rm=TRUE)) {
applvec[min(which(testvec)) : length(applvec)] - NA
  }
  applvec
 }

 fillafter - function(applvec, testvec=NULL) {
  if (is.null(testvec)) testvec - applvec
  fillfrom(applvec, c(FALSE, testvec[-length(testvec)]))
 }

 numtest - 6
 numsubj - 20

 newtest - array(rbinom(numtest*numsubj, 1, .5),
dim=c(numsubj, numtest))
 goldstandard - array(rbinom(numtest*numsubj, 1, .5),
dim=c(numsubj, numtest))

 newtest.NA - t(sapply(1:nrow(newtest), function(i) {
  fillafter(newtest[i,], goldstandard[i,]==1)}))

 My general question is if R provides some syntactic sugar
 for the awkward sapply(1:nrow(A)) expression. Maybe in this
 case there is also a way to bypass the apply mechanism and
 my way of thinking about the problem has to be adapted. But
 as the *apply calls are galore in R, I feel this is a standard
 way of dealing with vectors and matrices.





 --

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: help with counting how many times each value occur in each column

2007-08-10 Thread Petr PIKAL
Hi

 mat-sample(c(-50,0,-100), 100,replace=T)
 dim(mat)-c(10,10)
 mat
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]0000  -500000 0
 [2,] -100 -100  -50  -5000 -100  -50 -100   -50
 [3,]0  -50 -100 -1000  -50 -10000  -100
 [4,]0 -1000  -50 -100 -100  -50  -500  -100
 [5,]  -50  -50000 -100 -100 -1000  -100
 [6,]00  -50  -5000 -100 -100  -50  -100
 [7,] -100 -100 -100  -50 -1000 -100 -1000  -100
 [8,] -1000000 -1000 -1000  -100
 [9,] -1000  -50 -100  -5000  -500  -100
[10,]  -50 -10000  -50  -50  -50  -50 -100  -100

 apply(mat, 2, table)
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
-100442223542 8
-50 223432241 1
0   445455327 1

Transposing and ordering columns is up to you.

Regards
Petr

[EMAIL PROTECTED] napsal dne 10.08.2007 14:01:44:

 Dear list,
   I have the following dataset and want to know how many times each 
value 
 occur in each column.
data
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
  [1,] -100 -100 -100000000  -100
  [2,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  [3,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  [4,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  [5,] -100 -100 -100 -100 -100 -100 -100 -100 -100   -50
  [6,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  [7,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  [8,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  [9,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [10,] -100 -100 -100  -50 -100 -100 -100 -100 -100  -100
 [11,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [12,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [13,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [14,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [15,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [16,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [17,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [18,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [19,] -100 -100 -100000000  -100
 [20,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
   The result matrix should look like
-100 0 -50
 [1]   20 
 [2]   20
 [3]   20
 [4]   17
 [5]   18
 [6]   18
 [7]   18  and so on 
 [8] 
 [9] 
 [10]
 
 How can I do this in R ?
   Thanks alot for your help,
 Tom
 
 
 -
 
 Jämför pris pĺ flygbiljetter och hotellrum: 
http://shopping.yahoo.se/c-169901-
 resor-biljetter.html
[[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cleaning up the memory

2007-08-10 Thread Monica Pisica

Thanks! I will look into ...
 
I have 4 GB RAM, and i was monitoring the memory with Windows task manager so i 
was looking how R gets more and more memory allocation from less than 100Mb 
to  1500Mb .
 
My initial tables are between 30 to 80 Mb and the resulting tables that 
incorporate the initial tables plus PCA and kmeans results are inbetween 50 to 
200MB or thereabouts!
 
And yes, i don't really care about memory allocation in detail - what i want is 
to free that memory after every cycle ;-)
 
Although, after i didn't do anything in R and it was idle for more than 30 min. 
the memory allocation according to Task manager dropped to 15 Mb . which is 
good - but i cannot wait inbetween cycles half an hour though .
 
Again thanks,
 
Monica Date: Fri, 10 Aug 2007 18:28:07 +0100 From: [EMAIL PROTECTED] To: 
[EMAIL PROTECTED] CC: r-help@stat.math.ethz.ch Subject: Re: [R] Cleaning up 
the memory  On Fri, 10 Aug 2007, Monica Pisica wrote:Hi,   I 
have 4 huge tables on which i want to do a PCA analysis and a kmean   
clustering. If i run each table individually i have no problems, but if   i 
want to run it in a for loop i exceed the memory alocation after the   second 
table, even if i save the results as a csv table and i clean up   all the big 
objects with rm command. To me it seems that even if i don't   have the 
objects anymore, the memory these objects used to occupy is not   cleared. Is 
there any way to clear up the memory as well? I don't want   to close R and 
start it up again. Also i am running R under Windows.  See ?gc, which does 
the clearing.  However, unless you study the memory allocation in detail 
(which you  cannot do from R code), you don't actually know that this is the 
problem.  More likely is that you have fragmentation of your 32-bit address 
space:  see ?Memory-limits.  Without any idea what memory you have and 
what 'huge' means, we can only  make wild guesses. It might be worth raising 
the memory limit (the  --max-mem-size flag).thanks,   Monica  
_  [[trailing 
spam removed]]   [[alternative HTML version deleted]]   
__  R-help@stat.math.ethz.ch 
mailing list  https://stat.ethz.ch/mailman/listinfo/r-help  PLEASE do read 
the posting guide http://www.R-project.org/posting-guide.html  and provide 
commented, minimal, self-contained, reproducible code.   --  Brian D. 
Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, 
http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 
(self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 
1865 272595
_
Messenger Café — open for fun 24/7. Hot games, cool activities served daily. 
Visit now.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help wit matrices

2007-08-10 Thread Lanre Okusanya
that was ridiculously simple. duh.

THanks

Lanre

On 8/10/07, jim holtman [EMAIL PROTECTED] wrote:
 Is this what you want:

  x - matrix(runif(100), 10)
  round(x, 3)
[,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10]
  [1,] 0.268 0.961 0.262 0.347 0.306 0.762 0.524 0.062 0.028 0.226
  [2,] 0.219 0.100 0.165 0.131 0.578 0.933 0.317 0.109 0.527 0.131
  [3,] 0.517 0.763 0.322 0.374 0.910 0.471 0.278 0.382 0.880 0.982
  [4,] 0.269 0.948 0.510 0.631 0.143 0.604 0.788 0.169 0.373 0.327
  [5,] 0.181 0.819 0.924 0.390 0.415 0.485 0.702 0.299 0.048 0.507
  [6,] 0.519 0.308 0.511 0.690 0.211 0.109 0.165 0.192 0.139 0.681
  [7,] 0.563 0.650 0.258 0.689 0.429 0.248 0.064 0.257 0.321 0.099
  [8,] 0.129 0.953 0.046 0.555 0.133 0.499 0.755 0.181 0.155 0.119
  [9,] 0.256 0.954 0.418 0.430 0.460 0.373 0.620 0.477 0.132 0.050
 [10,] 0.718 0.340 0.854 0.453 0.943 0.935 0.170 0.771 0.221 0.929
  ifelse(x  .5, 1, 0)
   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
  [1,]010001100 0
  [2,]000011001 0
  [3,]110010001 1
  [4,]011101100 0
  [5,]011000100 1
  [6,]101100000 1
  [7,]110100000 0
  [8,]010100100 0
  [9,]010000100 0
 [10,]101011010 1


 On 8/10/07, Lanre Okusanya [EMAIL PROTECTED] wrote:
  Hello all,
 
  I am working with a 1000x1000 matrix, and I would like to return a
  1000x1000 matrix that tells me which value in the matrix is greater
  than a theshold value (1 or 0 indicator).
  i have tried
   mat2-as.matrix(as.numeric(mat10.25))
  but that returns a 1:10 matrix.
  I have also tried for loops, but they are grossly inefficient.
 
  THanks for all your help in advance.
 
  Lanre
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem you are trying to solve?


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help wit matrices

2007-08-10 Thread Lanre Okusanya
Hello all,

I am working with a 1000x1000 matrix, and I would like to return a
1000x1000 matrix that tells me which value in the matrix is greater
than a theshold value (1 or 0 indicator).
i have tried
  mat2-as.matrix(as.numeric(mat10.25))
but that returns a 1:10 matrix.
I have also tried for loops, but they are grossly inefficient.

THanks for all your help in advance.

Lanre

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Seasonality

2007-08-10 Thread Roland Rau
Alberto Monteiro wrote:
 I have a time series x = f(t), where t is taken for each
 month. What is the best function to detect if _x_ has a seasonal
 variation? If there is such seasonal effect, what is the
 best function to estimate it?
 
 From my own experience, I had the impression that there is nothing like 
a best approach to estimate the seasonal component of time series data.

Maybe it is possible for you to simulate the assumed nature of your data 
(variable trend? variable seasonal pattern? count data with 
overdispersion? maybe a bimodal pattern every year?) and then try 
various of these methods and check if they can extract your input 
approximately correctly?

Best,
Roland

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help wit matrices

2007-08-10 Thread Ted Harding

On 10-Aug-07 18:05:50, Lanre Okusanya wrote:
 Hello all,
 
 I am working with a 1000x1000 matrix, and I would like to return a
 1000x1000 matrix that tells me which value in the matrix is greater
 than a theshold value (1 or 0 indicator).
 i have tried
   mat2-as.matrix(as.numeric(mat10.25))
 but that returns a 1:10 matrix.
 I have also tried for loops, but they are grossly inefficient.
 
 THanks for all your help in advance.
 
 Lanre

Simple-minded, but:

 S-matrix(rnorm(25),nrow=5)
 S
   [,1][,2]   [,3]   [,4]   [,5]
[1,] -0.9283624 -0.44418487  1.1174555  1.9040999 -0.4675796
[2,]  0.2658770 -0.28492642 -1.2271013 -0.5713291  1.8036235
[3,]  0.7010885 -0.42972262  0.7576021  0.3407972 -1.0628487
[4,] -0.2003087  0.87006841  0.6233792 -0.9974902 -0.9104270
[5,]  0.2729014  0.09781886 -1.0004486  1.5987385 -0.4747125
 T-0*S
 T[S0.25] - 1+0*S[S0.25]
 T
 [,1] [,2] [,3] [,4] [,5]
[1,]00110
[2,]10001
[3,]10110
[4,]01100
[5,]10010

Does this work OK for your big matrix?

HTH
Ted.


E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 10-Aug-07   Time: 19:50:37
-- XFMail --

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with counting how many times each value occur in eachcolumn

2007-08-10 Thread Gasper Cankar

Tom,

If all values (-100,0,-50) would be in every column then simple

apply(data,2,table)

would work. Even if there aren0t all values in every column you could
correct that and insert additional lines with all values for all columns 
like

data - cbind(data,matrix(ncol=10,nrow=3,rep(c(-100,0,-50),10)))

and then do

apply(data,2,table)-1

to get correct results. But someone on a list can probably make much more
elegant solution.

Bye,

Gasper Cankar, PhD
Researcher
National Examinations Centre
Slovenia

-Original Message-
From: Tom Cohen [mailto:[EMAIL PROTECTED] 
Sent: Friday, August 10, 2007 2:02 PM
To: r-help@stat.math.ethz.ch
Subject: [R] help with counting how many times each value occur in
eachcolumn

Dear list,
  I have the following dataset and want to know how many times each value
occur in each column.
   data
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,] -100 -100 -100000000  -100
 [2,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100  
[3,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100  
[4,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [5,] -100 -100 -100 -100 -100 -100 -100 -100 -100   -50
 [6,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100  
[7,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100  
[8,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100  
[9,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100 
[10,] -100 -100 -100  -50 -100 -100 -100 -100 -100  -100 
[11,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100 
[12,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100 
[13,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100 
[14,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100 
[15,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100 
[16,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100 
[17,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100 
[18,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[19,] -100 -100 -100000000  -100
[20,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  The result matrix should look like
   -100 0 -50
[1]   20  
[2]   20
[3]   20
[4]   17
[5]   18
[6]   18
[7]   18  and so on 
[8]
[9]
[10]
  
How can I do this in R ?
  Thanks alot for your help,
Tom

   
-

Jämför pris på flygbiljetter och hotellrum:
http://shopping.yahoo.se/c-169901-resor-biljetter.html
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] QUESTION ON R!!!!!!!!!!!1

2007-08-10 Thread lecastil
Good day. I am employed at a public entity that handles million 
information and records of several variables and distributed in several 
topics. For the statistical analyses we use a statistical package, which 
allows us to call directly of the database (ORACLE) the information and 
to realize the statistical analysis. Everything is done in brief time.

Nevertheless, we want to know if the statistical package R is capable of 
doing the same thing that does the statistical package with which we 
work, that is to say, is R capable of importing information of a 
database of million records to maximum speed and later statistical 
analysis allows to realize once imported the information?.

I am grateful for prompt response to you since it is of supreme urgency 
to know this information.

Luis Eduardo Castillo Méndez.

**
 


Buen día. Trabajo en una entidad pública que maneja millones de datos y 
registros de varias variables y distribuidos en varios tópicos. Para los 
análisis estadísticos usamos un paquete estadístico, el cuál nos permite 
llamar directamente de la base de datos (ORACLE) la información y 
realizar el análisis estadístico. Todo se hace en tiempo breve.

Sin embargo, queremos saber si el paquete estadístico R es capaz de 
hacer lo mismo que hace el paquete estadístico con el que trabajamos, es 
decir, R es capaz de importar datos de una base de datos de millones de 
registros a velocidad máxima y después permita realizar análisis 
estadístico una vez importado los datos? .

Te agradezco pronta respuesta ya que es de suma urgencia saber este dato.

Luis Eduardo Castillo Méndez.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-excel

2007-08-10 Thread Peter Wickham

I am running R 2.5.1 using Mac OSX 10.4.10. xlsReadWrite is a Windows
binary. Instead, install and load packages: (1) gtools:(2) gdata. These
are both Windows and Mac binaries. gdata depends on gtools, so be sure
to load gtools first or set the installation depends parameters. Then you
can use read.xls.  Thus, in Mac: data-read.xls(/Users/your
name/Documents/data.xls,sheet=1). For Windows, substitute the appropriate
filepath and file name in the first argument of read.xls: e.g.,
data-read.xls(A:/filename.xls,sheet-1). Thanks to correspondents for
their advice; but I hope that this may alleviate some of the frustration
(referred to in the R Import/Export Manual) associated with dealing with
EXCEL files in R.

Erika Frigo wrote:
 
 
 Good morning to everybody,
 I have a problem : how can I import excel files in R???
 
 thank you very much
 
 
 Dr.sa. Erika Frigo
 Università degli Studi di Milano
 Facoltà di Medicina Veterinaria
 Dipartimento di Scienze e Tecnologie Veterinarie per la Sicurezza
 Alimentare (VSA)
  
 Via Grasselli, 7
 20137 Milano
 Tel. 02/50318515
 Fax 02/50318501
   [[alternative HTML version deleted]]
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/R-excel-tf3975982.html#a12101349
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.