Re: [R] Monotone splines

2007-09-06 Thread Stephen Ellner
Servien remi ([EMAIL PROTECTED]) wrote: 
I want to use splines to estimate a function but i want to force the 
interpolation to be monotone. Is this possible with R ?

There are a few options. You can use mono.con in mgcv (see the example code in 
?pcls), or smooth.monotone in fda. In both of these you have to specify the 
smoothing parameter yourself. I've written (with some help from Simon Wood) a 
set of scripts that fit a monontone nondecreasing regression spline with 
smoothing parameter chosen by GCV taking account of the active constraints, the 
approach developed by 

Wood, S.N. (1994) Monotonic smoothing splines fitted by cross validation. SIAM 
Journal on Scientific Computing 15:1126-1133.  

You can get them at  www.eeb.cornell.edu/ellner/software.html; scroll down and 
look for Gradfit. 
The function you need is gcv.rssM. It isn't very efficient, so you may have 
trouble if your data set is large. 

Stephen P. Ellner ([EMAIL PROTECTED])
Department of Ecology and Evolutionary Biology
Corson Hall, Cornell University, Ithaca NY 14853-2701

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The variables combined in a table from other table and combination questions

2007-09-06 Thread Stephen Weigand
On 9/5/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 Dear All:
 I need to have some data frame objects.
 First aa object:
pH  Formulation  time  Subject
 [1]1.2  F   0   1
 [2]7.4  S   1   2
 [3]MF   2   3
 [4] 3   4
 [5] ni
 Then, I need to produce 2*3(pH*formulation) different
 tables.  This table includes column of (pH,
 Formulation, time  S1  S2  S3 …..Si) and S1= subject
 1, S2=subject 2 and so on.  For example: bb1 table
pH  Formulation  time  S1  S2  S3….Si
 [1]1.2  F   0
 [2] 1
 [3] 2
 [4] 3
 [5] n

 For example: bb2 table
pH  Formulation  time  S1  S2  S3….Si
 [1]1.2  S   0
 [2] 1
 [3] 2
 [4] 3
 [5] n


 Moreover, the values of pH and Formulation column are
 the combination questions.  The values of pH and
 Formulation column should be the combinations such as
 (1.2, F), (1.2, S), (1.2, MF), (7.4, F), (7.4, S),
 (7.4, MF)
 I am a beginner level in R and I have no idea how to
 do this. Could any one please help me.  Thanks a
 lot!!!

 Best regrards
 Hsin Ya Lee


I don't understand exactly what you want but perhaps start with this:

expand.grid(pH = c(1.2, 7.4), Formulation = c(F, S, MF))

Hope this helps,

Stephen

-- 
Rochester, Minn. USA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subscript out of bounds Error in predict.naivebayes

2007-08-28 Thread Stephen Weigand
On 8/22/07, Polly He [EMAIL PROTECTED] wrote:
 I'm trying to fit a naive Bayes model and predict on a new data set using
 the functions naivebayes and predict (package = e1071).

 R version 2.5.1 on a Linux machine

 My data set looks like this. class is the response and k1 - k3 are the
 independent variables. All of them are factors. The response has 52 levels
 and k1 - k3 have 2-6 levels. I have about 9,300 independent variables but
 omit the long list here for simple demonstration. There are no missing
 values in the observations.

class k1 k2 k3
   1  0  0  1
   8  0  0  0

 # model fitting, I also tried setting laplace=0 but didn't help
  nbmodel - naiveBayes(class~., data=train, laplace=1)

 # predict
  nb.fit - predict(nbmodel, x.test[,-1])

 First I had no trouble fitting the model. R also returned the predictions
 for some of my large data sets. But for some data sets, R can fit the model
 (no error message, nb.model$tables look ok). When I invoked the predict
 function, it kept giving me the following message:

 # my data set has 1 response variable and 9318 independent variables
 Error in FUN(1:9319[[4L]], ...) : subscript out of bounds
[...]

In my experience, some predict methods have trouble when
newdata does not have all levels of a factor. This seems
to be the case with predict.naiveBayes:

example(naiveBayes)
predict(model, subset(HouseVotes84, V1 == n))

gives

Error in object$tables[[v]] : subscript out of bounds

One workaround is to predict for a bigger data set
and retain a subset of the predictions.

Hope this helps,

Stephen


-- 
Rochester, Minn. USA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to merge string to DF

2007-08-24 Thread Stephen Tucker
This seems to work:

tmp - aggregate(DF$y, list(DF$x, DF$f), mean)

tmp2 - aggregate(DF$conc, list(DF$x, DF$f), paste,collapse=, )
names(tmp2)[3] - var1

final - merge(tmp,tmp2)


--- Lauri Nikkinen [EMAIL PROTECTED] wrote:

 #Hi R-users,
 #I have an example DF like this:
 
 y1 - rnorm(10) + 6.8
 y2 - rnorm(10) + (1:10*1.7 + 1)
 y3 - rnorm(10) + (1:10*6.7 + 3.7)
 y - c(y1,y2,y3)
 x - rep(1:3,10)
 f - gl(2,15, labels=paste(lev, 1:2, sep=))
 g - seq(as.Date(2000/1/1), by=day, length=30)
 DF - data.frame(x=x,y=y, f=f, g=g)
 DF$wdays - weekdays(DF$g)
 DF$conc - paste(DF$g, DF$wdays)
 DF
 
 #Now I calculate group means
 
 tmp - aggregate(DF$y, list(DF$x, DF$f), mean)
 tmp
 
 #After doing this, I want to merge string from DF$conc to tmp using DF$x
 and
 DF$y as an identifier
 #The following DF should look like this:
 
   Group.1 Group.2 x var1
 1   1lev1  6.607869 2000-01-01 Saturday, 2000-01-04 Tuesday,
 2000-01-07 Friday, 2000-01-10 Monday etc.
 2   2lev1  6.598861 etc.
 3   3lev1  7.469262
 4   1lev2 27.488734
 5   2lev2 33.164037
 6   3lev2 34.466359
 
 #How do I do this?
 
 #Cheers,
 #Lauri
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Vectors in R (WAS Re: Does anyone.... worth a warning?!? No warning at all)

2007-08-21 Thread Stephen Tucker
 modifies the attributes, we can predict what
class of object is returned:

 f - factor(letters[1:5])
 append(f,factor(letters[6:10]),2)
 [1] 1 2 1 2 3 4 5 3 4 5
 replace(f,c(FALSE,FALSE,TRUE,TRUE,FALSE),NA)
[1] abNA NA e   
Levels: a b c d e

And for these cases also:
 append(factor(f,levels=letters[1:10]),
+factor(letters[6:10],levels=letters[1:10]),2)
 [1]  1  2  6  7  8  9 10  3  4  5
 replace(factor(f,levels=letters[1:10]),c(FALSE,FALSE,TRUE,TRUE,FALSE),g)
[1] a b g g e
Levels: a b c d e f g h i j

I understand S4 classes were introduced in S partly because in S3 the
class assignment doesn't necessarily raise an error if it isn't
consistent with the rest of the attributes, but then may yield
surprising results (or an error) when you pass that object to
functions that require access to those attributes.

I suppose the reason I'm throwing this out there is that for a while I
wasn't sure (1) which functions could be invoked on which objects
classes and (2) the class of object returned from each function (which
depends on the class of its argument) without reading the
documentation several times over; this also made explaining the
behavior of functions to colleagues and students learning R very tough
(clearly, my own shortcoming). But seeing everything as vectors
(again, in the sense of contiguous cells) with mutable attributes,
made everything more transparent - that if a specific method does
not exist for, say a data.frame object, you can still call a
function on it if you treat the data frame as a heterogeneous vector
consisting of identical-length atomic vectors, and the structure of
the output is less unpredictable to me if I can figure out which
attributes are potentially modified in the returned object.

I wonder if anyone has additional thoughts on this.

Stephen

P.S. I agree that R/S does have its own peculiarities, but I think having
them is not unique to R at all! But then I suppose the question turns to
addressing  severity rather than the presence/absence of them...



--- [EMAIL PROTECTED] wrote:

 On 20-Aug-07 19:55:44, Rolf Turner wrote:
  On 20/08/2007, at 9:54 PM, Tom Willems wrote:
  dear Mathew
 
  mean is a Generic function
 
  mean(x...)
 
  in wich x is a data object, like a  data frame a list
  a numeric vector...
 
  so in your example it only reads the first character
  and then reports it.
 
  try x = c(1,1,2)
  mean(x)
  
  I think you've completely missed the point. I'm sure Mathew
  now understands the syntax of the mean function. His point
  was that it would be very easy for someone to use this
  function incorrectly --- and he indicated very clearly *why*,
  by giving an example using max().
  
  If mean() could be made safer to use by incorporating a warning,  
  without unduly adding to overheads, then it would seem sensible
  to incorporate such a warning.  Or to change the mean()
  function so that mean(1,2,3) returns ``2'' --- just as max 
  (1,2,3) returns ``3'' --- as Mathew *initially* (and quite
  reasonably) expected it to do.
  
  cheers,
  Rolf Turner
 
 I think Rolf makes a very important point. There are a lot of
 idiosyncracies in R, which in time we get used to; but learning
 about them is something of a sociological exercise, just as
 one learns that when one's friend A says X Y Z is may not mean
 the same as when one's friend B says it.
 
 Another example is in the use of %*% for matrix multiplication
 when one or both of the factors is a vector. If you came to R
 from matlab/octave, where every vector is already either a row
 vector or a column vector, you knew where you stood. But in R
 the semantics of the syntax depend on the context in a more
 complicated way. In R, x-c(-1,1) is called a vector, but it
 does not have dimensions:
 
 x-c(-1,1)
 dim(x)
 NULL
 
 So its relationship to matrix multiplication is ambiguous.
 
 For example:
 
 M-matrix(c(1,2,3,4),nrow=2); M
  [,1] [,2]
 [1,]13
 [2,]24
 
 x%*%M
  [,1] [,2]
 [1,]11
 
 and x is now coerced into a column vector, which now (for that
 immediate purpose) now does have dimensions (just as a row vector
 would have in matlab/octave).
 
 Similarly,
 
 M%*%x
  [,1]
 [1,]2
 [2,]2
 
 coerces it into a column vector. But now (asks the beginner who
 has not yet got round to looking up ?%*%) what happens with x%*%x?
 
 Will we get column vector times row vector (a 2x2 matrix) or
 row times column (a scalar)? In fact we get the latter:
 
 x%*%x
  [,1]
 [1,]2
 
 All this is in accordance with ?%*%:
 
 Description:
  Multiplies two matrices, if they are conformable. If one argument
  is a vector, it will be coerced to a either a row or column matrix
  to make the two arguments conformable. If both are vectors it will
  return the inner product.
 
 
 But now suppose y-c(1,2,3), with x-c(-1,1) as before.
 
 x%*%y
 Error in x %*% y : non-conformable arguments
 
 because it is trying to make the inner product of vectors of unequal

Re: [R] Stacked Bar

2007-08-21 Thread Stephen Tucker
I think you want to use the 'density' argument. For example:

barplot(1:5,col=1)
legend(topleft,fill=1,legend=text,cex=1.2)
par(new=TRUE)
barplot(1:5,density=5,col=2)
legend(topleft,fill=2,density=20,legend=text,bty=n,cex=1.2)

(if you wanted to overlay solid colors with hatching)

Here's the lattice alternative of the bar graph, though the help page says
'density' is currently unimplemented (Package lattice version 0.16-2). To get
the legend into columns, I followed the suggestion described here:
http://tolstoy.newcastle.edu.au/R/help/05/04/2529.html

Essentially I use mapply() and the line following to create a list with
alternating 'text' and 'rect' arguments (3 times to get 3 columns).
===
x - matrix(1:75, ncol= 5)
dimnames(x)[[2]] - paste(Method, 1:5, sep=)
dimnames(x)[[1]] - paste(Row, 1:15, sep=)

u - mapply(function(x,y) list(text=list(lab=x),rect=list(col=y)),
x = as.data.frame(matrix(levels(as.data.frame.table(x)$Var1),
  ncol=3)),
y = as.data.frame(matrix(rainbow(nrow(x)),
  ncol=3)),
SIMPLIFY=FALSE)
key - c(rep=FALSE,space=bottom,unlist(names-(u,NULL),rec=FALSE))

barchart(Freq ~ Var2,
 data = as.data.frame.table(x),
 groups = Var1, stack = TRUE,
 col=rainbow(nrow(x)),density=5,
 key = key )
===
(I often use tim.colors() in the 'fields' package, if you wanted other ideas
for color schemes).



--- Deb Midya [EMAIL PROTECTED] wrote:

 Jim,

   Thanks for such a quick response. It works well. Is it possible to fill
 the bars with patterns and colours?

   Regards,

   Deb
 
 Jim Lemon [EMAIL PROTECTED] wrote:
   Deb Midya wrote:
  Hi R Users!
  
  Thanks in advance.
  
  I am using R-2.5.1 on Windows XP.
  
  I am trying to do a stacked bar plot, but could not get through the
 following problem. The code is given below.
  
  1. How can I provide 15 different colors for each method with 15 Rows?
  
  2. How can I put the legend in a particular position (eg., in the top or
 bottom or right or left)? How can I put legend using a number of rows (eg.,
 using two or three rows)? 
  
 Hi Deb,
 As you have probably noticed, the integer coded colors repeat too 
 quickly for the number of colors you want. You can use the rainbow()
 function to generate colors like this:
 
 barplot(x,beside=FALSE,col=rainbow(nrow(x)))
 
 or there are lots of other color generating functions in the grDevices 
 or plotrix packages. Here's how to get your legend in an empty space for 
 your plot. There is also an emptyspace() function in the plotrix package 
 that tries to find the biggest empty space in a plot, although it 
 probably wouldn't work in this case.
 
 legend(0,1000,rownames(x),fill=rainbow(nrow(x)))
 
 Jim
 
 
 

 -
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   


Comedy with an Edge to see what's on, when.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] open/execute/call/run an external file

2007-08-21 Thread STEPHEN M POWERS
I'm trying to figure out how to trigger a process from within R. I have an 
exectuable file that runs a Fortran model, but ideally, would like to run it 
from R. Note that I'm not talking about importing the function at all, passing 
variables, or anything complicated like that. I basically just want a script 
that double-clicks on a particular file and opens/runs it for me.

The idea here is that the executable Fortran file, when double clicked, simply 
draws all necessary inputs from text files within the same directory and I have 
no need to change this. So I've used R to summarize some raw data and format 
these required text input files in the way the Fortran executable requires, and 
also have scripts to interpret the Fortran text file outputs and summarize/plot 
them in R. The problem is I must run the first part of the R script to send 
data from R to the model, then double click the Fortran executable, then run 
the second part of the R script to get the model outputs into R, in three 
separate steps. Given that I may be doing this hundereds of times, I'd prefer 
to do it all in one step.

Any thoughts?---steve

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] polar.plot orientation and scale in plotrix

2007-08-20 Thread Stephen Tucker
I think that's the standard presentation for polar plots (theta measured from
positive x-axis) - that I've seen, anyway. But for customization you can
shift your origin for theta and define your own labels. For example, here is
a modification to the example in the help page for polar.plot():

testlen-c(rnorm(36)*2+5)
testpos-seq(0,350,by=10)
polar.plot(testlen,360-(testpos+90),
   main=Test Polar Plot,lwd=3,line.col=4,
   labels=seq(0,359,by=45)[c(3:1,8:4)],
   label.pos=seq(0,359,by=45))





--- Tim Sippel [EMAIL PROTECTED] wrote:

 Hello all-
 
  
 
 I would like to orient my polar.plot (from package plotrix) so that the
 circular scale runs clockwise and the origin (ie. 0 degrees) starts at the
 top of the plot.  The defaults of running the scale counter-clockwise and
 beginning with 90 degrees at the top of the graph seems counter-intuitive
 to
 me.  
 
  
 
 I'm using R 2.5.0, and plotrix version 2.2-4.  
 
  
 
 Many thanks,
 
  
 
 Tim
 
  
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] converting character string to an expression

2007-08-08 Thread Stephen Tucker

I think you're looking for

parse(text=paste(letters[1:3], collapse=+))

--- Jarrod Hadfield [EMAIL PROTECTED] wrote:

 Hi Everyone,
 
 I would simply like to coerce a character string into an expression:  
 something like:
 
 as.expression(paste(letters[1:3], collapse=+))
 
 but I can't seem to get rid of the quotes.  The only way I can get it  
 to work is using as.formula:
 
 as.expression(as.formula(paste(~, paste(letters[1:3], collapse=+
 
 but this requires the expression to have a tilde, which it will not  
 always have.
 
 Thanks,
 
 Jarrod
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Catch errors

2007-08-06 Thread Stephen Tucker
?try

or

?tryCatch
http://www.maths.lth.se/help/R/ExceptionHandlingInR/

for example...

tryCatch(lme(Y ~ X1*X2, random = ~1|subj, Model[i]),
 error=function(err) return(0))

(you can do something with 'err' or just return 0 as above)

--- Gang Chen [EMAIL PROTECTED] wrote:

 I run a linear mixed-effects model in a loop
 
 for (i in 1:N) {
 fit.lme - lme(Y ~ X1*X2, random = ~1|subj, Model[i]);
 }
 
 As the data in some iterations are all (or most) 0's, leading to the  
 following error message from lme:
 
 Error in chol((value + t(value))/2) : the leading minor of order 1 is  
 not positive definite
 
 What is a good way to catch the error without spilling on the screen  
 so that I can properly stuff the corresponding output with some  
 artificial value such as 0?
 
 Thanks,
 Gang
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Catch errors

2007-08-06 Thread Stephen Tucker
That's because tag - 1 is evaluated in a local environment (of the function)
- once function(err) exits, the information is lost. Try R's - operator:

tag - 0
tryCatch(fit.lme - lme(Beta ~ Trust*Sex*Freq, random = ~1|Subj,  
 Model), error=function(err) tag - 1)



--- Gang Chen [EMAIL PROTECTED] wrote:

 I wanted something like this:
 
  tag - 0;
  tryCatch(fit.lme - lme(Beta ~ Trust*Sex*Freq, random = ~1|Subj,  
 Model), error=function(err) tag - 1);
 
 but it seems not working because 'tag' does not get value of 1 when  
 error occurs. How can I make it work?
 
 Thanks,
 Gang
 
 
 On Aug 6, 2007, at 1:44 PM, Stephen Tucker wrote:
 
  ?try
 
  or
 
  ?tryCatch
  http://www.maths.lth.se/help/R/ExceptionHandlingInR/
 
  for example...
 
  tryCatch(lme(Y ~ X1*X2, random = ~1|subj, Model[i]),
   error=function(err) return(0))
 
  (you can do something with 'err' or just return 0 as above)
 
  --- Gang Chen [EMAIL PROTECTED] wrote:
 
  I run a linear mixed-effects model in a loop
 
  for (i in 1:N) {
  fit.lme - lme(Y ~ X1*X2, random = ~1|subj, Model[i]);
  }
 
  As the data in some iterations are all (or most) 0's, leading to the
  following error message from lme:
 
  Error in chol((value + t(value))/2) : the leading minor of order 1 is
  not positive definite
 
  What is a good way to catch the error without spilling on the screen
  so that I can properly stuff the corresponding output with some
  artificial value such as 0?
 
  Thanks,
  Gang
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 
  __ 
  __
  Looking for a deal? Find great prices on flights and hotels with  
  Yahoo! FareChase.
  http://farechase.yahoo.com/
 



  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About grep

2007-08-06 Thread Stephen Tucker
try

grep(paste(^,b[2],$,sep=),a)


your version will match b2:

 grep(^b[2]$,c(b,b2,b3))
[1] 2


--- Shao [EMAIL PROTECTED] wrote:

 Hi,everyone.
 
 I have a problem when using the grep.
 for example:
 a - c(aa,aba,abac)
 b- c(ab,aba)
 
 I want to match the whole word,so
 grep(^aba$,a)
 it returns 2
 
 but when I used it a more useful way:
 grep(^b[2]$,a),
 it doesn't work at all, it can't find it, returning integer(0).
 
 How can I chang the format in the second way?
 
 Thanks.
 
 -- 
 Shao
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   

Pinpoint customers who are looking for what you sell.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using loops to create multiple images

2007-08-05 Thread Stephen Tucker
Not sure exactly what 'results' is doing there or 'barplot(table(i),...)'
does  [see ?table]

but I think this is sort of what you want to do?

## Variable assignment
G01_01 - 1:10
G01_02 - 2:6

## Combine to list*
varnames - paste(G01_,substring(100+1:2,2),sep=)
vars - lapply(`names-`(as.list(varnames),varnames),
   function(x) eval(parse(text=x)))
print(vars)

## Plotting
for( i in 1:length(vars) ) {
  filenm - paste(/my/dir/barplot,i,.png,sep=)
  barplot(...)
  dev.copy(png,filename=filenm,...)
  dev.off()
}

## **Combining to list, step-by-step
## does the same thing as above
digits - substring(100+1:2,2)
varnames - paste(G01_,digits,sep=)
vars - as.list(varnames)
names(varlist) - vars
# convert character string of variable names to
# expressions via parse() and evaluate by eval()
vars - lapply(varlist,function(x) eval(parse(text=x)))
print(vars)

I think in many cases paste() is your answer...


--- Donatas G. [EMAIL PROTECTED] wrote:

 I have a data.frame with ~100 columns and I need a barplot for each column 
 produced and saved in some directory.
 
 I am not sure it is possible - so please help me.
 
 this is my loop that does not work...
 
 vars - list (substitute (G01_01), substitute (G01_02), substitute
 (G01_03), 
 substitute (G01_04))
 results - data.frame ('Variable Name'=rep (NA, length (vars)), 
 check.names=FALSE)
 for (i in 1:length (vars))  {
 barplot(table(i),xlab=i,ylab=Nuomonės)
 dev.copy(png, filename=/my/dir/barplot.i.png, height=600, width=600)
 dev.off()
 }
 
 questions: 
 
 Is it possible to use the i somewhere _within_ a file name? (like it is 
 possible in other programming or scripting languages?)
 
 Since I hate to type in all the variables (they go from G01_01 to G01_10
 and 
 then from G02_01 to G02_10 and so on), is it possible to shorten this list
 by 
 putting there another loop, applying some programming thing or so? 
 
 -- 
 Donatas Glodenis
 http://dg.lapas.info
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   


that gives answers, not web links.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems using lm in combination with predict

2007-08-04 Thread Stephen Tucker
I think you need 

predict(mod,newdate)

instead of 

predict(y,newdate)


--- Maja Schröter [EMAIL PROTECTED] wrote:

 Hello everybody,
 
 I'm trying to predict a linear regression model but it does not work.
 
 My Model: y = Worktime + Vacation + Illnes + Bankholidays
 
 My modelmatrix is of dimension  28x4
 
 Then I want to make use of the function predict because there
 confidence.intervals are include.
 
 My idea was:
 
 mod - lm(y~Worktime+Vacation+Illnes+Bankholidays)
 
 newdate=data.frame(x=c(324,123,0.9,0.1))
 predict(y,newdate)
 
 But I always get the message:
 
 
 'newdata' had 1 rows but variable(s) found have 28 rows
 
 
 What can I do?
 
 Yours, 
 
 Maja
 
 -- 
 Pt! Schon vom neuen GMX MultiMessenger gehört?
 Der kanns mit allen: http://www.gmx.net/de/go/multimessenger
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] methods and classes and things

2007-08-04 Thread Stephen Tucker
methods(plot)

--- Edna Bell [EMAIL PROTECTED] wrote:

 Hi R Gurus:
 
 I know that plot  has extra things like plot.ts, plot.lm
 
 How would i find out all of them, please?
 
 Thanks,
 Edna
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] request

2007-08-03 Thread Stephen Tucker
?cumsum

--- zahid khan [EMAIL PROTECTED] wrote:

  I want to calculate the commulative sum of any numeric vector with the
 following command but this following command does not work  comsum
   My question is , how we can calculate the commulative sum of any numeric
 vector with above command
   Thanks
 
 
 Zahid Khan
 Lecturer in Statistics
 Department of Mathematics
 Hazara University Mansehra.

 -
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] t-distribution

2007-08-02 Thread Stephen Tucker
p - seq(0.001,0.999,,1000)
x - qt(p,df=9)
y - dt(x,df=9)
plot(x,y,type=l)
polygon(x=c(x,rev(x)),y=c(y,rep(0,length(y))),col=gray90)

Hope this helps.

ST


--- Nair, Murlidharan T [EMAIL PROTECTED] wrote:

 Indeed, this is what I wanted, I figured it from the function you and
 Mark pointed me. Thank you both. 
 
 I am trying to plot it to illustrate the point and I tried this
 
 plot(function(x) dt(x, df = 9), -5, 5, ylim = c(0, 0.5), main=t -
 Density, yaxs=i)
 
 Is there an easy way to shade the area under the curve? 
 
 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of
 [EMAIL PROTECTED]
 Sent: Wednesday, August 01, 2007 3:18 PM
 To: [EMAIL PROTECTED]; r-help@stat.math.ethz.ch
 Subject: Re: [R] t-distribution
 
 Well, is t = 1.11 all that accurate in the first place?  :-)
 
 In fact, reading beween the lines of the original enquiry, what the
 person probably wanted was something like
 
 ta - pt(-1.11, 9) + pt(1.11, 9, lower.tail = FALSE)
 
 which is the two-sided t-test tail area.
 
 The teller of the parable will usually leave some things unexplained...
 
 Bill. 
 
 
 Bill Venables
 CSIRO Laboratories
 PO Box 120, Cleveland, 4163
 AUSTRALIA
 Office Phone (email preferred): +61 7 3826 7251
 Fax (if absolutely necessary):  +61 7 3826 7304
 Mobile: +61 4 8819 4402
 Home Phone: +61 7 3286 7700
 mailto:[EMAIL PROTECTED]
 http://www.cmis.csiro.au/bill.venables/ 
 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Ben Bolker
 Sent: Thursday, 2 August 2007 4:57 AM
 To: r-help@stat.math.ethz.ch
 Subject: Re: [R] t-distribution
 
  Bill.Venables at csiro.au writes:
 
  
  for the upper tail:
  
   1-pt(1.11, 9)
  [1] 0.1478873
  
wouldn't 
  pt(1.11, 9, lower.tail=FALSE)
   be more accurate?
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] line widths of plotting symbols in the lattice

2007-08-02 Thread Stephen Tucker
Thanks to all for the response - the grid.points() solution works well.

Stephen

(oddly I missed when this thread and its response actually got posted... was
starting to get worried)

--- Deepayan Sarkar [EMAIL PROTECTED] wrote:

 On 7/31/07, Uwe Ligges [EMAIL PROTECTED] wrote:
 
 
  Stephen Tucker wrote:
   Dear List,
  
   Sorry, this is very simple but I can't seem to find any information
 regarding
   line widths of plotting symbols in the lattice package.
  
   For instance, in traditional graphics:
  
   plot(1:10,lwd=3)
   points(10:1,lwd=2,col=3)
  
   'lwd' allows control of plotting symbol line widths.
 
 
  'lwd' is documented in ?gpar (the help page does not show up for me,
  I'll take a closer look why) and works for me:
 
  xyplot(1:10 ~ 1:10, type = l, lwd = 5)
 
 I think the point is that lwd doesn't work for _points_, and that is a
 bug (lplot.xy doesn't pass on lwd to grid.points). I'll fix it,
 meanwhile a workaround is to use grid.points directly, e.g.
 
 library(grid)
 xyplot(1:10 ~ 1:10, cex = 2, lwd = 3,
panel = function(x, y, ...) grid.points(x, y, gp = gpar(...)))
 
 -Deepayan


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] t-distribution

2007-08-02 Thread Stephen Tucker
yes, or

p - seq(0.001,0.999,,1000)
x - qt(p,df=9)
y - dt(x,df=9)
plot(x,y,type=l)

f - function(x,y,...) {
  polygon(x=c(x,rev(x)),y=c(y,rep(0,length(y))),...)
}
with(data.frame(x,y)[x = 2.3,],f(x,y,col=gray90))
with(data.frame(x,y)[x = -2.3,],f(x,y,col=gray90))


--- Nair, Murlidharan T [EMAIL PROTECTED] wrote:

 
 I tried doing it this way. 
 
 left--2.3
 right-2.3
 p - seq(0.001,0.999,,1000)
 x - qt(p,df=9)
 y - dt(x,df=9)
 plot(x,y,type=l)
 x.tmp-x
 y.tmp-y
 a-which(x=left)

polygon(x=c(x.tmp[a],rev(x.tmp[a])),y=c(y.tmp[a],rep(0,length(y.tmp[a]))),col=gray90)
 b-which(x=right)

polygon(x=c(x.tmp[b],rev(x.tmp[b])),y=c(y.tmp[b],rep(0,length(y.tmp[b]))),col=gray90)
 
 Please let me know if I have made any mistakes. 
 Thanks ../Murli
 
 
 
 -Original Message-
 From: Richard M. Heiberger [mailto:[EMAIL PROTECTED]
 Sent: Thu 8/2/2007 10:25 AM
 To: Nair, Murlidharan T; Stephen Tucker; r-help@stat.math.ethz.ch
 Subject: Re: [R] t-distribution
  
 I believe you are looking for the functionality I have
 in the norm.curve function in the HH package.
 
 Download and install HH from CRAN and then look at
 
 example(norm.curve)
 
 



   
Ready
 for the edge of your seat? 
Check out tonight's top picks on Yahoo! TV.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] y axix number into horizontal direction

2007-08-02 Thread Stephen Tucker
try

par(las=1)
plot(0,0,xaxt=n,type=n, ylim=c(0,100))
mtext(35,side=2,at=35)

you can use 'las=1' in par(), plot(), axis(), etc.

more generally, you can use 'srt' in text() to rotate tick labels:

plot(1:10,1:10,xaxt=n,type=n, yaxt=n,ylim=c(0,100))
axis(1); axis(2,lab=FALSE)
text(x=par(usr)[1]-2*par(cxy)[1],y=axTicks(2),
 lab=axTicks(2),xpd=TRUE,srt=45)



--- Rebecca Ding [EMAIL PROTECTED] wrote:

 Dear R users,
 
 I used plot() and mtext() functions to draw a plot. The numbers: 0,20,35,
 40,60,80,100 were in the vertical direction. I'd like to transfer them into
 the horizontal direction.
 
 plot(0,0,xaxt=n,type=n, ylim=c(0,100))
 mtext(35,side=2,at=35)
 
 Any suggestion?
 
 Thanks.
 
 Rebecca
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] passing args to R CMD BATCH in win 2000

2007-08-01 Thread Stephen . Bond
Hello and sorry to bother.

Please help.
I searched the archives but could not find out why --args is being ignored 
on Windows 2000.
I try

R CMD BATCH --slave 11.R 11.Rout --args 12

and 11.R has

x=commandArgs(trail=T)
print(x)
a=x[length(x)]
write.csv(a,file=13.out)
q(no)


the argument is not passed to the R process.
11.Rout only shows processing time and 13.out does not have the value.


Thank you all.
stephen
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] shadow between two lines in plot()

2007-08-01 Thread Stephen Tucker
see ?rect, or, for more general shapes, ?polygon

## EXAMPLES
plot(c(0,500),c(0,500),type=n,las=1)
rect(par(usr)[1],200,par(usr)[2],300,col=grey90)
points(seq(0,500,length=3),seq(0,500,length=3))

plot(c(0,500),c(0,500),type=n,las=1)
polygon((par(usr)[1:2])[c(1,1,2,2)],
(c(200,300))[c(1,2,2,1)],col=grey90)
points(seq(0,500,length=3),seq(0,500,length=3))



--- Ding, Rebecca [EMAIL PROTECTED] wrote:

  Dear R users,
 
 I used the following code to draw a scatter plot. 
 
 plot(x,y,type=n)
 points(x,y,pch=1)
 
 And then I used the abline functions to draw two lines. I want to add
 the shadow between those two lines. 
 
 abline(h=200)
 abline(h=300)
 
 Any suggestions?
 
 Thanks
 
 Rebecca
 
 --
 This e-mail and any files transmitted with it may contain pr...{{dropped}}
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] new user question on dataframe comparisons and plots

2007-08-01 Thread Stephen Tucker
Hi Conor,

I hope I interpreted your question correctly. I think for the first one you
are looking for a conditioning plot? I am going to create and use some
nonsensical data - 'iris' comes with R so this should be reproducible on your
machine:

library(lattice)
data(iris)
x - iris
# make some factors using cut()
x[,2:3] - lapply(x[,2:3],cut,3)
# add column of TRUE FALSE
x - cbind(x,TF=sample(c(TRUE,FALSE),nrow(x),replace=TRUE))
xyplot(petal.wid~petal.len | ## these are numeric
   sepal.wid*sepal.len,  ## these are factors
   groups=TF,## TRUE or FALSE
   panel=function(x,y,...) {
 panel.xyplot(x,y,...)
 panel.loess(x,y,...)
   },
   data=x,auto.key=TRUE)


merge() should work when you have different factors, when you specify
all=TRUE.

## get counts for TRUE and FALSE
 y - tapply(x$species,INDEX=x$TF,
+function(x) as.data.frame(table(x)))
## merge results
 (z - `names-`(merge(y$`TRUE`,y$`FALSE`,by=x,all=TRUE),
+   c(factor,true,false)))
  factor true false
1 versicolor   2921
2  virginica   2327

## reshape the data frame
 library(reshape)
 melt(z,id=1)
  factor variable value
1 versicolor true29
2  virginica true23
3 versicolorfalse21
4  virginicafalse27

Hope this helps. If it doesn't you can post a small (reproducible) piece of
data and we can maybe help you out a little better...

Best regards,

ST


--- Conor Robinson [EMAIL PROTECTED] wrote:

 I'm coming from the scipy community and have been using R on and for
 the past week or so.  I'm still feeling out the language structure,
 but so far so good.  I apologize in advance if I pose any obvious
 questions, due to my current lack of diction when searching for my
 issue, or recognizing it if I did see it.
 
 Question 1, plots:
 
 I have a data frame with 4 type factor columns, also in the data frame
 I have one single, type logical column with the response data (T or
 F).  I would like to plot a 4*4 grid showing all the two way attribute
 interactions like with plot(data.frame) or pairs(data.frame,
 panel=panel.smooth), however show the response's True and False as
 different colors, or any other built in graphical analysis that might
 be relevant in this case.  I'm sure this is simple since this is a
 common procedure, thanks in advance for humoring me.  Also, what is
 the correct term for this type of plot?
 
 
 Question 2, data frame analysis:
 
 I have two sub data frames split by whether my logical column is T or
 F.  I want to compare the same factor column between both of the two
 sub data frames (there are a few hundred different unique possibles
 for this factor column eg  -  enumerated).  I've used table()
 on the attribute columns from each sub frame to get counts.
 
 pos - data.frame(table(df.true$CAT))
 
   10
 BASD  0
 ZAQM 4
 ...
 
 neg - data.frame(table(df.false$CAT))
 
  1000
 BASD  3
 ZAQM  9
 PPWS 10
 ...
 
 The TRUE sub frame has less unique factors that the sub frame FALSE, I
 would like an output data frame that is one column all the factors
 from the TRUE sub frame and the second column the counts from the TRUE
 attributes / counts from the corresponding FALSE attributes ie
 %response for each represented factor.  It's fine (better even) if all
 factors are included and there is just a zero for the attributes with
 no TRUEs.
 
 I've been going off making my own function and running into trouble
 with the data frame not being a vector etc etc, but I have a feeling
 there is a *much* better way ie built in function, but I've hit my
 current level of R understanding.
 
 Thank you,
 Conor
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting exponential curve to data points

2007-07-30 Thread Stephen Tucker
Sorry, just got back into town.

I wonder if AIC, BIC, or cross-validation scoring couldn't also be used as
criteria for model selection - I've seen it mostly in the context of variable
selection rather than 'form' selection but in principle might apply here?


--- Dieter Menne [EMAIL PROTECTED] wrote:

 Andrew Clegg andrew.clegg at gmail.com writes:
 
  
  ... If I want to demonstrate that a non-linear curve fits
  better than an exponential, what's the best measure for that? Given
  that neither of nls() or optim() provide R-squared. 
 
 To supplement Karl's comment, try Douglas Bates' (author of nls) comments
 on the
 matter
 
 http://www.ens.gu.edu.au/ROBERTK/R/HELP/00B/0399.HTML
 
 Short summary:
 * ... the lack of automatic ANOVA, R^2 and adj. R^2 from nls is a
 feature,
 not a bug :-)
 * My best advice regarding R^2 statistics with nonlinear models is, as
 Nancy
 Reagan suggested, Just say no.
 
 Dieter
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] manipulating arrays

2007-07-30 Thread Stephen Tucker
I think you are looking for append(), though it won't modify the object
in-place like Python [I believe that is a product of R's 'functional
programming' philosophy].

might want to check this entertaining thread:
http://tolstoy.newcastle.edu.au/R/help/04/11/7727.html

in this example it would be like

 c(X[1], 0, X[2:5])
[1] 1 0 2 3 4 5
 append(X,0,1)
[1] 1 0 2 3 4 5


--- Henrique Dallazuanna [EMAIL PROTECTED] wrote:

 Hi, I don't know if is the more elegant way, but:
 
 X-c(1,2,3,4,5)
 X - c(X[1], 0, X[2:5])
 
 
 -- 
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O
 
 On 27/07/07, Nair, Murlidharan T [EMAIL PROTECTED] wrote:
 
  Can I insert an element in an array at a particular position without
  destroying the already existing element?
 
 
 
  X-c(1,2,3,4,5)
 
 
 
  I want to insert an element between 1 and 2.
 
 
 
  Thanks ../Murli
 
 
 
 
  [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
   [[alternative HTML version deleted]]
 
  __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   


Comedy with an Edge to see what's on, when.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] line widths of plotting symbols in the lattice

2007-07-30 Thread Stephen Tucker
Dear List,

Sorry, this is very simple but I can't seem to find any information regarding
line widths of plotting symbols in the lattice package.

For instance, in traditional graphics:

 plot(1:10,lwd=3)
 points(10:1,lwd=2,col=3)

'lwd' allows control of plotting symbol line widths.

I've tried looking through the documentation for xyplot, panel.points,
trellis.par.set, and the R-help archives. Maybe it goes by another name?

Thanks in advance,

Stephen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Redirecting print output

2007-07-24 Thread Stephen Tucker
Here are two simple ways:

=== method1 ===
cat(line1,\n,file=output.txt)
cat(line2,\n,file=output.txt,append=TRUE)

=== method2 ===
sink(output.txt)
cat(line1,\n)
cat(line2,\n)
out - lm(y~x,data=data.frame(x=1:10,y=(1:10+rnorm(10,0,0.1
print(out)
sink()

And then there is 'Sweave'. Check out, for instance
http://www.stat.umn.edu/~charlie/Sweave/

You can embed R code, figures, and output from print methods into your latex
document.

ST
--- Stan Hopkins [EMAIL PROTECTED] wrote:

 I see a rich set of graphic device functions to redirect that output.  Are
 there commands to redirect text as well.  I have a set of functions that
 execute many linear regression tests serially and I want to capture this in
 a file for printing.
 
 Thanks,
 
 Stan Hopkins
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   
Ready
 for the edge of your seat? 
Check out tonight's top picks on Yahoo! TV.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] persp and greek symbols in the axes labels

2007-07-24 Thread Stephen Tucker
I don't know why it doesn't work but I think people generally recommend that
you use wireframe() in lattice rather than persp(), because wireframe is more
customizable (the pdf document referred to in this post is pretty good):
http://tolstoy.newcastle.edu.au/R/e2/help/07/03/12534.html

Here's an example:

library(lattice)
library(reshape)
x - 1:5
y - 1:3
z - matrix(1:15,ncol=3,dimnames=list(NULL,y))
M - melt(data.frame(x,z,check.names=FALSE),id=1,variable=y)
wireframe(value~x*y,data=M,
  screen=list(z=45,x=-75),
  xlab=expression(kappa[lambda]),
  ylab=as.expression(substitute(paste(phi,=,true,sigma),
  list(true=5))),
  zlab = Z)

[you can play around with the 'screen' argument to rotate the view, analogous
to phi and theta in persp()]


--- Nathalie Peyrard [EMAIL PROTECTED] wrote:

 Hello,
 
 I am plotting a 3D function using persp and I would like to use greek 
 symbols in the axes labels.
 I  have found examples like  this one on the web:
 

plot(0,0,xlab=expression(kappa[lambda]),ylab=substitute(paste(phi,=,true,sigma),list(true=5)))
 
 this works well with plot but not with persp:
 with the command
 
 persp(M,theta = -20,phi = 

0,xlab=expression(kappa[lambda]),ylab=substitute(paste(phi,=,true,sigma),list(true=5)),zlab
 
 = Z)
 
 I get the labels as in toto.eps
 
 Any suggestion? Thanks!
 
 Nathalie
 
 -- 
 ~~   
 INRA  Toulouse - Unité de Biométrie et  Intelligence Artificielle 
 Chemin de Borde-Rouge BP 52627 31326 CASTANET-TOLOSAN cedex FRANCE 
 Tel : +33(0)5.61.28.54.39 - Fax : +33(0)5.61.28.53.35
 Web :http://mia.toulouse.inra.fr/index.php?id=217
 ~~
 
 
  __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   
Ready
 for the edge of your seat? 
Check out tonight's top picks on Yahoo! TV.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting exponential curve to data points

2007-07-24 Thread Stephen Tucker
I think your way is probably the easiest (shockingly). For instance, here are
some alternatives - I think in both cases you have to calculate the
coefficient of determination (R^2) manually. My understanding is that
multiple R^2 in your case is the usual R^2 because you only have one
predictor variable, and the adjusted R^2 considers the degrees of freedom and
penalizes for additional predictors. Which is better... depends? (Perhaps
more stats-savvy people can help you on that one. I'm a chemical engineer so
I unjustifiably claim ignorance).

## Data input
input -
Year   Count
19993
20005
20019
200230
200362
2004154
2005245
2006321

dat - read.table(textConnection(input),header=TRUE)
dat[,] - lapply(dat,function(x) x-x[1])
  # shifting in origin; will need to add back in later

## Nonlinear least squares
plot(dat)
out - nls(Count~b0*exp(b1*Year),data=dat,
   start=list(b0=1,b1=1))
lines(dat[,1],fitted(out),col=2)
out - nls(Count~b0+b1*Year+b2*Year^2,data=dat, #polynomial
   start=list(b0=0,b1=1,b2=1))
lines(dat[,1],fitted(out),col=3)

## Optim
f - function(.pars,.dat,.fun) sum((.dat[,2]-.fun(.pars,.dat[,1]))^2)
fitFun - function(b,x) cbind(1,x,x^2)%*%b
expFun - function(b,x) b[1]*exp(b[2]*x)

plot(dat)
out - optim(c(0,1,1),f,.dat=dat,.fun=fitFun)
lines(dat[,1],fitFun(out$par,dat[,1]),col=2)
out - optim(c(1,1),f,.dat=dat,.fun=expFun)
lines(dat[,1],expFun(out$par,dat[,1]),col=3)


--- Andrew Clegg [EMAIL PROTECTED] wrote:

 Hi folks,
 
 I've looked through the list archives and online resources, but I
 haven't really found an answer to this -- it's pretty basic, but I'm
 (very much) not a statistician, and I just want to check that my
 solution is statistically sound.
 
 Basically, I have a data file containing two columns of data, call it
 data.tsv:
 
 year  count
 1999  3
 2000  5
 2001  9
 2002  30
 2003  62
 2004  154
 2005  245
 2006  321
 
 These look exponential to me, so what I want to do is plot these
 points on a graph with linear axes, and add an exponential curve over
 the top. I also want to give an R-squared for the fit.
 
 The way I did it was like so:
 
 
 # Read in the data, make a copy of it, and take logs
 data = read.table(data.tsv, header=TRUE)
 log.data = data
 log.data$count = log(log.data$count)
 
 # Fit a model to the logs of the data
 model = lm(log.data$count ~ year, data = log.data)
 
 # Plot the original data points on a graph
 plot(data)
 
 # Draw in the exponents of the model's output
 lines(data$year, exp(fitted(model)))
 
 
 Is this the right way to do it? log-ing the data and then exp-ing the
 results seems like a bit of a long-winded way to achieve the desired
 effect. Is the R-squared given by summary(model) a valid measurement
 of the fit of the points to an exponential curve, and should I use
 multiple R-squared or adjusted R-squared?
 
 The R-squared I get from this method (0.98 multiple) seems a little
 high going by the deviation of the last data point from the curve --
 you'll see what I mean if you try it.
 
 Thanks in advance for any help!
 
 Yours gratefully,
 
 Andrew.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting exponential curve to data points

2007-07-24 Thread Stephen Tucker
Well spoken. And since log transformations are nonlinear and 'compresses' the
data, it's not surprising to find that the fit doesn't look so nice while the
fit metrics tell you that a model does a good job.

--- [EMAIL PROTECTED] wrote:

 On 24-Jul-07 01:09:06, Andrew Clegg wrote:
  Hi folks,
  
  I've looked through the list archives and online resources, but I
  haven't really found an answer to this -- it's pretty basic, but I'm
  (very much) not a statistician, and I just want to check that my
  solution is statistically sound.
  
  Basically, I have a data file containing two columns of data, call it
  data.tsv:
  
  year  count
  1999  3
  2000  5
  2001  9
  2002  30
  2003  62
  2004  154
  2005  245
  2006  321
  
  These look exponential to me, so what I want to do is plot these
  points on a graph with linear axes, and add an exponential curve over
  the top. I also want to give an R-squared for the fit.
  
  The way I did it was like so:
  
  
 # Read in the data, make a copy of it, and take logs
  data = read.table(data.tsv, header=TRUE)
  log.data = data
  log.data$count = log(log.data$count)
  
 # Fit a model to the logs of the data
  model = lm(log.data$count ~ year, data = log.data)
  
 # Plot the original data points on a graph
  plot(data)
  
 # Draw in the exponents of the model's output
  lines(data$year, exp(fitted(model)))
  
  
  Is this the right way to do it? log-ing the data and then exp-ing the
  results seems like a bit of a long-winded way to achieve the desired
  effect. Is the R-squared given by summary(model) a valid measurement
  of the fit of the points to an exponential curve, and should I use
  multiple R-squared or adjusted R-squared?
  
  The R-squared I get from this method (0.98 multiple) seems a little
  high going by the deviation of the last data point from the curve --
  you'll see what I mean if you try it.
 
 I just did. From the plot of log(count) against year, with the plot
 of the linear fit of log(count)~year superimposed, I see indications
 of a non-linear relationship.
 
 The departures of the data from the fit follow a rather systematic
 pattern. Initially the data increase more slowly than the fit,
 and lie below it. Then they increase faster and corss over above it.
 Then the data increase less fast than the fit, and the final data
 point is below the fit.
 
 There are not enough data to properly identify the non-linearity,
 but the overall appearance of the data plot suggests to me that
 you should be considering one of the growth curve models.
 
 Many such models start of with an increasing rate of growth,
 which then slows down, and typically levels off to an asymptote.
 The apparent large discrepancy of your final data point could
 be compatible with this kind of behaviour.
 
 At this point, knowledge of what kind of thing is represented
 by your count variable might be helpful. If, for instance,
 it is the count of the numbers of individuals of a species in
 an area, then independent knowledge of growth mechanisms may
 help to narrow down the kind of model you should be tring to fit.
 
 As to your question about Is this the right way to do it
 (i.e. fitting an exponential curve by doing a linear fit of the
 logarithm), generally speaking the answer is Yes. But of course
 you need to be confident that exponential is the right curve
 to be fitting in the first place. If it's the wrong type of
 curve to be considering, then it's not the right way to do it!
 
 Hoping this help[s,
 Ted.
 
 
 E-Mail: (Ted Harding) [EMAIL PROTECTED]
 Fax-to-email: +44 (0)870 094 0861
 Date: 24-Jul-07   Time: 10:08:33
 -- XFMail --
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: Re: Fitting exponential curve to data points

2007-07-24 Thread Stephen Tucker
Hope these help for alternatives to lm()? I show the use of a 2nd order
polynomial as an example to generalize a bit.

Sometimes from the subject line two separate responses can appear as reposts
when in fact they are not... (though there are identical reposts too). I
should probably figure a way around that.

--- Stephen Tucker [EMAIL PROTECTED] wrote:

 ## Data input
 input -
 Year Count
 1999  3
 2000  5
 2001  9
 2002  30
 2003  62
 2004  154
 2005  245
 2006  321
 
 dat - read.table(textConnection(input),header=TRUE)
 dat[,] - lapply(dat,function(x) x-x[1])
   # shifting in origin; will need to add back in later
 
 ## Nonlinear least squares
 plot(dat)
 out - nls(Count~b0*exp(b1*Year),data=dat,
start=list(b0=1,b1=1))
 lines(dat[,1],fitted(out),col=2)
 out - nls(Count~b0+b1*Year+b2*Year^2,data=dat, #polynomial
start=list(b0=0,b1=1,b2=1))
 lines(dat[,1],fitted(out),col=3)
 
 ## Optim
 f - function(.pars,.dat,.fun) sum((.dat[,2]-.fun(.pars,.dat[,1]))^2)
 fitFun - function(b,x) cbind(1,x,x^2)%*%b
 expFun - function(b,x) b[1]*exp(b[2]*x)
 
 plot(dat)
 out - optim(c(0,1,1),f,.dat=dat,.fun=fitFun)
 lines(dat[,1],fitFun(out$par,dat[,1]),col=2)
 out - optim(c(1,1),f,.dat=dat,.fun=expFun)
 lines(dat[,1],expFun(out$par,dat[,1]),col=3)



   

Got a little couch potato? 
Check out fun summer activities for kids.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Set

2007-07-23 Thread Stephen Tucker
My bad... corrections (semantic and otherwise) always appreciated. I'm still
learning too.

I also forgot the alternative of using make.names() instead of manually
assigning 'more convenient' names.

input - 
Mydata,S-sharif,A site
1,45,34
2,66,45
3,79,56

 dat - read.csv(textConnection(input),check.names=FALSE)
 dat
  Mydata S-sharif A site
1  1   45 34
2  2   66 45
3  3   79 56
 names(dat)
[1] Mydata   S-sharif A site  
 names(dat) - make.names(names(dat))
 names(dat)
[1] Mydata   S.sharif A.site  

Which, in the case of the data set, Monsoon, I don't know how it was created
originally but may be convenient to reassign names by

  names(Monsoon) - make.names(names(Monsoon))



--- Gavin Simpson [EMAIL PROTECTED] wrote:

 On Sun, 2007-07-22 at 21:51 -0700, Stephen Tucker wrote:
  It turns out that - and   (space) are not valid variable names. 
 
 They are valid names, the problem is that they aren't very convenient to
 use, as the OP discovered, because they need to be quoted.
 
 Note that if using something like read.csv or read.table, R will correct
 these problem variable names for you when you import the data. If you
 read this file in for example:
 
 Mydata,S-sharif,A site
 1,45,34
 2,66,45
 3,79,56
 
 using read.csv, you get easy to use names
 
  dat - read.csv(temp.csv)
  dat
   Mydata S.sharif A.site
 1  1   45 34
 2  2   66 45
 3  3   79 56
 
 You can turn off this safety checking using the argument check.names =
 FALSE
 
 G
 
 -- 
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
  Gavin Simpson [t] +44 (0)20 7679 0522
  ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
  Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
  Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
  UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ?R: Removing white space betwen multiple plots, traditional graphics

2007-07-22 Thread Stephen Tucker
You could try
par(mar=c(0,5,0,2), mfrow = c(6,1), oma=c(5,0,2,0))
##...then, your plots...##


--- Mr Natural [EMAIL PROTECTED] wrote:

 
 I would appreciate suggestions for removing the white spaces the graphs in
 a
 stack:
 
 par(mar=c(2,2,1,1), mfrow = c(6,1))
 mydates-dates(1:20,origin=c(month = 1, day = 1, year = 1986))
 plot(rnorm(20,0.1,0.1)~mydates, type=b,xlab=,ylim=c(0,1),xaxt = n)
 plot(rnorm(20,0.2,0.1)~mydates, type=b,xlab=,ylim=c(0,1),xaxt = n)
 plot(rnorm(20,0.3,0.1)~mydates, type=b,xlab=,ylim=c(0,1),xaxt = n)
 plot(rnorm(20,0.5,0.1)~mydates, type=b,xlab=,ylim=c(0,1),xaxt = n)
 plot(rnorm(20,0.7,0.1)~mydates, type=b,xlab=,ylim=c(0,1),xaxt = n)
 plot(rnorm(20,0.8,0.1)~mydates, type=b,xlab=,ylim=c(0,1) )
 
 Thanx, Don
 -- 
 View this message in context:

http://www.nabble.com/-R%3A--Removing-white-space-betwen-multiple-plots%2C-traditional-graphics-tf4119626.html#a11716176
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tagging results of apply

2007-07-22 Thread Stephen Tucker
Dear Bruce,
In your functions, you need to use your bound variable, 'x' [not mat1] in
your anonymous function [function(x)] as the argument to cor().

For instance, you wrote:
apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
apply(mat1, 1, function(x) cor(mat1, mat2))

They should be
apply(mat1, 1, function(x) cor(x, mat2[1,]))
apply(mat1, 1, function(x) cor(x, mat2))

or
f - function(x,y) cor(x, y)
apply(mat1, 1, f, y=mat2[1,])
apply(mat1, 1, f, y=mat2)

Then from the ?apply documentation - under section, 'Value' - the following
statement will help you predict its behavior in this case:
If each call to FUN returns a vector of length n, then apply returns an
array of dimension c(n, dim(X)[MARGIN]) if n  1.

[each column of your output is the output from cor(mat1[i,],mat2) in Scenario
2]. As for tagging, you can try adding dimension labels [to the object which
is passed as the 'X' argument to apply()]:

mat1 - matrix(sample(1:500, 25), ncol = 5,
   dimnames=list(paste(row,1:5,sep=),
 paste(col,1:5,sep=)))
mat2 - matrix(sample(501:1000, 25), ncol = 5)

 apply(mat1, 1, function(x,y) cor(x, y), y=mat2)
row1   row2   row3row4row5
[1,]  0.39412464 -0.6241649  0.7423724  0.48391875  0.27085386
[2,] -0.22912466 -0.4123714  0.2857004 -0.52447327  0.06971423
[3,] -0.51027247  0.3256587 -0.6195050 -0.48309737  0.01699978
[4,]  0.26353316 -0.1873564  0.2121154  0.88784766 -0.02257890
[5,] -0.03771225 -0.4250040  0.3795558 -0.03372794 -0.05874675

Hope this helps,

Stephen

--- Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote:

 In trying to get a better understanding of vectorization I wrote the
 following code:
 
 My objective is to take two sets of time series and calculate the
 correlations for each combination of time series.
 
 mat1 - matrix(sample(1:500, 25), ncol = 5)
 mat2 - matrix(sample(501:1000, 25), ncol = 5)
 
 Scenario 1:
 apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
 
 Scenario 2:
 apply(mat1, 1, function(x) cor(mat1, mat2))
 
 Using scenario 1, (output below) I can see that correlations are
 calculated for just the first row of mat2 against each individual row of
 mat1.
 
 Using scenario 2, (output below) I can see that correlations are
 calculated for each row of mat2 against each individual row of mat1.  
 
 Q1: The output of scenario2 consists of 25 rows of data.  Are the first
 five rows mat1 against mat2[1,], the next five rows mat1 against
 mat2[2,], ... last five rows mat1 against mat2[5,]?
 
 Q2: I assign the output of scenario 2 to a new matrix
 
   matC - apply(mat1, 1, function(x) cor(mat1, mat2))
 
 However, I need a way to identify each row in matC as a pairing of
 rows from mat1 and mat2.  Is there a parameter I can add to apply to do
 this?
 
 Scenario 1:
  apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
[,1]   [,2]   [,3]   [,4]   [,5]
 [1,] -0.4626122 -0.4626122 -0.4626122 -0.4626122 -0.4626122
 [2,] -0.9031543 -0.9031543 -0.9031543 -0.9031543 -0.9031543
 [3,]  0.0735273  0.0735273  0.0735273  0.0735273  0.0735273
 [4,]  0.7401259  0.7401259  0.7401259  0.7401259  0.7401259
 [5,] -0.4548582 -0.4548582 -0.4548582 -0.4548582 -0.4548582
 
 Scenario 2:
  apply(mat1, 1, function(x) cor(mat1, mat2))
  [,1][,2][,3][,4][,5]
  [1,]  0.19394126  0.19394126  0.19394126  0.19394126  0.19394126
  [2,]  0.26402400  0.26402400  0.26402400  0.26402400  0.26402400
  [3,]  0.12923842  0.12923842  0.12923842  0.12923842  0.12923842
  [4,] -0.74549676 -0.74549676 -0.74549676 -0.74549676 -0.74549676
  [5,]  0.64074122  0.64074122  0.64074122  0.64074122  0.64074122
  [6,]  0.26931986  0.26931986  0.26931986  0.26931986  0.26931986
  [7,]  0.08527921  0.08527921  0.08527921  0.08527921  0.08527921
  [8,] -0.28034079 -0.28034079 -0.28034079 -0.28034079 -0.28034079
  [9,] -0.15251915 -0.15251915 -0.15251915 -0.15251915 -0.15251915
 [10,]  0.19542415  0.19542415  0.19542415  0.19542415  0.19542415
 [11,]  0.75107032  0.75107032  0.75107032  0.75107032  0.75107032
 [12,]  0.53042767  0.53042767  0.53042767  0.53042767  0.53042767
 [13,] -0.51163612 -0.51163612 -0.51163612 -0.51163612 -0.51163612
 [14,] -0.44396048 -0.44396048 -0.44396048 -0.44396048 -0.44396048
 [15,]  0.57018745  0.57018745  0.57018745  0.57018745  0.57018745
 [16,]  0.70480284  0.70480284  0.70480284  0.70480284  0.70480284
 [17,] -0.36674283 -0.36674283 -0.36674283 -0.36674283 -0.36674283
 [18,] -0.81826607 -0.81826607 -0.81826607 -0.81826607 -0.81826607
 [19,]  0.53145184  0.53145184  0.53145184  0.53145184  0.53145184
 [20,]  0.24568385  0.24568385  0.24568385  0.24568385  0.24568385
 [21,] -0.10610402 -0.10610402 -0.10610402 -0.10610402 -0.10610402
 [22,] -0.78650748 -0.78650748 -0.78650748 -0.78650748 -0.78650748
 [23,]  0.04269423  0.04269423  0.04269423  0.04269423  0.04269423
 [24,]  0.14704698  0.14704698  0.14704698  0.14704698  0.14704698
 [25,]  0.28340166  0.28340166  0.28340166

Re: [R] tagging results of apply

2007-07-22 Thread Stephen Tucker
Actually if you want to tag both column and row, this might also help:

## Give dimension labels to both matrices
mat1 - matrix(sample(1:500, 25), ncol = 5,
   dimnames=list(paste(mat1row,1:5,sep=),
 paste(mat1col,1:5,sep=)))
mat2 - matrix(sample(501:1000, 25), ncol = 5,
   dimnames=list(paste(mat2row,1:5,sep=),
 paste(mat2col,1:5,sep=)))

cor(mat1[1,],mat2)
mat2col1   mat2col2   mat2col3  mat2col4 mat2col5
[1,] -0.06313535 -0.4679927 -0.5147084 -0.797748 -0.001457972

The column labels are there but are lost when returned from apply(), as it
says in ?apply:

In all cases the result is coerced by as.vector to one of the basic vector
types before the dimensions are set

 as.vector(cor(mat1[1,],mat2))
[1] -0.063135353 -0.467992672 -0.514708392 -0.797748010 -0.001457972

You lose the dimension labels in this case, so one option is to guard against
this in the following way:

 as.vector(as.data.frame(cor(mat1[1,],mat2)))
 mat2col1   mat2col2   mat2col3  mat2col4 mat2col5
1 -0.06313535 -0.4679927 -0.5147084 -0.797748 -0.001457972

Unfortunately, if you use 'as.data.frame()' in 'function(x)', apply will
return a list - but you can bind the rows of the output:

 f - function(x,y) as.data.frame(cor(x,y))
 do.call(rbind, apply(mat1,1,f,y=mat2))
mat2col1   mat2col2mat2col3   mat2col4 mat2col5
mat1row1 -0.06313535 -0.4679927 -0.51470839 -0.7977480 -0.001457972
mat1row2 -0.28750363  0.1681777  0.14671484  0.8139768  0.039982028
mat1row3 -0.62017387 -0.6932731 -0.72263865 -0.7929604  0.427366680
mat1row4  0.06441894  0.1707946 -0.11444747 -0.8213577  0.526239013
mat1row5 -0.09849051  0.7024540 -0.01997228  0.3712480  0.439037838

The result is a data frame, not a matrix, and note that the columns/rows are
transposed in relation to the output of
  apply(mat1,1,f,y=mat2)

An alternative is to convert each row of mat1 into a list element [by
transposing it with t() and then feeding it to as.data.frame()] and then use
sapply():

 sapply(as.data.frame(t(mat1)),f,y=mat2)
 mat1row1 mat1row2   mat1row3   mat1row4   mat1row5   
mat2col1 -0.06313535  -0.2875036 -0.6201739 0.06441894 -0.0984905 
mat2col2 -0.4679927   0.1681777  -0.6932731 0.1707946  0.702454   
mat2col3 -0.5147084   0.1467148  -0.7226387 -0.1144475 -0.01997228
mat2col4 -0.7977480.8139768  -0.7929604 -0.8213577 0.371248   
mat2col5 -0.001457972 0.03998203 0.4273667  0.526239   0.4390378



--- Stephen Tucker [EMAIL PROTECTED] wrote:

 Dear Bruce,
 In your functions, you need to use your bound variable, 'x' [not mat1] in
 your anonymous function [function(x)] as the argument to cor().
 
 For instance, you wrote:
 apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
 apply(mat1, 1, function(x) cor(mat1, mat2))
 
 They should be
 apply(mat1, 1, function(x) cor(x, mat2[1,]))
 apply(mat1, 1, function(x) cor(x, mat2))
 
 or
 f - function(x,y) cor(x, y)
 apply(mat1, 1, f, y=mat2[1,])
 apply(mat1, 1, f, y=mat2)
 
 Then from the ?apply documentation - under section, 'Value' - the following
 statement will help you predict its behavior in this case:
 If each call to FUN returns a vector of length n, then apply returns an
 array of dimension c(n, dim(X)[MARGIN]) if n  1.
 
 [each column of your output is the output from cor(mat1[i,],mat2) in
 Scenario
 2]. As for tagging, you can try adding dimension labels [to the object
 which
 is passed as the 'X' argument to apply()]:
 
 mat1 - matrix(sample(1:500, 25), ncol = 5,
dimnames=list(paste(row,1:5,sep=),
  paste(col,1:5,sep=)))
 mat2 - matrix(sample(501:1000, 25), ncol = 5)
 
  apply(mat1, 1, function(x,y) cor(x, y), y=mat2)
 row1   row2   row3row4row5
 [1,]  0.39412464 -0.6241649  0.7423724  0.48391875  0.27085386
 [2,] -0.22912466 -0.4123714  0.2857004 -0.52447327  0.06971423
 [3,] -0.51027247  0.3256587 -0.6195050 -0.48309737  0.01699978
 [4,]  0.26353316 -0.1873564  0.2121154  0.88784766 -0.02257890
 [5,] -0.03771225 -0.4250040  0.3795558 -0.03372794 -0.05874675
 
 Hope this helps,
 
 Stephen
 
 --- Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote:
 
  In trying to get a better understanding of vectorization I wrote the
  following code:
  
  My objective is to take two sets of time series and calculate the
  correlations for each combination of time series.
  
  mat1 - matrix(sample(1:500, 25), ncol = 5)
  mat2 - matrix(sample(501:1000, 25), ncol = 5)
  
  Scenario 1:
  apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
  
  Scenario 2:
  apply(mat1, 1, function(x) cor(mat1, mat2))
  
  Using scenario 1, (output below) I can see that correlations are
  calculated for just the first row of mat2 against each individual row of
  mat1.
  
  Using scenario 2, (output below) I can see that correlations are
  calculated for each row of mat2 against each individual row of mat1.  
  
  Q1: The output of scenario2 consists of 25 rows of data.  Are the first

Re: [R] Data Set

2007-07-22 Thread Stephen Tucker
Could you post the output from 

str(data)

?

Perhaps that will give us a clue.

--- amna khan [EMAIL PROTECTED] wrote:

 Sir the station name S.Sharif exists in the data but still the error is
 ocurring of being not found.
 Please help in this regard.
 
 
 On 7/22/07, Gavin Simpson [EMAIL PROTECTED] wrote:
 
  On Sun, 2007-07-22 at 03:25 -0700, amna khan wrote:
   Hi Sir
   I have made a data set having 23 stations of rainfall.
   when I use the attach function to approach indevidual stations then
   following error occurr.
  
   *attach(data)*
   *S.Sharif#S.Sharif is the station  name which has 50 data values*
   *Error: object S.Sharif not found*
   Now how to solve this problem.
 
  Then you don't have a column named exactly S.Sharif in your object
  data.
 
  What does str(data) and names(data) tell you about the columns in your
  data set? If looking at these doesn't help you, post the output from
  str(data) and names(data) and someone might be able to help.
 
  You should always check that R has imported the data in the way you
  expect; just because you think there is something in there called
  S.Sharif doesn't mean R sees it that way.
 
  You also seem to have included the R-Help email address twice in the To:
  header of your email - once is sufficient.
 
  G
 
   Thank You
   Regards
  
  --
  %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
  Gavin Simpson [t] +44 (0)20 7679 0522
  ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
  Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
  Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
  UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
  %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 
 
 
 
 
 -- 
 AMINA SHAHZADI
 Department of Statistics
 GC University Lahore, Pakistan.
 Email:
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Write columns from within a list to a matrix?

2007-07-22 Thread Stephen Tucker
Very close... Actually it's more like

savecol2=sapply(test, function(x) x[,1])

to get the same matrix as you showed in your for-loop (did you actually want
the first or second column?).

when I have multiple complex lists I am trying to manage...
for this, you can try mapply() which goes something like
mapply(function(x,y) #...function body...#,
   x=list1,y=list2)


--- [EMAIL PROTECTED] wrote:

 Hello,
 
 I think I have a mental block when it comes to working with lists.  lapply
 and sapply appear to do some magical things, but I can't seem to master
 their usage.
 
 As an example, I would like to convert a column within a list to a matrix,
 with the list element corresponding to the new matrix column.
 
 #Here is a simplified example: .
 test=vector(list, 3)
 for (i in 1:3){ test[[i]]=cbind(runif(15), rnorm(15,2)) }  #create example
 list (I'm sure there is a better way to do this too).
 
 #Now, I wan to get the second column back out, converting it from a list to
 a matrix.  This works, but gets confusing/inefficient when I have multiple
 complex lists I am trying to manage.
 
 savecol2=matrix(0,15,0)
 for (i in 1:3){
 savecol2=cbind(savecol2, test[[i]][,1])
 } 
 
 #Something like??:  (of course this doesn't work)
 savecol2=sapply(test, [[, function(x) x[2,]) 
 
 Thank you!
 
 Jeff
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Set

2007-07-22 Thread Stephen Tucker
It turns out that - and   (space) are not valid variable names. You can
get around that in two ways:

==
names(Monsoon)[2] - S.Sharif
names(Monsoon)[8] - Islamabad.AP
attach(Monsoon)
S.Sharif
Islamabad.AP
detach(Monsoon)

and do the same for other variable names that contain - or   characters.

=
The other way is to enclose the names in ``. For instance:
attach(Monsoon)
`S-Sharif`
`Islamabad AP`
detach(Monsoon)

Here is my example in which it works:
 x - list(1:5,6:8)
 names(x) - c(S-Sharif,Peshawar)
 str(x)
List of 2
 $ S-Sharif: int [1:5] 1 2 3 4 5
 $ Peshawar: int [1:3] 6 7 8
 attach(x)
 `S-Sharif`
[1] 1 2 3 4 5
 detach(x)



--- amna khan [EMAIL PROTECTED] wrote:

 Yes Sir
  I am sending u the clue for data.
 
  str(Monsoon)
 List of 23
  $ Dir   : num [1:40] 72.4 60.7 52.1.
  $ S-Sharif  : num [1:55] 23.6 93.5 36.3  ..
  $ Peshawar  : num [1:57] 54.4 27.7 ...
  $ Kakul : num [1:54]  50.3 116.1 ...
  $ Balakot   : num [1:47] 218.2  76.5 ...
  $ Parachinar: num [1:40] 41.4 37.6 62.2...
  $ Kohat : num [1:53] 50.8 93.2 94.5 ...
  $ Islamabad AP  : num [1:48] 140.2  69.3...
  $ Murree: num [1:47] 130.0 131.3  74.4 ...
  $ Islamabad SRRC: num [1:24] 172.2  82.3 150.1   ...
  $ Mian Wali : num [1:48] 80.5 48.5 56.6 43.2  ...
  $ Jhelum: num [1:57] 111.8  82.3  53.8  94.7  ...
  $ Sialkot   : num [1:55]  62.7 126.0  90.7  ...
  $ D-I Khan  : num [1:57] 24.9 40.6 34.3  ...
  $ Faisalabad: num [1:56] 79.2 43.9 55.4 ...
  $ Lahore: num [1:60] 32.5 81.5 28.7  ...
 
 when I attach the data file and access the site S-Sharif or D-I Khan or
 Mian Wali then error messages occur.
 
 Please help in this regard.
 
 Thank You
 
 
 On 7/23/07, Stephen Tucker [EMAIL PROTECTED] wrote:
 
  Could you post the output from
 
  str(data)
 
  ?
 
  Perhaps that will give us a clue.
 
  --- amna khan [EMAIL PROTECTED] wrote:
 
   Sir the station name S.Sharif exists in the data but still the error
  is
   ocurring of being not found.
   Please help in this regard.
  
  
   On 7/22/07, Gavin Simpson [EMAIL PROTECTED] wrote:
   
On Sun, 2007-07-22 at 03:25 -0700, amna khan wrote:
 Hi Sir
 I have made a data set having 23 stations of rainfall.
 when I use the attach function to approach indevidual stations then
 following error occurr.

 *attach(data)*
 *S.Sharif#S.Sharif is the station  name which has 50 data
  values*
 *Error: object S.Sharif not found*
 Now how to solve this problem.
   
Then you don't have a column named exactly S.Sharif in your object
data.
   
What does str(data) and names(data) tell you about the columns in
 your
data set? If looking at these doesn't help you, post the output from
str(data) and names(data) and someone might be able to help.
   
You should always check that R has imported the data in the way you
expect; just because you think there is something in there called
S.Sharif doesn't mean R sees it that way.
   
You also seem to have included the R-Help email address twice in the
  To:
header of your email - once is sufficient.
   
G
   
 Thank You
 Regards

--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson [t] +44 (0)20 7679 0522
ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
   
   
   
  
  
   --
   AMINA SHAHZADI
   Department of Statistics
   GC University Lahore, Pakistan.
   Email:
   [EMAIL PROTECTED]
   [EMAIL PROTECTED]
   [EMAIL PROTECTED]
  
 [[alternative HTML version deleted]]
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 
 
 
 
 
 



  news, photos  more.
  http://mobile.yahoo.com/go?refer=1GNXIC
 
 
 
 
 -- 
 AMINA SHAHZADI
 Department of Statistics
 GC University Lahore, Pakistan.
 Email:
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]

 



   

Pinpoint customers who are looking for what you sell.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to combine presence only data sets to one presence/absence table

2007-07-18 Thread Stephen Tucker
I think you can still read as a table, just use argument fill=TRUE.

Reading from Excel in general: you can save data as 'csv' or tab-delimited
file and then use read.csv or read.delim, respectively, or use one of the
packages listed in the following post (for some reason lines breaks are
messed up but hope you can extract the content):
http://tolstoy.newcastle.edu.au/R/e2/help/07/06/19925.html

## read in data
x - 
read.table(textConnection(
spl_A  spl_B   spl_C
spcs1   spcs1   spcs2
spcs2   spcs3   spcs3
spcs4   spcs5
spcs5
),fill=TRUE,header=TRUE,na.string=)

Then,

## 1. find unique
spcs - sort(na.omit(unique(unlist(x 
## 2. create matrix of zeros
mat - matrix(0,ncol=ncol(x),nrow=length(spcs),
  dimnames=list(spcs,names(x))) 
## 3. assign zeros to matches
for( i in 1:ncol(mat) ) mat[match(x[,i],rownames(mat)),i] - 1

Alternatively,
## find unique
spcs - sort(na.omit(unique(unlist(x 
## return the matrix you want (combine steps 2 and 3 from above)
sapply(x,function(.x,spcs)
   names-(ifelse(!is.na(match(spcs,.x)),1,0),spcs),spcs)

Hope this helps.

ST

--- Patrick Zimmermann [EMAIL PROTECTED] wrote:

 Problem: I have a Set of samples each with a list of observed species
 (presence only).
 Data is stored in a excel spreadsheet and the columns (spl) have
 different numbers of observations (spcs).
 Now I want to organize the data in a species by sample matrix with
 presence/absence style in R.
 
 data style (in excel):
 
 spl_A spl_B   spl_C
 spcs1 spcs1   spcs2
 spcs2 spcs3   spcs3
 spcs4 spcs5
 spcs5
 
 desired style:
 
   spl_A   spl_B   spl_C
 spcs1 1   1   0
 spcs2 1   0   1
 spcs3 0   1   1
 .
 .
 .
 
 How and in which form do I import the data to R?
 (read.table() seems not to be appropriate, as data is not organized as a
 table)
 
 How can I create the species by sample matrix?
 
 Thanks for any help,
 Patrick Zimmermann
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Drawing rectangles in multiple panels

2007-07-17 Thread Stephen Tucker
Thanks very much, Gabor - I hadn't considered this possibility. I always
enjoy your posts!

--- Gabor Grothendieck [EMAIL PROTECTED] wrote:

 Suppose ri were already defined as in the example below.
 Then panel.qrect is a bit harder to define although with
 work its possible as shown below:
 
 rectInfo -
list(matrix(runif(4), 2, 2),
 matrix(runif(4), 2, 2),
 matrix(runif(4), 2, 2))
 
 ri - function(x, y, ..., rect.info) {
ri - rect.info[[packet.number()]]
panel.rect(ri[1, 1], ri[1, 2], ri[2, 1], ri[2, 2],
   col = grey86, border = NA)
panel.xyplot(x, y, ...)
  }
 
 panel.qrect - function(rect.info) {
   function(x, y, ...) {
   environment(ri) - environment() ###
   ri(x, y, ..., rect.info = rect.info)
   }
 }
 
 xyplot(runif(30) ~ runif(30) | gl(3, 10),
   panel = panel.qrect(rectInfo))
 
 
 
 On 7/14/07, Stephen Tucker [EMAIL PROTECTED] wrote:
  This is very interesting - but I'm not entirely clear on your last
 statement
  though about how existing functions can cause problems with the scoping
 that
  createWrapper() avoids... (but thanks for the tip).
 
 
  --- Gabor Grothendieck [EMAIL PROTECTED] wrote:
 
   Your approach of using closures is cleaner than that
   given below but just for comparison in:
  
   http://tolstoy.newcastle.edu.au/R/devel/06/03/4476.html
  
   there is a createWrapper function which creates a new function based
   on the function passed as its first argument by using the components
   of the list passed as its second argument to overwrite its formal
   arguments.  For example,
  
   createWrapper - function(FUN, Params) {
  as.function(c(replace(formals(FUN), names(Params), Params),
 body(FUN)))
   }
  
   library(lattice)
  
   rectInfo -
  list(matrix(runif(4), 2, 2),
   matrix(runif(4), 2, 2),
   matrix(runif(4), 2, 2))
  
  
   panel.qrect - function(x, y, ..., rect.info) {
  ri - rect.info[[packet.number()]]
  panel.rect(ri[1, 1], ri[1, 2], ri[2, 1], ri[2, 2],
 col = grey86, border = NA)
  panel.xyplot(x, y, ...)
   }
  
   xyplot(runif(30) ~ runif(30) | gl(3, 10),
 panel = createWrapper(panel.qrect, list(rect.info = rectInfo)))
  
   The createWrapper approach does have an advantage in the situation
   where the function analogous to panel.qrect is existing since using
   scoping then involves manipulation of environments in the closure
   approach.
  
   On 7/11/07, Stephen Tucker [EMAIL PROTECTED] wrote:
In the Trellis approach, another way (I like) to deal with multiple
   pieces of
external data sources is to 'attach' them to panel functions through
   lexical
closures. For instance...
   
rectInfo -
   list(matrix(runif(4), 2, 2),
matrix(runif(4), 2, 2),
matrix(runif(4), 2, 2))
   
panel.qrect - function(rect.info) {
 function(x, y, ...) {
   ri - rect.info[[packet.number()]]
   panel.rect(ri[1, 1], ri[1, 2], ri[2, 1], ri[2, 2],
  col = grey86, border = NA)
   panel.xyplot(x, y, ...)
 }
}
   
xyplot(runif(30) ~ runif(30) | gl(3, 10),
  panel = panel.qrect(rectInfo))
   
...which may or may not be more convenient than passing rectInfo (and
   perhaps
other objects if desired) explicitly as an argument to xyplot().
   
   
--- Deepayan Sarkar [EMAIL PROTECTED] wrote:
   
 On 7/11/07, hadley wickham [EMAIL PROTECTED] wrote:
   A question/comment: I have usually found that the subscripts
   argument
 is
   what I need when passing *external* information into the panel
 function, for
   example, when I wish to add results from a fit done external to
 the
 trellis
   call. Fits[subscripts] gives me the fits (or whatever) I want
 to
   plot
 for
   each panel. It is not clear to me how the panel layout
 information
   from
   panel.number(), etc. would be helpful here instead. Am I
 correct?
   -- or
 is
   there a smarter way to do this that I've missed?
 
  This is one of things that I think ggplot does better - it's much
  easier to plot multiple data sources.  I don't have many examples
 of
  this yet, but the final example on
  http://had.co.nz/ggplot2/geom_abline.html illustrates the basic
 idea.

 That's probably true. The Trellis approach is to define a plot by
 data source + type of plot, whereas the ggplot approach (if I
 understand correctly) is to create a specification for the display
 (incrementally?) and then render it. Since the specification can be
 very general, the approach is very flexible. The downside is that
 you
 need to learn the language.

 On a philosophical note, I think the apparent limitations of
 Trellis
 in some (not all) cases is just due to the artificial importance
 given
 to data frames as the one true container for data. Now that we have
 proper multiple dispatch in S4, we can

Re: [R] Drawing rectangles in multiple panels

2007-07-17 Thread Stephen Tucker
Hi Deepayan, that's very hard-core... for the atmospheric science
applications (which is what I do) that I've encountered, (time-series) data
sets are often pre-aggregated before distribution (to 'average out'
instrument noise) so I haven't had the need for such requirements thus far...
but very good to know (and cool demonstrations btw). Thanks!

Stephen

--- Deepayan Sarkar [EMAIL PROTECTED] wrote:

 On 7/14/07, Stephen Tucker [EMAIL PROTECTED] wrote:
 
  I wonder what kind of objects? Are there large advantages for allowing
  lattice functions to operate on objects other than data frames - I
  couldn't find any screenshots of flowViz but I imagine those objects
  would probably be list of arrays and such? I tend to think of mapply()
  [and more recently melt()], etc. could always be applied beforehand,
  but I suppose that would undermine the case for having generic
  functions to support the rich collection of object classes in R...
 
 There's a copy of a presentation at
 

http://www.ficcs.org/meetings/ficcs3/presentations/DeepayanSarkar-flowviz.pdf
 
 and a (largish - 37M) vignette linked from
 
 http://bioconductor.org/packages/2.1/bioc/html/flowViz.html
 
 Neither of these really talk about the challenge posed by the size of
 the data. The data structure, as with most microarray-type
 experiments, is like a data frame, except that the response for every
 experimental unit is itself a large matrix. If we represented the GvHD
 data set (the one used in the examples) as a long format data frame
 that lattice would understand, it would have 585644 rows and 12
 columns (8 measurements that are different for each row, and 4
 phenotypic variables that are the same for all rows coming from a
 single sample). And this is for a smallish subset of the actual
 experiment.
 
 In practice, the data are stored in an environment to prevent
 unnecessary copying, and panel functions only access one data matrix
 at a time.
 
 -Deepayan
 
 
  --- Deepayan Sarkar [EMAIL PROTECTED] wrote:
 
   On 7/11/07, hadley wickham [EMAIL PROTECTED] wrote:
 A question/comment: I have usually found that the subscripts
 argument
   is
 what I need when passing *external* information into the panel
   function, for
 example, when I wish to add results from a fit done external to the
   trellis
 call. Fits[subscripts] gives me the fits (or whatever) I want to
 plot
   for
 each panel. It is not clear to me how the panel layout information
 from
 panel.number(), etc. would be helpful here instead. Am I correct?
 -- or
   is
 there a smarter way to do this that I've missed?
   
This is one of things that I think ggplot does better - it's much
easier to plot multiple data sources.  I don't have many examples of
this yet, but the final example on
http://had.co.nz/ggplot2/geom_abline.html illustrates the basic idea.
  
   That's probably true. The Trellis approach is to define a plot by
   data source + type of plot, whereas the ggplot approach (if I
   understand correctly) is to create a specification for the display
   (incrementally?) and then render it. Since the specification can be
   very general, the approach is very flexible. The downside is that you
   need to learn the language.
  
   On a philosophical note, I think the apparent limitations of Trellis
   in some (not all) cases is just due to the artificial importance given
   to data frames as the one true container for data. Now that we have
   proper multiple dispatch in S4, we can write methods that behave like
   traditional Trellis calls but work with more complex data structures.
   We have tried this in one bioconductor package (flowViz) with
   encouraging results.
  
   -Deepayan
 



   
Ready
 for the edge of your seat? 
Check out tonight's top picks on Yahoo! TV.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Optimization

2007-07-17 Thread Stephen Tucker
My apologies, didn't see the boundary constraints. Try this one...

f - function(x)
  (sqrt((x[1]*0.114434)^2+(x[2]*0.043966)^2+(x[3]*0.100031)^2)-0.04)^2

optim(par=rep(0,3),f,lower=rep(0,3),upper=rep(1,3),method=L-BFGS-B)

and check ?optim

--- massimiliano.talarico [EMAIL PROTECTED] wrote:

 I'm sorry the function is 
 
 sqrt((x1*0.114434)^2+(x2*0.043966)^2+(x3*0.100031)^2)=0.04;
 
 Have you any suggests.
 
 Thanks,
 Massimiliano
 
 
 
 What is radq?
 
 --- massimiliano.talarico
 [EMAIL PROTECTED] wrote:
 
  Dear all,
  I need a suggest to obtain the max of this function:
  
  Max x1*0.021986+x2*0.000964+x3*0.02913
  
  with these conditions:
  
  x1+x2+x3=1;
 
 radq((x1*0.114434)^2+(x2*0.043966)^2+(x3*0.100031)^2)=0.04;
  x1=0;
  x1=1;
  x2=0;
  x2=1;
  x3=0;
  x3=1;
  
  Any suggests ?
  
  Thanks in advanced,
  Massimiliano
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
  reproducible code.
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scaling of different data sets in ggplot

2007-07-17 Thread Stephen Tucker
Hi Hadley,

That was also my initial thought as well, that maybe having different scales
on the same figure would obfuscate the structure and meaning of the data. But
I think in some instances (i.e., publications where page limits are imposed)
I think it's desirable to condense a lot of information onto a single plot
(for instance, if they show the same trend - even if they are not in the same
units), which means having more than one scale in the same plotting window. I
haven't checked what Tufte, Cleveland, and Wilkinson have to say about this,
but in practice I don't think it's all that uncommon.

I agree that log(z) is an operation on the data set, but representing it
graphically can be accomplished either through plotting log(z), or plotting z
on a log scale... in either case having an extra axis showing y and z [and
not log(z)] would be nice I would think.

I haven't tried it in lattice but in the traditional graphics system it is
quite straight-forward. Your claim says that ggplot takes 'tries to take the
good parts of base and lattice graphics and none of the bad parts' - just
trying to hold you to your word :).

Seriously though, I think the idea of ggplot (and implementation) is really
great. Currently R has many graphics systems, of which I know traditional and
lattice - and both are really fantastic (I plan to learn grid sometime in the
future) and I am fanatical about them. But for students and colleagues who
have less programming experience, I think the learning curve for lattice (to
gain proficiency, that is) may be a tad steep... I've been playing around
with ggplot to see if it would be a gentler introduction to conditioning
plots and analysis of multivariate datasets - which, in a way, I think it
could be - so I'm currently trying to test the limits of its flexibility.
It's true that there are some plotting concepts that are generally
discouraged, but it seems to me that the ultimate discretion should lie with
the user, and the plotting system should give him/her the freedom to choose
[to make a bad plot]. Even Lee Wilkinson says in his book that his grammar
will allow someone to make meaningless plots. One example that comes to mind
is the pie chart - I know they are heavily discouraged, but in some
communities, it's commonly used and therefore expected; to communicate to
that particular audience it's sometimes necessary to speak their language...

So, hope you don't mind, but I may ask some more 'can ggplot do this'
questions in the future. But keep up the good work,

Stephen


--- hadley wickham [EMAIL PROTECTED] wrote:

 Hi Stephen,
 
 You can't do that in ggplot (have two different scales) because I
 think it's generally a really bad idea.  The whole point of plotting
 the data is so that you can use your visual abilities to gain insight
 into the data.  When you have two different scales the positions of
 the two groups are essentially arbitrary - the data only have x values
 in common, not y values.  You essentially have two almost unrelated
 graphs plotted on top of each other.
 
 On the other hand, for this data, I think it would be reasonable to
 plot log(z) and y on the same scale - the data is transformed not the
 scales.
 
 Hadley
 
 On 7/14/07, Stephen Tucker [EMAIL PROTECTED] wrote:
  Dear list (but probably mostly Hadley):
 
  In ggplot, operations to modify 'guides' are accessed through grid
  objects, but I did not find mention of creating new guides or possibly
  removing them altogether using ggplot functions. I wonder if this is
  something I need to learn grid to learn more about (which I hope to do
  eventually).
 
  Also, ggplot()+geom_object() [where 'object' can be point, line, etc.]
  or layer() contains specification for the data, mappings and
  geoms/stats - but the geoms/stats can be scale-dependent [for
  instance, log]. so I wonder how different scalings can be applied to
  different data sets.
 
  Below is an example that requires both:
 
  x - runif(100) y - exp(x^2) z - x^2+rnorm(100,0,0.02)
 
  par(mar=c(5,4,2,4)+0.1) plot(x,y,log=y) lines(lowess(x,y,f=1/3))
  par(new=TRUE) plot(x,z,col=2,pch=3,yaxt=n,ylab=)
  lines(lowess(x,z,f=1/3),col=2) axis(4,col=2,col.axis=2)
  mtext(z,4,line=3,col=2)
 
  In ggplot:
 
  ## data specification
  ggplot(data=data.frame(x,y,z)) +
 
## first set of points geom_point(mapping=aes(x=x,y=y)) +
## scale_y_log() +
 
## second set of points geom_point(mapping=aes(x=x,y=z),pch=3) +
## layer(mapping=aes(x=x,y=z),stat=smooth,method=loess) +
## scale_y_continuous()
 
  scale_y_log() and scale_y_continuous() appear to apply to both mappings
 at
  once, and I can't figure out how to associate them with the intended ones
 (I
  expect this will be a desire for size and color scales as well).
 
  Of course, I can always try to fool the system by (1) applying the
 scaling a
  priori to create a new variable, (2) plotting points from the new
 variable,
  and (3) creating a new axis with custom labels. Which

Re: [R] Optimization

2007-07-17 Thread Stephen Tucker
f - function(x)
  (sqrt((x[1]*0.114434)^2+(x[2]*0.043966)^2+(x[3]*0.100031)^2)-0.04)^2

optim(c(0,0,0),f)

see ?optim for details on arguments, options, etc.

--- massimiliano.talarico [EMAIL PROTECTED] wrote:

 I'm sorry the function is 
 
 sqrt((x1*0.114434)^2+(x2*0.043966)^2+(x3*0.100031)^2)=0.04;
 
 Have you any suggests.
 
 Thanks,
 Massimiliano
 
 
 
 What is radq?
 
 --- massimiliano.talarico
 [EMAIL PROTECTED] wrote:
 
  Dear all,
  I need a suggest to obtain the max of this function:
  
  Max x1*0.021986+x2*0.000964+x3*0.02913
  
  with these conditions:
  
  x1+x2+x3=1;
 
 radq((x1*0.114434)^2+(x2*0.043966)^2+(x3*0.100031)^2)=0.04;
  x1=0;
  x1=1;
  x2=0;
  x2=1;
  x3=0;
  x3=1;
  
  Any suggests ?
  
  Thanks in advanced,
  Massimiliano
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
  reproducible code.
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Optimization

2007-07-17 Thread Stephen Tucker
My apologies, I read the post over too quickly (even the second time). 

It's been a while since I've played around with anything other than box
constraints, but this one is conducive to a brute-force approach (employing
Berwin suggestions). The pseudo-code would look something like this:

delta - 1e-3   # grid space of x3, the smaller the better
oldvalue - -Inf # some initial value for objective function
for( x3 in seq(0,1,by=delta) ) {
  ## calculate x1,x2 as per Berwin's response
  ## if all constraints are met, feasible - TRUE
  ## else feasible - FALSE
  if( !feasible ) next # if not feasible, go to next x3 value
  ## newvalue - value of objective function with x1,x2,x3
  if( newvalue  oldvalue ) {
oldvalue - newvalue
max.x1 - x1; max.x2 - x2; max.x3 - x3
  }
}

You should end up with the desired values of max.x1, max.x2, max.x3. Hope
this helps,

ST



--- massimiliano.talarico [EMAIL PROTECTED] wrote:

 Thanks for your suggests, but I need to obtain the MAX of
 this function:
 
 Max x1*0.021986+x2*0.000964+x3*0.02913
 
 with these conditions:
 
 x1+x2+x3=1;
 
 sqrt((x1*0.114434)^2+(x2*0.043966)^2+(x3*0.100031)^2)=0.04;
 
 x1=0;
 x2=0;
 x3=0;
 
 
 Thanks and again Thanks,
 Massimiliano
 
 
 
 My apologies, didn't see the boundary constraints. Try this
 one...
 
 f - function(x)
   (sqrt((x[1]*0.114434)^2+(x[2]*0.043966)^2+(x[3]*0.100031)
 ^2)-0.04)^2
 
 optim(par=rep(0,3),f,lower=rep(0,3),upper=rep
 (1,3),method=L-BFGS-B)
 
 and check ?optim
 
 --- massimiliano.talarico
 [EMAIL PROTECTED] wrote:
 
  I'm sorry the function is
 
  sqrt((x1*0.114434)^2+(x2*0.043966)^2+(x3*0.100031)^2)
 =0.04;
 
  Have you any suggests.
 
  Thanks,
  Massimiliano
 
 
 
  What is radq?
 
  --- massimiliano.talarico
  [EMAIL PROTECTED] wrote:
 
   Dear all,
   I need a suggest to obtain the max of this function:
  
   Max x1*0.021986+x2*0.000964+x3*0.02913
  
   with these conditions:
  
   x1+x2+x3=1;
  
  radq((x1*0.114434)^2+(x2*0.043966)^2+(x3*0.100031)^2)
 =0.04;
   x1=0;
   x1=1;
   x2=0;
   x2=1;
   x3=0;
   x3=1;
  
   Any suggests ?
  
   Thanks in advanced,
   Massimiliano
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained,
   reproducible code.
  
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
 reproducible code.
 
 
 
 
 
 
 
 Fussy? Opinionated? Impossible to please? Perfect.  Join
 Yahoo!'s user panel and lay it on us.
 http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7
 
 
 
 



   


that gives answers, not web links.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Alternative to xyplot()?

2007-07-17 Thread Stephen Tucker
What's wrong with lattice? Here's an alternative:

library(ggplot2)
ggplot(data=data.frame(x,y,grps=factor(grps)), 
   mapping=aes(x=x,y=y,colour=grps)) + # define data
  geom_identity() +# points
  geom_smooth(method=lm) # regression line



--- Ben Bolker [EMAIL PROTECTED] wrote:

 Manuel Morales Manuel.A.Morales at williams.edu writes:
 
  
  Sorry. I was thinking of the groups functionality, as illustrated
  below:
  
  grps-rep(c(1:3),10)
  x-rep(c(1:10),3)
  y-x+grps+rnorm(30)
  library(lattice)
  xyplot(y~x,group=grps, type=c(r,p))
 
   The points (type p) are easy, the regression lines (type r) are a
 little
 harder. How about:
 
 
 plot(y~x,col=grps)
 invisible(mapply(function(z,col) {abline(lm(y~x,data=z),col=col)},
   split(data.frame(x,y),grps),1:3))
 
   cheers
 Ben Bolker
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] polymorphic functions in ggplot? (WAS Re: Drawing rectangles in multiple panels)

2007-07-14 Thread Stephen Tucker
Regarding your earlier statement,

I tend to think in very data centric approach, where you first generate the
data (in a data frame) and then you plot it. There is very little data
creation/modification during the plotting itself...

Is the data generation and plotting truly separate and sequential? I'm
not entirely clear on this point - as statistical
transformations/operations return objects that require new variables
to be created - and this may be rooted in semantics (the verbal one,
not the computational) of the grammar of graphics - in the online book
draft of 'ggplot' it says (p. 37)

The explicit transformation stage was dropped because variable
transformations are already so easy in R: they do not need to be part
of the grammar.

In my understanding of what transformations are defined to be, they
involve statistical ones - which perhaps I'm not truly getting because
tranformations are defined (by L. Wilkinson) as a mapping of elements
of one set to elements of the same set, and yet a function like
median() will accept a list (of values) and return a single
value... in any case maybe there is a distinction between a
statistical 'transformation' and a statistical 'operation' that I've
missed, but statistical 'transformations' are included in ggplot's
stat functions. L. Wilkinson also seems to include an explicit TRANS
specification at times (for example, in the case of the boxplot on
p.60) and at other times nest it into the ELEMENT specification (for
example, the histogram on p. 47).

In any case, I interpret that the following progression is achieved
through 'data operations' and 'application of algebra' in the language
of L. Wilkinson and through I/O, merge, reshape, and other functions
in R:

source object - variables - varset

A statistic might then computed on the varset, which will return
another source object (true in R as well: e.g., class 'histogram' or
'lm') from which variables can again be extracted, varsets
constructed, etc. to yield a list of tuples to be associated with
geometrical and aesthetic attributes. Indeed, in the bootstrap
example, L. Wilkinson begins by extracting variables from a bootstrap
function on another variable that has not explicitly been created from
source (dataset).

So it's not clear to me that the the data creation step is necessarily
distinct from the plotting, as it is more (but not completely) so in
the traditional graphics system:

## DATA specification
variable - rnorm(100)
## TRANS specification
statsObj - hist(variable,nclass=20,plot=FALSE)
## Transformed data is plotted (variables extracted implicity and
## associated with default geometry/aesthetic mappings)
plot(statsObj)

Below is an analogous plot in ggplot, where the creation of the
summary object occurs as part of the grammar:

ggplot(data=data.frame(variable),mapping=aes(x=variable)) +
stat_bin(breaks=statsObj$breaks)

Since all statistical transformations/operations aren't handled by
ggplot, it seems that working with non-data-frame objects (for
example, of class 'nls' or 'rlm') require data operations (p.7) (to
extract fitted values, etc.). Of course, R provides these facilities,
but the plotting functions in the traditional graphics system
accommodate a number of object classes through polymorphic
functions. I wonder if in a similar way for ggplot, stat_bin could
accept objects of 'histogram' class [hist() allows the user to specify
'nclass', which will then compute the breaks], or stat_smooth could
accept 'rlm' objects. Of course, in the case of an 'lm' object, plot()
additionally gives diagnostic (residual and Q-Q) plots but that type of
response does not seem to fit in with the expected behavior of ggplot
functions...


--- hadley wickham [EMAIL PROTECTED] wrote:

 On 7/12/07, Deepayan Sarkar [EMAIL PROTECTED] wrote:
  On 7/11/07, hadley wickham [EMAIL PROTECTED] wrote:
A question/comment: I have usually found that the subscripts argument
 is
what I need when passing *external* information into the panel
 function, for
example, when I wish to add results from a fit done external to the
 trellis
call. Fits[subscripts] gives me the fits (or whatever) I want to plot
 for
each panel. It is not clear to me how the panel layout information
 from
panel.number(), etc. would be helpful here instead. Am I correct? --
 or is
there a smarter way to do this that I've missed?
  
   This is one of things that I think ggplot does better - it's much
   easier to plot multiple data sources.  I don't have many examples of
   this yet, but the final example on
   http://had.co.nz/ggplot2/geom_abline.html illustrates the basic idea.
 
  That's probably true. The Trellis approach is to define a plot by
  data source + type of plot, whereas the ggplot approach (if I
  understand correctly) is to create a specification for the display
  (incrementally?) and then render it. Since the specification can be
  very general, the approach is very flexible. The downside is that you
  need to 

[R] scaling of different data sets in ggplot

2007-07-14 Thread Stephen Tucker
Dear list (but probably mostly Hadley):

In ggplot, operations to modify 'guides' are accessed through grid
objects, but I did not find mention of creating new guides or possibly
removing them altogether using ggplot functions. I wonder if this is
something I need to learn grid to learn more about (which I hope to do
eventually).

Also, ggplot()+geom_object() [where 'object' can be point, line, etc.]
or layer() contains specification for the data, mappings and
geoms/stats - but the geoms/stats can be scale-dependent [for
instance, log]. so I wonder how different scalings can be applied to
different data sets.

Below is an example that requires both:

x - runif(100) y - exp(x^2) z - x^2+rnorm(100,0,0.02)

par(mar=c(5,4,2,4)+0.1) plot(x,y,log=y) lines(lowess(x,y,f=1/3))
par(new=TRUE) plot(x,z,col=2,pch=3,yaxt=n,ylab=)
lines(lowess(x,z,f=1/3),col=2) axis(4,col=2,col.axis=2)
mtext(z,4,line=3,col=2)

In ggplot:

## data specification
ggplot(data=data.frame(x,y,z)) +

  ## first set of points geom_point(mapping=aes(x=x,y=y)) +
  ## scale_y_log() +

  ## second set of points geom_point(mapping=aes(x=x,y=z),pch=3) +
  ## layer(mapping=aes(x=x,y=z),stat=smooth,method=loess) +
  ## scale_y_continuous()

scale_y_log() and scale_y_continuous() appear to apply to both mappings at
once, and I can't figure out how to associate them with the intended ones (I
expect this will be a desire for size and color scales as well).

Of course, I can always try to fool the system by (1) applying the scaling a
priori to create a new variable, (2) plotting points from the new variable,
and (3) creating a new axis with custom labels. Which then brings me back to
...how to add new guides? :)

Thanks,

Stephen



  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to read many files at one time?

2007-07-14 Thread Stephen Tucker
This should do it:

allData - sapply(paste(Sim,1:20,sep=),
  function(.x) read.table(paste(.x,txt,sep=.)),
  simplify=FALSE)

see ?read.table for specification of delimiters, etc.

allData will be a list, and you can access the contents of each file by
any of the following commands:
allData[[2]]
allData[[Sim2]]
allData$Sim2


--- Zhang Jian [EMAIL PROTECTED] wrote:

 I want to load many files in the R. The names of the files are Sim1.txt,
 
 Sim2.txt, Sim3.txt, Sim4.txt, Sim5.txt and so on.
 Can I read them at one time? What should I do? I can give the same names in
 R.
 Thanks.
 
 For example:
  tst=paste(Sim,1:20,.txt,sep=) # the file names
  tst
  [1] Sim1.txt  Sim2.txt  Sim3.txt  Sim4.txt  Sim5.txt  Sim6.txt
  [7] Sim7.txt  Sim8.txt  Sim9.txt  Sim10.txt Sim11.txt
 Sim12.txt
 [13] Sim13.txt Sim14.txt Sim15.txt Sim16.txt Sim17.txt
 Sim18.txt
 [19] Sim19.txt Sim20.txt
 
  data.name=paste(Sim,1:20,sep=) # the file names in R
  data.name
  [1] Sim1  Sim2  Sim3  Sim4  Sim5  Sim6  Sim7  Sim8  Sim9
 [10] Sim10 Sim11 Sim12 Sim13 Sim14 Sim15 Sim16 Sim17
 Sim18
 [19] Sim19 Sim20
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Drawing rectangles in multiple panels

2007-07-14 Thread Stephen Tucker

I wonder what kind of objects? Are there large advantages for allowing
lattice functions to operate on objects other than data frames - I
couldn't find any screenshots of flowViz but I imagine those objects
would probably be list of arrays and such? I tend to think of mapply()
[and more recently melt()], etc. could always be applied beforehand,
but I suppose that would undermine the case for having generic
functions to support the rich collection of object classes in R...


--- Deepayan Sarkar [EMAIL PROTECTED] wrote:

 On 7/11/07, hadley wickham [EMAIL PROTECTED] wrote:
   A question/comment: I have usually found that the subscripts argument
 is
   what I need when passing *external* information into the panel
 function, for
   example, when I wish to add results from a fit done external to the
 trellis
   call. Fits[subscripts] gives me the fits (or whatever) I want to plot
 for
   each panel. It is not clear to me how the panel layout information from
   panel.number(), etc. would be helpful here instead. Am I correct? -- or
 is
   there a smarter way to do this that I've missed?
 
  This is one of things that I think ggplot does better - it's much
  easier to plot multiple data sources.  I don't have many examples of
  this yet, but the final example on
  http://had.co.nz/ggplot2/geom_abline.html illustrates the basic idea.
 
 That's probably true. The Trellis approach is to define a plot by
 data source + type of plot, whereas the ggplot approach (if I
 understand correctly) is to create a specification for the display
 (incrementally?) and then render it. Since the specification can be
 very general, the approach is very flexible. The downside is that you
 need to learn the language.
 
 On a philosophical note, I think the apparent limitations of Trellis
 in some (not all) cases is just due to the artificial importance given
 to data frames as the one true container for data. Now that we have
 proper multiple dispatch in S4, we can write methods that behave like
 traditional Trellis calls but work with more complex data structures.
 We have tried this in one bioconductor package (flowViz) with
 encouraging results.
 
 -Deepayan
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Drawing rectangles in multiple panels

2007-07-14 Thread Stephen Tucker
This is very interesting - but I'm not entirely clear on your last statement
though about how existing functions can cause problems with the scoping that
createWrapper() avoids... (but thanks for the tip).


--- Gabor Grothendieck [EMAIL PROTECTED] wrote:

 Your approach of using closures is cleaner than that
 given below but just for comparison in:
 
 http://tolstoy.newcastle.edu.au/R/devel/06/03/4476.html
 
 there is a createWrapper function which creates a new function based
 on the function passed as its first argument by using the components
 of the list passed as its second argument to overwrite its formal
 arguments.  For example,
 
 createWrapper - function(FUN, Params) {
as.function(c(replace(formals(FUN), names(Params), Params), body(FUN)))
 }
 
 library(lattice)
 
 rectInfo -
list(matrix(runif(4), 2, 2),
 matrix(runif(4), 2, 2),
 matrix(runif(4), 2, 2))
 
 
 panel.qrect - function(x, y, ..., rect.info) {
ri - rect.info[[packet.number()]]
panel.rect(ri[1, 1], ri[1, 2], ri[2, 1], ri[2, 2],
   col = grey86, border = NA)
panel.xyplot(x, y, ...)
 }
 
 xyplot(runif(30) ~ runif(30) | gl(3, 10),
   panel = createWrapper(panel.qrect, list(rect.info = rectInfo)))
 
 The createWrapper approach does have an advantage in the situation
 where the function analogous to panel.qrect is existing since using
 scoping then involves manipulation of environments in the closure
 approach.
 
 On 7/11/07, Stephen Tucker [EMAIL PROTECTED] wrote:
  In the Trellis approach, another way (I like) to deal with multiple
 pieces of
  external data sources is to 'attach' them to panel functions through
 lexical
  closures. For instance...
 
  rectInfo -
 list(matrix(runif(4), 2, 2),
  matrix(runif(4), 2, 2),
  matrix(runif(4), 2, 2))
 
  panel.qrect - function(rect.info) {
   function(x, y, ...) {
 ri - rect.info[[packet.number()]]
 panel.rect(ri[1, 1], ri[1, 2], ri[2, 1], ri[2, 2],
col = grey86, border = NA)
 panel.xyplot(x, y, ...)
   }
  }
 
  xyplot(runif(30) ~ runif(30) | gl(3, 10),
panel = panel.qrect(rectInfo))
 
  ...which may or may not be more convenient than passing rectInfo (and
 perhaps
  other objects if desired) explicitly as an argument to xyplot().
 
 
  --- Deepayan Sarkar [EMAIL PROTECTED] wrote:
 
   On 7/11/07, hadley wickham [EMAIL PROTECTED] wrote:
 A question/comment: I have usually found that the subscripts
 argument
   is
 what I need when passing *external* information into the panel
   function, for
 example, when I wish to add results from a fit done external to the
   trellis
 call. Fits[subscripts] gives me the fits (or whatever) I want to
 plot
   for
 each panel. It is not clear to me how the panel layout information
 from
 panel.number(), etc. would be helpful here instead. Am I correct?
 -- or
   is
 there a smarter way to do this that I've missed?
   
This is one of things that I think ggplot does better - it's much
easier to plot multiple data sources.  I don't have many examples of
this yet, but the final example on
http://had.co.nz/ggplot2/geom_abline.html illustrates the basic idea.
  
   That's probably true. The Trellis approach is to define a plot by
   data source + type of plot, whereas the ggplot approach (if I
   understand correctly) is to create a specification for the display
   (incrementally?) and then render it. Since the specification can be
   very general, the approach is very flexible. The downside is that you
   need to learn the language.
  
   On a philosophical note, I think the apparent limitations of Trellis
   in some (not all) cases is just due to the artificial importance given
   to data frames as the one true container for data. Now that we have
   proper multiple dispatch in S4, we can write methods that behave like
   traditional Trellis calls but work with more complex data structures.
   We have tried this in one bioconductor package (flowViz) with
   encouraging results.
  
   -Deepayan
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 



   


that gives answers, not web links.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

Re: [R] Drawing rectangles in multiple panels

2007-07-11 Thread Stephen Tucker
Not that Trellis/lattice was entirely easy to learn at first. :)

I've been playing around with ggplot2 and there is a plot()-like wrapper for
building a quick plot [incidentally, called qplot()], but otherwise it's my
understanding that you superpose elements (incrementally) to build up to the
graph you want. Here is the same plot in ggplot2:

rectInfo -
list(matrix(runif(4), 2, 2),
 matrix(runif(4), 2, 2),
 matrix(runif(4), 2, 2))

library(ggplot2)
ggopt(grid.fill = white) # just my preference
## original plot of points
p -
qplot(x,y,data=data.frame(x=runif(30),y=runif(30),f=gl(3,30)),facets=f~.)
# print(p)

## external data (rectangles) - in coordinates for geom_polygon 
x - do.call(rbind,
 mapply(function(.r,.f)
data.frame(x=.r[c(1,1,2,2),1],y=.r[c(1,2,2,1),2],f=.f),
.r=rectInfo,.f=seq(along=rectInfo),SIMPLIFY=FALSE))
## add rectangle to original plot of points
p+layer(geom=polygon,data=x,mapping=aes(x=x,y=y),facets=f~.)
# will print the graphics on my windows() device

Though lattice does seem to emphasize the 'chart type' approach to graphing,
in a way I see that it provides a similar flexibility - just that the
specifications for each element are contained in functions and objects that
are ultimately invoked by a high-level/higher-order function, instead of
being combined in the linear fashion of ggplot2.

ST

--- Deepayan Sarkar [EMAIL PROTECTED] wrote:

 On 7/11/07, hadley wickham [EMAIL PROTECTED] wrote:
   A question/comment: I have usually found that the subscripts argument
 is
   what I need when passing *external* information into the panel
 function, for
   example, when I wish to add results from a fit done external to the
 trellis
   call. Fits[subscripts] gives me the fits (or whatever) I want to plot
 for
   each panel. It is not clear to me how the panel layout information from
   panel.number(), etc. would be helpful here instead. Am I correct? -- or
 is
   there a smarter way to do this that I've missed?
 
  This is one of things that I think ggplot does better - it's much
  easier to plot multiple data sources.  I don't have many examples of
  this yet, but the final example on
  http://had.co.nz/ggplot2/geom_abline.html illustrates the basic idea.
 
 That's probably true. The Trellis approach is to define a plot by
 data source + type of plot, whereas the ggplot approach (if I
 understand correctly) is to create a specification for the display
 (incrementally?) and then render it. Since the specification can be
 very general, the approach is very flexible. The downside is that you
 need to learn the language.
 
 On a philosophical note, I think the apparent limitations of Trellis
 in some (not all) cases is just due to the artificial importance given
 to data frames as the one true container for data. Now that we have
 proper multiple dispatch in S4, we can write methods that behave like
 traditional Trellis calls but work with more complex data structures.
 We have tried this in one bioconductor package (flowViz) with
 encouraging results.
 
 -Deepayan
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Drawing rectangles in multiple panels

2007-07-11 Thread Stephen Tucker
In the Trellis approach, another way (I like) to deal with multiple pieces of
external data sources is to 'attach' them to panel functions through lexical
closures. For instance...

rectInfo -
list(matrix(runif(4), 2, 2),
 matrix(runif(4), 2, 2),
 matrix(runif(4), 2, 2))

panel.qrect - function(rect.info) {
  function(x, y, ...) {
ri - rect.info[[packet.number()]]
panel.rect(ri[1, 1], ri[1, 2], ri[2, 1], ri[2, 2],
   col = grey86, border = NA)
panel.xyplot(x, y, ...)
  }
}

xyplot(runif(30) ~ runif(30) | gl(3, 10),
   panel = panel.qrect(rectInfo))

...which may or may not be more convenient than passing rectInfo (and perhaps
other objects if desired) explicitly as an argument to xyplot().


--- Deepayan Sarkar [EMAIL PROTECTED] wrote:

 On 7/11/07, hadley wickham [EMAIL PROTECTED] wrote:
   A question/comment: I have usually found that the subscripts argument
 is
   what I need when passing *external* information into the panel
 function, for
   example, when I wish to add results from a fit done external to the
 trellis
   call. Fits[subscripts] gives me the fits (or whatever) I want to plot
 for
   each panel. It is not clear to me how the panel layout information from
   panel.number(), etc. would be helpful here instead. Am I correct? -- or
 is
   there a smarter way to do this that I've missed?
 
  This is one of things that I think ggplot does better - it's much
  easier to plot multiple data sources.  I don't have many examples of
  this yet, but the final example on
  http://had.co.nz/ggplot2/geom_abline.html illustrates the basic idea.
 
 That's probably true. The Trellis approach is to define a plot by
 data source + type of plot, whereas the ggplot approach (if I
 understand correctly) is to create a specification for the display
 (incrementally?) and then render it. Since the specification can be
 very general, the approach is very flexible. The downside is that you
 need to learn the language.
 
 On a philosophical note, I think the apparent limitations of Trellis
 in some (not all) cases is just due to the artificial importance given
 to data frames as the one true container for data. Now that we have
 proper multiple dispatch in S4, we can write methods that behave like
 traditional Trellis calls but work with more complex data structures.
 We have tried this in one bioconductor package (flowViz) with
 encouraging results.
 
 -Deepayan
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Drawing rectangles in multiple panels

2007-07-11 Thread Stephen Tucker
Regarding this, I meant to imply that lattice was similarly flexible in the
sense of handing multiple data sets [IMHO], in regards to other aspects of
the 'grammar of graphics' I have no qualifications to justify comment. But
the idea and intuitiveness of graph construction in ggplot2 is very appealing
- in an hour I picked up enough to do quite a bit, just by going through
examples in the author's book http://had.co.nz/ggplot2/. Will be
interesting to see how this package will be received by the community.

Stephen

--- Stephen Tucker [EMAIL PROTECTED] wrote:

 Not that Trellis/lattice was entirely easy to learn at first. :)
 
 I've been playing around with ggplot2 and there is a plot()-like wrapper
 for
 building a quick plot [incidentally, called qplot()], but otherwise it's my
 understanding that you superpose elements (incrementally) to build up to
 the
 graph you want. Here is the same plot in ggplot2:
 
 rectInfo -
 list(matrix(runif(4), 2, 2),
  matrix(runif(4), 2, 2),
  matrix(runif(4), 2, 2))
 
 library(ggplot2)
 ggopt(grid.fill = white) # just my preference
 ## original plot of points
 p -
 qplot(x,y,data=data.frame(x=runif(30),y=runif(30),f=gl(3,30)),facets=f~.)
 # print(p)
 
 ## external data (rectangles) - in coordinates for geom_polygon 
 x - do.call(rbind,
  mapply(function(.r,.f)
 data.frame(x=.r[c(1,1,2,2),1],y=.r[c(1,2,2,1),2],f=.f),
 .r=rectInfo,.f=seq(along=rectInfo),SIMPLIFY=FALSE))
 ## add rectangle to original plot of points
 p+layer(geom=polygon,data=x,mapping=aes(x=x,y=y),facets=f~.)
 # will print the graphics on my windows() device
 
 Though lattice does seem to emphasize the 'chart type' approach to
 graphing,
 in a way I see that it provides a similar flexibility - just that the
 specifications for each element are contained in functions and objects that
 are ultimately invoked by a high-level/higher-order function, instead of
 being combined in the linear fashion of ggplot2.
 
 ST
 
 --- Deepayan Sarkar [EMAIL PROTECTED] wrote:
 
  On 7/11/07, hadley wickham [EMAIL PROTECTED] wrote:
A question/comment: I have usually found that the subscripts argument
  is
what I need when passing *external* information into the panel
  function, for
example, when I wish to add results from a fit done external to the
  trellis
call. Fits[subscripts] gives me the fits (or whatever) I want to plot
  for
each panel. It is not clear to me how the panel layout information
 from
panel.number(), etc. would be helpful here instead. Am I correct? --
 or
  is
there a smarter way to do this that I've missed?
  
   This is one of things that I think ggplot does better - it's much
   easier to plot multiple data sources.  I don't have many examples of
   this yet, but the final example on
   http://had.co.nz/ggplot2/geom_abline.html illustrates the basic idea.
  
  That's probably true. The Trellis approach is to define a plot by
  data source + type of plot, whereas the ggplot approach (if I
  understand correctly) is to create a specification for the display
  (incrementally?) and then render it. Since the specification can be
  very general, the approach is very flexible. The downside is that you
  need to learn the language.
  
  On a philosophical note, I think the apparent limitations of Trellis
  in some (not all) cases is just due to the artificial importance given
  to data frames as the one true container for data. Now that we have
  proper multiple dispatch in S4, we can write methods that behave like
  traditional Trellis calls but work with more complex data structures.
  We have tried this in one bioconductor package (flowViz) with
  encouraging results.
  
  -Deepayan
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  
 
 
 
  


 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   

Pinpoint customers who are looking for what you sell.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple Stripcharts

2007-07-07 Thread Stephen Tucker
I'm not able to make out your data but something like this?

df - data.frame(A=rnorm(10),B=rnorm(10),C=runif(10))
stripchart(df,method=jitter)


--- Tavpritesh Sethi [EMAIL PROTECTED] wrote:

 Hi all,
 I have 205 rows with measurements for three categories of people. I want to
 generate stripplots for each of these rows. How can I do it without having
 to do them one by one. I am giving a sample dataset:-
 
  A
  B
  C
  A
  B
  C
  A
  B
  C
  A
  B
  C
  10.34822
  10.18426
  9.837874
  9.65047
  8.020482
  9.17312
  6.349595
  13.55664
  5.286697
  11.85409
  2.827027
  7.002696
  11.54984
  12.14591
  14.88955
  12.26134
  11.74262
  11.13481
  15.11849
  14.97857
  14.12973
  14.23219
  15.36582
  15.4698
  10.59222
  11.22417
  13.34279
  12.2538
  11.02348
  11.59403
  9.933778
  10.45499
  8.884345
  8.465186
  9.72647
  10.44469
 
 
 Thanks,
 Tavpritesh
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



 

Finding fabulous fares is fun.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] color scale in rgl plots

2007-07-07 Thread Stephen Tucker
Hi David,

I'm not an expert in 'rgl', but to determine data-dependent color for points
I often use cut().

# using a very simple example,
x - 1:2; y - 1:2; z - matrix(1:4,ncol=2)

# the following image will be a projection of my intended 3-D 'rgl' plot
# into 2-D space (if we don't consider color to be a dimension):
library(fields)
image.plot(x,y,z)

# the 3-D rgl plot will be as follows:
df - data.frame(x=rep(x,times=length(y)),
 y=rep(y,each=length(x)),
 z=as.vector(z))
plot3d(x=df,col=1:4,type=s)

## looks okay so moving onto bigger example:
x - 1:10; y - 1:10; z - matrix(1:100,ncol=10)

# 2-D projection:
image.plot(x,y,z)

# 3-D plot in rgl
df - data.frame(x=rep(x,times=length(y)),
 y=rep(y,each=length(x)),
 z=as.vector(z))
# This is how I determine color:
nColors - 64
colindex - as.integer(cut(df$z,breaks=nColors))
plot3d(x=df,type=s,col=tim.colors(nColors)[colindex])

===
tim.colors(nColors)[colindex] will return a vector of colors the same length
as 'df'.

I don't think as.integer() on cut() is entirely necessary because cut()
returns a factor... in any case, I use these integers as indices for
tim.colors() [you will need the 'fields' package for this set of colors].

Hope this helps.

ST


--- David Farrelly [EMAIL PROTECTED] wrote:

 Hello,
 
 I'm trying to make a 3d plot using rgl in which the size and color of
 each point corresponds to certain attributes of each data point. The color
 attribute, let's call it X, is scaled to go from 0 to 1. The
 rainbow(64,start=0.7,end=0.1) palette is perfect for what I want but I
 don't know how to take that palette and pick a color from it based on
 the value of X for a given data point. I'm fairly new to R and any
 suggestions would be greatly appreciated.
 
 Here's what I do - it's how to do the color mapping that has me stumped.
 

plot3d(th1,ph,th2,type='s',size=wd1,col=(rp1,0,0,1),cex=2,ylab=NULL,xlab=NULL,zlab=NULL,xlim=c(0,1),ylim=c(0,2),zlim=c(0,1),box=TRUE,axes=FALSE)
 
 I have also tried the more obvious col = rgb(a,b,c,d) where a,b,c,d are
 functions of X but I can't manage to come up with a nice looking color
 scale.
 
 Thanks in advance,
 
 David
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loop and function

2007-07-05 Thread Stephen Tucker
You do not have matching parentheses in this line
   returnlow - gpdlow(var[,i][var[,i](p[,i][[2]])
most likely there is a syntax error that halts the execution of the
assignment statement?



--- livia [EMAIL PROTECTED] wrote:

 
 Hi All, I am trying to make a loop for a function and I am using the
 following codes. p and var are some matrix obtained before. I would
 like
 to apply the function  gpdlow for i in 1:12 and get the returnlow for i
 in 1:12. But when I ask for returnlow there are warnings and it turns out
 some strange result. 
 
 for (i in 1:12){  
 gpdlow - function(u){  
 p[,i]$beta -u*p[,i][[2]]
 }
 returnlow - gpdlow(var[,i][var[,i](p[,i][[2]])
 }
 
 
 -- 
 View this message in context:
 http://www.nabble.com/Loop-and-function-tf4028854.html#a11443955
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] select data from large CSV file

2007-07-05 Thread Stephen C. Upton
Hi Lars,

I haven't tried this, but I believe there were a couple of messages on 
the list recently on reading large files that basically used scan with 
connections, and reading in by blocks.

see ?scan, ?connections

HTH
steve

Lars Modig wrote:
 Hello


 I’ve got a large CSV file (500M) with statistical data. It’s devided in
 12 columns and I don’t know how many lines.
 The second column is the date and the second is a unique code for the
 location, the rest is (lets say different whether data.  See example
 below.
 070704, 25, --,--,--,temperature, 22, --,--,30, 20,Y
 070705, 25, --,--,--,temperature, 22, --,--,30, 20,Y
 070705, 25, --,--,--,pressure, 1200, --,--,1000, 1100,N
 070705, 26, --,--,--,temperature, 22, --,--,30, 20,Y
 …
 First I tried with data - read.csv. and of course the memory got full.
 Then I found in the archive that you could use scan. So then I wrote the
 following lines below to search for location and store one location with
 all different data in one variable.

 # collect the different pnc's
  b=2#compare from second number
  alike=TRUE #Dim alike like a boolean
  stored = 910286609 #first number is known
   for(i in 1: 100){ #start counting and scaning
  data_final - matrix(unlist(scan(C:/Documents and
 Settings/modiglar/Desktop/temp/et.csv,sep=, ,
 what=list(,,,), skip=i ,
 n=12)),ncol=12, byrow=TRUE)


   a=1 #compare from the 1:th stored
   while( a  b){  #---
   #
 if(as.numeric(data_final[2] != stored[a])) #compare
   { a=a+1  #
   alike=FALSE  }   #
 else{  #
alike=TRUE  #
break } #
   }# ---

   if (alike==FALSE){   #
  stored[b]=as.numeric(data_final[2])   # Store new data
  b=b+1 #
   }
   }

 #
 # save 1 pnc at the time
 d=1
 saved_data = 1:1200 ; dim(saved_data) - c(12,100)
 save_data_nr = 1   #Stored number
   for(i in 1: 100){#start counting and scaning
  data_final - matrix(unlist(scan(C:/Documents and
 Settings/modiglar/Desktop/temp/et.csv,sep=, ,
 what=list(,,,), skip=i ,
 n=12)),ncol=12, byrow=TRUE)


   if(as.numeric(data_final[2] == stored[save_data_nr])) #compare
 { saved_data[,d] -  matrix(unlist(data_final),ncol=12,
 byrow=TRUE)  #Store new data
  d=d+1 } #
  #
  #
  }
 As you can see I’m not so familiar with R, and therefore I have probably
 done this the wrong way.

 As I understand when running this, is that scan opens up the file count
 down to the line that should be read and read it, then closing the file
 again. So when I’m starting to come to line number at 1 then it
 starting to take time. I let the computer run over night, but it was still
 far from finished when I stopped the loop.

 So how should I do this? Maybe I also need to sort on the date, and that
 is hopefully in order so then you should be able to cut the file every
 time you hit a new month but that will also take time if I do it like
 this.

 Thank you for your help in advance.

 Lars

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] nls() lower/upper bound specification

2007-07-04 Thread Stephen Tucker
Dear all,

In optim() all parameters of a function to be adjusted is stored in a single
vector, with lower/upper bounds can be specified by a vector of the same
length.

In nls(), is it true that if I want to specify lower/upper bounds, functions
must be re-written so that each parameter is contained in a single-valued
vector?

## data input
x - 1:10
y - 3*x+4*x^2+rnorm(10,250)

## this one does not work
f - function(x)
  function(beta)
  beta[1]+ beta[2]*x+beta[3]*x^2

out - nls(y~f(x)(beta),data=data.frame(x,y),
   alg=port,
   start=list(beta=1:3),
   lower=list(beta=rep(0,3)))

(However, this works if I do not specify a lower bound)

## this one works
g - function(x)
  function(beta1,beta2,beta3)
  beta1+ beta2*x+beta3*x^2

out - nls(y~g(x)(beta1,beta2,beta3),data=data.frame(x,y),
   alg=port,
   start=list(beta1=1,beta2=1,beta3=1),
   lower=list(beta1=1,beta2=1,beta3=1))

Thanks in advance!

Stephen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems using imported data

2007-07-03 Thread Stephen Tucker
Actually, I believe attach() and detached() is discouraged nowadays...

x - read.delim(Filename.txt, header=TRUE)

You can access your data by column:
x[,1]
x[,c(1,3)]

or if your first column is named Col1 and the third Col3,
x[,Col1]
x[,c(Col1,Col3)]

and you can do the same to access by row - by indices or rownames [which you
can set with rownames-, see help(rownames-)]

Alternatively, with this type of data [created by the read.delim() function]
you can also access with the following syntax:
x$Col1
x$Col3
with(x,Col1)
with(x,cbind(Col1,Col3))

...hope this helps

ST


--- Susie Iredale [EMAIL PROTECTED] wrote:

 
 
 
 (Repeat of previous HTML version)
 
 Hello all,
 
 I am a new R user and I have finally imported my data using
 read.delim(Filename.txt, header=TRUE) after some difficulty, by changing
 file directories (a hint to anyone who might be stuck there).
 
 However, I am now stuck trying to use my data.  When I try to use
 data.frame(filename.txt) it tells me object not found, which makes it
 difficult to use attach() or with().  How do I get R to recognize my data? 
 
 
 Thanks,
 Susie
 PhD Student UCI
 
 
 
 
  


 Luggage? GPS? Comic books?
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regexpr

2007-06-29 Thread Stephen Tucker
I think you are looking for paste().

And you can replace your for loop with lapply(), which will apply regexpr to
every element of 'mylist' (as the first argument, which is 'pattern'). 'text'
can be a vector also:

mylist - c(MN,NY,FL)
lapply(paste(mylist,$,sep=),regexpr,text=Those from MN:)



--- runner [EMAIL PROTECTED] wrote:

 
 Hi, 
 
 I 'd like to match each member of a list to a target string, e.g.
 --
 mylist=c(MN,NY,FL)
 g=regexpr(mylist[1], Those from MN:)
 if (g0)
 {
 On list
 }
 --
 My question is:
 
 How to add an end-of-string symbol '$' to the to-match string? so that 'M'
 won't match.
 
 Of course, MN$ will work, but i want to use it in a loop; mylist[i] is
 what i need. I tried mylist[1]$, but didn't work. So why it doesn't
 extrapolate? How to do it?
 
 Thanks a lot!
 -- 
 View this message in context:
 http://www.nabble.com/regexpr-tf4000743.html#a11363041
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



 

Bored stiff? Loosen up...

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assign name to a name

2007-06-29 Thread Stephen Tucker
You can just create another variable which contains the names you want:


## let
Year - c(rep(1999,2),rep(2000,2),rep(2001,3))

## one alternative
getYearCode1 - function(yr) {
  # yr can be a vector
  ifelse(yr==1999,Year1,
 ifelse(yr==2000,Year2,
ifelse(yr==2001,Year3)))
}

## another alternative
## more appropriate since you probably want 
## a single value returned
getYearCode2 - function(yr) {
  # yr is a single value
  switch(as.character(yr),
 `1999` = Year1,
 `2000` = Year2,
 `2001` = Year3)
}

## Application:
## single value
getYearCode1(Year[1])
getYearCode2(Year[1])
## on a vector
dataset$YearCode - getYearCode1(Year)
# or
dataset$YearCode - sapply(Year,getYearCode2)

## another option is match()
df - data.frame(Year=c(1999,2000,2001),YearCode=c(Year1,Year2,Year3))
dataset$YearCode - df[match(Year,df[,Year]),YearCode]

##
## reading from console
subset(dataset,YearCode==scan(,what=))
subset(dataset,
   YearCode=={x - function() {cat(YrCode: );readline()}; x()})

## or as a function
f - function(x) {
  g - function() {
x - function() {
  cat(YearCode: );
  readline()
}
subset(dataset,YearCode==x())
  }
}
getSubset1 - f(dataset)

## type at console. You will be prompted:
datayear - getSubset1()

## but easier is
f - function(x) {
  g - function(y)
subset(x,YearCode==y)
}
getSubset2 - f(dataset)

## type at prompt
datayear - getSubset1(1999)


--- Spilak,Jacqueline [Edm] [EMAIL PROTECTED] wrote:

 I would like to know how I can assign a name to a name.  I have a
 dataset that has different years in it.  I am writing scripts using R
 and I would like to give a month a generic name and then use the generic
 name to do different analysis.  The reason for the generic name would be
 so that I only have to change one thing if I wanted to change the year.
 For example.
 Year1 = 1999
 datayear - subset(dataset, Year = Year1)
 I would want  to subset for whatever year is in Year1.  I am not sure
 if R does this but it would be great if it does.  Is there also anyway
 for R to ask the user for the variable in the console without going into
 the script and then use whatever the user puts in. Thanks for the help.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Function call within a function.

2007-06-28 Thread Stephen Tucker
Dear John,

Perhaps I am mistaken in what you are trying to accomplish but it seems like
what is required is that you call lstfun() outside of ukn(). [and remove the
call to lstfun() in ukn()].

nts - lstfun(myfile, aa, bb)
results - ukn(dd1, a, b, nts$cda)

Alternatively, you can eliminate the fourth argument in ukn() and assign (via
'-') the results of lstfun() to 'nam1' within ukn() instead of saving to
'nts'...

--- John Kane [EMAIL PROTECTED] wrote:

 I am trying to call a funtion within another function
 and I clearly am misunderstanding what I should do. 
 Below is a simple example.
 I know lstfun works on its own but I cannot seem to
 figure out how to get it to work within ukn. Basically
 I need to create the variable nts. I have probably
 missed something simple in the Intro or FAQ.
 
 Any help would be much appreciated.
 
 EXAMPLE

---
 # create data.frame
 cata - c( 1,1,6,1,1,4)
 catb - c( 1,2,3,4,5,6)
 id - c('a', 'b', 'b', 'a', 'a', 'b')
 dd1  -  data.frame(id, cata,catb)
 
 # function to create list from data.frame
 lstfun  - function(file, alpha , beta ) {
 cda  -  subset(file, file[,1] == alpha)
 cdb  -  subset (file, file[,1]== beta)
 list1 - list(cda,cdb)
 }
 
 # funtion to operate on list
 ukn  -  function(file, alpha, beta, nam1){
 aa  - alpha
 bb  - beta
 myfile  - file
 nts - lstfun(myfile, aa, bb)
 mysum - nam1[,3]*5
 return(mysum)
 }
 
 results - ukn(dd1, a, b, nts$cda)
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for parallel functionality between Matlab and R

2007-06-27 Thread Stephen Tucker
This zooming function on the R-Wiki page was very neat:

http://wiki.r-project.org/rwiki/doku.php?id=tips:graphics-misc:interactive_zooming

Also, to answer question (a), maybe these examples might help?

## add elements to plot
plot(1:10,1:10)
lines(1:10,(1:10)/2)
points(1:10,(1:10)/1.5)

## add second y-axis
par(mar=c(5,4,2,4)+0.1)
plot(1:10,1:10)
par(new=TRUE)
plot(-20:20,20:-20,col=4,
 type=l,axes=FALSE,
 xlab=,ylab=,
 xaxs=i,
 xlim=par(usr)[1:2])
axis(4,col=4,col.axis=4)
mtext(second y-axis label,4,outer=TRUE,padj=-2,col=4)






--- Jim Lemon [EMAIL PROTECTED] wrote:

 El-ad David Amir wrote:
  I'm slowly moving my statistical analysis from Matlab to R, and find
 myself
  missing two features:
  
  a) How do I mimic Matlab's 'hold on'? (I want to show several plots
  together, when I type two plots one after the other the second overwrites
  the first)
  b) How do I mimic Matlab's 'axis'? (after drawing my plots I want to zoom
 on
  specific parts- for example, x=0:5, y=0:20).
  
 I think what you want for a) is par(ask=TRUE).
 
 There have been a few discussions of zooming on the help list - see:
 
 http://stats.math.uni-augsburg.de/iPlots/index.shtml
 
 for one solution.
 
 Jim
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



  

Luggage? GPS? Comic books?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] simultaneous actions of grep ???

2007-06-26 Thread Stephen Tucker
You can list them together using | (which stands for 'or'):

  c-subset(c,!rownames(c) %in% grep(.1|.5|.6|.9,rownames(c),value=T))

but . means any character for regular expressions, so if you meant a
decimal place, you probably want to escape them with a \\:

  c-subset(c,!rownames(c) %in%
grep(\\.1|\\.5|\\.6|\\.9, rownames(c),value=T))

Another option is

  c-subset(c,regexpr(\\.1|\\.5|\\.6|\\.9,c)  0)

because regexpr will return -1 for elements which do not contain a match.


--- Ana Patricia Martins [EMAIL PROTECTED] wrote:

 Hello R-users and developers,
 
 Once again, I'm asking for your help.
 
 There is other way to do the same more easily for applied simultaneous
 grep???
   
 c-subset(c,!rownames(c) %in% grep(.1,rownames(c),value=T))
 c-subset(c,!rownames(c) %in% grep(.5,rownames(c),value=T))
 c-subset(c,!rownames(c) %in% grep(.6,rownames(c),value=T))
 c-subset(c,!rownames(c) %in% grep(.9,rownames(c),value=T))
 
 Thanks in advance for helping me.
 
 Atenciosamente,
 Ana Patricia Martins
 ---
 Serviço Métodos Estatísticos
 Departamento de Metodologia Estatística
 INE - Portugal
 Telef:  218 426 100 - Ext: 3210
 E-mail: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] 
 
 
   [[alternative HTML version deleted]]
 
  __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] simultaneous actions of grep ???

2007-06-26 Thread Stephen Tucker
My mistake... last alternative should be:

   c-subset(c,regexpr(\\.1|\\.5|\\.6|\\.9,rownames(c))  0)

--- Stephen Tucker [EMAIL PROTECTED] wrote:

 You can list them together using | (which stands for 'or'):
 
   c-subset(c,!rownames(c) %in%
 grep(.1|.5|.6|.9,rownames(c),value=T))
 
 but . means any character for regular expressions, so if you meant a
 decimal place, you probably want to escape them with a \\:
 
   c-subset(c,!rownames(c) %in%
 grep(\\.1|\\.5|\\.6|\\.9, rownames(c),value=T))
 
 Another option is
 
   c-subset(c,regexpr(\\.1|\\.5|\\.6|\\.9,c)  0)
 
 because regexpr will return -1 for elements which do not contain a match.
 
 
 --- Ana Patricia Martins [EMAIL PROTECTED] wrote:
 
  Hello R-users and developers,
  
  Once again, I'm asking for your help.
  
  There is other way to do the same more easily for applied simultaneous
  grep???

  c-subset(c,!rownames(c) %in% grep(.1,rownames(c),value=T))
  c-subset(c,!rownames(c) %in% grep(.5,rownames(c),value=T))
  c-subset(c,!rownames(c) %in% grep(.6,rownames(c),value=T))
  c-subset(c,!rownames(c) %in% grep(.9,rownames(c),value=T))
  
  Thanks in advance for helping me.
  
  Atenciosamente,
  Ana Patricia Martins
  ---
  Serviço Métodos Estatísticos
  Departamento de Metodologia Estatística
  INE - Portugal
  Telef:  218 426 100 - Ext: 3210
  E-mail: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] 
  
  
  [[alternative HTML version deleted]]
  
   __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] changing the position of the y label (ylab)

2007-06-26 Thread Stephen Tucker
If by 'position' you mean the distance from the axes, I think 'mgp' is the
argument you are looking for (see ?par)-

You can set this in par(), plot() [which will affect both x and y labels], or
title():

par(mar=rep(6,4))
plot(NA,NA,xlim=0:1,ylim=0:1,xlab=X,ylab=)
title(ylab=Y2,mgp=c(4,1,0))

if you want to change 'position' parallel to the axis, then you probably have
to do
plot(...,xlab=,ylab=)

and set labels using mtext(); playing around with the 'adj' argument.

Btw, you can use '\n' to denote new line:
title(ylab=Onset/Withdrawl\nDate,mgp=c(4,1,0))


--- Etienne [EMAIL PROTECTED] wrote:

 How can I change the position of the ylab, after
 enlarging the margins with par(mar=...)? 
 
 Here is the relevant code snippet
 
 
 par(mar=c(5.1,5.1,4.1,2.1))

plot(c(1979,2003),c(40,50),ylim=c(1,73),lab=c(20,10,1),pch=21,col='blue',bg='blue',axes=FALSE,xlab=Years,ylab=Onset/Withdrawl
 Date,font.lab=2)
 box()
 axis(1,las=2)

axis(2,las=2,labels=c('JAN','FEB','MAR','APR','MAY','JUN','JUL','AUG','SEP','OCT','NOV','DEC','JAN'),at=seq(from=1,to=73,by=6))
 axis(3,labels=FALSE)
 axis(4,labels=FALSE,at=seq(from=1,to=73,by=6))
 
 
 Thanks
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ylab at the right hand of a plot with two y axes

2007-06-26 Thread Stephen Tucker
Here are two ways:

## method 1
plot(1:100,y1)
par(new=TRUE)
plot(1:100,y2,xlab=,ylab=,col=2,axes=FALSE)
axis(4,col=2,col.axis=2)

## method 2
plot.new()
plot.window(xlim=c(1,100),ylim=range(y1))
points(1:100,y1)
axis(1)
axis(2)
title(xlab=x,ylab=y1)
plot.window(xlim=c(1,100),ylim=range(y2))
points(1:100,y2)
axis(4,col=2,col.axis=2)
box()



--- Young Cho [EMAIL PROTECTED] wrote:

 When I try to plot two lines ( or scatterplots) with different scales, this
 is what I have been doing:
 
 Suppose: I have y1 and y2 in a very different scale
 
 y1 = 1:100
 y2 = c(100:1)*10
 
 To plot them on top of each other  and denote by different colors: I have
 to
 figure out the correct scale '10'  and corresponding tick.vector and
 lables.
 Then do:
 
 plot(1:100, y1)   # I can have 'ylab' here for the left-hand side y axis.
 points(1:100, y2/10,col=2)
 ytick.vector = seq(from=0,to=100,by=20)
 ytick.label = as.character(seq(from=0,to=1000,by=200))
 axis(4,at = ytick.vector,label = ytick.label,col=2,col.axis=2)
 
 Two questions.
 
 1. Are there easier ways to plot the y1, y2 w/o figuring out the correct
 scaler, tick vectors, and labels in order to put them in one figure?
 2. How to add additional 'ylab' to the right hand side y-axis of the plot?
 Thanks a lot!
 
 -Young
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-excel

2007-06-25 Thread Stephen Tucker
There are also some notes about this in the R Data Import/Export manual: 
http://cran.r-project.org/doc/manuals/R-data.html#Reading-Excel-spreadsheets

But I've gathered the following examples from the R-help mailing list
archives [in addition to the option of saving the spreadsheet as a .csv file
and reading it in with read.csv()]. Personally, I use option 4 regularly (I
happened to have Perl installed on my Windows XP machine already) and have
had good luck with it.

Hope this helps.

= Option 1 =
# SIMPLEST OPTION
install.packages(xlsReadWrite)
library(xlsReadWrite)
data = read.xls(sampledata.xls,sheet=1)

= Option 2 =
# ALSO SIMPLE BUT MORE MANUAL WORK EACH TIME
# (1) highlight region in Excel you want to import and
data = read.delim(file=clipboard,header=TRUE)
# or, if you don't have a header,
data = read.delim(file=clipboard,header=FALSE)

= Option 3 =
# RODBC IS A BIG APPLICATION, FOR INTERFACING
# WITH MANY TYPES OF FILES/SERVERS
install.packages(RODBC)
library(RODBC)
fid - odbcConnectExcel(sampledata.xls)
data - sqlFetch(fid,Sheet1)
close(fid)

= Option 4 =
# REQUIRES CONCURRENT INSTALLATION OF PERL
install.packages(gdata)
library(gdata)
data = read.xls(sampledata.xls,sheet=1)

 



--- Erika Frigo [EMAIL PROTECTED] wrote:

 
 Good morning to everybody,
 I have a problem : how can I import excel files in R???
 
 thank you very much
 
 
 Dr.sa. Erika Frigo
 Università degli Studi di Milano
 Facoltà di Medicina Veterinaria
 Dipartimento di Scienze e Tecnologie Veterinarie per la Sicurezza
 Alimentare (VSA)
  
 Via Grasselli, 7
 20137 Milano
 Tel. 02/50318515
 Fax 02/50318501
   [[alternative HTML version deleted]]
 
  __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   


Comedy with an Edge to see what's on, when.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FW: Suse RPM installation problem

2007-06-22 Thread Stephen Henderson
Yes I definitely have it. 

PC5-140:/home/rmgzshd/MAT-2 # whereis libpng12.so.0
libpng12.so: /usr/lib/libpng12.so.0 /usr/local/lib/libpng12.so 
/usr/local/lib/libpng12.so.0
PC5-140:/home/rmgzshd/MAT-2 # rpm -qf /usr/lib64/libpng12.so.0
libpng-1.2.8-19.5


I don't understand either why 'whereis' does not identify this-- or why it 
lists the file 3 times?. 
I'm also puzzled why they chose to call the file exactly the same thing.


I will next try  rpm --force as you suggest.

Stephen

-Original Message-
From: Peter Dalgaard [mailto:[EMAIL PROTECTED]
Sent: Fri 6/22/2007 10:10 AM
To: Stephen Henderson
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] FW: Suse RPM installation problem
 
Stephen Henderson wrote:
 Thanks for your help

 As you suggested I do indeed have a 64bit version called exactly the same

 PC5-140:/home/rmgzshd # rpm -qf /usr/lib/libpng12.so.0
 libpng-32bit-1.2.8-19.5
 PC5-140:/home/rmgzshd # rpm -qf /usr/lib64/libpng12.so.0
 libpng-1.2.8-19.5

 SO how do I tell rpm to find this and not the 32bit file? Or do I need to 
 edit something in the rpm file?

 Thanks

   
Odd... Do you actually _have_  /usr/lib64/libpng12.so.0 (whereis didn't
seem to find it) --- as opposed to rpm -qf telling you which package
contains the file? If not, try (re)installing libpng, possibly with
--force.


 -Original Message-
 From: Peter Dalgaard [mailto:[EMAIL PROTECTED]
 Sent: Thu 6/21/2007 6:34 PM
 To: Stephen Henderson
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] FW: Suse RPM installation problem
  
 Stephen Henderson wrote:
   
 Hello 

 I am trying to install the R RPM for Suse 10.0 on an x86_64 PC. However
 I am failing a dependency for  libpng12.so.0 straight away



 PC5-140:/home/rmgzshd # rpm -i R-base-2.5.0-2.1.x86_64.rpm
 error: Failed dependencies:
 libpng12.so.0(PNG12_0)(64bit) is needed by R-base-2.5.0-2.1.x86_64

 I do seem to have this file

 PC5-140:/home/rmgzshd # whereis libpng12.so.0
 libpng12.so: /usr/lib/libpng12.so.0 /usr/local/lib/libpng12.so 

 but presuming that it is not the 64bit version mentioned I went looking
 for a 64 bit version but could not find it through google.

 However reading the Installation manual I noted that libpng is mention
 in the context of a source build. I therefore downloaded libpng-1.2.18
 (v-1.2.8 or later is specified in the manual) and succesfully compiled
 this. This did not however help with my problem.

 Any suggestions?

   
 
 I have

 viggo:~/rpm -qf /usr/lib/libpng12.so.0
 libpng-32bit-1.2.12-25
 viggo:~/rpm -qf /usr/lib64/libpng12.so.0
 libpng-1.2.12-25
 viggo:~/rpm -q R-base
 R-base-2.5.0-2.1


   
 Thanks
 Stephen Henderson
  

 **
 This email and any files transmitted with it are confidentia...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   
 



 **
 This email and any files transmitted with it are confidentia...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907




**
This email and any files transmitted with it are confidentia...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FW: Suse RPM installation problem

2007-06-22 Thread Stephen Henderson
Thanks for your help

As you suggested I do indeed have a 64bit version called exactly the same

PC5-140:/home/rmgzshd # rpm -qf /usr/lib/libpng12.so.0
libpng-32bit-1.2.8-19.5
PC5-140:/home/rmgzshd # rpm -qf /usr/lib64/libpng12.so.0
libpng-1.2.8-19.5

SO how do I tell rpm to find this and not the 32bit file? Or do I need to edit 
something in the rpm file?

Thanks




-Original Message-
From: Peter Dalgaard [mailto:[EMAIL PROTECTED]
Sent: Thu 6/21/2007 6:34 PM
To: Stephen Henderson
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] FW: Suse RPM installation problem
 
Stephen Henderson wrote:
 Hello 

 I am trying to install the R RPM for Suse 10.0 on an x86_64 PC. However
 I am failing a dependency for  libpng12.so.0 straight away



 PC5-140:/home/rmgzshd # rpm -i R-base-2.5.0-2.1.x86_64.rpm
 error: Failed dependencies:
 libpng12.so.0(PNG12_0)(64bit) is needed by R-base-2.5.0-2.1.x86_64

 I do seem to have this file

 PC5-140:/home/rmgzshd # whereis libpng12.so.0
 libpng12.so: /usr/lib/libpng12.so.0 /usr/local/lib/libpng12.so 

 but presuming that it is not the 64bit version mentioned I went looking
 for a 64 bit version but could not find it through google.

 However reading the Installation manual I noted that libpng is mention
 in the context of a source build. I therefore downloaded libpng-1.2.18
 (v-1.2.8 or later is specified in the manual) and succesfully compiled
 this. This did not however help with my problem.

 Any suggestions?

   
I have

viggo:~/rpm -qf /usr/lib/libpng12.so.0
libpng-32bit-1.2.12-25
viggo:~/rpm -qf /usr/lib64/libpng12.so.0
libpng-1.2.12-25
viggo:~/rpm -q R-base
R-base-2.5.0-2.1


 Thanks
 Stephen Henderson
  

 **
 This email and any files transmitted with it are confidentia...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   



**
This email and any files transmitted with it are confidentia...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] FW: Suse RPM installation problem

2007-06-21 Thread Stephen Henderson
Hello 

I am trying to install the R RPM for Suse 10.0 on an x86_64 PC. However
I am failing a dependency for  libpng12.so.0 straight away



PC5-140:/home/rmgzshd # rpm -i R-base-2.5.0-2.1.x86_64.rpm
error: Failed dependencies:
libpng12.so.0(PNG12_0)(64bit) is needed by R-base-2.5.0-2.1.x86_64

I do seem to have this file

PC5-140:/home/rmgzshd # whereis libpng12.so.0
libpng12.so: /usr/lib/libpng12.so.0 /usr/local/lib/libpng12.so 

but presuming that it is not the 64bit version mentioned I went looking
for a 64 bit version but could not find it through google.

However reading the Installation manual I noted that libpng is mention
in the context of a source build. I therefore downloaded libpng-1.2.18
(v-1.2.8 or later is specified in the manual) and succesfully compiled
this. This did not however help with my problem.

Any suggestions?

Thanks
Stephen Henderson
 

**
This email and any files transmitted with it are confidentia...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dissimilarity Analysis

2007-06-20 Thread stephen cox
Hi Birgit - looks like you have a few issues here.

Birgit Lemcke birgit.lemcke at systbot.uzh.ch writes:

 
 Hello you all!
 
 I am a completely new user of R and I have a problem to solve.
 I am using Mac OS X on a PowerBook.
 
 I have a table that looks like this:
 
 species X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14  
 X15 X16 X17 X18 X19 X20 X21
 1Anth_cap1  1  0  0  1  0  1  0  0  1   0   0   0   0   0
 0   0   1   0   0   0   1
 2   Anth_crin1  1  0  0  1  0  1  0  0  1   0   1   0   0   0
 0   0   0   1   0   0   1
 3Anth_eck1  1  0  0  1  0  1  0  0  1   0   0   0   0   0
 0   0   0   1   0   0   1
 4   Anth_gram1  1  0  0  1  0  1  0  0  1  NA  NA  NA  NA   0
 0   0   0   1   0   0   0
 5   Anth_insi1  1  0  0  1  0  1  0  0  1   0   0   0   1   0
 0   0   0   1   0   0   1
 
 All columns  are binary coded characters.
 The Import was done by this
 
 Test-read.table(TestRFemMalBivariat1.csv,header = TRUE, sep = ;)

First - you need to transpose the matrix to have species as columns.  You can do
this with:

d2 = data.frame(t(Test[,-1]))
colnames(d2) = Test[,1]  #now use d2


 
 Now I try to perform a similarity analysis with the dsvdis function  
 of the labdsv package with the sorensen-Index.
 
 My first question is if all zeros in my table are seen as missing  
 values and if it islike that how can I change without turning zero  
 into other numbers?

no - the zeros are valid observations.  the na's are missing data.


   DisTest-dsvdis(Test, index = sorensen)
 
 But I always get back this error message:
 
 Warnung in symbol.For(dsvdis) :'symbol.For' is not needed: please  
 remove it
 Fehler in dsvdis(Test, index = sorensen) :
   NA/NaN/Inf in externem Funktionsaufruf (arg 1)
 Zusätzlich: Warning message:
 NAs durch Umwandlung erzeugt



Second - you have an issue with missing data.  It looks like dsvdis does not
like the NA's - so you must make a decision about what to do.  Delete that
species, delete that site, or whatever...

Finally - the warning over symbol.For is an issue with the labdsv library itself
- nothing you are doing wrong.  The results will still be valid - but the use of
symbol.For is something that will eventually need to be changed in the labdsv
library.

hth,

stephen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Computing time differences

2007-06-20 Thread Stephen Tucker
Here is one way:

Vector1 - c(20080621.00,20080623.00)
Vector2 - c(20080620.00,20080622.00)
do.call(difftime,
c(apply(cbind(time1=Vector1,time2=Vector2),2,
  function(x) strptime(x,format=%Y%m%d.00)),
  units=hours))

see ?strptime, ?difftime and
http://cran.r-project.org/doc/Rnews/Rnews_2004-1.pdf



--- [EMAIL PROTECTED] wrote:

 Dear R users, 
 
 I have a problem computing time differences using R. 
 
 I have a date that are given using the following format: 20080620.00, where
 the 4 first digits represent the year, the next 2 ones the month and the
 last
 2 ones the day. I would need to compute time differences between two
 vectors
 of this given format. 
 
 I tried around trying to change this format into any type of time serie
 without any succes. 
 
 Could some one provide me with some useful suggestion and/or tip to know
 where to look?
 
 I am using R-2.4.0 under Windows XP
 
 Thanks for your help, 
 
 Vincent
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   

Pinpoint customers who are looking for what you sell.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data type for block data?

2007-06-19 Thread Stephen Tucker
Hi Paul,

Hope this is what you're looking for:

## reading in text (the first 13 rows of cc from your posting)
## and using smaller indices [(3,8) instead of (10,40)]
## for this example
 cc - mode-(do.call(rbind,
+strsplit(readLines(textConnection(txt))[-1],[ ]{2,}))[,-1],
+numeric)
 index - c(3,8)

## (1) convert cc to data frame
## (2) split according to factors produced by cut()
## (3) apply data.matrix() to each element of list
## produced by split() to convert back to numeric matrix
 s - lapply(split(as.data.frame(cc),
+   f=cut(1:nrow(cc),breaks=c(-Inf,index,Inf))),
+ data.matrix)

## return result. now s[[1]] contains the first block,
## s[[2]] contains the second block, and so on.
 s
$`(-Inf,3]`
  V1 V2
1  1 26
2  2 27
3  3 28

$`(3,8]`
  V1 V2
4  4 29
5  5 30
6  6 31
7  7 32
8  8 33

$`(8, Inf]`
   V1 V2
9   9 34
10  1 27
11  1 28
12  2 30
13  3 34


--- H. Paul Benton [EMAIL PROTECTED] wrote:

 Dear All,
 
 
 I have a matrix with data that is not organised. I would like to go
 through this and extract it. Each feature has 2 vectors which express
 the data. I also have an index of the places where the data should be cut.
 eg.
 class(cc)
 matrix
 cc
   [,1] [,2]
  [1,]1   26
  [2,]2   27
  [3,]3   28
  [4,]4   29
  [5,]5   30
  [6,]6   31
  [7,]7   32
  [8,]8   33
  [9,]9   34
 [10,]1   27
 [11,]1   28
 [12,]2   30
 [13,]3   34
 ect..
  index
 [1] 10 40
 
 
 Is there a way to take cc[i:index[i-1],] to another format as to where
 each block could be worked on separately. ie so in one block would be
 rows1:10 the next block would be rows11:40 and so on.
 
 Thanks,
 
 Paul
 
 
 
 -- 
 Research Technician
 Mass Spectrometry
o The
   /
 o Scripps
   \
o Research
   /
 o Institute
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] : create a PDF file (text (print list) and grafics)

2007-06-19 Thread Stephen Tucker
Hi Ana,

There are two ways in which I imagine this can be done:
(1) create a layout [using layout()] and printing the text on a blank plot;
(2) using Sweave.

## === Method 1 example... ===

pdf()
layout(matrix(c(1,1,2,3),ncol=2,byrow=TRUE),widths=c(1,1),heights=c(3,2))
par(mar=c(0,0,5,0))
plot.new(); plot.window(xlim=c(0,1),ylim=c(0,1))
title(Title,cex.main=1.5)
text(0.4,0.5,adj=c(0,0),lab=
 print(myList)
$a
[1] 1

$b
[1] 2)
par(mar=c(5,4,1,1))
boxplot(1:10)
hist(1:10)
dev.off()

## (there must be a more elegant way than pasting the output
## of print(myList) as a character string in text() but I can't
## think of it at the moment...

## === Method 2 (warning: I am not too familiar with Sweave
## but I understand that this is how it *should* work; this
## Sweave document will create a '.tex' file which you can then
## compile with latex - this site was helpful:
## http://www.stat.umn.edu/~charlie/Sweave/)  ===

\documentclass[a4paper]{article}
\title{Sweave Document}
\author{}
\date{}
\begin{document}
\maketitle

Text field here

echo=FALSE=
## computations to build 'myList' here (but not for printing)
## such as
myList - list(a=1,b=2)
@
reg=
## this is for output
print(myList)
@ 

\begin{center}
fig =TRUE , echo =FALSE =
par(mfrow=c(1,2), oma=c(0,0,3,0),cex=0.5)
#Image
hist(controlo$quope,axes=T,plot=T,col=gray,xlab=
Quope,main=Histograma,lwd=2)
boxplot(controlo$quope,col=bisque,lty=3,medlty=1,medlwd=2.5,main=
Boxplot) 
mtext(regiao,cex=1.5,col=blue,adj=0.5,side=3,outer=TRUE) 
@
\end{center}
\end{document}

##



--- Ana Patricia Martins [EMAIL PROTECTED] wrote:

 Dear helpers,
 
 I need help to create a PDF file like the example
 
  ---
  |  Title |
  ---
  ||
  |Text (print a list) |  
  ||
  ---
  || |
  || |
  |image   | image   |
  || |
  || |
  ---
 
 

pdf(paste(getwd(),/Output/Controlo_Pesos,regiao,trimestre,substr(ano,3,4),
 
   .pdf,sep=),height=13.7, paper=special)
 par(mfrow=c(1,2), oma=c(0,0,3,0),cex=0.5)
 
 #Text field ()
 #print(qual_pesos)# is a list
 
 #Image
 hist(controlo$quope,axes=T,plot=T,col=gray,xlab=
 Quope,main=Histograma,lwd=2)
 boxplot(controlo$quope,col=bisque,lty=3,medlty=1,medlwd=2.5,main=
 Boxplot) 
 mtext(regiao,cex=1.5,col=blue,adj=0.5,side=3,outer=TRUE) 
 dev.off()
 
 
 
 There is other way to do the same more easily
 Thanks in advance for helping me.
 Best regards.
 
 Atenciosamente,
 Ana Patricia Martins
 ---
 Serviço Métodos Estatísticos
 Departamento de Metodologia Estatística
 INE - Portugal
 Telef:  218 426 100 - Ext: 3210
 E-mail: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] 
 
 
   [[alternative HTML version deleted]]
 
  __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] passing (or obtaining) index or element name of list to FUN in lapply()

2007-06-13 Thread Stephen Tucker
Hello everyone,

I wonder if there is a way to pass the index or name of a list to a
user-specified function in lapply(). For instance, my desired effect is
something like the output of 

 L - list(jack=4098,sape=4139)
 lapply(seq(along=L),function(i,x) if(i==1) jack else sape,x=L)
[[1]]
[1] jack

[[2]]
[1] sape

 lapply(seq(along=L),function(i,x) if(names(x)[i]==jack) 1 else 2,x=L)
[[1]]
[1] 1

[[2]]
[1] 2

But by passing L as the first argument of lapply(). I thought there was a
tangentially-related post on this mailing list in the past but I don't recall
that it was ever addressed directly (and I can't seem to find it now). The
examples above are perfectly good alternatives especially if I wrap each of
the lines in names-() to return lists with appropriate names assigned, but
it feels like I am essentially writing a FOR-LOOP - though I was surprised to
find that speed-wise, it doesn't seem to make much of a difference (unless I
have not selected a rigorous test):

 N - 1
 y - runif(N)
## looping through elements of y
 system.time(lapply(y,
+function(x) {
+  set.seed(222)
+  mean(rnorm(1e4,x,1))
+}))
[1] 21.00  0.17 21.29NANA
## looping through indices
 system.time(lapply(1:N,
+function(x,y) {
+  set.seed(222)
+  mean(rnorm(1e4,y[x],1))
+  },y=y))
[1] 21.09  0.14 21.26NANA

In Python, there are methods for Lists and Dictionaries called enumerate(),
and iteritems(), respectively. Example applications:

## a list
L = ['a','b','c']
[x for x in enumerate(L)]
## returns index of list along with the list element
[(0, 'a'), (1, 'b'), (2, 'c')]

## a dictionary
D = {'jack': 4098, 'sape': 4139}
[x for x in D.iteritems()]
## returns element key (name) along with element contents
[('sape', 4139), ('jack', 4098)]

And this is something of the effect I was looking for...

Thanks to all,

Stephen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Viewing a data object

2007-06-13 Thread Stephen Tucker
Hi Horace,

I have also thought that it may be useful but I don't know of any Object
Explorer available for R.

However, (you may alread know this but) 
(1) you can view your list of objects in R with objects(), 
(2) view objects in a spreadsheet-like table (if they are matrices or data
frames) with invisible(edit(objectName)) [which isn't easy on the fingers].
fix(objectName) is also a shorter option but it has the side effect of
possibly changing your object when you close the viewing data. For instance,
this can happen if you mistakenly type something into a cell; it can also
change your column classes when you don't - for example:

 options(stringsAsFactors=TRUE)
 x - data.frame(letters[1:5],1:5)
 sapply(x,class)
letters.1.5. X1.5 
factorinteger 
 fix(x) # no user-changes made
 sapply(x,class)
letters.1.5. X1.5 
factornumeric 

(3) I believe Deepayan Sarkar contributed the tab-completion capability at
the command line. So unless you have a lot of objects beginning with
'AuroraStoch...' you should be able to type a few letters and let the
auto-completion handle the rest.

Best regards,

ST


--- Horace Tso [EMAIL PROTECTED] wrote:

 Dear list,
 
 First apologize that this is trivial and just betrays my slothfulness at
 the keyboard. I'm sick of having to type a long name just to get a glimpse
 of something. For example, if my data frame is named
 'AuroraStochasticRunsJune1.df and I want to see what the middle looks
 like, I have to type
 
 AuroraStochasticRunsJune1.df[ 400:500, ]
 
 And often I'm not even sure rows 400 to 500 are what I want to see.  I
 might have to type the same line many times.
 
 Is there sort of a R-equivalence of the Object Explorer, like in Splus,
 where I could mouse-click an object in a list and a window pops up?  Short
 of that, is there any trick of saving a couple of keystrokes here and
 there?
 
 Thanks for tolerating this kind of annoying questions.
 
 H.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



 

Sucker-punch spam with award-winning protection.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] passing (or obtaining) index or element name of list to FUN in lapply()

2007-06-13 Thread Stephen Tucker
Hi Professor Ripley,

Thanks for the response. I apologize, my examples were not too real (though
your solutions are indeed clever)... I was trying to ask more generally
whether the element name or index of 'listObj' could be obtained by the
user-function 'myfunction' when used in lapply(X=listObj,FUN=myfunction);
below I illustrate two cases in which I have come across this desire:
(1) In 'Example 1' I essentially take the list element and do some
transformations (optionally some number-crunching), and then plot it with the
element name of the list for the title.
(2) In 'Example 2' I want to read in data from the list element and write the
contents to a file; writing a header line only when operating on the first
element of the list.

## data specification
data1 - var1 var2
-0.44 0.17
1.03 0.93
0.85 0.39
data2 - var1 var2
-0.16 0.97
0.93 0.23
0.80 0.42
L - list(data1=data1,data2=data2)

##=== Example 1 (want element name) ===
## function definition
plottingfunc - function(i,x) {
  plot(read.table(textConnection(x[[i]]),header=TRUE),main=names(x)[i])
}
## function application
par(mfrow=c(2,1))
lapply(seq(along=L),plottingfunc,x=L)

##=== Example 2 (want element index) ===
## function definition
readwritefunc - function(i,x,fout) {
  data - read.table(textConnection(x[[i]]),header=TRUE)
  if(i==1) cat(paste(colnames(data),collapse=,),\n,file=fout)
  write.table(data,file=fout,sep=,,col=FALSE,
  row=FALSE,quote=FALSE,append=TRUE)
}
## function application
fout - file(out.dat,open=w)
lapply(seq(along=L),readwritefunc,x=L,fout=fout)
close(fout)

Since the above code works, I suppose this is more of a question of
aesthetics since I thought the spirit of lapply() was to operate on the
elements of a list and not its indices - I thought perhaps there is a way to
get the index number and element name from within the user-function.

Also, I recall a lesson on 'loop avoidance' from an earlier version of MASS;
this was in the days of S-PLUS dominance and perhaps less applicable now to R
as you mentioned... But old habits die hard; my amygdala still invokes a fear
response at the thought of a loop... (and as of recently, I have been
infatuated with the notion of adhering, albeit loosely, to the 'functional
programming' paradigm which makes me doubly fearful of loops)

Thanks and best regards,

Stephen

--- Prof Brian Ripley [EMAIL PROTECTED] wrote:

 On Tue, 12 Jun 2007, Stephen Tucker wrote:
 
  Hello everyone,
 
  I wonder if there is a way to pass the index or name of a list to a
  user-specified function in lapply(). For instance, my desired effect is
  something like the output of
 
  L - list(jack=4098,sape=4139)
  lapply(seq(along=L),function(i,x) if(i==1) jack else sape,x=L)
  [[1]]
  [1] jack
 
  [[2]]
  [1] sape
 
 as.list(names(L))
 
  lapply(seq(along=L),function(i,x) if(names(x)[i]==jack) 1 else 2,x=L)
  [[1]]
  [1] 1
 
  [[2]]
  [1] 2
 
 as.list(seq_along(L))
 
 lapply() can be faster than a for-loop, but usually not by much: its main 
 advantage is clarity of code.
 
 I think we need a real-life example to see what you are trying to do.
 
  But by passing L as the first argument of lapply(). I thought there was a
  tangentially-related post on this mailing list in the past but I don't
 recall
  that it was ever addressed directly (and I can't seem to find it now).
 The
  examples above are perfectly good alternatives especially if I wrap each
 of
  the lines in names-() to return lists with appropriate names assigned,
 but
 
 Try something like
 
 L[] - lapply(seq_along(L),function(i,x) if(i==1) jack else sape,x=L)
 
  it feels like I am essentially writing a FOR-LOOP - though I was
 surprised to
  find that speed-wise, it doesn't seem to make much of a difference
 (unless I
  have not selected a rigorous test):
 
  N - 1
  y - runif(N)
  ## looping through elements of y
  system.time(lapply(y,
  +function(x) {
  +  set.seed(222)
  +  mean(rnorm(1e4,x,1))
  +}))
  [1] 21.00  0.17 21.29NANA
  ## looping through indices
  system.time(lapply(1:N,
  +function(x,y) {
  +  set.seed(222)
  +  mean(rnorm(1e4,y[x],1))
  +  },y=y))
  [1] 21.09  0.14 21.26NANA
 
  In Python, there are methods for Lists and Dictionaries called
 enumerate(),
  and iteritems(), respectively. Example applications:
 
  ## a list
  L = ['a','b','c']
  [x for x in enumerate(L)]
  ## returns index of list along with the list element
  [(0, 'a'), (1, 'b'), (2, 'c')]
 
  ## a dictionary
  D = {'jack': 4098, 'sape': 4139}
  [x for x in D.iteritems()]
  ## returns element key (name) along with element contents
  [('sape', 4139), ('jack', 4098)]
 
  And this is something of the effect I was looking for...
 
  Thanks to all,
 
  Stephen
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman

Re: [R] selecting characters from a line of text

2007-06-11 Thread Stephen Tucker
Maybe substring() is what you're looking for? Some examples:

 substring(textstring,1,5)
[1] texts
 substring(textstring,3)
[1] xtstring
 substring(textstring,3,nchar(textstring))
[1] xtstring


--- Tim Holland [EMAIL PROTECTED] wrote:

 Is there a way in R to select certain characters from a line of text?  I
 have some data that is presently in a large number of text files, and I
 would like to be able to select elements of each text file (elements are
 always on the same line, in the same position) and organize them into a
 table.  Is there a tool to select text in this way in R?  What I am looking
 for would be somewhat similar to the left() and right() functions in Excel.
 I have looked at the parse() and scan() functions, but don't think they can
 do what I want (although I could be wrong).
 Thank you,
 Tim
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to specify the start position using plot

2007-06-10 Thread Stephen Tucker

plot(x=1:10,y=1:10,xlim=c(0,5),ylim=c(6,10))

a lot of the arguments descriptions for plot() are contained in ?par

--- Patrick Wang [EMAIL PROTECTED] wrote:

 Hi,
 
 How to specify the start position of Y in plot command, hopefully I can
 specify the range of X and Y axis. I checked the ?plot, it didnot mention
 I can setup the range.
 
 
 Thanks
 Pat
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



 

Bored stiff? Loosen up...

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tools For Preparing Data For Analysis

2007-06-10 Thread Stephen Tucker

Since R is supposed to be a complete programming language, I wonder
why these tools couldn't be implemented in R (unless speed is the
issue). Of course, it's a naive desire to have a single language that
does everything, but it seems that R currently has most of the
functions necessary to do the type of data cleaning described.

For instance, Gabor and Peter showed some snippets of ways to do this
elegantly; my [physical science] data is often not as horrendously
structured so usually I can get away with a program containing this
type of code

txtin - scan(filename,what=,sep=\n)
filteredList - lapply(strsplit(txtin,delimiter),FUN=filterfunction)
   # fiteringfunction() returns selected (and possibly transformed
   # elements if present and NULL otherwise
   # may include calls to grep(), regexpr(), gsub(), substring(),...
   # nchar(), sscanf(), type.convert(), paste(), etc.
mydataframe - do.call(rbind,filteredList)
   # then match(), subset(), aggregate(), etc.

In the case that the file is large, I open a file connection and scan
a single line + apply filterfunction() successively in a FOR-LOOP
instead of using lapply(). Of course, the devil is in the details of
the filtering function, but I believe most of the required text
processing facilities are already provided by R.

I often have tasks that involve a combination of shell-scripting and
text processing to construct the data frame for analysis; I started
out using Python+NumPy to do the front-end work but have been using R
progressively more (frankly, all of it) to take over that portion
since I generally prefer the data structures and methods in R.


--- Peter Dalgaard [EMAIL PROTECTED] wrote:

 Douglas Bates wrote:
  Frank Harrell indicated that it is possible to do a lot of difficult
  data transformation within R itself if you try hard enough but that
  sometimes means working against the S language and its whole object
  view to accomplish what you want and it can require knowledge of
  subtle aspects of the S language.

 Actually, I think Frank's point was subtly different: It is *because* of 
 the differences in view that it sometimes seems difficult to find the 
 way to do something in R that  is apparently straightforward in SAS. 
 I.e. the solutions exist and are often elegant, but may require some 
 lateral thinking.
 
 Case in point: Finding the first or the last observation for each 
 subject when there are multiple records for each subject. The SAS way 
 would be a datastep with IF-THEN-DELETE, and a RETAIN statement so that 
 you can compare the subject ID with the one from the previous record, 
 working with data that are sorted appropriately.
 
 You can do the same thing in R with a for loop, but there are better 
 ways e.g.
 subset(df,!duplicated(ID)), and subset(df, rev(!duplicated(rev(ID))), or 
 maybe
 do.call(rbind,lapply(split(df,df$ID), head, 1)), resp. tail. Or 
 something involving aggregate(). (The latter approaches generalize 
 better to other within-subject functionals like cumulative doses, etc.).
 
 The hardest cases that I know of are the ones where you need to turn one 
 record into many, such as occurs in survival analysis with 
 time-dependent, piecewise constant covariates. This may require 
 transposing the problem, i.e. for each  interval you find out which 
 subjects contribute and with what, whereas the SAS way would be a 
 within-subject loop over intervals containing an OUTPUT statement.
 
 Also, there are some really weird data formats, where e.g. the input 
 format is different in different records. Back in the 80's where 
 punched-card input was still common, it was quite popular to have one 
 card with background information on a patient plus several cards 
 detailing visits, and you'd get a stack of cards containing both kinds. 
 In R you would most likely split on the card type using grep() and then 
 read the two kinds separately and merge() them later.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



  

Park yourself in front of a world of choices in alternative vehicles. Visit the 
Yahoo! Auto Green Center.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tools For Preparing Data For Analysis

2007-06-10 Thread Stephen Tucker
Embarrasingly, I don't know awk or sed but R's code seems to be
shorter for most tasks than Python, which is my basis for comparison.

It's true that R's more powerful data structures usually aren't
necessary for the data cleaning, but sometimes in the filtering
process I will pick out lines that contain certain data, in which case
I have to convert text to numbers and perform operations like
which.min(), order(), etc., so in that sense I like to have R's
vectorized notation and the objects/functions that support it.

As far as some of the tasks you described, I've tried transcribing
them to R. I know you provided only the simplest examples, but even in
these cases I think R's functions for handling these situations
exemplify their usefulness in this step of the analysis. But perhaps
you would argue that this code is too long... In any event it will
still save the trouble of keeping track of an extra (intermediate)
file passed between awk and R.

(1) the numbers of fields in each line equivalent to 
cat datafile.csv | awk 'BEGIN{FS=,}{n=NF;print n}'
in awk

# R equivalent:
nFields - count.fields(datafile.csv,sep=,)
# or 
nFields - sapply(strsplit(readLines(datafile.csv),,),length)

(2) which lines have the wrong number of fields, and how many fields
they have. You can similarly count how many lines there are (e.g. pipe
into wc -l).

# number of lines with wrong number of fields
nWrongFields - length(nFields[nFields  10])

# select only first ten fields from each line
# and return a matrix
firstTenFields - 
  do.call(rbind,
  lapply(strsplit(readLines(datafile.csv),,),
 function(x) x[1:10]))

# select only those lines which contain ten fields
# and return a matrix
onlyTenFields - 
  do.call(rbind,
  lapply(strsplit(readLines(datafile.csv),,),
 function(x) if(length(x) = 10) x else NULL))

(3)
If for instance you try to
read the following CSV into R as a dataframe:
 
1,2,.,4
2,.,4,5
3,4,.,6
 

txtC - textConnection(
1,2,.,4
2,.,4,5
3,4,.,6)
# using read.csv() specifying na.string argument:
 read.csv(txtC,header=FALSE,na.string=.)
  V1 V2 V3 V4
1  1  2 NA  4
2  2 NA  4  5
3  3  4 NA  6

# Of course, read.csv will work only if data is formatted correctly.
# More generally, using readLines(), strsplit(), etc., which are more
# flexible :

 do.call(rbind,
+ lapply(strsplit(readLines(txtC),,),
+type.convert,na.string=.))
 [,1] [,2] [,3] [,4]
[1,]12   NA4
[2,]2   NA45
[3,]34   NA6

(4) Situations where people mix ,, and ,.,!

# type.convert (and read.csv) will still work when missing values are ,,
# and ,., (automatically recognizes  as NA and through
# specification of 'na.string', can recognize . as NA)

# If it is desired to convert . to  first, this is simple as
# well:

m - do.call(rbind,
lapply(strsplit(readLines(txtC),,),
   function(x) gsub(^\\.$,,x)))
 m
 [,1] [,2] [,3] [,4]
[1,] 1  2 4 
[2,] 2 4  5 
[3,] 3  4 6 

# then
mode(m) - numeric
# or
m - apply(m,2,type.convert)
# will give
 m
 [,1] [,2] [,3] [,4]
[1,]12   NA4
[2,]2   NA45
[3,]34   NA6


--- [EMAIL PROTECTED] wrote:

 On 10-Jun-07 19:27:50, Stephen Tucker wrote:
  
  Since R is supposed to be a complete programming language,
  I wonder why these tools couldn't be implemented in R
  (unless speed is the issue). Of course, it's a naive desire
  to have a single language that does everything, but it seems
  that R currently has most of the functions necessary to do
  the type of data cleaning described.
 
 In principle that is certainly true. A couple of comments,
 though.
 
 1. R's rich data structures are likely to be superfluous.
Mostly, at the sanitisation stage, one is working with
flat files (row  column). This straightforward format
is often easier to handle using simple programs for the
kind of basic filtering needed, rather then getting into
the heavier programming constructs of R.
 
 2. As follow-on and contrast at the same time, very often
what should be a nice flat file with no rough edges is not.
If there are variable numbers of fields per line, R will
not handle it straightforwardly (you can force it in,
but it's more elaborate). There are related issues as well.
 
 a) If someone entering data into an Excel table lets their
cursor wander outside the row/col range of the table,
this can cause invisible entities to be planted in the
extraneous cells. When saved as a CSV, this file then
has variable numbers of fields per line, and possibly
also extra lines with arbitrary blank fields.
 
cat datafile.csv | awk 'BEGIN{FS=,}{n=NF;print n}'
 
will give you the numbers of fields in each line.
 
If you further pipe it into | sort -nu you will get
the distinct field-numbers. If you know (by now) how many
fields there should be (e.g. 10), then
 
cat

Re: [R] Different fonts on different axes

2007-05-31 Thread Stephen Weigand
There's also this approach

plot(runif(10), ylab=list(Red, Bold?, col = red, font = 2),
xlab=Black, standard?)


On 5/31/07, Greg Snow [EMAIL PROTECTED] wrote:
 Try this:

  plot(runif(10), ylab=, xlab=Black, standard?)
  mtext('Red, Bold', side=2, line=3, col='red', font=2)

 Hope this helps,

 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 [EMAIL PROTECTED]
 (801) 408-8111



  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] On Behalf Of Martin
  Henry H. Stevens
  Sent: Thursday, May 31, 2007 11:00 AM
  To: R-Help
  Subject: [R] Different fonts on different axes
 
  Hi Folks,
  How do I get red bold font on my y axis and black standard
  font on my x axis?
 
  plot(runif(10), ylab=Red, Bold?, xlab=Black, standard?)
 
  Any pointers or examples would be great.
  Thanks!
  Hank
 
 
  Dr. Hank Stevens, Assistant Professor
  338 Pearson Hall
  Botany Department
  Miami University
  Oxford, OH 45056
 
  Office: (513) 529-4206
  Lab: (513) 529-4262
  FAX: (513) 529-4243
  http://www.cas.muohio.edu/~stevenmh/
  http://www.muohio.edu/ecology/
  http://www.muohio.edu/botany/
 
  E Pluribus Unum
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Rochester, Minn. USA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] looking for the na.omit equivalent for a matrix of characters

2007-05-28 Thread Stephen Tucker
You can also use type.convert() if you did want to convert your characters to
numbers and NA's to NA's so that you can use na.omit().

 x - matrix(0,5,5)
 x[1,3] - x[4,4] - NA
 newx - apply(x,2,type.convert)
 newx
 [,1] [,2] [,3] [,4] [,5]
[1,]00   NA00
[2,]00000
[3,]00000
[4,]000   NA0
[5,]00000

--- jim holtman [EMAIL PROTECTED] wrote:

 Since they are characters you can just compare for them.  You did not show
 what your data looks like, or what you want to do if there are NA.  Do
 you
 want the row removed?  You can use 'apply' to test a row for NA:
 
   x - matrix(0,5,5)
  x[1,3] - x[4,4] - NA
  x
  [,1] [,2] [,3] [,4] [,5]
 [1,] 0  0  NA 0  0
 [2,] 0  0  0  0  0
 [3,] 0  0  0  0  0
 [4,] 0  0  0  NA 0
 [5,] 0  0  0  0  0
  apply(x, 1, function(z) any(z == NA))
 [1]  TRUE FALSE FALSE  TRUE FALSE
  x[!apply(x, 1, function(z) any(z == NA)),]
  [,1] [,2] [,3] [,4] [,5]
 [1,] 0  0  0  0  0
 [2,] 0  0  0  0  0
 [3,] 0  0  0  0  0
 
 
 
 
 On 5/28/07, Andrew Yee [EMAIL PROTECTED] wrote:
 
  I have a matrix of characters (actually numbers that have been read in as
  numbers), and I'd like to remove the NA.
 
  I'm familiar with na.omit, but is there an equivalent of na.omit when the
  NA
  are the actual characters NA?
 
  Thanks,
  Andrew
 
 [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 -- 
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390
 
 What is the problem you are trying to solve?
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   
Sick
 sense of humor? Visit Yahoo! TV's 
Comedy with an Edge to see what's on, when.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lme with corAR1 errors - can't find AR coefficient in output

2007-05-25 Thread Stephen Weigand
Millo,

On 5/24/07, Millo Giovanni [EMAIL PROTECTED] wrote:

 Dear List,

 I am using the output of a ML estimation on a random effects model with
 first-order autocorrelation to make a further conditional test. My model
 is much like this (which reproduces the method on the famous Grunfeld
 data, for the econometricians out there it is Table 5.2 in Baltagi):

 library(Ecdat)
 library(nlme)
 data(Grunfeld)
 mymod-lme(inv~value+capital,data=Grunfeld,random=~1|firm,correlation=co
 rAR1(0,~year|firm))

 Embarrassing as it may be, I can find the autoregressive parameter
 ('Phi', if I get it right) in the printout of summary(mymod) but I am
 utterly unable to locate the corresponding element in the lme or
 summary.lme objects.

 Any help appreciated. This must be something stupid I'm overlooking,
 either in str(mymod) or in the help files, but it's a huge problem for
 me.



Try

coef(mymod$model$corStruct,
   unconstrained = FALSE)

Stephen
-- 
Rochester, Minn. USA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] New package earth

2007-05-11 Thread Stephen Milborrow
The earth package is now available on CRAN.

Earth builds models using Friedman's MARS.

Earth's principal advantages over the existing function mda::mars are that 
it is much faster and provides plotting and printing methods.  The general 
purpose model plotting function plotmo may also be useful to people who 
are not interested in earth itself.

Example:

 a - earth(Volume ~ ., data = trees)
 summary(a, digits = 2)

Call:
earth(formula = Volume ~ ., data = trees)

Expression:
  23
  +  5.7 * pmax(0,  Girth - 13)
  -  2.9 * pmax(0,  13 - Girth)
  + 0.72 * pmax(0, Height - 76)

Number of cases: 31
Selected 4 of 5 terms, and 2 of 2 predictors
Number of terms at each degree of interaction: 1 3 (additive model)
GCV: 11 RSS: 213 GRSq: 0.96 RSq: 0.97


Regards,
Stephen Milborrow

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] loop in function

2007-05-05 Thread Stephen Tucker
Actually I am not sure what you want exactly, but is it

df1 -data.frame(b=c(1,2,3,4,5,5,6,7,8,9,10))
df2 -data.frame(x=c(1,2,3,4,5), y=c(2,5,4,6,5), z=c(10, 8, 7, 9, 3))
df1 - cbind(df1,
 colnames-(sapply(with(df2,(x+y)/z),
 function(a,b) a/b,b=df1$b),
  paste(goal,seq(nrow(df2)),sep=)))

 round(df1,2)
b goal1 goal2 goal3 goal4 goal5
1   1  0.30  0.88  1.00  1.11  3.33
2   2  0.15  0.44  0.50  0.56  1.67
3   3  0.10  0.29  0.33  0.37  1.11
4   4  0.07  0.22  0.25  0.28  0.83
5   5  0.06  0.17  0.20  0.22  0.67
6   5  0.06  0.17  0.20  0.22  0.67
7   6  0.05  0.15  0.17  0.19  0.56
8   7  0.04  0.12  0.14  0.16  0.48
9   8  0.04  0.11  0.12  0.14  0.42
10  9  0.03  0.10  0.11  0.12  0.37
11 10  0.03  0.09  0.10  0.11  0.33

each column goal corresponds to row of df1. Alternatively, the sapply()
function can be rewritten with apply():

apply(df2,1,
  function(a,b) (a[x]+a[y])/(a[z]*b),
  b=df1$b)

Hope this answered your question...

--- [EMAIL PROTECTED] wrote:

 Dear Mailing-List,
 I think this is a newbie question. However, i would like to integrate a
 loop in the function below. So that the script calculates for each
 variable within the dataframe df1 the connecting data in df2. Actually it
 takes only the first row. I hope that's clear. My goal is to apply the
 function for each data in df1. Many thanks in advance. An example is as
 follows:
 
 df1 -data.frame(b=c(1,2,3,4,5,5,6,7,8,9,10))
 df2 -data.frame(x=c(1,2,3,4,5), y=c(2,5,4,6,5), z=c(10, 8, 7, 9, 3))
 attach(df2)
 myfun = function(yxz) (x + y)/(z * df1$b)
 df1$goal - apply(df2, 1, myfun)
 df1$goal
 
 regards,
 
 kay
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Single Title for the Multiple plot page

2007-05-03 Thread Stephen Tucker
Sometimes I just overlay a blank plot and annotate with text.

par(mfrow=c(1,2), oma=c(2,0,2,0))
plot(1:10)
plot(1:10)
oldpar - par()
par(mfrow=c(1,1),new=TRUE,mar=rep(0,4),oma=rep(0,4))
plot.window(xlim=c(0,1),ylim=c(0,1),mar=rep(0,4))
text(0.5,c(0.98,0.02),c(Centered Overall Title,Centered Subtitle),
 cex=c(1.4,1))
par(oldpar)


--- Chuck Cleland [EMAIL PROTECTED] wrote:

 Mohammad Ehsanul Karim wrote:
  Dear List, 
  
  In R we can plot multiple graphs in same page using
  par(mfrow = c(*,*)). In each plot we can set title
  using main and sub commands. 
  
  However, is there any way that we can place an
  universal title above the set of plots placed in the
  same page (not individual plot titles, all i need is a
  title of the whole graph page) as well as sib-titles?
  Do I need any package to do so?
  
  Thank you for your time.
 
   This is covered in a number of places in the archives and
 RSiteSearch(Overall Title) points you to relevant posts. I'm not sure
 there is an example of having both a title and subtitle, but that is
 easy enough:
 
  par(mfrow=c(1,2), oma=c(2,0,2,0))
  plot(1:10)
  plot(1:10)
  title(Centered Overall Title, outer=TRUE)
  mtext(side=1, Centered Subtitle, outer=TRUE)
 
   You might consider upgrading to a more recent version of R.
 
  Mohammad Ehsanul Karim (R - 2.3.1 on windows)
  Institute of Statistical Research and Training
  University of Dhaka 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code. 
 
 -- 
 Chuck Cleland, Ph.D.
 NDRI, Inc.
 71 West 23rd Street, 8th floor
 New York, NY 10010
 tel: (212) 845-4495 (Tu, Th)
 tel: (732) 512-0171 (M, W, F)
 fax: (917) 438-0894
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] regular expressions with grep() and negative indexing

2007-04-25 Thread Stephen Tucker
Dear R-helpers,

Does anyone know how to use regular expressions to return vector elements
that don't contain a word? For instance, if I have a vector
  x - c(seal.0,seal.1-exclude)
I'd like to get back the elements which do not contain the word exclude,
using something like (I know this doesn't work) but:
  grep([^(exclude)],x)

I can use 
  x[-grep(exclude,x)]
for this case but then if I use this expression in a recursive function, it
will not work for instances in which the vector contains no elements with
that word. For instance, if I have
  x2 - c(dolphin.0,dolphin.1)
then
  x2[-grep(exclude,x2)]
will give me 'character(0)'

I know I can accomplish this in several steps, for instance:
  myfunc - function(x) {
iexclude - grep(exclude,x)
if(length(iexclude)  0) x2 - x[-iexclude] else x2 - x
# do stuff with x2 ...?
  }

But this is embedded in a much larger function and I am trying to minimize
intermediate variable assignment (perhaps a futile effort). But if anyone
knows of an easy solution, I'd appreciate a tip.

Thanks very much!

Stephen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sum of specific column

2007-04-25 Thread Stephen Weigand
On 4/25/07, Spilak,Jacqueline [Edm] [EMAIL PROTECTED] wrote:
 I have a data set that I have imported (not sure if that makes a
 difference) and I would like to calculate the sum of only specific
 columns.  I have tried
 colSums(dataset, by=list(dataset$col5), dims=1) and I get an error of
 unused arguments

The error message is helpful: there is no 'by' argument to colSums.
You'll just get column sums over all rows.

 I have also tried
 aggregate(dataset, by=list(dataset$col5), sum) and I get the error that
 sum is not meaningful for factors.

Instead of giving aggregate the whole dataset, you can specify certain
columns via dataset[, c(1,5)] or dataset[, c(height, weight)].


 I want to only calculate the sum for specific columns because some of
 the columns have words in them and I have not been able to find anything
 else that would help or why these errors are occuring.
 Jacquie


Good luck,

Stephen

-- 
Rochester, Minn. USA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regular expressions with grep() and negative indexing

2007-04-25 Thread Stephen Tucker
Thanks guys for the suggestions guys- I come across this problem a lot but
now I have many solutions.

Thank you,

Stephen


--- Peter Dalgaard [EMAIL PROTECTED] wrote:

 Peter Dalgaard wrote:
  Stephen Tucker wrote:

  Dear R-helpers,
 
  Does anyone know how to use regular expressions to return vector
 elements
  that don't contain a word? For instance, if I have a vector
x - c(seal.0,seal.1-exclude)
  I'd like to get back the elements which do not contain the word
 exclude,
  using something like (I know this doesn't work) but:
grep([^(exclude)],x)
 
  I can use 
x[-grep(exclude,x)]
  for this case but then if I use this expression in a recursive function,
 it
  will not work for instances in which the vector contains no elements
 with
  that word. For instance, if I have
x2 - c(dolphin.0,dolphin.1)
  then
x2[-grep(exclude,x2)]
  will give me 'character(0)'
 
  I know I can accomplish this in several steps, for instance:
myfunc - function(x) {
  iexclude - grep(exclude,x)
  if(length(iexclude)  0) x2 - x[-iexclude] else x2 - x
  # do stuff with x2 ...?
}
 
  But this is embedded in a much larger function and I am trying to
 minimize
  intermediate variable assignment (perhaps a futile effort). But if
 anyone
  knows of an easy solution, I'd appreciate a tip.

  
  It has come up a couple of times before, and yes, it is a bit of a pain.
 
  Probably the quickest way out is
 
  negIndex - function(i) 
 
 if(length(i))
 
 -i 
 
 else 
 
 TRUE
 

 ... which of course needs braces if typed on the command line
 
 negIndex - function(i) 
 {
if(length(i))
-i 
else 
TRUE
 }
 
 And I should probably also have said that it works like this:
 
  x2 - c(dolphin.0,dolphin.1)
  x2[-grep(exclude,x2)]
 character(0)
  x2[negIndex(grep(exclude,x2))]
 [1] dolphin.0 dolphin.1
 
 
 
 -- 
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)
 35327918
 ~~ - ([EMAIL PROTECTED])  FAX: (+45)
 35327907
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] queries

2007-04-22 Thread Stephen Tucker
hist(rnorm(100),xlab=Data,ylab=Count,main=)
title(main=Histogram of ...,cex=0.5)

see ?par for details on xlab, ylab, main, and cex arguments.
You can call these from title() or include them in hist().
I called title(main=..) separately to control its size separately
from the rest of the text (axis and tick labels).



--- Nima Tehrani [EMAIL PROTECTED] wrote:

 Dear Help Desk,

   Is there any way to change some of the labels on R diagrams? 

   Specifically in histograms, I would like to: 

   1. change the word frequency to count. 
   2. Make the font of the title (Histogram of …) smaller.
   3. Have a different word below the histogram than the one
 occurring in the title (right now if you choose X for your variable, it
 comes both above the histogram (in the phrase Histogram of X) and below
 it).

   Thanks for your time,
   Nima
 

 -
 
 
   [[alternative HTML version deleted]]
 
  __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] queries

2007-04-22 Thread Stephen Tucker
My apologies. Second line should be
title(main=Histogram of ...,cex.main=0.5)

Actually I just realized you can also do 
hist(rnorm(100),xlab=Data,ylab=Count,cex.main=0.5)

...this way you don't have to call title() separately.


--- Stephen Tucker [EMAIL PROTECTED] wrote:

 hist(rnorm(100),xlab=Data,ylab=Count,main=)
 title(main=Histogram of ...,cex=0.5)
 
 see ?par for details on xlab, ylab, main, and cex arguments.
 You can call these from title() or include them in hist().
 I called title(main=..) separately to control its size separately
 from the rest of the text (axis and tick labels).
 
 
 
 --- Nima Tehrani [EMAIL PROTECTED] wrote:
 
  Dear Help Desk,
 
Is there any way to change some of the labels on R diagrams? 
 
Specifically in histograms, I would like to: 
 
1. change the word frequency to count. 
2. Make the font of the title (Histogram of …) smaller.
3. Have a different word below the histogram than the one
  occurring in the title (right now if you choose X for your variable, it
  comes both above the histogram (in the phrase Histogram of X) and below
  it).
 
Thanks for your time,
Nima
  
 
  -
  
  
  [[alternative HTML version deleted]]
  
   __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Opinion on R plots: connecting X and Y

2007-04-20 Thread Stephen Tucker
Edward Tufte seems to have some opinions on this topic.

In The Visual Display of Quantitative Information (Chapter 6: Data-Ink
Maximization and Graphical Design - Redesign of the Scatterplot), he
presents several alternatives

(1) non-data-bearing frame in conventional scatterplots (equivalent to R's
bty=l), which he argues is the common but less informative method.

(2) a little removal of ink from (1) can change axes to display the range of
the data (range-frame).

(3) or, with slight modification of (2), fivenum().

(4) or even dot-dash-plots in which marginal frequency distribution are
displayed as the axis using dots and dashes.

I don't know that bty=n with xaxs,yaxs equal to i or r meets any of
these criteria (and bty=l is apparently less informative than his other
suggestions)...


--- Inman, Brant A.   M.D. [EMAIL PROTECTED] wrote:

 
 Attention R users, especially those that are experienced enough to be
 opinionated, I need your input.
 
 Consider the following simple plot:
 
 x - rnorm(100)
 y - rnorm(100)
 plot(x, y, bty='n')
 
 A colleague (and dreaded SAS user) commented that she thought that my
 plots could be cleaned up by connecting the X and Y axes.  I know that
 I can do that with bty='l' but I don't want to, I find that the plots
 look less cluttered with disjoint axes.
 
 However, I was intrigued enough by her comments that I decided to
 solicit the opinions of others on this issue.  Are there principled
 reasons why one should prefer joined axes or disjoint axes?
 
 Brant Inman
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using mean if two values are identical

2007-04-19 Thread Stephen Tucker
## making data up
# make matrix with some equal values
 mat - cbind(x=rnorm(10),y=rnorm(10),z=rnorm(10))
 mat[c(8,9),y] - mat[c(1,7),x]
 mat
x  y   z
 [1,]  0.26116849  0.5823529 -0.96924020
 [2,] -0.21415406  0.1085396  2.00542549
 [3,]  0.56890081 -1.2526322  0.08539552
 [4,] -1.09096693 -1.9369088  0.03079587
 [5,] -1.31749886 -1.1437411 -0.29125624
 [6,] -0.45941172  0.2997472  0.10329381
 [7,]  0.39586456 -0.2587432 -1.29788184
 [8,] -0.05066363  0.2611685 -0.47942195
 [9,] -0.87602919  0.3958646 -0.53205231
[10,]  0.30059621 -1.9531231  0.22398194
 

## find the index of y which corresponds to equivalent value of
## x and find mean. the following function will give search
## through for each x the matching values of y and return
## the value of x, the index of y, and the mean value
 t(apply(mat[,c(x,z)], MARGIN=1, FUN=function(v,w) {
+   yindex - which(abs(v[x]-w[,y])  .Machine$double.eps^0.5)
+   if(length(yindex)  0) {
+ c(xVal=v[x],indexOfy=yindex,meanVal=mean(c(v[z],w[yindex,z])))
+   } else {
+ c(xVal=v[x],indexOfy=NA,meanVal=NA)
+   }
+ },w=mat[,c(y,z)]))
x indexOfy   meanVal
 [1,]  0.261168498 -0.724331
 [2,] -0.21415406   NANA
 [3,]  0.56890081   NANA
 [4,] -1.09096693   NANA
 [5,] -1.31749886   NANA
 [6,] -0.45941172   NANA
 [7,]  0.395864569 -0.914967
 [8,] -0.05066363   NANA
 [9,] -0.87602919   NANA
[10,]  0.30059621   NANA

Hope this helps.

--- Felix Wave [EMAIL PROTECTED] wrote:

 Hello,
 I have got a question. 
 I've got a matrix (mail end) with the colnames x, y, z. In this matrix
 are different measurements. x and y are risign coordinates.
 
 My question. Always, if the x AND y coordinates are the same, I want to
 
 get the mean of their z values.
 
 
 e.q. 
 x AND y in line1 and line8 are identical: 
 29 4.5 -- mean of 1.505713 and 1.495148
 
 
 Thank's a lot.
 Felix
 
 
 
 
 ###
 ## My R Code ##
 ###
 INPUT   - readLines(dat.dat)
 INPUT   - gsub(^ , , INPUT)
 INPUT   - t( sapply( strsplit(INPUT, split= ), as.numeric ) )
 colnames(INPUT) - c(x, y, z )
 
 
 # HERE START's my PROBLEM #
 if (duplicated(INPUT[,1]  INPUT[,2] ))
   zMEAN   - mean(INPUT[,3] )
 
 # MATRIX with the mean-z values #
 zMATRIX - matrix(c(INPUT[,1], INPUT[,2], INPUT[,3] ), ncol=3, byrow=FALSE)
 
 
 
 
 #
 ## dat.dat ##
 #
 29 4.5 1.505713
 29 4.6 1.580402
 29 4.7 1.656875
 29 4.8 1.735054
 30 0 0
 30 0.1 0.00096108
 30 0.2 0.00323831
 29 4.5 1.495148
 29 4.6 1.568961
 29 4.7 1.644467
 30 0 0
 30 0.1 0.00093699
 30 0.2 0.00319411
 30 0.3 0.00676619
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error with strptime

2007-04-19 Thread Stephen Tucker
you have to use POSIXct classes to include date-time objects into data
frames. strptime() returns an object of class of POSIXlt. when you do the
cbind(), it automatically converts test2 into POSIXct

you probably want
bsamp$spltime-as.POSIXct(strptime(test,format=%d-%B-%y %H:%M))

(but please be aware of time-zone issues when using POSIXct classes). these
documents may help:

http://cran.r-project.org/doc/Rnews/Rnews_2004-1.pdf

=== Also ===
Changes in behavior of POSIXt classes since the aforementioned R News
publication:

http://tolstoy.newcastle.edu.au/R/e2/help/07/04/13626.html
http://tolstoy.newcastle.edu.au/R/e2/help/07/04/13632.html



--- Jean-Louis Abitbol [EMAIL PROTECTED] wrote:

 Dear All,
 
 I am trying to convert to POSIXct after pasting a date and a time in
 character format with strptime.
 
 It is probably obvious but I don't understand why I get an error message
 after
 
 bsamp$spltime-strptime(test,format=%d-%B-%y %H:%M)
 
 whereas I can get what I want if I do it in 2 steps and rbinding ?
 
 Thanks and best regards, Jean-Louis
 
 This is the R console output
 
  bsamp-read.table(bsampl2.csv,header=T,sep=;)
  names(bsamp)-tolower(names(bsamp))
  bsamp-upData(bsamp,drop=c(study))
 Input object size:   23896 bytes;15 variables
 Dropped variable study
 New object size: 23016 bytes;14 variables
  bsamp$visitdat-as.character(bsamp$visitdat)
  bsamp$samtime-as.character(bsamp$samtime)
  bsamp$admtime-as.character(bsamp$admtime)
  bsamp$delay-as.character(bsamp$delay)
  test-paste(bsamp$visitdat,bsamp$samtime)
  test
   [1] 01-mars-06 11:40 15-mars-06 11:30 15-mars-06 15:00
   [4] 29-mars-06 11:40 01-mars-06 11:45 15-mars-06 11:15
   [7] 15-mars-06 14:45 29-mars-06 12:50 01-mars-06 11:16
  [10] 15-mars-06 11:10 15-mars-06 14:30 29-mars-06 11:50
  [13] 01-mars-06 11:50 15-mars-06 11:25 15-mars-06 14:55
  [16] 29-mars-06 11:30 01-mars-06 11:55 15-mars-06 11:35
  [19] 15-mars-06   29-mars-06 11:45 01-mars-06 11:09
  .
 
 
  bsamp$spltime-strptime(test,format=%d-%B-%y %H:%M)
 Erreur dans `$-.data.frame`(`*tmp*`, spltime, value = list(sec =
 numeric(0),  :
 le tableau de remplacement a 9 lignes, le tableau remplacé en a
 140
 
 
  test2-strptime(test,format=%d-%B-%y %H:%M)
  bsamp-cbind(bsamp,test2)
  bsamp$test2
   [1] 2006-03-01 11:40:00 Centre de l'Europe
   [2] 2006-03-15 11:30:00 Centre de l'Europe
   [3] 2006-03-15 15:00:00 Centre de l'Europe
   [4] 2006-03-29 11:40:00 Centre de l'Europe (heure d'été
   [5] 2006-03-01 11:45:00 Centre de l'Europe
   [6] 2006-03-15 11:15:00 Centre de l'Europe
   [7] 2006-03-15 14:45:00 Centre de l'Europe
   [8] 2006-03-29 12:50:00 Centre de l'Europe (heure d'été
   [9] 2006-03-01 11:16:00 Centre de l'Europe
  [10] 2006-03-15 11:10:00 Centre de l'Europe
  [11] 2006-03-15 14:30:00 Centre de l'Europe
  [12] 2006-03-29 11:50:00 Centre de l'Europe (heure d'été
  [13] 2006-03-01 11:50:00 Centre de l'Europe
  [14] 2006-03-15 11:25:00 Centre de l'Europe
  [15] 2006-03-15 14:55:00 Centre de l'Europe
  [16] 2006-03-29 11:30:00 Centre de l'Europe (heure d'été
  [17] 2006-03-01 11:55:00 Centre de l'Europe
  [18] 2006-03-15 11:35:00 Centre de l'Europe
  [19] NA
  ..
  sessionInfo()
 R version 2.4.1 (2006-12-18)
 i386-pc-mingw32
 
 locale:

LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252
 
 attached base packages:
 [1] stats graphics  grDevices utils datasets
 [6] methods   base
 
 other attached packages:
  car RColorBrewer   gplotsgdata   gtools
  1.2-1  0.2-3  2.3.2  2.3.1  2.3.1
  latticeHmisc  acepack  RWinEdt
0.14-17  3.3-11.3-2.2  1.7-5
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Character coerced to factor and I cannot get it back

2007-04-19 Thread Stephen Tucker
You can also set this option globally with options(stringsAsFactors = TRUE)

I believe this was added in R 2.4.0.



--- Gabor Grothendieck [EMAIL PROTECTED] wrote:

 Try this:
 
 DF - data.frame(let = letters[1:3], num = 1:3, stringsAsFactors = FALSE)
 str(DF)
 
 
 On 4/19/07, John Kane [EMAIL PROTECTED] wrote:
 
  --- Tyler Smith [EMAIL PROTECTED] wrote:
 
   I really need to sit down with the manual and sort
   factors and classes
   properly. In your case, I think the problem has
   something to do with
   the way a list behaves?  I'm not sure, but if you
   convert your list to
   a dataframe it seems to work ok:
  
dd3 - as.data.frame(dd1)
typeof(dd3$st)
   [1] integer
class(dd3$st)
   [1] factor
dd3$st - as.character(dd3$st)
typeof(dd3$st)
   [1] character
class(dd3$st)
   [1] character
  
   HTH,
  
   Tyler
 
  Seems to work nicely. I had forgotten about
  'as.data.frame.
 
  I originally thought that it might be a list problem
  too but I don't think so. I set up the example as a
  list since that is the way my real data is being
  imported from csv. However after my original posting I
  went back and tried it with just a dataframe and I'm
  getting the same results. See below.
 
  I even shut down R , reloaded it and detached the two
  extra packages I usually load. Everything is working
  fine but I am doing some things with factors that I
  have never done before and this just makes me a bit
  paranoid.
 
  Thanks very much for the help.
 
 
  EXAMPLE
  dd  - data.frame(aa - 1:4, bb -  letters[1:4],
  cc - c(12345, 123456, 45678, 456789))
 
  id  -  as.character(dd[,3]) ; id
 
  st  - substring(id, 1,nchar(id)-4 ) ; st
  typeof (st)  ; class(st)
 
  dd1  -  cbind(dd, st)
 names(dd1)  - c(aa,bb,cc,st)
 dd1
 typeof(dd1$st); class(dd1$st)
 
  dd2  -  cbind(dd, as.character(st))
 names(dd2)  - c(aa,bb,cc,st)
 dd2
 typeof(dd2$st) ;   class(dd2$st)
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] importing excel-file

2007-04-18 Thread Stephen Tucker
I use gdata and it works quite well for me. It's as easy as

install.packages(gdata)
library(gdata)
data = read.xls(mydata.xls,sheet=1) 

[read.xls() can take other arguments]

It requires concurrent installation of Perl, but installing Perl is also
simple. For Windows, you can get it here:
http://www.activestate.com/Products/ActivePerl/


--- Gabor Csardi [EMAIL PROTECTED] wrote:

 There is also a read.xls command in package gdata, it seems that it uses
 a perl script called 'xls2csv'. I've have no idea how good this is,
 never tried it.
 
 Btw, xlsReadWrite is Windows-only, so you can use it only if 
 you use windows.
 
 Gabor
 
 ps. Corinna, to be honest, i've no idea what kind online help you've
 read, there is plenty. Next time try to be more specific please. 
 
 On Wed, Apr 18, 2007 at 03:07:51PM -0200, Alberto Monteiro wrote:
  Corinna Schmitt wrote:
   
   It is a quite stupid question but please help me. I am very 
   confuced. I am able to import normal txt ant mat-files to R but 
   unable to import .xls-file
   
  I've tried two ways to import excel files, but none of them
  seems perfect.
 [...]
 
 -- 
 Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Manipulation using R

2007-04-18 Thread Stephen Tucker
...is this what you're looking for?

donedat - subset(data,ID  6000 | ID = 7000)
findat - donedat[-unique(rapply(donedat,function(x)
 which( x  0 ))),,drop=FALSE]

the second line looks through each column, and finds the indices of negative
values - rapply() returns all of them as a vector; unique() removes
duplicated elements, and with negative indexing you remove these values from
donedat.

--- Anup Nandialath [EMAIL PROTECTED] wrote:

 Dear Friends,
 
 I have data set with around 220,000 rows and 17 columns. One of the columns
 is an id variable which is grouped from 1000 through 9000. I need to
 perform the following operations. 
 
 1) Remove all the observations with id's between 6000 and 6999
 
 I tried using this method. 
 
 remdat1 - subset(data, ID6000)
 remdat2 - subset(data, ID=7000)
 donedat - rbind(remdat1, remdat2)
 
 I check the last and first entry and found that it did not have ID values
 6000. Therefore I think that this might be correct, but is this the most
 efficient way of doing this?
 
 2) I need to remove observations within columns 3, 4, 6 and 8 when they are
 negative. For instance if the number in column 3 is -4, then I need to
 delete the entire observation. Can somebody help me with this too.
 
 Thank and Regards
 
 Anup
 

 -
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   3   >