Re: [R] plot legend: combining filled boxes and lines

2007-09-10 Thread Gabor Grothendieck
Check out:
http://tolstoy.newcastle.edu.au/R/e2/help/07/05/16777.html

On 9/10/07, Lauri Nikkinen [EMAIL PROTECTED] wrote:
 Hello,

 I have difficulties combining boxes and lines in plot legend. I
 searched previous R-posts and found this (with no solution):
 http://tolstoy.newcastle.edu.au/R/help/06/07/30248.html. Is there a
 way to avoid boxes behind the line legends?

 x1 - rnorm(100)
 x2 - rnorm(100, 2)
 hist(x1, main = , col = orange,ylab = density, xlab = x, freq
 = F, density = 55,  xlim = c(-2, 5), ylim = c(0, 0.5))
 par(new = T)
 hist(x2, main = , col = green, ylab = , xlab = ,axes = F, xlim
 = c(-2, 5), ylim = c(0, 0.5), density = 45, freq = F)

 abline(v = mean(x1), col = orange, lty = 2, lwd = 2.5)
 abline(v = mean(x2), col = green, lty = 2, lwd = 2.5)
 legend(3, 0.45, legend = c(x1, x2, mean(x1), mean(x2)), col =
 c(orange, green), fill=c(orange,green, 0, 0),  lty = c(0, 0,
 2, 2), merge = T)

 Thanks
 Lauri

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] off-topic: better OS for statistical computing

2007-09-10 Thread Gabor Grothendieck
You want whatever all the people you are working with are using
to make it as easy as possible to work together with them.

On 9/10/07, Wensui Liu [EMAIL PROTECTED] wrote:
 Good morning, everyone,
 I am sorry for this off-topic post but think I can get great answer
 from this list.
 My question is what is the best OS on PC (laptop) for statistical
 computing and why.
 I really appreciate your insight.
 Have a nice day.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] off-topic: better OS for statistical computing

2007-09-10 Thread Gabor Grothendieck
My sense is that R users are even split between UNIX and Windows
users so either will do in terms of the larger community.

Some R packages may not be avaliable on every platform or will
be available on one platform before another or there will be
certain platform-specific issues.  So in the end its easiest to
have the same thing everyone else that you work with does.

Also if you run into
problems then you can ask others whereas if you are the lone
person with something different you have no one to turn to.

Also associated software may be, for example, Microsoft Office in
a Microsoft environment and LaTeX in a UNIX environment. And
networking will be simplified in a consistent environment too.
Certainly there is Open Office, Samba and putty but the easiest
is just not to have to worry about getting everything to work
together by just having the same thing in the first place.

Neither Linux nor Windows is superior to the other.  People
making such representations generally know one much better
than the other and its more a reflection of their own experience
than anything else.  I personally have used both UNIX and
Windows since their inception and find that I tend to have a
slight preference for whatever I used last.  Technical merits of
one vs. the other are basically irrelevant for most purposes.

On 9/10/07, Patrick Connolly [EMAIL PROTECTED] wrote:
 On Mon, 10-Sep-2007 at 12:26PM -0400, Gabor Grothendieck wrote:

 | You want whatever all the people you are working with are using
 | to make it as easy as possible to work together with them.

 Assuming you're using R, there is negligible difficulty using a
 different OS from what your colleagues use (apart from the
 inconsistencies you get between different versions of Windows, but
 even that has little effect on R).  The standard .RData binary files
 work with Windows and Linux (and probably OS X).

 The only issue I come across is that Linux can't create WMF files as
 readily as Windows can, and that is more than made up for by the
 greater flexibility that Linux offers.  It's easier in Linux to
 produce Excel files from dataframes and matrices using a perl script
 posted to this list by Marc Schwartz.  Thanks again Marc.

 Best

 Patrick


 |
 | On 9/10/07, Wensui Liu [EMAIL PROTECTED] wrote:
 |  Good morning, everyone,
 |  I am sorry for this off-topic post but think I can get great answer
 |  from this list.
 |  My question is what is the best OS on PC (laptop) for statistical
 |  computing and why.
 |  I really appreciate your insight.
 |  Have a nice day.
 |
 | __
 | R-help@stat.math.ethz.ch mailing list
 | https://stat.ethz.ch/mailman/listinfo/r-help
 | PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 | and provide commented, minimal, self-contained, reproducible code.

 --
 ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
   ___Patrick Connolly
  {~._.~} Great minds discuss ideas
  _( Y )_Middle minds discuss events
 (:_~*~_:)Small minds discuss people
  (_)-(_)   . Anon

 ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] finding the minimum positive value of some data

2007-09-10 Thread Gabor Grothendieck
Here are some solutions each of which
1. has only one line,
2. x only occurs once so you can just plug in a complex expression
3. no temporary variables are left

min(sapply(x, function(z) if (z  0) z else Inf))

(function(z) min(ifelse(z  0, z, Inf))) (x)

with(list(z = x), min(z[z  0]))

local({ z - x; min(z[z  0]) })

On 9/10/07, dxc13 [EMAIL PROTECTED] wrote:

 useRs,

 I am looking to find the minimum positive value of some data I have.
 Currently, I am able to find the minimum of data after I apply some other
 functions to it:

  x
  [1]  1  0  1  2  3  3  4  5  5  5  6  7  8  8  9  9 10 10

  sort(x)
  [1]  0  1  1  2  3  3  4  5  5  5  6  7  8  8  9  9 10 10

  diff(sort(x))
  [1] 1 0 1 1 0 1 1 0 0 1 1 1 0 1 0 1 0

  min(diff(sort(x)))
 [1] 0

 The minimum is given as zero, which is clearly true, but I am interested in
 only the positive minimum, which is 1.  Can I find this by using only 1 line
 of code, like I have above? Thanks!

 dxc13
 --
 View this message in context: 
 http://www.nabble.com/finding-the-minimum-positive-value-of-some-data-tf4417250.html#a12599319
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] what am I missing

2007-09-10 Thread Gabor Grothendieck
Its a FAQ:

http://hermes.sdu.dk/Rdoc/faq.html#Why%20does%20outer()%20behave%20strangely%20with%20my%20function%3f

On 9/10/07, Jan de Leeuw [EMAIL PROTECTED] wrote:
 x-seq(-1,1,length=10)
 y-seq(-1,1,length=10)
 a-matrix(c(1,2,2,1),2,2)
 b-matrix(c(2,1,1,2),2,2)

 fv-function(x,y) {
m-x*a+y*b
t-m[1,1]+m[2,2]; d-m[1,1]*m[2,2]-m[1,2]^2
return((t-sqrt(t^2-4*d))/2)
 }

 gv-function(x,y) {
t-x*(a[1,1]+a[2,2])+y*(b[1,1]+b[2,2])
d-(x*a[1,1]+y*b[1,1])*(x*a[2,2]+y*b[2,2])-(x*a[1,2]+y*b[1,2])^2
return((t-sqrt(t^2-4*d))/2)
 }


 now outer(x,y,gv) works as expected, outer(x,y,fv) bombs. But

 z-matrix(0,10,10); for (i in 1:10) for (j in 1:10) z[i,j]-fv(x[i],y
 [j])

 works fine. Must be something in outer().

 ==
 Jan de Leeuw, 11667 Steinhoff Rd, Frazier Park, CA 93225, 661-245-1725
 .mac: jdeleeuw ++  aim: deleeuwjan ++ skype: j_deleeuw
 homepages: http://www.cuddyvalley.org and http://gifi.stat.ucla.edu
 ==
A bath when you're born,
   a bath when you die,
how stupid.  (Issa 1763-1827)




[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL like function?

2007-09-08 Thread Gabor Grothendieck
Great.  Regarding the web, note that there are actually quite a few R
web projects as well:

http://www.lmbe.seu.edu.cn/CRAN/doc/FAQ/R-FAQ.html#R-Web-Interfaces

I have used rpad (www.rpad.org) which has an integrated web server right
in the R package making setup a non-issue.

On 9/8/07, Takatsugu Kobayashi [EMAIL PROTECTED] wrote:
 Hi Gabor,

 Wow, this is awesome although I eventually should learn MySQL for
 integrating it on web-based DB management using PHP or Perl, this is a
 very helpful tool for me to start with!

 Thank you very much

 Gabor Grothendieck wrote:
  Others have already pointed out %in% but regarding your comment about
  SQL, you can use SQL to manipulate R data frames using the sqldf package
  which provides an interface to lower level RSQLite (and RMySQL in the 
  future)
  routines.  The following examples use SQLite underneath:
 
  DF - data.frame(observation = c(1,2,3,4,5))
  ID - data.frame(ID = c(1, 3, 4))
 
  library(sqldf)
  sqldf(select observation, observation in (select * from ID) `ID?` from DF)
 
  # or
 
  sqldf(select observation, observation in (1, 3, 4) `ID?` from DF)
 
  See home page at:
 
  http://sqldf.googlecode.com
 
 
  On 9/7/07, Takatsugu Kobayashi [EMAIL PROTECTED] wrote:
 
  Hi RUsers,
 
  I am wonder if I can search observations whose IDs matches any of the
  values in another vector, such as in MySQL. While I am learing MySQL for
  future database management, I appreciate if anyone could give me a hint.
 
  Suppose I have one 5*1 vector containing observation IDs and
  frequencies, and one 3*1 vector containing observation IDs.
 
  observation-c(1,2,3,4,5)
  ID-c(1,3,4)
 
  Then, I would like to program a code that returns a results showing
  matched observations like
 
  result: TRUE FALSE TRUE TRUE FALSE
 
  I am reading S programming, but I cannot find a way to do this.
 
  Thank you very much.
 
  Taka
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lisp-like primitives in R

2007-09-08 Thread Gabor Grothendieck
On 9/8/07, Peter Dalgaard [EMAIL PROTECTED] wrote:
 François Pinard wrote:
  [Roland Rau]
 
  [François Pinard]
 
 
 
  I wonder what happened, for R to hide the underlying Scheme so fully,
  at least at the level of the surface language (despite there are
  hints).
 
 
 
  To further foster portability, we chose to write R in ANSI C
 
 
  Yes, of course.  Scheme is also (often) implemented in C.  I meant that
  R might have implemented a Scheme engine (or part of a Scheme engine,
  extended with appropriate data types) with a surface language (nearly
  the S language) which is purposely not Scheme, but could have been.
 
  If the gap is not extreme, one could dare dreaming that the Scheme
  engine in R be completed, and Scheme offered as an alternate extension
  language.  If you allow me to continue dreaming awake -- they told me
  they will let me free as long as I do not get dangerous! :-) -- part
  of the interest lies in the fact there are excellent Scheme compilers.
  If we could only find or devise some kind of marriage between a mature
  Scheme and R, so to speed up the non-vectorisable parts of R scripts...
 
 
 Well, depending on what you want, this is either trivial or
 impossible... The internal storage of R is still pretty much equivalent
 to scheme. E.g. try this:

   r2scheme - function(e) if (!is.recursive(e))
  deparse(e) else c((, unlist(lapply(as.list(e), r2scheme)), ))
   paste(r2scheme(quote(for(i in 1:4)print(i))), collapse= )
 [1] ( for i ( : 1 4 ) ( print i ) )


Also see showTree in codetools:

 library(codetools)
 showTree(quote(for(i in 1:4)print(i)))
(for i (: 1 4) (print i))

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL like function?

2007-09-07 Thread Gabor Grothendieck
Others have already pointed out %in% but regarding your comment about
SQL, you can use SQL to manipulate R data frames using the sqldf package
which provides an interface to lower level RSQLite (and RMySQL in the future)
routines.  The following examples use SQLite underneath:

DF - data.frame(observation = c(1,2,3,4,5))
ID - data.frame(ID = c(1, 3, 4))

library(sqldf)
sqldf(select observation, observation in (select * from ID) `ID?` from DF)

# or

sqldf(select observation, observation in (1, 3, 4) `ID?` from DF)

See home page at:

http://sqldf.googlecode.com


On 9/7/07, Takatsugu Kobayashi [EMAIL PROTECTED] wrote:
 Hi RUsers,

 I am wonder if I can search observations whose IDs matches any of the
 values in another vector, such as in MySQL. While I am learing MySQL for
 future database management, I appreciate if anyone could give me a hint.

 Suppose I have one 5*1 vector containing observation IDs and
 frequencies, and one 3*1 vector containing observation IDs.

 observation-c(1,2,3,4,5)
 ID-c(1,3,4)

 Then, I would like to program a code that returns a results showing
 matched observations like

 result: TRUE FALSE TRUE TRUE FALSE

 I am reading S programming, but I cannot find a way to do this.

 Thank you very much.

 Taka

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help on replacing values

2007-09-07 Thread Gabor Grothendieck
Your columns are factors, not character strings.  Use as.is = TRUE as
an argument to read.table.   Also its a bit dangerous to use T although
not wrong.  Its safer to use TRUE.

On 9/7/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 Dear List,

 I have a newbie question. I have read in a data.frame as follows:

  data = read.table(table.txt, header = T)
  data
  X1 X2 X3 X4
 A AB AC AB AC
 B AB AC AA AB
 C AA AB AA AB
 D AA AB AB AC
 E AB AA AA AB
 F AB AA AB AC
 B AB AC AB AA

 I would like to replace AA values by BB in column X2. I have tried
 using replace() with no success, although I am not sure this is the
 right function. This is the code I have used:

 data$X2 - replace(data$X2, data$X2 ==AA,BB)
 Warning message:
 invalid factor level, NAs generated in: `[-.factor`(`*tmp*`, list,
 value = BB)

 What is wrong with the code? How can I get this done? how about
 changing AA values by BB in all 4 columns simultaneously? Actually
 this is a small example dataframe, the real one would have about 1000
 columns.

 Extendind this, I found a similar thread dated July 2006 that used
 replace() on iris dataset, but I have tried reproducing it obtaining
 same warning message

  iris$Species - replace(iris$Species, iris$Species
 == setosa,NewName)
 Warning message:
 invalid factor level, NAs generated in: `[-.factor`(`*tmp*`, list,
 value = NewName)

 Thanks in advance your help,

 David

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] variable format

2007-09-07 Thread Gabor Grothendieck
A matrix is for situations where every element is of the same class
but your columns have different classes so use a data frame:

DF - data.frame(a = 11:15, b = letters[1:5], stringsAsFactors = FALSE)
subset(DF, a %in% 11:13)
subset(DF, a %in% c(0, 11:13)) # same

Suggest you review the Introduction to R manual and look at ?data.frame,
?subset and ?%in%

On 9/4/07, Cory Nissen [EMAIL PROTECTED] wrote:
 Okay, I want to do something similar to SAS proc format.

 I usually do this...

 a - NULL
 a$divisionOld - c(1,2,3,4,5)
 divisionTable - matrix(c(1, New England,
  2, Middle Atlantic,
  3, East North Central,
  4, West North Central,
  5, South Atlantic),
ncol=2, byrow=T)
 a$divisionNew[match(a$divisionOld, divisionTable[,1])] - divisionTable[,2]

 But how do I handle the case where...
 a$divisionOld - c(0,1,2,3,4,5)   #no format available for 0, this throws an 
 error.
 OR
 divisionTable - matrix(c(1, New England,
  2, Middle Atlantic,
  3, East North Central,
  4, West North Central,
  5, South Atlantic,
  6, East South Central,
  7, West South Central,
  8, Mountain,
  9, Pacific),
ncol=2, byrow=T)
 There are extra formats available... this throws a warning.

 Thanks

 Cory

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ploting missing data

2007-09-07 Thread Gabor Grothendieck
Try this:

library(zoo)
plot(na.approx(zoo(as.matrix(data[-1]), data[,1])), plot.type = single)

See ?na.approx, ?plot.zoo, ?xyplot.zoo and vignette(zoo)

On 9/7/07, Markus Schmidberger [EMAIL PROTECTED] wrote:
 Hello,

 I have this kind of dataframe and have to plot it.

 data - data.frame(sw= c(1,2,3,4,5,6,7,8,9,10,11,12,15),
zehn =
 c(33.44,20.67,18.20,18.19,17.89,19.65,20.05,19.87,20.55,22.53,NA,NA,NA),
 zwanzig =
 c(61.42,NA,26.60,23.28,NA,24.90,24.47,24.53,26.41,28.26,NA,29.80,35.49),
 fuenfzig =
 c(162.51,66.08,49.55,43.40,NA,37.77,35.53,36.46,37.25,37.66,NA,42.29,47.80)
 )

 The plot should have lines:
 lines(fuenfzig~sw, data=data)
 lines(zwanzig~sw, data=data)

 But now I have holes in my lines for the missing values (NA). How to
 plot the lines without the holes?
 The missing values should be interpolated or the left and right point
 directly connected. The function approx interpolates the whole dataset.
 Thats not my goal!
 Is there no plotting function to do this directly?

 Best
 Markus

 --
 Dipl.-Tech. Math. Markus Schmidberger

 Ludwig-Maximilians-Universität München
 IBE - Institut für medizinische Informationsverarbeitung,
 Biometrie und Epidemiologie
 Marchioninistr. 15, D-81377 Muenchen
 URL: http://ibe.web.med.uni-muenchen.de
 Mail: Markus.Schmidberger [at] ibe.med.uni-muenchen.de

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Delete query in sqldf?

2007-09-07 Thread Gabor Grothendieck
Yes but delete does not return anything so its not useful.  In the devel
version of sqldf you can pass multiple command so try this using the
builtin data frame BOD noting that the record with demand = 8.3 was
removed:

 library(sqldf)
Loading required package: RSQLite
Loading required package: DBI
Loading required package: gsubfn
Loading required package: proto
 # overwrite with devel version of the sqldf.R file
 source(http://sqldf.googlecode.com/svn/trunk/R/sqldf.R;)
 sqldf(c(delete from BOD where demand = 8.3, select * from BOD))
  Time__1 demand
1   2   10.3
2   3   19.0
3   4   16.0
4   5   15.6
5   7   19.8


On 9/7/07, Paul Smith [EMAIL PROTECTED] wrote:
 Dear All,

 Is sqldf equipped with delete queries? I have tried delete queries but
 with no success.

 Thanks in advance,

 Paul

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Automatic detachment of dependent packages

2007-09-07 Thread Gabor Grothendieck
If its good enough just to get rid of all attached packages since after startup
you could just do repeated detaches like this making use of the fact that
search() has 9 components on startup:

replicate(length(search()) - 9, detach())


On 9/7/07, Paul Smith [EMAIL PROTECTED] wrote:
 Dear All,

 When one loads certain packages, some other dependent packages are
 loaded as well. Is there some way of detaching them automatically when
 one detaches the first package loaded? For instance,

  library(sqldf)
 Loading required package: RSQLite
 Loading required package: DBI
 Loading required package: gsubfn
 Loading required package: proto

 but

  detach(package:sqldf)
 
  search()
  [1] .GlobalEnvpackage:gsubfnpackage:proto
  [4] package:RSQLite   package:DBI   package:stats
  [7] package:graphics  package:grDevices package:utils
 [10] package:datasets  package:methods   Autoloads
 [13] package:base

 The packages

 RSQLite
 DBI
 gsubfn
 proto

 were not detached.

 Thanks in advance,

 Paul

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Delete query in sqldf?

2007-09-07 Thread Gabor Grothendieck
All sqldf does is pass the command to sqlite and retrieve whatever it
sends back translating the two directions to and from R.  sqldf
does not change the meaning of any sql statements.  Perhaps the
meaning you expect is desirable but its not how sqlite works.   If
sqlite were changed to adopt that meaning then sqldf would
automatically get it too.

Here is an example which does not involve R at all which
illustrates that delete returns nothing.

C:\ sqlite3
SQLite version 3.4.0
Enter .help for instructions
sqlite
sqlite create table t1(a,b);
sqlite insert into T1 values(1,2);
sqlite insert into T1 values(1,3);
sqlite insert into T1 values(2,4);
sqlite delete from t1 where b = 2;
sqlite select * from t1;
1|3
2|4


On 9/7/07, Paul Smith [EMAIL PROTECTED] wrote:
 On 9/7/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
  Yes but delete does not return anything so its not useful.  In the devel
  version of sqldf you can pass multiple command so try this using the
  builtin data frame BOD noting that the record with demand = 8.3 was
  removed:
 
   library(sqldf)
  Loading required package: RSQLite
  Loading required package: DBI
  Loading required package: gsubfn
  Loading required package: proto
   # overwrite with devel version of the sqldf.R file
   source(http://sqldf.googlecode.com/svn/trunk/R/sqldf.R;)
   sqldf(c(delete from BOD where demand = 8.3, select * from BOD))
Time__1 demand
  1   2   10.3
  2   3   19.0
  3   4   16.0
  4   5   15.6
  5   7   19.8

 I see, Gabor, but I would expect as more natural to have

 sqldf(delete from BOD where demand = 8.3)

 working, with no second command.

 Paul


  On 9/7/07, Paul Smith [EMAIL PROTECTED] wrote:
   Dear All,
  
   Is sqldf equipped with delete queries? I have tried delete queries but
   with no success.
  
   Thanks in advance,
  
   Paul

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R first.id last.id function error

2007-09-07 Thread Gabor Grothendieck
A slightly easier way to construct first and last if the vector x is
sorted (as is assumed in SAS) is:

   first - !duplicated(x)
   last - !duplicated(x, fromLast = TRUE)

where the fromLast= argument is added in R 2.6.0.


On 9/7/07, Gerard Smits [EMAIL PROTECTED] wrote:
 Hi R users,

 I have a test dataframe (file1, shown below) for which I am trying
 to create a flag for the first and last ID record (equivalent to SAS
 first.id and last.id variables.

 Dump of file1:

   file1
id rx week dv1
 1   1  11   1
 2   1  12   1
 3   1  13   2
 4   2  11   3
 5   2  12   4
 6   2  13   1
 7   3  11   2
 8   3  12   3
 9   3  13   4
 10  4  11   2
 11  4  12   6
 12  4  13   5
 13  5  21   7
 14  5  22   8
 15  5  23   5
 16  6  21   2
 17  6  22   4
 18  6  23   6
 19  7  21   7
 20  7  22   8
 21  8  21   9
 22  9  21   4
 23  9  22   5

 I have written code that correctly assigns the first.id and last.id variabes:

 require(Hmisc)  #for Lags
 #ascending order to define first dot
 file1- file1[order(file1$id, file1$week),]
 file1$first.id - (Lag(file1$id) != file1$id)
 file1$first.id[1]-TRUE  #force NA to TRUE

 #descending order to define last dot
 file1- file1[order(-file1$id,-file1$week),]
 file1$last.id  - (Lag(file1$id) != file1$id)
 file1$last.id[1]-TRUE   #force NA to TRUE

 #resort to original order
 file1- file1[order(file1$id,file1$week),]



 I am now trying to get the above code to work as a function, and am
 clearly doing something wrong:

   first.last - function (df, idvar, sortvars1, sortvars2)
 +   {
 +   #sort in ascending order to define first dot
 +   df- df[order(sortvars1),]
 +   df$first.idvar - (Lag(df$idvar) != df$idvar)
 +   #force first record NA to TRUE
 +   df$first.idvar[1]-TRUE
 +
 +   #sort in descending order to define last dot
 +   df- df[order(-sortvars2),]
 +   df$last.idvar  - (Lag(df$idvar) != df$idvar)
 +   #force last record NA to TRUE
 +   df$last.idvar[1]-TRUE
 +
 +   #resort to original order
 +   df- df[order(sortvars1),]
 +   }
  

 Function call:

   first.last(df=file1, idvar=file1$id,
 sortvars1=c(file1$id,file1$week), sortvars2=c(-file1$id,-file1$week))

 R Error:

 Error in as.vector(x, mode) : invalid argument 'mode'
  

 I am not sure about the passing of the sort strings.  Perhaps this is
 were things are off.  Any help greatly appreciated.

 Thanks,

 Gerard
[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creat list

2007-09-06 Thread Gabor Grothendieck
Try this:

do.call(cbind, lista)

On 9/6/07, livia [EMAIL PROTECTED] wrote:

 Hi,

 I have a list named lista, which has 50 vectors and each vector has the
 length about 1200. I would like to creat a matrix out of lista. What I try
 now is cbind(lista[[1]],lista[[2]],...,lista[[50]]). I guess there would be
 an easy way of doing this. Could anyone give me some advice?
 --
 View this message in context: 
 http://www.nabble.com/creat-list-tf4391162.html#a12519637
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Excel

2007-09-06 Thread Gabor Grothendieck
On my version of Excel (Excel 2007 under Vista) using
File | Open on a file, a.txt such as:

a b
sep7 10
sep10 11

causes it to enter a wizard where it asks you for the delimiters and
column types so you can change it from what it offers as the default.
In particular, if you leave it at General it will guess Date but you can
specify Text or you can specify Date to cause it to select a
particular type.


On 9/6/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 Quoting Robert A LaBudde [EMAIL PROTECTED]:

  If you format the column as Text, you won't have this problem. By
  leaving the cells as General, you leave it up to Excel to guess at
  the correct interpretation.
 
  You will note that the conversion to a date occurs immediately in
  Excel when you enter the value. There are many formats to enter dates.
 
  Either pre-format the column as Text, or prefix the individual entry
  with an ' to indicate text.

 But the conversion is done as soon as the file is opened, _before_ you
 have the chance to format the column as text!!!
 Once the conversion is done... it's done.
 I had gene names such as SEP7 converted by Excel into a 5 digit
 number representing a date. From that number I didn't find a way to
 reconstruct SEP7. Sept-7 is not the same.

 It seems like a problem with an easy solution. But it isn't. There are
 too many variations.

  A similar problem occurs in R's read.table() function when a factor
  has levels that can be interpreted as numbers.

 at least with read.table you can specify the classes of each column
 _before_ you read the file.

 R developers are better behaved than MS Excel ones ;-)

 Jose

 
  At 10:11 PM 8/27/2007, David wrote:
 
  A common process when data is obtained in an Excel spreadsheet is to save
  the spreadsheet as a .csv file then read it into R. Experienced users
  might have learned to be wary of dates (as I have) but possibly have not
  experienced what just happened to me. I thought I might just share it with
  r-help as a cautionary tale.
 
  I received an Excel file giving patient details. Each patient had an ID
  code in the form of three letters followed by four digits. (Actually a New
  Zealand National Health Identification.) I saved the .xls file as .csv.
  Then I opened up the .csv (with Excel) to look at it. In the column of ID
  codes I saw: Aug-99. Clicking on that entry it showed 1/08/2699.
 
  In a column of character data, Excel had interpreted AUG2699 as a date.
 
  The .csv did not actually have a date in that cell, but if I had saved the
  .csv file it would have.
 
  David Scott
 
  
  Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [EMAIL PROTECTED]
  Least Cost Formulations, Ltd.URL: http://lcfltd.com/
  824 Timberlake Drive Tel: 757-467-0954
  Virginia Beach, VA 23464-3239Fax: 757-467-2947
 
  Vere scire est per causas scire
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 



 --
 Dr. Jose I. de las Heras  Email: [EMAIL PROTECTED]
 The Wellcome Trust Centre for Cell BiologyPhone: +44 (0)131 6513374
 Institute for Cell  Molecular BiologyFax:   +44 (0)131 6507360
 Swann Building, Mayfield Road
 University of Edinburgh
 Edinburgh EH9 3JR
 UK

 --
 The University of Edinburgh is a charitable body, registered in
 Scotland, with registration number SC005336.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Excel

2007-09-06 Thread Gabor Grothendieck
That is not what happens in Excel 2007 when I tried it just now. I tried
saving the same file I displayed in my prior message as an .xls file and
as an .xlsx file and in both cases the first column came back as text,
as I had specified to the Wizard on the initial import.  I guess they fixed
the behavior in Excel 2007.

On 9/6/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

 Yes, and then you save it, you open it again... same behaviour.
 The only way I found around it was to insert a character at the
 beginning of every element in such columns. An apostrophe works, but
 it looks ugly. Yes, when loading the data in R you could easily clean
 it up automatically... doable.
 You can add a space. Then it will not show, but you have to remember
 that if you ever use the data for labels etc. You shouldn't need to do
 that in the first place...

 Jose

 Quoting Erich Neuwirth [EMAIL PROTECTED]:

  There is a hack to get around the problem.
  It is definitely not a good solution, just a hack.
 
  Open the .csv file in a text editor and select everything.
  Paste it into an empty Excel sheet.
  Then use Data - Text to Columns
 
  The third dialog box (at least it is the third one in Excel 2003)
  allows you to format each column of the data. This is the place where
  you can switch off the date interpretation of your ID column.
 
  AUG1838 probably is not onterpreted as date because Excel dates only
  start at 1/1/1900.
 
 
  Duncan Murdoch wrote:
  On 8/28/2007 3:16 AM, J Dougherty wrote:
  On Monday 27 August 2007 22:21, David Scott wrote:
  On Tue, 28 Aug 2007, Robert A LaBudde wrote:
  If you format the column as Text, you won't have this problem. By
  leaving the cells as General, you leave it up to Excel to guess at
  the correct interpretation.
  Not true actually. I had converted the column to Text because I saw the
  interpretation as a date in the .xls file. I saved the .csv file *after*
  the column had been converted to Text. Looking at the .csv file in a text
  editor, the entry is correct.
 
  I have just rechecked this.
 
  On reopening the .csv using Excel, the entry AUG2699 had been interpreted
  as a date, and was showing as Aug-99. Most bizarre is that the NHI value
  of AUG1838 has *not* been interpreted as a date.
 
 
  --
  Erich Neuwirth, University of Vienna
  Faculty of Computer Science
  Computer Supported Didactics Working Group
  Visit our SunSITE at http://sunsite.univie.ac.at
  Phone: +43-1-4277-39464 Fax: +43-1-4277-39459
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 



 --
 Dr. Jose I. de las Heras  Email: [EMAIL PROTECTED]
 The Wellcome Trust Centre for Cell BiologyPhone: +44 (0)131 6513374
 Institute for Cell  Molecular BiologyFax:   +44 (0)131 6507360
 Swann Building, Mayfield Road
 University of Edinburgh
 Edinburgh EH9 3JR
 UK

 --
 The University of Edinburgh is a charitable body, registered in
 Scotland, with registration number SC005336.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lisp-like primitives in R

2007-09-06 Thread Gabor Grothendieck
Reduce, Filter and Map are part of R 2.6.0.  Try ?Reduce

On 9/6/07, Chris Elsaesser [EMAIL PROTECTED] wrote:
 I mainly program in Common Lisp and use R for statistical analysis.

 While in R I miss the power and ease of use of Lisp, especially its many
 primitives such as find, member, cond, and (perhaps a bridge too far)
 loop.

 Has anyone created a package that includes R analogs to a subset of Lisp
 functions?


 Chris Elsaesser, PhD
 Principal Scientist, Machine Learning
 SPADAC Inc.
 7921 Jones Branch Dr. Suite 600
 McLean, VA 22102

 703.371.7301 (m)
 703.637.9421 (o)

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems in read.table

2007-09-06 Thread Gabor Grothendieck
See ?count.fields to get a vector of how many fields are on each line.
Also fill = TRUE on read.table() can be used to fill out short lines if
that is appropriate.

On 9/6/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 Dear R-users,

 I have encountered the following problem every now and then. But I was
 dealing with a very small dataset before, so it wasn't a problem (I
 just edited the dataset in Openoffice speadsheet). This time I have to
 deal with many large datasets containing commuting flow data. I
 appreciate if anyone could give me a hint or clue to get out of this
 problem.

 I have a .dat file called 1081.dat: 1001 means Birmingham, AL.

 I imported this .dat file using read.table like
 tmp-read.table('CTPP3_ANSI/MPO3441_ctpp3_sumlv944.dat',header=T)

 Then I got this error message:
 Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
line 9499 did not have 209 elements

 Since I got an error message saying other rows did not have 209
 elements, I added skip=c(205,9499,9294)) in hoping that R would take
 care of this problem. But I got a similar error message:
 tmp-read.table('CTPP3_ANSI/MPO3441_ctpp3_sumlv944.dat',header=T,skip=c(205,9499,9294))
 Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
line 9294 did not have 209 elements
 In addition: Warning message:
 the condition has length  1 and only the first element will be used
 in: if (skip  0) readLines(file, skip)

 Is there any way to let a R code to automatically skip problematic
 rows? Thank you very much!

 Taka

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 'singular gradient matrix’ when using nl s() and how to make the program skip nls( ) and run on

2007-09-06 Thread Gabor Grothendieck
In case 1 graph your function and then use optimize rather than nls.

In case 2 a and b may have the same effect as c on f whereas they
don't vary in case 1 so it does not matter.  For example consider
minimizing f - function(a, b) (a + b)^2  If a is fixed at zero then
the minimum occurs for b=0 but if a is not fixed then increasing a
and decreasing b by the same amount causes no change in the
result so the gradient in such a direction is zero.

On 9/5/07, Yuchen Luo [EMAIL PROTECTED] wrote:
 Dear friends.

 I use nls() and encounter the following puzzling problem:



 I have a function f(a,b,c,x), I have a data vector of x and a vectory  y of
 realized value of f.



 Case1

 I tried to estimate  c with (a=0.3, b=0.5) fixed:

 nls(y~f(a,b,c,x), control=list(maxiter = 10, minFactor=0.5
 ^2048),start=list(c=0.5)).

 The error message is: number of iterations exceeded maximum of 10



 Case2

 I then think maybe the value of a and be are not reasonable. So, I let nls()
 estimate (a,b,c) altogether:

 nls(y~f(a,b,c,x), control=list(maxiter = 10, minFactor=0.5
 ^2048),start=list(a=0.3,b=0.5,c=0.5)).

 The error message is:

 singular gradient matrix at initial parameter estimates.



 This is what puzzles me, if the initial parameter of (a=0.3,b=0.5,c=0.5) can
 create 'singular gradient matrix', then why doesn't this 'singular gradient
 matrix' appear in Case1?



 I have tried to change the initial value of (a,b,c) around but the problem
 persists. I am wondering if there is a way out.



 My another question is, I need to run 220 of  nls() in my program with
 different y and x. When one of the nls() encounter a problem, the whole
 program stops.  In my case, the 3rd nls() runs into a problem.  I would
 still need the program to run the remaining 217 nls( )! Is there a way to
 make the program skip the problematic nls() and complete the ramaining
 nls()'s?



 Your help will be highly appreciated!

 Yuchen Luo

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Table and ftable

2007-09-04 Thread Gabor Grothendieck
Try this which gives an object of the required shape and of
class c(xtabs, table) :

   xx - xtabs(area ~ sic + level, DF)

You can optionally do it like this to make it class matrix

   xx - xtabs(area ~ sic + level, DF)[]

and if you don't want the call attribute:

   attr(xx, call) - NULL

On 9/4/07, Giulia Bennati [EMAIL PROTECTED] wrote:
 Dear listmembers,
 I have a little question: I have my data organized as follow

 sic  level  area
 a2112.4
 b3112.3
 b3220.2
 b3220.5
 c1003.0
 c1001.5
 c2421.5
 d2220.2

 where levels and sics are factors. I'm trying to obtain a matrix like this:

level
 211311322   100242 222
 sic
 a2.4  0   0   0   00
 b 0   2.30.7 0   00
 c 00  0   4.5 1.5 0
 d 00  00   0   0.2

 I tryed with table function as
 table(sic,level) but i obteined only a contingency table.
 Have you any suggestions?
 Thank you very much,
 Giulia

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variable scope in a function

2007-09-04 Thread Gabor Grothendieck
environment(test_func) - baseenv()

will allow it to access the base environment so it can still find exists
but will not find kat.  If you issue the command

search()

then each attached package has the next as its parent and base is the
last one.

Regarding your second question, try rm().

f - function() { x - 1; rm(x); exists(x, environment()) }
f() # FALSE


On 9/4/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 Hello,

 I apologise in advance for this question; I'm sure it is answered in
 the documentation or this mailing list many times, however the answer
 has eluded me.

 I'm trying to write a function where I don't want external variables
 to be scoped in from the parent environment. Given this function:

 test_func = function() {

if (exists(kat) == FALSE) {
print(kat is undefined)
} else {
print(kat)
}
 }

 If I did this:

   kat = 12
   test_func()

 I'd like the result to be the error, but now it's 12 (which is of
 course correct according to the documentation).

 So there are two questions:

 1) How can I disregard all variables from the parent environment
 within a function? (Although from what I've read on the mailing lists
 this isn't really what I want.)

 Apparently

 environment(test_func) = NULL

 is defunct, and what I thought was its replacement

 environment(test_func) = emptyenv()

 doesn't seem to be.


 2) How can I undefine a variable, perhaps just within the context
 of my function. I'm hoping to find some line that I can put at the
 start of my function above so that the result would be:

   kat = 12
   test_func()
 [1] kat is undefined
   kat
 [1] 12

 Thanks in advance for any help!

 Cheers,

 Demitri

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using temporary arrays in R

2007-09-03 Thread Gabor Grothendieck
You can do it in a local, in a function or explicitly remove it.  Also
if you never assign it to a variable then it will be garbage collected as well

# 1
local({
print(gc())
x - matrix(NA, 1000, 1000)
print(gc())
})
gc()

# 2
f - function() {
print(gc())
x - matrix(NA, 1000, 1000)
print(gc())
}
f()
gc()

# 3
gc()
x - matrix(NA, 1000, 1000)
gc()
rm(x)
gc()

# 4
gc()
sum(matrix(1, 1000, 1000))
gc()



On 9/3/07, dxc13 [EMAIL PROTECTED] wrote:

 useR's,

 Is there a way to create a temporary array (or matrix) in R to hold values,
 then drop or delete that temporary array from memory once I do not need it
 anymore?

 I am working with multidimensional arrays/matrices and I frequently perform
 multiple operations on the same matrix and rename it to be another object.
 I want to be able to delete the older versions of the array/matrix to free
 up space.

 Thank you.
 --
 View this message in context: 
 http://www.nabble.com/using-temporary-arrays-in-R-tf4372367.html#a12462219
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Derivative of a Function Expression

2007-09-03 Thread Gabor Grothendieck
The Ryacas package can do that (but the function must be one line
and it can't have brace brackets).  The first yacas call below registers f with
yacas, then we set up a function to act as a template to hold the
derivative and then we set its body calling yacas again to take the
derivative.

library(Ryacas)
f - function(x)  2*cos(x)^2 + 3*sin(x) +  0.5
yacas(f) # register f with yacas
Df - f
body(Df) - yacas(expression(deriv(f(x[[1]]
Df

Here is the output:

 library(Ryacas)
 f - function(x)  2*cos(x)^2 + 3*sin(x) +  0.5
 yacas(f)
[1] Starting Yacas!
expression(TRUE)
 Df - f
 body(Df) - yacas(expression(deriv(f(x[[1]]
 Df
function (x)
2 * (-2 * sin(x) * cos(x)) + 3 * cos(x)

Also see:

demo(Ryacas-Function)

and the other demos, vignette and home page:
   http://ryacas.googlecode.com



On 9/3/07, Rory Winston [EMAIL PROTECTED] wrote:
 Hi

 I am currently (for pedagogical purposes) writing a simple numerical
 analysis library in R. I have come unstuck when writing a simple
 Newton-Raphson implementation, that looks like this:

 f - function(x) { 2*cos(x)^2 + 3*sin(x) +  0.5  }

 root - newton(f, tol=0.0001, N=20, a=1)

 My issue is calculating the symbolic derivative of f() inside the newton()
 function. I cant seem to get R to do this...I can of course calculate the
 derivative by calling D() with an expression object containing the inner
 function definition, but I would like to just define the function once and
 then compute the derivative of the existing function. I have tried using
 deriv() and as.call(), but I am evidently misusing them, as they dont do
 what I want. Does anyone know how I can define a function, say foo, which
 manipulates one or more arguments, and then refer to that function later in
 my code in order to calculate a (partial) derivative?

 Thanks
 Rory

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Derivative of a Function Expression

2007-09-03 Thread Gabor Grothendieck
Actually in thinking about this its pretty easy to do it without Ryacas too:

Df - f
body(Df) - deriv(body(f), x)
Df


On 9/3/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 The Ryacas package can do that (but the function must be one line
 and it can't have brace brackets).  The first yacas call below registers f 
 with
 yacas, then we set up a function to act as a template to hold the
 derivative and then we set its body calling yacas again to take the
 derivative.

 library(Ryacas)
 f - function(x)  2*cos(x)^2 + 3*sin(x) +  0.5
 yacas(f) # register f with yacas
 Df - f
 body(Df) - yacas(expression(deriv(f(x[[1]]
 Df

 Here is the output:

  library(Ryacas)
  f - function(x)  2*cos(x)^2 + 3*sin(x) +  0.5
  yacas(f)
 [1] Starting Yacas!
 expression(TRUE)
  Df - f
  body(Df) - yacas(expression(deriv(f(x[[1]]
  Df
 function (x)
 2 * (-2 * sin(x) * cos(x)) + 3 * cos(x)

 Also see:

 demo(Ryacas-Function)

 and the other demos, vignette and home page:
   http://ryacas.googlecode.com



 On 9/3/07, Rory Winston [EMAIL PROTECTED] wrote:
  Hi
 
  I am currently (for pedagogical purposes) writing a simple numerical
  analysis library in R. I have come unstuck when writing a simple
  Newton-Raphson implementation, that looks like this:
 
  f - function(x) { 2*cos(x)^2 + 3*sin(x) +  0.5  }
 
  root - newton(f, tol=0.0001, N=20, a=1)
 
  My issue is calculating the symbolic derivative of f() inside the newton()
  function. I cant seem to get R to do this...I can of course calculate the
  derivative by calling D() with an expression object containing the inner
  function definition, but I would like to just define the function once and
  then compute the derivative of the existing function. I have tried using
  deriv() and as.call(), but I am evidently misusing them, as they dont do
  what I want. Does anyone know how I can define a function, say foo, which
  manipulates one or more arguments, and then refer to that function later in
  my code in order to calculate a (partial) derivative?
 
  Thanks
  Rory
 
 [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Derivative of a Function Expression

2007-09-03 Thread Gabor Grothendieck
The problem is that brace brackets are not in the derivatives table.
Make sure you don't have any.

On 9/3/07, Alberto Vieira Ferreira Monteiro [EMAIL PROTECTED] wrote:
 Gabor Grothendieck wrote:
 
  Actually in thinking about this its pretty easy to do it without Ryacas
  too:
 
  Df - f
  body(Df) - deriv(body(f), x)
  Df
 
 This is weird.

 f - function(x) { x^2 + 2*x+1 }
 Df - f
 body(Df) - deriv(body(f), x) # error

 Also:

 f - function(x) x^2 + 2 * x + 1
 Df - f
 body(Df) - deriv(body(f), x) # ok
 D2f - f
 body(D2f) - deriv(body(Df), x) # error

 Alberto Monteiro


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Derivative of a Function Expression

2007-09-03 Thread Gabor Grothendieck
One improvement.  This returns a function directly without having
to create a template and filling in its body:

deriv(body(f), x, func = TRUE)

On 9/3/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 The problem is that brace brackets are not in the derivatives table.
 Make sure you don't have any.

 On 9/3/07, Alberto Vieira Ferreira Monteiro [EMAIL PROTECTED] wrote:
  Gabor Grothendieck wrote:
  
   Actually in thinking about this its pretty easy to do it without Ryacas
   too:
  
   Df - f
   body(Df) - deriv(body(f), x)
   Df
  
  This is weird.
 
  f - function(x) { x^2 + 2*x+1 }
  Df - f
  body(Df) - deriv(body(f), x) # error
 
  Also:
 
  f - function(x) x^2 + 2 * x + 1
  Df - f
  body(Df) - deriv(body(f), x) # ok
  D2f - f
  body(D2f) - deriv(body(Df), x) # error
 
  Alberto Monteiro
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Derivative of a Function Expression

2007-09-03 Thread Gabor Grothendieck
And if f has brace brackets surrounding the body then do this:

f - function(x) { x*x }
deriv(body(f)[[2]], x, func = TRUE)

If you are writing a general function you can do this:

e - if (identical(body(f)[[1]], as.name({))) body(f)[[2]] else body(f)
deriv(e, x, func = TRUE)


On 9/3/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 One improvement.  This returns a function directly without having
 to create a template and filling in its body:

 deriv(body(f), x, func = TRUE)

 On 9/3/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
  The problem is that brace brackets are not in the derivatives table.
  Make sure you don't have any.
 
  On 9/3/07, Alberto Vieira Ferreira Monteiro [EMAIL PROTECTED] wrote:
   Gabor Grothendieck wrote:
   
Actually in thinking about this its pretty easy to do it without Ryacas
too:
   
Df - f
body(Df) - deriv(body(f), x)
Df
   
   This is weird.
  
   f - function(x) { x^2 + 2*x+1 }
   Df - f
   body(Df) - deriv(body(f), x) # error
  
   Also:
  
   f - function(x) x^2 + 2 * x + 1
   Df - f
   body(Df) - deriv(body(f), x) # ok
   D2f - f
   body(D2f) - deriv(body(Df), x) # error
  
   Alberto Monteiro
  
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comparing transform to with

2007-09-02 Thread Gabor Grothendieck
Try this version of transform.  In the first test we show
it works on your example but we have used the head of the built in
anscombe data set.  The second and third show that
it necessarily is incompatible with transform because transform
always looks up variables in DF first whereas my.transform looks
up the computed ones first.

my.transform - function(DF, ...) {
f - function(){}
formals(f) - eval(substitute(as.pairlist(c(alist(...), DF
body(f) - substitute(modifyList(DF, data.frame(...)))
f()
}

# test
a - head(anscombe)
# 1
my.transform(a, sum1 = x1+x2+x3+x4, sum2 = y1+y2+y3+y4, total = sum1+sum2)
# 2
my.transform(a, y2 = y1, y3 = y2)
# 3
transform(a, y2 = y1, y3 = y2) # different


On 9/1/07, Muenchen, Robert A (Bob) [EMAIL PROTECTED] wrote:
 Hi All,

 I've been successfully using the with function for analyses and the
 transform function for multiple transformations. Then I thought, why not
 use with for both? I ran into problems  couldn't figure them out from
 help files or books. So I created a simplified version of what I'm
 doing:

 rm( list=ls() )
 x1-c(1,3,3)
 x2-c(3,2,1)
 x3-c(2,5,2)
 x4-c(5,6,9)
 myDF-data.frame(x1,x2,x3,x4)
 rm(x1,x2,x3,x4)
 ls()
 myDF

 This creates two new variables just fine

 transform(myDF,
  sum1=x1+x2,
  sum2=x3+x4
 )

 This next code does not see sum1, so it appears that transform cannot
 see the variables that it creates. Would I need to transform new
 variables in a second pass?

 transform(myDF,
  sum1=x1+x2,
  sum2=x3+x4,
  total=sum1+sum2
 )

 Next I'm trying the same thing using with. It doesn't not work but
 also does not generate error messages, giving me the impression that I'm
 doing something truly idiotic:

 with(myDF, {
  sum1-x1+x2
  sum2-x3+x4
  total - sum1+sum2
 } )
 myDF
 ls()

 Then I thought, perhaps one of the advantages of transform is that it
 works on the left side of the equation without using a longer name like
 myDF$sum1. with probably doesn't do that, so I use the longer form
 below. It also does not work and generates no error messages.

 # Try it again, writing vars to myDF explicitly.
 # It generates no errors, and no results.
 with(myDF, {
  myDF$sum1-x1+x2
  myDF$sum2-x3+x4
  myDF$total - myDF$sum1+myDF$sum2
 } )
 myDF
 ls()

 I would appreciate some advice about the relative roles of these two
 functions  why my attempts with with have failed.

 Thanks!
 Bob

 =
 Bob Muenchen (pronounced Min'-chen), Manager
 Statistical Consulting Center
 U of TN Office of Information Technology
 200 Stokely Management Center, Knoxville, TN 37996-0520
 Voice: (865) 974-5230
 FAX: (865) 974-4810
 Email: [EMAIL PROTECTED]
 Web: http://oit.utk.edu/scc,
 News: http://listserv.utk.edu/archives/statnews.html

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Function modification: how to calculate values for every combination?

2007-09-02 Thread Gabor Grothendieck
Just to add to this be sure you do have names if you want them
and read about vectorization in ?outer in case fun was just an
example and your actual fun is more complex:

x - c(1,2,3)
names(x) - x
y - c(4,5,6)
names(y) - y

outer(x, y, fun) # as in previous answer

# or
 outer(-log(15) * x, log(10) * y, +)


On 9/2/07, Erich Neuwirth [EMAIL PROTECTED] wrote:
 outer(x,y,fun)

 Lauri Nikkinen wrote:
  Hello,
 
  I have a function like this:
 
  fun - function (x, y) {
a - log(10)*y
b - log(15)*x
extr - a-b
extr
}
 
  fun(2,3)
  [1] 1.491655
 
  x - c(1,2,3)
  y - c(4,5,6)
  fun(x, y)
  [1] 6.502290 6.096825 5.691360
 
  How do I have to modify my function that I can calculate results using
  every combination of x and y? I would like to produce a matrix which
  includes the calculated values in every cell and names(x) and names(y)
  as row and column headers respectively. Is the outer-function a way to
  solution?
 
  Best regards,

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Synchronzing workspaces

2007-09-02 Thread Gabor Grothendieck
You could try saving prior to quitting in the future if you want to try
those arguments.

On 9/3/07, Paul August [EMAIL PROTECTED] wrote:
 Thanks for sharing your experience. In my case, the involved machines are 
 Windows Vista, XP and 2000. Not sure whether it contributes to my problem or 
 not. I will look into this further.

 I just noticed the two arguments ascii and compress for save. However, my 
 .RData file was created by q() with yes. The manual says that q() is 
 equivalent to save(list = ls(all=TRUE), file = .RData). There seems to be 
 no way to set ascii or compression of save through q function, unless the q 
 function is replaced explicitly with save(list = ls(all=TRUE), file = 
 .RData, ascii = T).

 Paul.


 - Original Message 
 From: Gabor Grothendieck [EMAIL PROTECTED]
 To: Paul August [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Sent: Thursday, August 30, 2007 11:24:31 PM
 Subject: Re: [R] Synchronzing workspaces

 I haven't had similar experience but note that save has ascii=
 and compress= arguments.  You could check if varying those
 parameter values makes a difference.

 On 8/30/07, Paul August [EMAIL PROTECTED] wrote:
  I used to work on several computers and to use a flash drive to synchronize 
  the workspace on each machine before starting to work on it. I found that 
  .RData always caused some trouble: Often it is corrupted even though there 
  is no error in copying process. Does anybody have the similar experience?
 
  Paul.
 
  - Original Message 
  From: Barry Rowlingson [EMAIL PROTECTED]
  To: Eric Turkheimer [EMAIL PROTECTED]
  Cc: r-help@stat.math.ethz.ch
  Sent: Wednesday, August 22, 2007 9:43:57 AM
  Subject: Re: [R] Synchronzing workspaces
 
  Eric Turkheimer wrote:
   How do people go about synchronizing multiple workspaces on different
   workstations?  I tend to wind up with projects spread around the various
   machines I work on.  I find that placing the directories on a server and
   reading them remotely tends to slow things down.
 
   If R were to store all its workspace data objects in individual files
  instead of one big .RData file, then you could use a revision control
  system like SVN.  Check out the data, work on it, check it in, then on
  another machine just update to get the changes.
 
   However SVN doesn't work too well for binary files - conflicts being
  hard to resolve without someone backing down - so maybe its not such a
  good idea anyway...
 
   On unix boxes and derivatives, you can keep things in sync efficiently
  with the 'rsync' command.  I think there are GUI addons for it, and
  Windows ports.
 
  Barry
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 
 
  
 
  Comedy with an Edge to see what's on, when.
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] by group problem

2007-08-31 Thread Gabor Grothendieck
See the examples labelled head in the examples section near the bottom of:

http://sqldf.googlecode.com/svn/trunk/man/sqldf.Rd

These show show to do it using order as well as using SQL via sqldf.

On 8/31/07, Cory Nissen [EMAIL PROTECTED] wrote:
 I am working with census data.  My columns of interest are...

 PercentOld - the percentage of people in each county that are over 65
 County - the county in each state
 State - the state in the US

 There are about 3100 rows, with each row corresponding to a county within a 
 state.

 I want to return the top five PercentOld by state.  But I want the County 
 and the Value.

 I tried this...

 topN - function(column, n=5)
  {
column - sort(column, decreasing=T)
return(column[1:n])
  }
 top5PerState - tapply(data$percentOld, data$STATE, topN)

 But this only returns the value for percentOld per state, I also want the 
 corresponding County.

 I think I'm close, but I just can't get it...

 Thanks

 cn

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame row manipulation

2007-08-31 Thread Gabor Grothendieck
Try this:

evaluation$maxVol - ave(evaluation$vol, evaluation$name, FUN = max)

or using SQL via sqldf like this:

library(sqldf)
sqldf(select * from evaluation join
  (select name, max(vol) from evaluation group by name) using (name))


On 8/31/07, Calle [EMAIL PROTECTED] wrote:
 Hello,

 struggling with the very basic needs... :( any help appreciated.

 #using the package doBY
 #who drinks how much beer per day and therefor cannot calculate rowise
 maxvals
 evaluation=data.frame(date=c(1,2,3,4,5,6,7,8,9),
 name=c(Michael,Steve,Bob,
 Michael,Steve,Bob,Michael,Steve,Bob), vol=c(3,5,4,2,4,5,7,6,7))
 evaluation #

 maxval=summaryBy(vol ~ name,data=evaluation,FUN = function(x) { c(ma=max(x))
 } )
 maxval # over all days per person

 #function
 getMaxVal=function(x) { maxval$vol.ma[maxval$name==x] }
 getMaxVal(Steve) # testing the function for one name is ok

 #we want to add a column, that shows the daily drinkingvolume in relation to
 the persons max-vol.
 evaluation[,relDrink]= evaluation$vol/getMaxVal(evaluation$name)
 #
 # this brings the error:
 #
 #Warning message:
 # Korrupter Data Frame: Spalten werden abgeschnitten oder mit NAs
 # aufgefüllt in: format.data.frame(x, digits = digits, na.encode = FALSE)

 errortest= evaluation$vol/getMaxVal(evaluation$name)
 errortest
 # this brings:
 # numeric(0)


 #target was the following:
 #show in each line the daily consumed beer per person and in the next column

 #the all time max consumed beer for this person´(or divided by daily vol):
 #
 #  datename vol relDrink
 #11 Michael   37
 #22   Steve   56
 #33 Bob   47
 #44 Michael   27
 #55   Steve   47
 #66 Bob   57
 #77 Michael   77
 #88   Steve   66
 #99 Bob   77

 # who can help???

[[alternative HTML version deleted]]


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] size limitations in R

2007-08-31 Thread Gabor Grothendieck
SAS was developed many years ago when computers were far
less powerful so its heritage is that it is very efficient and its unlikely
that R or other modern software will match SAS in that respect.

The development version of the sqldf R package provides an interface
which simplifies the use of the R package RSQLite which in turn is an
interface to the sqlite database.  The development version of
sqldf supports RSQLite's ability to read a file directly to sqlite without
going through R and then reading it from there or reading a subset of it
from there into R.  See example 6 on the sqldf home page:

http://code.google.com/p/sqldf/

On 8/31/07, Fabiano Vergari [EMAIL PROTECTED] wrote:
 I am a SAS user currently evaluating R as a possible addition or even
 replacement for SAS. The difficulty I have come across

 straight away is R's apparent difficulty in handling relatively large data
 files. Whilst I would not expect it to handle

 datasets with millions of records, I still really need to be able to work
 with dataset with 100,000+ records and 100+

 variables. Yet, when reading a .csv file with 180,000 records and about 200
 variables, the software virtually ground to a

 halt (I stopped it after 1 hour). Are there guidelines or maybe a
 limitations document anywhere that helps me assess the size

 of file that R, generally, or specific routines will handle? Also, mindful
 of the fact that I am am an R novice, are there
 guidelines to make efficient use of R in terms of data handling?

 Many thanks in advance for your help.

 Regards,
 Fabiano Vergari
 [EMAIL PROTECTED]

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and Web Applications

2007-08-30 Thread Gabor Grothendieck
The R packages and projects for the web and R are listed here:

 http://www.lmbe.seu.edu.cn/CRAN/doc/FAQ/R-FAQ.html#R-Web-Interfaces


On 8/30/07, Chris Parkin [EMAIL PROTECTED] wrote:
 Hello,

 I'm curious to know how people are calling R from web applications (I've
 been looking for Perl but I'm open to other languages).  After doing a
 search, I came across the R package RSPerl, but I'm having difficulties
 getting it installed (on Mac OSX).  I believe the problem probably has to do
 with changes in R since the package release.  Below you will see where the
 installation process comes to an end.  Does anyone have any suggestions, or
 perhaps a direction to point me in?

 Thanks in advance for your insight!

 Chris

 * Installing to library '/Library/Frameworks/R.framework/Resources/library'
 * Installing *source* package 'RSPerl' ...
 checking for perl... /usr/bin/perl
 No support for any of the Perl modules from calling Perl from R.
 *

   Set PERL5LIB to
 /Library/Frameworks/R.framework/Versions/2.5/Resources/library/RSPerl/perl

 *
 Testing: -F/Library/Frameworks/R.framework/.. -framework R
 Using '/usr/bin/perl' as the perl executable
 Perl modules (no):
 Adding R package to list of Perl modules to enable callbacks to R from Perl
 Creating the C code for dynamically loading modules with native code for
 Perl:  R
 modules:   R; linking:
 checking for gcc... gcc
 checking for C compiler default output file name... a.out
 checking whether the C compiler works... yes
 checking whether we are cross compiling... no
 checking for suffix of executables...
 checking for suffix of object files... o
 checking whether we are using the GNU C compiler... yes
 checking whether gcc accepts -g... yes
 checking for gcc option to accept ISO C89... none needed
 Support R in Perl: yes
 configure: creating ./config.status
 config.status: creating src/Makevars
 config.status: creating inst/scripts/RSPerl.csh
 config.status: creating inst/scripts/RSPerl.bsh
 config.status: creating src/RinPerlMakefile
 config.status: creating src/Makefile.PL
 config.status: creating cleanup
 config.status: creating src/R.pm
 config.status: creating R/perl5lib.R
 making target all in RinPerlMakefile
 RinPerlMakefile:5: /Library/Frameworks/R.framework/Resources/etc/Makeconf:
 No such file or directory
 make: *** No rule to make target
 `/Library/Frameworks/R.framework/Resources/etc/Makeconf'.  Stop.

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Month end calculations

2007-08-30 Thread Gabor Grothendieck
The zoo package includes the yearmon class to facilitate such
manipulations.  Here are a few solutions assuming you store
you series in a zoo variable:

# test data
library(zoo)
z - zoo(1001:1100, as.Date(101:200))[-(45:55)]

# Solution 1.  tapply produces indexes of last of month
tt - time(z)
z[ c(tapply(seq_along(tt), as.yearmon(tt), tail, 1)) ]

# If we want to create a last variable which corresponds
# to last in sas then do it this slightly longer way:

# Solution 2
tt - time(z)
last - seq_along(tt) %in% tapply(seq_along(tt), as.yearmon(tt), tail, 1)
z[last]

# Solution 3. another solution with a last variable.  f(x) is
# vector same length as x with all 0's except last element is 1.
tt - time(z)
f - function(x) replace(0*x, length(x), 1)
last - ave(seq_along(tt), as.yearmon(tt), FUN = f)
z[last]

In all these solutions the last point in the series is always
included.

We have not assumed that every day is necessarily included in your
series but if every day is included then even simpler solutions
are possible.

On 8/29/07, Shubha Vishwanath Karanth [EMAIL PROTECTED] wrote:
 Hi R users,



 Is there a function in R, which does some calculation only for the month
 end in a daily data?... In other words, is there a command in R,
 equivalent to last. function in SAS?



 BR, Shubha


[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Month end calculations

2007-08-30 Thread Gabor Grothendieck
The last line is wrong (see below for correction):

On 8/30/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 The zoo package includes the yearmon class to facilitate such
 manipulations.  Here are a few solutions assuming you store
 you series in a zoo variable:

 # test data
 library(zoo)
 z - zoo(1001:1100, as.Date(101:200))[-(45:55)]

 # Solution 1.  tapply produces indexes of last of month
 tt - time(z)
 z[ c(tapply(seq_along(tt), as.yearmon(tt), tail, 1)) ]

 # If we want to create a last variable which corresponds
 # to last in sas then do it this slightly longer way:

 # Solution 2
 tt - time(z)
 last - seq_along(tt) %in% tapply(seq_along(tt), as.yearmon(tt), tail, 1)
 z[last]

 # Solution 3. another solution with a last variable.  f(x) is
 # vector same length as x with all 0's except last element is 1.
 tt - time(z)
 f - function(x) replace(0*x, length(x), 1)
 last - ave(seq_along(tt), as.yearmon(tt), FUN = f)
 z[last]

This last line should be:

z[last == 1]



 In all these solutions the last point in the series is always
 included.

 We have not assumed that every day is necessarily included in your
 series but if every day is included then even simpler solutions
 are possible.

 On 8/29/07, Shubha Vishwanath Karanth [EMAIL PROTECTED] wrote:
  Hi R users,
 
 
 
  Is there a function in R, which does some calculation only for the month
  end in a daily data?... In other words, is there a command in R,
  equivalent to last. function in SAS?
 
 
 
  BR, Shubha
 
 
 [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Month end calculations

2007-08-30 Thread Gabor Grothendieck
And one more yearmon solution.  Here z is a zoo series as before:

tt - time(z)
aggregate(z, ave(tt, as.yearmon(tt), FUN = max), tail, 1)


On 8/30/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 The last line is wrong (see below for correction):

 On 8/30/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
  The zoo package includes the yearmon class to facilitate such
  manipulations.  Here are a few solutions assuming you store
  you series in a zoo variable:
 
  # test data
  library(zoo)
  z - zoo(1001:1100, as.Date(101:200))[-(45:55)]
 
  # Solution 1.  tapply produces indexes of last of month
  tt - time(z)
  z[ c(tapply(seq_along(tt), as.yearmon(tt), tail, 1)) ]
 
  # If we want to create a last variable which corresponds
  # to last in sas then do it this slightly longer way:
 
  # Solution 2
  tt - time(z)
  last - seq_along(tt) %in% tapply(seq_along(tt), as.yearmon(tt), tail, 1)
  z[last]
 
  # Solution 3. another solution with a last variable.  f(x) is
  # vector same length as x with all 0's except last element is 1.
  tt - time(z)
  f - function(x) replace(0*x, length(x), 1)
  last - ave(seq_along(tt), as.yearmon(tt), FUN = f)
  z[last]

 This last line should be:

 z[last == 1]


 
  In all these solutions the last point in the series is always
  included.
 
  We have not assumed that every day is necessarily included in your
  series but if every day is included then even simpler solutions
  are possible.
 
  On 8/29/07, Shubha Vishwanath Karanth [EMAIL PROTECTED] wrote:
   Hi R users,
  
  
  
   Is there a function in R, which does some calculation only for the month
   end in a daily data?... In other words, is there a command in R,
   equivalent to last. function in SAS?
  
  
  
   BR, Shubha
  
  
  [[alternative HTML version deleted]]
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide 
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Synchronzing workspaces

2007-08-30 Thread Gabor Grothendieck
I haven't had similar experience but note that save has ascii=
and compress= arguments.  You could check if varying those
parameter values makes a difference.

On 8/30/07, Paul August [EMAIL PROTECTED] wrote:
 I used to work on several computers and to use a flash drive to synchronize 
 the workspace on each machine before starting to work on it. I found that 
 .RData always caused some trouble: Often it is corrupted even though there is 
 no error in copying process. Does anybody have the similar experience?

 Paul.

 - Original Message 
 From: Barry Rowlingson [EMAIL PROTECTED]
 To: Eric Turkheimer [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Sent: Wednesday, August 22, 2007 9:43:57 AM
 Subject: Re: [R] Synchronzing workspaces

 Eric Turkheimer wrote:
  How do people go about synchronizing multiple workspaces on different
  workstations?  I tend to wind up with projects spread around the various
  machines I work on.  I find that placing the directories on a server and
  reading them remotely tends to slow things down.

  If R were to store all its workspace data objects in individual files
 instead of one big .RData file, then you could use a revision control
 system like SVN.  Check out the data, work on it, check it in, then on
 another machine just update to get the changes.

  However SVN doesn't work too well for binary files - conflicts being
 hard to resolve without someone backing down - so maybe its not such a
 good idea anyway...

  On unix boxes and derivatives, you can keep things in sync efficiently
 with the 'rsync' command.  I think there are GUI addons for it, and
 Windows ports.

 Barry

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.






 

 Comedy with an Edge to see what's on, when.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sql query over local tables

2007-08-29 Thread Gabor Grothendieck
I assume that by local tables you mean data frames in R.

You can use the merge function in the base of R, as others have already
mentioned, or if you want to use SQL syntax you can use the sqldf
package.  See example 4 on the sqldf home page:

http://sqldf.googlecode.com


On 8/28/07, Jorge Cornejo Donoso [EMAIL PROTECTED] wrote:
 Hi i have to table with IDs in each one.

 I want to make a join (as in sql) by the ID. Is any way to use the RODBC
 package (or other) in local tables (not a access, mysql, sql, etc. )  and
 made the join?



 Thanks in advance

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strage result with an append/strptime combination

2007-08-29 Thread Gabor Grothendieck
Try chron:

 library(chron)
 namefile - 070707050642.dat#day-month-year-hour-minute-second.dat
 x - chron(substr(namefile, 1, 6), substr(namefile, 7, 12),
+   format = c(dmy, hms), out.format = c(m/d/y, h:m:s))
 c(x, x)
[1] (07/07/07 05:06:42) (07/07/07 05:06:42)

See R News 4/1 Help Desk article for more.


On 8/29/07, Ptit_Bleu [EMAIL PROTECTED] wrote:

 Hi,

 I keep on trying to write some small scripts in order to learn R but even
 with basic scripts I have problems ...

 I start with the name of a file which is in fact the time the file has been
 generated (I cannot change the format). Then I convert namefile with
 strptime. The problem occurs when I add another time from another file with
 append. It displays some informations I don't want.

 I found a post about this problem
 (http://www.nabble.com/Error-with-strptime-tf3607942.html#a10081942) but I
 don't understand the solution. I tested as.POSIXct or as.POSIX.lt but it has
 no effect.

 Do you have some ideas to solve this problem ?
 Thank you for your help.
 Ptit Bleu.

 ---

 namefile-070707050642.dat#day-month-year-hour-minute-second.dat
 jourheure-strptime(namefile,%d%m%y%H%M%S)

  jourheure
 [1] 2007-07-07 05:06:42

 jourheure-append(jourheure,jourheure)
  jourheure
 [1] 2007-07-07 05:06:42 Paris, Madrid (heure d'été) 2007-07-07 05:06:42
 Paris, Madrid (heure d'été)

 --
 View this message in context: 
 http://www.nabble.com/Strage-result-with-an-append-strptime-combination-tf4347401.html#a12385852
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strage result with an append/strptime combination

2007-08-29 Thread Gabor Grothendieck
Try


fmt - function(x) with(month.day.year(x),
sprintf(%02d/%02d/%02d %02d:%02d:%02d, month, day, year,
hours(x), minutes(x), seconds(x)))
fmt(x)


On 8/29/07, Ptit_Bleu [EMAIL PROTECTED] wrote:

 Thanks Gabor !

 It works.
 Just one more thing : is there a possibility to remove ( and ) before I
 copy the data to a MySQL database.

 Again thank you for the tip.
 Ptit Bleu.


 Gabor Grothendieck wrote:
 
  Try chron:
 
  library(chron)
  namefile - 070707050642.dat#day-month-year-hour-minute-second.dat
  x - chron(substr(namefile, 1, 6), substr(namefile, 7, 12),
  +   format = c(dmy, hms), out.format = c(m/d/y, h:m:s))
  c(x, x)
  [1] (07/07/07 05:06:42) (07/07/07 05:06:42)
 
  See R News 4/1 Help Desk article for more.
 
 
  On 8/29/07, Ptit_Bleu [EMAIL PROTECTED] wrote:
 
  Hi,
 
  I keep on trying to write some small scripts in order to learn R but even
  with basic scripts I have problems ...
 
  I start with the name of a file which is in fact the time the file has
  been
  generated (I cannot change the format). Then I convert namefile with
  strptime. The problem occurs when I add another time from another file
  with
  append. It displays some informations I don't want.
 
  I found a post about this problem
  (http://www.nabble.com/Error-with-strptime-tf3607942.html#a10081942) but
  I
  don't understand the solution. I tested as.POSIXct or as.POSIX.lt but it
  has
  no effect.
 
  Do you have some ideas to solve this problem ?
  Thank you for your help.
  Ptit Bleu.
 
  ---
 
  namefile-070707050642.dat#day-month-year-hour-minute-second.dat
  jourheure-strptime(namefile,%d%m%y%H%M%S)
 
   jourheure
  [1] 2007-07-07 05:06:42
 
  jourheure-append(jourheure,jourheure)
   jourheure
  [1] 2007-07-07 05:06:42 Paris, Madrid (heure d'été) 2007-07-07
  05:06:42
  Paris, Madrid (heure d'été)
 
  --
  View this message in context:
  http://www.nabble.com/Strage-result-with-an-append-strptime-combination-tf4347401.html#a12385852
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 

 --
 View this message in context: 
 http://www.nabble.com/Strage-result-with-an-append-strptime-combination-tf4347401.html#a12386702
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Excel

2007-08-29 Thread Gabor Grothendieck
You would still need the interactive GUI to get to the point where its
at all comparable to Excel.  Using rpad you could construct such
an interface although its a bit of work.  Here is an example using
rpad and reshape:

http://www.rpad.org/Rpad/DataExplorer.Rpad

On 8/29/07, Bert Gunter [EMAIL PROTECTED] wrote:
 Erich:

 This is not a comment either for or against the use of Excel. I only wish to
 point out that AFAICS, Hadley Wickham's reshape package offers all the pivot
 table functionality and more.

 If I am wrong about this, please let me and everyone else know.


 Bert Gunter
 Genentech Nonclinical Statistics


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Erich Neuwirth
 Sent: Wednesday, August 29, 2007 11:43 AM
 To: r-help
 Subject: Re: [R] Excel

 Excel bashing can be fun but also can be dangerous because
 you are makeing your life harder than necessary.
 Statisticians meanwhile know that the numerics of statistical
 computation can be quite bad, therefore one should not use them.
 But using our (we = Thomas Baier + Erich Neuwirth) RExcel addin either
 with the R(D)COM server or with rcom (package on CRAN) allows you to use
 all the nice features of Excel (yes, there are quite a few) and use R as
 as the computational engine within Excel. The formula
 =RApply(var,A1:A1000) in an Excel cell for example will use R to
 compute the variance of the data in column A in Excel. If you change any
 of the values in the range A1:A1000 will automatically recompute the
 variance.

 There is one feature in Excel which is extremely convenient, Pivot
 tables. Anybody doing any work as statistical consultant really ought to
 know about Pivot tables, and I am still surprised how many statisticians
 do not know about it. Neither Gnumeric nor OpenOffice Calc offer
 comparably convenient ways working with multidimensional tables.

 I think the answer to the question
 Excel or R of course is Excel and R.



 --
 Erich Neuwirth, University of Vienna
 Faculty of Computer Science
 Computer Supported Didactics Working Group
 Visit our SunSITE at http://sunsite.univie.ac.at
 Phone: +43-1-4277-39464 Fax: +43-1-4277-39459

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Efficient way to parse string and construct data.frame

2007-08-28 Thread Gabor Grothendieck
Try this:

 s - c(1 ,2 ,3,  4 ,5 ,6)
 read.csv(textConnection(s), header = FALSE)
  V1 V2 V3
1  1  2  3
2  4  5  6



On 8/28/07, yoo [EMAIL PROTECTED] wrote:

 Hi all,

 I have this list of strings
 [1] 1 ,2 ,3  4 ,5 ,6

 Is there an efficient way to convert it to data.frame:
   V1  V2  V3
 1   1   23
 2   4   56

 Like I can use strsplit to get to a list of split strings.. and then use say
 a = strsplit(mylist, ,)
 data.frame(V1 = lapply(a, function(x){x[1]}), V2 = lapply(a,
 function(x){x[2]}),.)

 but i'm loop through that list so many times.. so I'm hesitated to use
 that..

 Thanks a lot for your great help before and this time as well!!
 - boy
 --
 View this message in context: 
 http://www.nabble.com/Efficient-way-to-parse-string-and-construct-data.frame-tf4342441.html#a12370234
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Factor levels

2007-08-28 Thread Gabor Grothendieck
You can create your own class and pass that to read table.  In
the example below Fld2 is read in with factor levels C, A, B
in that order.


library(methods)
setClass(my.levels)
setAs(character, my.levels,
  function(from) factor(from, levels = c(C, A, B)))


### test ###

Input - Fld1 Fld2
10 A
20 B
30 C
40 A

DF - read.table(textConnection(Input), header = TRUE,
  colClasses = c(numeric, my.levels))
str(DF)
# or
DF - read.table(textConnection(Input), header = TRUE,
  colClasses = list(Fld2 = my.levels))
str(DF)


On 8/28/07, Sébastien [EMAIL PROTECTED] wrote:
 Dear R-users,

 I have found this not-so-recent post in the archives  -
 http://tolstoy.newcastle.edu.au/R/devel/00a/0291.html - while I was
 looking for a particular way to reorder factor levels. The question
 addressed by the author was to know if the read.table function could be
 modified to order the levels of newly created factors according to the
 order that they appear in the data file. Exactly what I am looking for.
 As there was no reply to this post, I wonder if any move have been made
 towards the implementation of this suggestion. A quick look at
 ?read.table tells me that if this option was implemented, it was not in
 the read.table function...

 Sebastien

 PS: I am sorry to post so many messages on the list, but I am learning R
 (basically by trials  errors ;-) ) and no one around me has even a
 slight notion about it...

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Factor levels

2007-08-28 Thread Gabor Grothendieck
Its not clear from your description what you want.
Could you be a bit more specific including an example.

On 8/28/07, Sébastien [EMAIL PROTECTED] wrote:
 Thanks Gabor, I have two questions:

 1- Is there any difference between your code and the following one, with
 regards to Fld2 ?
 ### test ###

Input - Fld1 Fld2
10 A
20 B
30 C
40 A

DF -
 read.table(textConnection(Input), header =
 TRUE)

DF$Fld2-factor(DF$Fld2,levels= c(C, A, B)))
 2- do you see any way to bring flexibility to your method ? Because, it
 looks to me as, at this stage, I have to i) know the order of my levels
 before I read the table and ii) create one class per factor.
 My problem is that I am not really working on a specific dataset. My goal is
 to develop R scripts capable of handling datasets which have various
 contents but close structures. So, I really need to minimize the quantity of
 user-specific code.

 Sebastien

 Gabor Grothendieck a écrit :
 You can create your own class and pass that to read table. In
the example
 below Fld2 is read in with factor levels C, A, B
in that
 order.


library(methods)
setClass(my.levels)
setAs(character,
 my.levels,
 function(from) factor(from, levels = c(C, A, B)))


###
 test ###

Input - Fld1 Fld2
10 A
20 B
30 C
40 A

DF -
 read.table(textConnection(Input), header = TRUE,
 colClasses = c(numeric,
 my.levels))
str(DF)
# or
DF - read.table(textConnection(Input), header =
 TRUE,
 colClasses = list(Fld2 = my.levels))
str(DF)


On 8/28/07,
 Sébastien [EMAIL PROTECTED] wrote:

 Dear R-users,

I have found this not-so-recent post in the archives
 -
http://tolstoy.newcastle.edu.au/R/devel/00a/0291.html -
 while I was
looking for a particular way to reorder factor levels. The
 question
addressed by the author was to know if the read.table function
 could be
modified to order the levels of newly created factors according to
 the
order that they appear in the data file. Exactly what I am looking
 for.
As there was no reply to this post, I wonder if any move have been
 made
towards the implementation of this suggestion. A quick look
 at
?read.table tells me that if this option was implemented, it was not
 in
the read.table function...

Sebastien

PS: I am sorry to post so many
 messages on the list, but I am learning R
(basically by trials  errors ;-)
 ) and no one around me has even a
slight notion about
 it...

__
R-help@stat.math.ethz.ch
 mailing
 list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do
 read the posting guide
 http://www.R-project.org/posting-guide.html
and provide
 commented, minimal, self-contained, reproducible code.






__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Factor levels

2007-08-28 Thread Gabor Grothendieck
Its the same principle.  Just change the function to be suitable.  This one
arranges the levels according to the input:

library(methods)
setClass(my.factor)
setAs(character, my.factor,
 function(from) factor(from, levels = unique(from)))

Input - a b c
1   1 176 w
2   2 141 k
3   3 172 r
4   4 182 s
5   5 123 k
6   6 153 p
7   7 176 l
8   8 170 u
9   9 140 z
10 10 194 s
11 11 164 j
12 12 100 j
13 13 127 x
14 14 137 r
15 15 198 d
16 16 173 j
17 17 113 x
18 18 144 w
19 19 198 q
20 20 122 f

DF - read.table(textConnection(Input), header = TRUE,
  colClasses = list(c = my.factor))
str(DF)


On 8/28/07, Sébastien [EMAIL PROTECTED] wrote:
 Ok, I cannot send to you one of my dataset since they are confidential. But
 I can produce a dummy mini dataset to illustrate my question. Let's say I
 have a csv file with 3 columns and 20 rows which content is reproduced by
 the following line.

  mydata-data.frame(a=1:20,
 b=sample(100:200,20,replace=T),c=sample(letters[1:26], 20,
 replace = T))
  mydata
 a   b c
 1   1 176 w
 2   2 141 k
 3   3 172 r
 4   4 182 s
 5   5 123 k
 6   6 153 p
 7   7 176 l
 8   8 170 u
 9   9 140 z
 10 10 194 s
 11 11 164 j
 12 12 100 j
 13 13 127 x
 14 14 137 r
 15 15 198 d
 16 16 173 j
 17 17 113 x
 18 18 144 w
 19 19 198 q
 20 20 122 f

 If I had to read the csv file, I would use something like:
 mydata-data.frame(read.table(file=c:/test.csv,header=T))

 Now, if you look at mydata$c, the levels are alphabetically ordered.
  mydata$c
  [1] w k r s k p l u z s j j x r d j x w q f
 Levels: d f j k l p q r s u w x z

 What I am trying to do is to reorder the levels as to have them in the order
 they appear in the table, ie
 Levels: w k r s p l u z j x d q f

 Again, keep in mind that my script should be used on datasets which content
 are unknown to me. In my example, I have used letters for mydata$c, but my
 code may have to handle factors of numeric or character values (I need to
 transform specific columns of my dataset into factors for plotting
 purposes). My goal is to let the code scan the content of each factor of my
 data.frame during or after the read.table step and reorder their levels
 automatically without having to ask the user to hard-code the level order.

 In a way, my problem is more related to the way the factor levels are
 ordered than to the read.table function, although I guess there is a link...

 Gabor Grothendieck a écrit :
 Its not clear from your description what you want.
Could you be a bit more
 specific including an example.

On 8/28/07, Sébastien [EMAIL PROTECTED]
 wrote:

 Thanks Gabor, I have two questions:

1- Is there any difference between your
 code and the following one, with
regards to Fld2 ?
### test ###

 Input - Fld1 Fld2
10 A
20 B
30 C
40 A

DF -

 read.table(textConnection(Input), header =
TRUE)

 DF$Fld2-factor(DF$Fld2,levels= c(C, A, B)))

 2- do you see any way to bring flexibility to your method ? Because,
 it
looks to me as, at this stage, I have to i) know the order of my
 levels
before I read the table and ii) create one class per factor.
My
 problem is that I am not really working on a specific dataset. My goal is
to
 develop R scripts capable of handling datasets which have various
contents
 but close structures. So, I really need to minimize the quantity
 of
user-specific code.

Sebastien

Gabor Grothendieck a écrit :
You can
 create your own class and pass that to read table. In

 the example

 below Fld2 is read in with factor levels C, A, B

 in that

 order.


library(methods)
setClass(my.levels)
setAs(character,

 my.levels,

  function(from) factor(from, levels = c(C, A, B)))


###

 test ###

 Input - Fld1 Fld2
10 A
20 B
30 C
40 A

DF -

 read.table(textConnection(Input), header = TRUE,

  colClasses = c(numeric,

 my.levels))

 str(DF)
# or
DF - read.table(textConnection(Input), header =

 TRUE,

  colClasses = list(Fld2 = my.levels))
str(DF)


On 8/28/07,

 Sébastien [EMAIL PROTECTED] wrote:


 Dear R-users,

 I have found this not-so-recent post in the archives

 -

 http://tolstoy.newcastle.edu.au/R/devel/00a/0291.html -

 while I was

 looking for a particular way to reorder factor levels. The

 question

 addressed by the author was to know if the read.table function

 could be

 modified to order the levels of newly created factors according to

 the

 order that they appear in the data file. Exactly what I am looking

 for.

 As there was no reply to this post, I wonder if any move have been

 made

 towards the implementation of this suggestion. A quick look

 at

 ?read.table tells me that if this option was implemented, it was not

 in

 the read.table function...

Sebastien

PS: I am sorry to post so many

 messages on the list, but I am learning R

 (basically by trials  errors ;-)

 ) and no one around me has even a

 slight notion about

 it...

 __
R-help@stat.math.ethz.ch
 mailing
list

 https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do

 read the posting
 guide

Re: [R] Nodes edges with similarity matrix

2007-08-28 Thread Gabor Grothendieck
Try this:

# test data
mat - structure(c(1, 0.325141612, 0.002109751, 0.250153137, 0.0223676,
1, 0.342654, 0.1987485, 0.9723831, 0.9644216, 1, 0.7391222, 0.394331,
0.5460461, 0.7080224, 1), .Dim = c(4L, 4L), .Dimnames = list(
c(a, b, c, d), c(a, b, c, d)))

library(sna)

# draw edges according to value
gplot(mat, edge.lwd = mat, label = rownames(mat))

# thresholding at 0.5
 gplot(mat  .5, label = rownames(mat))


On 8/28/07, H. Paul Benton [EMAIL PROTECTED] wrote:
 Hello,

I apologise if someone has already answered this but I searched and
 googled but didn't find anything.

I have a matrix which gives me the similarity of each item to each
 other. I would like to turn this matrix into something like what they
 have in the graph package with the nodes and edges.
 http://cran.r-project.org/doc/packages/graph.pdf . However I cannot find
 a method to convert my matrix to an object that graph can use.

 my similarity matrix looks like:
  sim[1:4,]
a  b  c  d
 [a]  1.0  0.0223676  0.9723831  0.3943310
 [b]  0.325141612  1.000  0.9644216  0.5460461
 [c]  0.002109751  0.3426540  1.000  0.7080224
 [d]  0.250153137  0.1987485  0.7391222  1.000

 please don't get caught up with the numbers I simple made this to show.
 I have not produce the code yet to make my similitary matrix.

 Does anyone know a method to do this or do I have to write something. :(
 If I do any starter code :D jj. If I've read something wrong or
 misunderstood my apologies.

 cheers,


 Paul


 --
 Research Technician
 Mass Spectrometry
   o The
  /
 o Scripps
  \
   o Research
  /
 o Institute

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to provide argument when opening RGui from an external application

2007-08-26 Thread Gabor Grothendieck
There are also some batch files that can be used with Rscript on XP and info
in the README here:

   http://batchfiles.googlecode.com


On 8/26/07, Sébastien [EMAIL PROTECTED] wrote:
 Thanks for your reply.
 When you say look into Rscript.exe, do you have a specific document in
 mind ? I tried to google it but could not find much... I forgot to
 mention in my first email that I am working under the Windows XP
 environment.

 Prof Brian Ripley a écrit :
  Look into Rscript.exe (on Windows), which is a flexible way to run
  scripts.  Neither using a GUI nor using source() are recommended.
 
  On Fri, 24 Aug 2007, Sébastien wrote:
 
  Dear R-users,
 
  I have written a small application (in visual basic) that automatically
  generate some R scripts. I would like to execute these scripts when my
  application is being closed.
  My problem is that I don't know how to pass the
  'source(c:/.../myscript.r)' instruction when I programmatically start
  RGui. Tinn-R is capable of doing such things, so I guess there must be a
  way to pass arguments to RGui.
 
  Any advice or link to relevant references would be greatly appreciated.
 
  Sebastien
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to make an array of data.frames?

2007-08-26 Thread Gabor Grothendieck
Is this what you want:

DF1 - DF2 - DF3 - df1 - df2 - df3 - head(iris)
list(a = list(DF1, DF2, DF3), b = list(df1, df2, df3))

or

x - list()
x$a - list(DF1, DF2, DF3)
x$b - list(df1, df2, df3)


On 8/26/07, Werner Wernersen [EMAIL PROTECTED] wrote:
 Hi,

 I am still struggling with the data structures in R. I
 know how it works in C++ but how can I get such a
 structure in R?

 Here is what I want:
 x[a]$dataframe1
 x[a]$dataframe2
 x[a]$dataframe3
 x[b]$dataframe1
 x[b]$dataframe2
 x[b]$dataframe3
 x[c]$dataframe1
 x[c]$dataframe2
 x[c]$dataframe3

 And it would be nice if I could fill in objects a,
 b, c one at a time successively.

 What is the easiest way to get such a data structure?
 It would be great if someone could give me some help
 with this.

 Many thanks and kind regards,
  Werner

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to make an array of data.frames?

2007-08-26 Thread Gabor Grothendieck
That gives you a list of data frames. An array is a vector with a dim
attribute to to make it into an array add the appropriate dim attirbute.

If x is the list we created before then:

dim(x) - 2

gives us an array of length 2 each of which has a list of 3 elements
or

dim(x) - 1:2

gives a 1x2 array

or

y - list(DF1, DF2, DF3, df1, df2, df3)
dim(y) - 3:2

gives a 3x2 array so you can write y[[1,2]] for example.
etc.

On 8/26/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 Is this what you want:

 DF1 - DF2 - DF3 - df1 - df2 - df3 - head(iris)
 list(a = list(DF1, DF2, DF3), b = list(df1, df2, df3))

 or

 x - list()
 x$a - list(DF1, DF2, DF3)
 x$b - list(df1, df2, df3)


 On 8/26/07, Werner Wernersen [EMAIL PROTECTED] wrote:
  Hi,
 
  I am still struggling with the data structures in R. I
  know how it works in C++ but how can I get such a
  structure in R?
 
  Here is what I want:
  x[a]$dataframe1
  x[a]$dataframe2
  x[a]$dataframe3
  x[b]$dataframe1
  x[b]$dataframe2
  x[b]$dataframe3
  x[c]$dataframe1
  x[c]$dataframe2
  x[c]$dataframe3
 
  And it would be nice if I could fill in objects a,
  b, c one at a time successively.
 
  What is the easiest way to get such a data structure?
  It would be great if someone could give me some help
  with this.
 
  Many thanks and kind regards,
   Werner
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Program of matrix of seasonal dummy variable(Econometrics)

2007-08-26 Thread Gabor Grothendieck
Try this:

 kronecker(rep(1, 3), diag(4))
  [,1] [,2] [,3] [,4]
 [1,]1000
 [2,]0100
 [3,]0010
 [4,]0001
 [5,]1000
 [6,]0100
 [7,]0010
 [8,]0001
 [9,]1000
[10,]0100
[11,]0010
[12,]0001



On 8/26/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 Dear R users,
 I would like to construct a matrix of seasonal dummy variables, such matrix 
 can be written as follows(i.e format(T,4))
 10   0   0
 01   0   0
 00   1   0
 00   0   1
 10   0   0
 01   0   0
 00   1   0
 00   0   1
 10   0   0
 01   0   0
 00   1   0
 00   0   1
 .. ..
 .. . .
  etc
 I have written the following small program:
 T=100
 br-matrix(0,T,4)
 for (i in 1:T)
 {
 + for (j in 1:4)
 {
 + if i=j
 {+ br[i,j]=1
 + }
 + if else (abs(i-j)%%4==0)
 {+ br[i,j]=1
 +}
 + else
 {+ br[i,j]=0
 +}
 +}
 +}
 I have obtained the following message from R consol:
  T=100
  br-matrix(0,T,4)
  for (i in 1:T)
 +  {
 + + for (j in 1:4)
 + {
 ++ if i=j
 Erreur : syntax error, unexpected SYMBOL, expecting '(' dans :
 
 
 
  {+ br[i,j]=1
 + + }
 Erreur : syntax error, unexpected '}' dans {+ br[i,j]=1
 
 + if else (abs(i-j)%%4==0)
 Erreur : syntax error, unexpected ELSE, expecting '(' dans+ if else
  {+ br[i,j]=1
 + +}
 Erreur : syntax error, unexpected '}' dans {+ br[i,j]=1
 + else
 Erreur : syntax error, unexpected ELSE dans+ else
  {+ br[i,j]=0
 + +}
 Erreur : syntax error, unexpected '}' dans {+ br[i,j]=0
  +}
 Erreur : syntax error, unexpected '}' dans +}
  +}
 Erreur : syntax error, unexpected '}' dans +}
 
 I would require if you can rectify my program in order to obtain this matrix 
 of seasonal dummies. Many thanks in advance.
[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset using noncontiguous variables by name (not index)

2007-08-26 Thread Gabor Grothendieck
Using builtin data frame anscombe try this. First we set up a data frame
anscombe.seq which has one row containing 1, 2, 3, ... .  Then select
out from that data frame and unlist it to get the desired index vector.

 anscombe.seq - replace(anscombe[1,], TRUE, seq_along(anscombe))
 idx - unlist(subset(anscombe.seq, select = c(x1, x3:x4, y2)))
 anscombe[idx]
   x1 x3 x4   y2
1  10 10  8 9.14
2   8  8  8 8.14
3  13 13  8 8.74
4   9  9  8 8.77
5  11 11  8 9.26
6  14 14  8 8.10
7   6  6  8 6.13
8   4  4 19 3.10
9  12 12  8 9.13
10  7  7  8 7.26
11  5  5  8 4.74


On 8/26/07, Muenchen, Robert A (Bob) [EMAIL PROTECTED] wrote:
 Hi All,

 I'm using the subset function to select a list of variables, some of
 which are contiguous in the data frame, and others of which are not. It
 works fine when I use the form:

 subset(mydata,select=c(x1,x3:x5,x7) )

 In reality, my list is far more complex. So I would like to store it in
 a variable to substitute in for c(x1,x3:x5,x7) but cannot get it to
 work. That use of the c function seems to violate R rules, so I'm not
 sure how it works at all. A small simulation of the problem is below.

 If the variable names  orders were really this simple, I could use
 indices like

 summary( mydata[ ,c(1,3:5,7) ] )

 but alas, they are not.

 How does the c function work this way in the first place, and how can I
 make this substitution?

 Thanks,
 Bob

 mydata - data.frame(
  x1=c(1,2,3,4,5),
  x2=c(1,2,3,4,5),
  x3=c(1,2,3,4,5),
  x4=c(1,2,3,4,5),
  x5=c(1,2,3,4,5),
  x6=c(1,2,3,4,5),
  x7=c(1,2,3,4,5)
 )
 mydata

 # This does what I want.
 summary(
  subset(mydata,select=c(x1,x3:x5,x7) )
 )

 # Can I substitute myVars?
 attach(mydata)
 myVars1 - c(x1,x3:x5,x7)

 # Not looking good!
 myVars1

 # This doesn't do the right thing.
 summary(
  subset(mydata,select=myVars1 )
 )

 # Total desperation on this attempt:
 myVars2 - x1,x3:x5,x7
 myVars2

 # This doesn't work either.
 summary(
  subset(mydata,select=myVars2 )
 )



 =
 Bob Muenchen (pronounced Min'-chen), Manager
 Statistical Consulting Center
 U of TN Office of Information Technology
 200 Stokely Management Center, Knoxville, TN 37996-0520
 Voice: (865) 974-5230
 FAX: (865) 974-4810
 Email: [EMAIL PROTECTED]
 Web: http://oit.utk.edu/scc,
 News: http://listserv.utk.edu/archives/statnews.html

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset using noncontiguous variables by name (not index)

2007-08-26 Thread Gabor Grothendieck
Try this:

 %:% - function(x, y) {
+prex - gsub([0-9], , x); postx - gsub([^0-9], , x)
+prey - gsub([0-9], , y); posty - gsub([^0-9], , y)
+stopifnot(prex == prey)
+paste(prex, seq(from = as.numeric(postx), to =
as.numeric(posty)), sep = )
+ }
 x2 %:% x4
[1] x2 x3 x4


On 8/26/07, Muenchen, Robert A (Bob) [EMAIL PROTECTED] wrote:
 Thanks Bert  Gabor for two very interesting solutions!

 It would be very handy in R if string1:stringN generated
 string1,string2...stringN it would make selections like this much
 more obvious. I know it's easy to with the colon operator and paste
 function but that's quite a step up in complexity compared to SAS' x1
 x3-x4 y2 or SPSS' x1,x3 to x4, y2. And it's complexity that beginners
 face early in learning R.

 While on the subject of the colon operator, why doesn't anscombe[[1:4]]
 select the x variables in list form as anscombe[,1:4] or anscombe[1:4]
 do in data frame form?

 Thanks,

 Bob

 =
 Bob Muenchen (pronounced Min'-chen), Manager
 Statistical Consulting Center
 U of TN Office of Information Technology
 200 Stokely Management Center, Knoxville, TN 37996-0520
 Voice: (865) 974-5230
 FAX: (865) 974-4810
 Email: [EMAIL PROTECTED]
 Web: http://oit.utk.edu/scc,
 News: http://listserv.utk.edu/archives/statnews.html
 =


  -Original Message-
  From: Bert Gunter [mailto:[EMAIL PROTECTED]
  Sent: Sunday, August 26, 2007 6:50 PM
  To: 'Gabor Grothendieck'; Muenchen, Robert A (Bob)
  Cc: r-help@stat.math.ethz.ch
  Subject: RE: [R] subset using noncontiguous variables by name (not
  index)
 
  The problem is that x3:x5 does not mean what you think it means. The
  only
  reason it does the right thing in subset() is because a clever trick
 is
  used
  there (read the code -- it's not hard to understand) to ensure that it
  does.
  Gabor has essentially mimicked that trick in his solution.
 
  However, it is not necessary do this. You can construct the call
  directly as
  you tried to do. Using the anscombe example, here's how:
 
  chooz - c(x1,x3:x4,y2)  ## enclose the desired expression in quotes
  do.call (subset, list( x = anscombe, select = parse(text = chooz)))
 
  -- Bert Gunter
  Genentech Non-Clinical Statistics
  South San Francisco, CA
 
  The business of the statistician is to catalyze the scientific
  learning
  process.  - George E. P. Box
 
 
 
   -Original Message-
   From: [EMAIL PROTECTED]
   [mailto:[EMAIL PROTECTED] On Behalf Of Gabor
   Grothendieck
   Sent: Sunday, August 26, 2007 2:10 PM
   To: Muenchen, Robert A (Bob)
   Cc: r-help@stat.math.ethz.ch
   Subject: Re: [R] subset using noncontiguous variables by name
   (not index)
  
   Using builtin data frame anscombe try this. First we set up a
   data frame
   anscombe.seq which has one row containing 1, 2, 3, ... .  Then
 select
   out from that data frame and unlist it to get the desired
   index vector.
  
anscombe.seq - replace(anscombe[1,], TRUE, seq_along(anscombe))
idx - unlist(subset(anscombe.seq, select = c(x1, x3:x4, y2)))
anscombe[idx]
  x1 x3 x4   y2
   1  10 10  8 9.14
   2   8  8  8 8.14
   3  13 13  8 8.74
   4   9  9  8 8.77
   5  11 11  8 9.26
   6  14 14  8 8.10
   7   6  6  8 6.13
   8   4  4 19 3.10
   9  12 12  8 9.13
   10  7  7  8 7.26
   11  5  5  8 4.74
  
  
   On 8/26/07, Muenchen, Robert A (Bob) [EMAIL PROTECTED] wrote:
Hi All,
   
I'm using the subset function to select a list of variables, some
  of
which are contiguous in the data frame, and others of which
   are not. It
works fine when I use the form:
   
subset(mydata,select=c(x1,x3:x5,x7) )
   
In reality, my list is far more complex. So I would like to
   store it in
a variable to substitute in for c(x1,x3:x5,x7) but cannot get it
 to
work. That use of the c function seems to violate R rules,
   so I'm not
sure how it works at all. A small simulation of the problem
   is below.
   
If the variable names  orders were really this simple, I could
 use
indices like
   
summary( mydata[ ,c(1,3:5,7) ] )
   
but alas, they are not.
   
How does the c function work this way in the first place,
   and how can I
make this substitution?
   
Thanks,
Bob
   
mydata - data.frame(
 x1=c(1,2,3,4,5),
 x2=c(1,2,3,4,5),
 x3=c(1,2,3,4,5),
 x4=c(1,2,3,4,5),
 x5=c(1,2,3,4,5),
 x6=c(1,2,3,4,5),
 x7=c(1,2,3,4,5)
)
mydata
   
# This does what I want.
summary(
 subset(mydata,select=c(x1,x3:x5,x7) )
)
   
# Can I substitute myVars?
attach(mydata)
myVars1 - c(x1,x3:x5,x7)
   
# Not looking good!
myVars1
   
# This doesn't do the right thing.
summary(
 subset(mydata,select=myVars1 )
)
   
# Total desperation on this attempt:
myVars2 - x1,x3:x5,x7
myVars2
   
# This doesn't work either.
summary(
 subset

Re: [R] Extracting a range of elements from a vector

2007-08-25 Thread Gabor Grothendieck
See ?embed

On 8/25/07, Otis Laws [EMAIL PROTECTED] wrote:
 Dear R users

 I am R newbie creating a function that implements the poker test to test
 pseudo random bit generators.
 Iam reading the bits from a text file (1 bit per line),  which causes
 each bit to be stored in an element of a numeric vector.

 What Iam trying to do is to extract a block of bits of arbitray size
 from the original vector into a smaller numeric vector and then count
 this binary number
  (and keep repeating this until the end of the vector, so that I get a
 vector containing the number of times each binary number has occured) e.g.

 original vector:
 0, 1,1,0,0,1,0,1,1

 using a block size of 3 bits the first smaller vector becomes:
 0, 1, 1

 At the momemt I do this by iterating through the original vector and set
 the ith element of the smaller vector.
 I have looked at using the subset() function but it seems to operate on
 a vector's content rather than index.

 This causes the following two main questions:
 1. Is there a way to specify a range of vector elements?
 2. Is this the most efficient method, since this could be extremly time
 consuming when used to test millions of bits?


 Thanks very much in advance

 Otis Laws

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Character position command

2007-08-25 Thread Gabor Grothendieck
See ?regexpr to get the position; however, using sub we could remove
the dot and everything after it in one go.  See ?regexp and ?sub .
Also there are some links to info on regular expressions in the Links
box on this page:
http://gsubfn.googlecode.com

 n - regexpr(., apples.pears, fixed = TRUE)
 substr(apples.pear, 1, n-1)
[1] apples


 sub([.].*, , apples.pears)
[1] apples


On 8/25/07, Mitchell Hoffman [EMAIL PROTECTED] wrote:
 This is a very simple question, so I apologize I couldn't find it online:

 I want to shorten the string 'apples.pears' to 'apples'.

 string='apples.pears'
 string1=substr(string,0,x)

 For x above, I would like to have a command like charAt(string,.), i.e.
 the position of the period in the word, but I can't seem to find a charAt
 command in R.

 Thank you.

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to shade vertical bands in a graph?

2007-08-24 Thread Gabor Grothendieck
There is an example using classic graphics here:

   http://www.mayin.org/ajayshah/KB/R/html/g5.html

and one using lattice graphics here:

   library(zoo)
   ?xyplot.zoo


On 8/23/07, del pes [EMAIL PROTECTED] wrote:

 Hello,

 I would like to draw vertical yellow bands in my graph, but could not find 
 how to do that in the documentation.

 I set up a page to show what I would like to achieve: 
 http://rstudent.blogg.de/eintrag.php?id=1 (the first picture was manually 
 colored with the Gimp).

 Any help would be welcome...

 All the best,

 Delfina
 _
 [[replacing trailing spam]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] It is possible to use a Shell command inside a R script?

2007-08-24 Thread Gabor Grothendieck
What OS was that on?

On 8/24/07, Alberto Monteiro [EMAIL PROTECTED] wrote:
 Ronaldo Reis Junior wrote:
 
  It is possible to use a shell command inside a R script?
 
  I'm write a R script and I like to put somes shell commands inside
  to R. Somethink like: convert fig01.png fig01.xpm or sed ..., etc.
 
  It is possible? How?
 
 ?system

 BTW, I found that using things directly in R is _much_
 slower than creating a batch file and then running it.

 For example, I had a directory with misnamed mp3 files,
 and I wanted to use R to rename and copy them
 to another directory. I tried to use file.copy, but it
 took too much time. Writing a batch file and then running
 it was much faster.

 Alberto Monteiro

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] It is possible to use a Shell command inside a R script?

2007-08-24 Thread Gabor Grothendieck
On 8/24/07, Dirk Eddelbuettel [EMAIL PROTECTED] wrote:
 On Fri, Aug 24, 2007 at 08:32:00AM -0400, Duncan Murdoch wrote:
  On 8/24/2007 6:58 AM, Ronaldo Reis Junior wrote:
   Hi,
  
   It is possible to use a shell command inside a R script?
  
   I'm write a R script and I like to put somes shell commands inside to R.
   Somethink like: convert fig01.png fig01.xpm or sed ..., etc.
 
  The details and available functions depend on the platform, but you want
  to look at ?system, ?shell, and/or ?shell.exec.  (These all exist in
  Windows; on Unix-alikes, you probably won't have the latter two.)

 Don't forget pipes.

 R's ability to consistently work on connections that may be local
 files, remotes files, program output, ... is a true treasure (and
 thanks and credits to, I believe, Brian Ripley to make it so).

 Eg you can do this

  OD - read.table(pipe(links -dump http://cran.r-project.org/src/contrib/ | 
 awk '/tar.gz/ {print $3, $4}'), header=FALSE, col.names=c(file, date))

 to get files and dates of files on CRAN.

 As I recall, this also works on that other operating system, provided
 you do all the legwork of installing other tools, setting PATHs etc
 to provide what works out of the box on the supposedly unfriendlier OS.


Or commonly we can just do it entirely within R.  In the example discussed
we read in the lines, grep out the tar.gz lines, split each line into
fields and
select the desired columns, delete the junk and reformat it all into a
data frame:

 Lines - readLines(http://cran.r-project.org/src/contrib/;)
 tar.gz.Lines - grep(tar.gz, Lines, value = TRUE)
 raw.fields - do.call(rbind, strsplit(tar.gz.Lines, /td))[, 2:3]
 mat - apply(raw.fields, 2, gsub, pattern = /a|.*\| *$, replacement = 
 )
 DF - data.frame(file = mat[,1],
+   date = strptime(mat[,2], %d-%b-%Y %H:%M),
+   stringsAsFactors = FALSE)
 head(DF)
 filedate
1 ADaCGH_1.3-1.tar.gz 2007-05-14 12:04:00
2  AIS_1.0.tar.gz 2007-07-31 16:38:00
3 AMORE_0.2-10.tar.gz 2007-04-11 10:17:00
4   ARES_1.2-2.tar.gz 2007-03-19 20:53:00
5 AcceptanceSampling_0.1-1.tar.gz 2007-07-07 20:46:00
6   AdaptFit_0.2-1.tar.gz 2007-08-04 09:51:00

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] It is possible to use a Shell command inside a R script?

2007-08-24 Thread Gabor Grothendieck
On 8/24/07, Dirk Eddelbuettel [EMAIL PROTECTED] wrote:
 On Fri, Aug 24, 2007 at 10:57:46AM -0400, Duncan Murdoch wrote:
  On 8/24/2007 10:33 AM, Dirk Eddelbuettel wrote:
  On Fri, Aug 24, 2007 at 08:32:00AM -0400, Duncan Murdoch wrote:
  On 8/24/2007 6:58 AM, Ronaldo Reis Junior wrote:
   Hi,
It is possible to use a shell command inside a R script?
I'm write a R script and I like to put somes shell commands inside to
  R.  Somethink like: convert fig01.png fig01.xpm or sed ..., etc.
  The details and available functions depend on the platform, but you want
  to look at ?system, ?shell, and/or ?shell.exec.  (These all exist in
  Windows; on Unix-alikes, you probably won't have the latter two.)
  Don't forget pipes. R's ability to consistently work on connections that
  may be local
  files, remotes files, program output, ... is a true treasure (and
  thanks and credits to, I believe, Brian Ripley to make it so).
  Eg you can do this   OD - read.table(pipe(links -dump
  http://cran.r-project.org/src/contrib/ | awk '/tar.gz/ {print $3, $4}'),
  header=FALSE, col.names=c(file, date))
  to get files and dates of files on CRAN.   As I recall, this also works on
  that other operating system, provided
  you do all the legwork of installing other tools, setting PATHs etc
  to provide what works out of the box on the supposedly unfriendlier OS.
 
  The pipe command you list doesn't work in Windows.  I'd guess this is
  because the pipe syntax | within the command is unsupported:  it tries to
  execute links, with the rest of the line passed as arguments.  But I
  haven't traced through to check on this.

 Hm, wishful thinking must have gotten the better of me then. Sorry for
 spreading misinformation about the capabilities of that other OS.

This works for me on Windows:

 tab - read.table(pipe(lynx --nolist --dump 
 http://cran.r-project.org/src/contrib/ | findstr tar.gz), as.is = TRUE)
 head(tab[3:5])
   V3  V4V5
1 ADaCGH_1.3-1.tar.gz 14-May-2007 12:04
2  AIS_1.0.tar.gz 31-Jul-2007 16:38
3 AMORE_0.2-10.tar.gz 11-Apr-2007 10:17
4   ARES_1.2-2.tar.gz 19-Mar-2007 20:53
5 AcceptanceSampling_0.1-1.tar.gz 07-Jul-2007 20:46
6   AdaptFit_0.2-1.tar.gz 04-Aug-2007 09:51

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] It is possible to use a Shell command inside a R script?

2007-08-24 Thread Gabor Grothendieck
On 8/24/07, Duncan Murdoch [EMAIL PROTECTED] wrote:
 On 8/24/2007 1:05 PM, Gabor Grothendieck wrote:
  On 8/24/07, Dirk Eddelbuettel [EMAIL PROTECTED] wrote:
  On Fri, Aug 24, 2007 at 10:57:46AM -0400, Duncan Murdoch wrote:
   On 8/24/2007 10:33 AM, Dirk Eddelbuettel wrote:
   On Fri, Aug 24, 2007 at 08:32:00AM -0400, Duncan Murdoch wrote:
   On 8/24/2007 6:58 AM, Ronaldo Reis Junior wrote:
Hi,
 It is possible to use a shell command inside a R script?
 I'm write a R script and I like to put somes shell commands inside 
 to
   R.  Somethink like: convert fig01.png fig01.xpm or sed ..., etc.
   The details and available functions depend on the platform, but you 
   want
   to look at ?system, ?shell, and/or ?shell.exec.  (These all exist in
   Windows; on Unix-alikes, you probably won't have the latter two.)
   Don't forget pipes. R's ability to consistently work on connections that
   may be local
   files, remotes files, program output, ... is a true treasure (and
   thanks and credits to, I believe, Brian Ripley to make it so).
   Eg you can do this   OD - read.table(pipe(links -dump
   http://cran.r-project.org/src/contrib/ | awk '/tar.gz/ {print $3, 
   $4}'),
   header=FALSE, col.names=c(file, date))
   to get files and dates of files on CRAN.   As I recall, this also works 
   on
   that other operating system, provided
   you do all the legwork of installing other tools, setting PATHs etc
   to provide what works out of the box on the supposedly unfriendlier OS.
  
   The pipe command you list doesn't work in Windows.  I'd guess this is
   because the pipe syntax | within the command is unsupported:  it tries 
   to
   execute links, with the rest of the line passed as arguments.  But I
   haven't traced through to check on this.
 
  Hm, wishful thinking must have gotten the better of me then. Sorry for
  spreading misinformation about the capabilities of that other OS.
 
  This works for me on Windows:
 
  tab - read.table(pipe(lynx --nolist --dump 
  http://cran.r-project.org/src/contrib/ | findstr tar.gz), as.is = TRUE)

 Which R version is that?  It doesn't work for me in Rgui, though it does
  in Rterm, both R-devel versions.


I am using Rgui

 R.version.string
[1] R version 2.5.1 (2007-06-27)

on Windows XP.  lynx --version gives:

Lynx Version 2.8.5rel.1 (04 Feb 2004)
libwww-FM 2.14FM, SSL-MM 1.4.1, OpenSSL 0.9.7d-dev
Compiled by Borland C++ (Feb  5 2004 17:35:58).

Copyrights held by the University of Kansas, CERN, and other contributors.
Distributed under the GNU General Public License.
See http://lynx.isc.org/ and the online help for more information.

See http://www.moxienet.com/lynx/ for information about SSL for Lynx.
See http://www.openssl.org/ for information about OpenSSL.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Turning a logical vector into its indices without losing its length

2007-08-24 Thread Gabor Grothendieck
On 8/24/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 Here are two solutions:

  logvec - c(TRUE,FALSE,FALSE,TRUE,FALSE,FALSE,TRUE,FALSE)

  ifelse(logvec, seq_along(logvec), 0)
 [1] 1 0 0 4 0 0 7 0

  replace(logvec * 0, logvec, which(logvec))
 [1] 1 0 0 4 0 0 7 0

Actually the * 0 is not needed.  The last one could simply be:

replace(logvec, logvec, which(logvec))




 On 8/24/07, Leeds, Mark (IED) [EMAIL PROTECTED] wrote:
  I have the code below which gives me what I want for temp based on
  logvec but I was wondering if there was a shorter way ( i.e :
  a one liner ) without having to initialize temp to zeros.  This is
  purely for learning purposes. Thanks.
 
  logvec - c(TRUE,FALSE,FALSE,TRUE,FALSE,FALSE,TRUE,FALSE)
 
  temp-numeric(length(invec))
  temp[invec]-which(invec)
  temp
 
  [1] 1 0 0 4 0 0 7 0
 
  obviously, the code below doesn't work.
 
  temp - which(invec)
   temp
  [1] 1 4 7
  
 
  This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Turning a logical vector into its indices without losing its length

2007-08-24 Thread Gabor Grothendieck
Here are two solutions:

 logvec - c(TRUE,FALSE,FALSE,TRUE,FALSE,FALSE,TRUE,FALSE)

 ifelse(logvec, seq_along(logvec), 0)
[1] 1 0 0 4 0 0 7 0

 replace(logvec * 0, logvec, which(logvec))
[1] 1 0 0 4 0 0 7 0


On 8/24/07, Leeds, Mark (IED) [EMAIL PROTECTED] wrote:
 I have the code below which gives me what I want for temp based on
 logvec but I was wondering if there was a shorter way ( i.e :
 a one liner ) without having to initialize temp to zeros.  This is
 purely for learning purposes. Thanks.

 logvec - c(TRUE,FALSE,FALSE,TRUE,FALSE,FALSE,TRUE,FALSE)

 temp-numeric(length(invec))
 temp[invec]-which(invec)
 temp

 [1] 1 0 0 4 0 0 7 0

 obviously, the code below doesn't work.

 temp - which(invec)
  temp
 [1] 1 4 7
 

 This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Turning a logical vector into its indices without losing its length

2007-08-24 Thread Gabor Grothendieck
On 8/24/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 On 8/24/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
  Here are two solutions:
 
   logvec - c(TRUE,FALSE,FALSE,TRUE,FALSE,FALSE,TRUE,FALSE)
 
   ifelse(logvec, seq_along(logvec), 0)
  [1] 1 0 0 4 0 0 7 0
 
   replace(logvec * 0, logvec, which(logvec))
  [1] 1 0 0 4 0 0 7 0

 Actually the * 0 is not needed.  The last one could simply be:

 replace(logvec, logvec, which(logvec))

If logvec can have NAs then this solution would not work but could
be modified to be done like this:

replace(logvec, which(logvec), which(logvec))

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting strings

2007-08-23 Thread Gabor Grothendieck
This applies the indicated perl-style regular expression where the
first backreference (\\D+) is the non-digits and the second
backreference (\\d+) is the digits.

The two backreferences, but not the entire matched pattern itself,
are passed as arguments x and y to the function whose body is the
right hand side of the formula in the third argument.

That is then simplified using rbind to give the result.

library(gsubfn)
strapply(surgery, (\\D+)(\\d+), ~ list(lets = x, nums = as.numeric(y)),
   backref = -2, perl = TRUE, simplify = rbind)

More on gsubfn at
  http://gsubfn.googlecode.com


On 8/23/07, Gary Collins [EMAIL PROTECTED] wrote:
 I'm having a Thursday morning mental block, any suggestions on the following
 would be most appreciated...

 I have (as an example)

 surgery = c(d48,  d67,  dnc37,  a75,  d10,  a78,  d31,
 d55,  d1)

 before each number part the possibilities are c(a, d, dnc), I'm trying
 to split each element in surgery so that I have,

 status time
 d48
 d67
 dnc 37
 a75
 d10
 a78
 d31
 d55
 d1

 I've tried various strsplit approaches but nothing has done what I need.

 thanks in advance

 Gary

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FAQ 7.x when 7 does not exist. Useability question

2007-08-23 Thread Gabor Grothendieck
Note that googling

R FAQ 7.10

will get it on the first hit.

On 8/23/07, John Kane [EMAIL PROTECTED] wrote:
 The FAQ Section 7 is a very useful place for new users
 to find out any number of R idiosycracies.  However
 there is no numbering on the FAQ Table of Content or
 on the Sections Tables of Contents.

 An R-help list reply of Read FAQ 7.10 in response to
 a question about converting a factor to numeric is  a
 bit cryptic. The only time 7.10 appears is after the
 searcher has found the entry.

 Would it be a good idea to actually number the entries
 for the FAQ Table of Contents and the Table of
 Contents for the Sections?

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extracting duplicated elements

2007-08-23 Thread Gabor Grothendieck
Try:

 lapply(as.data.frame(t(DF)), function(x) unique(x[duplicated(x)  x  0]))


On 8/23/07, dxc13 [EMAIL PROTECTED] wrote:

 Can anyone help me solve this problem...thanks!

 Consider a data frame, namely v, as such:
  v
   X1 X2 X3 X4 X5 X1 X2 X3 X4 X5
 x1  1  2 -1 -1 -1  1  2 -1 -1 -1
 y1  1  2 -1 -1 -1  1  2  3 -1 -1

 What I would like to do is to create an array or data frame with only the
 elements that appear in the data frame more than once and are = 0.

 I try this...
  v[v=0]
 [1] 1 1 2 2 1 1 2 2 3

 which returns all = 0 elements, but they are not in their respective rows
 from the original data frame.  I have tried using the duplicated()
 function and can't seem to get it to work correctly.

 Essentially, the outcome I am trying to get is a df or array looking like:

 step 1...achieve this out of original df
 [1] 1 2 1 2
 [2] 1 2 1 2 3

 (the blank element in row 1, position 5 can be just be NA)

 step 2...take the above and get this...only the duplicated elements
 [1] 1 2
 [2] 1 2

 --
 View this message in context: 
 http://www.nabble.com/extracting-duplicated-elements-tf4318034.html#a12295213
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read big text file into R

2007-08-23 Thread Gabor Grothendieck
Another option is to read it into a database and from there into R.
RSQLite has the capability of reading certain text files directly into
an SQLite database without going through R and from there one
can read it into R.   You can use RSQLite to do that.  Alternately this
post describes how the devel version of the sqldf package can do it:

http://www.nabble.com/Re%3A-Memory-Experimentation%3A-Rule-of-Thumb-%3D-10-15-Times-the-Memory-p12078165.html

On 8/23/07, Yupu Liang [EMAIL PROTECTED] wrote:
 Dear Rs:

 Hi, I am trying to read a big text file (nrows=243440, ncols=144). It
 seems the computational time of all the read methods
 (scan,readtable,read.delim) is not linear to the number of rows I
 want to read in: things became really slow once I tried to read in
 10 lines compare to 1 lines).

 If I am reading the profiling result right, I guess scan wouldn't
 help either.

 My questions are :
 1) Is this a memory issue?
 2) How to get around this?: I can't just sit around for 15 mins.
 Would write a c function help?

 Thanks!

 Here is the profiling I did:

   Rprof()
   dd = read.delim(file,skip=9,sep=\t,as.is= T,nrows=1)
   Rprof(NULL)
   summaryRprof()
 $by.self
self.time self.pct total.time total.pct
 scan  3.56 85.2   3.56  85.2
 type.convert  0.48 11.5   0.48  11.5
 read.table0.08  1.9   4.18 100.0
 make.names0.02  0.5   0.02   0.5
 options   0.02  0.5   0.02   0.5
 readLines 0.02  0.5   0.02   0.5
 read.delim0.00  0.0   4.18 100.0
 file  0.00  0.0   0.02   0.5
 getOption 0.00  0.0   0.02   0.5

 $by.total
total.time total.pct self.time self.pct
 read.table 4.18 100.0  0.08  1.9
 read.delim 4.18 100.0  0.00  0.0
 scan   3.56  85.2  3.56 85.2
 type.convert   0.48  11.5  0.48 11.5
 make.names 0.02   0.5  0.02  0.5
 options0.02   0.5  0.02  0.5
 readLines  0.02   0.5  0.02  0.5
 file   0.02   0.5  0.00  0.0
 getOption  0.02   0.5  0.00  0.0

 $sampling.time
 [1] 4.18

   ?Rprof()
   Rprof()
   dd = read.delim(file,skip=9,sep=\t,as.is= T,nrows=10)
   Rprof(NULL)
   summaryRprof()
 $by.self
  self.time self.pct total.time total.pct
 scan  143.12 92.7 143.12  92.7
 type.convert9.52  6.2   9.52   6.2
 read.table  1.60  1.0 154.28  99.9
 paste   0.02  0.0   0.08   0.1
 textConnection  0.02  0.0   0.04   0.0
 .deparseOpts0.02  0.0   0.02   0.0
 file0.02  0.0   0.02   0.0
 make.names  0.02  0.0   0.02   0.0
 print.default   0.02  0.0   0.02   0.0
 read.delim  0.00  0.0 154.28  99.9
 doTryCatch  0.00  0.0   0.08   0.1
 gsub0.00  0.0   0.08   0.1
 try 0.00  0.0   0.08   0.1
 tryCatch0.00  0.0   0.08   0.1
 tryCatchList0.00  0.0   0.08   0.1
 tryCatchOne 0.00  0.0   0.08   0.1
 capture.output  0.00  0.0   0.06   0.0
 deparse 0.00  0.0   0.02   0.0
 eval.with.vis   0.00  0.0   0.02   0.0
 evalVis 0.00  0.0   0.02   0.0
 print   0.00  0.0   0.02   0.0

 $by.total
  total.time total.pct self.time self.pct
 read.table 154.28  99.9  1.60  1.0
 read.delim 154.28  99.9  0.00  0.0
 scan   143.12  92.7143.12 92.7
 type.convert 9.52   6.2  9.52  6.2
 paste0.08   0.1  0.02  0.0
 doTryCatch   0.08   0.1  0.00  0.0
 gsub 0.08   0.1  0.00  0.0
 try  0.08   0.1  0.00  0.0
 tryCatch 0.08   0.1  0.00  0.0
 tryCatchList 0.08   0.1  0.00  0.0
 tryCatchOne  0.08   0.1  0.00  0.0
 capture.output   0.06   0.0  0.00  0.0
 textConnection   0.04   0.0  0.02  0.0
 .deparseOpts 0.02   0.0  0.02  0.0
 file 0.02   0.0  0.02  0.0
 make.names   0.02   0.0  0.02  0.0
 print.default0.02   0.0  0.02  0.0
 deparse  0.02   0.0  0.00  0.0
 eval.with.vis0.02   0.0  0.00  0.0
 evalVis  0.02   0.0  0.00  0.0
 print0.02   0.0  0.00  0.0

 $sampling.time
 [1] 154.36

 I am using R 2.5.1 for mac on a Dual 2 

Re: [R] uneven list to matrix

2007-08-23 Thread Gabor Grothendieck
Here are two solutions.  The first repeatedly uses merge and the
second creates a zoo object from each alph component whose time
index consists of the row labels and uses zoo's multiway merge to
merge them.

# test data
m - matrix(1:5, 5, dimnames = list(LETTERS[1:5], NULL))
alph - list(m[1:4,,drop=F], m[c(1,3,4),,drop=F], m[c(1,4,5),,drop=F])
alph

# solution 1
out - alph[[1]]
for(i in 2:length(alph)) {
out - merge(out, alph[[i]], by = 0, all = TRUE)
row.names(out) - out[[1]]
out - out[-1]
}
matrix(as.matrix(out), nrow(out), dimnames=list(rownames(out),NULL))

# solution 2
library(zoo)
z - do.call(merge, lapply(alph, function(x) zoo(c(x), rownames(x
matrix(coredata(z), nrow(z), dimnames=list(time(z),NULL))


On 8/23/07, Christopher Marcum [EMAIL PROTECTED] wrote:
 Hello,

 I am sure I am not the only person with this problem.

 I have a list with n elements, each consisting of a single column matrix
 with different row lengths. Each row has a name ranging from A to E. Here
 is an example:

 alph[[1]]
 A 1
 B 2
 C 3
 D 4

 alph[[2]]
 A 1
 C 3
 D 4

 alph[[3]]
 A 1
 D 4
 E 5


 I would like to create a matrix from the elements in the list with n
 columns such that the row names are preserved and NAs are inserted into
 the cells where the uneven lists do not match up based on their row names.
 Here is an example of the desired output:

 newmatrix
  [,1]  [,2]  [,3]
 A  1 1 1
 B  2 NANA
 C  3 3 NA
 D  4 4 4
 E  NANA5

 Any suggestions?
 I have tried
 do.call(cbind,list)
 I also thought I was on the right track when I tried converting each
 element into a vector and then running this loop (which ultimately
 failed):

 newmat-matrix(NA,ncol=3,nrow=5)
 colnames(newmatrix)-c(A:E)
 for(j in 1:3){
 for(i in 1:5){
 for(k in 1:length(list[[i]])){
 if(is.na(match(colnames(newmatrix),names(alph[[i]])))[j]==TRUE){
 newmatrix[i,j]-NA}
 else newmatrix[i,j]-alph[[i]][k]}}}

 Thanks,
 Chris
 UCI Sociology

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] uneven list to matrix

2007-08-23 Thread Gabor Grothendieck
On 8/24/07, Christopher Marcum [EMAIL PROTECTED] wrote:
 Hi Gabor,

 Thank you. The native solution works just fine, though there is an
 interesting side effect, namely, that with very large lists the rows of
 the output become scrambled though the corresponding columns are correctly
 sorted. The zoo package solution does not work on large lists: there is an
 error:

 Error in order(na.last, decreasing, ...) :
argument 1 is not a vector

They both work on the example data.  Please provide reproducible
examples to illustrate your comments if you would like a response.


 Gabor Grothendieck wrote:
  Here are two solutions.  The first repeatedly uses merge and the
  second creates a zoo object from each alph component whose time
  index consists of the row labels and uses zoo's multiway merge to
  merge them.
 
  # test data
  m - matrix(1:5, 5, dimnames = list(LETTERS[1:5], NULL))
  alph - list(m[1:4,,drop=F], m[c(1,3,4),,drop=F], m[c(1,4,5),,drop=F])
  alph
 
  # solution 1
  out - alph[[1]]
  for(i in 2:length(alph)) {
out - merge(out, alph[[i]], by = 0, all = TRUE)
row.names(out) - out[[1]]
out - out[-1]
  }
  matrix(as.matrix(out), nrow(out), dimnames=list(rownames(out),NULL))
 
  # solution 2
  library(zoo)
  z - do.call(merge, lapply(alph, function(x) zoo(c(x), rownames(x
  matrix(coredata(z), nrow(z), dimnames=list(time(z),NULL))
 
 
  On 8/23/07, Christopher Marcum [EMAIL PROTECTED] wrote:
  Hello,
 
  I am sure I am not the only person with this problem.
 
  I have a list with n elements, each consisting of a single column matrix
  with different row lengths. Each row has a name ranging from A to E.
  Here
  is an example:
 
  alph[[1]]
  A 1
  B 2
  C 3
  D 4
 
  alph[[2]]
  A 1
  C 3
  D 4
 
  alph[[3]]
  A 1
  D 4
  E 5
 
 
  I would like to create a matrix from the elements in the list with n
  columns such that the row names are preserved and NAs are inserted into
  the cells where the uneven lists do not match up based on their row
  names.
  Here is an example of the desired output:
 
  newmatrix
   [,1]  [,2]  [,3]
  A  1 1 1
  B  2 NANA
  C  3 3 NA
  D  4 4 4
  E  NANA5
 
  Any suggestions?
  I have tried
  do.call(cbind,list)
  I also thought I was on the right track when I tried converting each
  element into a vector and then running this loop (which ultimately
  failed):
 
  newmat-matrix(NA,ncol=3,nrow=5)
  colnames(newmatrix)-c(A:E)
  for(j in 1:3){
  for(i in 1:5){
  for(k in 1:length(list[[i]])){
  if(is.na(match(colnames(newmatrix),names(alph[[i]])))[j]==TRUE){
  newmatrix[i,j]-NA}
  else newmatrix[i,j]-alph[[i]][k]}}}
 
  Thanks,
  Chris
  UCI Sociology
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 




__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] uneven list to matrix

2007-08-23 Thread Gabor Grothendieck
OK.  One other thought. The R merge command has a sort= argument
that you can try out.  See ?merge


On 8/24/07, Christopher Marcum [EMAIL PROTECTED] wrote:
 Hi Gabor,

 My apologies. Both solutions work just fine on large lists (n=1000,
 n[[i]]=500). A memory problem on my machine caused the error and
 fail-to-sort. Thank you!

 PS - The zoo method is slightly faster.

 Best,
 Chris

 Gabor Grothendieck wrote:
  On 8/24/07, Christopher Marcum [EMAIL PROTECTED] wrote:
  Hi Gabor,
 
  Thank you. The native solution works just fine, though there is an
  interesting side effect, namely, that with very large lists the rows of
  the output become scrambled though the corresponding columns are
  correctly
  sorted. The zoo package solution does not work on large lists: there is
  an
  error:
 
  Error in order(na.last, decreasing, ...) :
 argument 1 is not a vector
 
  They both work on the example data.  Please provide reproducible
  examples to illustrate your comments if you would like a response.
 
 
  Gabor Grothendieck wrote:
   Here are two solutions.  The first repeatedly uses merge and the
   second creates a zoo object from each alph component whose time
   index consists of the row labels and uses zoo's multiway merge to
   merge them.
  
   # test data
   m - matrix(1:5, 5, dimnames = list(LETTERS[1:5], NULL))
   alph - list(m[1:4,,drop=F], m[c(1,3,4),,drop=F], m[c(1,4,5),,drop=F])
   alph
  
   # solution 1
   out - alph[[1]]
   for(i in 2:length(alph)) {
 out - merge(out, alph[[i]], by = 0, all = TRUE)
 row.names(out) - out[[1]]
 out - out[-1]
   }
   matrix(as.matrix(out), nrow(out), dimnames=list(rownames(out),NULL))
  
   # solution 2
   library(zoo)
   z - do.call(merge, lapply(alph, function(x) zoo(c(x), rownames(x
   matrix(coredata(z), nrow(z), dimnames=list(time(z),NULL))
  
  
   On 8/23/07, Christopher Marcum [EMAIL PROTECTED] wrote:
   Hello,
  
   I am sure I am not the only person with this problem.
  
   I have a list with n elements, each consisting of a single column
  matrix
   with different row lengths. Each row has a name ranging from A to E.
   Here
   is an example:
  
   alph[[1]]
   A 1
   B 2
   C 3
   D 4
  
   alph[[2]]
   A 1
   C 3
   D 4
  
   alph[[3]]
   A 1
   D 4
   E 5
  
  
   I would like to create a matrix from the elements in the list with n
   columns such that the row names are preserved and NAs are inserted
  into
   the cells where the uneven lists do not match up based on their row
   names.
   Here is an example of the desired output:
  
   newmatrix
[,1]  [,2]  [,3]
   A  1 1 1
   B  2 NANA
   C  3 3 NA
   D  4 4 4
   E  NANA5
  
   Any suggestions?
   I have tried
   do.call(cbind,list)
   I also thought I was on the right track when I tried converting each
   element into a vector and then running this loop (which ultimately
   failed):
  
   newmat-matrix(NA,ncol=3,nrow=5)
   colnames(newmatrix)-c(A:E)
   for(j in 1:3){
   for(i in 1:5){
   for(k in 1:length(list[[i]])){
   if(is.na(match(colnames(newmatrix),names(alph[[i]])))[j]==TRUE){
   newmatrix[i,j]-NA}
   else newmatrix[i,j]-alph[[i]][k]}}}
  
   Thanks,
   Chris
   UCI Sociology
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
  
 
 
 
 




__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Optimization problem

2007-08-22 Thread Gabor Grothendieck
Try this.

1. following Ben remove the Randalstown point and reset the levels of the
Location factor.

2. then replace solve with ginv so it uses the generalized inverse to calculate
the hessian:

alan2 - subset(alan, subset = Location != Randalstown)
alan2$Location - factor(as.character(alan2$Location))

library(MASS)
solve - ginv

zinb.zc - zicounts(resp=Scars~.,x =~Location + Lar + Mass + Lar:Mass
+ Location:Mass,z =~Location + Lar + Mass + Lar:Mass + Location:Mass,
data = alan2)

rm(solve)

On 8/21/07, Ben Bolker [EMAIL PROTECTED] wrote:

  (Hope this gets threaded properly.  Sorry if it doesn't.)

   Gabor: Lac and Lacfac being the same is irrelevant, wouldn't
 produce NAs (but would produce something like a singular Hessian
 and maybe other problems) -- but they're not even specified in this
 model.

  The bottom line is that you have a location with a single
 observation, so the GLM that zicounts runs to get the initial
 parameter values has an unestimable location:mass interaction
 for one location, so it gives an NA, so optim complains.

  In gruesome detail:

 ## set up  data
 scardat = read.table(scars.dat,header=TRUE)
 library(zicounts)
 ## try to run model
 zinb.zc - zicounts(resp=Scars~.,
x =~Location + Lar + Mass + Lar:Mass + Location:Mass,
z =~Location + Lar + Mass + Lar:Mass + Location:Mass,
data=scardat)
 ## tried to debug this by dumping zicounts.R to a file, modifying
 ## it to put a trace argument in that would print out the parameters
 ## and log-likelihood for every call to the log-likelihood function.
 dump(zicounts,file=zicounts.R)
 source(zicounts.R)
 zinb.zc - zicounts(resp=Scars~.,
x =~Location + Lar + Mass + Lar:Mass + Location:Mass,
z =~Location + Lar + Mass + Lar:Mass + Location:Mass,
data=scardat,trace=TRUE)
 ## this actually didn't do any good because the negative log-likelihood
 ## function never gets called -- as it turns out optim() barfs when it
 ## gets its initial values, before it ever gets to evaluating the
 log-likelihood

 ## check the glm -- this is the equivalent of what zicounts does to
 ## get the initial values of the x parameters
 p1 - glm(Scars~Location + Lar + Mass + Lar:Mass + Location:Mass,
  data=scardat,family=poisson)
 which(is.na(coef(p1)))

 ## find out what the deal is
 table(scardat$Location)

 scar2 = subset(scardat,Location!=Randalstown)
 ## first step to removing the bad point from the data set -- but ...
 table(scar2$Location)
 ## it leaves the Location factor with the same levels, so
 ##  now we have ZERO counts for one location:
 ## redefine the factor to drop unused levels
 scar2$Location - factor(scar2$Location)
 ## OK, looks fine now
 table(scar2$Location)

 zinb.zc - zicounts(resp=Scars~.,
x =~Location + Lar + Mass + Lar:Mass + Location:Mass,
z =~Location + Lar + Mass + Lar:Mass + Location:Mass,
data=scar2)
 ## now we get another error (system is computationally singular when
 ## trying to compute Hessian -- overparameterized?)   Not in any
 ## trivial way that I can see.  It would be nice to get into the guts
 ## of zicounts and stop it from trying to invert the Hessian, which is
 ## I think where this happens.

  In the meanwhile, I have some other  ideas about this analysis (sorry,
 but you started it ...)

  Looking at the data in a few different ways:

 library(lattice)
 xyplot(Scars~Mass,groups=Location,data=scar2,jitter=TRUE,
   auto.key=list(columns=3))
 xyplot(Scars~Mass|Location,data=scar2,jitter=TRUE)

 xyplot(Scars~Lar,groups=Location,data=scar2,
   auto.key=list(columns=3))
 xyplot(Scars~Mass|Lar,data=scar2)
 xyplot(Scars~Lar|Location,data=scar2)

   Some thoughts: (1) I'm not at all sure that
 zero-inflation is necessary (see Warton 2005, Environmentrics).
 This is a fairly small, noisy data set without huge numbers
 of zeros -- a plain old negative binomial might be fine.

   I don't actually see a lot of signal here, period (although there may
 be some) ...
 there's not a huge range in Lar (whatever it is -- the rest of the
 covariates I
 think I can interpret).  It would be tempting to try to fit location as
 a random
 effect, because fitting all those extra degrees of freedom is going to
 kill you.
 On the other hand, GLMMs are a bit hairy.

   cheers
   Ben



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Evaluating f(x(2,3)) on a function f- function(a,b){a+b}

2007-08-22 Thread Gabor Grothendieck
Try this:

do.call(f, as.list(x))

On 8/22/07, Søren Højsgaard [EMAIL PROTECTED] wrote:
 Dear list
 I have a function and a vector, say
f - function(a,b){a+b}
x - c(2,3)
 I want to evaluate f on x in the sense of computing f(x[1],x[2]). I would 
 like it to be so that I can write f(x). (I know I can write a wrapper 
 function g - function(x){f(x[1],x[2])}, but this is not really what I am 
 looking for). Is there a general way doing this (programmatically)? (E.g. by 
 unpacking the elements of x and putting them in the right places when 
 calling f...)
 I've looked under formals, alist etc. but so far without luck.

 Regards
 Søren


[[alternative HTML version deleted]]


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subsetting zoo object with a vector of time values.

2007-08-21 Thread Gabor Grothendieck
See ?window.zoo
e.g.

library(zoo)

# create test data
tt - c(-50, -49.996, -49.995, -49.96, -49.956, -49.955, -49.92, -49.916,
-49.915, -49.88)
z - zoo(seq_along(tt), tt)

window(z, c(-50, -49.96, -49.92, -49.88))

On 8/21/07, Todd Remund [EMAIL PROTECTED] wrote:
 I have a zoo object for which I would like to subset using a vector of time
 values.  For example, I have the following time values represented in my zoo
 object.

 -50.000 -49.996 -49.995 -49.960 -49.956 -49.955 -49.920 -49.916 -49.915
 -49.880

 and would like to get observations corresponding to times

 -50 -49.96 -49.92 -49.88.

 What can I do without using the lapply or which functions?

 Thank you.

 Todd Remund

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extracting month from date in numeric form

2007-08-21 Thread Gabor Grothendieck
On 8/21/07, Gonçalo Ferraz [EMAIL PROTECTED] wrote:
 Hi,
 Anyone knows what would be a short way of extracting a month from a date in
 numeric or integer format?

 months(1979-12-20)
 returns
 December in character format.

 How could I get 12 in numeric or integer format?


Here are a few solutions:

format(as.Date(1979-12-20), %m)

as.POSIXlt(as.Date(1979-12-20))$mo + 1

as.numeric(substring(1979-12-20, 6, 7))

as.numeric(factor(months(as.Date(1979-12-20), abbrev = TRUE), levels
= month.abb))


See R News 4/1 Help Desk article for more on dates.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Optimization problem

2007-08-21 Thread Gabor Grothendieck
Lac and Lacfac are the same.

On 8/21/07, Alan Harrison [EMAIL PROTECTED] wrote:
 Hello Folks,

 Very new to R so bear with me, running 5.2 on XP.  Trying to do a 
 zero-inflated negative binomial regression on placental scar data as 
 dependent.  Lactation, location, number of tick larvae present and mass of 
 mouse are independents.  Dataframe and attributes below:


  Location Lac Scars Lar Mass Lacfac
 1   Tullychurry   0 0  15 13.87  0
 2  Somerset   0 0   0 15.60  0
 3 Tollymore   0 0   3 16.43  0
 4 Tollymore   0 0   0 16.55  0
 5   Caledon   0 0   0 17.47  0
 6  Hillsborough   1 5   0 18.18  1
 7   Caledon   0 0   1 19.06  0
 8   Portglenone   0 4   0 19.10  0
 9   Portglenone   0 5   0 19.13  0
 10Tollymore   0 5   3 19.50  0
 11 Hillsborough   1 5   0 19.58  1
 12  Portglenone   0 4   0 19.76  0
 13  Caledon   0 8   0 19.97  0
 14 Hillsborough   1 4   0 20.02  1
 15  Tullychurry   0 3   3 20.13  0
 16 Hillsborough   1 5   0 20.18  1
 17   LoughNavar   1 5   0 20.20  1
 18Tollymore   0 0   1 20.24  0
 19 Hillsborough   1 5   0 20.48  1
 20  Caledon   0 4   1 20.56  0
 21  Caledon   0 3   2 20.58  0
 22Tollymore   0 4   3 20.58  0
 23Tollymore   0 0   2 20.88  0
 24 Hillsborough   1 0   0 21.01  1
 25  Portglenone   0 5   0 21.08  0
 26  Tullychurry   0 2   5 21.28  0
 27 Ballysallagh   1 4   0 21.59  1
 28  Caledon   0 0   1 21.68  0
 29 Hillsborough   1 5   0 22.09  1
 30  Tullychurry   0 5   5 22.28  0
 31  Tullychurry   1 6  75 22.43  1
 32 Ballysallagh   1 5   0 22.57  1
 33 Ballysallagh   1 4   0 22.67  1
 34   LoughNavar   1 5   3 22.71  1
 35 Hillsborough   1 4   0 23.01  1
 36  Caledon   0 0   3 23.08  0
 37   LoughNavar   1 5   0 23.53  1
 38 Ballysallagh   1 4   0 23.55  1
 39  Portglenone   1 6   0 23.61  1
 40   Mt.Stewart   0 3   0 23.70  0
 41 Somerset   0 5   0 23.83  0
 42 Ballysallagh   1 5   0 23.93  1
 43 Ballysallagh   1 5   0 24.01  1
 44  Caledon   0 0   3 24.14  0
 45   LoughNavar   0 6   0 24.30  0
 46   LoughNavar   1 5   0 24.34  1
 47 Hillsborough   1 4   0 24.45  1
 48  Caledon   0 3   2 24.55  0
 49  Tullychurry   0 5  44 24.83  0
 50 Hillsborough   1 5   0 24.86  1
 51 Ballysallagh   1 5   0 25.02  1
 52  Tullychurry   0 0   9 25.27  0
 53   Mt.Stewart   0 5   0 25.31  0
 54   LoughNavar   1 4   8 25.43  1
 55 Somerset   1 0   0 25.58  1
 56 Hillsborough   1 5   0 25.82  1
 57  Portglenone   1 2   0 26.02  1
 58 Ballysallagh   1 5   0 26.19  1
 59   Mt.Stewart   1 0   0 26.66  1
 60  Randalstown   1 0   1 26.70  1
 61 Somerset   0 4   0 27.01  0
 62   Mt.Stewart   0 4   0 27.05  0
 63 Somerset   0 3   0 27.10  0
 64 Somerset   0 6   0 27.34  0
 65 Somerset   0 0   0 27.87  0
 66   LoughNavar   1 5   1 28.01  1
 67  Tullychurry   1 6  42 28.55  1
 68 Hillsborough   1 5   0 28.84  1
 69  Portglenone   1 4   0 29.00  1
 70 Somerset   1 4   0 31.87  1
 71 Ballysallagh   1 5   0 33.06  1
 72   LoughNavar   1 4   0 33.24  1
 73 Somerset   1 4   0 33.36  1

 alan : 'data.frame':73 obs. of  6 variables:
  $ Location: Factor w/ 10 levels Ballysallagh,..: 10 8 9 9 2 3 2 6 6 9 ...
  $ Lac : int  0 0 0 0 0 1 0 0 0 0 ...
  $ Scars   : int  0 0 0 0 0 5 0 4 5 5 ...
  $ Lar : int  15 0 3 0 0 0 1 0 0 3 ...
  $ Mass: num  13.9 15.6 16.4 16.6 17.5 ...
  $ Lacfac  : Factor w/ 2 levels 0,1: 1 1 1 1 1 2 1 1 1 1 ...

 The syntax I used to create the model is:

 zinb.zc - zicounts(resp=Scars~.,x =~Location + Lar + Mass + Lar:Mass + 
 Location:Mass,z =~Location + Lar + Mass + Lar:Mass + Location:Mass, data=alan)

 The error given is:

 Error in optim(par = parm, fn = neg.like, gr = neg.grad, hessian = TRUE,  :
non-finite value supplied by optim
 In addition: Warning message:
 fitted probabilities numerically 0 or 1 occurred in: glm.fit(zz, 1 - pmin(y, 
 1), family = binomial())

 I understand this is a problem with the model I specified, could anyone help 
 out??

 Many thanks

 Alan Harrison

 Quercus
 Queen's University Belfast
 MBC, 97 Lisburn Road
 Belfast

 BT9 7BL

 T: 02890 972219
 M: 07798615682


[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, 

Re: [R] tackle memory insufficiency for large dataset using save() load()?

2007-08-21 Thread Gabor Grothendieck
See ?save .  The ... arguments are the ***names*** of the objects, not
the objects
so you want save(d, ...whatever...) not save(d, ...whatever...) .
Also don't use attach and detach and read this about factors which applies
if your factor has many levels but can be ignored if not:
http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg92970.html

On 8/21/07, Jessica Z [EMAIL PROTECTED] wrote:
 Hello List, i have been agonizing over this for days, any reply would be 
 greatly appreciated!

  Situation:___
 My original dataset is a .csv dataset (w/ 2M records) with 4 variables:
 job_id (Primary key, won't be used for analysis, just used for join tables),
 sector_id (categorical variable, for 19 industry sectors),
 sqft (con't variable for square footage),
 building_type (categorical, for 2 building types)
  some values of sqft were inputed wrong, so i'd like to set sqft1 to NA 
 and then use aregImpute() to impute those NAs.

  Problem: the origianl dataset(.csv format) is too large. though i could read 
 that dataset into R, i could not get aregImpute() run even i set the memory 
 limit to 3G ! (yes, i did the switch in windows to reach 3G rather than 2G)

  Goal: try to find a way to slim down my dataset so as to get aregImpute() 
 running.

  What i did:
  i searched in the archive, and found someone said, as R tends to inflate 
 memory, it is a good idea to first read the original dataset into R-- then 
 save it as a more compact binary file using save() -- and then reload the 
 compact binary file back into R using load(). this way would reduce the 
 memory allocation.

  HOWEVER, after i saved my original dataset into a compact binary file using 
 save(), and used load(filename.Rdata) to reload the new compact data 
 format into R, I could not figure out how to retrive all my variables!!! R 
 shows the new dataset is not a list, nor a matrix, or a dataframe, but just a 
 character with length 1 !!! and there is no way i could do attach().

  i generated a 1K-row subset out of my original dataset to illustrate my 
 problem (does anyone know how to get my four variables back from this 
 compact binary new dataset? what did i do wrong?):

  data - read.table (file.choose(),header=T,sep=,)
  summary(data)
 job_id sector_id   sqftbuilding_type
  Min.   :   1.0   Min.   : 6.000   Min.   :  0.00   Min.   :1.000
  1st Qu.: 250.8   1st Qu.: 6.000   1st Qu.:  3.00   1st Qu.:2.000
  Median : 500.5   Median :11.000   Median :  4.00   Median :2.000
  Mean   : 500.5   Mean   : 9.455   Mean   : 12.49   Mean   :1.996
  3rd Qu.: 750.3   3rd Qu.:11.000   3rd Qu.:  4.00   3rd Qu.:2.000
  Max.   :1000.0   Max.   :12.000   Max.   :192.00   Max.   :2.000
 
  attach(data)
  sqft[sqft1] - NA
  sector.f - as.factor(sector_id)
  building_type.f - as.factor (building_type)
  d - data.frame(job_id,sector.f,sqft, building_type.f)
  summary (d)
 job_id   sector.f  sqftbuilding_type.f
  Min.   :   1.0   6 :340   Min.   :  3.00   1:  4
  1st Qu.: 250.8   11:505   1st Qu.:  4.00   2:996
  Median : 500.5   12:155   Median :  4.00
  Mean   : 500.5Mean   : 14.16
  3rd Qu.: 750.33rd Qu.: 17.00
  Max.   :1000.0Max.   :192.00
   NA's   :118.00
  save (d, file=compact_d.Rdata, ascii=FALSE)
 
  newdata - load (compact_d.Rdata)
 
  summary(newdata)
   Length Class  Mode
1 character character
  attach(newdata)
 Error in attach(newdata) : file 'd' not found
  is.data.frame (newdata)
 [1] FALSE
  is.list (newdata)
 [1] FALSE
  is.matrix (newdata)
 [1] FALSE
 
  _
 btw, i also tried to just save (into compact binary) and reload (the new 
 compact binary data format) (as i could do the NA stuff in sql anyhow). 
 however, i still got stucked at the same spot:
  data - read.table (file.choose(),header=T,sep=,)
  summary(data)
 job_id sector_id   sqftbuilding_type
  Min.   :   1.0   Min.   : 6.000   Min.   :  0.00   Min.   :1.000
  1st Qu.: 250.8   1st Qu.: 6.000   1st Qu.:  3.00   1st Qu.:2.000
  Median : 500.5   Median :11.000   Median :  4.00   Median :2.000
  Mean   : 500.5   Mean   : 9.455   Mean   : 12.49   Mean   :1.996
  3rd Qu.: 750.3   3rd Qu.:11.000   3rd Qu.:  4.00   3rd Qu.:2.000
  Max.   :1000.0   Max.   :12.000   Max.   :192.00   Max.   :2.000
  save (data, file=compact_data.Rdata, ascii=FALSE)
  newdata - load (compact_data.Rdata)
  summary(newdata)
   Length Class  Mode
1 character character
  attach(newdata)
 Error: restore file may be empty -- no data loaded
 In addition: Warning message:
 file 'data' has magic number ''
   Use of save versions prior to 2 is deprecated
  is.data.frame (newdata)
 [1] FALSE
  is.list (newdata)
 [1] FALSE
  is.matrix (newdata)
 [1] FALSE
 




 -
 Building a website is a piece of cake.

[[alternative 

Re: [R] tackle memory insufficiency for large dataset using save() load()?

2007-08-21 Thread Gabor Grothendieck
?save says its the names (not the objects) although I just
tried it and both save(iris, file = /iris.Rdata) and
save(iris, file = /iris.Rdata) seemed to work so you are
right that it seems to work with the objects, not just the names,\
although its not documented to do so.

Usage
save(..., list = character(0),
 file = stop('file' must be specified),
 ascii = FALSE, version = NULL, envir = parent.frame(),
 compress = !ascii, eval.promises = TRUE)

save.image(file = .RData, version = NULL, ascii = FALSE,
   compress = !ascii, safe = TRUE)

Arguments
... the names of the objects to be saved.
list A character vector containing the names of objects to be saved.

On 8/21/07, Rolf Turner [EMAIL PROTECTED] wrote:

 On 22/08/2007, at 1:48 PM, Gabor Grothendieck wrote:

  See ?save .  The ... arguments are the ***names*** of the objects, not
  the objects
  so you want save(d, ...whatever...) not save(d, ...whatever...) .

I think this is wrong.  You want the objects not their names.

If you want to make use of object names, use the list argument.

I.e.

save(melvin,clyde,file=irving)

and

save(list=c(melvin,clyde),file=irving)

accomplish the same thing.

cheers,

Rolf Turner

 ##
 Attention:
 This e-mail message is privileged and confidential. If you are not the
 intended recipient please delete the message and notify the sender.
 Any views or opinions presented are solely those of the author.

 This e-mail has been scanned and cleared by MailMarshal
 www.marshalsoftware.com
 ##


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Q: combine 2 data frames with missing values

2007-08-20 Thread Gabor Grothendieck
Try this:

Lines - casevar1var2var3   var4
1   9   9   13  11
2   15  9   15  13
3   na  na  12  9
4   8   6   na  na
5   14  10  na  na
6   20  15  17  15


# replace with DF - read.table(myfile.dat, header = TRUE, na.strings = na)
DF - read.table(textConnection(Lines), header = TRUE, na.strings = na)

DF1 - DF[-1]
kor - cor(DF1, use = pairwise)
kor

lm(var1 ~ var2, DF) # a sample regression

# mycoef calculates kth coefficient in regression of
# ith variable on jth variable
mycoef - function(i, j, k) coef(lm(DF1[c(i, j)]))[k]

idx - 1:ncol(DF1)
names(idx) - names(DF1)

intercepts - outer(idx, idx, Vectorize(mycoef), 1)
names(dimnames(intercepts)) - c(y, x)
intercepts

slopes - outer(idx, idx, Vectorize(mycoef), 2)
names(dimnames(slopes)) - c(y, x)
slopes

# another approach to the above
# mycoef1 is like mycoef but has only one argument
# and outputs all coefs, not just a specified one
mycoef1 - function(idx) coef(lm(DF1[idx]))
out - t(apply(expand.grid(y = idx, x = idx), 1, mycoef1))
colnames(out) - c(y, x, intercept, slope)
out

# To perform SQL operations on data frames
# see sqldf home page at http://sqldf.googlecode.com
# and also ?sqldf for many examples
library(sqldf)
sqldf(select avg(var1), avg(var2), avg(var3), avg(var4) from DF1)
colMeans(DF1, na.rm = TRUE)  # same



On 8/20/07, Tom Willems [EMAIL PROTECTED] wrote:
 hello R ussers,

 i have the same problem with my data,
 for aal the different variables, i have the same number of cases, but the
 are often out of detectionlimits so they produce na's .
 so the data looks like this:

 casevar1var2var3var4 ...
 1   9   9   13  11
 2   15  9   15  13
 3   na  na  12  9
 4   8   6   na  na
 5   14  10  na  na
 6   20  15  17  15  ..
 ..

 What i would like to do for data exploration, is to compare each possible
 pair of variables, get their correlation coefficient, the intercept and
 the slope of regression line. yet for every variable the messurements are
 lnked thruogh theyr case. it is the same sample just a diferent test.

 Now  i select a subsets  of variables out of the original dataset, and use
  :
  value_x1 = subset(dataset_1,select=lg_value)
  value_y1 =subset(dataset_2,select=lg_value)

 Then i to mold an lm model, inorder to get estimates for the slope ans
 intercept
model_1 - lm (value_y1[,1]~ value_x1[,1]  )

 This is what R tell's me:
Error in model.frame(formula, rownames,
 variables, varnames, extras, extranames,  :
  variable lengths differ (found for
 'value_x1[, 1]')

 Is there perhapes a way of binding the selected subsets together, still
 linked to their case, so that the na's can be discarded by R automaticaly?
 I have been trying to use SQLiteDF and the other sql func's of R, but i
 don't realy understand them.
 If someone out there knows how to use sql, in R, i d be delited if he or
 she could explain it to me, more understandible then the manuals i find on
 the web.
 Here is what io would want sql to do .


 My data is in columns, one column holds all the case numbers, one the
 messured values, one all the testtypes and one the timeperiod and then one
 column for the lab's that preformed the test. is is stored in a txt file.
 So it is a long 5 column data table.
 Now is it possible to make a cross table holding the case nr's, and
 timeperiod in 2 column's, and then have a different column for every test?
 so if there are 4 tests and 4 lab's, it would give 16 columns.
 I've tryed it in access, but it gave me andless loops of repeated values.
 and creating new data files is dangerous, 'litle mistakes made while
 copying ' or manipultaions made to one file and not to the other'.
 .

 kind regards,
 Tom



 Disclaimer: click here
[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Any parser generator / code assistance for R?

2007-08-20 Thread Gabor Grothendieck
On 8/17/07, Ali - [EMAIL PROTECTED] wrote:
 Hi,

 Is there any parser generator like www.antlr.org?  Moreover, how does simple

Given the response, it looks like no one has come up with an antlr
parser for R but there are some facilities within R itself.

showTree() in the R codetools package which can generate a
Lisp style expression for any R expression:

 library(codetools)
 showTree(quote(for(i in myvec[1:3]) print(i+88*2+3*4)))
(for i ([ myvec (: 1 3)) (print (+ (+ i (* 88 2)) (* 3 4

Looking at the source of showTree would show you how to walk
an R parse tree.

The Ryacas R package has a recursive descent R parser that is used to
process R code translating it to yacas and it also can translate OpenMath
XML code generated by yacas to R.  See:
http://ryacas.googlecode.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recommended combo of apps for new user?

2007-08-19 Thread Gabor Grothendieck
Regarding RODBC vs. DBI-based packages (RSQLite, RMySQL, etc.) its
my perception, possibly mistaken, that apart from any consideration of
the R packages themselves, ODBC (which originated in the Windows world)
is more widely used on Windows than UNIX.  Also ODBC has the problem
that one must configure it which puts an extra step into the process.  Clear
documentation on how to do such ODBC configuration may be difficult to find.

On the other hand the RODBC package itself seems to be maintained
very well and is typically available for new versions of R before the
DBI-based packages.

On 8/19/07, Prof Brian Ripley [EMAIL PROTECTED] wrote:
 Some additional comments on the DBMS front.

 (a) SPSS is not a DBMS, so it is not clear that you need this. But if you
 do and are storing valuable data in a DBMS a lot of further questions come
 into play, like how you are going to do backups.  I'd say PostgreSQL was
 really only for professional-level administrators.  My sysadmins recommend
 MySQL for most people.  We do also run PostgreSQL and they find it a lot
 trickier to maintain.

 'dozens of columns and thousands of rows' is not big.  A data frame with
 50 columns and 5000 rows would only take 2Mb to store, and R will easily
 handle 100x with 4GB of RAM (and if you have less, get 4GB).  So storing
 data in .rda (R's save() format) is most likely viable.  R's indexing etc
 operations make it good at data manipulation, and using a DBMS will
 involve learning SQL, a non-trivial cost.

 (b) You have a choice of interfaces to a DBMS, RODBC and the DBI+ family,
 e.g. DBI+RMySQL and DBI+RSQLite.  I'm biased, but I find RODBC more
 intuitive, and many people have reported it to be faster.  If all you want
 is non-permanent storage for manipulation of large data sets, consider
 also SQLiteDF.

 On Sat, 18 Aug 2007, Duncan Murdoch wrote:

  Martin Brown wrote:
  [i sent this message earlier but apparently should have sent it plain
  text, as follows..]
 
  Hi there,
 
  I would like some advice, not so much about how to use R, but about
  software that I need to complement R.  I've rooted around in the FAQ's
  and done a few searches on this mailing list but haven't quite found
  the perspective I need.
 
  I am an experienced data analyst in my field (forest ecology and
  ecological monitoring) but new to R. I am a long time user of SPSS and
  have gotten pretty handy with it.  However, I am frustrated with SPSS
  for several reasons:  There's the cost (I'm a freelancer; I pay for my
  software myself);  the Windows dependence (I use Kubuntu as my usual
  OS now, and switching back and forth is a pain); the horrible
  inefficiency when I do certain types of file manipulations; and the
  inability to do the kind of publication-quality graphs I want... I've
  usually ended up using a commercial graphing program (another source
  of expense and limitation).
 
  I'd like to switch to using R on Kubuntu, for all those reasons.  In
  addition I think the mathematical formality that R encourages might be
  good for me.
 
  However, reviewing the FAQ's on the R project web site makes me
  realize that I've been using SPSS as three kinds of software really:
  a DBMS; a statistical analysis package; and a graphing package.  It
  looks like moving to R might involve learning three kinds of software,
  not just one.  I wonder:
 
  1) What open-source DBMS works most seamlessly with R?  I have seen
  MySQL recommended but wonder if there are alternatives.  I sometimes
  need to handle big data files.  In fact a lot of my work involves
  exploratory and descriptive analyses of rather large and messy
  databases from ecological monitoring, rather than statistical tests
  per se.  In SPSS the data files I have been generating have dozens of
  columns and thousands of rows, often with value and variable labels
  helpful for documenting my work.

 See above.

 
  I think you won't find much difference in the R interface between MySQL,
  PostgreSQL, or SQLite.  The choice should be made based on the qualities
  of the database (and I don't know enough about the differences to give a
  recommendaton.)
  2) For the purpose of creating publication-quality graphs, do R users
  typically need to go outside of the R system? If so, what open-source
  programs would you all recommend?
 
  R is great for this, but you might need to go outside for some
  specialized stuff (e.g. medical imaging).
 
  3) Any other software I need to learn that would make my work in R
  more productive? (for example, a code editor).
 
  A lot of people are happy with ESS mode in Emacs.
 
  Duncan Murdoch
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 --
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied 

Re: [R] Creating a data set within a function

2007-08-19 Thread Gabor Grothendieck
Check out ?embed

On 8/19/07, Anup Nandialath [EMAIL PROTECTED] wrote:
 Dear Friends,

 I'm trying to find if there is a way to automate creation of the design 
 matrix. Suppose we are interested in say running an autoregressive model. The 
 user inputs the following data

 myfunAR - function(y, order)
 {.
 ..
 }

 now here y is the data series and order represents the level of the process. 
 In other words if order=2 then we have an AR (2) process. Now it is easy to 
 to create the y vector within the function, but I'm not clear on how to 
 create the design matrix.

 For instance if order=2 then

 y - as.matrix(rnorm(100))
 ynew - as.matrix(y[3:nrow(y),1])
 x - as.matrix(cbind(rep(1, nrow(y)-2), y[2:(nrow(y)-1),1], 
 y[1:(nrow(y)-2),1]))

 ynew and x gives me the response vector and design matrix respectively. 
 however, I'm trying to write a general function which will accomodate any 
 order. Hence given the user inputs y and the order, is there a way to program 
 the creation of the x matrix automatically.

 The long way would be

 if (order=1)
 {%5}

 if (order=2)
 {%5}

 but this will force me to limit at some point.Is there an alternative way to 
 program this??

 Thanks in advance
 Regards

 Anup




 -
 Building a website is a piece of cake.

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to parse a string into the symbol for a data frame object

2007-08-19 Thread Gabor Grothendieck
You might want to store the data frames in a list to eliminate this
problem and make it more convenient to iterate over them:

L - list(df1 = df1, df2 = df2)
rm(df1, df2)

# reduce each data frame to its first few rows
for(nm in names(L))   L[[nm]] - head(L[[nm])

or if you don't need to modify them or know their names:

# print first few lines of each
for(df in L) print(head(df))


On 8/19/07, Darren Weber [EMAIL PROTECTED] wrote:
  I have several data frames, eg:

  df1 - data.frame(x=seq(0,10), y=seq(10,20))
  df2 - data.frame(a=seq(0,10), b=seq(10,20))

 It is common to create loops in R like this:

  for(df in list(df1, df2)){ #etc. }

 This works fine when you know the name of the objects to put into the
 list.  I assume that the order of the objects in the list is respected
 through the loop.  Inside the loop, the objects of the list are
 'dereferenced' using 'df' but, to my knowledge, there is no way to
 tell whether 'df' is a current representation of 'df1' or 'df2'
 without some additional book keeping.

 In addition, I really want to use 'paste' within the loop to create a
 new string value that will have the symbol name of a data frame to be
 dereferenced, e.g.:

  for(n in c(1, 2)){ dfString - paste('df', n, sep=); 
  print(eval(dfString)) }

 [1] df1
 [1] df2

 This is not what I want.  I have read through the documentation on
 eval and similar commands like substitute and quote.  I program
 regularly, but I do not understand these constructs in R.  I do not
 understand the R framework for parsing and evaluation and I don't have
 a lot of time right now to get lost in this detail.  I could really
 use some help to get the string values in my loop to be parsed into
 symbols that refer to the data frame objects df1 and df2.  How is this
 done?

 Best, Darren

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to collapse a list of 1 column matrix to a matrix?

2007-08-19 Thread Gabor Grothendieck
Try this:

L - list(`1` = matrix(1:4, 4), `2` = matrix(5:8, 4))
sapply(L, c)

Note that the list component names are kept as column names in the result


On 8/19/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 Hi,

 I encounter a situation where I have a list whose element is a column matrix. 
 Says,

 $'1'
 [,1]
 1
 2
 3

 $'2'
 [,1]
 4
 5
 6

 Is there fast way to collapse the list into a matrix like a cbind operation 
 in this case? Meaning, the result should be a matrix that looks like:

  [,1]  [,2]
 [1,]1  4
 [2,]2  5
 [3,]3  6

 I can loop through all elements and do cbind manually. But I think there must 
 be a simpler way that I don't know. Thank you.

 - adschai

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recommended combo of apps for new user?

2007-08-18 Thread Gabor Grothendieck
On 8/18/07, Martin Brown [EMAIL PROTECTED] wrote:
 Hi there,

 I would like some advice, not so much about how to use R, but about software
 that I need to complement R.  I've rooted around in the FAQ's and done a few
 searches on this mailing list but haven't quite found the perspective I
 need.

 I am an experienced data analyst in my field (forest ecology and ecological
 monitoring) but new to R. I am a long time user of SPSS and have gotten
 pretty handy with it.  However, I am frustrated with SPSS for several
 reasons:  There's the cost (I'm a freelancer; I pay for my software
 myself);  the Windows dependence (I use Kubuntu as my usual OS now, and
 switching back and forth is a pain); the horrible inefficiency when I do
 certain types of file manipulations; and the inability to do the kind of
 publication-quality graphs I want... I've usually ended up using a
 commercial graphing program (another source of expense and limitation).

 I'd like to switch to using R on Kubuntu, for all those reasons.  In
 addition I think the mathematical formality that R encourages might be good
 for me.

From a strictly language perspective, mathematical formality is pretty
far from R.  Its actually quite loose.  Underneath there are some Lisp/Scheme
ideas but you are not very close to that as a user.


 However, reviewing the FAQ's on the R project web site makes me realize that
 I've been using SPSS as three kinds of software really:  a DBMS; a
 statistical analysis package; and a graphing package.  It looks like moving
 to R might involve learning three kinds of software, not just one.  I
 wonder:

 1) What open-source DBMS works most seamlessly with R?  I have seen MySQL
 recommended but wonder if there are alternatives.  I sometimes need to
 handle big data files.  In fact a lot of my work involves exploratory and
 descriptive analyses of rather large and messy databases from ecological
 monitoring, rather than statistical tests per se.  In SPSS the data files I
 have been generating have dozens of columns and thousands of rows, often
 with value and variable labels helpful for documenting my work.

Databases. SQLite is the easiest to install since its embedded rather
than client/server so I would use that unless your application requires
client/server or other features of MySQL.  MySQL is probably the most
popular of the free data bases so that would be the next one to go with.
If you intend to create a commercial application you might want to
consider Postgres instead of MySQL as the latter charges for
commercial implementations but Postgres does not.  Some heavy
Postgres users might feel that it should be considered after SQLite
rather than MySQL and there is a certain amount of arbitrariness here.
See the R packages RSQLite, RMySQL and DBI.  The R packages sqldf and
SQLiteDF are beginning to blur the boundary between R and the database.

 2) For the purpose of creating publication-quality graphs, do R users
 typically need to go outside of the R system? If so, what open-source
 programs would you all recommend?

Graphics.  R should be ok.  Check out:
   http://cran.r-project.org/src/contrib/Views/Graphics.html
and also google for
   R Graphics Gallery

 3) Any other software I need to learn that would make my work in R more
 productive? (for example, a code editor).


Other.  You need to know a text editor.  I use vim but there are
many good choices here with ESS being one that is often mentioned.

http://www.sciviews.org/_rgui/projects/Editors.html
http://ess.r-project.org/

If you intend to write C routines to run with R then, of course, you
need to know C.
For certain R packages that interface with outside software (tcltk, Rgraphviz,
Ryacas, XML, etc.) you will need to know something about the interfaced-to
software if you intend to use those packages.

For package development you will need to know latex and possibly subversion,
i.e. svn, the UNIX screen program, tar and various other UNIX commands.
Certain auxilliary programs that come with and are used with R are written
in perl although its unlikely you will need to know it.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] names not inherited in functions

2007-08-17 Thread Gabor Grothendieck
Within a function deparse(substitute(x)) will give the name of x, as a character
variable.  Search the archives for
  deparse substitute
to find many examples.

On 8/17/07, david dav [EMAIL PROTECTED] wrote:
 Dear R list,
 After a huge delay, I come back to this question. Using names of
 variables inside a function is a problem I run into quite often.
 Maybe this little example should help to get my point:
 Suppose I want to make a function llabel to get the labels of the
 variables from a data frame.
 If no label is defined, llabel should return the name of the variable.

library(Hmisc)
v1 - c(1,2)
v2 - c(1,2)
v3 - c(1,3)
tablo - data.frame(v1,v2,v3)
rm(v1,v2,v3)

label(tablo$v1) - var1
attach(tablo)

 # This does the trick on one variable.
if (label(v1) !=) label(v1)   else names(data.frame(v1))
if (label(v2) !=) label(v2)   else names(data.frame(v2))

 But if I call this statement in a llabel function,

llabel - function(var) {
if (label(var) != )
res - label(var)
else res - names(data.frame(var))
return (res) }

 I just get vars instead of the names when no label is defined :

 llabel(v1) # works
 llabel(v2) # gives var instead of v2

 Thanks for your help.

 David


 2007/6/7, Uwe Ligges [EMAIL PROTECTED]:
  Not sure what you are going to get. Can you shorten your functions and
  specify some example data? Then please tell us what your expected result is.
 
  Best,
  Uwe Ligges
 
 
 
 
  david dav wrote:
   Dear all,
  
   I 'd like to keep the names of variables when calling them in a function.
   An example might help to understand my problem :
  
   The following function puts in a new data frame counts and percent of
   a data.frame called as tablo
   the step  nom.chiffr[1] - names(vari)  is useless as names from the
   original data.frame aren't kept in the function environement.
  
   Hoping I use appropriate R-vocabulary, I thank you for your help
  
   David
  
   descriptif - function (tablo) {
 descriptifvar - function (vari) {
 table(vari)
 length(vari[!is.na(vari)])
 chiffr - 
   cbind(table(vari),100*table(vari)/(length(vari[!is.na(vari)])))
 nom.chiffr - rep(NA, dim(table(vari)))
 if (is.null(names(vari))) nom.chiffr[1] - paste(i,) else
 nom.chiffr[1] - names(vari)
 chiffr - data.frame (  names(table(vari)),chiffr)
 rownames(chiffr) - NULL
 chiffr - data.frame (nom.chiffr, chiffr)
 return(chiffr)
 }
  
 res - rep(NA, 4)
 for (i in 1 : ncol(tablo))
 res - rbind(res,descriptifvar(tablo[,i]))
 colnames(res) - c(variable, niveau, effectif, pourcentage)
   return(res[-1,])
   }
   # NB I used this function on a data.frame with only factors in
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide 
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] an easy way to construct this special matirx

2007-08-16 Thread Gabor Grothendieck
Here are two solutions.  In the first lo has TRUE on the lower diagonal
and diagonal. Then we compute the exponents, multiplying by lo to zero
out the upper triangle.  In the second rn is a matrix of row numbers
and rn = t(rn) is the same as lo in the first solution.

r - 2; n - 5 # test data

lo - lower.tri(diag(n), diag = TRUE)
lo * r ^ (row(lo) - col(lo) + 1)

Here is another one:

rn - row(diag(n))
(rn = t(rn)) * r ^ (rn - t(rn) + 1)

On 8/15/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 Hi,
 Sorry if this is a repost. I searched but found no results.
 I am wondering if it is an easy way to construct the following matrix:

 r  1 0 00
 r^2   r 1 00
 r^3   r^2  r 10
 r^4   r^3  r^2  r1

 where r could be any number. Thanks.
 Wen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] time series with quality codes

2007-08-16 Thread Gabor Grothendieck
In addition, we could create a function to.df which converts a zoo
object to a data frame assuming that any column that only contains
1:nlevels is a factor with the indicated level names.  Use to.df just
before plotting:

library(zoo)
set.seed(1)
f - zoo(factor(sample(3, 10, replace = TRUE)))
x - zoo(rnorm(10))
y - zoo(rnorm(10))
z - merge(x, y, f)

to.df - function(z, levels = letters[1:3], time = FALSE) {
zz - as.data.frame(z)
for(i in ncol(zz))
if (all(zz[,i] %in% seq_along(levels)))
z[,i] - factor(levels[z[,i]])
if (time) cbind(index = index(z), zz) else zz
}

library(lattice)
xyplot(y ~ x | f, data = to.df(z))




On 8/16/07, Achim Zeileis [EMAIL PROTECTED] wrote:
 On Thu, 16 Aug 2007, Felix Andrews wrote:

  list(...),
 
  I am working with environmental time series (eg rainfall, stream flow)
  that have attached quality codes for each data point. The quality
  codes have just a few factor levels, like good, suspect, poor,
  imputed. I use the quality codes in plots and summaries. They are
  carried through when a time series is aggregated to a longer
  time-step, according to rules like worst, median or mode.
 
  I need to support time steps of anything from hours to years. I can
  assume the data are regular time series -- they might be irregular
  initially but could be 'regularized'. But I would want to plot
  irregular time series along with regular ones.
 
  So far I have been using a data frame with a POSIXct column, a numeric
  column and a factor column. However I would like to use zoo instead,
  because of its many utility functions and easy conversion to ts. Is
  there any prospect of zoo handling such numeric + factor data? Other
  suggestions on elegant ways to do it are also welcome.

 There is some limited support for this in zoo. You can do
   z - zoo(myfactor, myindex)
 and work with it like a zoo series and then
   coredata(z)
 will recover a factor. However, you cannot bind this to other series
 without losing the factor structure. At least not in a plain zoo series.
 But you can do
   df - merge(z, Z, retclass = data.frame)
 where every column of the resulting data.frame is a univariate zoo series.

 The final option would be to just have a data.frame as usual and put your
 data/index into one column. But then it's more difficult to leverage zoo's
 functionality.

 I would like to have more support for things like this, but currently this
 is what we have.

 Best,
 Z

  Felix
 
  --
  Felix Andrews / ��
  PhD candidate
  Integrated Catchment Assessment and Management Centre
  The Fenner School of Environment and Society
  The Australian National University (Building 48A), ACT 0200
  Beijing Bag, Locked Bag 40, Kingston ACT 2604
  http://www.neurofractal.org/felix/
  xmpp:[EMAIL PROTECTED]
  3358 543D AAC6 22C2 D336  80D9 360B 72DD 3E4C F5D8
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] an easy way to construct this special matirx

2007-08-16 Thread Gabor Grothendieck
It was pointed out that the required matrix may not be square and
the superdiagonal was missing in my prior post.  Here is a revision:

r - 2; nr - 4; nc - 5 # test data

x - matrix(nr = nr, nc = nc)
x - row(x) - col(x) + 1
(x = 0) * r ^ x

On 8/16/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 Here are two solutions.  In the first lo has TRUE on the lower diagonal
 and diagonal. Then we compute the exponents, multiplying by lo to zero
 out the upper triangle.  In the second rn is a matrix of row numbers
 and rn = t(rn) is the same as lo in the first solution.

 r - 2; n - 5 # test data

 lo - lower.tri(diag(n), diag = TRUE)
 lo * r ^ (row(lo) - col(lo) + 1)

 Here is another one:

 rn - row(diag(n))
 (rn = t(rn)) * r ^ (rn - t(rn) + 1)

 On 8/15/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
  Hi,
  Sorry if this is a repost. I searched but found no results.
  I am wondering if it is an easy way to construct the following matrix:
 
  r  1 0 00
  r^2   r 1 00
  r^3   r^2  r 10
  r^4   r^3  r^2  r1
 
  where r could be any number. Thanks.
  Wen


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear models over large datasets

2007-08-16 Thread Gabor Grothendieck
Its actually only a few lines of code to do this from first principles.
The coefficients depend only on the cross products X'X and X'y and you
can build them up easily by extending this example to read files or
a database holding x and y instead of getting them from the args.
Here we process incr rows of builtin matrix state.x77 at a time
building up the two cross productxts, xtx and xty, regressing
Income (variable 2) on the other variables:

mylm - function(x, y, incr = 25) {
start - xtx - xty - 0
while(start  nrow(x)) {
idx - seq(start + 1, min(start + incr, nrow(x)))
x1 - cbind(1, x[idx,])
xtx - xtx + crossprod(x1)
xty - xty + crossprod(x1, y[idx])
start - start + incr
}
solve(xtx, xty)
}

mylm(state.x77[,-2], state.x77[,2])


On 8/16/07, Alp ATICI [EMAIL PROTECTED] wrote:
 I'd like to fit linear models on very large datasets. My data frames
 are about 200 rows x 200 columns of doubles and I am using an 64
 bit build of R. I've googled about this extensively and went over the
 R Data Import/Export guide. My primary issue is although my data
 represented in ascii form is 4Gb in size (therefore much smaller
 considered in binary), R consumes about 12Gb of virtual memory.

 What exactly are my options to improve this? I looked into the biglm
 package but the problem with it is it uses update() function and is
 therefore not transparent (I am using a sophisticated script which is
 hard to modify). I really liked the concept behind the  LM package
 here: http://www.econ.uiuc.edu/~roger/research/rq/RMySQL.html
 But it is no longer available. How could one fit linear models to very
 large datasets without loading the entire set into memory but from a
 file/database (possibly through a connection) using a relatively
 simple modification of standard lm()? Alternatively how could one
 improve the memory usage of R given a large dataset (by changing some
 default parameters of R or even using on-the-fly compression)? I don't
 mind much higher levels of CPU time required.

 Thank you in advance for your help.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combine matrix

2007-08-16 Thread Gabor Grothendieck
Try this.  We convert to data frame placing the row names in column 1, do
the merge, remove column 1 and convert back to matrix:

# test input
a - matrix(1:25, nrow = 5,
  dimnames = list(letters[1:5], rep(A, 5)))
b - matrix(1:40, nrow = 8,
  dimnames = list(rep(letters[1:2], each = 4), rep(B, 5)))

# 1. process
to.DF - function(x) data.frame(rn = row.names(x), x, row.names = 1:nrow(x))
out - as.matrix(merge(to.DF(a), to.DF(b), by = 1)[,-1])
colnames(out) - c(colnames(a), colnames(b))
out

# 2. same but merge is done using sqldf
# assume same a, b and to.DF as before

library(sqldf)
DFa - to.DF(a)
DFb - to.DF(b)
out - as.matrix(sqldf(select * from DFa join DFb using(rn))[-1])
colnames(out) - c(colnames(a), colnames(b))
out


# 3. same but uses sqldf and proto (which sqldf automatically loads)
# assume same a, b and to.DF as before

library(sqldf)
out - as.matrix(sqldf(select * from a join b using(rn),
  envir = proto(a = to.DF(a), b = to.DF(b)))[-1])
colnames(out) - c(colnames(a), colnames(b))
out




On 8/16/07, Gianni Burgin [EMAIL PROTECTED] wrote:
 let say something like this


 a=matrix(1:25, nrow=5)

 rownames(a)=letters[1:5]
  colnames(a)=rep(A, 5)

  a
  A  A  A  A  A
 a 1  6 11 16 21
 b 2  7 12 17 22
 c 3  8 13 18 23
 d 4  9 14 19 24
 e 5 10 15 20 25

  b=matrix(1:40, nrow=8)
  rownames(b)=c(rep(a,4),rep(b,4))
  colnames(b)=rep(B, 5)

  b
  B  B  B  B  B
 a 1  9 17 25 33
 a 2 10 18 26 34
 a 3 11 19 27 35
 a 4 12 20 28 36
 b 5 13 21 29 37
 b 6 14 22 30 38
 b 7 15 23 31 39
 b 8 16 24 32 40

 as a results I wold like something like

  A  A  A  A  A  B  B  B  B  B
 a 1  6 11 16 21  1  9 17 25 33
 a 1  6 11 16 21  2 10 18 26 34
 a 1  6 11 16 21  3 11 19 27 35
 a 1  6 11 16 21  4 12 20 28 36
 b 2  7 12 17 22  5 13 21 29 37
 b 2  7 12 17 22  6 14 22 30 38
 b 2  7 12 17 22  7 15 23 31 39
 b 2  7 12 17 22  8 16 24 32 40


 does it is clear? is there a function that automate this operation?


 thank you very much!




 On 8/16/07, jim holtman [EMAIL PROTECTED] wrote:
 
  Can you provide an example of what you mean; e.g., the two input
  matrices and the desired output.
 
  On 8/16/07, Gianni Burgin [EMAIL PROTECTED] wrote:
   Hi R user,
  
   I am new to R, and I have a very simple question for you. I have two
  matrix
   A and B, with internally redundant rownames (but variables are
  different).
   Some, but not all the rownames are shared among the two matrix. I want
  to
   create a greater matrix that combines the previuos two, and has all the
   possible combinations of matching rownames lines among matrix A and B.
  
   looking for the solution I bumped in merge but actually works on
  data.frame,
   and in dataframe there could be no redundancy in names.
  
  
   can you help me??
  
  [[alternative HTML version deleted]]
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 
 
  --
  Jim Holtman
  Cincinnati, OH
  +1 513 646 9390
 
  What is the problem you are trying to solve?
 

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] function to find coodinates in an array

2007-08-16 Thread Gabor Grothendieck
Get the indices using expand.grid and then reorder them:

set.seed(1); X - array(rnorm(24), 2:4) # input
X # look at X

do.call(expand.grid, sapply(dim(X), seq))[order(X),]


On 8/16/07, Ana Conesa [EMAIL PROTECTED] wrote:
 Dear list,

 I am looking for a function/way to get the array coordinates of given
 elements in an array. What I mean is the following:
 - Let X be a 3D array
 - I find the ordering of the elements of X by ord - order(X) (this
 returns me a vector)
 - I now want to find the x,y,z coordinates of each element of ord

 Can anyone help me?

 Thanks!

 Ana

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Formula in lm inside lapply

2007-08-15 Thread Gabor Grothendieck
It can't find x since the environment of formula1 and of formula2 is the Global
Environment and x is not there -- its local to the function.

Try this:

#generating data
set.seed(1)
DF - data.frame(y = rnorm(100, 1), x1 = rnorm(100, 1), x2 = rnorm(100, 1),
  group = rep(c(A, B), c(40, 60)))

formula1 - as.formula(y ~ x1)
lapply(levels(DF$group), function(x) {
   environment(formula1) - environment()
   lm(formula1, DF, subset = group == x)
})

formula2 - as.formula(y ~ x1 + x2)
lapply(levels(DF$group), function(x) {
   environment(formula2) - environment()
   lm(formula2, DF, subset = group == x)
})



On 8/15/07, Li, Yan (IED) [EMAIL PROTECTED] wrote:
 I am trying to run separate regressions for different groups of
 observations using the lapply function. It works fine when I write the
 formula inside the lm() function. But I would like to pass formulae into
 lm(), so I can do multiple models more easily. I got an error message
 when I tried to do that. Here is my sample code:

 #generating data
 x1 - rnorm(100,1)
 x2 - rnorm(100,1)
 y  - rnorm(100,1)
 group - rep(c(A,B),c(40,60))
 group - factor(group)
 df - data.frame(y,x1,x2,group)

 #write formula inside lm--works fine
 res1 - lapply(levels(df$group), function(x) lm(y~x1,df, subset = group
 ==x))
 res1
 res2 - lapply(levels(df$group),function(x) lm(y~x1+x2,df, subset =
 group ==x))
 res2

 #try to pass formula into lm()--does not work
 formula1 - as.formula(y~x1)
 formula2 - as.formula(y~x1+x2)
 resf1 - lapply(levels(df$group),function(x) lm(formula1,df, subset =
 group ==x))
 resf1
 resf2 - lapply(levels(df$group),function(x) lm(formula2,df, subset =
 group ==x))
 Resf2

 The error message is
 'Error in eval(expr, envir, enclos): object x not found'

 Any help is greatly appreciated!

 Yan
 

 This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Formula in lm inside lapply

2007-08-15 Thread Gabor Grothendieck
Here is another solution that gets around the non-standard
way that subset= is handled in lm.  It has the advantage that unlike
the previous solution where formula1 and group == x appear literally
in the output, in this one the formula appears written out and
group == A and group == B appear:

 lapply(levels(DF$group), function(x) do.call(lm,
+list(formula1, quote(DF), subset = bquote(group == .(x)
[[1]]

Call:
lm(formula = y ~ x1, data = DF, subset = group == A)

Coefficients:
(Intercept)   x1
1.04855  0.04585


[[2]]

Call:
lm(formula = y ~ x1, data = DF, subset = group == B)

Coefficients:
(Intercept)   x1
1.13593 -0.01627


On 8/15/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 It can't find x since the environment of formula1 and of formula2 is the 
 Global
 Environment and x is not there -- its local to the function.

 Try this:

 #generating data
 set.seed(1)
 DF - data.frame(y = rnorm(100, 1), x1 = rnorm(100, 1), x2 = rnorm(100, 1),
  group = rep(c(A, B), c(40, 60)))

 formula1 - as.formula(y ~ x1)
 lapply(levels(DF$group), function(x) {
   environment(formula1) - environment()
   lm(formula1, DF, subset = group == x)
 })

 formula2 - as.formula(y ~ x1 + x2)
 lapply(levels(DF$group), function(x) {
   environment(formula2) - environment()
   lm(formula2, DF, subset = group == x)
 })



 On 8/15/07, Li, Yan (IED) [EMAIL PROTECTED] wrote:
  I am trying to run separate regressions for different groups of
  observations using the lapply function. It works fine when I write the
  formula inside the lm() function. But I would like to pass formulae into
  lm(), so I can do multiple models more easily. I got an error message
  when I tried to do that. Here is my sample code:
 
  #generating data
  x1 - rnorm(100,1)
  x2 - rnorm(100,1)
  y  - rnorm(100,1)
  group - rep(c(A,B),c(40,60))
  group - factor(group)
  df - data.frame(y,x1,x2,group)
 
  #write formula inside lm--works fine
  res1 - lapply(levels(df$group), function(x) lm(y~x1,df, subset = group
  ==x))
  res1
  res2 - lapply(levels(df$group),function(x) lm(y~x1+x2,df, subset =
  group ==x))
  res2
 
  #try to pass formula into lm()--does not work
  formula1 - as.formula(y~x1)
  formula2 - as.formula(y~x1+x2)
  resf1 - lapply(levels(df$group),function(x) lm(formula1,df, subset =
  group ==x))
  resf1
  resf2 - lapply(levels(df$group),function(x) lm(formula2,df, subset =
  group ==x))
  Resf2
 
  The error message is
  'Error in eval(expr, envir, enclos): object x not found'
 
  Any help is greatly appreciated!
 
  Yan
  
 
  This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] shell and shell.exec on Windows

2007-08-11 Thread Gabor Grothendieck
The system() function has an invisible= argument.  The ryacas package
uses system() to run yacas.  See the runYacas() and
yacasInvokeString() functions in yacas.R for examples:
   http://ryacas.googlecode.com/svn/trunk/R/yacas.R

On 8/11/07, Erich Neuwirth [EMAIL PROTECTED] wrote:
 I have an Excel workbook MyWorkbook.xls containing an Auto_Open macro
 which I want to be run from R.

 shell.exec(MyWorkbook.xls)
 does that.

 shell(start MyWorkbook.xls)
 also runs it.

 In both cases, the Excel window is visible on screen when Excel is started.
 Is there a way of opening the sheet with a hidden Excel window?
 start has some parameters (e.g. /MIN), which should allow this, but
 shell(start /MIN MyWorkbook.xls)
 also starts Excel visibly.



 --
 Erich Neuwirth, University of Vienna
 Faculty of Computer Science
 Computer Supported Didactics Working Group
 Visit our SunSITE at http://sunsite.univie.ac.at
 Phone: +43-1-4277-39464 Fax: +43-1-4277-39459

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with counting how many times each value occur in each column

2007-08-10 Thread Gabor Grothendieck
Try this where we have constructed the example to illustrate that
it does handle the case where not all values are in each column:

   mat - matrix(rep(1:6, each = 4), 6)

   table(col(mat), mat)

On 8/10/07, Tom Cohen [EMAIL PROTECTED] wrote:
 Dear list,
  I have the following dataset and want to know how many times each value 
 occur in each column.
   data
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
  [1,] -100 -100 -100000000  -100
  [2,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  [3,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  [4,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  [5,] -100 -100 -100 -100 -100 -100 -100 -100 -100   -50
  [6,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  [7,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  [8,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  [9,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [10,] -100 -100 -100  -50 -100 -100 -100 -100 -100  -100
 [11,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [12,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [13,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [14,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [15,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [16,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [17,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [18,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [19,] -100 -100 -100000000  -100
 [20,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  The result matrix should look like
   -100 0 -50
 [1]   20
 [2]   20
 [3]   20
 [4]   17
 [5]   18
 [6]   18
 [7]   18  and so on
 [8]
 [9]
 [10]

 How can I do this in R ?
  Thanks alot for your help,
 Tom


 -

 Jämför pris på flygbiljetter och hotellrum: 
 http://shopping.yahoo.se/c-169901-resor-biljetter.html
[[alternative HTML version deleted]]


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ordering a data.frame by average rank of multiple columns

2007-08-10 Thread Gabor Grothendieck
Try this:

positions - order(ranks)

On 8/10/07, Tom.O [EMAIL PROTECTED] wrote:

 Hi

 I have run into a problem and i wonder if anyone has a smart way of doing
 this.

 For example i have this data frame for 5 different test groups:

 Res1 - c(1,5,4,-0.5,3)
 Res2 - c(-1,8,2,0,3)
 Mean - c(0.5,1,1.5,-.5,2)
 MyFrame - data.frame(Res1,Res2,Mean,row.names=c(G1,G2,G3,G4,G5))

 where the first two columns are the results of two different tests, the
 third column is the mean of the group.

 I want to order this data.frame by the combined rank of Res1  Res2, but
 where weigths are assigned to the importeance av each column. Lets assume
 that Res1 is twice as important and lower values rank better.

 MyRanks-data.frame(Rank1=rank(MyFrame[,Res1]),Rank2=rank(MyFrame[,Res2]),CombR=2*rank(MyFrame[,Res1])+rank(MyFrame[,Res2]),row.names=c(G1,G2,G3,G4,G5))

Rank1 Rank2 CombR
 G1 2 1 5
 G2 5 515
 G3 4 311
 G4 1 2 4
 G5 3 410


 and the rank of the combined is 2,5,4,1,3 , but to be able to sort MyFrame
 in that order I need to enter this vector of positions c(4,1,5,3,2) but do
 anyone have a smart way of converting ranks to positions?

 Tom


 --
 View this message in context: 
 http://www.nabble.com/ordering-a-data.frame-by-average-rank-of-multiple-columns-tf4247393.html#a12087498
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Fwd: Re: How to apply functions over rows of multiple matrices]

2007-08-10 Thread Gabor Grothendieck
1. matrices are stored columnwise so R is better at column-wise operations
than row-wise.

2. Here is one way to do it (although I am not sure its better than the
index approach):

   row.apply - function(f, a, b)
  t(mapply(f, as.data.frame(t(a)), as.data.frame(t(b

3. The code for the example in this post could be simplified to:

first.1 - apply(cbind(goldstandard, 1), 1, which.max)
ifelse(col(newtest)  first.1, NA, newtest)

4. given that both examples did not inherently need row by row operations
   I wonder if that is the wrong generalization in the first place?


On 8/10/07, Johannes Hüsing [EMAIL PROTECTED] wrote:
 [Apologies to Gabor, who I sent a personal copy of the reply
 erroneously instead of posting to List directly]

 [...]
   Perhaps what you really intend is to
  take the average over those elements in each row of the first matrix
 which correspond to 1's in the second in the corresponding
  row of the second.  In that case its just:
 
  rowSums(newtest * goldstandard) / rowSums(goldstandard)
 

 Thank you for clearing my thoughts about the particular example.
 My question was a bit more general though, as I have different
 functions which are applied row-wise to multiple matrices. An
 example that sets all values of a row of matrix A to NA after the
 first occurrence of TRUE in matrix B.

 fillfrom - function(applvec, testvec=NULL) {
  if (is.null(testvec)) testvec - applvec
  if (length(testvec) != length(applvec)) {
stop(applvec and testvec have to be of same length!)
  } else if(any(testvec, na.rm=TRUE)) {
applvec[min(which(testvec)) : length(applvec)] - NA
  }
  applvec
 }

 fillafter - function(applvec, testvec=NULL) {
  if (is.null(testvec)) testvec - applvec
  fillfrom(applvec, c(FALSE, testvec[-length(testvec)]))
 }

 numtest - 6
 numsubj - 20

 newtest - array(rbinom(numtest*numsubj, 1, .5),
dim=c(numsubj, numtest))
 goldstandard - array(rbinom(numtest*numsubj, 1, .5),
dim=c(numsubj, numtest))

 newtest.NA - t(sapply(1:nrow(newtest), function(i) {
  fillafter(newtest[i,], goldstandard[i,]==1)}))

 My general question is if R provides some syntactic sugar
 for the awkward sapply(1:nrow(A)) expression. Maybe in this
 case there is also a way to bypass the apply mechanism and
 my way of thinking about the problem has to be adapted. But
 as the *apply calls are galore in R, I feel this is a standard
 way of dealing with vectors and matrices.





 --

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Countvariable for id by date

2007-08-09 Thread Gabor Grothendieck
Try this:

Lines - id;dg1;dg2;date;
1;F28;;1997-11-04;
1;F20;F702;1998-11-09;
1;F20;;1997-12-03;
1;F208;;2001-03-18;
2;F32;;1999-03-07;
2;F29;F32;2000-01-06;
2;F32;;2003-07-05;
2;F323;F2800;2000-02-05;


# replace textConnection(Lines) with actual file name
DF - read.csv2(textConnection(Lines), as.is = TRUE,
 colClasses = list(numeric, character, character, Date, NULL))

rk - function(x, pat) {
  z - regexpr(pat, x$dg1)  0 | regexpr(pat, x$dg2)  0
  rank(ifelse(z, x$date, NA), na.last = keep)
}

DF$countF20 - unlist(by(DF, DF$id, rk, pat = ^F20))
DF$countF2129 - unlist(by(DF, DF$id, rk, pat = ^F2[1-9]))
DF




On 8/9/07, David Gyllenberg [EMAIL PROTECTED] wrote:
Best R-users,

  Here's a  newbie question. I have tried to find an answer to this via 
 help and the ave(x,factor(),FUN=function(y)  rank (z,tie='first')-function, 
 but without success.

  I have a dataframe  (~8000 observations, registerdata) with four 
 columns: id, dg1, dg2 and date(-MM-DD)  of interest:

  id;dg1;dg2;date;
  1;F28;;1997-11-04;
  1;F20;F702;1998-11-09;
  1;F20;;1997-12-03;
  1;F208;;2001-03-18;
  2;F32;;1999-03-07;
  2;F29;F32;2000-01-06;
  2;F32;;2003-07-05;
  2;F323;F2800;2000-02-05;
  ...

  I would  like o have two additional columns:
  1. countF20:  a countvariable that shows which in order (by date) 
 the id has if it fulfils  the following logical expression: dg1 = F20* OR dg2 
 = F20*,
  where *  means F201,F202... F2001,F2002...F20001,F20002...
  2. countF2129:  another countvariable that shows which in order (by 
 date) the id has if it fulfils  the following logical expression: dg1 = 
 F21*-F29* OR dg2 = F21*-F29*,
  where F21*-F29*  means F21*, F22*...F29* and
  where *  means F211,F212... F2101,F2102...F21001,F21002...

  ... so the  dataframe would look like this, where 1 is the first 
 observation for the id with  the right condition, 2 is the second etc.:

  id;dg1;dg2;date;countF20;countF2129;
  1;F28;;1997-11-04;;1;
  1;F20;F702;1998-11-09;2;;
  1;F20;;1997-12-03;1;;
  1;F208;;2001-03-18;3;;
  2;F32;;1999-03-07;;;
  2;F29;F32;2000-01-06;;1;
  2;F32;;2003-07-05;;;
  2;F323;F2800;2000-02-05;;2;
  ...

  Do you know  a convenient way to create these kind of countvariables? 
 Thank you in  advance!

  / David (david.gyllenberg  at  yahoo.com


 -
 Park yourself in front of a world of choices in alternative vehicles.

[[alternative HTML version deleted]]


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   3   4   5   6   7   8   9   10   >