### Re: [R] plot legend: combining filled boxes and lines

```Check out:
http://tolstoy.newcastle.edu.au/R/e2/help/07/05/16777.html

On 9/10/07, Lauri Nikkinen [EMAIL PROTECTED] wrote:
Hello,

I have difficulties combining boxes and lines in plot legend. I
searched previous R-posts and found this (with no solution):
http://tolstoy.newcastle.edu.au/R/help/06/07/30248.html. Is there a
way to avoid boxes behind the line legends?

x1 - rnorm(100)
x2 - rnorm(100, 2)
hist(x1, main = , col = orange,ylab = density, xlab = x, freq
= F, density = 55,  xlim = c(-2, 5), ylim = c(0, 0.5))
par(new = T)
hist(x2, main = , col = green, ylab = , xlab = ,axes = F, xlim
= c(-2, 5), ylim = c(0, 0.5), density = 45, freq = F)

abline(v = mean(x1), col = orange, lty = 2, lwd = 2.5)
abline(v = mean(x2), col = green, lty = 2, lwd = 2.5)
legend(3, 0.45, legend = c(x1, x2, mean(x1), mean(x2)), col =
c(orange, green), fill=c(orange,green, 0, 0),  lty = c(0, 0,
2, 2), merge = T)

Thanks
Lauri

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] off-topic: better OS for statistical computing

```You want whatever all the people you are working with are using
to make it as easy as possible to work together with them.

On 9/10/07, Wensui Liu [EMAIL PROTECTED] wrote:
Good morning, everyone,
I am sorry for this off-topic post but think I can get great answer
from this list.
My question is what is the best OS on PC (laptop) for statistical
computing and why.
Have a nice day.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] off-topic: better OS for statistical computing

```My sense is that R users are even split between UNIX and Windows
users so either will do in terms of the larger community.

Some R packages may not be avaliable on every platform or will
be available on one platform before another or there will be
certain platform-specific issues.  So in the end its easiest to
have the same thing everyone else that you work with does.

Also if you run into
problems then you can ask others whereas if you are the lone
person with something different you have no one to turn to.

Also associated software may be, for example, Microsoft Office in
a Microsoft environment and LaTeX in a UNIX environment. And
networking will be simplified in a consistent environment too.
Certainly there is Open Office, Samba and putty but the easiest
is just not to have to worry about getting everything to work
together by just having the same thing in the first place.

Neither Linux nor Windows is superior to the other.  People
making such representations generally know one much better
than the other and its more a reflection of their own experience
than anything else.  I personally have used both UNIX and
Windows since their inception and find that I tend to have a
slight preference for whatever I used last.  Technical merits of
one vs. the other are basically irrelevant for most purposes.

On 9/10/07, Patrick Connolly [EMAIL PROTECTED] wrote:
On Mon, 10-Sep-2007 at 12:26PM -0400, Gabor Grothendieck wrote:

| You want whatever all the people you are working with are using
| to make it as easy as possible to work together with them.

Assuming you're using R, there is negligible difficulty using a
different OS from what your colleagues use (apart from the
inconsistencies you get between different versions of Windows, but
even that has little effect on R).  The standard .RData binary files
work with Windows and Linux (and probably OS X).

The only issue I come across is that Linux can't create WMF files as
readily as Windows can, and that is more than made up for by the
greater flexibility that Linux offers.  It's easier in Linux to
produce Excel files from dataframes and matrices using a perl script
posted to this list by Marc Schwartz.  Thanks again Marc.

Best

Patrick

|
| On 9/10/07, Wensui Liu [EMAIL PROTECTED] wrote:
|  Good morning, everyone,
|  I am sorry for this off-topic post but think I can get great answer
|  from this list.
|  My question is what is the best OS on PC (laptop) for statistical
|  computing and why.
|  I really appreciate your insight.
|  Have a nice day.
|
| __
| R-help@stat.math.ethz.ch mailing list
| https://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
| and provide commented, minimal, self-contained, reproducible code.

--
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
___Patrick Connolly
{~._.~} Great minds discuss ideas
_( Y )_Middle minds discuss events
(:_~*~_:)Small minds discuss people
(_)-(_)   . Anon

~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] finding the minimum positive value of some data

```Here are some solutions each of which
1. has only one line,
2. x only occurs once so you can just plug in a complex expression
3. no temporary variables are left

min(sapply(x, function(z) if (z  0) z else Inf))

(function(z) min(ifelse(z  0, z, Inf))) (x)

with(list(z = x), min(z[z  0]))

local({ z - x; min(z[z  0]) })

On 9/10/07, dxc13 [EMAIL PROTECTED] wrote:

useRs,

I am looking to find the minimum positive value of some data I have.
Currently, I am able to find the minimum of data after I apply some other
functions to it:

x
[1]  1  0  1  2  3  3  4  5  5  5  6  7  8  8  9  9 10 10

sort(x)
[1]  0  1  1  2  3  3  4  5  5  5  6  7  8  8  9  9 10 10

diff(sort(x))
[1] 1 0 1 1 0 1 1 0 0 1 1 1 0 1 0 1 0

min(diff(sort(x)))
[1] 0

The minimum is given as zero, which is clearly true, but I am interested in
only the positive minimum, which is 1.  Can I find this by using only 1 line
of code, like I have above? Thanks!

dxc13
--
View this message in context:
http://www.nabble.com/finding-the-minimum-positive-value-of-some-data-tf4417250.html#a12599319
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] what am I missing

```Its a FAQ:

http://hermes.sdu.dk/Rdoc/faq.html#Why%20does%20outer()%20behave%20strangely%20with%20my%20function%3f

On 9/10/07, Jan de Leeuw [EMAIL PROTECTED] wrote:
x-seq(-1,1,length=10)
y-seq(-1,1,length=10)
a-matrix(c(1,2,2,1),2,2)
b-matrix(c(2,1,1,2),2,2)

fv-function(x,y) {
m-x*a+y*b
t-m[1,1]+m[2,2]; d-m[1,1]*m[2,2]-m[1,2]^2
return((t-sqrt(t^2-4*d))/2)
}

gv-function(x,y) {
t-x*(a[1,1]+a[2,2])+y*(b[1,1]+b[2,2])
d-(x*a[1,1]+y*b[1,1])*(x*a[2,2]+y*b[2,2])-(x*a[1,2]+y*b[1,2])^2
return((t-sqrt(t^2-4*d))/2)
}

now outer(x,y,gv) works as expected, outer(x,y,fv) bombs. But

z-matrix(0,10,10); for (i in 1:10) for (j in 1:10) z[i,j]-fv(x[i],y
[j])

works fine. Must be something in outer().

==
Jan de Leeuw, 11667 Steinhoff Rd, Frazier Park, CA 93225, 661-245-1725
.mac: jdeleeuw ++  aim: deleeuwjan ++ skype: j_deleeuw
homepages: http://www.cuddyvalley.org and http://gifi.stat.ucla.edu
==
A bath when you're born,
a bath when you die,
how stupid.  (Issa 1763-1827)

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] SQL like function?

```Great.  Regarding the web, note that there are actually quite a few R
web projects as well:

http://www.lmbe.seu.edu.cn/CRAN/doc/FAQ/R-FAQ.html#R-Web-Interfaces

I have used rpad (www.rpad.org) which has an integrated web server right
in the R package making setup a non-issue.

On 9/8/07, Takatsugu Kobayashi [EMAIL PROTECTED] wrote:
Hi Gabor,

Wow, this is awesome although I eventually should learn MySQL for
integrating it on web-based DB management using PHP or Perl, this is a

Thank you very much

Gabor Grothendieck wrote:
SQL, you can use SQL to manipulate R data frames using the sqldf package
which provides an interface to lower level RSQLite (and RMySQL in the
future)
routines.  The following examples use SQLite underneath:

DF - data.frame(observation = c(1,2,3,4,5))
ID - data.frame(ID = c(1, 3, 4))

library(sqldf)
sqldf(select observation, observation in (select * from ID) `ID?` from DF)

# or

sqldf(select observation, observation in (1, 3, 4) `ID?` from DF)

On 9/7/07, Takatsugu Kobayashi [EMAIL PROTECTED] wrote:

Hi RUsers,

I am wonder if I can search observations whose IDs matches any of the
values in another vector, such as in MySQL. While I am learing MySQL for
future database management, I appreciate if anyone could give me a hint.

Suppose I have one 5*1 vector containing observation IDs and
frequencies, and one 3*1 vector containing observation IDs.

observation-c(1,2,3,4,5)
ID-c(1,3,4)

Then, I would like to program a code that returns a results showing
matched observations like

result: TRUE FALSE TRUE TRUE FALSE

I am reading S programming, but I cannot find a way to do this.

Thank you very much.

Taka

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Lisp-like primitives in R

```On 9/8/07, Peter Dalgaard [EMAIL PROTECTED] wrote:
François Pinard wrote:
[Roland Rau]

[François Pinard]

I wonder what happened, for R to hide the underlying Scheme so fully,
at least at the level of the surface language (despite there are
hints).

To further foster portability, we chose to write R in ANSI C

Yes, of course.  Scheme is also (often) implemented in C.  I meant that
R might have implemented a Scheme engine (or part of a Scheme engine,
extended with appropriate data types) with a surface language (nearly
the S language) which is purposely not Scheme, but could have been.

If the gap is not extreme, one could dare dreaming that the Scheme
engine in R be completed, and Scheme offered as an alternate extension
language.  If you allow me to continue dreaming awake -- they told me
they will let me free as long as I do not get dangerous! :-) -- part
of the interest lies in the fact there are excellent Scheme compilers.
If we could only find or devise some kind of marriage between a mature
Scheme and R, so to speed up the non-vectorisable parts of R scripts...

Well, depending on what you want, this is either trivial or
impossible... The internal storage of R is still pretty much equivalent
to scheme. E.g. try this:

r2scheme - function(e) if (!is.recursive(e))
deparse(e) else c((, unlist(lapply(as.list(e), r2scheme)), ))
paste(r2scheme(quote(for(i in 1:4)print(i))), collapse= )
[1] ( for i ( : 1 4 ) ( print i ) )

Also see showTree in codetools:

library(codetools)
showTree(quote(for(i in 1:4)print(i)))
(for i (: 1 4) (print i))

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] SQL like function?

```Others have already pointed out %in% but regarding your comment about
SQL, you can use SQL to manipulate R data frames using the sqldf package
which provides an interface to lower level RSQLite (and RMySQL in the future)
routines.  The following examples use SQLite underneath:

DF - data.frame(observation = c(1,2,3,4,5))
ID - data.frame(ID = c(1, 3, 4))

library(sqldf)
sqldf(select observation, observation in (select * from ID) `ID?` from DF)

# or

sqldf(select observation, observation in (1, 3, 4) `ID?` from DF)

On 9/7/07, Takatsugu Kobayashi [EMAIL PROTECTED] wrote:
Hi RUsers,

I am wonder if I can search observations whose IDs matches any of the
values in another vector, such as in MySQL. While I am learing MySQL for
future database management, I appreciate if anyone could give me a hint.

Suppose I have one 5*1 vector containing observation IDs and
frequencies, and one 3*1 vector containing observation IDs.

observation-c(1,2,3,4,5)
ID-c(1,3,4)

Then, I would like to program a code that returns a results showing
matched observations like

result: TRUE FALSE TRUE TRUE FALSE

I am reading S programming, but I cannot find a way to do this.

Thank you very much.

Taka

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] help on replacing values

```Your columns are factors, not character strings.  Use as.is = TRUE as
an argument to read.table.   Also its a bit dangerous to use T although
not wrong.  Its safer to use TRUE.

On 9/7/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
Dear List,

I have a newbie question. I have read in a data.frame as follows:

data
X1 X2 X3 X4
A AB AC AB AC
B AB AC AA AB
C AA AB AA AB
D AA AB AB AC
E AB AA AA AB
F AB AA AB AC
B AB AC AB AA

I would like to replace AA values by BB in column X2. I have tried
using replace() with no success, although I am not sure this is the
right function. This is the code I have used:

data\$X2 - replace(data\$X2, data\$X2 ==AA,BB)
Warning message:
invalid factor level, NAs generated in: `[-.factor`(`*tmp*`, list,
value = BB)

What is wrong with the code? How can I get this done? how about
changing AA values by BB in all 4 columns simultaneously? Actually
this is a small example dataframe, the real one would have about 1000
columns.

Extendind this, I found a similar thread dated July 2006 that used
replace() on iris dataset, but I have tried reproducing it obtaining
same warning message

iris\$Species - replace(iris\$Species, iris\$Species
== setosa,NewName)
Warning message:
invalid factor level, NAs generated in: `[-.factor`(`*tmp*`, list,
value = NewName)

David

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] variable format

```A matrix is for situations where every element is of the same class
but your columns have different classes so use a data frame:

DF - data.frame(a = 11:15, b = letters[1:5], stringsAsFactors = FALSE)
subset(DF, a %in% 11:13)
subset(DF, a %in% c(0, 11:13)) # same

Suggest you review the Introduction to R manual and look at ?data.frame,
?subset and ?%in%

On 9/4/07, Cory Nissen [EMAIL PROTECTED] wrote:
Okay, I want to do something similar to SAS proc format.

I usually do this...

a - NULL
a\$divisionOld - c(1,2,3,4,5)
divisionTable - matrix(c(1, New England,
2, Middle Atlantic,
3, East North Central,
4, West North Central,
5, South Atlantic),
ncol=2, byrow=T)
a\$divisionNew[match(a\$divisionOld, divisionTable[,1])] - divisionTable[,2]

But how do I handle the case where...
a\$divisionOld - c(0,1,2,3,4,5)   #no format available for 0, this throws an
error.
OR
divisionTable - matrix(c(1, New England,
2, Middle Atlantic,
3, East North Central,
4, West North Central,
5, South Atlantic,
6, East South Central,
7, West South Central,
8, Mountain,
9, Pacific),
ncol=2, byrow=T)
There are extra formats available... this throws a warning.

Thanks

Cory

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] ploting missing data

```Try this:

library(zoo)
plot(na.approx(zoo(as.matrix(data[-1]), data[,1])), plot.type = single)

See ?na.approx, ?plot.zoo, ?xyplot.zoo and vignette(zoo)

On 9/7/07, Markus Schmidberger [EMAIL PROTECTED] wrote:
Hello,

I have this kind of dataframe and have to plot it.

data - data.frame(sw= c(1,2,3,4,5,6,7,8,9,10,11,12,15),
zehn =
c(33.44,20.67,18.20,18.19,17.89,19.65,20.05,19.87,20.55,22.53,NA,NA,NA),
zwanzig =
c(61.42,NA,26.60,23.28,NA,24.90,24.47,24.53,26.41,28.26,NA,29.80,35.49),
fuenfzig =
c(162.51,66.08,49.55,43.40,NA,37.77,35.53,36.46,37.25,37.66,NA,42.29,47.80)
)

The plot should have lines:
lines(fuenfzig~sw, data=data)
lines(zwanzig~sw, data=data)

But now I have holes in my lines for the missing values (NA). How to
plot the lines without the holes?
The missing values should be interpolated or the left and right point
directly connected. The function approx interpolates the whole dataset.
Thats not my goal!
Is there no plotting function to do this directly?

Best
Markus

--
Dipl.-Tech. Math. Markus Schmidberger

Ludwig-Maximilians-Universität München
IBE - Institut für medizinische Informationsverarbeitung,
Biometrie und Epidemiologie
Marchioninistr. 15, D-81377 Muenchen
URL: http://ibe.web.med.uni-muenchen.de
Mail: Markus.Schmidberger [at] ibe.med.uni-muenchen.de

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Delete query in sqldf?

```Yes but delete does not return anything so its not useful.  In the devel
version of sqldf you can pass multiple command so try this using the
builtin data frame BOD noting that the record with demand = 8.3 was
removed:

library(sqldf)
# overwrite with devel version of the sqldf.R file
sqldf(c(delete from BOD where demand = 8.3, select * from BOD))
Time__1 demand
1   2   10.3
2   3   19.0
3   4   16.0
4   5   15.6
5   7   19.8

On 9/7/07, Paul Smith [EMAIL PROTECTED] wrote:
Dear All,

Is sqldf equipped with delete queries? I have tried delete queries but
with no success.

Paul

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Automatic detachment of dependent packages

```If its good enough just to get rid of all attached packages since after startup
you could just do repeated detaches like this making use of the fact that
search() has 9 components on startup:

replicate(length(search()) - 9, detach())

On 9/7/07, Paul Smith [EMAIL PROTECTED] wrote:
Dear All,

When one loads certain packages, some other dependent packages are
loaded as well. Is there some way of detaching them automatically when
one detaches the first package loaded? For instance,

library(sqldf)

but

detach(package:sqldf)

search()
[1] .GlobalEnvpackage:gsubfnpackage:proto
[4] package:RSQLite   package:DBI   package:stats
[7] package:graphics  package:grDevices package:utils
[13] package:base

The packages

RSQLite
DBI
gsubfn
proto

were not detached.

Paul

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Delete query in sqldf?

```All sqldf does is pass the command to sqlite and retrieve whatever it
sends back translating the two directions to and from R.  sqldf
does not change the meaning of any sql statements.  Perhaps the
meaning you expect is desirable but its not how sqlite works.   If
sqlite were changed to adopt that meaning then sqldf would
automatically get it too.

Here is an example which does not involve R at all which
illustrates that delete returns nothing.

C:\ sqlite3
SQLite version 3.4.0
Enter .help for instructions
sqlite
sqlite create table t1(a,b);
sqlite insert into T1 values(1,2);
sqlite insert into T1 values(1,3);
sqlite insert into T1 values(2,4);
sqlite delete from t1 where b = 2;
sqlite select * from t1;
1|3
2|4

On 9/7/07, Paul Smith [EMAIL PROTECTED] wrote:
On 9/7/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
Yes but delete does not return anything so its not useful.  In the devel
version of sqldf you can pass multiple command so try this using the
builtin data frame BOD noting that the record with demand = 8.3 was
removed:

library(sqldf)
# overwrite with devel version of the sqldf.R file
sqldf(c(delete from BOD where demand = 8.3, select * from BOD))
Time__1 demand
1   2   10.3
2   3   19.0
3   4   16.0
4   5   15.6
5   7   19.8

I see, Gabor, but I would expect as more natural to have

sqldf(delete from BOD where demand = 8.3)

working, with no second command.

Paul

On 9/7/07, Paul Smith [EMAIL PROTECTED] wrote:
Dear All,

Is sqldf equipped with delete queries? I have tried delete queries but
with no success.

Paul

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] R first.id last.id function error

```A slightly easier way to construct first and last if the vector x is
sorted (as is assumed in SAS) is:

first - !duplicated(x)
last - !duplicated(x, fromLast = TRUE)

where the fromLast= argument is added in R 2.6.0.

On 9/7/07, Gerard Smits [EMAIL PROTECTED] wrote:
Hi R users,

I have a test dataframe (file1, shown below) for which I am trying
to create a flag for the first and last ID record (equivalent to SAS
first.id and last.id variables.

Dump of file1:

file1
id rx week dv1
1   1  11   1
2   1  12   1
3   1  13   2
4   2  11   3
5   2  12   4
6   2  13   1
7   3  11   2
8   3  12   3
9   3  13   4
10  4  11   2
11  4  12   6
12  4  13   5
13  5  21   7
14  5  22   8
15  5  23   5
16  6  21   2
17  6  22   4
18  6  23   6
19  7  21   7
20  7  22   8
21  8  21   9
22  9  21   4
23  9  22   5

I have written code that correctly assigns the first.id and last.id variabes:

require(Hmisc)  #for Lags
#ascending order to define first dot
file1- file1[order(file1\$id, file1\$week),]
file1\$first.id - (Lag(file1\$id) != file1\$id)
file1\$first.id[1]-TRUE  #force NA to TRUE

#descending order to define last dot
file1- file1[order(-file1\$id,-file1\$week),]
file1\$last.id  - (Lag(file1\$id) != file1\$id)
file1\$last.id[1]-TRUE   #force NA to TRUE

#resort to original order
file1- file1[order(file1\$id,file1\$week),]

I am now trying to get the above code to work as a function, and am
clearly doing something wrong:

first.last - function (df, idvar, sortvars1, sortvars2)
+   {
+   #sort in ascending order to define first dot
+   df- df[order(sortvars1),]
+   df\$first.idvar - (Lag(df\$idvar) != df\$idvar)
+   #force first record NA to TRUE
+   df\$first.idvar[1]-TRUE
+
+   #sort in descending order to define last dot
+   df- df[order(-sortvars2),]
+   df\$last.idvar  - (Lag(df\$idvar) != df\$idvar)
+   #force last record NA to TRUE
+   df\$last.idvar[1]-TRUE
+
+   #resort to original order
+   df- df[order(sortvars1),]
+   }

Function call:

first.last(df=file1, idvar=file1\$id,
sortvars1=c(file1\$id,file1\$week), sortvars2=c(-file1\$id,-file1\$week))

R Error:

Error in as.vector(x, mode) : invalid argument 'mode'

I am not sure about the passing of the sort strings.  Perhaps this is
were things are off.  Any help greatly appreciated.

Thanks,

Gerard
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] creat list

```Try this:

do.call(cbind, lista)

On 9/6/07, livia [EMAIL PROTECTED] wrote:

Hi,

I have a list named lista, which has 50 vectors and each vector has the
length about 1200. I would like to creat a matrix out of lista. What I try
now is cbind(lista[[1]],lista[[2]],...,lista[[50]]). I guess there would be
an easy way of doing this. Could anyone give me some advice?
--
View this message in context:
http://www.nabble.com/creat-list-tf4391162.html#a12519637
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Excel

```On my version of Excel (Excel 2007 under Vista) using
File | Open on a file, a.txt such as:

a b
sep7 10
sep10 11

causes it to enter a wizard where it asks you for the delimiters and
column types so you can change it from what it offers as the default.
In particular, if you leave it at General it will guess Date but you can
specify Text or you can specify Date to cause it to select a
particular type.

On 9/6/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
Quoting Robert A LaBudde [EMAIL PROTECTED]:

If you format the column as Text, you won't have this problem. By
leaving the cells as General, you leave it up to Excel to guess at
the correct interpretation.

You will note that the conversion to a date occurs immediately in
Excel when you enter the value. There are many formats to enter dates.

Either pre-format the column as Text, or prefix the individual entry
with an ' to indicate text.

But the conversion is done as soon as the file is opened, _before_ you
have the chance to format the column as text!!!
Once the conversion is done... it's done.
I had gene names such as SEP7 converted by Excel into a 5 digit
number representing a date. From that number I didn't find a way to
reconstruct SEP7. Sept-7 is not the same.

It seems like a problem with an easy solution. But it isn't. There are
too many variations.

A similar problem occurs in R's read.table() function when a factor
has levels that can be interpreted as numbers.

at least with read.table you can specify the classes of each column

R developers are better behaved than MS Excel ones ;-)

Jose

At 10:11 PM 8/27/2007, David wrote:

A common process when data is obtained in an Excel spreadsheet is to save
the spreadsheet as a .csv file then read it into R. Experienced users
might have learned to be wary of dates (as I have) but possibly have not
experienced what just happened to me. I thought I might just share it with
r-help as a cautionary tale.

I received an Excel file giving patient details. Each patient had an ID
code in the form of three letters followed by four digits. (Actually a New
Zealand National Health Identification.) I saved the .xls file as .csv.
Then I opened up the .csv (with Excel) to look at it. In the column of ID
codes I saw: Aug-99. Clicking on that entry it showed 1/08/2699.

In a column of character data, Excel had interpreted AUG2699 as a date.

The .csv did not actually have a date in that cell, but if I had saved the
.csv file it would have.

David Scott

Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [EMAIL PROTECTED]
Least Cost Formulations, Ltd.URL: http://lcfltd.com/
824 Timberlake Drive Tel: 757-467-0954
Virginia Beach, VA 23464-3239Fax: 757-467-2947

Vere scire est per causas scire

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

--
Dr. Jose I. de las Heras  Email: [EMAIL PROTECTED]
The Wellcome Trust Centre for Cell BiologyPhone: +44 (0)131 6513374
Institute for Cell  Molecular BiologyFax:   +44 (0)131 6507360
University of Edinburgh
Edinburgh EH9 3JR
UK

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Excel

```That is not what happens in Excel 2007 when I tried it just now. I tried
saving the same file I displayed in my prior message as an .xls file and
as an .xlsx file and in both cases the first column came back as text,
as I had specified to the Wizard on the initial import.  I guess they fixed
the behavior in Excel 2007.

On 9/6/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

Yes, and then you save it, you open it again... same behaviour.
The only way I found around it was to insert a character at the
beginning of every element in such columns. An apostrophe works, but
it looks ugly. Yes, when loading the data in R you could easily clean
it up automatically... doable.
You can add a space. Then it will not show, but you have to remember
that if you ever use the data for labels etc. You shouldn't need to do
that in the first place...

Jose

Quoting Erich Neuwirth [EMAIL PROTECTED]:

There is a hack to get around the problem.
It is definitely not a good solution, just a hack.

Open the .csv file in a text editor and select everything.
Paste it into an empty Excel sheet.
Then use Data - Text to Columns

The third dialog box (at least it is the third one in Excel 2003)
allows you to format each column of the data. This is the place where
you can switch off the date interpretation of your ID column.

AUG1838 probably is not onterpreted as date because Excel dates only
start at 1/1/1900.

Duncan Murdoch wrote:
On 8/28/2007 3:16 AM, J Dougherty wrote:
On Monday 27 August 2007 22:21, David Scott wrote:
On Tue, 28 Aug 2007, Robert A LaBudde wrote:
If you format the column as Text, you won't have this problem. By
leaving the cells as General, you leave it up to Excel to guess at
the correct interpretation.
Not true actually. I had converted the column to Text because I saw the
interpretation as a date in the .xls file. I saved the .csv file *after*
the column had been converted to Text. Looking at the .csv file in a text
editor, the entry is correct.

I have just rechecked this.

On reopening the .csv using Excel, the entry AUG2699 had been interpreted
as a date, and was showing as Aug-99. Most bizarre is that the NHI value
of AUG1838 has *not* been interpreted as a date.

--
Erich Neuwirth, University of Vienna
Faculty of Computer Science
Computer Supported Didactics Working Group
Phone: +43-1-4277-39464 Fax: +43-1-4277-39459

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

--
Dr. Jose I. de las Heras  Email: [EMAIL PROTECTED]
The Wellcome Trust Centre for Cell BiologyPhone: +44 (0)131 6513374
Institute for Cell  Molecular BiologyFax:   +44 (0)131 6507360
University of Edinburgh
Edinburgh EH9 3JR
UK

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Lisp-like primitives in R

```Reduce, Filter and Map are part of R 2.6.0.  Try ?Reduce

On 9/6/07, Chris Elsaesser [EMAIL PROTECTED] wrote:
I mainly program in Common Lisp and use R for statistical analysis.

While in R I miss the power and ease of use of Lisp, especially its many
primitives such as find, member, cond, and (perhaps a bridge too far)
loop.

Has anyone created a package that includes R analogs to a subset of Lisp
functions?

Chris Elsaesser, PhD
Principal Scientist, Machine Learning
7921 Jones Branch Dr. Suite 600
McLean, VA 22102

703.371.7301 (m)
703.637.9421 (o)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] problems in read.table

```See ?count.fields to get a vector of how many fields are on each line.
Also fill = TRUE on read.table() can be used to fill out short lines if
that is appropriate.

On 9/6/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
Dear R-users,

I have encountered the following problem every now and then. But I was
dealing with a very small dataset before, so it wasn't a problem (I
just edited the dataset in Openoffice speadsheet). This time I have to
deal with many large datasets containing commuting flow data. I
appreciate if anyone could give me a hint or clue to get out of this
problem.

I have a .dat file called 1081.dat: 1001 means Birmingham, AL.

I imported this .dat file using read.table like

Then I got this error message:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
line 9499 did not have 209 elements

Since I got an error message saying other rows did not have 209
elements, I added skip=c(205,9499,9294)) in hoping that R would take
care of this problem. But I got a similar error message:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
line 9294 did not have 209 elements
the condition has length  1 and only the first element will be used
in: if (skip  0) readLines(file, skip)

Is there any way to let a R code to automatically skip problematic
rows? Thank you very much!

Taka

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] 'singular gradient matrix’ when using nl s() and how to make the program skip nls( ) and run on

```In case 1 graph your function and then use optimize rather than nls.

In case 2 a and b may have the same effect as c on f whereas they
don't vary in case 1 so it does not matter.  For example consider
minimizing f - function(a, b) (a + b)^2  If a is fixed at zero then
the minimum occurs for b=0 but if a is not fixed then increasing a
and decreasing b by the same amount causes no change in the
result so the gradient in such a direction is zero.

On 9/5/07, Yuchen Luo [EMAIL PROTECTED] wrote:
Dear friends.

I use nls() and encounter the following puzzling problem:

I have a function f(a,b,c,x), I have a data vector of x and a vectory  y of
realized value of f.

Case1

I tried to estimate  c with (a=0.3, b=0.5) fixed:

nls(y~f(a,b,c,x), control=list(maxiter = 10, minFactor=0.5
^2048),start=list(c=0.5)).

The error message is: number of iterations exceeded maximum of 10

Case2

I then think maybe the value of a and be are not reasonable. So, I let nls()
estimate (a,b,c) altogether:

nls(y~f(a,b,c,x), control=list(maxiter = 10, minFactor=0.5
^2048),start=list(a=0.3,b=0.5,c=0.5)).

The error message is:

singular gradient matrix at initial parameter estimates.

This is what puzzles me, if the initial parameter of (a=0.3,b=0.5,c=0.5) can
matrix' appear in Case1?

I have tried to change the initial value of (a,b,c) around but the problem
persists. I am wondering if there is a way out.

My another question is, I need to run 220 of  nls() in my program with
different y and x. When one of the nls() encounter a problem, the whole
program stops.  In my case, the 3rd nls() runs into a problem.  I would
still need the program to run the remaining 217 nls( )! Is there a way to
make the program skip the problematic nls() and complete the ramaining
nls()'s?

Your help will be highly appreciated!

Yuchen Luo

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Table and ftable

```Try this which gives an object of the required shape and of
class c(xtabs, table) :

xx - xtabs(area ~ sic + level, DF)

You can optionally do it like this to make it class matrix

xx - xtabs(area ~ sic + level, DF)[]

and if you don't want the call attribute:

attr(xx, call) - NULL

On 9/4/07, Giulia Bennati [EMAIL PROTECTED] wrote:
Dear listmembers,
I have a little question: I have my data organized as follow

sic  level  area
a2112.4
b3112.3
b3220.2
b3220.5
c1003.0
c1001.5
c2421.5
d2220.2

where levels and sics are factors. I'm trying to obtain a matrix like this:

level
211311322   100242 222
sic
a2.4  0   0   0   00
b 0   2.30.7 0   00
c 00  0   4.5 1.5 0
d 00  00   0   0.2

I tryed with table function as
table(sic,level) but i obteined only a contingency table.
Have you any suggestions?
Thank you very much,
Giulia

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Variable scope in a function

```environment(test_func) - baseenv()

will allow it to access the base environment so it can still find exists
but will not find kat.  If you issue the command

search()

then each attached package has the next as its parent and base is the
last one.

Regarding your second question, try rm().

f - function() { x - 1; rm(x); exists(x, environment()) }
f() # FALSE

On 9/4/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
Hello,

I apologise in advance for this question; I'm sure it is answered in
the documentation or this mailing list many times, however the answer
has eluded me.

I'm trying to write a function where I don't want external variables
to be scoped in from the parent environment. Given this function:

test_func = function() {

if (exists(kat) == FALSE) {
print(kat is undefined)
} else {
print(kat)
}
}

If I did this:

kat = 12
test_func()

I'd like the result to be the error, but now it's 12 (which is of
course correct according to the documentation).

So there are two questions:

1) How can I disregard all variables from the parent environment
within a function? (Although from what I've read on the mailing lists
this isn't really what I want.)

Apparently

environment(test_func) = NULL

is defunct, and what I thought was its replacement

environment(test_func) = emptyenv()

doesn't seem to be.

2) How can I undefine a variable, perhaps just within the context
of my function. I'm hoping to find some line that I can put at the
start of my function above so that the result would be:

kat = 12
test_func()
[1] kat is undefined
kat
[1] 12

Thanks in advance for any help!

Cheers,

Demitri

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] using temporary arrays in R

```You can do it in a local, in a function or explicitly remove it.  Also
if you never assign it to a variable then it will be garbage collected as well

# 1
local({
print(gc())
x - matrix(NA, 1000, 1000)
print(gc())
})
gc()

# 2
f - function() {
print(gc())
x - matrix(NA, 1000, 1000)
print(gc())
}
f()
gc()

# 3
gc()
x - matrix(NA, 1000, 1000)
gc()
rm(x)
gc()

# 4
gc()
sum(matrix(1, 1000, 1000))
gc()

On 9/3/07, dxc13 [EMAIL PROTECTED] wrote:

useR's,

Is there a way to create a temporary array (or matrix) in R to hold values,
then drop or delete that temporary array from memory once I do not need it
anymore?

I am working with multidimensional arrays/matrices and I frequently perform
multiple operations on the same matrix and rename it to be another object.
I want to be able to delete the older versions of the array/matrix to free
up space.

Thank you.
--
View this message in context:
http://www.nabble.com/using-temporary-arrays-in-R-tf4372367.html#a12462219
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Derivative of a Function Expression

```The Ryacas package can do that (but the function must be one line
and it can't have brace brackets).  The first yacas call below registers f with
yacas, then we set up a function to act as a template to hold the
derivative and then we set its body calling yacas again to take the
derivative.

library(Ryacas)
f - function(x)  2*cos(x)^2 + 3*sin(x) +  0.5
yacas(f) # register f with yacas
Df - f
body(Df) - yacas(expression(deriv(f(x[[1]]
Df

Here is the output:

library(Ryacas)
f - function(x)  2*cos(x)^2 + 3*sin(x) +  0.5
yacas(f)
[1] Starting Yacas!
expression(TRUE)
Df - f
body(Df) - yacas(expression(deriv(f(x[[1]]
Df
function (x)
2 * (-2 * sin(x) * cos(x)) + 3 * cos(x)

Also see:

demo(Ryacas-Function)

On 9/3/07, Rory Winston [EMAIL PROTECTED] wrote:
Hi

I am currently (for pedagogical purposes) writing a simple numerical
analysis library in R. I have come unstuck when writing a simple
Newton-Raphson implementation, that looks like this:

f - function(x) { 2*cos(x)^2 + 3*sin(x) +  0.5  }

root - newton(f, tol=0.0001, N=20, a=1)

My issue is calculating the symbolic derivative of f() inside the newton()
function. I cant seem to get R to do this...I can of course calculate the
derivative by calling D() with an expression object containing the inner
function definition, but I would like to just define the function once and
then compute the derivative of the existing function. I have tried using
deriv() and as.call(), but I am evidently misusing them, as they dont do
what I want. Does anyone know how I can define a function, say foo, which
manipulates one or more arguments, and then refer to that function later in
my code in order to calculate a (partial) derivative?

Thanks
Rory

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Derivative of a Function Expression

```Actually in thinking about this its pretty easy to do it without Ryacas too:

Df - f
body(Df) - deriv(body(f), x)
Df

On 9/3/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
The Ryacas package can do that (but the function must be one line
and it can't have brace brackets).  The first yacas call below registers f
with
yacas, then we set up a function to act as a template to hold the
derivative and then we set its body calling yacas again to take the
derivative.

library(Ryacas)
f - function(x)  2*cos(x)^2 + 3*sin(x) +  0.5
yacas(f) # register f with yacas
Df - f
body(Df) - yacas(expression(deriv(f(x[[1]]
Df

Here is the output:

library(Ryacas)
f - function(x)  2*cos(x)^2 + 3*sin(x) +  0.5
yacas(f)
[1] Starting Yacas!
expression(TRUE)
Df - f
body(Df) - yacas(expression(deriv(f(x[[1]]
Df
function (x)
2 * (-2 * sin(x) * cos(x)) + 3 * cos(x)

Also see:

demo(Ryacas-Function)

On 9/3/07, Rory Winston [EMAIL PROTECTED] wrote:
Hi

I am currently (for pedagogical purposes) writing a simple numerical
analysis library in R. I have come unstuck when writing a simple
Newton-Raphson implementation, that looks like this:

f - function(x) { 2*cos(x)^2 + 3*sin(x) +  0.5  }

root - newton(f, tol=0.0001, N=20, a=1)

My issue is calculating the symbolic derivative of f() inside the newton()
function. I cant seem to get R to do this...I can of course calculate the
derivative by calling D() with an expression object containing the inner
function definition, but I would like to just define the function once and
then compute the derivative of the existing function. I have tried using
deriv() and as.call(), but I am evidently misusing them, as they dont do
what I want. Does anyone know how I can define a function, say foo, which
manipulates one or more arguments, and then refer to that function later in
my code in order to calculate a (partial) derivative?

Thanks
Rory

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Derivative of a Function Expression

```The problem is that brace brackets are not in the derivatives table.
Make sure you don't have any.

On 9/3/07, Alberto Vieira Ferreira Monteiro [EMAIL PROTECTED] wrote:
Gabor Grothendieck wrote:

too:

Df - f
body(Df) - deriv(body(f), x)
Df

This is weird.

f - function(x) { x^2 + 2*x+1 }
Df - f
body(Df) - deriv(body(f), x) # error

Also:

f - function(x) x^2 + 2 * x + 1
Df - f
body(Df) - deriv(body(f), x) # ok
D2f - f
body(D2f) - deriv(body(Df), x) # error

Alberto Monteiro

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Derivative of a Function Expression

```One improvement.  This returns a function directly without having
to create a template and filling in its body:

deriv(body(f), x, func = TRUE)

On 9/3/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
The problem is that brace brackets are not in the derivatives table.
Make sure you don't have any.

On 9/3/07, Alberto Vieira Ferreira Monteiro [EMAIL PROTECTED] wrote:
Gabor Grothendieck wrote:

too:

Df - f
body(Df) - deriv(body(f), x)
Df

This is weird.

f - function(x) { x^2 + 2*x+1 }
Df - f
body(Df) - deriv(body(f), x) # error

Also:

f - function(x) x^2 + 2 * x + 1
Df - f
body(Df) - deriv(body(f), x) # ok
D2f - f
body(D2f) - deriv(body(Df), x) # error

Alberto Monteiro

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Derivative of a Function Expression

```And if f has brace brackets surrounding the body then do this:

f - function(x) { x*x }
deriv(body(f)[[2]], x, func = TRUE)

If you are writing a general function you can do this:

e - if (identical(body(f)[[1]], as.name({))) body(f)[[2]] else body(f)
deriv(e, x, func = TRUE)

On 9/3/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
One improvement.  This returns a function directly without having
to create a template and filling in its body:

deriv(body(f), x, func = TRUE)

On 9/3/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
The problem is that brace brackets are not in the derivatives table.
Make sure you don't have any.

On 9/3/07, Alberto Vieira Ferreira Monteiro [EMAIL PROTECTED] wrote:
Gabor Grothendieck wrote:

too:

Df - f
body(Df) - deriv(body(f), x)
Df

This is weird.

f - function(x) { x^2 + 2*x+1 }
Df - f
body(Df) - deriv(body(f), x) # error

Also:

f - function(x) x^2 + 2 * x + 1
Df - f
body(Df) - deriv(body(f), x) # ok
D2f - f
body(D2f) - deriv(body(Df), x) # error

Alberto Monteiro

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Comparing transform to with

```Try this version of transform.  In the first test we show
it works on your example but we have used the head of the built in
anscombe data set.  The second and third show that
it necessarily is incompatible with transform because transform
always looks up variables in DF first whereas my.transform looks
up the computed ones first.

my.transform - function(DF, ...) {
f - function(){}
formals(f) - eval(substitute(as.pairlist(c(alist(...), DF
body(f) - substitute(modifyList(DF, data.frame(...)))
f()
}

# test
# 1
my.transform(a, sum1 = x1+x2+x3+x4, sum2 = y1+y2+y3+y4, total = sum1+sum2)
# 2
my.transform(a, y2 = y1, y3 = y2)
# 3
transform(a, y2 = y1, y3 = y2) # different

On 9/1/07, Muenchen, Robert A (Bob) [EMAIL PROTECTED] wrote:
Hi All,

I've been successfully using the with function for analyses and the
transform function for multiple transformations. Then I thought, why not
use with for both? I ran into problems  couldn't figure them out from
help files or books. So I created a simplified version of what I'm
doing:

rm( list=ls() )
x1-c(1,3,3)
x2-c(3,2,1)
x3-c(2,5,2)
x4-c(5,6,9)
myDF-data.frame(x1,x2,x3,x4)
rm(x1,x2,x3,x4)
ls()
myDF

This creates two new variables just fine

transform(myDF,
sum1=x1+x2,
sum2=x3+x4
)

This next code does not see sum1, so it appears that transform cannot
see the variables that it creates. Would I need to transform new
variables in a second pass?

transform(myDF,
sum1=x1+x2,
sum2=x3+x4,
total=sum1+sum2
)

Next I'm trying the same thing using with. It doesn't not work but
also does not generate error messages, giving me the impression that I'm
doing something truly idiotic:

with(myDF, {
sum1-x1+x2
sum2-x3+x4
total - sum1+sum2
} )
myDF
ls()

Then I thought, perhaps one of the advantages of transform is that it
works on the left side of the equation without using a longer name like
myDF\$sum1. with probably doesn't do that, so I use the longer form
below. It also does not work and generates no error messages.

# Try it again, writing vars to myDF explicitly.
with(myDF, {
myDF\$sum1-x1+x2
myDF\$sum2-x3+x4
myDF\$total - myDF\$sum1+myDF\$sum2
} )
myDF
ls()

I would appreciate some advice about the relative roles of these two
functions  why my attempts with with have failed.

Thanks!
Bob

=
Bob Muenchen (pronounced Min'-chen), Manager
Statistical Consulting Center
U of TN Office of Information Technology
200 Stokely Management Center, Knoxville, TN 37996-0520
Voice: (865) 974-5230
FAX: (865) 974-4810
Email: [EMAIL PROTECTED]
Web: http://oit.utk.edu/scc,
News: http://listserv.utk.edu/archives/statnews.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Function modification: how to calculate values for every combination?

```Just to add to this be sure you do have names if you want them
and read about vectorization in ?outer in case fun was just an
example and your actual fun is more complex:

x - c(1,2,3)
names(x) - x
y - c(4,5,6)
names(y) - y

outer(x, y, fun) # as in previous answer

# or
outer(-log(15) * x, log(10) * y, +)

On 9/2/07, Erich Neuwirth [EMAIL PROTECTED] wrote:
outer(x,y,fun)

Lauri Nikkinen wrote:
Hello,

I have a function like this:

fun - function (x, y) {
a - log(10)*y
b - log(15)*x
extr - a-b
extr
}

fun(2,3)
[1] 1.491655

x - c(1,2,3)
y - c(4,5,6)
fun(x, y)
[1] 6.502290 6.096825 5.691360

How do I have to modify my function that I can calculate results using
every combination of x and y? I would like to produce a matrix which
includes the calculated values in every cell and names(x) and names(y)
as row and column headers respectively. Is the outer-function a way to
solution?

Best regards,

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Synchronzing workspaces

```You could try saving prior to quitting in the future if you want to try
those arguments.

On 9/3/07, Paul August [EMAIL PROTECTED] wrote:
Thanks for sharing your experience. In my case, the involved machines are
Windows Vista, XP and 2000. Not sure whether it contributes to my problem or
not. I will look into this further.

I just noticed the two arguments ascii and compress for save. However, my
.RData file was created by q() with yes. The manual says that q() is
equivalent to save(list = ls(all=TRUE), file = .RData). There seems to be
no way to set ascii or compression of save through q function, unless the q
function is replaced explicitly with save(list = ls(all=TRUE), file =
.RData, ascii = T).

Paul.

- Original Message
From: Gabor Grothendieck [EMAIL PROTECTED]
To: Paul August [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch
Sent: Thursday, August 30, 2007 11:24:31 PM
Subject: Re: [R] Synchronzing workspaces

I haven't had similar experience but note that save has ascii=
and compress= arguments.  You could check if varying those
parameter values makes a difference.

On 8/30/07, Paul August [EMAIL PROTECTED] wrote:
I used to work on several computers and to use a flash drive to synchronize
the workspace on each machine before starting to work on it. I found that
.RData always caused some trouble: Often it is corrupted even though there
is no error in copying process. Does anybody have the similar experience?

Paul.

- Original Message
From: Barry Rowlingson [EMAIL PROTECTED]
To: Eric Turkheimer [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch
Sent: Wednesday, August 22, 2007 9:43:57 AM
Subject: Re: [R] Synchronzing workspaces

Eric Turkheimer wrote:
How do people go about synchronizing multiple workspaces on different
workstations?  I tend to wind up with projects spread around the various
machines I work on.  I find that placing the directories on a server and
reading them remotely tends to slow things down.

If R were to store all its workspace data objects in individual files
instead of one big .RData file, then you could use a revision control
system like SVN.  Check out the data, work on it, check it in, then on
another machine just update to get the changes.

However SVN doesn't work too well for binary files - conflicts being
hard to resolve without someone backing down - so maybe its not such a
good idea anyway...

On unix boxes and derivatives, you can keep things in sync efficiently
with the 'rsync' command.  I think there are GUI addons for it, and
Windows ports.

Barry

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

Comedy with an Edge to see what's on, when.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] by group problem

```See the examples labelled head in the examples section near the bottom of:

These show show to do it using order as well as using SQL via sqldf.

On 8/31/07, Cory Nissen [EMAIL PROTECTED] wrote:
I am working with census data.  My columns of interest are...

PercentOld - the percentage of people in each county that are over 65
County - the county in each state
State - the state in the US

There are about 3100 rows, with each row corresponding to a county within a
state.

I want to return the top five PercentOld by state.  But I want the County
and the Value.

I tried this...

topN - function(column, n=5)
{
column - sort(column, decreasing=T)
return(column[1:n])
}
top5PerState - tapply(data\$percentOld, data\$STATE, topN)

But this only returns the value for percentOld per state, I also want the
corresponding County.

I think I'm close, but I just can't get it...

Thanks

cn

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] data frame row manipulation

```Try this:

evaluation\$maxVol - ave(evaluation\$vol, evaluation\$name, FUN = max)

or using SQL via sqldf like this:

library(sqldf)
sqldf(select * from evaluation join
(select name, max(vol) from evaluation group by name) using (name))

On 8/31/07, Calle [EMAIL PROTECTED] wrote:
Hello,

struggling with the very basic needs... :( any help appreciated.

#using the package doBY
#who drinks how much beer per day and therefor cannot calculate rowise
maxvals
evaluation=data.frame(date=c(1,2,3,4,5,6,7,8,9),
name=c(Michael,Steve,Bob,
Michael,Steve,Bob,Michael,Steve,Bob), vol=c(3,5,4,2,4,5,7,6,7))
evaluation #

maxval=summaryBy(vol ~ name,data=evaluation,FUN = function(x) { c(ma=max(x))
} )
maxval # over all days per person

#function
getMaxVal=function(x) { maxval\$vol.ma[maxval\$name==x] }
getMaxVal(Steve) # testing the function for one name is ok

#we want to add a column, that shows the daily drinkingvolume in relation to
the persons max-vol.
evaluation[,relDrink]= evaluation\$vol/getMaxVal(evaluation\$name)
#
# this brings the error:
#
#Warning message:
# Korrupter Data Frame: Spalten werden abgeschnitten oder mit NAs
# aufgefüllt in: format.data.frame(x, digits = digits, na.encode = FALSE)

errortest= evaluation\$vol/getMaxVal(evaluation\$name)
errortest
# this brings:
# numeric(0)

#target was the following:
#show in each line the daily consumed beer per person and in the next column

#the all time max consumed beer for this person´(or divided by daily vol):
#
#  datename vol relDrink
#11 Michael   37
#22   Steve   56
#33 Bob   47
#44 Michael   27
#55   Steve   47
#66 Bob   57
#77 Michael   77
#88   Steve   66
#99 Bob   77

# who can help???

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] size limitations in R

```SAS was developed many years ago when computers were far
less powerful so its heritage is that it is very efficient and its unlikely
that R or other modern software will match SAS in that respect.

The development version of the sqldf R package provides an interface
which simplifies the use of the R package RSQLite which in turn is an
interface to the sqlite database.  The development version of
sqldf supports RSQLite's ability to read a file directly to sqlite without
going through R and then reading it from there or reading a subset of it
from there into R.  See example 6 on the sqldf home page:

On 8/31/07, Fabiano Vergari [EMAIL PROTECTED] wrote:
I am a SAS user currently evaluating R as a possible addition or even
replacement for SAS. The difficulty I have come across

straight away is R's apparent difficulty in handling relatively large data
files. Whilst I would not expect it to handle

datasets with millions of records, I still really need to be able to work
with dataset with 100,000+ records and 100+

variables. Yet, when reading a .csv file with 180,000 records and about 200
variables, the software virtually ground to a

halt (I stopped it after 1 hour). Are there guidelines or maybe a
limitations document anywhere that helps me assess the size

of file that R, generally, or specific routines will handle? Also, mindful
of the fact that I am am an R novice, are there
guidelines to make efficient use of R in terms of data handling?

Regards,
Fabiano Vergari
[EMAIL PROTECTED]

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] R and Web Applications

```The R packages and projects for the web and R are listed here:

http://www.lmbe.seu.edu.cn/CRAN/doc/FAQ/R-FAQ.html#R-Web-Interfaces

On 8/30/07, Chris Parkin [EMAIL PROTECTED] wrote:
Hello,

I'm curious to know how people are calling R from web applications (I've
been looking for Perl but I'm open to other languages).  After doing a
search, I came across the R package RSPerl, but I'm having difficulties
getting it installed (on Mac OSX).  I believe the problem probably has to do
with changes in R since the package release.  Below you will see where the
installation process comes to an end.  Does anyone have any suggestions, or
perhaps a direction to point me in?

Chris

* Installing to library '/Library/Frameworks/R.framework/Resources/library'
* Installing *source* package 'RSPerl' ...
checking for perl... /usr/bin/perl
No support for any of the Perl modules from calling Perl from R.
*

Set PERL5LIB to
/Library/Frameworks/R.framework/Versions/2.5/Resources/library/RSPerl/perl

*
Testing: -F/Library/Frameworks/R.framework/.. -framework R
Using '/usr/bin/perl' as the perl executable
Perl modules (no):
Adding R package to list of Perl modules to enable callbacks to R from Perl
Creating the C code for dynamically loading modules with native code for
Perl:  R
checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
Support R in Perl: yes
configure: creating ./config.status
config.status: creating src/Makevars
config.status: creating inst/scripts/RSPerl.csh
config.status: creating inst/scripts/RSPerl.bsh
config.status: creating src/RinPerlMakefile
config.status: creating src/Makefile.PL
config.status: creating cleanup
config.status: creating src/R.pm
config.status: creating R/perl5lib.R
making target all in RinPerlMakefile
RinPerlMakefile:5: /Library/Frameworks/R.framework/Resources/etc/Makeconf:
No such file or directory
make: *** No rule to make target
`/Library/Frameworks/R.framework/Resources/etc/Makeconf'.  Stop.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Month end calculations

```The zoo package includes the yearmon class to facilitate such
manipulations.  Here are a few solutions assuming you store
you series in a zoo variable:

# test data
library(zoo)
z - zoo(1001:1100, as.Date(101:200))[-(45:55)]

# Solution 1.  tapply produces indexes of last of month
tt - time(z)
z[ c(tapply(seq_along(tt), as.yearmon(tt), tail, 1)) ]

# If we want to create a last variable which corresponds
# to last in sas then do it this slightly longer way:

# Solution 2
tt - time(z)
last - seq_along(tt) %in% tapply(seq_along(tt), as.yearmon(tt), tail, 1)
z[last]

# Solution 3. another solution with a last variable.  f(x) is
# vector same length as x with all 0's except last element is 1.
tt - time(z)
f - function(x) replace(0*x, length(x), 1)
last - ave(seq_along(tt), as.yearmon(tt), FUN = f)
z[last]

In all these solutions the last point in the series is always
included.

We have not assumed that every day is necessarily included in your
series but if every day is included then even simpler solutions
are possible.

On 8/29/07, Shubha Vishwanath Karanth [EMAIL PROTECTED] wrote:
Hi R users,

Is there a function in R, which does some calculation only for the month
end in a daily data?... In other words, is there a command in R,
equivalent to last. function in SAS?

BR, Shubha

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Month end calculations

```The last line is wrong (see below for correction):

On 8/30/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
The zoo package includes the yearmon class to facilitate such
manipulations.  Here are a few solutions assuming you store
you series in a zoo variable:

# test data
library(zoo)
z - zoo(1001:1100, as.Date(101:200))[-(45:55)]

# Solution 1.  tapply produces indexes of last of month
tt - time(z)
z[ c(tapply(seq_along(tt), as.yearmon(tt), tail, 1)) ]

# If we want to create a last variable which corresponds
# to last in sas then do it this slightly longer way:

# Solution 2
tt - time(z)
last - seq_along(tt) %in% tapply(seq_along(tt), as.yearmon(tt), tail, 1)
z[last]

# Solution 3. another solution with a last variable.  f(x) is
# vector same length as x with all 0's except last element is 1.
tt - time(z)
f - function(x) replace(0*x, length(x), 1)
last - ave(seq_along(tt), as.yearmon(tt), FUN = f)
z[last]

This last line should be:

z[last == 1]

In all these solutions the last point in the series is always
included.

We have not assumed that every day is necessarily included in your
series but if every day is included then even simpler solutions
are possible.

On 8/29/07, Shubha Vishwanath Karanth [EMAIL PROTECTED] wrote:
Hi R users,

Is there a function in R, which does some calculation only for the month
end in a daily data?... In other words, is there a command in R,
equivalent to last. function in SAS?

BR, Shubha

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Month end calculations

```And one more yearmon solution.  Here z is a zoo series as before:

tt - time(z)
aggregate(z, ave(tt, as.yearmon(tt), FUN = max), tail, 1)

On 8/30/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
The last line is wrong (see below for correction):

On 8/30/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
The zoo package includes the yearmon class to facilitate such
manipulations.  Here are a few solutions assuming you store
you series in a zoo variable:

# test data
library(zoo)
z - zoo(1001:1100, as.Date(101:200))[-(45:55)]

# Solution 1.  tapply produces indexes of last of month
tt - time(z)
z[ c(tapply(seq_along(tt), as.yearmon(tt), tail, 1)) ]

# If we want to create a last variable which corresponds
# to last in sas then do it this slightly longer way:

# Solution 2
tt - time(z)
last - seq_along(tt) %in% tapply(seq_along(tt), as.yearmon(tt), tail, 1)
z[last]

# Solution 3. another solution with a last variable.  f(x) is
# vector same length as x with all 0's except last element is 1.
tt - time(z)
f - function(x) replace(0*x, length(x), 1)
last - ave(seq_along(tt), as.yearmon(tt), FUN = f)
z[last]

This last line should be:

z[last == 1]

In all these solutions the last point in the series is always
included.

We have not assumed that every day is necessarily included in your
series but if every day is included then even simpler solutions
are possible.

On 8/29/07, Shubha Vishwanath Karanth [EMAIL PROTECTED] wrote:
Hi R users,

Is there a function in R, which does some calculation only for the month
end in a daily data?... In other words, is there a command in R,
equivalent to last. function in SAS?

BR, Shubha

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Synchronzing workspaces

```I haven't had similar experience but note that save has ascii=
and compress= arguments.  You could check if varying those
parameter values makes a difference.

On 8/30/07, Paul August [EMAIL PROTECTED] wrote:
I used to work on several computers and to use a flash drive to synchronize
the workspace on each machine before starting to work on it. I found that
.RData always caused some trouble: Often it is corrupted even though there is
no error in copying process. Does anybody have the similar experience?

Paul.

- Original Message
From: Barry Rowlingson [EMAIL PROTECTED]
To: Eric Turkheimer [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch
Sent: Wednesday, August 22, 2007 9:43:57 AM
Subject: Re: [R] Synchronzing workspaces

Eric Turkheimer wrote:
How do people go about synchronizing multiple workspaces on different
workstations?  I tend to wind up with projects spread around the various
machines I work on.  I find that placing the directories on a server and
reading them remotely tends to slow things down.

If R were to store all its workspace data objects in individual files
instead of one big .RData file, then you could use a revision control
system like SVN.  Check out the data, work on it, check it in, then on
another machine just update to get the changes.

However SVN doesn't work too well for binary files - conflicts being
hard to resolve without someone backing down - so maybe its not such a
good idea anyway...

On unix boxes and derivatives, you can keep things in sync efficiently
with the 'rsync' command.  I think there are GUI addons for it, and
Windows ports.

Barry

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

Comedy with an Edge to see what's on, when.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] sql query over local tables

```I assume that by local tables you mean data frames in R.

You can use the merge function in the base of R, as others have already
mentioned, or if you want to use SQL syntax you can use the sqldf

On 8/28/07, Jorge Cornejo Donoso [EMAIL PROTECTED] wrote:
Hi i have to table with IDs in each one.

I want to make a join (as in sql) by the ID. Is any way to use the RODBC
package (or other) in local tables (not a access, mysql, sql, etc. )  and

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Strage result with an append/strptime combination

```Try chron:

library(chron)
namefile - 070707050642.dat#day-month-year-hour-minute-second.dat
x - chron(substr(namefile, 1, 6), substr(namefile, 7, 12),
+   format = c(dmy, hms), out.format = c(m/d/y, h:m:s))
c(x, x)
[1] (07/07/07 05:06:42) (07/07/07 05:06:42)

See R News 4/1 Help Desk article for more.

On 8/29/07, Ptit_Bleu [EMAIL PROTECTED] wrote:

Hi,

I keep on trying to write some small scripts in order to learn R but even
with basic scripts I have problems ...

I start with the name of a file which is in fact the time the file has been
generated (I cannot change the format). Then I convert namefile with
strptime. The problem occurs when I add another time from another file with
append. It displays some informations I don't want.

(http://www.nabble.com/Error-with-strptime-tf3607942.html#a10081942) but I
don't understand the solution. I tested as.POSIXct or as.POSIX.lt but it has
no effect.

Do you have some ideas to solve this problem ?
Ptit Bleu.

---

namefile-070707050642.dat#day-month-year-hour-minute-second.dat
jourheure-strptime(namefile,%d%m%y%H%M%S)

jourheure
[1] 2007-07-07 05:06:42

jourheure-append(jourheure,jourheure)
jourheure
[1] 2007-07-07 05:06:42 Paris, Madrid (heure d'été) 2007-07-07 05:06:42

--
View this message in context:
http://www.nabble.com/Strage-result-with-an-append-strptime-combination-tf4347401.html#a12385852
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Strage result with an append/strptime combination

```Try

fmt - function(x) with(month.day.year(x),
sprintf(%02d/%02d/%02d %02d:%02d:%02d, month, day, year,
hours(x), minutes(x), seconds(x)))
fmt(x)

On 8/29/07, Ptit_Bleu [EMAIL PROTECTED] wrote:

Thanks Gabor !

It works.
Just one more thing : is there a possibility to remove ( and ) before I
copy the data to a MySQL database.

Again thank you for the tip.
Ptit Bleu.

Gabor Grothendieck wrote:

Try chron:

library(chron)
namefile - 070707050642.dat#day-month-year-hour-minute-second.dat
x - chron(substr(namefile, 1, 6), substr(namefile, 7, 12),
+   format = c(dmy, hms), out.format = c(m/d/y, h:m:s))
c(x, x)
[1] (07/07/07 05:06:42) (07/07/07 05:06:42)

See R News 4/1 Help Desk article for more.

On 8/29/07, Ptit_Bleu [EMAIL PROTECTED] wrote:

Hi,

I keep on trying to write some small scripts in order to learn R but even
with basic scripts I have problems ...

I start with the name of a file which is in fact the time the file has
been
generated (I cannot change the format). Then I convert namefile with
strptime. The problem occurs when I add another time from another file
with
append. It displays some informations I don't want.

(http://www.nabble.com/Error-with-strptime-tf3607942.html#a10081942) but
I
don't understand the solution. I tested as.POSIXct or as.POSIX.lt but it
has
no effect.

Do you have some ideas to solve this problem ?
Ptit Bleu.

---

namefile-070707050642.dat#day-month-year-hour-minute-second.dat
jourheure-strptime(namefile,%d%m%y%H%M%S)

jourheure
[1] 2007-07-07 05:06:42

jourheure-append(jourheure,jourheure)
jourheure
[1] 2007-07-07 05:06:42 Paris, Madrid (heure d'été) 2007-07-07
05:06:42

--
View this message in context:
http://www.nabble.com/Strage-result-with-an-append-strptime-combination-tf4347401.html#a12385852
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
View this message in context:
http://www.nabble.com/Strage-result-with-an-append-strptime-combination-tf4347401.html#a12386702
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Excel

```You would still need the interactive GUI to get to the point where its
at all comparable to Excel.  Using rpad you could construct such
an interface although its a bit of work.  Here is an example using

On 8/29/07, Bert Gunter [EMAIL PROTECTED] wrote:
Erich:

This is not a comment either for or against the use of Excel. I only wish to
point out that AFAICS, Hadley Wickham's reshape package offers all the pivot
table functionality and more.

Bert Gunter
Genentech Nonclinical Statistics

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Erich Neuwirth
Sent: Wednesday, August 29, 2007 11:43 AM
To: r-help
Subject: Re: [R] Excel

Excel bashing can be fun but also can be dangerous because
you are makeing your life harder than necessary.
Statisticians meanwhile know that the numerics of statistical
computation can be quite bad, therefore one should not use them.
But using our (we = Thomas Baier + Erich Neuwirth) RExcel addin either
with the R(D)COM server or with rcom (package on CRAN) allows you to use
all the nice features of Excel (yes, there are quite a few) and use R as
as the computational engine within Excel. The formula
=RApply(var,A1:A1000) in an Excel cell for example will use R to
compute the variance of the data in column A in Excel. If you change any
of the values in the range A1:A1000 will automatically recompute the
variance.

There is one feature in Excel which is extremely convenient, Pivot
tables. Anybody doing any work as statistical consultant really ought to
know about Pivot tables, and I am still surprised how many statisticians
do not know about it. Neither Gnumeric nor OpenOffice Calc offer
comparably convenient ways working with multidimensional tables.

I think the answer to the question
Excel or R of course is Excel and R.

--
Erich Neuwirth, University of Vienna
Faculty of Computer Science
Computer Supported Didactics Working Group
Phone: +43-1-4277-39464 Fax: +43-1-4277-39459

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Efficient way to parse string and construct data.frame

```Try this:

s - c(1 ,2 ,3,  4 ,5 ,6)
V1 V2 V3
1  1  2  3
2  4  5  6

On 8/28/07, yoo [EMAIL PROTECTED] wrote:

Hi all,

I have this list of strings
[1] 1 ,2 ,3  4 ,5 ,6

Is there an efficient way to convert it to data.frame:
V1  V2  V3
1   1   23
2   4   56

Like I can use strsplit to get to a list of split strings.. and then use say
a = strsplit(mylist, ,)
data.frame(V1 = lapply(a, function(x){x[1]}), V2 = lapply(a,
function(x){x[2]}),.)

but i'm loop through that list so many times.. so I'm hesitated to use
that..

Thanks a lot for your great help before and this time as well!!
- boy
--
View this message in context:
http://www.nabble.com/Efficient-way-to-parse-string-and-construct-data.frame-tf4342441.html#a12370234
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Factor levels

```You can create your own class and pass that to read table.  In
the example below Fld2 is read in with factor levels C, A, B
in that order.

library(methods)
setClass(my.levels)
setAs(character, my.levels,
function(from) factor(from, levels = c(C, A, B)))

### test ###

Input - Fld1 Fld2
10 A
20 B
30 C
40 A

colClasses = c(numeric, my.levels))
str(DF)
# or
colClasses = list(Fld2 = my.levels))
str(DF)

On 8/28/07, Sébastien [EMAIL PROTECTED] wrote:
Dear R-users,

I have found this not-so-recent post in the archives  -
http://tolstoy.newcastle.edu.au/R/devel/00a/0291.html - while I was
looking for a particular way to reorder factor levels. The question
addressed by the author was to know if the read.table function could be
modified to order the levels of newly created factors according to the
order that they appear in the data file. Exactly what I am looking for.
As there was no reply to this post, I wonder if any move have been made
towards the implementation of this suggestion. A quick look at
?read.table tells me that if this option was implemented, it was not in

Sebastien

PS: I am sorry to post so many messages on the list, but I am learning R
(basically by trials  errors ;-) ) and no one around me has even a

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Factor levels

```Its not clear from your description what you want.
Could you be a bit more specific including an example.

On 8/28/07, Sébastien [EMAIL PROTECTED] wrote:
Thanks Gabor, I have two questions:

1- Is there any difference between your code and the following one, with
regards to Fld2 ?
### test ###

Input - Fld1 Fld2
10 A
20 B
30 C
40 A

DF -
TRUE)

DF\$Fld2-factor(DF\$Fld2,levels= c(C, A, B)))
2- do you see any way to bring flexibility to your method ? Because, it
looks to me as, at this stage, I have to i) know the order of my levels
before I read the table and ii) create one class per factor.
My problem is that I am not really working on a specific dataset. My goal is
to develop R scripts capable of handling datasets which have various
contents but close structures. So, I really need to minimize the quantity of
user-specific code.

Sebastien

Gabor Grothendieck a écrit :
You can create your own class and pass that to read table. In
the example
below Fld2 is read in with factor levels C, A, B
in that
order.

library(methods)
setClass(my.levels)
setAs(character,
my.levels,
function(from) factor(from, levels = c(C, A, B)))

###
test ###

Input - Fld1 Fld2
10 A
20 B
30 C
40 A

DF -
colClasses = c(numeric,
my.levels))
str(DF)
# or
TRUE,
colClasses = list(Fld2 = my.levels))
str(DF)

On 8/28/07,
Sébastien [EMAIL PROTECTED] wrote:

Dear R-users,

I have found this not-so-recent post in the archives
-
http://tolstoy.newcastle.edu.au/R/devel/00a/0291.html -
while I was
looking for a particular way to reorder factor levels. The
question
could be
modified to order the levels of newly created factors according to
the
order that they appear in the data file. Exactly what I am looking
for.
As there was no reply to this post, I wonder if any move have been
towards the implementation of this suggestion. A quick look
at
?read.table tells me that if this option was implemented, it was not
in

Sebastien

PS: I am sorry to post so many
messages on the list, but I am learning R
(basically by trials  errors ;-)
) and no one around me has even a
it...

__
R-help@stat.math.ethz.ch
mailing
list
https://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
and provide
commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Factor levels

```Its the same principle.  Just change the function to be suitable.  This one
arranges the levels according to the input:

library(methods)
setClass(my.factor)
setAs(character, my.factor,
function(from) factor(from, levels = unique(from)))

Input - a b c
1   1 176 w
2   2 141 k
3   3 172 r
4   4 182 s
5   5 123 k
6   6 153 p
7   7 176 l
8   8 170 u
9   9 140 z
10 10 194 s
11 11 164 j
12 12 100 j
13 13 127 x
14 14 137 r
15 15 198 d
16 16 173 j
17 17 113 x
18 18 144 w
19 19 198 q
20 20 122 f

colClasses = list(c = my.factor))
str(DF)

On 8/28/07, Sébastien [EMAIL PROTECTED] wrote:
Ok, I cannot send to you one of my dataset since they are confidential. But
I can produce a dummy mini dataset to illustrate my question. Let's say I
have a csv file with 3 columns and 20 rows which content is reproduced by
the following line.

mydata-data.frame(a=1:20,
b=sample(100:200,20,replace=T),c=sample(letters[1:26], 20,
replace = T))
mydata
a   b c
1   1 176 w
2   2 141 k
3   3 172 r
4   4 182 s
5   5 123 k
6   6 153 p
7   7 176 l
8   8 170 u
9   9 140 z
10 10 194 s
11 11 164 j
12 12 100 j
13 13 127 x
14 14 137 r
15 15 198 d
16 16 173 j
17 17 113 x
18 18 144 w
19 19 198 q
20 20 122 f

If I had to read the csv file, I would use something like:

Now, if you look at mydata\$c, the levels are alphabetically ordered.
mydata\$c
[1] w k r s k p l u z s j j x r d j x w q f
Levels: d f j k l p q r s u w x z

What I am trying to do is to reorder the levels as to have them in the order
they appear in the table, ie
Levels: w k r s p l u z j x d q f

Again, keep in mind that my script should be used on datasets which content
are unknown to me. In my example, I have used letters for mydata\$c, but my
code may have to handle factors of numeric or character values (I need to
transform specific columns of my dataset into factors for plotting
purposes). My goal is to let the code scan the content of each factor of my
data.frame during or after the read.table step and reorder their levels
automatically without having to ask the user to hard-code the level order.

In a way, my problem is more related to the way the factor levels are
ordered than to the read.table function, although I guess there is a link...

Gabor Grothendieck a écrit :
Its not clear from your description what you want.
Could you be a bit more
specific including an example.

On 8/28/07, Sébastien [EMAIL PROTECTED]
wrote:

Thanks Gabor, I have two questions:

1- Is there any difference between your
code and the following one, with
regards to Fld2 ?
### test ###

Input - Fld1 Fld2
10 A
20 B
30 C
40 A

DF -

TRUE)

DF\$Fld2-factor(DF\$Fld2,levels= c(C, A, B)))

2- do you see any way to bring flexibility to your method ? Because,
it
looks to me as, at this stage, I have to i) know the order of my
levels
before I read the table and ii) create one class per factor.
My
problem is that I am not really working on a specific dataset. My goal is
to
develop R scripts capable of handling datasets which have various
contents
but close structures. So, I really need to minimize the quantity
of
user-specific code.

Sebastien

Gabor Grothendieck a écrit :
You can

the example

below Fld2 is read in with factor levels C, A, B

in that

order.

library(methods)
setClass(my.levels)
setAs(character,

my.levels,

function(from) factor(from, levels = c(C, A, B)))

###

test ###

Input - Fld1 Fld2
10 A
20 B
30 C
40 A

DF -

colClasses = c(numeric,

my.levels))

str(DF)
# or

TRUE,

colClasses = list(Fld2 = my.levels))
str(DF)

On 8/28/07,

Sébastien [EMAIL PROTECTED] wrote:

Dear R-users,

I have found this not-so-recent post in the archives

-

http://tolstoy.newcastle.edu.au/R/devel/00a/0291.html -

while I was

looking for a particular way to reorder factor levels. The

question

could be

modified to order the levels of newly created factors according to

the

order that they appear in the data file. Exactly what I am looking

for.

As there was no reply to this post, I wonder if any move have been

towards the implementation of this suggestion. A quick look

at

?read.table tells me that if this option was implemented, it was not

in

Sebastien

PS: I am sorry to post so many

messages on the list, but I am learning R

(basically by trials  errors ;-)

) and no one around me has even a

it...

__
R-help@stat.math.ethz.ch
mailing
list

https://stat.ethz.ch/mailman/listinfo/r-help

guide```

### Re: [R] Nodes edges with similarity matrix

```Try this:

# test data
mat - structure(c(1, 0.325141612, 0.002109751, 0.250153137, 0.0223676,
1, 0.342654, 0.1987485, 0.9723831, 0.9644216, 1, 0.7391222, 0.394331,
0.5460461, 0.7080224, 1), .Dim = c(4L, 4L), .Dimnames = list(
c(a, b, c, d), c(a, b, c, d)))

library(sna)

# draw edges according to value
gplot(mat, edge.lwd = mat, label = rownames(mat))

# thresholding at 0.5
gplot(mat  .5, label = rownames(mat))

On 8/28/07, H. Paul Benton [EMAIL PROTECTED] wrote:
Hello,

I apologise if someone has already answered this but I searched and

I have a matrix which gives me the similarity of each item to each
other. I would like to turn this matrix into something like what they
have in the graph package with the nodes and edges.
http://cran.r-project.org/doc/packages/graph.pdf . However I cannot find
a method to convert my matrix to an object that graph can use.

my similarity matrix looks like:
sim[1:4,]
a  b  c  d
[a]  1.0  0.0223676  0.9723831  0.3943310
[b]  0.325141612  1.000  0.9644216  0.5460461
[c]  0.002109751  0.3426540  1.000  0.7080224
[d]  0.250153137  0.1987485  0.7391222  1.000

please don't get caught up with the numbers I simple made this to show.
I have not produce the code yet to make my similitary matrix.

Does anyone know a method to do this or do I have to write something. :(
If I do any starter code :D jj. If I've read something wrong or
misunderstood my apologies.

cheers,

Paul

--
Research Technician
Mass Spectrometry
o The
/
o Scripps
\
o Research
/
o Institute

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] How to provide argument when opening RGui from an external application

```There are also some batch files that can be used with Rscript on XP and info

On 8/26/07, Sébastien [EMAIL PROTECTED] wrote:
When you say look into Rscript.exe, do you have a specific document in
mind ? I tried to google it but could not find much... I forgot to
mention in my first email that I am working under the Windows XP
environment.

Prof Brian Ripley a écrit :
Look into Rscript.exe (on Windows), which is a flexible way to run
scripts.  Neither using a GUI nor using source() are recommended.

On Fri, 24 Aug 2007, Sébastien wrote:

Dear R-users,

I have written a small application (in visual basic) that automatically
generate some R scripts. I would like to execute these scripts when my
application is being closed.
My problem is that I don't know how to pass the
'source(c:/.../myscript.r)' instruction when I programmatically start
RGui. Tinn-R is capable of doing such things, so I guess there must be a
way to pass arguments to RGui.

Sebastien

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] How to make an array of data.frames?

```Is this what you want:

DF1 - DF2 - DF3 - df1 - df2 - df3 - head(iris)
list(a = list(DF1, DF2, DF3), b = list(df1, df2, df3))

or

x - list()
x\$a - list(DF1, DF2, DF3)
x\$b - list(df1, df2, df3)

On 8/26/07, Werner Wernersen [EMAIL PROTECTED] wrote:
Hi,

I am still struggling with the data structures in R. I
know how it works in C++ but how can I get such a
structure in R?

Here is what I want:
x[a]\$dataframe1
x[a]\$dataframe2
x[a]\$dataframe3
x[b]\$dataframe1
x[b]\$dataframe2
x[b]\$dataframe3
x[c]\$dataframe1
x[c]\$dataframe2
x[c]\$dataframe3

And it would be nice if I could fill in objects a,
b, c one at a time successively.

What is the easiest way to get such a data structure?
It would be great if someone could give me some help
with this.

Many thanks and kind regards,
Werner

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] How to make an array of data.frames?

```That gives you a list of data frames. An array is a vector with a dim
attribute to to make it into an array add the appropriate dim attirbute.

If x is the list we created before then:

dim(x) - 2

gives us an array of length 2 each of which has a list of 3 elements
or

dim(x) - 1:2

gives a 1x2 array

or

y - list(DF1, DF2, DF3, df1, df2, df3)
dim(y) - 3:2

gives a 3x2 array so you can write y[[1,2]] for example.
etc.

On 8/26/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
Is this what you want:

DF1 - DF2 - DF3 - df1 - df2 - df3 - head(iris)
list(a = list(DF1, DF2, DF3), b = list(df1, df2, df3))

or

x - list()
x\$a - list(DF1, DF2, DF3)
x\$b - list(df1, df2, df3)

On 8/26/07, Werner Wernersen [EMAIL PROTECTED] wrote:
Hi,

I am still struggling with the data structures in R. I
know how it works in C++ but how can I get such a
structure in R?

Here is what I want:
x[a]\$dataframe1
x[a]\$dataframe2
x[a]\$dataframe3
x[b]\$dataframe1
x[b]\$dataframe2
x[b]\$dataframe3
x[c]\$dataframe1
x[c]\$dataframe2
x[c]\$dataframe3

And it would be nice if I could fill in objects a,
b, c one at a time successively.

What is the easiest way to get such a data structure?
It would be great if someone could give me some help
with this.

Many thanks and kind regards,
Werner

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Program of matrix of seasonal dummy variable(Econometrics)

```Try this:

kronecker(rep(1, 3), diag(4))
[,1] [,2] [,3] [,4]
[1,]1000
[2,]0100
[3,]0010
[4,]0001
[5,]1000
[6,]0100
[7,]0010
[8,]0001
[9,]1000
[10,]0100
[11,]0010
[12,]0001

On 8/26/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
Dear R users,
I would like to construct a matrix of seasonal dummy variables, such matrix
can be written as follows(i.e format(T,4))
10   0   0
01   0   0
00   1   0
00   0   1
10   0   0
01   0   0
00   1   0
00   0   1
10   0   0
01   0   0
00   1   0
00   0   1
.. ..
.. . .
etc
I have written the following small program:
T=100
br-matrix(0,T,4)
for (i in 1:T)
{
+ for (j in 1:4)
{
+ if i=j
{+ br[i,j]=1
+ }
+ if else (abs(i-j)%%4==0)
{+ br[i,j]=1
+}
+ else
{+ br[i,j]=0
+}
+}
+}
I have obtained the following message from R consol:
T=100
br-matrix(0,T,4)
for (i in 1:T)
+  {
+ + for (j in 1:4)
+ {
++ if i=j
Erreur : syntax error, unexpected SYMBOL, expecting '(' dans :

{+ br[i,j]=1
+ + }
Erreur : syntax error, unexpected '}' dans {+ br[i,j]=1

+ if else (abs(i-j)%%4==0)
Erreur : syntax error, unexpected ELSE, expecting '(' dans+ if else
{+ br[i,j]=1
+ +}
Erreur : syntax error, unexpected '}' dans {+ br[i,j]=1
+ else
Erreur : syntax error, unexpected ELSE dans+ else
{+ br[i,j]=0
+ +}
Erreur : syntax error, unexpected '}' dans {+ br[i,j]=0
+}
Erreur : syntax error, unexpected '}' dans +}
+}
Erreur : syntax error, unexpected '}' dans +}

I would require if you can rectify my program in order to obtain this matrix
of seasonal dummies. Many thanks in advance.
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] subset using noncontiguous variables by name (not index)

```Using builtin data frame anscombe try this. First we set up a data frame
anscombe.seq which has one row containing 1, 2, 3, ... .  Then select
out from that data frame and unlist it to get the desired index vector.

anscombe.seq - replace(anscombe[1,], TRUE, seq_along(anscombe))
idx - unlist(subset(anscombe.seq, select = c(x1, x3:x4, y2)))
anscombe[idx]
x1 x3 x4   y2
1  10 10  8 9.14
2   8  8  8 8.14
3  13 13  8 8.74
4   9  9  8 8.77
5  11 11  8 9.26
6  14 14  8 8.10
7   6  6  8 6.13
8   4  4 19 3.10
9  12 12  8 9.13
10  7  7  8 7.26
11  5  5  8 4.74

On 8/26/07, Muenchen, Robert A (Bob) [EMAIL PROTECTED] wrote:
Hi All,

I'm using the subset function to select a list of variables, some of
which are contiguous in the data frame, and others of which are not. It
works fine when I use the form:

subset(mydata,select=c(x1,x3:x5,x7) )

In reality, my list is far more complex. So I would like to store it in
a variable to substitute in for c(x1,x3:x5,x7) but cannot get it to
work. That use of the c function seems to violate R rules, so I'm not
sure how it works at all. A small simulation of the problem is below.

If the variable names  orders were really this simple, I could use
indices like

summary( mydata[ ,c(1,3:5,7) ] )

but alas, they are not.

How does the c function work this way in the first place, and how can I
make this substitution?

Thanks,
Bob

mydata - data.frame(
x1=c(1,2,3,4,5),
x2=c(1,2,3,4,5),
x3=c(1,2,3,4,5),
x4=c(1,2,3,4,5),
x5=c(1,2,3,4,5),
x6=c(1,2,3,4,5),
x7=c(1,2,3,4,5)
)
mydata

# This does what I want.
summary(
subset(mydata,select=c(x1,x3:x5,x7) )
)

# Can I substitute myVars?
attach(mydata)
myVars1 - c(x1,x3:x5,x7)

# Not looking good!
myVars1

# This doesn't do the right thing.
summary(
subset(mydata,select=myVars1 )
)

# Total desperation on this attempt:
myVars2 - x1,x3:x5,x7
myVars2

# This doesn't work either.
summary(
subset(mydata,select=myVars2 )
)

=
Bob Muenchen (pronounced Min'-chen), Manager
Statistical Consulting Center
U of TN Office of Information Technology
200 Stokely Management Center, Knoxville, TN 37996-0520
Voice: (865) 974-5230
FAX: (865) 974-4810
Email: [EMAIL PROTECTED]
Web: http://oit.utk.edu/scc,
News: http://listserv.utk.edu/archives/statnews.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] subset using noncontiguous variables by name (not index)

```Try this:

%:% - function(x, y) {
+prex - gsub([0-9], , x); postx - gsub([^0-9], , x)
+prey - gsub([0-9], , y); posty - gsub([^0-9], , y)
+stopifnot(prex == prey)
+paste(prex, seq(from = as.numeric(postx), to =
as.numeric(posty)), sep = )
+ }
x2 %:% x4
[1] x2 x3 x4

On 8/26/07, Muenchen, Robert A (Bob) [EMAIL PROTECTED] wrote:
Thanks Bert  Gabor for two very interesting solutions!

It would be very handy in R if string1:stringN generated
string1,string2...stringN it would make selections like this much
more obvious. I know it's easy to with the colon operator and paste
function but that's quite a step up in complexity compared to SAS' x1
x3-x4 y2 or SPSS' x1,x3 to x4, y2. And it's complexity that beginners
face early in learning R.

While on the subject of the colon operator, why doesn't anscombe[[1:4]]
select the x variables in list form as anscombe[,1:4] or anscombe[1:4]
do in data frame form?

Thanks,

Bob

=
Bob Muenchen (pronounced Min'-chen), Manager
Statistical Consulting Center
U of TN Office of Information Technology
200 Stokely Management Center, Knoxville, TN 37996-0520
Voice: (865) 974-5230
FAX: (865) 974-4810
Email: [EMAIL PROTECTED]
Web: http://oit.utk.edu/scc,
News: http://listserv.utk.edu/archives/statnews.html
=

-Original Message-
From: Bert Gunter [mailto:[EMAIL PROTECTED]
Sent: Sunday, August 26, 2007 6:50 PM
To: 'Gabor Grothendieck'; Muenchen, Robert A (Bob)
Cc: r-help@stat.math.ethz.ch
Subject: RE: [R] subset using noncontiguous variables by name (not
index)

The problem is that x3:x5 does not mean what you think it means. The
only
reason it does the right thing in subset() is because a clever trick
is
used
there (read the code -- it's not hard to understand) to ensure that it
does.
Gabor has essentially mimicked that trick in his solution.

However, it is not necessary do this. You can construct the call
directly as
you tried to do. Using the anscombe example, here's how:

chooz - c(x1,x3:x4,y2)  ## enclose the desired expression in quotes
do.call (subset, list( x = anscombe, select = parse(text = chooz)))

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA

The business of the statistician is to catalyze the scientific
learning
process.  - George E. P. Box

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Gabor
Grothendieck
Sent: Sunday, August 26, 2007 2:10 PM
To: Muenchen, Robert A (Bob)
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] subset using noncontiguous variables by name
(not index)

Using builtin data frame anscombe try this. First we set up a
data frame
anscombe.seq which has one row containing 1, 2, 3, ... .  Then
select
out from that data frame and unlist it to get the desired
index vector.

anscombe.seq - replace(anscombe[1,], TRUE, seq_along(anscombe))
idx - unlist(subset(anscombe.seq, select = c(x1, x3:x4, y2)))
anscombe[idx]
x1 x3 x4   y2
1  10 10  8 9.14
2   8  8  8 8.14
3  13 13  8 8.74
4   9  9  8 8.77
5  11 11  8 9.26
6  14 14  8 8.10
7   6  6  8 6.13
8   4  4 19 3.10
9  12 12  8 9.13
10  7  7  8 7.26
11  5  5  8 4.74

On 8/26/07, Muenchen, Robert A (Bob) [EMAIL PROTECTED] wrote:
Hi All,

I'm using the subset function to select a list of variables, some
of
which are contiguous in the data frame, and others of which
are not. It
works fine when I use the form:

subset(mydata,select=c(x1,x3:x5,x7) )

In reality, my list is far more complex. So I would like to
store it in
a variable to substitute in for c(x1,x3:x5,x7) but cannot get it
to
work. That use of the c function seems to violate R rules,
so I'm not
sure how it works at all. A small simulation of the problem
is below.

If the variable names  orders were really this simple, I could
use
indices like

summary( mydata[ ,c(1,3:5,7) ] )

but alas, they are not.

How does the c function work this way in the first place,
and how can I
make this substitution?

Thanks,
Bob

mydata - data.frame(
x1=c(1,2,3,4,5),
x2=c(1,2,3,4,5),
x3=c(1,2,3,4,5),
x4=c(1,2,3,4,5),
x5=c(1,2,3,4,5),
x6=c(1,2,3,4,5),
x7=c(1,2,3,4,5)
)
mydata

# This does what I want.
summary(
subset(mydata,select=c(x1,x3:x5,x7) )
)

# Can I substitute myVars?
attach(mydata)
myVars1 - c(x1,x3:x5,x7)

# Not looking good!
myVars1

# This doesn't do the right thing.
summary(
subset(mydata,select=myVars1 )
)

# Total desperation on this attempt:
myVars2 - x1,x3:x5,x7
myVars2

# This doesn't work either.
summary(
subset```

### Re: [R] Extracting a range of elements from a vector

```See ?embed

On 8/25/07, Otis Laws [EMAIL PROTECTED] wrote:
Dear R users

I am R newbie creating a function that implements the poker test to test
pseudo random bit generators.
Iam reading the bits from a text file (1 bit per line),  which causes
each bit to be stored in an element of a numeric vector.

What Iam trying to do is to extract a block of bits of arbitray size
from the original vector into a smaller numeric vector and then count
this binary number
(and keep repeating this until the end of the vector, so that I get a
vector containing the number of times each binary number has occured) e.g.

original vector:
0, 1,1,0,0,1,0,1,1

using a block size of 3 bits the first smaller vector becomes:
0, 1, 1

At the momemt I do this by iterating through the original vector and set
the ith element of the smaller vector.
I have looked at using the subset() function but it seems to operate on
a vector's content rather than index.

This causes the following two main questions:
1. Is there a way to specify a range of vector elements?
2. Is this the most efficient method, since this could be extremly time
consuming when used to test millions of bits?

Otis Laws

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Character position command

```See ?regexpr to get the position; however, using sub we could remove
the dot and everything after it in one go.  See ?regexp and ?sub .
Also there are some links to info on regular expressions in the Links

n - regexpr(., apples.pears, fixed = TRUE)
substr(apples.pear, 1, n-1)
[1] apples

sub([.].*, , apples.pears)
[1] apples

On 8/25/07, Mitchell Hoffman [EMAIL PROTECTED] wrote:
This is a very simple question, so I apologize I couldn't find it online:

I want to shorten the string 'apples.pears' to 'apples'.

string='apples.pears'
string1=substr(string,0,x)

For x above, I would like to have a command like charAt(string,.), i.e.
the position of the period in the word, but I can't seem to find a charAt
command in R.

Thank you.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] How to shade vertical bands in a graph?

```There is an example using classic graphics here:

http://www.mayin.org/ajayshah/KB/R/html/g5.html

and one using lattice graphics here:

library(zoo)
?xyplot.zoo

On 8/23/07, del pes [EMAIL PROTECTED] wrote:

Hello,

I would like to draw vertical yellow bands in my graph, but could not find
how to do that in the documentation.

I set up a page to show what I would like to achieve:
http://rstudent.blogg.de/eintrag.php?id=1 (the first picture was manually
colored with the Gimp).

Any help would be welcome...

All the best,

Delfina
_
[[replacing trailing spam]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] It is possible to use a Shell command inside a R script?

```What OS was that on?

On 8/24/07, Alberto Monteiro [EMAIL PROTECTED] wrote:
Ronaldo Reis Junior wrote:

It is possible to use a shell command inside a R script?

I'm write a R script and I like to put somes shell commands inside
to R. Somethink like: convert fig01.png fig01.xpm or sed ..., etc.

It is possible? How?

?system

BTW, I found that using things directly in R is _much_
slower than creating a batch file and then running it.

For example, I had a directory with misnamed mp3 files,
and I wanted to use R to rename and copy them
to another directory. I tried to use file.copy, but it
took too much time. Writing a batch file and then running
it was much faster.

Alberto Monteiro

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] It is possible to use a Shell command inside a R script?

```On 8/24/07, Dirk Eddelbuettel [EMAIL PROTECTED] wrote:
On Fri, Aug 24, 2007 at 08:32:00AM -0400, Duncan Murdoch wrote:
On 8/24/2007 6:58 AM, Ronaldo Reis Junior wrote:
Hi,

It is possible to use a shell command inside a R script?

I'm write a R script and I like to put somes shell commands inside to R.
Somethink like: convert fig01.png fig01.xpm or sed ..., etc.

The details and available functions depend on the platform, but you want
to look at ?system, ?shell, and/or ?shell.exec.  (These all exist in
Windows; on Unix-alikes, you probably won't have the latter two.)

Don't forget pipes.

R's ability to consistently work on connections that may be local
files, remotes files, program output, ... is a true treasure (and
thanks and credits to, I believe, Brian Ripley to make it so).

Eg you can do this

awk '/tar.gz/ {print \$3, \$4}'), header=FALSE, col.names=c(file, date))

to get files and dates of files on CRAN.

As I recall, this also works on that other operating system, provided
you do all the legwork of installing other tools, setting PATHs etc
to provide what works out of the box on the supposedly unfriendlier OS.

Or commonly we can just do it entirely within R.  In the example discussed
we read in the lines, grep out the tar.gz lines, split each line into
fields and
select the desired columns, delete the junk and reformat it all into a
data frame:

tar.gz.Lines - grep(tar.gz, Lines, value = TRUE)
raw.fields - do.call(rbind, strsplit(tar.gz.Lines, /td))[, 2:3]
mat - apply(raw.fields, 2, gsub, pattern = /a|.*\| *\$, replacement =
)
DF - data.frame(file = mat[,1],
+   date = strptime(mat[,2], %d-%b-%Y %H:%M),
+   stringsAsFactors = FALSE)
filedate
2  AIS_1.0.tar.gz 2007-07-31 16:38:00
3 AMORE_0.2-10.tar.gz 2007-04-11 10:17:00
4   ARES_1.2-2.tar.gz 2007-03-19 20:53:00
5 AcceptanceSampling_0.1-1.tar.gz 2007-07-07 20:46:00

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] It is possible to use a Shell command inside a R script?

```On 8/24/07, Dirk Eddelbuettel [EMAIL PROTECTED] wrote:
On Fri, Aug 24, 2007 at 10:57:46AM -0400, Duncan Murdoch wrote:
On 8/24/2007 10:33 AM, Dirk Eddelbuettel wrote:
On Fri, Aug 24, 2007 at 08:32:00AM -0400, Duncan Murdoch wrote:
On 8/24/2007 6:58 AM, Ronaldo Reis Junior wrote:
Hi,
It is possible to use a shell command inside a R script?
I'm write a R script and I like to put somes shell commands inside to
R.  Somethink like: convert fig01.png fig01.xpm or sed ..., etc.
The details and available functions depend on the platform, but you want
to look at ?system, ?shell, and/or ?shell.exec.  (These all exist in
Windows; on Unix-alikes, you probably won't have the latter two.)
Don't forget pipes. R's ability to consistently work on connections that
may be local
files, remotes files, program output, ... is a true treasure (and
thanks and credits to, I believe, Brian Ripley to make it so).
http://cran.r-project.org/src/contrib/ | awk '/tar.gz/ {print \$3, \$4}'),
to get files and dates of files on CRAN.   As I recall, this also works on
that other operating system, provided
you do all the legwork of installing other tools, setting PATHs etc
to provide what works out of the box on the supposedly unfriendlier OS.

The pipe command you list doesn't work in Windows.  I'd guess this is
because the pipe syntax | within the command is unsupported:  it tries to
execute links, with the rest of the line passed as arguments.  But I
haven't traced through to check on this.

Hm, wishful thinking must have gotten the better of me then. Sorry for

This works for me on Windows:

http://cran.r-project.org/src/contrib/ | findstr tar.gz), as.is = TRUE)
V3  V4V5
2  AIS_1.0.tar.gz 31-Jul-2007 16:38
3 AMORE_0.2-10.tar.gz 11-Apr-2007 10:17
4   ARES_1.2-2.tar.gz 19-Mar-2007 20:53
5 AcceptanceSampling_0.1-1.tar.gz 07-Jul-2007 20:46

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] It is possible to use a Shell command inside a R script?

```On 8/24/07, Duncan Murdoch [EMAIL PROTECTED] wrote:
On 8/24/2007 1:05 PM, Gabor Grothendieck wrote:
On 8/24/07, Dirk Eddelbuettel [EMAIL PROTECTED] wrote:
On Fri, Aug 24, 2007 at 10:57:46AM -0400, Duncan Murdoch wrote:
On 8/24/2007 10:33 AM, Dirk Eddelbuettel wrote:
On Fri, Aug 24, 2007 at 08:32:00AM -0400, Duncan Murdoch wrote:
On 8/24/2007 6:58 AM, Ronaldo Reis Junior wrote:
Hi,
It is possible to use a shell command inside a R script?
I'm write a R script and I like to put somes shell commands inside
to
R.  Somethink like: convert fig01.png fig01.xpm or sed ..., etc.
The details and available functions depend on the platform, but you
want
to look at ?system, ?shell, and/or ?shell.exec.  (These all exist in
Windows; on Unix-alikes, you probably won't have the latter two.)
Don't forget pipes. R's ability to consistently work on connections that
may be local
files, remotes files, program output, ... is a true treasure (and
thanks and credits to, I believe, Brian Ripley to make it so).
http://cran.r-project.org/src/contrib/ | awk '/tar.gz/ {print \$3,
\$4}'),
to get files and dates of files on CRAN.   As I recall, this also works
on
that other operating system, provided
you do all the legwork of installing other tools, setting PATHs etc
to provide what works out of the box on the supposedly unfriendlier OS.

The pipe command you list doesn't work in Windows.  I'd guess this is
because the pipe syntax | within the command is unsupported:  it tries
to
execute links, with the rest of the line passed as arguments.  But I
haven't traced through to check on this.

Hm, wishful thinking must have gotten the better of me then. Sorry for

This works for me on Windows:

http://cran.r-project.org/src/contrib/ | findstr tar.gz), as.is = TRUE)

Which R version is that?  It doesn't work for me in Rgui, though it does
in Rterm, both R-devel versions.

I am using Rgui

R.version.string
[1] R version 2.5.1 (2007-06-27)

on Windows XP.  lynx --version gives:

Lynx Version 2.8.5rel.1 (04 Feb 2004)
libwww-FM 2.14FM, SSL-MM 1.4.1, OpenSSL 0.9.7d-dev
Compiled by Borland C++ (Feb  5 2004 17:35:58).

Copyrights held by the University of Kansas, CERN, and other contributors.

See http://www.moxienet.com/lynx/ for information about SSL for Lynx.
See http://www.openssl.org/ for information about OpenSSL.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Turning a logical vector into its indices without losing its length

```On 8/24/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
Here are two solutions:

logvec - c(TRUE,FALSE,FALSE,TRUE,FALSE,FALSE,TRUE,FALSE)

ifelse(logvec, seq_along(logvec), 0)
[1] 1 0 0 4 0 0 7 0

replace(logvec * 0, logvec, which(logvec))
[1] 1 0 0 4 0 0 7 0

Actually the * 0 is not needed.  The last one could simply be:

replace(logvec, logvec, which(logvec))

On 8/24/07, Leeds, Mark (IED) [EMAIL PROTECTED] wrote:
I have the code below which gives me what I want for temp based on
logvec but I was wondering if there was a shorter way ( i.e :
a one liner ) without having to initialize temp to zeros.  This is
purely for learning purposes. Thanks.

logvec - c(TRUE,FALSE,FALSE,TRUE,FALSE,FALSE,TRUE,FALSE)

temp-numeric(length(invec))
temp[invec]-which(invec)
temp

[1] 1 0 0 4 0 0 7 0

obviously, the code below doesn't work.

temp - which(invec)
temp
[1] 1 4 7

This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Turning a logical vector into its indices without losing its length

```Here are two solutions:

logvec - c(TRUE,FALSE,FALSE,TRUE,FALSE,FALSE,TRUE,FALSE)

ifelse(logvec, seq_along(logvec), 0)
[1] 1 0 0 4 0 0 7 0

replace(logvec * 0, logvec, which(logvec))
[1] 1 0 0 4 0 0 7 0

On 8/24/07, Leeds, Mark (IED) [EMAIL PROTECTED] wrote:
I have the code below which gives me what I want for temp based on
logvec but I was wondering if there was a shorter way ( i.e :
a one liner ) without having to initialize temp to zeros.  This is
purely for learning purposes. Thanks.

logvec - c(TRUE,FALSE,FALSE,TRUE,FALSE,FALSE,TRUE,FALSE)

temp-numeric(length(invec))
temp[invec]-which(invec)
temp

[1] 1 0 0 4 0 0 7 0

obviously, the code below doesn't work.

temp - which(invec)
temp
[1] 1 4 7

This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Turning a logical vector into its indices without losing its length

```On 8/24/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
On 8/24/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
Here are two solutions:

logvec - c(TRUE,FALSE,FALSE,TRUE,FALSE,FALSE,TRUE,FALSE)

ifelse(logvec, seq_along(logvec), 0)
[1] 1 0 0 4 0 0 7 0

replace(logvec * 0, logvec, which(logvec))
[1] 1 0 0 4 0 0 7 0

Actually the * 0 is not needed.  The last one could simply be:

replace(logvec, logvec, which(logvec))

If logvec can have NAs then this solution would not work but could
be modified to be done like this:

replace(logvec, which(logvec), which(logvec))

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Splitting strings

```This applies the indicated perl-style regular expression where the
first backreference (\\D+) is the non-digits and the second
backreference (\\d+) is the digits.

The two backreferences, but not the entire matched pattern itself,
are passed as arguments x and y to the function whose body is the
right hand side of the formula in the third argument.

That is then simplified using rbind to give the result.

library(gsubfn)
strapply(surgery, (\\D+)(\\d+), ~ list(lets = x, nums = as.numeric(y)),
backref = -2, perl = TRUE, simplify = rbind)

More on gsubfn at

On 8/23/07, Gary Collins [EMAIL PROTECTED] wrote:
I'm having a Thursday morning mental block, any suggestions on the following
would be most appreciated...

I have (as an example)

surgery = c(d48,  d67,  dnc37,  a75,  d10,  a78,  d31,
d55,  d1)

before each number part the possibilities are c(a, d, dnc), I'm trying
to split each element in surgery so that I have,

status time
d48
d67
dnc 37
a75
d10
a78
d31
d55
d1

I've tried various strsplit approaches but nothing has done what I need.

Gary

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] FAQ 7.x when 7 does not exist. Useability question

```Note that googling

R FAQ 7.10

will get it on the first hit.

On 8/23/07, John Kane [EMAIL PROTECTED] wrote:
The FAQ Section 7 is a very useful place for new users
to find out any number of R idiosycracies.  However
there is no numbering on the FAQ Table of Content or
on the Sections Tables of Contents.

a question about converting a factor to numeric is  a
bit cryptic. The only time 7.10 appears is after the
searcher has found the entry.

Would it be a good idea to actually number the entries
Contents for the Sections?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] extracting duplicated elements

```Try:

lapply(as.data.frame(t(DF)), function(x) unique(x[duplicated(x)  x  0]))

On 8/23/07, dxc13 [EMAIL PROTECTED] wrote:

Can anyone help me solve this problem...thanks!

Consider a data frame, namely v, as such:
v
X1 X2 X3 X4 X5 X1 X2 X3 X4 X5
x1  1  2 -1 -1 -1  1  2 -1 -1 -1
y1  1  2 -1 -1 -1  1  2  3 -1 -1

What I would like to do is to create an array or data frame with only the
elements that appear in the data frame more than once and are = 0.

I try this...
v[v=0]
[1] 1 1 2 2 1 1 2 2 3

which returns all = 0 elements, but they are not in their respective rows
from the original data frame.  I have tried using the duplicated()
function and can't seem to get it to work correctly.

Essentially, the outcome I am trying to get is a df or array looking like:

step 1...achieve this out of original df
[1] 1 2 1 2
[2] 1 2 1 2 3

(the blank element in row 1, position 5 can be just be NA)

step 2...take the above and get this...only the duplicated elements
[1] 1 2
[2] 1 2

--
View this message in context:
http://www.nabble.com/extracting-duplicated-elements-tf4318034.html#a12295213
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] read big text file into R

```Another option is to read it into a database and from there into R.
RSQLite has the capability of reading certain text files directly into
an SQLite database without going through R and from there one
can read it into R.   You can use RSQLite to do that.  Alternately this
post describes how the devel version of the sqldf package can do it:

http://www.nabble.com/Re%3A-Memory-Experimentation%3A-Rule-of-Thumb-%3D-10-15-Times-the-Memory-p12078165.html

On 8/23/07, Yupu Liang [EMAIL PROTECTED] wrote:
Dear Rs:

Hi, I am trying to read a big text file (nrows=243440, ncols=144). It
seems the computational time of all the read methods
want to read in: things became really slow once I tried to read in
10 lines compare to 1 lines).

If I am reading the profiling result right, I guess scan wouldn't
help either.

My questions are :
1) Is this a memory issue?
2) How to get around this?: I can't just sit around for 15 mins.
Would write a c function help?

Thanks!

Here is the profiling I did:

Rprof()
Rprof(NULL)
summaryRprof()
\$by.self
self.time self.pct total.time total.pct
scan  3.56 85.2   3.56  85.2
type.convert  0.48 11.5   0.48  11.5
make.names0.02  0.5   0.02   0.5
options   0.02  0.5   0.02   0.5
file  0.00  0.0   0.02   0.5
getOption 0.00  0.0   0.02   0.5

\$by.total
total.time total.pct self.time self.pct
scan   3.56  85.2  3.56 85.2
type.convert   0.48  11.5  0.48 11.5
make.names 0.02   0.5  0.02  0.5
options0.02   0.5  0.02  0.5
file   0.02   0.5  0.00  0.0
getOption  0.02   0.5  0.00  0.0

\$sampling.time
[1] 4.18

?Rprof()
Rprof()
Rprof(NULL)
summaryRprof()
\$by.self
self.time self.pct total.time total.pct
scan  143.12 92.7 143.12  92.7
type.convert9.52  6.2   9.52   6.2
paste   0.02  0.0   0.08   0.1
textConnection  0.02  0.0   0.04   0.0
.deparseOpts0.02  0.0   0.02   0.0
file0.02  0.0   0.02   0.0
make.names  0.02  0.0   0.02   0.0
print.default   0.02  0.0   0.02   0.0
doTryCatch  0.00  0.0   0.08   0.1
gsub0.00  0.0   0.08   0.1
try 0.00  0.0   0.08   0.1
tryCatch0.00  0.0   0.08   0.1
tryCatchList0.00  0.0   0.08   0.1
tryCatchOne 0.00  0.0   0.08   0.1
capture.output  0.00  0.0   0.06   0.0
deparse 0.00  0.0   0.02   0.0
eval.with.vis   0.00  0.0   0.02   0.0
evalVis 0.00  0.0   0.02   0.0
print   0.00  0.0   0.02   0.0

\$by.total
total.time total.pct self.time self.pct
scan   143.12  92.7143.12 92.7
type.convert 9.52   6.2  9.52  6.2
paste0.08   0.1  0.02  0.0
doTryCatch   0.08   0.1  0.00  0.0
gsub 0.08   0.1  0.00  0.0
try  0.08   0.1  0.00  0.0
tryCatch 0.08   0.1  0.00  0.0
tryCatchList 0.08   0.1  0.00  0.0
tryCatchOne  0.08   0.1  0.00  0.0
capture.output   0.06   0.0  0.00  0.0
textConnection   0.04   0.0  0.02  0.0
.deparseOpts 0.02   0.0  0.02  0.0
file 0.02   0.0  0.02  0.0
make.names   0.02   0.0  0.02  0.0
print.default0.02   0.0  0.02  0.0
deparse  0.02   0.0  0.00  0.0
eval.with.vis0.02   0.0  0.00  0.0
evalVis  0.02   0.0  0.00  0.0
print0.02   0.0  0.00  0.0

\$sampling.time
[1] 154.36

I am using R 2.5.1 for mac on a Dual 2 ```

### Re: [R] uneven list to matrix

```Here are two solutions.  The first repeatedly uses merge and the
second creates a zoo object from each alph component whose time
index consists of the row labels and uses zoo's multiway merge to
merge them.

# test data
m - matrix(1:5, 5, dimnames = list(LETTERS[1:5], NULL))
alph - list(m[1:4,,drop=F], m[c(1,3,4),,drop=F], m[c(1,4,5),,drop=F])
alph

# solution 1
out - alph[[1]]
for(i in 2:length(alph)) {
out - merge(out, alph[[i]], by = 0, all = TRUE)
row.names(out) - out[[1]]
out - out[-1]
}
matrix(as.matrix(out), nrow(out), dimnames=list(rownames(out),NULL))

# solution 2
library(zoo)
z - do.call(merge, lapply(alph, function(x) zoo(c(x), rownames(x
matrix(coredata(z), nrow(z), dimnames=list(time(z),NULL))

On 8/23/07, Christopher Marcum [EMAIL PROTECTED] wrote:
Hello,

I am sure I am not the only person with this problem.

I have a list with n elements, each consisting of a single column matrix
with different row lengths. Each row has a name ranging from A to E. Here
is an example:

alph[[1]]
A 1
B 2
C 3
D 4

alph[[2]]
A 1
C 3
D 4

alph[[3]]
A 1
D 4
E 5

I would like to create a matrix from the elements in the list with n
columns such that the row names are preserved and NAs are inserted into
the cells where the uneven lists do not match up based on their row names.
Here is an example of the desired output:

newmatrix
[,1]  [,2]  [,3]
A  1 1 1
B  2 NANA
C  3 3 NA
D  4 4 4
E  NANA5

Any suggestions?
I have tried
do.call(cbind,list)
I also thought I was on the right track when I tried converting each
element into a vector and then running this loop (which ultimately
failed):

newmat-matrix(NA,ncol=3,nrow=5)
colnames(newmatrix)-c(A:E)
for(j in 1:3){
for(i in 1:5){
for(k in 1:length(list[[i]])){
if(is.na(match(colnames(newmatrix),names(alph[[i]])))[j]==TRUE){
newmatrix[i,j]-NA}
else newmatrix[i,j]-alph[[i]][k]}}}

Thanks,
Chris
UCI Sociology

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] uneven list to matrix

```On 8/24/07, Christopher Marcum [EMAIL PROTECTED] wrote:
Hi Gabor,

Thank you. The native solution works just fine, though there is an
interesting side effect, namely, that with very large lists the rows of
the output become scrambled though the corresponding columns are correctly
sorted. The zoo package solution does not work on large lists: there is an
error:

Error in order(na.last, decreasing, ...) :
argument 1 is not a vector

They both work on the example data.  Please provide reproducible

Gabor Grothendieck wrote:
Here are two solutions.  The first repeatedly uses merge and the
second creates a zoo object from each alph component whose time
index consists of the row labels and uses zoo's multiway merge to
merge them.

# test data
m - matrix(1:5, 5, dimnames = list(LETTERS[1:5], NULL))
alph - list(m[1:4,,drop=F], m[c(1,3,4),,drop=F], m[c(1,4,5),,drop=F])
alph

# solution 1
out - alph[[1]]
for(i in 2:length(alph)) {
out - merge(out, alph[[i]], by = 0, all = TRUE)
row.names(out) - out[[1]]
out - out[-1]
}
matrix(as.matrix(out), nrow(out), dimnames=list(rownames(out),NULL))

# solution 2
library(zoo)
z - do.call(merge, lapply(alph, function(x) zoo(c(x), rownames(x
matrix(coredata(z), nrow(z), dimnames=list(time(z),NULL))

On 8/23/07, Christopher Marcum [EMAIL PROTECTED] wrote:
Hello,

I am sure I am not the only person with this problem.

I have a list with n elements, each consisting of a single column matrix
with different row lengths. Each row has a name ranging from A to E.
Here
is an example:

alph[[1]]
A 1
B 2
C 3
D 4

alph[[2]]
A 1
C 3
D 4

alph[[3]]
A 1
D 4
E 5

I would like to create a matrix from the elements in the list with n
columns such that the row names are preserved and NAs are inserted into
the cells where the uneven lists do not match up based on their row
names.
Here is an example of the desired output:

newmatrix
[,1]  [,2]  [,3]
A  1 1 1
B  2 NANA
C  3 3 NA
D  4 4 4
E  NANA5

Any suggestions?
I have tried
do.call(cbind,list)
I also thought I was on the right track when I tried converting each
element into a vector and then running this loop (which ultimately
failed):

newmat-matrix(NA,ncol=3,nrow=5)
colnames(newmatrix)-c(A:E)
for(j in 1:3){
for(i in 1:5){
for(k in 1:length(list[[i]])){
if(is.na(match(colnames(newmatrix),names(alph[[i]])))[j]==TRUE){
newmatrix[i,j]-NA}
else newmatrix[i,j]-alph[[i]][k]}}}

Thanks,
Chris
UCI Sociology

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] uneven list to matrix

```OK.  One other thought. The R merge command has a sort= argument
that you can try out.  See ?merge

On 8/24/07, Christopher Marcum [EMAIL PROTECTED] wrote:
Hi Gabor,

My apologies. Both solutions work just fine on large lists (n=1000,
n[[i]]=500). A memory problem on my machine caused the error and
fail-to-sort. Thank you!

PS - The zoo method is slightly faster.

Best,
Chris

Gabor Grothendieck wrote:
On 8/24/07, Christopher Marcum [EMAIL PROTECTED] wrote:
Hi Gabor,

Thank you. The native solution works just fine, though there is an
interesting side effect, namely, that with very large lists the rows of
the output become scrambled though the corresponding columns are
correctly
sorted. The zoo package solution does not work on large lists: there is
an
error:

Error in order(na.last, decreasing, ...) :
argument 1 is not a vector

They both work on the example data.  Please provide reproducible

Gabor Grothendieck wrote:
Here are two solutions.  The first repeatedly uses merge and the
second creates a zoo object from each alph component whose time
index consists of the row labels and uses zoo's multiway merge to
merge them.

# test data
m - matrix(1:5, 5, dimnames = list(LETTERS[1:5], NULL))
alph - list(m[1:4,,drop=F], m[c(1,3,4),,drop=F], m[c(1,4,5),,drop=F])
alph

# solution 1
out - alph[[1]]
for(i in 2:length(alph)) {
out - merge(out, alph[[i]], by = 0, all = TRUE)
row.names(out) - out[[1]]
out - out[-1]
}
matrix(as.matrix(out), nrow(out), dimnames=list(rownames(out),NULL))

# solution 2
library(zoo)
z - do.call(merge, lapply(alph, function(x) zoo(c(x), rownames(x
matrix(coredata(z), nrow(z), dimnames=list(time(z),NULL))

On 8/23/07, Christopher Marcum [EMAIL PROTECTED] wrote:
Hello,

I am sure I am not the only person with this problem.

I have a list with n elements, each consisting of a single column
matrix
with different row lengths. Each row has a name ranging from A to E.
Here
is an example:

alph[[1]]
A 1
B 2
C 3
D 4

alph[[2]]
A 1
C 3
D 4

alph[[3]]
A 1
D 4
E 5

I would like to create a matrix from the elements in the list with n
columns such that the row names are preserved and NAs are inserted
into
the cells where the uneven lists do not match up based on their row
names.
Here is an example of the desired output:

newmatrix
[,1]  [,2]  [,3]
A  1 1 1
B  2 NANA
C  3 3 NA
D  4 4 4
E  NANA5

Any suggestions?
I have tried
do.call(cbind,list)
I also thought I was on the right track when I tried converting each
element into a vector and then running this loop (which ultimately
failed):

newmat-matrix(NA,ncol=3,nrow=5)
colnames(newmatrix)-c(A:E)
for(j in 1:3){
for(i in 1:5){
for(k in 1:length(list[[i]])){
if(is.na(match(colnames(newmatrix),names(alph[[i]])))[j]==TRUE){
newmatrix[i,j]-NA}
else newmatrix[i,j]-alph[[i]][k]}}}

Thanks,
Chris
UCI Sociology

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Optimization problem

```Try this.

1. following Ben remove the Randalstown point and reset the levels of the
Location factor.

2. then replace solve with ginv so it uses the generalized inverse to calculate
the hessian:

alan2 - subset(alan, subset = Location != Randalstown)
alan2\$Location - factor(as.character(alan2\$Location))

library(MASS)
solve - ginv

zinb.zc - zicounts(resp=Scars~.,x =~Location + Lar + Mass + Lar:Mass
+ Location:Mass,z =~Location + Lar + Mass + Lar:Mass + Location:Mass,
data = alan2)

rm(solve)

On 8/21/07, Ben Bolker [EMAIL PROTECTED] wrote:

(Hope this gets threaded properly.  Sorry if it doesn't.)

Gabor: Lac and Lacfac being the same is irrelevant, wouldn't
produce NAs (but would produce something like a singular Hessian
and maybe other problems) -- but they're not even specified in this
model.

The bottom line is that you have a location with a single
observation, so the GLM that zicounts runs to get the initial
parameter values has an unestimable location:mass interaction
for one location, so it gives an NA, so optim complains.

In gruesome detail:

## set up  data
library(zicounts)
## try to run model
zinb.zc - zicounts(resp=Scars~.,
x =~Location + Lar + Mass + Lar:Mass + Location:Mass,
z =~Location + Lar + Mass + Lar:Mass + Location:Mass,
data=scardat)
## tried to debug this by dumping zicounts.R to a file, modifying
## it to put a trace argument in that would print out the parameters
## and log-likelihood for every call to the log-likelihood function.
dump(zicounts,file=zicounts.R)
source(zicounts.R)
zinb.zc - zicounts(resp=Scars~.,
x =~Location + Lar + Mass + Lar:Mass + Location:Mass,
z =~Location + Lar + Mass + Lar:Mass + Location:Mass,
data=scardat,trace=TRUE)
## this actually didn't do any good because the negative log-likelihood
## function never gets called -- as it turns out optim() barfs when it
## gets its initial values, before it ever gets to evaluating the
log-likelihood

## check the glm -- this is the equivalent of what zicounts does to
## get the initial values of the x parameters
p1 - glm(Scars~Location + Lar + Mass + Lar:Mass + Location:Mass,
data=scardat,family=poisson)
which(is.na(coef(p1)))

## find out what the deal is
table(scardat\$Location)

scar2 = subset(scardat,Location!=Randalstown)
## first step to removing the bad point from the data set -- but ...
table(scar2\$Location)
## it leaves the Location factor with the same levels, so
##  now we have ZERO counts for one location:
## redefine the factor to drop unused levels
scar2\$Location - factor(scar2\$Location)
## OK, looks fine now
table(scar2\$Location)

zinb.zc - zicounts(resp=Scars~.,
x =~Location + Lar + Mass + Lar:Mass + Location:Mass,
z =~Location + Lar + Mass + Lar:Mass + Location:Mass,
data=scar2)
## now we get another error (system is computationally singular when
## trying to compute Hessian -- overparameterized?)   Not in any
## trivial way that I can see.  It would be nice to get into the guts
## of zicounts and stop it from trying to invert the Hessian, which is
## I think where this happens.

but you started it ...)

Looking at the data in a few different ways:

library(lattice)
xyplot(Scars~Mass,groups=Location,data=scar2,jitter=TRUE,
auto.key=list(columns=3))
xyplot(Scars~Mass|Location,data=scar2,jitter=TRUE)

xyplot(Scars~Lar,groups=Location,data=scar2,
auto.key=list(columns=3))
xyplot(Scars~Mass|Lar,data=scar2)
xyplot(Scars~Lar|Location,data=scar2)

Some thoughts: (1) I'm not at all sure that
zero-inflation is necessary (see Warton 2005, Environmentrics).
This is a fairly small, noisy data set without huge numbers
of zeros -- a plain old negative binomial might be fine.

I don't actually see a lot of signal here, period (although there may
be some) ...
there's not a huge range in Lar (whatever it is -- the rest of the
covariates I
think I can interpret).  It would be tempting to try to fit location as
a random
effect, because fitting all those extra degrees of freedom is going to
kill you.
On the other hand, GLMMs are a bit hairy.

cheers
Ben

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Evaluating f(x(2,3)) on a function f- function(a,b){a+b}

```Try this:

do.call(f, as.list(x))

On 8/22/07, Søren Højsgaard [EMAIL PROTECTED] wrote:
Dear list
I have a function and a vector, say
f - function(a,b){a+b}
x - c(2,3)
I want to evaluate f on x in the sense of computing f(x[1],x[2]). I would
like it to be so that I can write f(x). (I know I can write a wrapper
function g - function(x){f(x[1],x[2])}, but this is not really what I am
looking for). Is there a general way doing this (programmatically)? (E.g. by
unpacking the elements of x and putting them in the right places when
calling f...)
I've looked under formals, alist etc. but so far without luck.

Regards
Søren

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Subsetting zoo object with a vector of time values.

```See ?window.zoo
e.g.

library(zoo)

# create test data
tt - c(-50, -49.996, -49.995, -49.96, -49.956, -49.955, -49.92, -49.916,
-49.915, -49.88)
z - zoo(seq_along(tt), tt)

window(z, c(-50, -49.96, -49.92, -49.88))

On 8/21/07, Todd Remund [EMAIL PROTECTED] wrote:
I have a zoo object for which I would like to subset using a vector of time
values.  For example, I have the following time values represented in my zoo
object.

-50.000 -49.996 -49.995 -49.960 -49.956 -49.955 -49.920 -49.916 -49.915
-49.880

and would like to get observations corresponding to times

-50 -49.96 -49.92 -49.88.

What can I do without using the lapply or which functions?

Thank you.

Todd Remund

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] extracting month from date in numeric form

```On 8/21/07, Gonçalo Ferraz [EMAIL PROTECTED] wrote:
Hi,
Anyone knows what would be a short way of extracting a month from a date in
numeric or integer format?

months(1979-12-20)
returns
December in character format.

How could I get 12 in numeric or integer format?

Here are a few solutions:

format(as.Date(1979-12-20), %m)

as.POSIXlt(as.Date(1979-12-20))\$mo + 1

as.numeric(substring(1979-12-20, 6, 7))

as.numeric(factor(months(as.Date(1979-12-20), abbrev = TRUE), levels
= month.abb))

See R News 4/1 Help Desk article for more on dates.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Optimization problem

```Lac and Lacfac are the same.

On 8/21/07, Alan Harrison [EMAIL PROTECTED] wrote:
Hello Folks,

Very new to R so bear with me, running 5.2 on XP.  Trying to do a
zero-inflated negative binomial regression on placental scar data as
dependent.  Lactation, location, number of tick larvae present and mass of
mouse are independents.  Dataframe and attributes below:

Location Lac Scars Lar Mass Lacfac
1   Tullychurry   0 0  15 13.87  0
2  Somerset   0 0   0 15.60  0
3 Tollymore   0 0   3 16.43  0
4 Tollymore   0 0   0 16.55  0
5   Caledon   0 0   0 17.47  0
6  Hillsborough   1 5   0 18.18  1
7   Caledon   0 0   1 19.06  0
8   Portglenone   0 4   0 19.10  0
9   Portglenone   0 5   0 19.13  0
10Tollymore   0 5   3 19.50  0
11 Hillsborough   1 5   0 19.58  1
12  Portglenone   0 4   0 19.76  0
13  Caledon   0 8   0 19.97  0
14 Hillsborough   1 4   0 20.02  1
15  Tullychurry   0 3   3 20.13  0
16 Hillsborough   1 5   0 20.18  1
17   LoughNavar   1 5   0 20.20  1
18Tollymore   0 0   1 20.24  0
19 Hillsborough   1 5   0 20.48  1
20  Caledon   0 4   1 20.56  0
21  Caledon   0 3   2 20.58  0
22Tollymore   0 4   3 20.58  0
23Tollymore   0 0   2 20.88  0
24 Hillsborough   1 0   0 21.01  1
25  Portglenone   0 5   0 21.08  0
26  Tullychurry   0 2   5 21.28  0
27 Ballysallagh   1 4   0 21.59  1
28  Caledon   0 0   1 21.68  0
29 Hillsborough   1 5   0 22.09  1
30  Tullychurry   0 5   5 22.28  0
31  Tullychurry   1 6  75 22.43  1
32 Ballysallagh   1 5   0 22.57  1
33 Ballysallagh   1 4   0 22.67  1
34   LoughNavar   1 5   3 22.71  1
35 Hillsborough   1 4   0 23.01  1
36  Caledon   0 0   3 23.08  0
37   LoughNavar   1 5   0 23.53  1
38 Ballysallagh   1 4   0 23.55  1
39  Portglenone   1 6   0 23.61  1
40   Mt.Stewart   0 3   0 23.70  0
41 Somerset   0 5   0 23.83  0
42 Ballysallagh   1 5   0 23.93  1
43 Ballysallagh   1 5   0 24.01  1
44  Caledon   0 0   3 24.14  0
45   LoughNavar   0 6   0 24.30  0
46   LoughNavar   1 5   0 24.34  1
47 Hillsborough   1 4   0 24.45  1
48  Caledon   0 3   2 24.55  0
49  Tullychurry   0 5  44 24.83  0
50 Hillsborough   1 5   0 24.86  1
51 Ballysallagh   1 5   0 25.02  1
52  Tullychurry   0 0   9 25.27  0
53   Mt.Stewart   0 5   0 25.31  0
54   LoughNavar   1 4   8 25.43  1
55 Somerset   1 0   0 25.58  1
56 Hillsborough   1 5   0 25.82  1
57  Portglenone   1 2   0 26.02  1
58 Ballysallagh   1 5   0 26.19  1
59   Mt.Stewart   1 0   0 26.66  1
60  Randalstown   1 0   1 26.70  1
61 Somerset   0 4   0 27.01  0
62   Mt.Stewart   0 4   0 27.05  0
63 Somerset   0 3   0 27.10  0
64 Somerset   0 6   0 27.34  0
65 Somerset   0 0   0 27.87  0
66   LoughNavar   1 5   1 28.01  1
67  Tullychurry   1 6  42 28.55  1
68 Hillsborough   1 5   0 28.84  1
69  Portglenone   1 4   0 29.00  1
70 Somerset   1 4   0 31.87  1
71 Ballysallagh   1 5   0 33.06  1
72   LoughNavar   1 4   0 33.24  1
73 Somerset   1 4   0 33.36  1

alan : 'data.frame':73 obs. of  6 variables:
\$ Location: Factor w/ 10 levels Ballysallagh,..: 10 8 9 9 2 3 2 6 6 9 ...
\$ Lac : int  0 0 0 0 0 1 0 0 0 0 ...
\$ Scars   : int  0 0 0 0 0 5 0 4 5 5 ...
\$ Lar : int  15 0 3 0 0 0 1 0 0 3 ...
\$ Mass: num  13.9 15.6 16.4 16.6 17.5 ...
\$ Lacfac  : Factor w/ 2 levels 0,1: 1 1 1 1 1 2 1 1 1 1 ...

The syntax I used to create the model is:

zinb.zc - zicounts(resp=Scars~.,x =~Location + Lar + Mass + Lar:Mass +
Location:Mass,z =~Location + Lar + Mass + Lar:Mass + Location:Mass, data=alan)

The error given is:

Error in optim(par = parm, fn = neg.like, gr = neg.grad, hessian = TRUE,  :
non-finite value supplied by optim
fitted probabilities numerically 0 or 1 occurred in: glm.fit(zz, 1 - pmin(y,
1), family = binomial())

I understand this is a problem with the model I specified, could anyone help
out??

Many thanks

Alan Harrison

Quercus
Queen's University Belfast
Belfast

BT9 7BL

T: 02890 972219
M: 07798615682

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, ```

### Re: [R] tackle memory insufficiency for large dataset using save() load()?

```See ?save .  The ... arguments are the ***names*** of the objects, not
the objects
so you want save(d, ...whatever...) not save(d, ...whatever...) .
Also don't use attach and detach and read this about factors which applies
if your factor has many levels but can be ignored if not:
http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg92970.html

On 8/21/07, Jessica Z [EMAIL PROTECTED] wrote:
Hello List, i have been agonizing over this for days, any reply would be
greatly appreciated!

Situation:___
My original dataset is a .csv dataset (w/ 2M records) with 4 variables:
job_id (Primary key, won't be used for analysis, just used for join tables),
sector_id (categorical variable, for 19 industry sectors),
sqft (con't variable for square footage),
building_type (categorical, for 2 building types)
some values of sqft were inputed wrong, so i'd like to set sqft1 to NA
and then use aregImpute() to impute those NAs.

Problem: the origianl dataset(.csv format) is too large. though i could read
that dataset into R, i could not get aregImpute() run even i set the memory
limit to 3G ! (yes, i did the switch in windows to reach 3G rather than 2G)

Goal: try to find a way to slim down my dataset so as to get aregImpute()
running.

What i did:
i searched in the archive, and found someone said, as R tends to inflate
memory, it is a good idea to first read the original dataset into R-- then
save it as a more compact binary file using save() -- and then reload the
compact binary file back into R using load(). this way would reduce the
memory allocation.

HOWEVER, after i saved my original dataset into a compact binary file using
format into R, I could not figure out how to retrive all my variables!!! R
shows the new dataset is not a list, nor a matrix, or a dataframe, but just a
character with length 1 !!! and there is no way i could do attach().

i generated a 1K-row subset out of my original dataset to illustrate my
problem (does anyone know how to get my four variables back from this
compact binary new dataset? what did i do wrong?):

summary(data)
job_id sector_id   sqftbuilding_type
Min.   :   1.0   Min.   : 6.000   Min.   :  0.00   Min.   :1.000
1st Qu.: 250.8   1st Qu.: 6.000   1st Qu.:  3.00   1st Qu.:2.000
Median : 500.5   Median :11.000   Median :  4.00   Median :2.000
Mean   : 500.5   Mean   : 9.455   Mean   : 12.49   Mean   :1.996
3rd Qu.: 750.3   3rd Qu.:11.000   3rd Qu.:  4.00   3rd Qu.:2.000
Max.   :1000.0   Max.   :12.000   Max.   :192.00   Max.   :2.000

attach(data)
sqft[sqft1] - NA
sector.f - as.factor(sector_id)
building_type.f - as.factor (building_type)
d - data.frame(job_id,sector.f,sqft, building_type.f)
summary (d)
job_id   sector.f  sqftbuilding_type.f
Min.   :   1.0   6 :340   Min.   :  3.00   1:  4
1st Qu.: 250.8   11:505   1st Qu.:  4.00   2:996
Median : 500.5   12:155   Median :  4.00
Mean   : 500.5Mean   : 14.16
3rd Qu.: 750.33rd Qu.: 17.00
Max.   :1000.0Max.   :192.00
NA's   :118.00
save (d, file=compact_d.Rdata, ascii=FALSE)

summary(newdata)
Length Class  Mode
1 character character
attach(newdata)
is.data.frame (newdata)
[1] FALSE
is.list (newdata)
[1] FALSE
is.matrix (newdata)
[1] FALSE

_
btw, i also tried to just save (into compact binary) and reload (the new
compact binary data format) (as i could do the NA stuff in sql anyhow).
however, i still got stucked at the same spot:
summary(data)
job_id sector_id   sqftbuilding_type
Min.   :   1.0   Min.   : 6.000   Min.   :  0.00   Min.   :1.000
1st Qu.: 250.8   1st Qu.: 6.000   1st Qu.:  3.00   1st Qu.:2.000
Median : 500.5   Median :11.000   Median :  4.00   Median :2.000
Mean   : 500.5   Mean   : 9.455   Mean   : 12.49   Mean   :1.996
3rd Qu.: 750.3   3rd Qu.:11.000   3rd Qu.:  4.00   3rd Qu.:2.000
Max.   :1000.0   Max.   :12.000   Max.   :192.00   Max.   :2.000
save (data, file=compact_data.Rdata, ascii=FALSE)
summary(newdata)
Length Class  Mode
1 character character
attach(newdata)
Error: restore file may be empty -- no data loaded
file 'data' has magic number ''
Use of save versions prior to 2 is deprecated
is.data.frame (newdata)
[1] FALSE
is.list (newdata)
[1] FALSE
is.matrix (newdata)
[1] FALSE

-
Building a website is a piece of cake.

[[alternative ```

### Re: [R] tackle memory insufficiency for large dataset using save() load()?

```?save says its the names (not the objects) although I just
tried it and both save(iris, file = /iris.Rdata) and
save(iris, file = /iris.Rdata) seemed to work so you are
right that it seems to work with the objects, not just the names,\
although its not documented to do so.

Usage
save(..., list = character(0),
file = stop('file' must be specified),
ascii = FALSE, version = NULL, envir = parent.frame(),
compress = !ascii, eval.promises = TRUE)

save.image(file = .RData, version = NULL, ascii = FALSE,
compress = !ascii, safe = TRUE)

Arguments
... the names of the objects to be saved.
list A character vector containing the names of objects to be saved.

On 8/21/07, Rolf Turner [EMAIL PROTECTED] wrote:

On 22/08/2007, at 1:48 PM, Gabor Grothendieck wrote:

See ?save .  The ... arguments are the ***names*** of the objects, not
the objects
so you want save(d, ...whatever...) not save(d, ...whatever...) .

I think this is wrong.  You want the objects not their names.

If you want to make use of object names, use the list argument.

I.e.

save(melvin,clyde,file=irving)

and

save(list=c(melvin,clyde),file=irving)

accomplish the same thing.

cheers,

Rolf Turner

##
Attention:
This e-mail message is privileged and confidential. If you are not the
intended recipient please delete the message and notify the sender.
Any views or opinions presented are solely those of the author.

This e-mail has been scanned and cleared by MailMarshal
www.marshalsoftware.com
##

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Q: combine 2 data frames with missing values

```Try this:

Lines - casevar1var2var3   var4
1   9   9   13  11
2   15  9   15  13
3   na  na  12  9
4   8   6   na  na
5   14  10  na  na
6   20  15  17  15

# replace with DF - read.table(myfile.dat, header = TRUE, na.strings = na)

DF1 - DF[-1]
kor - cor(DF1, use = pairwise)
kor

lm(var1 ~ var2, DF) # a sample regression

# mycoef calculates kth coefficient in regression of
# ith variable on jth variable
mycoef - function(i, j, k) coef(lm(DF1[c(i, j)]))[k]

idx - 1:ncol(DF1)
names(idx) - names(DF1)

intercepts - outer(idx, idx, Vectorize(mycoef), 1)
names(dimnames(intercepts)) - c(y, x)
intercepts

slopes - outer(idx, idx, Vectorize(mycoef), 2)
names(dimnames(slopes)) - c(y, x)
slopes

# another approach to the above
# mycoef1 is like mycoef but has only one argument
# and outputs all coefs, not just a specified one
mycoef1 - function(idx) coef(lm(DF1[idx]))
out - t(apply(expand.grid(y = idx, x = idx), 1, mycoef1))
colnames(out) - c(y, x, intercept, slope)
out

# To perform SQL operations on data frames
# and also ?sqldf for many examples
library(sqldf)
sqldf(select avg(var1), avg(var2), avg(var3), avg(var4) from DF1)
colMeans(DF1, na.rm = TRUE)  # same

On 8/20/07, Tom Willems [EMAIL PROTECTED] wrote:
hello R ussers,

i have the same problem with my data,
for aal the different variables, i have the same number of cases, but the
are often out of detectionlimits so they produce na's .
so the data looks like this:

casevar1var2var3var4 ...
1   9   9   13  11
2   15  9   15  13
3   na  na  12  9
4   8   6   na  na
5   14  10  na  na
6   20  15  17  15  ..
..

What i would like to do for data exploration, is to compare each possible
pair of variables, get their correlation coefficient, the intercept and
the slope of regression line. yet for every variable the messurements are
lnked thruogh theyr case. it is the same sample just a diferent test.

Now  i select a subsets  of variables out of the original dataset, and use
:
value_x1 = subset(dataset_1,select=lg_value)
value_y1 =subset(dataset_2,select=lg_value)

Then i to mold an lm model, inorder to get estimates for the slope ans
intercept
model_1 - lm (value_y1[,1]~ value_x1[,1]  )

This is what R tell's me:
Error in model.frame(formula, rownames,
variables, varnames, extras, extranames,  :
variable lengths differ (found for
'value_x1[, 1]')

Is there perhapes a way of binding the selected subsets together, still
linked to their case, so that the na's can be discarded by R automaticaly?
I have been trying to use SQLiteDF and the other sql func's of R, but i
don't realy understand them.
If someone out there knows how to use sql, in R, i d be delited if he or
she could explain it to me, more understandible then the manuals i find on
the web.
Here is what io would want sql to do .

My data is in columns, one column holds all the case numbers, one the
messured values, one all the testtypes and one the timeperiod and then one
column for the lab's that preformed the test. is is stored in a txt file.
So it is a long 5 column data table.
Now is it possible to make a cross table holding the case nr's, and
timeperiod in 2 column's, and then have a different column for every test?
so if there are 4 tests and 4 lab's, it would give 16 columns.
I've tryed it in access, but it gave me andless loops of repeated values.
and creating new data files is dangerous, 'litle mistakes made while
copying ' or manipultaions made to one file and not to the other'.
.

kind regards,
Tom

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Any parser generator / code assistance for R?

```On 8/17/07, Ali - [EMAIL PROTECTED] wrote:
Hi,

Is there any parser generator like www.antlr.org?  Moreover, how does simple

Given the response, it looks like no one has come up with an antlr
parser for R but there are some facilities within R itself.

showTree() in the R codetools package which can generate a
Lisp style expression for any R expression:

library(codetools)
showTree(quote(for(i in myvec[1:3]) print(i+88*2+3*4)))
(for i ([ myvec (: 1 3)) (print (+ (+ i (* 88 2)) (* 3 4

Looking at the source of showTree would show you how to walk
an R parse tree.

The Ryacas R package has a recursive descent R parser that is used to
process R code translating it to yacas and it also can translate OpenMath
XML code generated by yacas to R.  See:

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] recommended combo of apps for new user?

```Regarding RODBC vs. DBI-based packages (RSQLite, RMySQL, etc.) its
my perception, possibly mistaken, that apart from any consideration of
the R packages themselves, ODBC (which originated in the Windows world)
is more widely used on Windows than UNIX.  Also ODBC has the problem
that one must configure it which puts an extra step into the process.  Clear
documentation on how to do such ODBC configuration may be difficult to find.

On the other hand the RODBC package itself seems to be maintained
very well and is typically available for new versions of R before the
DBI-based packages.

On 8/19/07, Prof Brian Ripley [EMAIL PROTECTED] wrote:

(a) SPSS is not a DBMS, so it is not clear that you need this. But if you
do and are storing valuable data in a DBMS a lot of further questions come
into play, like how you are going to do backups.  I'd say PostgreSQL was
MySQL for most people.  We do also run PostgreSQL and they find it a lot
trickier to maintain.

'dozens of columns and thousands of rows' is not big.  A data frame with
50 columns and 5000 rows would only take 2Mb to store, and R will easily
handle 100x with 4GB of RAM (and if you have less, get 4GB).  So storing
data in .rda (R's save() format) is most likely viable.  R's indexing etc
operations make it good at data manipulation, and using a DBMS will
involve learning SQL, a non-trivial cost.

(b) You have a choice of interfaces to a DBMS, RODBC and the DBI+ family,
e.g. DBI+RMySQL and DBI+RSQLite.  I'm biased, but I find RODBC more
intuitive, and many people have reported it to be faster.  If all you want
is non-permanent storage for manipulation of large data sets, consider
also SQLiteDF.

On Sat, 18 Aug 2007, Duncan Murdoch wrote:

Martin Brown wrote:
[i sent this message earlier but apparently should have sent it plain
text, as follows..]

Hi there,

software that I need to complement R.  I've rooted around in the FAQ's
and done a few searches on this mailing list but haven't quite found
the perspective I need.

I am an experienced data analyst in my field (forest ecology and
ecological monitoring) but new to R. I am a long time user of SPSS and
have gotten pretty handy with it.  However, I am frustrated with SPSS
for several reasons:  There's the cost (I'm a freelancer; I pay for my
software myself);  the Windows dependence (I use Kubuntu as my usual
OS now, and switching back and forth is a pain); the horrible
inefficiency when I do certain types of file manipulations; and the
inability to do the kind of publication-quality graphs I want... I've
usually ended up using a commercial graphing program (another source
of expense and limitation).

I'd like to switch to using R on Kubuntu, for all those reasons.  In
addition I think the mathematical formality that R encourages might be
good for me.

However, reviewing the FAQ's on the R project web site makes me
realize that I've been using SPSS as three kinds of software really:
a DBMS; a statistical analysis package; and a graphing package.  It
looks like moving to R might involve learning three kinds of software,
not just one.  I wonder:

1) What open-source DBMS works most seamlessly with R?  I have seen
MySQL recommended but wonder if there are alternatives.  I sometimes
need to handle big data files.  In fact a lot of my work involves
exploratory and descriptive analyses of rather large and messy
databases from ecological monitoring, rather than statistical tests
per se.  In SPSS the data files I have been generating have dozens of
columns and thousands of rows, often with value and variable labels

See above.

I think you won't find much difference in the R interface between MySQL,
PostgreSQL, or SQLite.  The choice should be made based on the qualities
of the database (and I don't know enough about the differences to give a
recommendaton.)
2) For the purpose of creating publication-quality graphs, do R users
typically need to go outside of the R system? If so, what open-source
programs would you all recommend?

R is great for this, but you might need to go outside for some
specialized stuff (e.g. medical imaging).

3) Any other software I need to learn that would make my work in R
more productive? (for example, a code editor).

A lot of people are happy with ESS mode in Emacs.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied ```

### Re: [R] Creating a data set within a function

```Check out ?embed

On 8/19/07, Anup Nandialath [EMAIL PROTECTED] wrote:
Dear Friends,

I'm trying to find if there is a way to automate creation of the design
matrix. Suppose we are interested in say running an autoregressive model. The
user inputs the following data

myfunAR - function(y, order)
{.
..
}

now here y is the data series and order represents the level of the process.
In other words if order=2 then we have an AR (2) process. Now it is easy to
to create the y vector within the function, but I'm not clear on how to
create the design matrix.

For instance if order=2 then

y - as.matrix(rnorm(100))
ynew - as.matrix(y[3:nrow(y),1])
x - as.matrix(cbind(rep(1, nrow(y)-2), y[2:(nrow(y)-1),1],
y[1:(nrow(y)-2),1]))

ynew and x gives me the response vector and design matrix respectively.
however, I'm trying to write a general function which will accomodate any
order. Hence given the user inputs y and the order, is there a way to program
the creation of the x matrix automatically.

The long way would be

if (order=1)
{%5}

if (order=2)
{%5}

but this will force me to limit at some point.Is there an alternative way to
program this??

Regards

Anup

-
Building a website is a piece of cake.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] How to parse a string into the symbol for a data frame object

```You might want to store the data frames in a list to eliminate this
problem and make it more convenient to iterate over them:

L - list(df1 = df1, df2 = df2)
rm(df1, df2)

# reduce each data frame to its first few rows
for(nm in names(L))   L[[nm]] - head(L[[nm])

or if you don't need to modify them or know their names:

# print first few lines of each

On 8/19/07, Darren Weber [EMAIL PROTECTED] wrote:
I have several data frames, eg:

df1 - data.frame(x=seq(0,10), y=seq(10,20))
df2 - data.frame(a=seq(0,10), b=seq(10,20))

It is common to create loops in R like this:

for(df in list(df1, df2)){ #etc. }

This works fine when you know the name of the objects to put into the
list.  I assume that the order of the objects in the list is respected
through the loop.  Inside the loop, the objects of the list are
'dereferenced' using 'df' but, to my knowledge, there is no way to
tell whether 'df' is a current representation of 'df1' or 'df2'

In addition, I really want to use 'paste' within the loop to create a
new string value that will have the symbol name of a data frame to be
dereferenced, e.g.:

for(n in c(1, 2)){ dfString - paste('df', n, sep=);
print(eval(dfString)) }

[1] df1
[1] df2

This is not what I want.  I have read through the documentation on
eval and similar commands like substitute and quote.  I program
regularly, but I do not understand these constructs in R.  I do not
understand the R framework for parsing and evaluation and I don't have
a lot of time right now to get lost in this detail.  I could really
use some help to get the string values in my loop to be parsed into
symbols that refer to the data frame objects df1 and df2.  How is this
done?

Best, Darren

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] how to collapse a list of 1 column matrix to a matrix?

```Try this:

L - list(`1` = matrix(1:4, 4), `2` = matrix(5:8, 4))
sapply(L, c)

Note that the list component names are kept as column names in the result

On 8/19/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
Hi,

I encounter a situation where I have a list whose element is a column matrix.
Says,

\$'1'
[,1]
1
2
3

\$'2'
[,1]
4
5
6

Is there fast way to collapse the list into a matrix like a cbind operation
in this case? Meaning, the result should be a matrix that looks like:

[,1]  [,2]
[1,]1  4
[2,]2  5
[3,]3  6

I can loop through all elements and do cbind manually. But I think there must
be a simpler way that I don't know. Thank you.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] recommended combo of apps for new user?

```On 8/18/07, Martin Brown [EMAIL PROTECTED] wrote:
Hi there,

that I need to complement R.  I've rooted around in the FAQ's and done a few
searches on this mailing list but haven't quite found the perspective I
need.

I am an experienced data analyst in my field (forest ecology and ecological
monitoring) but new to R. I am a long time user of SPSS and have gotten
pretty handy with it.  However, I am frustrated with SPSS for several
reasons:  There's the cost (I'm a freelancer; I pay for my software
myself);  the Windows dependence (I use Kubuntu as my usual OS now, and
switching back and forth is a pain); the horrible inefficiency when I do
certain types of file manipulations; and the inability to do the kind of
publication-quality graphs I want... I've usually ended up using a
commercial graphing program (another source of expense and limitation).

I'd like to switch to using R on Kubuntu, for all those reasons.  In
addition I think the mathematical formality that R encourages might be good
for me.

From a strictly language perspective, mathematical formality is pretty
far from R.  Its actually quite loose.  Underneath there are some Lisp/Scheme
ideas but you are not very close to that as a user.

However, reviewing the FAQ's on the R project web site makes me realize that
I've been using SPSS as three kinds of software really:  a DBMS; a
statistical analysis package; and a graphing package.  It looks like moving
to R might involve learning three kinds of software, not just one.  I
wonder:

1) What open-source DBMS works most seamlessly with R?  I have seen MySQL
recommended but wonder if there are alternatives.  I sometimes need to
handle big data files.  In fact a lot of my work involves exploratory and
descriptive analyses of rather large and messy databases from ecological
monitoring, rather than statistical tests per se.  In SPSS the data files I
have been generating have dozens of columns and thousands of rows, often
with value and variable labels helpful for documenting my work.

Databases. SQLite is the easiest to install since its embedded rather
than client/server so I would use that unless your application requires
client/server or other features of MySQL.  MySQL is probably the most
popular of the free data bases so that would be the next one to go with.
If you intend to create a commercial application you might want to
consider Postgres instead of MySQL as the latter charges for
commercial implementations but Postgres does not.  Some heavy
Postgres users might feel that it should be considered after SQLite
rather than MySQL and there is a certain amount of arbitrariness here.
See the R packages RSQLite, RMySQL and DBI.  The R packages sqldf and
SQLiteDF are beginning to blur the boundary between R and the database.

2) For the purpose of creating publication-quality graphs, do R users
typically need to go outside of the R system? If so, what open-source
programs would you all recommend?

Graphics.  R should be ok.  Check out:
http://cran.r-project.org/src/contrib/Views/Graphics.html
R Graphics Gallery

3) Any other software I need to learn that would make my work in R more
productive? (for example, a code editor).

Other.  You need to know a text editor.  I use vim but there are
many good choices here with ESS being one that is often mentioned.

http://www.sciviews.org/_rgui/projects/Editors.html
http://ess.r-project.org/

If you intend to write C routines to run with R then, of course, you
need to know C.
For certain R packages that interface with outside software (tcltk, Rgraphviz,
Ryacas, XML, etc.) you will need to know something about the interfaced-to
software if you intend to use those packages.

For package development you will need to know latex and possibly subversion,
i.e. svn, the UNIX screen program, tar and various other UNIX commands.
Certain auxilliary programs that come with and are used with R are written
in perl although its unlikely you will need to know it.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] names not inherited in functions

```Within a function deparse(substitute(x)) will give the name of x, as a character
variable.  Search the archives for
deparse substitute
to find many examples.

On 8/17/07, david dav [EMAIL PROTECTED] wrote:
Dear R list,
After a huge delay, I come back to this question. Using names of
variables inside a function is a problem I run into quite often.
Maybe this little example should help to get my point:
Suppose I want to make a function llabel to get the labels of the
variables from a data frame.
If no label is defined, llabel should return the name of the variable.

library(Hmisc)
v1 - c(1,2)
v2 - c(1,2)
v3 - c(1,3)
tablo - data.frame(v1,v2,v3)
rm(v1,v2,v3)

label(tablo\$v1) - var1
attach(tablo)

# This does the trick on one variable.
if (label(v1) !=) label(v1)   else names(data.frame(v1))
if (label(v2) !=) label(v2)   else names(data.frame(v2))

But if I call this statement in a llabel function,

llabel - function(var) {
if (label(var) != )
res - label(var)
else res - names(data.frame(var))
return (res) }

I just get vars instead of the names when no label is defined :

llabel(v1) # works
llabel(v2) # gives var instead of v2

David

2007/6/7, Uwe Ligges [EMAIL PROTECTED]:
Not sure what you are going to get. Can you shorten your functions and
specify some example data? Then please tell us what your expected result is.

Best,
Uwe Ligges

david dav wrote:
Dear all,

I 'd like to keep the names of variables when calling them in a function.
An example might help to understand my problem :

The following function puts in a new data frame counts and percent of
a data.frame called as tablo
the step  nom.chiffr[1] - names(vari)  is useless as names from the
original data.frame aren't kept in the function environement.

Hoping I use appropriate R-vocabulary, I thank you for your help

David

descriptif - function (tablo) {
descriptifvar - function (vari) {
table(vari)
length(vari[!is.na(vari)])
chiffr -
cbind(table(vari),100*table(vari)/(length(vari[!is.na(vari)])))
nom.chiffr - rep(NA, dim(table(vari)))
if (is.null(names(vari))) nom.chiffr[1] - paste(i,) else
nom.chiffr[1] - names(vari)
chiffr - data.frame (  names(table(vari)),chiffr)
rownames(chiffr) - NULL
chiffr - data.frame (nom.chiffr, chiffr)
return(chiffr)
}

res - rep(NA, 4)
for (i in 1 : ncol(tablo))
res - rbind(res,descriptifvar(tablo[,i]))
colnames(res) - c(variable, niveau, effectif, pourcentage)
return(res[-1,])
}
# NB I used this function on a data.frame with only factors in

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] an easy way to construct this special matirx

```Here are two solutions.  In the first lo has TRUE on the lower diagonal
and diagonal. Then we compute the exponents, multiplying by lo to zero
out the upper triangle.  In the second rn is a matrix of row numbers
and rn = t(rn) is the same as lo in the first solution.

r - 2; n - 5 # test data

lo - lower.tri(diag(n), diag = TRUE)
lo * r ^ (row(lo) - col(lo) + 1)

Here is another one:

rn - row(diag(n))
(rn = t(rn)) * r ^ (rn - t(rn) + 1)

On 8/15/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
Hi,
Sorry if this is a repost. I searched but found no results.
I am wondering if it is an easy way to construct the following matrix:

r  1 0 00
r^2   r 1 00
r^3   r^2  r 10
r^4   r^3  r^2  r1

where r could be any number. Thanks.
Wen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] time series with quality codes

```In addition, we could create a function to.df which converts a zoo
object to a data frame assuming that any column that only contains
1:nlevels is a factor with the indicated level names.  Use to.df just
before plotting:

library(zoo)
set.seed(1)
f - zoo(factor(sample(3, 10, replace = TRUE)))
x - zoo(rnorm(10))
y - zoo(rnorm(10))
z - merge(x, y, f)

to.df - function(z, levels = letters[1:3], time = FALSE) {
zz - as.data.frame(z)
for(i in ncol(zz))
if (all(zz[,i] %in% seq_along(levels)))
z[,i] - factor(levels[z[,i]])
if (time) cbind(index = index(z), zz) else zz
}

library(lattice)
xyplot(y ~ x | f, data = to.df(z))

On 8/16/07, Achim Zeileis [EMAIL PROTECTED] wrote:
On Thu, 16 Aug 2007, Felix Andrews wrote:

list(...),

I am working with environmental time series (eg rainfall, stream flow)
that have attached quality codes for each data point. The quality
codes have just a few factor levels, like good, suspect, poor,
imputed. I use the quality codes in plots and summaries. They are
carried through when a time series is aggregated to a longer
time-step, according to rules like worst, median or mode.

I need to support time steps of anything from hours to years. I can
assume the data are regular time series -- they might be irregular
initially but could be 'regularized'. But I would want to plot
irregular time series along with regular ones.

So far I have been using a data frame with a POSIXct column, a numeric
column and a factor column. However I would like to use zoo instead,
because of its many utility functions and easy conversion to ts. Is
there any prospect of zoo handling such numeric + factor data? Other
suggestions on elegant ways to do it are also welcome.

There is some limited support for this in zoo. You can do
z - zoo(myfactor, myindex)
and work with it like a zoo series and then
coredata(z)
will recover a factor. However, you cannot bind this to other series
without losing the factor structure. At least not in a plain zoo series.
But you can do
df - merge(z, Z, retclass = data.frame)
where every column of the resulting data.frame is a univariate zoo series.

The final option would be to just have a data.frame as usual and put your
data/index into one column. But then it's more difficult to leverage zoo's
functionality.

I would like to have more support for things like this, but currently this
is what we have.

Best,
Z

Felix

--
Felix Andrews / ��
PhD candidate
Integrated Catchment Assessment and Management Centre
The Fenner School of Environment and Society
The Australian National University (Building 48A), ACT 0200
Beijing Bag, Locked Bag 40, Kingston ACT 2604
http://www.neurofractal.org/felix/
xmpp:[EMAIL PROTECTED]
3358 543D AAC6 22C2 D336  80D9 360B 72DD 3E4C F5D8

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] an easy way to construct this special matirx

```It was pointed out that the required matrix may not be square and
the superdiagonal was missing in my prior post.  Here is a revision:

r - 2; nr - 4; nc - 5 # test data

x - matrix(nr = nr, nc = nc)
x - row(x) - col(x) + 1
(x = 0) * r ^ x

On 8/16/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
Here are two solutions.  In the first lo has TRUE on the lower diagonal
and diagonal. Then we compute the exponents, multiplying by lo to zero
out the upper triangle.  In the second rn is a matrix of row numbers
and rn = t(rn) is the same as lo in the first solution.

r - 2; n - 5 # test data

lo - lower.tri(diag(n), diag = TRUE)
lo * r ^ (row(lo) - col(lo) + 1)

Here is another one:

rn - row(diag(n))
(rn = t(rn)) * r ^ (rn - t(rn) + 1)

On 8/15/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
Hi,
Sorry if this is a repost. I searched but found no results.
I am wondering if it is an easy way to construct the following matrix:

r  1 0 00
r^2   r 1 00
r^3   r^2  r 10
r^4   r^3  r^2  r1

where r could be any number. Thanks.
Wen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Linear models over large datasets

```Its actually only a few lines of code to do this from first principles.
The coefficients depend only on the cross products X'X and X'y and you
can build them up easily by extending this example to read files or
a database holding x and y instead of getting them from the args.
Here we process incr rows of builtin matrix state.x77 at a time
building up the two cross productxts, xtx and xty, regressing
Income (variable 2) on the other variables:

mylm - function(x, y, incr = 25) {
start - xtx - xty - 0
while(start  nrow(x)) {
idx - seq(start + 1, min(start + incr, nrow(x)))
x1 - cbind(1, x[idx,])
xtx - xtx + crossprod(x1)
xty - xty + crossprod(x1, y[idx])
start - start + incr
}
solve(xtx, xty)
}

mylm(state.x77[,-2], state.x77[,2])

On 8/16/07, Alp ATICI [EMAIL PROTECTED] wrote:
I'd like to fit linear models on very large datasets. My data frames
are about 200 rows x 200 columns of doubles and I am using an 64
R Data Import/Export guide. My primary issue is although my data
represented in ascii form is 4Gb in size (therefore much smaller
considered in binary), R consumes about 12Gb of virtual memory.

What exactly are my options to improve this? I looked into the biglm
package but the problem with it is it uses update() function and is
therefore not transparent (I am using a sophisticated script which is
hard to modify). I really liked the concept behind the  LM package
here: http://www.econ.uiuc.edu/~roger/research/rq/RMySQL.html
But it is no longer available. How could one fit linear models to very
large datasets without loading the entire set into memory but from a
file/database (possibly through a connection) using a relatively
simple modification of standard lm()? Alternatively how could one
improve the memory usage of R given a large dataset (by changing some
default parameters of R or even using on-the-fly compression)? I don't
mind much higher levels of CPU time required.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Combine matrix

```Try this.  We convert to data frame placing the row names in column 1, do
the merge, remove column 1 and convert back to matrix:

# test input
a - matrix(1:25, nrow = 5,
dimnames = list(letters[1:5], rep(A, 5)))
b - matrix(1:40, nrow = 8,
dimnames = list(rep(letters[1:2], each = 4), rep(B, 5)))

# 1. process
to.DF - function(x) data.frame(rn = row.names(x), x, row.names = 1:nrow(x))
out - as.matrix(merge(to.DF(a), to.DF(b), by = 1)[,-1])
colnames(out) - c(colnames(a), colnames(b))
out

# 2. same but merge is done using sqldf
# assume same a, b and to.DF as before

library(sqldf)
DFa - to.DF(a)
DFb - to.DF(b)
out - as.matrix(sqldf(select * from DFa join DFb using(rn))[-1])
colnames(out) - c(colnames(a), colnames(b))
out

# 3. same but uses sqldf and proto (which sqldf automatically loads)
# assume same a, b and to.DF as before

library(sqldf)
out - as.matrix(sqldf(select * from a join b using(rn),
envir = proto(a = to.DF(a), b = to.DF(b)))[-1])
colnames(out) - c(colnames(a), colnames(b))
out

On 8/16/07, Gianni Burgin [EMAIL PROTECTED] wrote:
let say something like this

a=matrix(1:25, nrow=5)

rownames(a)=letters[1:5]
colnames(a)=rep(A, 5)

a
A  A  A  A  A
a 1  6 11 16 21
b 2  7 12 17 22
c 3  8 13 18 23
d 4  9 14 19 24
e 5 10 15 20 25

b=matrix(1:40, nrow=8)
rownames(b)=c(rep(a,4),rep(b,4))
colnames(b)=rep(B, 5)

b
B  B  B  B  B
a 1  9 17 25 33
a 2 10 18 26 34
a 3 11 19 27 35
a 4 12 20 28 36
b 5 13 21 29 37
b 6 14 22 30 38
b 7 15 23 31 39
b 8 16 24 32 40

as a results I wold like something like

A  A  A  A  A  B  B  B  B  B
a 1  6 11 16 21  1  9 17 25 33
a 1  6 11 16 21  2 10 18 26 34
a 1  6 11 16 21  3 11 19 27 35
a 1  6 11 16 21  4 12 20 28 36
b 2  7 12 17 22  5 13 21 29 37
b 2  7 12 17 22  6 14 22 30 38
b 2  7 12 17 22  7 15 23 31 39
b 2  7 12 17 22  8 16 24 32 40

does it is clear? is there a function that automate this operation?

thank you very much!

On 8/16/07, jim holtman [EMAIL PROTECTED] wrote:

Can you provide an example of what you mean; e.g., the two input
matrices and the desired output.

On 8/16/07, Gianni Burgin [EMAIL PROTECTED] wrote:
Hi R user,

I am new to R, and I have a very simple question for you. I have two
matrix
A and B, with internally redundant rownames (but variables are
different).
Some, but not all the rownames are shared among the two matrix. I want
to
create a greater matrix that combines the previuos two, and has all the
possible combinations of matching rownames lines among matrix A and B.

looking for the solution I bumped in merge but actually works on
data.frame,
and in dataframe there could be no redundancy in names.

can you help me??

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] function to find coodinates in an array

```Get the indices using expand.grid and then reorder them:

set.seed(1); X - array(rnorm(24), 2:4) # input
X # look at X

do.call(expand.grid, sapply(dim(X), seq))[order(X),]

On 8/16/07, Ana Conesa [EMAIL PROTECTED] wrote:
Dear list,

I am looking for a function/way to get the array coordinates of given
elements in an array. What I mean is the following:
- Let X be a 3D array
- I find the ordering of the elements of X by ord - order(X) (this
returns me a vector)
- I now want to find the x,y,z coordinates of each element of ord

Can anyone help me?

Thanks!

Ana

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Formula in lm inside lapply

```It can't find x since the environment of formula1 and of formula2 is the Global
Environment and x is not there -- its local to the function.

Try this:

#generating data
set.seed(1)
DF - data.frame(y = rnorm(100, 1), x1 = rnorm(100, 1), x2 = rnorm(100, 1),
group = rep(c(A, B), c(40, 60)))

formula1 - as.formula(y ~ x1)
lapply(levels(DF\$group), function(x) {
environment(formula1) - environment()
lm(formula1, DF, subset = group == x)
})

formula2 - as.formula(y ~ x1 + x2)
lapply(levels(DF\$group), function(x) {
environment(formula2) - environment()
lm(formula2, DF, subset = group == x)
})

On 8/15/07, Li, Yan (IED) [EMAIL PROTECTED] wrote:
I am trying to run separate regressions for different groups of
observations using the lapply function. It works fine when I write the
formula inside the lm() function. But I would like to pass formulae into
lm(), so I can do multiple models more easily. I got an error message
when I tried to do that. Here is my sample code:

#generating data
x1 - rnorm(100,1)
x2 - rnorm(100,1)
y  - rnorm(100,1)
group - rep(c(A,B),c(40,60))
group - factor(group)
df - data.frame(y,x1,x2,group)

#write formula inside lm--works fine
res1 - lapply(levels(df\$group), function(x) lm(y~x1,df, subset = group
==x))
res1
res2 - lapply(levels(df\$group),function(x) lm(y~x1+x2,df, subset =
group ==x))
res2

#try to pass formula into lm()--does not work
formula1 - as.formula(y~x1)
formula2 - as.formula(y~x1+x2)
resf1 - lapply(levels(df\$group),function(x) lm(formula1,df, subset =
group ==x))
resf1
resf2 - lapply(levels(df\$group),function(x) lm(formula2,df, subset =
group ==x))
Resf2

The error message is

Any help is greatly appreciated!

Yan

This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Formula in lm inside lapply

```Here is another solution that gets around the non-standard
way that subset= is handled in lm.  It has the advantage that unlike
the previous solution where formula1 and group == x appear literally
in the output, in this one the formula appears written out and
group == A and group == B appear:

lapply(levels(DF\$group), function(x) do.call(lm,
+list(formula1, quote(DF), subset = bquote(group == .(x)
[[1]]

Call:
lm(formula = y ~ x1, data = DF, subset = group == A)

Coefficients:
(Intercept)   x1
1.04855  0.04585

[[2]]

Call:
lm(formula = y ~ x1, data = DF, subset = group == B)

Coefficients:
(Intercept)   x1
1.13593 -0.01627

On 8/15/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
It can't find x since the environment of formula1 and of formula2 is the
Global
Environment and x is not there -- its local to the function.

Try this:

#generating data
set.seed(1)
DF - data.frame(y = rnorm(100, 1), x1 = rnorm(100, 1), x2 = rnorm(100, 1),
group = rep(c(A, B), c(40, 60)))

formula1 - as.formula(y ~ x1)
lapply(levels(DF\$group), function(x) {
environment(formula1) - environment()
lm(formula1, DF, subset = group == x)
})

formula2 - as.formula(y ~ x1 + x2)
lapply(levels(DF\$group), function(x) {
environment(formula2) - environment()
lm(formula2, DF, subset = group == x)
})

On 8/15/07, Li, Yan (IED) [EMAIL PROTECTED] wrote:
I am trying to run separate regressions for different groups of
observations using the lapply function. It works fine when I write the
formula inside the lm() function. But I would like to pass formulae into
lm(), so I can do multiple models more easily. I got an error message
when I tried to do that. Here is my sample code:

#generating data
x1 - rnorm(100,1)
x2 - rnorm(100,1)
y  - rnorm(100,1)
group - rep(c(A,B),c(40,60))
group - factor(group)
df - data.frame(y,x1,x2,group)

#write formula inside lm--works fine
res1 - lapply(levels(df\$group), function(x) lm(y~x1,df, subset = group
==x))
res1
res2 - lapply(levels(df\$group),function(x) lm(y~x1+x2,df, subset =
group ==x))
res2

#try to pass formula into lm()--does not work
formula1 - as.formula(y~x1)
formula2 - as.formula(y~x1+x2)
resf1 - lapply(levels(df\$group),function(x) lm(formula1,df, subset =
group ==x))
resf1
resf2 - lapply(levels(df\$group),function(x) lm(formula2,df, subset =
group ==x))
Resf2

The error message is

Any help is greatly appreciated!

Yan

This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] shell and shell.exec on Windows

```The system() function has an invisible= argument.  The ryacas package
uses system() to run yacas.  See the runYacas() and
yacasInvokeString() functions in yacas.R for examples:

On 8/11/07, Erich Neuwirth [EMAIL PROTECTED] wrote:
I have an Excel workbook MyWorkbook.xls containing an Auto_Open macro
which I want to be run from R.

shell.exec(MyWorkbook.xls)
does that.

shell(start MyWorkbook.xls)
also runs it.

In both cases, the Excel window is visible on screen when Excel is started.
Is there a way of opening the sheet with a hidden Excel window?
start has some parameters (e.g. /MIN), which should allow this, but
shell(start /MIN MyWorkbook.xls)
also starts Excel visibly.

--
Erich Neuwirth, University of Vienna
Faculty of Computer Science
Computer Supported Didactics Working Group
Phone: +43-1-4277-39464 Fax: +43-1-4277-39459

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] help with counting how many times each value occur in each column

```Try this where we have constructed the example to illustrate that
it does handle the case where not all values are in each column:

mat - matrix(rep(1:6, each = 4), 6)

table(col(mat), mat)

On 8/10/07, Tom Cohen [EMAIL PROTECTED] wrote:
Dear list,
I have the following dataset and want to know how many times each value
occur in each column.
data
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] -100 -100 -100000000  -100
[2,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[3,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[4,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[5,] -100 -100 -100 -100 -100 -100 -100 -100 -100   -50
[6,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[7,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[8,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[9,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[10,] -100 -100 -100  -50 -100 -100 -100 -100 -100  -100
[11,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[12,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[13,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[14,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[15,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[16,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[17,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[18,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[19,] -100 -100 -100000000  -100
[20,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
The result matrix should look like
-100 0 -50
[1]   20
[2]   20
[3]   20
[4]   17
[5]   18
[6]   18
[7]   18  and so on
[8]
[9]
[10]

How can I do this in R ?
Tom

-

Jämför pris på flygbiljetter och hotellrum:
http://shopping.yahoo.se/c-169901-resor-biljetter.html
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] ordering a data.frame by average rank of multiple columns

```Try this:

positions - order(ranks)

On 8/10/07, Tom.O [EMAIL PROTECTED] wrote:

Hi

I have run into a problem and i wonder if anyone has a smart way of doing
this.

For example i have this data frame for 5 different test groups:

Res1 - c(1,5,4,-0.5,3)
Res2 - c(-1,8,2,0,3)
Mean - c(0.5,1,1.5,-.5,2)
MyFrame - data.frame(Res1,Res2,Mean,row.names=c(G1,G2,G3,G4,G5))

where the first two columns are the results of two different tests, the
third column is the mean of the group.

I want to order this data.frame by the combined rank of Res1  Res2, but
where weigths are assigned to the importeance av each column. Lets assume
that Res1 is twice as important and lower values rank better.

MyRanks-data.frame(Rank1=rank(MyFrame[,Res1]),Rank2=rank(MyFrame[,Res2]),CombR=2*rank(MyFrame[,Res1])+rank(MyFrame[,Res2]),row.names=c(G1,G2,G3,G4,G5))

Rank1 Rank2 CombR
G1 2 1 5
G2 5 515
G3 4 311
G4 1 2 4
G5 3 410

and the rank of the combined is 2,5,4,1,3 , but to be able to sort MyFrame
in that order I need to enter this vector of positions c(4,1,5,3,2) but do
anyone have a smart way of converting ranks to positions?

Tom

--
View this message in context:
http://www.nabble.com/ordering-a-data.frame-by-average-rank-of-multiple-columns-tf4247393.html#a12087498
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] [Fwd: Re: How to apply functions over rows of multiple matrices]

```1. matrices are stored columnwise so R is better at column-wise operations
than row-wise.

2. Here is one way to do it (although I am not sure its better than the
index approach):

row.apply - function(f, a, b)
t(mapply(f, as.data.frame(t(a)), as.data.frame(t(b

3. The code for the example in this post could be simplified to:

first.1 - apply(cbind(goldstandard, 1), 1, which.max)
ifelse(col(newtest)  first.1, NA, newtest)

4. given that both examples did not inherently need row by row operations
I wonder if that is the wrong generalization in the first place?

On 8/10/07, Johannes Hüsing [EMAIL PROTECTED] wrote:
[Apologies to Gabor, who I sent a personal copy of the reply
erroneously instead of posting to List directly]

[...]
Perhaps what you really intend is to
take the average over those elements in each row of the first matrix
which correspond to 1's in the second in the corresponding
row of the second.  In that case its just:

rowSums(newtest * goldstandard) / rowSums(goldstandard)

Thank you for clearing my thoughts about the particular example.
My question was a bit more general though, as I have different
functions which are applied row-wise to multiple matrices. An
example that sets all values of a row of matrix A to NA after the
first occurrence of TRUE in matrix B.

fillfrom - function(applvec, testvec=NULL) {
if (is.null(testvec)) testvec - applvec
if (length(testvec) != length(applvec)) {
stop(applvec and testvec have to be of same length!)
} else if(any(testvec, na.rm=TRUE)) {
applvec[min(which(testvec)) : length(applvec)] - NA
}
applvec
}

fillafter - function(applvec, testvec=NULL) {
if (is.null(testvec)) testvec - applvec
fillfrom(applvec, c(FALSE, testvec[-length(testvec)]))
}

numtest - 6
numsubj - 20

newtest - array(rbinom(numtest*numsubj, 1, .5),
dim=c(numsubj, numtest))
goldstandard - array(rbinom(numtest*numsubj, 1, .5),
dim=c(numsubj, numtest))

newtest.NA - t(sapply(1:nrow(newtest), function(i) {
fillafter(newtest[i,], goldstandard[i,]==1)}))

My general question is if R provides some syntactic sugar
for the awkward sapply(1:nrow(A)) expression. Maybe in this
case there is also a way to bypass the apply mechanism and
my way of thinking about the problem has to be adapted. But
as the *apply calls are galore in R, I feel this is a standard
way of dealing with vectors and matrices.

--

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Countvariable for id by date

```Try this:

Lines - id;dg1;dg2;date;
1;F28;;1997-11-04;
1;F20;F702;1998-11-09;
1;F20;;1997-12-03;
1;F208;;2001-03-18;
2;F32;;1999-03-07;
2;F29;F32;2000-01-06;
2;F32;;2003-07-05;
2;F323;F2800;2000-02-05;

# replace textConnection(Lines) with actual file name
DF - read.csv2(textConnection(Lines), as.is = TRUE,
colClasses = list(numeric, character, character, Date, NULL))

rk - function(x, pat) {
z - regexpr(pat, x\$dg1)  0 | regexpr(pat, x\$dg2)  0
rank(ifelse(z, x\$date, NA), na.last = keep)
}

DF\$countF20 - unlist(by(DF, DF\$id, rk, pat = ^F20))
DF\$countF2129 - unlist(by(DF, DF\$id, rk, pat = ^F2[1-9]))
DF

On 8/9/07, David Gyllenberg [EMAIL PROTECTED] wrote:
Best R-users,

Here's a  newbie question. I have tried to find an answer to this via
help and the ave(x,factor(),FUN=function(y)  rank (z,tie='first')-function,
but without success.

I have a dataframe  (~8000 observations, registerdata) with four
columns: id, dg1, dg2 and date(-MM-DD)  of interest:

id;dg1;dg2;date;
1;F28;;1997-11-04;
1;F20;F702;1998-11-09;
1;F20;;1997-12-03;
1;F208;;2001-03-18;
2;F32;;1999-03-07;
2;F29;F32;2000-01-06;
2;F32;;2003-07-05;
2;F323;F2800;2000-02-05;
...

I would  like o have two additional columns:
1. countF20:  a countvariable that shows which in order (by date)
the id has if it fulfils  the following logical expression: dg1 = F20* OR dg2
= F20*,
where *  means F201,F202... F2001,F2002...F20001,F20002...
2. countF2129:  another countvariable that shows which in order (by
date) the id has if it fulfils  the following logical expression: dg1 =
F21*-F29* OR dg2 = F21*-F29*,
where F21*-F29*  means F21*, F22*...F29* and
where *  means F211,F212... F2101,F2102...F21001,F21002...

... so the  dataframe would look like this, where 1 is the first
observation for the id with  the right condition, 2 is the second etc.:

id;dg1;dg2;date;countF20;countF2129;
1;F28;;1997-11-04;;1;
1;F20;F702;1998-11-09;2;;
1;F20;;1997-12-03;1;;
1;F208;;2001-03-18;3;;
2;F32;;1999-03-07;;;
2;F29;F32;2000-01-06;;1;
2;F32;;2003-07-05;;;
2;F323;F2800;2000-02-05;;2;
...

Do you know  a convenient way to create these kind of countvariables?

/ David (david.gyllenberg  at  yahoo.com

-
Park yourself in front of a world of choices in alternative vehicles.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help