### [R] cauculating dissimilarities in R

```Dear All,
Ive got a statistical question on calculating
dissimilarities in R.
I want to calculate the different types of dissimilarities
on the flower dataset found in the package
cluster. Flower is a data frame with 18 observations
on 8 variables. Variable 1 and 2 are binary, variable 3 is
asymmetric binary, variable 4 is nominal, variable 5 and 6
are ordered and variable 7 and 8 are interval scaled.

Commands to load the dataset in R.
library(cluster)
data(flower)
flower

What are the different types of dissimilarities that can be
calculated on such a dataset?
Do I need to group the types of variables first i.e. all
binary together then run the calculation?  Do I use
dissimilarity indices such as Jaccard or should it be
classification function such as daisy which should be
used?

Many thanks,

Elvina Payet (MSc)
University of La Reunion

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Sort problem with merge (again)

```On Mon, 25 Sep 2006, Bruce LaZerte wrote:

# R version 2.3.1 (2006-06-01) Debian Linux testing

# Is the following behaviour a bug, feature or just a lack of
# understanding on my part? I see that this was discussed here
# last March with no apparent resolution.

Reference?  It is the third alternative.  A factor is sorted by its codes:
consider

x - factor(1:3, levels=as.character(3:1))
x
[1] 1 2 3
Levels: 3 2 1
sort(x)
[1] 3 2 1
Levels: 3 2 1

and that is what is happening here: for your example the levels of df\$Date
are

levels(df\$Date)
[1] 1970-04-04 1970-08-11 1970-10-18 1970-06-04 1970-08-18

so the result is sorted correctly.

If you want to sort a character column in lexicographic order, don't make
it into a factor. Similarly for a date column: use class Date.

d - as.factor(c(1970-04-04,1970-08-11,1970-10-18))
x - c(9,10,11)
ch - data.frame(Date=d,X=x)

d - as.factor(c(1970-06-04,1970-08-11,1970-08-18))
y - c(109,110,111)
sp - data.frame(Date=d,Y=y)

df - merge(ch,sp,all=TRUE,by=Date)
# the rows with dates missing all ch vars are tacked on the end.
# the rows with dates missing all sp vars are sorted in with
# the row with a date with vars from both ch and sp
# is.ordered(df\$Date) returns FALSE

# The rows of df are not sorted as they should be as sort=TRUE
# is the default. Adding sort=TRUE does nothing.
# So try this:
# dd - df[order(df\$Date),]
# But that doesn't work.
# Nor does sort(df\$Date)
# But sort(as.vector(df\$Date)) does work.
# As does order(as.vector(df\$Date)), so this works:
dd - df[order(as.vector(df\$Date)),]
# ?

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] calculating dissimilarities in R

``` Dear All,
Ive got a statistical question on calculating
dissimilarities in R.
I want to calculate the different types of dissimilarities
on the flower dataset found in the package
cluster. Flower is a data frame with 18 observations
on 8 variables. Variable 1 and 2 are binary, variable 3 is
asymmetric binary, variable 4 is nominal, variable 5 and 6
are ordered and variable 7 and 8 are interval scaled.

Commands to load the dataset in R.
library(cluster)
data(flower)
flower

What are the different types of dissimilarities that can be
calculated on such a dataset?
Do I need to group the types of variables first i.e. all
binary together then run the calculation?  Do I use
dissimilarity indices such as Jaccard or should it be
classification function such as daisy which should be
used?

Many thanks,

Elvina Payet (MSc)
University of La Reunion

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] warning message in nlm

```Dear R-users,

I am trying to find the MLEs for a loglikelihood function (loglikcs39) and
tried using both optim and nlm.

fredcs39-function(b1,b2,x){return(exp(b1+b2*x))}
loglikcs39-function(theta,len){
sum(mcs39[1:len]*fredcs39(theta[1],theta[2],c(8:(7+len))) - pcs39[1:len] *
log(fredcs39(theta[1],theta[2],c(8:(7+len)
}
theta.start-c(0.1,0.1)

1. The output from using optim is as follow
--

optcs39-optim(theta.start,loglikcs39,len=120,method=BFGS)
optcs39
\$par
[1] -1.27795226 -0.03626846

\$value
[1] 7470.551

\$counts
133   23

\$convergence
[1] 0

\$message
NULL

2. The output from using nlm is as follow
---

outcs39-nlm(loglikcs39,theta.start,len=120)
Warning messages:
1: NA/Inf replaced by maximum positive value
2: NA/Inf replaced by maximum positive value
3: NA/Inf replaced by maximum positive value
4: NA/Inf replaced by maximum positive value
5: NA/Inf replaced by maximum positive value
6: NA/Inf replaced by maximum positive value
7: NA/Inf replaced by maximum positive value
outcs39
\$minimum
[1] 7470.551

\$estimate
[1] -1.27817854 -0.03626027

[1] -8.933577e-06 -1.460512e-04

\$code
[1] 1

\$iterations
[1] 40

As you can see, the values obtained from using both functions are very
similar.  But, what puzzled is the warning message that i got from using
nlm. Could anyone please shed some light on how this warning message come
about and whether it is a cause for concern?

singyee

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] lapply, plot and additional arguments

```Dear all

Hopefully somebody will know the answer.

I have some list

x - data.frame(a = 1:9, beta = exp(-4:4), logic = rep(c(TRUE,FALSE),
c(5,4)))
x.l - split(x, x\$logic)
plot(x.l\$a, x.l\$beta)

and I want to plot lines color coded according to logic variable

lapply(x.l, function(x, ...) lines(x\$a, x\$beta, col=1:2))
lapply(x.l, function(x,...) lines(x\$a,x\$beta), col=1:2)
lapply(x.l, function(x,...) lines(x\$a,x\$beta, ...), col=1:2)

Well, lapply seems to ignore my best attempts to persuade it to use
different colours for each part of x.l list.

Anybody knows how to code different colours when using lapply for
such plotting?

At present time I use a loop but maybe lapply could do it too.

Best regards.
Petr

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] printing a variable name in a for loop

```Hello,

How do you print a variable name in a for loop?

I'm trying to construct a csv file that looks like this:

Hello, variable1, value_of_variable1, World,
Hello, variable2, value_of_variable2, World,
Hello, variable3, value_of_variable3, World,

Using this:

for (variable in list(variable1, variable2, variable3)){

cat(Hello,, ???variable???, variable, , World,)
}

This works fine if I'm trying to print the VALUE of variable, but I want to
print the NAME of variable as well.

Thanks,
Suzi

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] [R-pkgs] the IPSUR package

```Dear useRs,

We are pleased to announce the preliminary release of the IPSUR package.

The primary audience was originally envisioned to be upper division
undergraduate mathematics/statistics/engineering majors, but other useRs may
find this material useful.

In a nutshell, this package slightly modifies and adds selected
functionality to the R Commander by John Fox.  The changes were meant to
customize Rcmdr for our Statistics classes, populated for the most part by
the audience above.  Some clever functions written by John Verzani were
translated to IPSUR from UsingR.

Downloads for the package (while the CRAN submission is pending) are at

http://www.cc.ysu.edu/~gjkerns/IPSUR/package/index.htm

Check out the Features page to see what the package offers.

http://www.cc.ysu.edu/~gjkerns/IPSUR/package/features.htm

Full credit must be given to John Fox, together with his diverse team of
dedicated contributors.  Indeed, without all of their countless hours of
effort the IPSUR package would not be possible.  Kudos to them for providing
excellent software to the R community.

Cheers,
Jay

***
G. Jay Kerns, Ph.D.
Department of Mathematics  Statistics
Youngstown State University
Youngstown, OH 44555-0002 USA
Office: 1035 Cushwa Hall
Phone: (330) 941-3310 Office (voice mail)
-3302 Department
-3170 FAX
E-mail: [EMAIL PROTECTED]
http://www.cc.ysu.edu/~gjkerns/

[[alternative HTML version deleted]]

___
R-packages mailing list
R-packages@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] Different results in agnes and hclust

```Hello to everybody,

I have a question regarding the results obtained from the hclust and the
agnes funtion using the ward algorithm because they seem to differ from each
other. I also ran a cluster analysis using the ward algorithm in Matlab and
obtained the same results as from agnes.
I'm using the pvclust package in order to confirm the clustering results
which internally uses the hclust function. Therefore I'm not too shure what
to do with the results. This problem doesn't appear when using the average
algorithm.

Regards
Robert Rein

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] rpart

```Dear r-help-list:

If I use the rpart method like

cfit-rpart(y~.,data=data,...),

what kind of tree is stored in cfit?
Is it right that this tree is not pruned at all, that it is the full tree?

If so, it's up to me to choose a subtree by using the printcp method.
In the technical report from Atkinson and Therneau An Introduction to
recursive partitioning using the rpart routines from 2000, one can see the
following table on page 15:

CP  nsplit  relerror  xerror   xstd
1   0.105   0 1.0   1.   0.108
2   0.056   3 0.68519   1.1852   0.111
3   0.028   4 0.62963   1.0556   0.109
4   0.574   6 0.57407   1.0556   0.109
5   0.100   7 0.6   1.0556   0.109

Some lines below it says We see that the best tree has 5 terminal nodes (4
splits). Why that if the xerror is the lowest for the tree only consisting of
the root?

Thank you very much for your help

Henri
--

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] About the display of matrix

```For  a matrix A, i don't want to display the zero elements in it , How to do
with that?

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] Voung test implementation in R

```Dear All,
I would like to know if the Voung test (Voung; Econometrica, 1989) to compare
two non-nested regression models has been implemented in R.
mirko
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Creating Movies with R

```J.R. Lockwood wrote:
An alternative that I've used a few times is the jpg() function to
create the sequence of images, and then converting these to an mpeg
movie using mencoder distributed with mplayer.  This works on both
windows and linux.  I have a pretty self-contained example file
written up that I can send to anyone who is interested.  Oddly, the
most challenging part was creating a sequence of file names that would
be correctly ordered - for this I use:

lex - function(N){
## produce vector of N lexicograpically ordered strings
ndig - nchar(N)
substr(formatC((1:N)/10^ndig,digits=ndig,format=f),3,1000)
}

Hi,

Or you could have asked the `filename` argument of `jpeg` to do the job
for you, ie :
filename = something%04d as documented in ?jpeg

jpeg(filename = something%04.jpg, onefile = FALSE)
for(i in 1:10){
plot(i)
}

Cheers,

Romain

PS : For those who have ideas of movies, I once started a website R
Movies Gallery as a little sister of R Graph(ics) Gallery ... you may
want to send me code to populate the website, or populate the wiki with
such examples. The idea is not to produce pretty science fiction type
movies with R, but use R abilities to create some useful animation that
could highlight some statistical concepts such as the LCT, ...
Plus, there's a place where grid could show its full power.

On Fri, 22 Sep 2006, Jeffrey Horner wrote:

Date: Fri, 22 Sep 2006 13:46:52 -0500
From: Jeffrey Horner [EMAIL PROTECTED]
To: Lorenzo Isella [EMAIL PROTECTED], r-help@stat.math.ethz.ch
Subject: Re: [R] Creating Movies with R

If you run R on Linux, then you can run the ImageMagick command called
convert. I place this in an R function to use a sequence of PNG plots as
movie frames:

make.mov.plotcol3d - function(){
system(convert -delay 10 plotcol3d*.png plotcol3d.mpg)
}

Examples can be seen here:

http://biostat.mc.vanderbilt.edu/JrhRgbColorSpace

Cheers,

Jeff

Lorenzo Isella wrote:

Dear All,

I'd like to know if it is possible to create animations with R.
To be specific, I attach a code I am using for my research to plot
some analytical results in 3D using the lattice package. It is not
necessary to go through the code.
Simply, it plots some 3D density profiles at two different times
selected by the user.
I wonder if it is possible to use the data generated for different
times to create something like an .avi file.

Here is the script:

rm(list=ls())
library(lattice)

# I start defining the analytical functions needed to get the density
as a function of time

expect_position - function(t,lam1,lam2,pos_ini,vel_ini)
{1/(lam1-lam2)*(lam1*exp(lam2*t)-lam2*exp(lam1*t))*pos_ini+
1/(lam1-lam2)*(exp(lam1*t)-exp(lam2*t))*vel_ini
}

sigma_pos-function(t,q,lam1,lam2)
{
q/(lam1-lam2)^2*(
(exp(2*lam1*t)-1)/(2*lam1)-2/(lam1+lam2)*(exp(lam1*t+lam2*t)-1) +
(exp(2*lam2*t)-1)/(2*lam2) )
}

rho_x-function(x,expect_position,sigma_pos)
{
1/sqrt(2*pi*sigma_pos)*exp(-1/2*(x-expect_position)^2/sigma_pos)
}

Now the physical parameters
tau-0.1
beta-1/tau
St-tau ### since I am in dimensionless units and tau is already in
units of 1/|alpha|
D=2e-2
q-2*beta^2*D
### Now the grid in space and time
time-5  # time extent
tsteps-501 # time steps
newtime-seq(0,time,len=tsteps)
Now the things specific for the dynamics along x
lam1- -beta/2*(1+sqrt(1+4*St))
lam2- -beta/2*(1-sqrt(1+4*St))
xmin- -0.5
xmax-0.5
x0-0.1
vx0-x0
nx-101 ## grid intervals along x
newx-seq(xmin,xmax,len=nx) # grid along x

# M1 - do.call(g, c(list(x = newx), mypar))

mypar-c(q,lam1,lam2)
sig_xx-do.call(sigma_pos,c(list(t=newtime),mypar))
mypar-c(lam1,lam2,x0,vx0)
exp_x-do.call(expect_position,c(list(t=newtime),mypar))

#rho_x-function(x,expect_position,sigma_pos)

#NB: at t=0, the density blows up, since I have a delta as the initial
state!
# At any t0, instead, the result is finite.
#for this reason I now redefine time by getting rid of the istant t=0
to work out
# the density

rho_x_t-matrix(ncol=nx,nrow=tsteps-1)
for (i in 2:tsteps)
{mypar-c(exp_x[i],sig_xx[i])
myrho_x-do.call(rho_x,c(list(x=newx),mypar))
rho_x_t[ i-1, ]-myrho_x
}

### Now I also define a scaled density

rho_x_t_scaled-matrix(ncol=nx,nrow=tsteps-1)
for (i in 2:tsteps)
{mypar-c(exp_x[i],sig_xx[i])
myrho_x-do.call(rho_x,c(list(x=newx),mypar))
rho_x_t_scaled[ i-1, ]-myrho_x/max(myrho_x)
}

###Now I deal with the dynamics along y

lam1- -beta/2*(1+sqrt(1-4*St))
lam2- -beta/2*(1-sqrt(1-4*St))
ymin- 0
ymax- 1
y0-ymax
vy0- -y0

mypar-c(q,lam1,lam2)
sig_yy-do.call(sigma_pos,c(list(t=newtime),mypar))
mypar-c(lam1,lam2,y0,vy0)
exp_y-do.call(expect_position,c(list(t=newtime),mypar))

# now I introduce the function giving the density along y: this has to
include the BC of zero
# density at wall

rho_y-function(y,expect_position,sigma_pos)
{
```

### [R] package usage statistics.

```Dear useRs,

Is it possible to get the R package usage statistics?
That is, does R contain any tools to estimate which packages were
used and how often?

I am going to temporary change the workplace and packing the data
and their processing scripts on my computer in order to continue my
projects.

During my work on the current workplace I periodically have had installed
new R packages, have investigated them and used them in my work or did
not used them, depending on their functionality.

Now I am thinking about writing an R script which will automatically
So, I need a list of packages I have used in R.

The first solution in my head is to scan all disks for R
scripts and .Rhistory files, extract calls for library from them
and save names of loaded packages.

I would appreciate other variants.

---
Best regards,

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] venn diagram with more than three vectors

```On Tue, 2006-09-26 at 10:02 +0200, Oosting, J. (PATH) wrote:
I am not aware of existing functions to draw venn diagrams with more
than 3 sets, but you could have a look at
http://en.wikipedia.org/wiki/Venn_diagram to see how these can be
constructed.

Jan Oosting

Package vegan has a function (varpart) and plot method that will draw
venn diagrams with up to 4 sets. It works on results from redundancy

HTH

G

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Pan Zheng
Sent: dinsdag 26 september 2006 2:09
To: r-help@stat.math.ethz.ch
Subject: [R] venn diagram with more than three vectors

Hi,

I am using venn diagram function in AMDA to plot the venn diagram. But
it seems in this function, it can only plot 3 or less vectors. Is there
a way to plot the venn diagram with more than 3 vectors?

Thanks.

Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.
--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
*Note new Address and Fax and Telephone numbers from 10th April 2006*
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson [t] +44 (0)20 7679 0522
ECRC  [f] +44 (0)20 7679 0565
UCL Department of Geography
Pearson Building  [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street
London, UK[w] http://www.ucl.ac.uk/~ucfagls/cv/
WC1E 6BT  [w] http://www.ucl.ac.uk/~ucfagls/
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] rpart

```On Mon, 25 Sep 2006, [EMAIL PROTECTED] wrote:

Dear r-help-list:

If I use the rpart method like

cfit-rpart(y~.,data=data,...),

what kind of tree is stored in cfit?
Is it right that this tree is not pruned at all, that it is the full tree?

It is an rpart object.  This contains both the tree and the instructions
for pruning it at all values of cp: note that cp is also used in deciding
how large a tree to grow.

If so, it's up to me to choose a subtree by using the printcp method.

Or the plotcp method.

In the technical report from Atkinson and Therneau An Introduction to
recursive partitioning using the rpart routines from 2000, one can see
the following table on page 15:

CP  nsplit  relerror  xerror   xstd
1   0.105   0 1.0   1.   0.108
2   0.056   3 0.68519   1.1852   0.111
3   0.028   4 0.62963   1.0556   0.109
4   0.574   6 0.57407   1.0556   0.109
5   0.100   7 0.6   1.0556   0.109

Some lines below it says We see that the best tree has 5 terminal nodes
(4 splits). Why that if the xerror is the lowest for the tree only
consisting of the root?

There are *two* reports with that name: this seems to be from minitech.ps.
The choice is explained in the rest of that para (the 1-SE rule was used).
My guess is that the authors excluded the root as not being a tree, but

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] Accessing C- source code of R

```Dear list,

I'm looking for the sources code of parts of R, (e.g. spline).
Does anyone know where I can access it ?

Gunther

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] venn diagram with more than three vectors

```I am not aware of existing functions to draw venn diagrams with more
than 3 sets, but you could have a look at
http://en.wikipedia.org/wiki/Venn_diagram to see how these can be
constructed.

Jan Oosting

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Pan Zheng
Sent: dinsdag 26 september 2006 2:09
To: r-help@stat.math.ethz.ch
Subject: [R] venn diagram with more than three vectors

Hi,

I am using venn diagram function in AMDA to plot the venn diagram. But
it seems in this function, it can only plot 3 or less vectors. Is there
a way to plot the venn diagram with more than 3 vectors?

Thanks.

Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] glmmPQL in 2.3.1

```On Mon, 25 Sep 2006, Justin Rhodes wrote:

Dear R-help,

I recently tried implementing glmmPQL in 2.3.1,

I thought *I* had implemented it: are you talking about my function in
package MASS or your own implementation?

and I discovered a few differences as compared to 2.2.1.

You appear to be talking about contributed packages (MASS, and glmmPQL
also depends on nlme) without giving their version numbers.

I am fitting a regression with fixed and random effects with Gamma error
structure.  First, 2.3.1 gives different estimates than 2.2.1, and
2.3.1, takes more iterations to converge.

We have no idea, given the lack of reproducible example.  glmmPQL does
give the same answers as before for the book examples for which it is
support software.  This may well be due to an underlying change in nlme.

Second, when I try using the anova function it says, 'anova' is not
available for PQL fits, why?  Any help would be greatly appreciated.

Because anova implies you are using an optimization criterion, such as
least squares or maximum likelihood, and so there is something like a
deviance to partition.  It was not used in the book with glmmPQL supports,
but it seems some people were using glmmPQL without reference to that book
so I made a number of their misuses explicit errors.  This *is* in the
NEWS and WHATS.NEWS files for MASS and VR:

- There are anova() and logLik() methods for class glmmPQL to stop
misuse.

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] printing a variable name in a for loop

```This would do it:

v1 - 5
v2 - 6
v3 - 7

vns - paste(v,1:3,sep=)
for (i in 1:length(vns)) cat(Hello, vns[i], get(vns[i]), World\n, sep=,)

Hello,v1,5,World
Hello,v2,6,World
Hello,v3,7,World

On 24/09/06, Suzi Fei [EMAIL PROTECTED] wrote:
Hello,

How do you print a variable name in a for loop?

I'm trying to construct a csv file that looks like this:

Hello, variable1, value_of_variable1, World,
Hello, variable2, value_of_variable2, World,
Hello, variable3, value_of_variable3, World,

Using this:

for (variable in list(variable1, variable2, variable3)){

cat(Hello,, ???variable???, variable, , World,)
}

This works fine if I'm trying to print the VALUE of variable, but I want to
print the NAME of variable as well.

Thanks,
Suzi

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

--
=
David Barron
University of Oxford
Park End Street
Oxford OX1 1HP

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] Statistical data and Map-package

```Dear helpeRs,

I'm working with the map-package and came upon a problem which I
couldn't solve. I hope onee of you can. If not, this can be seen as a
suggestion for new versions of the package.

I'm trying to create a map of some European countries, filled with
colors corresponding to some values. Let's say I have the following
countries and I assign the following colors (fictional):

country2001 - c(Austria, Belgium, Switzerland,
Czechoslovakia, Germany, Denmark, Spain, Finland, France,
UK, Greece, Hungary, Ireland, Israel, Italy,
Luxembourg, Netherlands, Norway, Poland, Portugal,
Sweden, Slovenia)
color2001 - c(green, yellow,red,red, red, red, red,
red, green, red, red, red, red, red, red, red,
red, blue, red, red, red, orange)

I then let the colors and the values correspond using 'match.map',
like this:

match - match.map(world,country2001)
color - color2001[match]

And finally I plot the map. It works perfectly fine.

map(database=world, fill=TRUE, col=color)

But as I mentioned, I want to create a map of Europe. So, I use xlim
and ylim to let some parts of the world fall of the map. The syntax
becomes like this:

map(database=world, fill=TRUE, col=color, xlim=c(-25,70),ylim=c
(35,71))

Now, a problem arises. The regions on the map are colored by the
vector 'color'. It needs therefore to correspond to the order in
which the polygons are drawn. Since some of the full world-map isn't
drawn this time, the color-vector doesn't correspond anymore. This
results in the coloring of the wrong countries.

Does anybody know of a way to solve this?

Rense Nieuwenhuis
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] printing a variable name in a for loop

```Suzi Fei wrote:
Hello,

How do you print a variable name in a for loop?

I'm trying to construct a csv file that looks like this:

Hello, variable1, value_of_variable1, World,
Hello, variable2, value_of_variable2, World,
Hello, variable3, value_of_variable3, World,

Using this:

for (variable in list(variable1, variable2, variable3)){

cat(Hello,, ???variable???, variable, , World,)
}

This works fine if I'm trying to print the VALUE of variable, but I want to
print the NAME of variable as well.

This is a teetering heap of assumptions, but is this what you wanted?

Suzi-1
HiYa-function(x) {
cat(Hello,deparse(substitute(x)),x,World\n,sep=, )
}
HiYa(Suzi)

Jim

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Statistical data and Map-package

```On Tue, 26 Sep 2006, Rense Nieuwenhuis wrote:

Dear helpeRs,

I'm working with the map-package and came upon a problem which I
couldn't solve. I hope onee of you can. If not, this can be seen as a
suggestion for new versions of the package.

I'm trying to create a map of some European countries, filled with
colors corresponding to some values. Let's say I have the following
countries and I assign the following colors (fictional):

country2001 - c(Austria, Belgium, Switzerland,
Czechoslovakia, Germany, Denmark, Spain, Finland, France,
UK, Greece, Hungary, Ireland, Israel, Italy,
Luxembourg, Netherlands, Norway, Poland, Portugal,
Sweden, Slovenia)
color2001 - c(green, yellow,red,red, red, red, red,
red, green, red, red, red, red, red, red, red,
red, blue, red, red, red, orange)

I then let the colors and the values correspond using 'match.map',
like this:

match - match.map(world,country2001)
color - color2001[match]

And finally I plot the map. It works perfectly fine.

map(database=world, fill=TRUE, col=color)

But as I mentioned, I want to create a map of Europe. So, I use xlim
and ylim to let some parts of the world fall of the map. The syntax
becomes like this:

map(database=world, fill=TRUE, col=color, xlim=c(-25,70),ylim=c
(35,71))

Now, a problem arises. The regions on the map are colored by the
vector 'color'. It needs therefore to correspond to the order in
which the polygons are drawn. Since some of the full world-map isn't
drawn this time, the color-vector doesn't correspond anymore. This
results in the coloring of the wrong countries.

Does anybody know of a way to solve this?

Within the maps package:

europe - map(database=world, fill=TRUE, plot=FALSE,
xlim=c(-25,70),ylim=c(35,71))
match - match.map(europe,country2001)
color - color2001[match]
map(database=world, fill=TRUE, col=color, xlim=c(-25,70),ylim=c(35,71))

but I'm afraid the world database precedes the dissolution of the Soviet
Union, Czechoslovakia, and Yugoslavia, and doesn't code Sicily or Sardinia
in Italy, so the result is perhaps not yet what you need:

europe\$names[grep(Sicily, europe\$names)] - Italy:Sicily
europe\$names[grep(Sardinia, europe\$names)] - Italy:Sardinia
match - match.map(europe,country2001)
color - color2001[match]
map(database=world, fill=TRUE, col=color, xlim=c(-25,70),ylim=c(35,71))

deals with Italy, but you won't get Slovenia. There was a discussion about
this on the R-sig-geo list in March this year starting here:

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/78303.html

or equivalently:

http://article.gmane.org/gmane.comp.lang.r.geo/299

Rense Nieuwenhuis
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Need help with boxplots

```To prevent confusion you might want to use a red dot rather than
a line:

points(1:2, c(mean(a), mean(b)), col = red)

and perhaps label it since its non-standard:

text(1:2, c(mean(a), mean(b)), Mean, pos = 4)

On 9/26/06, laba diena [EMAIL PROTECTED] wrote:
How to add a mean line in the boxplot keeping the median line ?
For example in this:

set.seed(1)

a - rnorm(10)

b - rnorm(10)

boxplot(a, b)

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] Need help with boxplots

```How to add a mean line in the boxplot keeping the median line ?
For example in this:

set.seed(1)

a - rnorm(10)

b - rnorm(10)

boxplot(a, b)

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] Vectorise a for loop?

```
Hi R guru coders

I wrote a bit of code to add a new column onto a topTable dataframe.
That is a list of genes processed using the limma package. I used a for
loop but I kept feeling there was a better way using a more vector
oriented approach. I looked at several commands such as apply, by
etc but could not find a good way to do it. I have this feeling there is
a command or technique eluding me. (Is there an expr:value1?value2
construction in R?)

Can anybody suggest an elegant solution?

Details:

So, the topTable looks like this:

topa1[1:5,c(1,2,3,4)]
IDName GB_accession M
11195 245828 SIGKEC9 AX135029 -7.670197
10966107FHL1   B14446 -5.089926
6287   25744 M90LL137340 -4.531744
777 2288   VSNL1 LF039555 -4.035472
11310 272294 M98LL031650  3.866422

I want to add a fold column so it will look like this:

topa1[1:5,c(1,2,3,4,10)]
IDName GB_accession M  fold
11195 245828 SIGKEC9 AX135029 -7.670197 203.68521
10966107FHL1   B14446 -5.089926  34.05810
6287   25744 M90LL137340 -4.531744  23.13082
777 2288   VSNL1 LF039555 -4.035472  16.39828
11310 272294 M98LL031650  3.866422  14.58508

The fold values is calculated from the M column which is a log2 value.
The calculation is different depending on whether the M value is
negative or positive. That is if the gene is down regulated the
reciprocal value has to be used to calculate a fold value.

Here is my clunky, not vectorised code :

# Function to add a fold column to the toptable
ttfold-function(tt) {
fold-NULL
for (i in 1:length(tt\$M)) {
if (tt\$M[i]  0 ) {
fold[i]-1/(2^tt\$M[i])
} else {
fold[i]-2^tt\$M[i]
}
}
tt-cbind(tt, fold)
}

# Add fold column to top tables
topa1-ttfold(topa1)

Regards

J

---

John Seers
Institute of Food Research
Norwich Research Park
Colney
Norwich
NR4 7UA

tel +44 (0)1603 251497
fax +44 (0)1603 507723
e-mail [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]

e-disclaimer at http://www.ifr.ac.uk/edisclaimer/
http://www.ifr.ac.uk/edisclaimer/

Web sites:

www.ifr.ac.uk http://www.ifr.ac.uk/
www.foodandhealthnetwork.com http://www.foodandhealthnetwork.com/

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] package usage statistics. (UPDATE)

```Here is the perl script with some comments

pre
#!/bin/perl -w

use File::Find;
# we use the standard Perl module.
# its procedure will scan the directory tree and put all package names to the
hash

%pkgs=(base=-1,# won't print packages installed by default
datasets=-1,
grDevices=-1,
graphics=-1,
grid=-1,
methods=-1,
splines=-1,
stats=-1,
stats4=-1,
tcltk=-1,
tools=-1,
utils=-1,
MASS=-1
);

sub wanted {   # this subroutine is used by the File::Find
procedure
# it adds package names to the hash above
return if(\$_!~/\.[Rr]\$/  \$_!~/\.[Rr]history\$/);  # do nothing if this file
doesn't contain R commands

open IN, .\$File::Find::name or die(cannot open file \$!);

while(IN){
if(/library\((.*)\)/){# looking for
library(...) calls
\$pkgname=\$1;
next if(! -d C:\\Program Files\\R\\library\\\$pkgname); # don't do
anything if the package directory doesn't exist
# simple
protection against typos
if(exists \$pkgs{\$pkgname}) {
\$pkgs{\$pkgname}=\$pkgs{\$pkgname}+1;# here we assume that
}else{  # with library()
\$pkgs{\$pkgname}=1;
}
}
}
close(IN);
}

sub getdepends {# this subroutine resolves the package dependencies
\$pkgname=\$_[0];   # its argument is a package name. It finds the packages
the current one depends on
# and adds them to the hash above
open IN,  C:\\Program Files\\R\\library\\\$pkgname\\DESCRIPTION or return;
#do {print (cannot open file C:\\Program
Files\\R\\library\\\$pkgname\\DESCRIPTION\n \$!);
while(IN){
if(\$_=~/^Imports: (.*)/ || \$_=~/^Depends: (.*)/) {
@deplist=split(/,/,\$1);
for(@deplist) {
next if(/R \(.*\)/); # exclude dependencies on R version
s/\s//g;
if(/(.*)\(.*\)/) {
\$pkgname=\$1;
}else{
\$pkgname=\$_;
}

if(exists \$pkgs{\$pkgname}) {
basic packages
}else{
\$pkgs{\$pkgname}=1;
}
}
}
}
close(IN);
}

# now the main loop. hope, it is self-describing

print Searching for R commands...;
find({ wanted = \wanted, no_chdir = 1 }, '.');
print done!\n;

print Now resolving dependencies...;
for \$p (keys %pkgs) {
#print \$p\n;
getdepends(\$p);
}
print done!\n;

open OUT, install.pkgs.r or die(cannot create file install.pkgs.r);

print OUT install.packages(\n;
foreach(keys %pkgs){
print OUT   \$_,\n if(\$pkgs{\$_}0);
}

close(OUT);
/pre

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Voung test implementation in R

```
Yes, the pscl package contains that function.

library(pscl)
?vuong

Description

Compares two models fit to the same data that do not nest via Vuong's
non-nested test.

Usage

vuong(m1, m2, digits = getOption(digits))

On 9/26/06, mirko sanpietrucci [EMAIL PROTECTED] wrote:

Dear All,
I would like to know if the Voung test (Voung; Econometrica, 1989) to compare
two non-nested regression models has been implemented in R.
mirko
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

--

Department of Sociology
Fudan University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] putting stuff into bins...

```Federico Calboli [EMAIL PROTECTED] writes:

Hi All,

I have a vector of data, a vector of bin breakpoints and I want to put my
data
in the bins and then extract fanciful informations like the mean value of
each bin.

I know I can write my own function, but I would have thought that R should
have
somewhere a function that took as arguments something like (data, breaks,
what
to do with the data in the bins). I surey could not find it trawling the
R-help
archives though.

If such a function exists I'd be grateful to anyone pointing it out to me.

cut, split+lapply, aggregate, by

--
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] putting stuff into bins...

```Hi All,

I have a vector of data, a vector of bin breakpoints and I want to put my data
in the bins and then extract fanciful informations like the mean value of each
bin.

I know I can write my own function, but I would have thought that R should have
somewhere a function that took as arguments something like (data, breaks, what
to do with the data in the bins). I surey could not find it trawling the R-help
archives though.

If such a function exists I'd be grateful to anyone pointing it out to me.

Cheers,

Fede

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] putting stuff into bins...

```Federico Calboli schrieb:
Hi All,

I have a vector of data, a vector of bin breakpoints and I want to put my
data
in the bins and then extract fanciful informations like the mean value of
each bin.

I know I can write my own function, but I would have thought that R should
have
somewhere a function that took as arguments something like (data, breaks,
what
to do with the data in the bins). I surey could not find it trawling the
R-help
archives though.

If such a function exists I'd be grateful to anyone pointing it out to me.

Cheers,

Fede

The following should be of help:

bd384 - c(2.968, 2.097, 1.611, 3.038, 7.921, 5.476, 9.858,
1.397, 0.155, 1.301, 9.054, 1.958, 4.058, 3.918, 2.019, 3.689,
3.081, 4.229, 4.669, 2.274, 1.971, 10.379, 3.391, 2.093,
6.053, 4.196, 2.788, 4.511, 7.3, 5.856, 0.86, 2.093, 0.703,
1.182, 4.114, 2.075, 2.834, 3.698, 6.48, 2.36, 5.249, 5.1,
4.131, 0.02, 1.071, 4.455, 3.676, 2.666, 5.457, 1.046, 1.908,
3.064, 5.392, 8.393, 0.916, 9.665, 5.564, 3.599, 2.723, 2.87,
1.582, 5.453, 4.091, 3.716, 6.156, 2.039)
cut(bd384,0:11)
split(bd384,cut(bd384,0:11))
sapply(split(bd384,cut(bd384,0:11)),mean)

D.Trenkler

--
Dietrich Trenkler c/o Universitaet Osnabrueck
Rolandstr. 8; D-49069 Osnabrueck, Germany
email: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] rpart

```On Tue, 26 Sep 2006, [EMAIL PROTECTED] wrote:

Original-Nachricht
Datum: Tue, 26 Sep 2006 09:56:53 +0100 (BST)
Von: Prof Brian Ripley [EMAIL PROTECTED]
An: [EMAIL PROTECTED]
Betreff: Re: [R] rpart

On Mon, 25 Sep 2006, [EMAIL PROTECTED] wrote:

Dear r-help-list:

If I use the rpart method like

cfit-rpart(y~.,data=data,...),

what kind of tree is stored in cfit?
Is it right that this tree is not pruned at all, that it is the full
tree?

It is an rpart object.  This contains both the tree and the instructions
for pruning it at all values of cp: note that cp is also used in deciding
how large a tree to grow.

Ok, I have to explain my problem a little bit more in detail, I'm sorry for
being so vague:
I used the method in the following way:
cfit- rpart(y~., method=class, minsplit=1, cp=0)
I got a tree with a lot of terminals nodes that contained more than 100
On the other hand, the printcp method showed subtrees that were better.
So, are the trees a little bit pruned?

Yes, as you asked for cp=0.  Look up what that does in ?rpart.control.

If so, it's up to me to choose a subtree by using the printcp method.

Or the plotcp method.

In the technical report from Atkinson and Therneau An Introduction to
recursive partitioning using the rpart routines from 2000, one can see
the following table on page 15:

CP  nsplit  relerror  xerror   xstd
1   0.105   0 1.0   1.   0.108
2   0.056   3 0.68519   1.1852   0.111
3   0.028   4 0.62963   1.0556   0.109
4   0.574   6 0.57407   1.0556   0.109
5   0.100   7 0.6   1.0556   0.109

Some lines below it says We see that the best tree has 5 terminal nodes
(4 splits). Why that if the xerror is the lowest for the tree only
consisting of the root?

There are *two* reports with that name: this seems to be from minitech.ps.
The choice is explained in the rest of that para (the 1-SE rule was used).
My guess is that the authors excluded the root as not being a tree, but

Are both reports from 2000? But you're right, I'm talking about the one from
minitch.ps.
The 1-SE-rule only explains why they didn't choose the tree with 6 or 7
splits, but not why they didn't choose the tree without a split.
The exclusion of the root as not being a tree was my first explanation, too.
But if the tree only consisting of the root is still better than any other
tree, why would I choose a tree with 4 splits then?

Henri

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] package usage statistics. (UPDATE)

```On Tue, 26 Sep 2006, Vladimir Eremeev wrote:

Here is the perl script with some comments

??

t1 - installed.packages()
t2 - is.na(t1[,Priority])
t3 - names(t2)[t2]
t4 - sapply(t3, function(n) file.info(system.file(R, package=n)[1])\$atime)
class(t4) - POSIXct
sort(t4)

though the R directory may not be the best place to look for the atime?
(This isn't the same, but you get the idea, there is a well-known
fortune ...)

Roger

pre
#!/bin/perl -w

use File::Find;
# we use the standard Perl module.
# its procedure will scan the directory tree and put all package names to the
hash

%pkgs=(base=-1,# won't print packages installed by default
datasets=-1,
grDevices=-1,
graphics=-1,
grid=-1,
methods=-1,
splines=-1,
stats=-1,
stats4=-1,
tcltk=-1,
tools=-1,
utils=-1,
MASS=-1
);

sub wanted {   # this subroutine is used by the File::Find
procedure
# it adds package names to the hash above
return if(\$_!~/\.[Rr]\$/  \$_!~/\.[Rr]history\$/);  # do nothing if this
file
doesn't contain R commands

open IN, .\$File::Find::name or die(cannot open file \$!);

while(IN){
if(/library\((.*)\)/){# looking for
library(...) calls
\$pkgname=\$1;
next if(! -d C:\\Program Files\\R\\library\\\$pkgname); # don't do
anything if the package directory doesn't exist
# simple
protection against typos
if(exists \$pkgs{\$pkgname}) {
\$pkgs{\$pkgname}=\$pkgs{\$pkgname}+1;# here we assume that
}else{  # with library()
\$pkgs{\$pkgname}=1;
}
}
}
close(IN);
}

sub getdepends {# this subroutine resolves the package dependencies
\$pkgname=\$_[0];   # its argument is a package name. It finds the
packages
the current one depends on
# and adds them to the hash above
open IN,  C:\\Program Files\\R\\library\\\$pkgname\\DESCRIPTION or
return;
#do {print (cannot open file C:\\Program
Files\\R\\library\\\$pkgname\\DESCRIPTION\n \$!);
while(IN){
if(\$_=~/^Imports: (.*)/ || \$_=~/^Depends: (.*)/) {
@deplist=split(/,/,\$1);
for(@deplist) {
next if(/R \(.*\)/); # exclude dependencies on R version
s/\s//g;
if(/(.*)\(.*\)/) {
\$pkgname=\$1;
}else{
\$pkgname=\$_;
}

if(exists \$pkgs{\$pkgname}) {
\$pkgs{\$pkgname}=\$pkgs{\$pkgname}+1 if(\$pkgs{\$pkgname}0);  # don't
basic packages
}else{
\$pkgs{\$pkgname}=1;
}
}
}
}
close(IN);
}

# now the main loop. hope, it is self-describing

print Searching for R commands...;
find({ wanted = \wanted, no_chdir = 1 }, '.');
print done!\n;

print Now resolving dependencies...;
for \$p (keys %pkgs) {
#print \$p\n;
getdepends(\$p);
}
print done!\n;

open OUT, install.pkgs.r or die(cannot create file install.pkgs.r);

print OUT install.packages(\n;
foreach(keys %pkgs){
print OUT   \$_,\n if(\$pkgs{\$_}0);
}

close(OUT);
/pre

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] putting stuff into bins...

```I don't know about such a function, but

tapply(data,cut(data,breaks),what to do)

should give you what you need.

HIH

Ciao,
Stefano

On Tue, Sep 26, 2006 at 12:44:35PM +0100, Federico Calboli wrote:
FedericoHi All,
Federico
FedericoI have a vector of data, a vector of bin breakpoints and I want to
put my data
Federicoin the bins and then extract fanciful informations like the mean
value of each bin.
Federico
FedericoI know I can write my own function, but I would have thought that R
should have
Federicosomewhere a function that took as arguments something like (data,
breaks, what
Federicoto do with the data in the bins). I surey could not find it trawling
the R-help
Federicoarchives though.
Federico
FedericoIf such a function exists I'd be grateful to anyone pointing it out
to me.
Federico
FedericoCheers,
Federico
FedericoFede
Federico
Federico--
FedericoFederico C. F. Calboli
FedericoDepartment of Epidemiology and Public Health
FedericoImperial College, St Mary's Campus
FedericoNorfolk Place, London W2 1PG
Federico
FedericoTel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193
Federico
Federicof.calboli [.a.t] imperial.ac.uk
Federicof.calboli [.a.t] gmail.com
Federico
Federico__
FedericoR-help@stat.math.ethz.ch mailing list
Federicohttps://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
Federicoand provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] about the determinant of a symmetric compound matrix

```Dear R users,
even if this question is not related to an issue about R, probably some of you
will be able to help me.

I have a square matrix of dimension k by k with alpha on the diagonal and beta
everywhee else.
This symmetric matrix is called symmetric compound matrix and has the form
a( I + cJ),
where
I is the k by k identity matrix
J is the k by k matrix of all ones
a = alpha - beta
c = beta/a

I need to evaluate the determinant of this matrix. Is there any algebric
formula for that?

Stefano

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] rpart

```
Original-Nachricht
Datum: Tue, 26 Sep 2006 09:56:53 +0100 (BST)
Von: Prof Brian Ripley [EMAIL PROTECTED]
An: [EMAIL PROTECTED]
Betreff: Re: [R] rpart

On Mon, 25 Sep 2006, [EMAIL PROTECTED] wrote:

Dear r-help-list:

If I use the rpart method like

cfit-rpart(y~.,data=data,...),

what kind of tree is stored in cfit?
Is it right that this tree is not pruned at all, that it is the full
tree?

It is an rpart object.  This contains both the tree and the instructions
for pruning it at all values of cp: note that cp is also used in deciding
how large a tree to grow.

Ok, I have to explain my problem a little bit more in detail, I'm sorry for
being so vague:
I used the method in the following way:
cfit- rpart(y~., method=class, minsplit=1, cp=0)
I got a tree with a lot of terminals nodes that contained more than 100
On the other hand, the printcp method showed subtrees that were better.
So, are the trees a little bit pruned?

If so, it's up to me to choose a subtree by using the printcp method.

Or the plotcp method.

In the technical report from Atkinson and Therneau An Introduction to
recursive partitioning using the rpart routines from 2000, one can see
the following table on page 15:

CP  nsplit  relerror  xerror   xstd
1   0.105   0 1.0   1.   0.108
2   0.056   3 0.68519   1.1852   0.111
3   0.028   4 0.62963   1.0556   0.109
4   0.574   6 0.57407   1.0556   0.109
5   0.100   7 0.6   1.0556   0.109

Some lines below it says We see that the best tree has 5 terminal nodes
(4 splits). Why that if the xerror is the lowest for the tree only
consisting of the root?

There are *two* reports with that name: this seems to be from minitech.ps.
The choice is explained in the rest of that para (the 1-SE rule was used).
My guess is that the authors excluded the root as not being a tree, but

Are both reports from 2000? But you're right, I'm talking about the one from
minitch.ps.
The 1-SE-rule only explains why they didn't choose the tree with 6 or 7 splits,
but not why they didn't choose the tree without a split.
The exclusion of the root as not being a tree was my first explanation, too.
But if the tree only consisting of the root is still better than any other
tree, why would I choose a tree with 4 splits then?

Henri

--

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] putting stuff into bins...

```This would work.  The point is to make a factor from the breakpoints
using cut, then use this to calculate the statistics on the binned
data.

x - rnorm(500)
f - cut(x,10)
aggregate(x,list(f),mean)
Group.1  x
1(-2.71,-2.09] -2.3668991
2(-2.09,-1.46] -1.7332011
3   (-1.46,-0.834] -1.1156487
4  (-0.834,-0.208] -0.5117649
5   (-0.208,0.418]  0.1277991
6 (0.418,1.04]  0.7092500
7  (1.04,1.67]  1.2859184
8   (1.67,2.3]  1.9347327
9   (2.3,2.92]  2.5518835
10 (2.92,3.55]  3.2873698

On 26/09/06, Federico Calboli [EMAIL PROTECTED] wrote:
Hi All,

I have a vector of data, a vector of bin breakpoints and I want to put my data
in the bins and then extract fanciful informations like the mean value of
each bin.

I know I can write my own function, but I would have thought that R should
have
somewhere a function that took as arguments something like (data, breaks, what
to do with the data in the bins). I surey could not find it trawling the
R-help
archives though.

If such a function exists I'd be grateful to anyone pointing it out to me.

Cheers,

Fede

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

--
=
David Barron
University of Oxford
Park End Street
Oxford OX1 1HP

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] crostab qry - Too many crosstab column headers

```Hello

I guess not, but is there a way to reduce/split up a MS-access crosstab
query resulting in more than 256 cols. by using sqlGetResults in RODBC
to e.g. produce several dataframes of  256 columns (that is without
changing the query itself).
---
[1] [RODBC] ERROR: Could not SQLExecDirect

[2] S1001 -1040 [Microsoft][ODBC Microsoft Access Driver] Too many

R - 2.3
WinXp with Ms access 2002

Best Regards
Anders

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Vectorise a for loop?

```tt\$fold - ifelse(tt\$M  0, 1/(2^tt\$M), 2^tt\$M)
---
Jacques VESLOT

CNRS UMR 8090
I.B.L (2ème étage)
1 rue du Professeur Calmette
B.P. 245
59019 Lille Cedex

Tel : 33 (0)3.20.87.10.44
Fax : 33 (0)3.20.87.10.31

http://www-good.ibl.fr
---

john seers (IFR) a écrit :

Hi R guru coders

I wrote a bit of code to add a new column onto a topTable dataframe.
That is a list of genes processed using the limma package. I used a for
loop but I kept feeling there was a better way using a more vector
oriented approach. I looked at several commands such as apply, by
etc but could not find a good way to do it. I have this feeling there is
a command or technique eluding me. (Is there an expr:value1?value2
construction in R?)

Can anybody suggest an elegant solution?

Details:

So, the topTable looks like this:

topa1[1:5,c(1,2,3,4)]

IDName GB_accession M
11195 245828 SIGKEC9 AX135029 -7.670197
10966107FHL1   B14446 -5.089926
6287   25744 M90LL137340 -4.531744
777 2288   VSNL1 LF039555 -4.035472
11310 272294 M98LL031650  3.866422

I want to add a fold column so it will look like this:

topa1[1:5,c(1,2,3,4,10)]

IDName GB_accession M  fold
11195 245828 SIGKEC9 AX135029 -7.670197 203.68521
10966107FHL1   B14446 -5.089926  34.05810
6287   25744 M90LL137340 -4.531744  23.13082
777 2288   VSNL1 LF039555 -4.035472  16.39828
11310 272294 M98LL031650  3.866422  14.58508

The fold values is calculated from the M column which is a log2 value.
The calculation is different depending on whether the M value is
negative or positive. That is if the gene is down regulated the
reciprocal value has to be used to calculate a fold value.

Here is my clunky, not vectorised code :

# Function to add a fold column to the toptable
ttfold-function(tt) {
fold-NULL
for (i in 1:length(tt\$M)) {
if (tt\$M[i]  0 ) {
fold[i]-1/(2^tt\$M[i])
} else {
fold[i]-2^tt\$M[i]
}
}
tt-cbind(tt, fold)
}

# Add fold column to top tables
topa1-ttfold(topa1)

Regards

J

---

John Seers
Institute of Food Research
Norwich Research Park
Colney
Norwich
NR4 7UA

tel +44 (0)1603 251497
fax +44 (0)1603 507723
e-mail [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]

e-disclaimer at http://www.ifr.ac.uk/edisclaimer/
http://www.ifr.ac.uk/edisclaimer/

Web sites:

www.ifr.ac.uk http://www.ifr.ac.uk/
www.foodandhealthnetwork.com http://www.foodandhealthnetwork.com/

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] rpart

```
Original-Nachricht
Datum: Tue, 26 Sep 2006 12:54:22 +0100 (BST)
Von: Prof Brian Ripley [EMAIL PROTECTED]
An: [EMAIL PROTECTED]
Betreff: Re: [R] rpart

On Tue, 26 Sep 2006, [EMAIL PROTECTED] wrote:

Original-Nachricht
Datum: Tue, 26 Sep 2006 09:56:53 +0100 (BST)
Von: Prof Brian Ripley [EMAIL PROTECTED]
An: [EMAIL PROTECTED]
Betreff: Re: [R] rpart

On Mon, 25 Sep 2006, [EMAIL PROTECTED] wrote:

Dear r-help-list:

If I use the rpart method like

cfit-rpart(y~.,data=data,...),

what kind of tree is stored in cfit?
Is it right that this tree is not pruned at all, that it is the full
tree?

It is an rpart object.  This contains both the tree and the
instructions
for pruning it at all values of cp: note that cp is also used in
deciding
how large a tree to grow.

Ok, I have to explain my problem a little bit more in detail, I'm sorry
for being so vague:
I used the method in the following way:
cfit- rpart(y~., method=class, minsplit=1, cp=0)
I got a tree with a lot of terminals nodes that contained more than 100
On the other hand, the printcp method showed subtrees that were
better.
So, are the trees a little bit pruned?

Yes, as you asked for cp=0.  Look up what that does in ?rpart.control.

I thought I would get a full tree by choosing cp=0 - and it was one.
The nodes with more than 100 observations were not split further because there
was no sequence of splits which made the class label change for any subset. (A
bad explanation, but you probably know what I mean.) I realized that when I
chose cp=-1. Thank you very much for your help!

If so, it's up to me to choose a subtree by using the printcp method.

Or the plotcp method.

In the technical report from Atkinson and Therneau An Introduction to
recursive partitioning using the rpart routines from 2000, one can
see
the following table on page 15:

CP  nsplit  relerror  xerror   xstd
1   0.105   0 1.0   1.   0.108
2   0.056   3 0.68519   1.1852   0.111
3   0.028   4 0.62963   1.0556   0.109
4   0.574   6 0.57407   1.0556   0.109
5   0.100   7 0.6   1.0556   0.109

Some lines below it says We see that the best tree has 5 terminal
nodes
(4 splits). Why that if the xerror is the lowest for the tree only
consisting of the root?

There are *two* reports with that name: this seems to be from
minitech.ps.
The choice is explained in the rest of that para (the 1-SE rule was
used).
My guess is that the authors excluded the root as not being a tree, but

Are both reports from 2000? But you're right, I'm talking about the one
from minitch.ps.
The 1-SE-rule only explains why they didn't choose the tree with 6 or 7
splits, but not why they didn't choose the tree without a split.
The exclusion of the root as not being a tree was my first explanation,
too. But if the tree only consisting of the root is still better than any
other tree, why would I choose a tree with 4 splits then?

Henri

--

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] Need help with boxplots

``` How to add a mean *line* in the boxplot keeping the median line ?
Maybe it is possible to do using the *segments* function ?
For example in this:

set.seed(1)

a - rnorm(10)

b - rnorm(10)

boxplot(a, b)

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Vectorise a for loop?

```

Hi Jacques

Yes, that looks a whole lot better. That ifelse is exactly what I was
searching for.

Merci.

J

---

John Seers
Institute of Food Research
Norwich Research Park
Colney
Norwich
NR4 7UA

tel +44 (0)1603 251497
fax +44 (0)1603 507723
e-mail [EMAIL PROTECTED]
e-disclaimer at http://www.ifr.ac.uk/edisclaimer/

Web sites:

www.ifr.ac.uk
www.foodandhealthnetwork.com

-Original Message-
From: Jacques VESLOT [mailto:[EMAIL PROTECTED]
Sent: 26 September 2006 14:02
To: john seers (IFR)
Cc: R-help
Subject: Re: [R] Vectorise a for loop?

tt\$fold - ifelse(tt\$M  0, 1/(2^tt\$M), 2^tt\$M)
---
Jacques VESLOT

CNRS UMR 8090
I.B.L (2ème étage)
1 rue du Professeur Calmette
B.P. 245
59019 Lille Cedex

Tel : 33 (0)3.20.87.10.44
Fax : 33 (0)3.20.87.10.31

http://www-good.ibl.fr

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] creation of new variables

```Hello All,

I have 8 variables named

a b c d e f g h

I need to create four variables from these 8 vraibles in R.

the new variables are ab,cd,ef,gh.

Can anyone pleas help me

thanks,
Pratap

-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] about the determinant of a symmetric compound matrix

```Stefano Sofia [EMAIL PROTECTED] writes:

Dear R users,
even if this question is not related to an issue about R, probably some of
you will be able to help me.

I have a square matrix of dimension k by k with alpha on the diagonal and
beta everywhee else.
This symmetric matrix is called symmetric compound matrix and has the form
a( I + cJ),
where
I is the k by k identity matrix
J is the k by k matrix of all ones
a = alpha - beta
c = beta/a

I need to evaluate the determinant of this matrix. Is there any algebric
formula for that?

Yes. Unusually, this is not from the famous Rao p.33, but from p.32... [1]:

det(A+XX') = det(A)(1+X'A^{-1}X) provided det(A) != 0

now put X = sqrt(c) times a vector of ones and get det(I+cJ) = 1+ck.
Multiply by a^k for the general case.

Quick sanity check:

m - matrix(.1,7,7)
diag(m) - .9
det(m)
[1] 0.393216
.8^7 * (1 + .1/.8 * 7)
[1] 0.393216

Alternatively, you can do it via eigenvalues: The off-diagonal part
(beta*J) corresponds to a single direction along the unit vector
c(1,1,...,1)/sqrt(7). The diagonal part corresponds to adding (alpha -
beta)*I, which has total sphericity so you can arrange that one
eigenvector of it points in the same direction and you end up with

(alpha - beta)^(k-1) * (alpha - beta + k*beta)

(.9-.1)^6*((.9-.1)+ 7*.1)
[1] 0.393216

(Getting this right on the first try is almost impossible...)

[1] CR Rao, Linear Statistical Inference and Its Applications, 2nd ed.
Wiley 1973.

--
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] treatment effect at specific time point within mixed effects model

```
All,

The code below is for a pseudo dataset of repeated measures on patients
where there is also a treatment factor called drug.  Time is treated
as categorical.

What code is necessary to test for a treatment effect at a single time
point,
e.g., time = 3?   Does the answer matter if the design is a crossover
design,
i.e, each patient received drug and placebo?

Finally, what would be a good response to someone that suggests to do a
simple t-test (paired in crossover case) instead of the test above
within a mixed model?

thanks!
dave

z = rnorm(24, mean=0, sd=1)
time = rep(1:6, 4)
Patient = rep(1:4, each = 6)
drug = factor(rep(c(I, P), each = 6, times = 2)) ## P = placebo, I =
Ibuprofen
dat.new = data.frame(time, drug, z, Patient)
data.grp = groupedData(z ~ time | Patient, data = dat.new)
fm1 = lme(z ~ factor(time) + drug + factor(time):drug, data = data.grp,
random = list(Patient = ~ 1) )

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] package usage statistics. (UPDATE)

```Dear Roger,

Tuesday, September 26, 2006, 4:16:38 PM, you wrote:

RB On Tue, 26 Sep 2006, Vladimir Eremeev wrote:

Here is the perl script with some comments

RB ??

Sorry, forgot to mention, this script is designed to run from the root
of the working directory tree.
It scans all R session histories and scripts and analyzes them.
This is performed through the call

find({ wanted = \wanted, no_chdir = 1 }, '.');

The second parameter to find is a list of directories.
This allows, for example, build a histogram of package usage.

---
Best regards,

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] October R/Splus course in Washington DC, San Francisco, Seattle *** R/Splus Fundamentals and Programming Techniques

```XLSolutions Corporation (www.xlsolutions-corp.com) is proud to
announce our 2-day October 2006 R/S-plus Fundamentals and Programming
Techniques : www.xlsolutions-corp.com/Rfund.htm

*** Washington DC / October 12-13, 2006
*** Seattle Wa  / October 19-20
*** San Francisco / October 26-27

Reserve your seat now at the early bird rates! Payment due AFTER
the class

Course Description:

This two-day beginner to intermediate R/S-plus course focuses on a
broad spectrum of topics, from reading raw data to a comparison of R
and S. We will learn the essentials of data manipulation, graphical
visualization and R/S-plus programming. We will explore statistical
data analysis tools,including graphics with data sets. How to enhance
ODBC,etc.
We will perform some statistical modeling and fit linear regression
models. Participants are encouraged to bring data for interactive
sessions

With the following outline:

- An Overview of R and S
- Data Manipulation and Graphics
- Using Lattice Graphics
- A Comparison of R and S-Plus
- How can R Complement SAS?
- Writing Functions
- Avoiding Loops
- Vectorization
- Statistical Modeling
- Project Management
- Techniques for Effective use of R and S
- Enhancing Plots
- Using High-level Plotting Functions
- Building and Distributing Packages (libraries)
- Connecting; ODBC, Rweb, Orca via sockets and via Rjava

Email us for group discounts.
Email Sue Turner: [EMAIL PROTECTED]
Phone: 206-686-1578
Visit us: www.xlsolutions-corp.com/training.htm
Please let us know if you and your colleagues are interested in this
classto take advantage of group discount. Register now to secure your
seat!

Interested in R/Splus Advanced course? email us.

Cheers,
Elvis Miller, PhD
Manager Training.
XLSolutions Corporation
206 686 1578
www.xlsolutions-corp.com
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] printing a variable name in a for loop

```Example:

lst - list(variable1, variable2, variable3)
for (kk in seq(along=lst)) {
name - names(lst)[kk];
value - lst[[kk]];
cat(Hello,, name, value, , World,)
}

/Henrik

On 9/26/06, Jim Lemon [EMAIL PROTECTED] wrote:
Suzi Fei wrote:
Hello,

How do you print a variable name in a for loop?

I'm trying to construct a csv file that looks like this:

Hello, variable1, value_of_variable1, World,
Hello, variable2, value_of_variable2, World,
Hello, variable3, value_of_variable3, World,

Using this:

for (variable in list(variable1, variable2, variable3)){

cat(Hello,, ???variable???, variable, , World,)
}

This works fine if I'm trying to print the VALUE of variable, but I want to
print the NAME of variable as well.

This is a teetering heap of assumptions, but is this what you wanted?

Suzi-1
HiYa-function(x) {
cat(Hello,deparse(substitute(x)),x,World\n,sep=, )
}
HiYa(Suzi)

Jim

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] putting stuff into bins...

```probably a combination of cut() and tapply() could be of help in this
case, e.g.,

x - rnorm(100)
tapply(x, cut(x, -4:4), mean)

Best,
Dimitris

Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
http://www.student.kuleuven.be/~m0390867/dimitris.htm

- Original Message -
From: Federico Calboli [EMAIL PROTECTED]
To: r-help r-help@stat.math.ethz.ch
Sent: Tuesday, September 26, 2006 1:44 PM
Subject: [R] putting stuff into bins...

Hi All,

I have a vector of data, a vector of bin breakpoints and I want to
put my data
in the bins and then extract fanciful informations like the mean
value of each bin.

I know I can write my own function, but I would have thought that R
should have
somewhere a function that took as arguments something like (data,
breaks, what
to do with the data in the bins). I surey could not find it trawling
the R-help
archives though.

If such a function exists I'd be grateful to anyone pointing it out
to me.

Cheers,

Fede

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Accessing C- source code of R

```

Gunther Höning wrote:
Dear list,

I'm looking for the sources code of parts of R, (e.g. spline).
Does anyone know where I can access it ?

I plan to write a corresponding R Help Desk article on Accessing the
source. A draft is available from:
http://www.statistik.uni-dortmund.de/~ligges/R_Help_Desk_preview.pdf

Can you please tell me if this description is sufficient?

Thanks,
Uwe Ligges

Gunther

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] venn diagram with more than three vectors

```Hi,

I am using venn diagram function in AMDA to plot the venn diagram. But it
seems in this function, it can only plot 3 or less vectors. Is there a way to
plot the venn diagram with more than 3 vectors?

Thanks.

Z

-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] About the display of matrix

``` SQW == S Q WEN [EMAIL PROTECTED]
on Mon, 25 Sep 2006 23:12:10 -0700 writes:

SQW For a matrix A, i don't want to display the zero
SQW elements in it , How to do with that?

Either use as.table() and use the print() method for table
explicitly, or,
if you are really working with sparse matrices, use the 'Matrix'
package:

set.seed(1); m - matrix(rpois(80, lambda=.8), 8,10);m
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]011012101 0
[2,]004001110 0
[3,]100021111 1
[4,]201011201 2
[5,]012211020 2
[6,]200001002 0
[7,]211110010 1
[8,]110101002 3
print(as.table(m), zero = .)
A B C D E F G H I J
A . 1 1 . 1 2 1 . 1 .
B . . 4 . . 1 1 1 . .
C 1 . . . 2 1 1 1 1 1
D 2 . 1 . 1 1 2 . 1 2
E . 1 2 2 1 1 . 2 . 2
F 2 . . . . 1 . . 2 .
G 2 1 1 1 1 . . 1 . 1
H 1 1 . 1 . 1 . . 2 3

library(Matrix)
M - Matrix(m, sparse = TRUE)
M
8 x 10 sparse Matrix of class dgCMatrix

[1,] . 1 1 . 1 2 1 . 1 .
[2,] . . 4 . . 1 1 1 . .
[3,] 1 . . . 2 1 1 1 1 1
[4,] 2 . 1 . 1 1 2 . 1 2
[5,] . 1 2 2 1 1 . 2 . 2
[6,] 2 . . . . 1 . . 2 .
[7,] 2 1 1 1 1 . . 1 . 1
[8,] 1 1 . 1 . 1 . . 2 3

---

Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Need help with boxplots

```The problem with a line, I think, would be that the width of the boxes
can vary depending on the number of boxes in the plot, etc.  No doubt
it could be done, but you'd probably have to look into the bxp
function to see how the widths are calculated.

On 26/09/06, laba diena [EMAIL PROTECTED] wrote:
How to add a mean *line* in the boxplot keeping the median line ?
Maybe it is possible to do using the *segments* function ?
For example in this:

set.seed(1)

a - rnorm(10)

b - rnorm(10)

boxplot(a, b)

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

--
=
David Barron
University of Oxford
Park End Street
Oxford OX1 1HP

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] calculating dissimilarities in R

```Hi Elvina,

Elvina == Elvina Payet [EMAIL PROTECTED]
on Tue, 26 Sep 2006 05:48:01 GMT writes:

Elvina __
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Statistical data and Map-package

```You may also want to look at the maptools (and sp) package, it can read
in and plot shapefiles from external sources.

Some sources of maps that maptools can plot include:

http://www.vdstech.com/map_data.htm
http://openmap.bbn.com/data/shape/timezone/
back

Hope this helps,

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Rense Nieuwenhuis
Sent: Tuesday, September 26, 2006 2:21 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Statistical data and Map-package

Dear helpeRs,

I'm working with the map-package and came upon a problem which I
couldn't solve. I hope onee of you can. If not, this can be seen as a
suggestion for new versions of the package.

I'm trying to create a map of some European countries, filled with
colors corresponding to some values. Let's say I have the following
countries and I assign the following colors (fictional):

country2001 - c(Austria, Belgium, Switzerland, Czechoslovakia,
Germany, Denmark, Spain, Finland, France, UK, Greece,
Hungary, Ireland, Israel, Italy, Luxembourg, Netherlands,
Norway, Poland, Portugal, Sweden, Slovenia)
color2001 - c(green, yellow,red,red, red, red, red,
red, green, red, red, red, red, red, red, red, red,
blue, red, red, red, orange)

I then let the colors and the values correspond using 'match.map', like
this:

match - match.map(world,country2001)
color - color2001[match]

And finally I plot the map. It works perfectly fine.

map(database=world, fill=TRUE, col=color)

But as I mentioned, I want to create a map of Europe. So, I use xlim and
ylim to let some parts of the world fall of the map. The syntax becomes
like this:

map(database=world, fill=TRUE, col=color, xlim=c(-25,70),ylim=c
(35,71))

Now, a problem arises. The regions on the map are colored by the vector
'color'. It needs therefore to correspond to the order in which the
polygons are drawn. Since some of the full world-map isn't drawn this
time, the color-vector doesn't correspond anymore. This results in the
coloring of the wrong countries.

Does anybody know of a way to solve this?

Rense Nieuwenhuis
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Need help with boxplots

```And if you really do want a line segment try this:

M - c(mean(a), mean(b))
segments(1:2-0.4, M, 1:2+0.4, M, col = red)

On 9/26/06, Gabor Grothendieck [EMAIL PROTECTED] wrote:
To prevent confusion you might want to use a red dot rather than
a line:

points(1:2, c(mean(a), mean(b)), col = red)

and perhaps label it since its non-standard:

text(1:2, c(mean(a), mean(b)), Mean, pos = 4)

On 9/26/06, laba diena [EMAIL PROTECTED] wrote:
How to add a mean line in the boxplot keeping the median line ?
For example in this:

set.seed(1)

a - rnorm(10)

b - rnorm(10)

boxplot(a, b)

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] lapply, plot and additional arguments

```maybe something like this could help:

x - data.frame(a = 1:9, beta = exp(-4:4),
logic = rep(c(TRUE, FALSE), c(5, 4)))
x.l - split(x, x\$logic)

plot(x\$a, x\$beta)
mapply(function(x, y) lines(x\$a, x\$b, col = y), x.l, 1:2)

Best,
Dimitris

Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
http://www.student.kuleuven.be/~m0390867/dimitris.htm

- Original Message -
From: Petr Pikal [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Tuesday, September 26, 2006 5:40 PM
Subject: [R] lapply, plot and additional arguments

Dear all

Hopefully somebody will know the answer.

I have some list

x - data.frame(a = 1:9, beta = exp(-4:4), logic =
rep(c(TRUE,FALSE),
c(5,4)))
x.l - split(x, x\$logic)
plot(x.l\$a, x.l\$beta)

and I want to plot lines color coded according to logic variable

lapply(x.l, function(x, ...) lines(x\$a, x\$beta, col=1:2))
lapply(x.l, function(x,...) lines(x\$a,x\$beta), col=1:2)
lapply(x.l, function(x,...) lines(x\$a,x\$beta, ...), col=1:2)

Well, lapply seems to ignore my best attempts to persuade it to use
different colours for each part of x.l list.

Anybody knows how to code different colours when using lapply for
such plotting?

At present time I use a loop but maybe lapply could do it too.

Best regards.
Petr

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] set off error messages

```Hello there!

I'm creacting a loop for(i in 1:n){...}within which I build a nls model at each
iteration. for some of the values of i, the algoritm in the nls function
doesn't converge or cannot find a solution and consequently an error message is
produced, and so my loop is interupted. The errors don't really matter to me as
all the other values might still be useful and therefore I want to ignore the
errors, so that that the return of the models for which no solution is found
should just be NA values, so that I get a value for every i. How can I turn off
the error message and make return NA values instead?

Fabian

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] set off error messages

```On Tue, 26 Sep 2006, Mollet, Fabian wrote:

Hello there!

I'm creacting a loop for(i in 1:n){...}within which I build a nls model
at each iteration. for some of the values of i, the algoritm in the nls
function doesn't converge or cannot find a solution and consequently an
error message is produced, and so my loop is interupted. The errors
don't really matter to me as all the other values might still be useful
and therefore I want to ignore the errors, so that that the return of
the models for which no solution is found should just be NA values, so
that I get a value for every i. How can I turn off the error message and

This is a FAQ (7.32)

-thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] Extention of Pie Chart in R (was Re: Adding percentage to Pie Charts)

```Jim Lemon jim at bitwrit.com.au writes:

I admit to interpreting this pretty loosely, but I would like to know
what people think of a fan plot.

Hi all, I tried the fan.plots that Jim has been very nice to provide. It made me
think if there was something like, clock.plots in R? Something like the
following, anything that comes close?

The idea an extention in yet another way of Pie Charts, extending the fan.plots
provided by Jim.
* A value will be depicted on a clock.plot using 1 or 2 hands of an analog
clock on a circle calibrated from 0 to 100 (same as 0).
* For values between 0 and 99 use the position of only one hand of the clock
(needle).
* For values of 100, use the second hand (needle), and move it to 1.
* Some way to identify needles, and two two overlapping needles.
* Use color coding or line-types to differentiate variables.

This is basically a clock calibrated on a scale of 100, rather than 60. It can
visually depict values between 1 and 1.

Do we have something like this R?

Anupam.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] set off error messages

```Try ?try

On 26/09/06, Mollet, Fabian [EMAIL PROTECTED] wrote:
Hello there!

I'm creacting a loop for(i in 1:n){...}within which I build a nls model at
each iteration. for some of the values of i, the algoritm in the nls function
doesn't converge or cannot find a solution and consequently an error message
is produced, and so my loop is interupted. The errors don't really matter to
me as all the other values might still be useful and therefore I want to
ignore the errors, so that that the return of the models for which no
solution is found should just be NA values, so that I get a value for every
i. How can I turn off the error message and make return NA values instead?

Fabian

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

--
=
David Barron
University of Oxford
Park End Street
Oxford OX1 1HP

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Not all functions work in RSPerl package?

```Hi, Prof Duncan

I am sorry to report to a wrong place. But I am lucky to meet you by
chance, right? Thanks first ^^

1. The variable y1 is an array get from Perl, each element is from a
database (the type should be numeric). Here is the code for that.

\$query = qq{
select exonCount, count(hsEnsGene) as geneCount from countTop1000ks
group by exonCount order by exonCount;
};
\$sql=\$orthologDB-prepare(\$query);
\$sql-execute()or die Could not execute '\$query' ...;

my @x1; my @y1;
while(my(\$exonCount, \$geneCount) = \$sql-fetchrow_array())
{
push(@x1, \$exonCount);
push(@y1, \$geneCount);
}

Here is the result if I print out the value in @y1:

print y1---,join( ,@y1), ---end\n;

% y1---101 44 33 26 8 15 18 13 3 5 4 2 1 4 1 1---end

But when I call

R::callWithNames(barplot, {'',[EMAIL PROTECTED], 'main', 'Barplot
the Gene number per exon with top1000 low Ks', 'xlab', Exon(low ks) number in
the gene,'ylab', 'Numbers of gene'});

It always says non-numeric argument:

% Error in -0.01 * height : non-numeric argument to binary operator

If I asign the value to another array, like

my @x=(101, 44, 33, 26, 8, 15, 18, 13, 3, 5, 4, 2, 1, 4, 1, 1);
R::callWithNames(barplot, {'',[EMAIL PROTECTED], 'main', 'Barplot
the Gene number per exon with top1000 low Ks', 'xlab', Exon(low ks) number in
the gene,'ylab', 'Numbers of gene'});

then it works.

I don't know why and what the difference is.

I also thought whether it is because of the different data type between
Perl and R, because in Perl, 3 and 3 could be same sometime. So I
call

R::callWithNames(as.numeric,{'',[EMAIL PROTECTED]);

before I call

R::callWithNames(barplot, {'',[EMAIL PROTECTED]);

Same error!

Same case if I change to use the R::boxplot([EMAIL PROTECTED]) as you said.

I am not sure I explain clear this time.

Regards,

-Xianjun

On Thu, 2006-09-21 at 07:33 -0700, Duncan Temple Lang wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi Xianjun

[Important: Please don't send mail about an R package to r-bugs. That is
for reporting bugs in R itself. Add on packages are different
and it is only a coincidence that I am one of the R-core developers
and package author.  In general, all bug reports about
a package should be sent to the author  and questions should go to the
author and the r-devel or r-help list as appropriate.]

Is the problem you report a bug? Well not necessarily in RSPerl,
but in your code.  Unfortunately, you haven't told me what
the variable y1 contains so it is hard to figure out what
is going into the computations.

A couple of things:
a) Your example is calling boxplot in the first call and barplot
in the second.

b) in the first example, you are passing @y1 and in the second
you are passing [EMAIL PROTECTED]

I would guess that [EMAIL PROTECTED] is more appropriate and you might
try that
in the first case.

c) the first case doesn't have any named arguments (just '') so why
use callWithNames.  Just R::boxplot([EMAIL PROTECTED])

You are calling the R functions, but you are getting an error during
the invocation. The error message is coming from R.  So the
problem is that you are passing inputs to the functions that it cannot
handle.  This can happen directly in R and so also in RSPerl.  My guess
is that you don't have the correct type of data in @y1 or that you are
not passing it in the call as a reference.

Xianjun Dong wrote:
Hi,

It looks that not all function in R could be implemented by RSPerl. For
example,  when I call

R::callWithNames(boxplot, {'',@y1});
or
R::barplot([EMAIL PROTECTED]);

There would be error:

Error in -0.01 * height : non-numeric argument to binary operator
Caught error in R::call()

The same happened when calling barplot, but it's ok to call plot.

Is it a bug?

- --
Duncan Temple Lang[EMAIL PROTECTED]
Department of Statistics  work:  (530) 752-4782
4210 Mathematical Sciences Building   fax:   (530) 752-7099
One Shields Ave.
University of California at Davis
Davis,
CA 95616,
USA
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFFEqKy9p/Jzwa2QP4RAoVcAJ4rK3CKGBCxlgdlJYke59l/Rm4rAQCffS1x
nhSyWBrhQre0UXvv3DKD0KI=
=EVsZ
-END PGP SIGNATURE-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

```Hi All, is there a way of directly writing to disk file, the dataframe or list
of dataframes that result from read.xport function. This function converts SAS
export files to R dataframes. I would like to convert a SAS transport file to R,
but the resulting R dataframes do not fit in the memory of my computer. Is there
way to write the output of this fucntion to disk, perhaps using some pipe or
connection facility. Something like,

filexpt.lst - lookup.xport(file.xpt)
# works very well and returns a list with all kind of information about variable
# name, format, labels, etc.

# from what I can tell, this will not work.

? Is there a way to use a pipe or connection to write filexpt.df to disk as it
is being created?
? Is there a way to use a connection to an R dataframe on disk, so I can get
subsets (rows or colums) from the dataframe on disk, without having to read it
into memory?

I will be thankful for your help and suggestions.

Anupam.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] New project: littler for GNU R

```
What ?
==

littler - Provides hash-bang (#!) capability for R (www.r-project.org)

Why ?
=

GNU R, a language and environment for statistical computing and
graphics, provides a wonderful system for 'programming with data'
as well as interactive exploratory analysis, often involving graphs.

Sometimes, however, simple scripts are desired. While GNU R can
be used in batch mode, and while so-called 'here' documents can be
crafted, a long-standing need for a scripting front-end has often
been expressed by the R Community.

littler (pronounced 'little R' and written 'r') aims to fill
this need.

It can be used directly on the command-line just like, say, bc(1):

\$ echo 'cat(pi^2,\n)' | r
9.869604

Equivalently, commands that are to be evaluated can be given on
the command-line

\$ r -e 'cat(pi^2, \n)'
9.869604

But unlike bc(1), GNU R has a vast number of statistical
functions. For example, we can quickly compute a summary() and show
a stem-and-leaf plot for file sizes in a given directory via

\$ ls -l /boot | awk '!/^total/ {print \$5}' | \
print(summary(fsizes)); stem(fsizes)'
Min. 1st Qu.  MedianMean 3rd Qu.Max.
13 512  110100  486900  768400 4735000

The decimal point is 6 digit(s) to the right of the |

0 | 002223
0 | 5557778899
1 | 112233
1 | 5
2 |
2 |
3 |
3 |
4 |
4 | 7

And, last but not least, this (somewhat unwieldy) expression can
be stored in a helper script:

\$ cat examples/fsizes.r
#!/usr/bin/env r

print(summary(fsizes))
stem(fsizes)

(where calling /usr/bin/env is a trick from Python which allows one
to forget whether r is installed in /usr/bin/r, /usr/local/bin/r,
~/bin/r, ...)

A few examples are provided in the source directories examples/
and tests/.

Where ?
===

http://biostat.mc.vanderbilt.edu/LittleR

accessed by anonymous SVN:

or (soon !) be gotten from Debian mirrors via

\$ agt-get install littler

littler is known to build and run on Linux and OS X.

Who ?
=

Copyright (C) 2006 Jeffrey Horner and Dirk Eddelbuettel

littler is free software; you can redistribute it and/or modify it
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
General Public License for more details.

You should have received a copy of the GNU General Public
License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
MA  02111-1307  USA

Comments are welcome, as are are suggestions, bug fixes, or patches.

- Jeffrey Horner [EMAIL PROTECTED]
- Dirk Eddelbuettel [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] New project: littler for GNU R

```Any plans for Windows?

On 9/26/06, Jeffrey Horner [EMAIL PROTECTED] wrote:

What ?
==

littler - Provides hash-bang (#!) capability for R (www.r-project.org)

Why ?
=

GNU R, a language and environment for statistical computing and
graphics, provides a wonderful system for 'programming with data'
as well as interactive exploratory analysis, often involving graphs.

Sometimes, however, simple scripts are desired. While GNU R can
be used in batch mode, and while so-called 'here' documents can be
crafted, a long-standing need for a scripting front-end has often
been expressed by the R Community.

littler (pronounced 'little R' and written 'r') aims to fill
this need.

It can be used directly on the command-line just like, say, bc(1):

\$ echo 'cat(pi^2,\n)' | r
9.869604

Equivalently, commands that are to be evaluated can be given on
the command-line

\$ r -e 'cat(pi^2, \n)'
9.869604

But unlike bc(1), GNU R has a vast number of statistical
functions. For example, we can quickly compute a summary() and show
a stem-and-leaf plot for file sizes in a given directory via

\$ ls -l /boot | awk '!/^total/ {print \$5}' | \
print(summary(fsizes)); stem(fsizes)'
Min. 1st Qu.  MedianMean 3rd Qu.Max.
13 512  110100  486900  768400 4735000

The decimal point is 6 digit(s) to the right of the |

0 | 002223
0 | 5557778899
1 | 112233
1 | 5
2 |
2 |
3 |
3 |
4 |
4 | 7

And, last but not least, this (somewhat unwieldy) expression can
be stored in a helper script:

\$ cat examples/fsizes.r
#!/usr/bin/env r

print(summary(fsizes))
stem(fsizes)

(where calling /usr/bin/env is a trick from Python which allows one
to forget whether r is installed in /usr/bin/r, /usr/local/bin/r,
~/bin/r, ...)

A few examples are provided in the source directories examples/
and tests/.

Where ?
===

http://biostat.mc.vanderbilt.edu/LittleR

accessed by anonymous SVN:

or (soon !) be gotten from Debian mirrors via

\$ agt-get install littler

littler is known to build and run on Linux and OS X.

Who ?
=

Copyright (C) 2006 Jeffrey Horner and Dirk Eddelbuettel

littler is free software; you can redistribute it and/or modify it
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
General Public License for more details.

You should have received a copy of the GNU General Public
License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
MA  02111-1307  USA

Comments are welcome, as are are suggestions, bug fixes, or patches.

- Jeffrey Horner [EMAIL PROTECTED]
- Dirk Eddelbuettel [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] dotplot, dropping unused levels of 'y'

```Deepayan Sarkar wrote:

On 9/15/06, Benjamin Tyner [EMAIL PROTECTED] wrote:

In dotplot, what's the best way to suppress the unused levels of 'y' on
a per-panel basis? This is useful for the case that 'y' is a factor
taking perhaps thousands of levels, but for a given panel, only a
handfull of these levels ever present.

It's a bit problematic. Basically, you can use
relation=free/sliced, but y behaves as as.numeric(y) would. So, if
the small subset in each panel are always more or less contiguous (in
terms of the levels being close to each other) then you would be fine.
Otherwise you would not. In that case, you can still write your own
prepanel and panel functions, e.g.:
-

library(lattice)

y - factor(sample(1:100), levels = 1:100)
x - 1:100
a - gl(9, 1, 100)

dotplot(y ~ x | a)

p -
dotplot(y ~ x | a,
scales = list(y = list(relation = free, rot = 0)),

prepanel = function(x, y, ...) {
yy - y[, drop = TRUE]
list(ylim = levels(yy),
yat = sort(unique(as.numeric(yy
},

panel = function(x, y, ...) {
yy - y[, drop = TRUE]
panel.dotplot(x, yy, ...)
})

--

Hope that gives you what you want.

Deepayan

I've been trying to extend this to allow groups, but am running into a
bit of trouble. For example, the following doesn't quite work: (some of
the unused factor levels are suppressed per panel, but not all):

set.seed(47905)
temp3-data.frame(s_port=factor(rpois(100,10)),
POSIXtime=structure(1:100,class=c(POSIXt,POSIXct)),
l_ipn=factor(rpois(100,10)),
duration=runif(100),
locality=sample(1:4,replace=TRUE,size=100),
l_role=sample(c(-1,1),replace=TRUE,size=100))

plot-dotplot(s_port~POSIXtime|l_ipn,
data=temp3,
layout=c(1,1),
pch=|,
col=1:8,
duration=temp3\$duration,
auto.key=list(col=1:8,points=FALSE),
groups=locality*l_role,
prepanel = function(x, y, ...) {
yy - y[, drop = TRUE]
list(ylim = levels(yy),
yat = sort(unique(as.numeric(yy
},
panel = panel.superpose,
panel.groups = function(x, y, subscripts, duration, col,
...) {
yy - y[, drop = TRUE]
yy.n - as.numeric(yy)
panel.abline(h=yy.n,col=lightgray)
panel.xyplot(x=x,y=yy.n,subscripts=subscripts,col=col,...)
panel.segments(x,
yy.n,
x+duration[subscripts],
yy.n,
col = col)
},
scales=list(y=list(relation=free),
x=list(rot=45)),
xlab=time,
ylab=source port)

Thanks,
Ben

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] New project: littler for GNU R

```
On 26 September 2006 at 13:14, Gabor Grothendieck wrote:
| Any plans for Windows?

Someone with deeper knowledge of the Windows build process would need to help
us. Interested?

Dirk

--
Hell, there are no rules here - we're trying to accomplish something.
-- Thomas A. Edison

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] creation of new variables

```You may not have told us quite enough to be able to help you.  It may
be worth your while investing some time in describing the problem you
are trying to solve a little bit more comprehensively.

The posting guide http://www.R-project.org/posting-guide.html can be
useful in helping you  frame a question that stands a better chance of
receiving help.

Regards,

Mike

On 9/26/06, nalluri pratap [EMAIL PROTECTED] wrote:
Hello All,

I have 8 variables named

a b c d e f g h

I need to create four variables from these 8 vraibles in R.

the new variables are ab,cd,ef,gh.

Can anyone pleas help me

thanks,
Pratap

-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

--
Regards,

Mike Nielsen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] New project: littler for GNU R

```Wow, looks neat.

OS X users will be unhappy with your naming choice as the default
filesystem there is not case-sensitive :-(

IOW, r and R do the same thing.  I would expect it to otherwise work
on OS X so a change of some sort might be worthwhile.

+ seth

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] How to Pack a matrix

```Hello,
Suppose I have a matrix a where

a=  sp1 sp2 sp3 sp4 sp5 sp6
site1   1   0   1   1   0   1
site2   1   0   1   1   0   1
site3   1   1   1   1   1   1
site4   0   1   1   1   0   1
site5   0   0   1   0   0   1
site6   0   0   1   0   1   0

And I want to pack that matrix so that the upper left corner contains
most of the ones and the bottom right corner contains most of the zeros
so that matrix b is

b=  sp3 sp6 sp4 sp1 sp2 sp5
site1   1   1   1   1   0   0
site2   1   1   1   1   0   0
site3   1   1   1   1   1   1
site4   1   1   1   0   1   0
site5   1   1   0   0   0   0
site6   1   0   0   0   0   1

Can any of you help me with some code to accomplish this?  I have tried
different forms of order and can't seem to figure it out.  Basically I
want to order the matrix by both the rows and columns.

Cam

Cameron Guenther, Ph.D.
Associate Research Scientist
FWC/FWRI, Marine Fisheries Research
100 8th Avenue S.E.
St. Petersburg, FL 33701
(727)896-8626 Ext. 4305
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] New project: littler for GNU R

```Seth Falcon wrote:
Wow, looks neat.

OS X users will be unhappy with your naming choice as the default
filesystem there is not case-sensitive :-(

IOW, r and R do the same thing.  I would expect it to otherwise work
on OS X so a change of some sort might be worthwhile.

(I'm always amazed at how I can miss the simplest details. I probably
knew at some point that OS X shipped with a case-sensitive file system,
which you can turn off somehow, but forgot. Thank goodness for peer review.)

littler will install into /usr/local/bin by default, so I don't think
there's a clash with the Mac binary provided by CRAN, right?

Jeff
--
http://biostat.mc.vanderbilt.edu/JeffreyHorner

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] dotplot, dropping unused levels of 'y'

```On 9/26/06, Benjamin Tyner [EMAIL PROTECTED] wrote:
Deepayan Sarkar wrote:

On 9/15/06, Benjamin Tyner [EMAIL PROTECTED] wrote:

In dotplot, what's the best way to suppress the unused levels of 'y' on
a per-panel basis? This is useful for the case that 'y' is a factor
taking perhaps thousands of levels, but for a given panel, only a
handfull of these levels ever present.

It's a bit problematic. Basically, you can use
relation=free/sliced, but y behaves as as.numeric(y) would. So, if
the small subset in each panel are always more or less contiguous (in
terms of the levels being close to each other) then you would be fine.
Otherwise you would not. In that case, you can still write your own
prepanel and panel functions, e.g.:
-

library(lattice)

y - factor(sample(1:100), levels = 1:100)
x - 1:100
a - gl(9, 1, 100)

dotplot(y ~ x | a)

p -
dotplot(y ~ x | a,
scales = list(y = list(relation = free, rot = 0)),

prepanel = function(x, y, ...) {
yy - y[, drop = TRUE]
list(ylim = levels(yy),
yat = sort(unique(as.numeric(yy
},

panel = function(x, y, ...) {
yy - y[, drop = TRUE]
panel.dotplot(x, yy, ...)
})

--

Hope that gives you what you want.

Deepayan

I've been trying to extend this to allow groups, but am running into a
bit of trouble. For example, the following doesn't quite work: (some of
the unused factor levels are suppressed per panel, but not all):

I don't think panel = panel.superpose is enough. Try

panel = function(x, y, ...) {
yy - y[, drop = TRUE]
yy.n - as.numeric(yy)
panel.superpose(x, yy.n, ...)
},
panel.groups =
function(x, y, subscripts, duration, col, ...) {
panel.abline(h = y, col = lightgray)
panel.xyplot(x, y, col = col, ...)
panel.segments(x, y,
x + duration[subscripts], y,
col = col)
},

-Deepayan

set.seed(47905)
temp3-data.frame(s_port=factor(rpois(100,10)),
POSIXtime=structure(1:100,class=c(POSIXt,POSIXct)),
l_ipn=factor(rpois(100,10)),
duration=runif(100),
locality=sample(1:4,replace=TRUE,size=100),
l_role=sample(c(-1,1),replace=TRUE,size=100))

plot-dotplot(s_port~POSIXtime|l_ipn,
data=temp3,
layout=c(1,1),
pch=|,
col=1:8,
duration=temp3\$duration,
auto.key=list(col=1:8,points=FALSE),
groups=locality*l_role,
prepanel = function(x, y, ...) {
yy - y[, drop = TRUE]
list(ylim = levels(yy),
yat = sort(unique(as.numeric(yy
},
panel = panel.superpose,
panel.groups = function(x, y, subscripts, duration, col,
...) {
yy - y[, drop = TRUE]
yy.n - as.numeric(yy)
panel.abline(h=yy.n,col=lightgray)
panel.xyplot(x=x,y=yy.n,subscripts=subscripts,col=col,...)
panel.segments(x,
yy.n,
x+duration[subscripts],
yy.n,
col = col)
},
scales=list(y=list(relation=free),
x=list(rot=45)),
xlab=time,
ylab=source port)

Thanks,
Ben

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] How to Pack a matrix

```It looks like your example only reorders the columns but your
discussion refers to ordering rows too.  I have only addressed
the columns part but it is hopefully clear how to extend this
or use other objective functions.  We generate every permutation
of the rows and define an objective function f which is smaller for
more desirable column permutations and then use brute force to find
the minimizer:

library(combinat)

mat - structure(c(1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0), .Dim = c(6,
6), .Dimnames = list(c(site1, site2, site3, site4, site5,
site6), c(sp1, sp2, sp3, sp4, sp5, sp6)))

f - function(p) sum(mat[,p] * (row(mat) + col(mat)))
perms - permn(ncol(mat))
mat[,perms[[which.min(sapply(perms, f))]]]

On 9/26/06, Guenther, Cameron [EMAIL PROTECTED] wrote:
Hello,
Suppose I have a matrix a where

a=  sp1 sp2 sp3 sp4 sp5 sp6
site1   1   0   1   1   0   1
site2   1   0   1   1   0   1
site3   1   1   1   1   1   1
site4   0   1   1   1   0   1
site5   0   0   1   0   0   1
site6   0   0   1   0   1   0

And I want to pack that matrix so that the upper left corner contains
most of the ones and the bottom right corner contains most of the zeros
so that matrix b is

b=  sp3 sp6 sp4 sp1 sp2 sp5
site1   1   1   1   1   0   0
site2   1   1   1   1   0   0
site3   1   1   1   1   1   1
site4   1   1   1   0   1   0
site5   1   1   0   0   0   0
site6   1   0   0   0   0   1

Can any of you help me with some code to accomplish this?  I have tried
different forms of order and can't seem to figure it out.  Basically I
want to order the matrix by both the rows and columns.

Cam

Cameron Guenther, Ph.D.
Associate Research Scientist
FWC/FWRI, Marine Fisheries Research
100 8th Avenue S.E.
St. Petersburg, FL 33701
(727)896-8626 Ext. 4305
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] colClasses: supressed 'NA'

```Hi,

The colClasses seem to be supressing 'NA' vlaues. How do I fix this?

R script and first 5 lines of output is below.

File test2.dat has blanks that are read as NA when I do not use
'colClasses', but as blanks when I use 'colClasses'.

col.names=c(psu,losewt,maintain,fewcal,phyact,age,income,weight,
wtdesire,gender),
colClasses=c(factor,factor,factor,factor,factor,numeric,factor,
numeric,numeric,factor),
nrows=27, comment.char=)

temp.df
psu losewt maintain fewcal phyact age income weight wtdesire gender
1   2003009323  2252 05220  220  1
2   2003005181  21  2  2  58 08165  145  2
3   2003015942  21  4  1  76 05142  130  2
4   2003011406  21  3  1  43 03110  110  2
5   2003006786  1   4  1  49 06178  145  2

? why am I not getting missing values when I use 'colClasses'?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] Building R for Windows with ATLAS

```I think this is not a R-devel question. Sorry to all if I'm wrong,

I managed to build R successfully with the default BLAS but when I
change the MKRULES to use ATLAS BLAS and set the path to
following error message (I'm posting only the final part, there was a
lot of compilation before this):

cp R.dll ../../bin/
Building ../../bin/Rblas.dll
gcc  -shared -s -o ../../bin/Rblas.dll blas00.o dllversion.o Rblas.def \
-L../../bin -lR  -LC:/WinNT_ATHLONSSE2 -lf77blas -latlas
C:/WinNT_ATHLONSSE2/libf77blas.a(xerbla.o):xerbla.f:(.text+0xb): undefined refer
ence to `s_wsfe'
C:/WinNT_ATHLONSSE2/libf77blas.a(xerbla.o):xerbla.f:(.text+0x27): undefined refe
rence to `do_fio'
C:/WinNT_ATHLONSSE2/libf77blas.a(xerbla.o):xerbla.f:(.text+0x43): undefined refe
rence to `do_fio'
C:/WinNT_ATHLONSSE2/libf77blas.a(xerbla.o):xerbla.f:(.text+0x48): undefined refe
rence to `e_wsfe'
C:/WinNT_ATHLONSSE2/libf77blas.a(xerbla.o):xerbla.f:(.text+0x5c): undefined refe
rence to `s_stop'
collect2: ld returned 1 exit status
make[2]: *** [../../bin/Rblas.dll] Error 1
make[1]: *** [rbuild] Error 2
make: *** [all] Error 2

The ATLAS BLAS was build using Cygwin. AFTER building ATLAS BLAS I
changed the Path variable putting C:\Rtools\tools\bin;C:\MinGW\bin
before everything else.
To build R I followed R Administration and Instalation and Duncan
Murdoch's guide at http://www.murdoch-sutherland.com/Rtools/,
including the version of MinGW.

At ATLAS web page (http://math-atlas.sourceforge.net/errata.html) I
found the following:

Q: I'm linking with C, and getting missing symbols (such as w_wsfe,
do_fio, w_esfe or s_stop).
R: These kinds of symbols are Fortran library calls. The problem is
that the C linker does not automatically find the Fortran libraries.
rewrite your code so that Fortran routines are not called. If you know
where they are, you can also choose to link in the Fortran libraries
explicitly

Well, I can understand that there is a huge probability that this is
my problem. Unfortunately I know nothing of C or Fortran. Even if I
knew that I have these Fortran libraries I wouldn't know how to link
them. I tried to look at MinGW web page but found nothing.
Any help would be mostly welcome, please.
Giuseppe Antonaci

Sorry for English errors and lack of knowledge. I hope I made myself
understandable.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] creation of new variables

```Depends on what these variables are. Are these vectors?
if so a simple
a*b etc should work.
If they are columns of a data frame DF?
then DF\$a*DF\$b.

If these variables are part of a function then also a*b should work.

On 9/26/06, nalluri pratap [EMAIL PROTECTED] wrote:

Hello All,

I have 8 variables named

a b c d e f g h

I need to create four variables from these 8 vraibles in R.

the new variables are ab,cd,ef,gh.

Can anyone pleas help me

thanks,
Pratap

-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Ritwik Sinha
Epidemiology and Biostatistics
Case Western Reserve University

http://darwin.cwru.edu/~rsinha

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] colClasses: supressed 'NA'

```Because by default blank fields aren't considered to be missing in factors
but they are in integer vectors.

f1-factor(c(1,2,,3,4))
f1
[1] 1 2   3 4
Levels:  1 2 3 4

I think you can fix this by specifying na.strings=c(NA,)

On 26/09/06, Anupam Tyagi [EMAIL PROTECTED] wrote:

Hi,

The colClasses seem to be supressing 'NA' vlaues. How do I fix this?

R script and first 5 lines of output is below.

File test2.dat has blanks that are read as NA when I do not use
'colClasses', but as blanks when I use 'colClasses'.

col.names=c
(psu,losewt,maintain,fewcal,phyact,age,income,weight,
wtdesire,gender),

colClasses=c(factor,factor,factor,factor,factor,numeric,factor,
numeric,numeric,factor),
nrows=27, comment.char=)

temp.df
psu losewt maintain fewcal phyact age income weight wtdesire
gender
1   2003009323  2252
05220  220  1
2   2003005181  21  2  2  58
08165  145  2
3   2003015942  21  4  1  76
05142  130  2
4   2003011406  21  3  1  43
03110  110  2
5   2003006786  1   4  1  49
06178  145  2

? why am I not getting missing values when I use 'colClasses'?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
=
David Barron
University of Oxford
Park End Street
Oxford OX1 1HP

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] New project: littler for GNU R

```On 9/26/2006 1:04 PM, Jeffrey Horner wrote:
What ?
==

littler - Provides hash-bang (#!) capability for R (www.r-project.org)

Why ?
=

GNU R, a language and environment for statistical computing and
graphics, provides a wonderful system for 'programming with data'
as well as interactive exploratory analysis, often involving graphs.

Sometimes, however, simple scripts are desired. While GNU R can
be used in batch mode, and while so-called 'here' documents can be
crafted, a long-standing need for a scripting front-end has often
been expressed by the R Community.

littler (pronounced 'little R' and written 'r') aims to fill
this need.

It can be used directly on the command-line just like, say, bc(1):

\$ echo 'cat(pi^2,\n)' | r
9.869604

Is there a technical reason that this couldn't work by modifying the
script that invokes R?  That would avoid the r/R clash on MacOSX and
Windows.  In Windows R is R.exe, not a script, so some adjustment would
be needed there, but that shouldn't be difficult.

Duncan Murdoch

Equivalently, commands that are to be evaluated can be given on
the command-line

\$ r -e 'cat(pi^2, \n)'
9.869604

But unlike bc(1), GNU R has a vast number of statistical
functions. For example, we can quickly compute a summary() and show
a stem-and-leaf plot for file sizes in a given directory via

\$ ls -l /boot | awk '!/^total/ {print \$5}' | \
print(summary(fsizes)); stem(fsizes)'
Min. 1st Qu.  MedianMean 3rd Qu.Max.
13 512  110100  486900  768400 4735000

The decimal point is 6 digit(s) to the right of the |

0 | 002223
0 | 5557778899
1 | 112233
1 | 5
2 |
2 |
3 |
3 |
4 |
4 | 7

And, last but not least, this (somewhat unwieldy) expression can
be stored in a helper script:

\$ cat examples/fsizes.r
#!/usr/bin/env r

print(summary(fsizes))
stem(fsizes)

(where calling /usr/bin/env is a trick from Python which allows one
to forget whether r is installed in /usr/bin/r, /usr/local/bin/r,
~/bin/r, ...)

A few examples are provided in the source directories examples/
and tests/.

Where ?
===

http://biostat.mc.vanderbilt.edu/LittleR

accessed by anonymous SVN:

or (soon !) be gotten from Debian mirrors via

\$ agt-get install littler

littler is known to build and run on Linux and OS X.

Who ?
=

Copyright (C) 2006 Jeffrey Horner and Dirk Eddelbuettel

littler is free software; you can redistribute it and/or modify it
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
General Public License for more details.

You should have received a copy of the GNU General Public
License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
MA  02111-1307  USA

Comments are welcome, as are are suggestions, bug fixes, or patches.

- Jeffrey Horner [EMAIL PROTECTED]
- Dirk Eddelbuettel [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] colClasses: supressed 'NA'

```

Anupam Tyagi wrote:
Hi,

The colClasses seem to be supressing 'NA' vlaues. How do I fix this?

R script and first 5 lines of output is below.

File test2.dat has blanks that are read as NA when I do not use
'colClasses', but as blanks when I use 'colClasses'.

Well, you say it should be a factor, hence   is taken as a level.
Otherwise you have to specify na.string =  .

Uwe Ligges

col.names=c(psu,losewt,maintain,fewcal,phyact,age,income,weight,
wtdesire,gender),
colClasses=c(factor,factor,factor,factor,factor,numeric,factor,
numeric,numeric,factor),
nrows=27, comment.char=)

temp.df
psu losewt maintain fewcal phyact age income weight wtdesire gender
1   2003009323  2252 05220  220  1
2   2003005181  21  2  2  58 08165  145  2
3   2003015942  21  4  1  76 05142  130  2
4   2003011406  21  3  1  43 03110  110  2
5   2003006786  1   4  1  49 06178  145  2

? why am I not getting missing values when I use 'colClasses'?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] New project: littler for GNU R

```
On 26 September 2006 at 15:48, Duncan Murdoch wrote:
| On 9/26/2006 1:04 PM, Jeffrey Horner wrote:
| It can be used directly on the command-line just like, say, bc(1):
|
|
|   \$ echo 'cat(pi^2,\n)' | r
|   9.869604
|
| Is there a technical reason that this couldn't work by modifying the
| script that invokes R?  That would avoid the r/R clash on MacOSX and
| Windows.  In Windows R is R.exe, not a script, so some adjustment would
| be needed there, but that shouldn't be difficult.

Quite possible. We would surely encourage it.

We'd be happy to retire littler to the dustbin when `the real R' can do this
too.  Until then, littler appear to serve one of us rather well (as R still
can't do shebang-style scripts), and may hence be of interest to others too.

Regards, Dirk

--
Hell, there are no rules here - we're trying to accomplish something.
-- Thomas A. Edison

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] New project: littler for GNU R

```On 9/26/06, Seth Falcon [EMAIL PROTECTED] wrote:
Wow, looks neat.

OS X users will be unhappy with your naming choice as the default
filesystem there is not case-sensitive :-(

IOW, r and R do the same thing.  I would expect it to otherwise work
on OS X so a change of some sort might be worthwhile.

Installing as 'littler' on OS X might be a reasonable solution.

Then again, adapting /usr/bin/R to have a python-style -c switch might
be the best long-term solution for R 2.5+.

Chris, waiting for apt-get install littler to work :-)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] New project: littler for GNU R

```I like this plan and have now played with the concept.  I did the following
on Windows in cygwin.  It would also work in Unix, and I think could be tickled
to work on the standard MS cmd line in Windows.  It would certainly work
on Windows with a Windows-native port of the basic unix utilities.

echo 'options(echo=FALSE);cat(pi^2,\n)' | Rterm --no-save

This produces an output file, that normally shows up in the *shell*
buffer, but could be redirected.   The obvious place to redirect it to is
awk with a script to filter out everything above the echo of the options()
line.

The only change to R needed to remove the need for an awk script
is to suppress the display of the copyright message and startup
information.  I suppose that could be done with a new
--suppress-startup-info argument to Rterm.

The other optimizations that Jeffrey and Dirk have, such as
would also need to be done.

Very good work and concept.

Rich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] New project: littler for GNU R

```The way it should work IMHO is that one can write any of these
(in analogy to awk/perl/etc.):

R -f myprog.R mydata.dat
R -f myprog.R  mydata.dat
cat mydata.dat | R -f myprog.R # or analogously on Windows
R -e ...some.R.code...  mydata.dat
R -e ...some.R.code...   mydata.dat

and there should be a simple way for myprog.R to read the
input data that does not require that it know whether it
was specified on the command line or redirected.

On 9/26/06, Richard M. Heiberger [EMAIL PROTECTED] wrote:
I like this plan and have now played with the concept.  I did the following
on Windows in cygwin.  It would also work in Unix, and I think could be
tickled
to work on the standard MS cmd line in Windows.  It would certainly work
on Windows with a Windows-native port of the basic unix utilities.

echo 'options(echo=FALSE);cat(pi^2,\n)' | Rterm --no-save

This produces an output file, that normally shows up in the *shell*
buffer, but could be redirected.   The obvious place to redirect it to is
awk with a script to filter out everything above the echo of the options()
line.

The only change to R needed to remove the need for an awk script
is to suppress the display of the copyright message and startup
information.  I suppose that could be done with a new
--suppress-startup-info argument to Rterm.

The other optimizations that Jeffrey and Dirk have, such as
would also need to be done.

Very good work and concept.

Rich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] 5 binary_class models vs one 5-class model

```Hi,

I apologize this question is not very r-related, but believe many
people using R are expertised at or interested to know the answer to
the following question.

I am having a problem in classification. In bioinformatics study, we
always ends with a limited size of samples. While in algorithms, some
specific algorithm cannot handle modeling with more than 2 classes
problem. For the time being, not considering those limitations, I just
have a general question like this:

suppose I have a problem for classification, which involves 5 classes.
I am wondering if there is a general research comparison on which
approach is more accurate: building 5 binary_class models or building
one 5-class model (suppose cost (penalty) is same when accuracy is
estimated).

An extended or more practical question, in bioinformatics, if you do
not have many samples but you are having such problem, what approach
will you take?

thanks,

--
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.

Did you always know?
No, I did not. But I believed...
---Matrix III

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] New project: littler for GNU R

```On Tue, 26 Sep 2006, Richard M. Heiberger wrote:

I like this plan and have now played with the concept.  I did the following
on Windows in cygwin.  It would also work in Unix, and I think could be
tickled
to work on the standard MS cmd line in Windows.  It would certainly work
on Windows with a Windows-native port of the basic unix utilities.

echo 'options(echo=FALSE);cat(pi^2,\n)' | Rterm --no-save

This produces an output file, that normally shows up in the *shell*
buffer, but could be redirected.   The obvious place to redirect it to is
awk with a script to filter out everything above the echo of the options()
line.

The only change to R needed to remove the need for an awk script
is to suppress the display of the copyright message and startup
information.  I suppose that could be done with a new
--suppress-startup-info argument to Rterm.

It is called --slave.

The other optimizations that Jeffrey and Dirk have, such as
would also need to be done.

Rterm --slave R_DEFAULT_PACKAHES=NULL

and variables is already widely used in the R build process.

Very good work and concept.

Rich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] New project: littler for GNU R

```On 9/26/06, Richard M. Heiberger [EMAIL PROTECTED] wrote:
I like this plan and have now played with the concept.  I did the following
on Windows in cygwin.  It would also work in Unix, and I think could be
tickled
to work on the standard MS cmd line in Windows.  It would certainly work
on Windows with a Windows-native port of the basic unix utilities.

echo 'options(echo=FALSE);cat(pi^2,\n)' | Rterm --no-save

This produces an output file, that normally shows up in the *shell*
buffer, but could be redirected.   The obvious place to redirect it to is
awk with a script to filter out everything above the echo of the options()
line.

It seems to me that a big difference between this and littler is how
stdin is treated. How would you implement the fsizer.r example using
this concept?

The only change to R needed to remove the need for an awk script
is to suppress the display of the copyright message and startup
information.  I suppose that could be done with a new
--suppress-startup-info argument to Rterm.

I typically use

--vanilla --slave

(which I assume would work on Windows too).

The other optimizations that Jeffrey and Dirk have, such as
would also need to be done.

Very good work and concept.

Rich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] colClasses: supressed 'NA'

```Uwe Ligges ligges at statistik.uni-dortmund.de writes:

Well, you say it should be a factor, hence   is taken as a level.

And why not   a level. Thanks for drawing my attention to it. It is common
mistake that is easy to slip attention. Thanks a lot. Anupam.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] Lattice strip labels for two factors

```Dear All:

In the following code which I modified from previous question, in addition
to show the fact1 level names (y, b, r) in strips, I also want to have a
color bar to indicate the state of every panel (in this example, y
correspods to 1, and b, r correspond to 0). Does anyone have a quick
solution?

Thanks

df - expand.grid(fact1=c(y,b,r),
fact2=cfar,por,lis,set), year=1991:2000, value= NA)
df[,value] - sample(1:50, 120, replace=TRUE)
df\$state - 0
df\$state[df\$fact1==y] - 1

require(lattice)
xyplot( value ~ year | fact1, data=df, type=b, subset= fact2==far,
strip = strip.custom(bg=gray.colors(1,0.95),
factor.levels=c(yellow,  black, red)), layout=c(1,3))

_
Spaces

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Lattice strip labels for two factors

```On 9/26/06, Joe Moore [EMAIL PROTECTED] wrote:
Dear All:

In the following code which I modified from previous question,

Perhaps you should also have checked if it runs after the modification.

to show the fact1 level names (y, b, r) in strips, I also want to have a
color bar to indicate the state of every panel (in this example, y
correspods to 1, and b, r correspond to 0). Does anyone have a quick
solution?

No, but this might give you a hint (you need to write a suitable panel
function):

xyplot(value ~ year | fact1:factor(state),
data=df, type=b,
subset= fact2==far,
layout=c(1,3))

Deepayan

Thanks

df - expand.grid(fact1=c(y,b,r),
fact2=cfar,por,lis,set), year=1991:2000, value= NA)
df[,value] - sample(1:50, 120, replace=TRUE)
df\$state - 0
df\$state[df\$fact1==y] - 1

require(lattice)
xyplot( value ~ year | fact1, data=df, type=b, subset= fact2==far,
strip = strip.custom(bg=gray.colors(1,0.95),
factor.levels=c(yellow,  black, red)), layout=c(1,3))

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] Bug in formals-

```I think this is new since a previous version of R:

h - function(x, trantab) trantab[x]
w - 6:4
names(w) - c('cat','dog','giraffe')
w
cat dog giraffe
6   5   4

formals(h) - list(x=numeric(0), trantab=w)
h
function (x = numeric(0), trantab = c(6, 5, 4))
trantab[x]

You can see that the names have been dropped from trantab's default
values.  I don't see a workaround but it seems to need fixing.

Version 2.3.1 (2006-06-01)
i486-pc-linux-gnu

attached base packages:
[1] grid  methods   stats graphics  grDevices utils
[7] datasets  base

other attached packages:
lattice   acepack Hmisc
0.13-10 1.3-2.2  3.0-12

--
Frank E Harrell Jr   Professor and Chair   School of Medicine
Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Bug in formals-

```On 9/26/06, Frank E Harrell Jr [EMAIL PROTECTED] wrote:
I think this is new since a previous version of R:

h - function(x, trantab) trantab[x]
w - 6:4
names(w) - c('cat','dog','giraffe')
w
cat dog giraffe
6   5   4

formals(h) - list(x=numeric(0), trantab=w)
h
function (x = numeric(0), trantab = c(6, 5, 4))
trantab[x]

You can see that the names have been dropped from trantab's default
values.

Are you sure? I get

formals(h)
\$x
numeric(0)

\$trantab
cat dog giraffe
6   5   4

h(1)
cat
6

R version 2.4.0 beta (2006-09-21 r39463)
x86_64-unknown-linux-gnu

-Deepayan

Version 2.3.1 (2006-06-01)
i486-pc-linux-gnu

attached base packages:
[1] grid  methods   stats graphics  grDevices utils
[7] datasets  base

other attached packages:
lattice   acepack Hmisc
0.13-10 1.3-2.2  3.0-12

--
Frank E Harrell Jr   Professor and Chair   School of Medicine
Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Bug in formals-

```This seems to be related to using c to define transtab.
If we use list in place of c then it displays ok:

h - function(x, trantab) transtab[x]
formals(h) - list(x = numeric(0), transtab = c(cat = 6, dog = 5))
function (x = numeric(0), transtab = c(6, 5))
transtab[x]
h(cat) # runs ok
cat
6
formals(h) - list(x = numeric(0), transtab = list(cat = 6, dog = 5))
print(h) # now display is ok
function (x = numeric(0), transtab = list(cat = 6, dog = 5))
transtab[x]
h(cat) # runs ok
\$cat
[1] 6

On 9/26/06, Frank E Harrell Jr [EMAIL PROTECTED] wrote:
Deepayan Sarkar wrote:
On 9/26/06, Frank E Harrell Jr [EMAIL PROTECTED] wrote:
I think this is new since a previous version of R:

h - function(x, trantab) trantab[x]
w - 6:4
names(w) - c('cat','dog','giraffe')
w
cat dog giraffe
6   5   4

formals(h) - list(x=numeric(0), trantab=w)
h
function (x = numeric(0), trantab = c(6, 5, 4))
trantab[x]

You can see that the names have been dropped from trantab's default
values.

Are you sure? I get

formals(h)
\$x
numeric(0)

\$trantab
cat dog giraffe
6   5   4

h(1)
cat
6

R version 2.4.0 beta (2006-09-21 r39463)
x86_64-unknown-linux-gnu

-Deepayan

Deepayan -

You are correct.  h('cat') is 6 as intended.  I just looked at the
function definition - the names attribute doesn't show for some reason.
I was expecting function(..., trantab=c(cat=6, ..).

Thanks

Frank

Version 2.3.1 (2006-06-01)
i486-pc-linux-gnu

attached base packages:
[1] grid  methods   stats graphics  grDevices utils
[7] datasets  base

other attached packages:
lattice   acepack Hmisc
0.13-10 1.3-2.2  3.0-12

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] New project: littler for GNU R

```Duncan Murdoch wrote:
On 9/26/2006 1:04 PM, Jeffrey Horner wrote:

[...]

It can be used directly on the command-line just like, say, bc(1):

\$ echo 'cat(pi^2,\n)' | r
9.869604

Is there a technical reason that this couldn't work by modifying the
script that invokes R?  That would avoid the r/R clash on MacOSX and
Windows.  In Windows R is R.exe, not a script, so some adjustment would
be needed there, but that shouldn't be difficult.

In fact, it does work:

\$ echo 'cat(pi^2,\n)' | R --vanilla --slave
9.869604

but what's more compelling is the ability to utilize the UNIX hash-bang
mechanism.

Jeff
--
http://biostat.mc.vanderbilt.edu/JeffreyHorner

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] New project: littler for GNU R

```Seth Falcon wrote:
Jeffrey Horner [EMAIL PROTECTED] writes:

[...]

littler will install into /usr/local/bin by default, so I don't think
there's a clash with the Mac binary provided by CRAN, right?

It depends what you mean by clash :-)

If both are on the PATH, then you get the first one, I suspect, when
running either 'R' or 'r'.  I haven't tested this bit yet, but on my
OS X laptop I can invoke a new R session using either 'R' or 'r'
(using an R built from source, not the R GUI app thingie).

Good point, but the executable path can be named absolutely in hash-bang
scripts. Relative paths work as well with the use of '/usr/bin/env
program' as is described in the littler announcement, but then you don't
get to pass arguments to 'program', just to the hash-bang script.

So IMO, a different name or an integration into the R script in some
way would be a big improvement.

But I'd like to know why there's an R script in the first place. Why not
just an executable as on windows?

'r' is cute, but going down the road of tools with the same name
except for caps leads to confusion (for me).  For example, R CMD
build/INSTALL still catches me up after a number of years.

That's a different problem than case-sensitivity. The word 'build' must
have had a different semantic than INSTALL, and I'm not sure why one was
all caps and the other isn't.

Jeff
--
http://biostat.mc.vanderbilt.edu/JeffreyHorner

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] New project: littler for GNU R

```The real problem is that one wants to pipe the data in, not the
R source.  The idea is that one successively transforms the
data in successive elements of the pipeline.

For example one might want to write cut, grep, etc. in R rather than
in C.

This has been on my year-end wishlist for some time.

On 9/26/06, Duncan Murdoch [EMAIL PROTECTED] wrote:
On 9/26/2006 1:04 PM, Jeffrey Horner wrote:
What ?
==

littler - Provides hash-bang (#!) capability for R (www.r-project.org)

Why ?
=

GNU R, a language and environment for statistical computing and
graphics, provides a wonderful system for 'programming with data'
as well as interactive exploratory analysis, often involving graphs.

Sometimes, however, simple scripts are desired. While GNU R can
be used in batch mode, and while so-called 'here' documents can be
crafted, a long-standing need for a scripting front-end has often
been expressed by the R Community.

littler (pronounced 'little R' and written 'r') aims to fill
this need.

It can be used directly on the command-line just like, say, bc(1):

\$ echo 'cat(pi^2,\n)' | r
9.869604

Is there a technical reason that this couldn't work by modifying the
script that invokes R?  That would avoid the r/R clash on MacOSX and
Windows.  In Windows R is R.exe, not a script, so some adjustment would
be needed there, but that shouldn't be difficult.

Duncan Murdoch

Equivalently, commands that are to be evaluated can be given on
the command-line

\$ r -e 'cat(pi^2, \n)'
9.869604

But unlike bc(1), GNU R has a vast number of statistical
functions. For example, we can quickly compute a summary() and show
a stem-and-leaf plot for file sizes in a given directory via

\$ ls -l /boot | awk '!/^total/ {print \$5}' | \
print(summary(fsizes)); stem(fsizes)'
Min. 1st Qu.  MedianMean 3rd Qu.Max.
13 512  110100  486900  768400 4735000

The decimal point is 6 digit(s) to the right of the |

0 | 002223
0 | 5557778899
1 | 112233
1 | 5
2 |
2 |
3 |
3 |
4 |
4 | 7

And, last but not least, this (somewhat unwieldy) expression can
be stored in a helper script:

\$ cat examples/fsizes.r
#!/usr/bin/env r

print(summary(fsizes))
stem(fsizes)

(where calling /usr/bin/env is a trick from Python which allows one
to forget whether r is installed in /usr/bin/r, /usr/local/bin/r,
~/bin/r, ...)

A few examples are provided in the source directories examples/
and tests/.

Where ?
===

http://biostat.mc.vanderbilt.edu/LittleR

accessed by anonymous SVN:

or (soon !) be gotten from Debian mirrors via

\$ agt-get install littler

littler is known to build and run on Linux and OS X.

Who ?
=

Copyright (C) 2006 Jeffrey Horner and Dirk Eddelbuettel

littler is free software; you can redistribute it and/or modify it
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
General Public License for more details.

You should have received a copy of the GNU General Public
License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
MA  02111-1307  USA

Comments are welcome, as are are suggestions, bug fixes, or patches.

- Jeffrey Horner [EMAIL PROTECTED]
- Dirk Eddelbuettel [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch ```

### [R] matrix with additional upper, botton, left and right cells

```Dear R Gurus,

I have a matrix dim(1000x1000) and I need create a second matrix with
dim(1002x1002) and insert my first matrix at position col=2,line=2. Please, see
an example below:

0050055050
555000
5000505005
5005000500
000555

and I need

300500550503
35550003
350005050053
350050005003
30005553

Thanks a lot,

miltinho

__

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] New project: littler for GNU R

```
On 26 September 2006 at 22:17, Gabor Grothendieck wrote:
| The real problem is that one wants to pipe the data in, not the
| R source.  The idea is that one successively transforms the
| data in successive elements of the pipeline.

But that is what our filesize example does::

| On 9/26/06, Duncan Murdoch [EMAIL PROTECTED] wrote:
|  On 9/26/2006 1:04 PM, Jeffrey Horner wrote:
[...]
|  But unlike bc(1), GNU R has a vast number of statistical
|  functions. For example, we can quickly compute a summary() and show
|  a stem-and-leaf plot for file sizes in a given directory via
|
|\$ ls -l /boot | awk '!/^total/ {print \$5}' | \
| r -e 'fsizes - as.integer(readLines());
|print(summary(fsizes)); stem(fsizes)'
|   Min. 1st Qu.  MedianMean 3rd Qu.Max.
| 13 512  110100  486900  768400 4735000
|
|  The decimal point is 6 digit(s) to the right of the |
|
|  0 | 002223
|  0 | 5557778899
|  1 | 112233
|  1 | 5
|  2 |
|  2 |
|  3 |
|  3 |
|  4 |
|  4 | 7

Data to be processed on stdin, command via -e 'some long expression'.

To make it simpler, here is a somewhat useless example of r piping into r

\$  r -e 'set.seed(42); sapply(rnorm(5),function(x) cat(x,\n))' |  \
3.335916

Isn't that something where, to quote you, one wants to pipe the data in, not
the R source ?

Dirk

--
Hell, there are no rules here - we're trying to accomplish something.
-- Thomas A. Edison

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] matrix with additional upper, botton, left and right cells

```How about something like this:
x - matrix(1:100,10)
x.1 - array(-3, dim=c(12,12))
x.1[2:11, 2:11] - x
x.1
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,]   -3   -3   -3   -3   -3   -3   -3   -3   -3-3-3-3
[2,]   -31   11   21   31   41   51   61   718191-3
[3,]   -32   12   22   32   42   52   62   728292-3
[4,]   -33   13   23   33   43   53   63   738393-3
[5,]   -34   14   24   34   44   54   64   748494-3
[6,]   -35   15   25   35   45   55   65   758595-3
[7,]   -36   16   26   36   46   56   66   768696-3
[8,]   -37   17   27   37   47   57   67   778797-3
[9,]   -38   18   28   38   48   58   68   788898-3
[10,]   -39   19   29   39   49   59   69   798999-3
[11,]   -3   10   20   30   40   50   60   70   8090   100-3
[12,]   -3   -3   -3   -3   -3   -3   -3   -3   -3-3-3-3

On 9/26/06, Milton Cezar [EMAIL PROTECTED] wrote:
Dear R Gurus,

I have a matrix dim(1000x1000) and I need create a second matrix with
dim(1002x1002) and insert my first matrix at position col=2,line=2. Please,
see an example below:

0050055050
555000
5000505005
5005000500
000555

and I need

300500550503
35550003
350005050053
350050005003
30005553

Thanks a lot,

miltinho

__

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### [R] histogram colors in lattice

```I have code that constructs a plot using the lattice package that looks
something like the following toy example:

library(lattice)
Start - factor(rbinom(100,1,.5))

breaks=c(1, 1.4 ,1.6,2),
scales=list(x=list(at=c(1.2,1.8),labels=c(Yes,No))),
xlab=,ylab=)

I would like to have different colors for the bars in the left and right
panel (say red and green) but I can't find a way to do this. Can anyone give
me any advice on how to achieve this?

Thanks,
Jamie Jarabek

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] New project: littler for GNU R

```I think this is quoted out of context. I was referring to Duncan's post
which shows an example of piping R code.

On 9/26/06, Dirk Eddelbuettel [EMAIL PROTECTED] wrote:

On 26 September 2006 at 22:17, Gabor Grothendieck wrote:
| The real problem is that one wants to pipe the data in, not the
| R source.  The idea is that one successively transforms the
| data in successive elements of the pipeline.

But that is what our filesize example does::

| On 9/26/06, Duncan Murdoch [EMAIL PROTECTED] wrote:
|  On 9/26/2006 1:04 PM, Jeffrey Horner wrote:
[...]
|  But unlike bc(1), GNU R has a vast number of statistical
|  functions. For example, we can quickly compute a summary() and show
|  a stem-and-leaf plot for file sizes in a given directory via
|
|\$ ls -l /boot | awk '!/^total/ {print \$5}' | \
| r -e 'fsizes - as.integer(readLines());
|print(summary(fsizes)); stem(fsizes)'
|   Min. 1st Qu.  MedianMean 3rd Qu.Max.
| 13 512  110100  486900  768400 4735000
|
|  The decimal point is 6 digit(s) to the right of the |
|
|  0 | 002223
|  0 | 5557778899
|  1 | 112233
|  1 | 5
|  2 |
|  2 |
|  3 |
|  3 |
|  4 |
|  4 | 7

Data to be processed on stdin, command via -e 'some long expression'.

To make it simpler, here is a somewhat useless example of r piping into r

\$  r -e 'set.seed(42); sapply(rnorm(5),function(x) cat(x,\n))' |  \
3.335916

Isn't that something where, to quote you, one wants to pipe the data in, not
the R source ?

Dirk

--
Hell, there are no rules here - we're trying to accomplish something.
-- Thomas A. Edison

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] histogram colors in lattice

```Try this:

library(lattice)
Start - factor(rbinom(100,1,.5))

breaks=c(1, 1.4 ,1.6,2),
scales=list(x=list(at=c(1.2,1.8),labels=c(Yes,No))),
panel = function(x, ..., panel.number, col) {  ## added this
panel function
panel.histogram(x, ..., col = panel.number+1)
},
xlab=,ylab=)

On 9/26/06, Jamie Jarabek [EMAIL PROTECTED] wrote:
I have code that constructs a plot using the lattice package that looks
something like the following toy example:

library(lattice)
Start - factor(rbinom(100,1,.5))

breaks=c(1, 1.4 ,1.6,2),
scales=list(x=list(at=c(1.2,1.8),labels=c(Yes,No))),
xlab=,ylab=)

I would like to have different colors for the bars in the left and right
panel (say red and green) but I can't find a way to do this. Can anyone give
me any advice on how to achieve this?

Thanks,
Jamie Jarabek

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help