date:20090515

Re: [R] Graphical output format

2009-05-15 Thread Dieter Menne

Stats Wolf stats.wolf at gmail.com writes:

 Postscript, however, does not have to be what I need for two reasons.
 First, it does not accept some special characters from foreign
 languages (exactly like PDF). 

You should given an example for that in pdf. I always had the impression
that pdf is the most comprehensive in foreign character support.

Dieter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [OT?]R Reference Manual review/recommend

2009-05-15 Thread AG

John Kane wrote:
 I don't know the book but I doubt that it is a good way to learn R.  

 I'd suggest having a look at some of the documentation available on the R 
 site.  Click on Other (in left column of page) have a look there  and then 
 select the  contributed documentation link to get more documentation.  Have 
 a look at some of these offerings before buying any books.  

 The on-line Introduction to R  (Click on Manuals) is also very useful 
 although I found that it was more useful after I had a basic understanding of 
 R than as an intro for a complete novice who is not a statistician.
 Oh yes, it's also much easer to use in PDF form than in the HTML format. 

 --- On Wed, 5/13/09, AG computing.acco...@googlemail.com wrote:

   
 From: AG computing.acco...@googlemail.com
 Subject: [R] [OT?]R Reference Manual review/recommend
 To: R-help@r-project.org
 Received: Wednesday, May 13, 2009, 4:55 PM
 Hello  all

 I am looking to learn R and was thumbing through volume 1
 of R reference manual - Base Package.  I'm sorry if
 this is ludicrously silly to ask, but is this book worth the
 investment as a good way to learn how to use R?

 AG

 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.

 


   __
 Looking for the perfect gift? Give the gift of Flickr! 

 http://www.flickr.com/gift/

   
Dear all

Thanks for all of the suggestions.  I'm glad I asked before I bought the 
book.

Sounds like there's loads of alternatives, so will pursue those leads.

Many thanks

AG

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] transposing/rotating XY in a 3D array

2009-05-15 Thread andzsin

Dear list,

We have a number of files containing similarly structured data:
file1:
A B C
1 2 3
4 5 6

file2:
A B C
7 8 9
10 11 12
... etc


My part of R receives all these data as an array: 1,2,3... 12 together
with info about dimensions (row,col,fileN) . (

Converting the data into 3D cannot simply done by:
   array(x, c(2,3,2))
because breaks the structure (e.g. 1,3,5 is type mismatch)

array(1:12,c(2,3,2))
, , 1

 [,1] [,2] [,3]
[1,]135
[2,]246
...

Of course  following R's indexing order (rowIndex is the fastest)
retains the structures, but X and Y dimensions are transposed. (note, c
(2,3,2) = (3,2,2))

array(1:12, c(3,2,2))
, , 1

 [,1] [,2]
[1,]14
[2,]25
[3,]36

Its like converting into Japanese vertical reading.
It is not that I cannot get used to it, but I guess it is less error
prone if I have the same order as in the data files.

Now I am using an ad-hoc function (see below) to transpose back the
rotated YX into a  XYZ array, but I'd rather go with built-ins, e.g.
byrow=T, or t()  -- and also without duplicating my data.
THanks for the advice in advance.

Gabor

code
transposeXYinXYZ-function(x){
y - array(NA,c(dim(x)[2],dim(x)[1],dim(x)[3]))

for(i in 1:dim(x)[3]){
y[,,i] - t(x[,,i])
}
return (y)
}
xyz - array(1:24,c(4,3,2))
yxz - transpose(x)

xyz
yxz
/code

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Graphical output format

2009-05-15 Thread Prof Brian Ripley


On Fri, 15 May 2009, Dieter Menne wrote:


Stats Wolf stats.wolf at gmail.com writes:


Postscript, however, does not have to be what I need for two reasons.
First, it does not accept some special characters from foreign
languages (exactly like PDF).


'foreign' is a relative term which is imprecise (and somewhat 
impolite) when writing to an international community: I don't suppose 
Dieter Menne regards German characters as 'foreign' but Ei-ji Nakama 
does, unlike Japanese ones.



You should given an example for that in pdf. I always had the impression
that pdf is the most comprehensive in foreign character support.


Not really true, but since you can embed bitmaps in both PostScript 
and PDF, there are workarounds.


PostScript and PDF use 8-bit encodings for character strings except 
for some predefined encodings for CJK languages, so in principle this 
is far less comprehensive than windows() and X11() which use Unicode. 
However, in practice the limitations are the glyphs available in the 
specified fonts, and in all the cases I am aware of an available font 
can be encoded in one or two 8-bit encodings (and hence in one or two 
R font families).  You can't mix (say) Russian and Polish characters 
in a single text() call for pdf() (you can for windows()), but you can 
have them in separate calls for the same plot.


There are (on suitable R platforms) cairo_pdf() and cairo_ps() 
devices.  They are (on suitably rich OSes) able to cover a very wide 
range of characters, which they do by embedding the font gyphs into 
the output (often as bitmaps): the quality of the effect often depends 
on the output device used, which is why the traditional approach in 
PS/PDF is to render fonts in the output device.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Graphical output format

2009-05-15 Thread baptiste auguie



On 15 May 2009, at 10:01, Prof Brian Ripley wrote:


On Fri, 15 May 2009, Dieter Menne wrote:


Stats Wolf stats.wolf at gmail.com writes:

Postscript, however, does not have to be what I need for two  
reasons.

First, it does not accept some special characters from foreign
languages (exactly like PDF).


'foreign' is a relative term which is imprecise (and somewhat
impolite) when writing to an international community: I don't suppose
Dieter Menne regards German characters as 'foreign' but Ei-ji Nakama
does, unlike Japanese ones.

You should given an example for that in pdf. I always had the  
impression

that pdf is the most comprehensive in foreign character support.


Just a thought:

There was recently a discussion here on the pgfSweave [1] driver ---  
it should be possible to use it in conjunction with XeTeX [2] to  
process the pgf output. Presumably there will be issues of alignment  
and spacing but at least arbitrary characters of most languages could  
be employed in a fairly straight-forward manner.


[1]: http://r-forge.r-project.org/R/?group_id=331
[2]: http://www.tug.org/xetex/

Regards,

baptiste

_


Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulation

2009-05-15 Thread Kon Knafelman


hey guys, i've been following this discussion about the simulation, and being a 
beginner myself, im really unsure of the best method.

 

I hve the same problem as the initial one, except i need 1000 samples of size 
15, and my distribution is Exp(1). I've adjusted some of the loop formulas for 
my n=15, but im unsure how to proceed in the quickest way.

 

Can someone please help?

 

Much appreciated :)
 
 From: r.tur...@auckland.ac.nz
 Date: Thu, 14 May 2009 10:26:38 +1200
 To: c...@witthoft.com
 CC: r-help@r-project.org
 Subject: Re: [R] Simulation
 
 
 On 14/05/2009, at 10:04 AM, Carl Witthoft wrote:
 
  So far nobody seems to have warned the OP about seeding.
 
  Presumably Debbie wants 1000 different sets of samples, but as we all
  know there are ways to get the same sequence (initial seed) every 
  time.
  If there's a starting seed for one of the generate a single giant
  matrix methods proposed, the whole matrix will be the same for a 
  given
  seed.
  If rnorm is called 1000 times (hopefully w/ different random (oops)
  seeds), the entire set of samples will be different.
 
  and so on.
 
 I really don't get this. The OP wanted 1000 independent samples,
 each of size 100. Whether she does
 
 set.seed(42)
 M - matrix(rnorm(100*1000),nrow=1000) # Each row is a sample.
 
 or
 
 L - list()
 set.seed(42)
 for(i in 1:1000) L[[i]] - rnorm(100) # Each list entry is a sample.
 
 she gets this, i.e. the desired result. Setting a seed serves to make
 the results reproducible. This works via either approach. Making 
 results
 reproducible in this manner is advisable, but seed-setting is nothing 
 that the OP
 needs to be *warned* about.
 
 cheers,
 
 Rolf Turner
 
 ##
 Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

_



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help on Nan error

2009-05-15 Thread Richard . Cotton

 When i want to do ANOSIM i get an NaN error message. What is wrong?
  (lots of other code)
  iwithin=rep(0,(N*(N-1)/2) )
  r.w=sum(r*iwithin)/sum(iwithin)

iwithin is a vector of zeroes and so is its sum.  r*iwithin is also a 
vector of zeroes, and so is its sum.  Thus r.w=sum(r*iwithin)/sum(iwithin) 
is zero divided by zero, which is not defined.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Simulation

2009-05-15 Thread Kon Knafelman


hey guys, i've been following this discussion about the simulation, and being a 
beginner myself, im really unsure of the best method.
 
 
 
I hve the same problem as the initial one, except i need 1000 samples of size 
15, and my distribution is Exp(1). I've adjusted some of the loop formulas for 
my n=15, but im unsure how to proceed in the quickest way.
 
 
 
Can someone please help?

_
[[elided Hotmail spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] transposing/rotating XY in a 3D array

2009-05-15 Thread Kushantha Perera


Try this:

 x  - array(1:12,c(3,2,2))
 x
, , 1

 [,1] [,2]
[1,]14
[2,]25
[3,]36

, , 2

 [,1] [,2]
[1,]7   10
[2,]8   11
[3,]9   12

 xt - aperm(x, c(2,1,3))
 xt
, , 1

 [,1] [,2] [,3]
[1,]123
[2,]456

, , 2

 [,1] [,2] [,3]
[1,]789
[2,]   10   11   12

Good day!

Kushantha Perera | Amba Research

Ph +94 11 235 6281 | Mob +94 77 222 4373

Bangalore * Colombo * London * New York * San José * Singapore * 
www.ambaresearch.com

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of andzsin
Sent: Friday, May 15, 2009 12:38 PM
To: r-help@r-project.org
Subject: [R] transposing/rotating XY in a 3D array

Dear list,

We have a number of files containing similarly structured data:
file1:
A B C
1 2 3
4 5 6

file2:
A B C
7 8 9
10 11 12
... etc


My part of R receives all these data as an array: 1,2,3... 12 together
with info about dimensions (row,col,fileN) . (

Converting the data into 3D cannot simply done by:
   array(x, c(2,3,2))
because breaks the structure (e.g. 1,3,5 is type mismatch)

array(1:12,c(2,3,2))
, , 1

 [,1] [,2] [,3]
[1,]135
[2,]246
...

Of course  following R's indexing order (rowIndex is the fastest)
retains the structures, but X and Y dimensions are transposed. (note, c
(2,3,2) = (3,2,2))

array(1:12, c(3,2,2))
, , 1

 [,1] [,2]
[1,]14
[2,]25
[3,]36

Its like converting into Japanese vertical reading.
It is not that I cannot get used to it, but I guess it is less error
prone if I have the same order as in the data files.

Now I am using an ad-hoc function (see below) to transpose back the
rotated YX into a  XYZ array, but I'd rather go with built-ins, e.g.
byrow=T, or t()  -- and also without duplicating my data.
THanks for the advice in advance.

Gabor

code
transposeXYinXYZ-function(x){
y - array(NA,c(dim(x)[2],dim(x)[1],dim(x)[3]))

for(i in 1:dim(x)[3]){
y[,,i] - t(x[,,i])
}
return (y)
}xyz - array(1:24,c(4,3,2))
yxz - transpose(x)

xyz
yxz
/code

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
This e-mail may contain confidential and/or privileged i...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulation

2009-05-15 Thread Stefan Grosse

On Fri, 15 May 2009 19:17:37 +1000 Kon Knafelman konk2...@hotmail.com
wrote:

KK I hve the same problem as the initial one, except i need 1000
KK samples of size 15, and my distribution is Exp(1). I've adjusted
KK some of the loop formulas for my n=15, but im unsure how to proceed
KK in the quickest way. 
KK Can someone please help?

What exactly do you want? Please be more specific about what you did
and what does not work. 

Stefan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to do a pretty panel plot?

2009-05-15 Thread Jakson Alves de Aquino

Ajay Shah wrote:
 Here's my best version of your code:
 
 ## Data
 M - structure(list(date = structure(c(13634, 13665, 13695, 13726, 
   13757, 13787, 13818, 13848, 13879, 13910, 13939, 13970, 
 14000, 
   14031, 14061, 14092, 14123, 14153, 14184, 14214, 14245, 
 14276, 
   14304, 14335), class = Date),
 cospi = c(1987.31, 2033.37, 2140.13, 
   2120.66, 2427.09, 2917.7, 2915.28, 3262.06, 2616.26, 
 2617.75, 
   2277.69, 2538.13, 2374.09, 1911.22, 2063.73, 2081.28, 
 1813.58, 
   1304.96, 1219.73, 1361.74, 1299.2, 1242.74, 1339.18, 
 1557.29), 
 cospi.PE = c(19.2, 19.69, 20.13, 24.08, 27.61, 30.9, 
 30.69, 
   34.92, 26.95, 27.63, 23.86, 26.14, 23.72, 19.5, 23.43, 
 23.73, 
   20.69, 16.4, 16.12, 18.04, 18.46, 18.86, 20.24, 23.53)),
.Names = c(date, cospi, cospi.PE),
row.names = 209:232, class = data.frame)
 
 ## Set up par's to make 2 panel chart 
 par(bty=l); par(ps=10)
 par(mfrow=c(2,1))   # try to get two plots, one above the other 
 par(mar=c(0,4,0,1)) ## Set par(mar) to eliminate X axis gap 
 par(oma=c(2,2,2,2))
 
 ## Make Plot 1
 plot(M$date, M$cospi, type=l, log=y, 
  xaxs=i, yaxs=i, axes=F, lwd=2,
  ylab=Cospi level) 
 axis(1, col=grey, at=NULL, labels=FALSE) 
 axis(2, col=black, labels=TRUE) 
 axis(3, col=grey, labels=TRUE)
 grid(col = lightgrey, lty=1)
 box(col = grey)
 
 ## Adjust par(mar) for 2nd plot
 par(mar=c(2,4,0,1))
 
 ## Second plot
 plot(M$date, M$cospi.PE, type=l, col=black, log=y, 
  xaxs=i, yaxs=i, axes=F, lwd=2,
  ylab=Cospi P/E) 
 axis(2, col=black, at=NULL, labels=T) 
 axis(1, col=lightgrey, at=NULL, labels=T)
 grid(col = lightgrey, lty=1)
 box(col = grey)
 

I think it's better if the lines are above the grid:

## Data
M - structure(list(date = structure(c(13634, 13665, 13695, 13726,
13757, 13787, 13818, 13848, 13879, 13910, 13939, 13970, 14000,
14031, 14061, 14092, 14123, 14153, 14184, 14214, 14245, 14276,
14304, 14335), class = Date),
cospi = c(1987.31, 2033.37, 2140.13,
  2120.66, 2427.09, 2917.7, 2915.28, 3262.06, 2616.26, 2617.75,
  2277.69, 2538.13, 2374.09, 1911.22, 2063.73, 2081.28, 1813.58,
  1304.96, 1219.73, 1361.74, 1299.2, 1242.74, 1339.18, 1557.29),
cospi.PE = c(19.2, 19.69, 20.13, 24.08, 27.61, 30.9, 30.69,
  34.92, 26.95, 27.63, 23.86, 26.14, 23.72, 19.5, 23.43, 23.73,
  20.69, 16.4, 16.12, 18.04, 18.46, 18.86, 20.24, 23.53)),
  .Names = c(date, cospi, cospi.PE),
  row.names = 209:232, class = data.frame)

## Set up par's to make 2 panel chart
par(bty=l)
par(ps=10)
par(mfrow=c(2,1))   # try to get two plots, one above the other
par(mar=c(0,4,0,1)) ## Set par(mar) to eliminate X axis gap
par(oma=c(2,2,2,2))

## Make Plot 1
plot(M$date, M$cospi, type=l, log=y, xaxs=i, yaxs=i, axes=F,
  lwd=0, ylab=Cospi level)
grid(col = lightgrey, lty=1)
lines(M$date, M$cospi, type=l, lwd=2)
axis(1, col=grey, at=NULL, labels=FALSE)
axis(2, col=black, labels=TRUE)
axis(3, col=grey, labels=TRUE)
box(col = grey)

## Adjust par(mar) for 2nd plot
par(mar=c(2,4,0,1))

## Second plot
plot(M$date, M$cospi.PE, type=l, col=black, log=y,
  xaxs=i, yaxs=i, axes=F, lwd=0, ylab=Cospi P/E)
grid(col = lightgrey, lty=1)
lines(M$date, M$cospi.PE, col=black, lwd=2)
axis(2, col=black, at=NULL, labels=T)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SQL Queries from Multiple Servers

2009-05-15 Thread Mark Wardle

Hi.

Depending on your requirements, one option would be to do the join in
R using merge()

If you wish to run SQL joins across multiple databases, then it is not
an R problem but a database problem. For a quick solution, I would
write scripts that bring all your data together into one database
(could be written in any scripting language, and of course R) and then
process from there.

Bw

Mark

2009/5/13 Tom Schenk Jr tomschen...@gmail.com:
 I use RODBC as my conduit from R to SQL. It works well when the tables are
 stored on one channel, e.g.,

 channel - odbcConnect(data_base_01, uid=, dsn=)

 However, I often need to match tables across multiple databases, e.g.,
 data_base_01 and data_base_02. However, odbcConnect() appears limited
 insofar as you may only query from tables within a single channel, e.g.,
 database. I do not have access to write and create new tables on the SQL
 servers, which is a possible solution (e.g., copy all tables into a single
 database).

 Is there any way, in RODBC or another R-friendly SQL package, to perform SQL
 operations across multiple databases?

 Warm regards.

 --
 Tom Schenk Jr.
 tomschen...@gmail.com

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Dr. Mark Wardle
Specialist registrar, Neurology
Cardiff, UK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] web interface for R script??

2009-05-15 Thread Martial Sankar


Dear All,

I requiered your feedbacks about a web interface for R scripts.
I already tested RGG ( but it's not web).

and two of the CRAN list : Rserve  Rpad.

However, Rpad requieres some knowledge in Javascript, php etc...
and with Rserve I have to create a web interface entirely.

Rwui from the cran list seems attractive.

Did you ever test this one ?
Other suggestions are welcomed too ^^

Thanks,

- Martial

_
Découvrez toutes les possibilités de communication avec vos proches

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Ego net and merge networkss

2009-05-15 Thread Martin Klaus

Dear  R-Help Members,
 

I am working on an analysis of a social network of web pages. Therefore 
I use the STATNET package which is such a good sna package. Thank you 
for developing it!

But now I came to a point where my R skills are not good enough for what 
I want. So I am asking you if you might help me.

The Problem:

I have a network object and calculated the degree centrality (freeman) 
for all vertexes. Now I select the first 5 vertexes with the highest 
degree centrality to take a closer look at their ego networks. For the 
ego network analysis I tried and tried for 3 day with ego.extract, 
sapply and gapply but I couldn`t do it.

My question: I want to look at the development of the relation between 
the number of egos related to the number of alters they reach with 
directly. Let's say with one ego (the vertex with the highest degree 
centrality) I reach 15 alters in my network. Now I want to combine the 
ego networks of the vertex with the highest and the second highest 
degree centrality and loo how many alters these two egos combined can 
rech. Than I want to look at the three vertexes with the highest degree 
centralities ... and so on.

At the end I want to develop a data.frame which lists in the first 
column the number of egos (increasing from 1...N) , in the second column 
the number of alters which are reached by the egos, the third column the 
number of edges and in the third column the density of the subnetwork.

Than I want to decide which ego-combination-networks seems to be the 
best and want to gplot it.

Unfortunately my R skills are limited and so I could not program this. I 
really would appreciated it if you could help me!

Thank you in advance. Sincerely yours

 

Martin Klaus (University of Kassel)

Example Data  Code:

 

m- matrix( c ( 0 , 1 , 1 , 0 , 0 , 0 , 1 , 0 , 0 ,

1 , 0 , 1 , 0 , 0 , 0 , 0 , 1 , 0 ,

1 , 1 , 0 , 0 , 0 , 0 , 0 , 0 , 0 ,

0 , 0 , 1 , 0 , 1 , 1 , 1 , 0 , 0 ,

0 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 0 ,

0 , 0 , 0 , 1 , 1 , 0 , 1 , 0 , 1 ,

0 , 0 , 0 , 1 , 1 , 1 , 0 , 1 , 0 ,

0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 ,

0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 ) , ncol=9)

 

diag(m) - 0

g - network(m, vertex.attrnames=c(a,a,a,a,a))

summary(g)

g

degree(g)

sort(degree(g))

#-vertex 7 has the highest degree

eg-ego.extract(g)

#- extract and visualize the ego network of vertex 4

eg$`7`

gplot(eg$`7`)

str(eg$`7`)

#- Now I need to count the number of alters and edges and save the 
results in the first row of a data.frame

 

#- Than I need to combine the ego networks of vertex 7 and vertex 6 and 
look how many alters these two egos reach together

#- The results need to saved in the second row of the data.fram ...

 

EXAMPLE data.frame:

 

#Egos  /  #Alters   /   #edges   /  ego.net.density

1  /   8   /  ...  


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Intel® Core™2 Quad Processors

2009-05-15 Thread whatn...@gmail.com

Introducing the Intel® Core™2 Quad processor for desktop PCs, designed
to handle massive compute and visualization workloads enabled by
powerful multi-core technology. Providing all the bandwidth you need
for next-generation highly-threaded applications, the latest four-core
Intel Core 2 Quad processors are built on 45nm Intel® Core™
microarchitecture enabling faster, cooler, and quieter desktop PC and
workstation experiences.



www.infoaboutintelprocessor.blogspot.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using sample to create Training and Test sets

2009-05-15 Thread Chris Arthur

Forgive the newbie question, I want to select random rows from my 
data.frame to create a test set (which I can do) but then I want to 
create a training set using whats left over.


Example code:
acc - read.table(accOUT.txt, header=T, sep = ,, row.names=1)
#select 400 random rows in data
training - acc[sample(1:nrow(acc), 400, replace=TRUE),]

#try to get whats left of acc not in training
testset - acc[-training, ]
Fails with the following error
Error: invalid subscript type
In addition: Warning message:
- not meaningful for factors in: Ops.factor(left)

I then try.
testset - acc[!training, ]
Which gives me the warning message
! not meaningful for factors in: Ops.factor(left)
And if i look at testset It is 400 rows of NA's ... which clearly isn't 
right.


Can anyone tell me what I'm doing wrong.

Thanks in advance

Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Intel® Core™2 Quad Processors

2009-05-15 Thread whatn...@gmail.com

Introducing the Intel® Core™2 Quad processor for desktop PCs, designed
to handle massive compute and visualization workloads enabled by
powerful multi-core technology. Providing all the bandwidth you need
for next-generation highly-threaded applications, the latest four-core
Intel Core 2 Quad processors are built on 45nm Intel® Core™
microarchitecture enabling faster, cooler, and quieter desktop PC and
workstation experiences.



for more info www.infoaboutintelprocessor.blogspot.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with kalman-filterd betas using the dlm package

2009-05-15 Thread tom81


I have studied both the vinguette and other material I've been able to get my
hands on and Im starting to get a better understanding. And I'm defenitly
going to buy Petris, Petrone, and Campagnoli (2009) Dynamic Linear Models
with R. But that's not publish yet so I 'm not getting much help there.

This is the set-up i am using
y[t] = a[t] + b*x[t] + V[t],  
a[t] = a[t-1] + W[t,a] 
b[t] = b[t-1] + W[t,b]

V[t] ~ N(0,V)
W[t] ~ N(0,W)
W = blockdiag(W[a],W[b])


V could be estimated from the data with a non-diagonal variance matrix of
the returns,
W would be the same estimated in the same way but where the effect of past
betas in the transition taken into account. But how do I estimate that
matrix, is that done with a MLE,SUR or some other statistical teqnique.

Im also assuming in this example that a[t] are time invariant, which gives
W[a] = 0

Appriciate any guidence.
Regards Tom


spencerg wrote:
 
   Have you worked through vignette('dlm')?  Vignettes are nice 
 because they provide an Adobe Acrobat Portable Document Format (pdf) 
 file with a companion R script file, which you can get as follows: 
 
 
 (dlm. - vignette('dlm'))
 Stangle(dlm.$file) 
 
 
   The first of these two lines opens the pdf file.  The second 
 creates a file dlm.R in the working directory (getwd()) containing the 
 R commands discussed in the pdf file. 
 
 
   If I remember correctly, your question is answered in this vignette. 
 
 
   You may also be interested in a book that is soon to appear about 
 this package:  Petris, Petrone, and Campagnoli (2009) Dynamic Linear 
 Models with R (Springer;  
 http://www.amazon.com/Dynamic-Linear-Models-R-Use/dp/0387772375/ref=sr_1_4?ie=UTF8s=booksqid=1242162708sr=1-4),
  
 scheduled to ship in late June.  If you have long-term interest in this 
 subject, as I suspect you may, you might find this book interesting and 
 useful. 
 
 
   Hope this helps.
   Spencer Graves
 
 tom81 wrote:
 Hi all R gurus out there, 
 Im a kind of newbie to kalman-filters after some research I have found
 that
 the dlm package is the easiest to start with. So be patient if some of my
 questions are too basic.

 I would like to set up a beta estimation between an asset and a market
 index
 using a kalman-filter. Much littarture says it gives superior estimates
 compared to OLS estimates. So I would like to learn and to use the
 filter.

 I would like to run two types of kalman-filters, one with using a
 random-walk model (RW) and one with a stationary model, in other worlds
 the
 transition equition either follow a RW or AR(1) model.

 This is how I think it would be set up;

 I will have my time-series Y,X, where Y is the response variable

 this setup should give me a RW process if I have understood the example
 correctly
 mydlmModel = dlmModReg(X)  + dlmModPoly(order=1)

 and then run on the dlm model
 dlmFilter(Y,mydlmModel )

 but setting up a AR(1) process is unclear, should I use dlmModPoly or the
 dlmModARMA to set up the model.

 And at last but not the least, how do I set up a proper build function to
 use with dlmMLE to optimize the starting values.

 Regards Tom

 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Help-with-kalman-filterd-betas-using-the-dlm-package-tp23473796p2376.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems intalling on Suse 10.3 x86_64 OS

2009-05-15 Thread Stefan Grosse

On Thu, 14 May 2009 12:32:18 -0700 (PDT) PDXRugger j_r...@hotmail.com
wrote:

P 
P Alright, i am unsure of the posting rules for these types of
P questions but i will be as help ful as possible.  My windows based
P system cant handle a model i am running so i am trying to install R

Why? To many data?

P on our Linux based machine but i have encountered the following and
P i dont know linux much but my intuition is that i need to install
P some other files first.  Any thoughts?

P There are no installable providers of libtcl8.5.so()(64bit) for

It seems that you need a more recent tcl/tk Installation. Suse 10.3 is
a little bit old. Maybe 8.5 was not included. So you could see if you
find a newer version. Sometimes you can take the repository of the more
recent opensuse version and install newer versions from there. 

Unfortunately yast is very sensitive on dependencies. From my
experience the smart package manager is faster and less touchy than
Yast/zypper on the older Opensuse systems.

Alternatively you could upgrade you distro, install another one or run
a live-System. You could run a live Usb-Ubuntu to run your programs.
Depends on what time you have and how flexible your admins are...

hth
Stefan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with kalman-filterd betas using the dlm package

2009-05-15 Thread spencerg




 1.  Might you look again at section 2.  Maximum likelihood 
estimation of the dlm vignette?  It describes how to estimate 
parameters. 



 2.  Have you started with the code on those 2 pages, confirming 
that you can make that work and understand what it does?  If yes, then 
try to build code for your problem as a series of small modifications to 
that example.  With luck, this will bring enlightenment.  If not, try to 
express your question in terms of commented, minimal, self-contained 
code that others can copy into R to replicate what you see then modify 
to get it to work, as suggested in the posting guide 
www.R-project.org/posting-guide.html.  If someone reading this list 
can do this in a few seconds, it will increases the chances that you 
will get a useful reply. 



 Hope this helps. 
 Spencer Graves   


tom81 wrote:

I have studied both the vinguette and other material I've been able to get my
hands on and Im starting to get a better understanding. And I'm defenitly
going to buy Petris, Petrone, and Campagnoli (2009) Dynamic Linear Models
with R. But that's not publish yet so I 'm not getting much help there.

This is the set-up i am using
y[t] = a[t] + b*x[t] + V[t],  
a[t] = a[t-1] + W[t,a] 
b[t] = b[t-1] + W[t,b]


V[t] ~ N(0,V)
W[t] ~ N(0,W)
W = blockdiag(W[a],W[b])


V could be estimated from the data with a non-diagonal variance matrix of
the returns,
W would be the same estimated in the same way but where the effect of past
betas in the transition taken into account. But how do I estimate that
matrix, is that done with a MLE,SUR or some other statistical teqnique.

Im also assuming in this example that a[t] are time invariant, which gives
W[a] = 0

Appriciate any guidence.
Regards Tom


spencerg wrote:
  
  Have you worked through vignette('dlm')?  Vignettes are nice 
because they provide an Adobe Acrobat Portable Document Format (pdf) 
file with a companion R script file, which you can get as follows: 



(dlm. - vignette('dlm'))
Stangle(dlm.$file) 



  The first of these two lines opens the pdf file.  The second 
creates a file dlm.R in the working directory (getwd()) containing the 
R commands discussed in the pdf file. 



  If I remember correctly, your question is answered in this vignette. 



  You may also be interested in a book that is soon to appear about 
this package:  Petris, Petrone, and Campagnoli (2009) Dynamic Linear 
Models with R (Springer;  
http://www.amazon.com/Dynamic-Linear-Models-R-Use/dp/0387772375/ref=sr_1_4?ie=UTF8s=booksqid=1242162708sr=1-4), 
scheduled to ship in late June.  If you have long-term interest in this 
subject, as I suspect you may, you might find this book interesting and 
useful. 



  Hope this helps.
  Spencer Graves

tom81 wrote:

Hi all R gurus out there, 
Im a kind of newbie to kalman-filters after some research I have found

that
the dlm package is the easiest to start with. So be patient if some of my
questions are too basic.

I would like to set up a beta estimation between an asset and a market
index
using a kalman-filter. Much littarture says it gives superior estimates
compared to OLS estimates. So I would like to learn and to use the
filter.

I would like to run two types of kalman-filters, one with using a
random-walk model (RW) and one with a stationary model, in other worlds
the
transition equition either follow a RW or AR(1) model.

This is how I think it would be set up;

I will have my time-series Y,X, where Y is the response variable

this setup should give me a RW process if I have understood the example
correctly
mydlmModel = dlmModReg(X)  + dlmModPoly(order=1)

and then run on the dlm model
dlmFilter(Y,mydlmModel )

but setting up a AR(1) process is unclear, should I use dlmModPoly or the
dlmModARMA to set up the model.

And at last but not the least, how do I set up a proper build function to
use with dlmMLE to optimize the starting values.

Regards Tom

  

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.








__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with loops

2009-05-15 Thread Amit Patel


Hi
I am trying to create a loop which averages replicates in my data.
The original data has many rows. and consists of 40 column zz[,2:41] plus row 
headings in zz[,1]
I am trying to average each set of values (i.e. zz[1,2:3] averaged and placed 
in average_value[1,2] and so on.
below is my script but it seems to be stuck in an endless loop
Any suggestions??

for (i in 1:length(average_value[,1])) {
average_value[i] - i^100; print(average_value[i])

#calculates Meanss
#Sample A
average_value[i,2] - rowMeans(zz[i,2:3])
average_value[i,3] - rowMeans(zz[i,4:5])
average_value[i,4] - rowMeans(zz[i,6:7])
average_value[i,5] - rowMeans(zz[i,8:9])
average_value[i,6] - rowMeans(zz[i,10:11])

#Sample B
average_value[i,7] - rowMeans(zz[i,12:13])
average_value[i,8] - rowMeans(zz[i,14:15])
average_value[i,9] - rowMeans(zz[i,16:17])
average_value[i,10] - rowMeans(zz[i,18:19])
average_value[i,11] - rowMeans(zz[i,20:21])

#Sample C
average_value[i,12] - rowMeans(zz[i,22:23])
average_value[i,13] - rowMeans(zz[i,24:25])
average_value[i,14] - rowMeans(zz[i,26:27])
average_value[i,15] - rowMeans(zz[i,28:29])
average_value[i,16] - rowMeans(zz[i,30:31])

#Sample D
average_value[i,17] - rowMeans(zz[i,32:33])
average_value[i,18] - rowMeans(zz[i,34:35])
average_value[i,19] - rowMeans(zz[i,36:37])
average_value[i,20] - rowMeans(zz[i,38:39])
average_value[i,21] - rowMeans(zz[i,40:41])
  }


thanks




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] help with as.numeric

2009-05-15 Thread deanj2k


hi everyone, wondering if you could help me with a novice problem.  I have a
data frame called subjects with a height and weight variable and want to
calculate a bmi variable from the two.  i have tried:

attach(subjects)
bmi - (weight)/((height/100)^2)

but it comes up with the error:
Warning messages:
1: In Ops.factor(height, 100) : / not meaningful for factors
2: In Ops.factor((weight), ((height/100)^2)) :
  / not meaningful for factors

I presume that this means the vectors height and weight are not in numeric
form (confirmed by is.numeric) so i changed the code to:

bmi - (as.numeric(weight))/((as.numeric(height)/100)^2)

but this just comes up with a result which doesnt make sense i.e. numbers
such as 4 within bmi vector.  Ive looked at
as.numeric(height)/as.numeric(weight) and these numbers just arnt the same
as height/weight which is the reason for the incorrect bmi.  Cant anyone
tell me where I am going wrong?  Its quiet frustrating because I cant
understand why a function claiming to convert to numeric would come up with
such a bizarre result.
-- 
View this message in context: 
http://www.nabble.com/help-with-as.numeric-tp23558326p23558326.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] need help

2009-05-15 Thread H Z

Dear all
please ,I need to write a function in R to estimate the parameters of negative 
binomial distribution and then calculate the loglikelihood amount for given 
data.Is there any one to help me.
thank you very much for any help
Best regards 



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fitdistr for t distribution

2009-05-15 Thread lagreene


Thanks Jorge,

but I still don't understand where they come from.  when I use: 
fitdistr(mydata, t, df = 9) and get values for m and s, and the variance
of my data should be the df/s?

I jsut want to be able to confirm how m and s are calculated

mydt - function(x, m, s, df) dt((x-m)/s, df)/s
fitdistr(x2, mydt, list(m = 0, s = 1), df = 9, lower = c(-Inf, 0))

Thanks anyway for the help!




Jorge Ivan Velez wrote:
 
  Dear lagreene,
 See the second example in
 
 require(MASS)
 ?fitdistr
 
 HTH,
 
 Jorge
 
 
 On Thu, May 14, 2009 at 7:15 PM, lagreene lagreene...@gmail.com wrote:
 

 Hi,
 I was wondering if anyone could tell me how m and s are calculated for a
 t
 distribution?

 I thought m was the sample mean and s the standard deviation- but
 obviously
 I'm wrong as this doesn'y give the same answer.

 Thank you
 --
 View this message in context:
 http://www.nabble.com/fitdistr-for-t-distribution-tp23550779p23550779.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/fitdistr-for-t-distribution-tp23550779p23557778.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] transposing/rotating XY in a 3D array

2009-05-15 Thread andzsin

Dear Kushantha,

Thank you very much.
Very nice, indeed.


Gabor

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using sample to create Training and Test sets

2009-05-15 Thread Frank E Harrell Jr

Note that the single split sample technique is not competitive with 
other approaches unless the sample size exceeds around 20,000.


Frank


Chris Arthur wrote:
Forgive the newbie question, I want to select random rows from my 
data.frame to create a test set (which I can do) but then I want to 
create a training set using whats left over.


Example code:
acc - read.table(accOUT.txt, header=T, sep = ,, row.names=1)
#select 400 random rows in data
training - acc[sample(1:nrow(acc), 400, replace=TRUE),]

#try to get whats left of acc not in training
testset - acc[-training, ]
Fails with the following error
Error: invalid subscript type
In addition: Warning message:
- not meaningful for factors in: Ops.factor(left)

I then try.
testset - acc[!training, ]
Which gives me the warning message
! not meaningful for factors in: Ops.factor(left)
And if i look at testset It is 400 rows of NA's ... which clearly isn't 
right.


Can anyone tell me what I'm doing wrong.

Thanks in advance

Chris



--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with as.numeric

2009-05-15 Thread Nutter, Benjamin

as.numeric() doesn't convert factors to the explicit value, nor should
it.  Under what you're expecting, ff you have a factor where the levels
are Female and Male, using as.numeric() wouldn't produce anything
meaningful.

However, as.numeric() does something much smarter.  It converts Female
to 1, and Male to 2.  More generally, if you have n levels, it will
produce a vector of values between 1 and n.  This is referred to as the
'internal coding.'

If you want to convert your height and bmi variables to their numeric
values, you need to do

 as.numeric(as.character(height))

This will get you around the internal coding.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of deanj2k
Sent: Friday, May 15, 2009 7:58 AM
To: r-help@r-project.org
Subject: [R] help with as.numeric


hi everyone, wondering if you could help me with a novice problem.  I
have a
data frame called subjects with a height and weight variable and want to
calculate a bmi variable from the two.  i have tried:

attach(subjects)
bmi - (weight)/((height/100)^2)

but it comes up with the error:
Warning messages:
1: In Ops.factor(height, 100) : / not meaningful for factors
2: In Ops.factor((weight), ((height/100)^2)) :
  / not meaningful for factors

I presume that this means the vectors height and weight are not in
numeric
form (confirmed by is.numeric) so i changed the code to:

bmi - (as.numeric(weight))/((as.numeric(height)/100)^2)

but this just comes up with a result which doesnt make sense i.e.
numbers
such as 4 within bmi vector.  Ive looked at
as.numeric(height)/as.numeric(weight) and these numbers just arnt the
same
as height/weight which is the reason for the incorrect bmi.  Cant anyone
tell me where I am going wrong?  Its quiet frustrating because I cant
understand why a function claiming to convert to numeric would come up
with
such a bizarre result.
-- 
View this message in context:
http://www.nabble.com/help-with-as.numeric-tp23558326p23558326.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S. News  World Report (2008).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] displaying results

2009-05-15 Thread deanj2k


Hi everyone, can anyone tell me how i can change how i display mean(age), i
want it to say The mean age of patients within the sample is mean(age)
-- 
View this message in context: 
http://www.nabble.com/displaying-results-tp23558890p23558890.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulation

2009-05-15 Thread Ben Bolker



On Fri, 15 May 2009 19:17:37 +1000 Kon Knafelman konk2...@hotmail.com
wrote:

KK I hve the same problem as the initial one, except i need 1000
KK samples of size 15, and my distribution is Exp(1). I've adjusted
KK some of the loop formulas for my n=15, but im unsure how to proceed
KK in the quickest way. 
KK Can someone please help?


  Taking a guess:

matrix(rexp(15000,1),ncol=15)

?

-- 
View this message in context: 
http://www.nabble.com/Simulation-tp23556274p23558953.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to do a pretty panel plot?

2009-05-15 Thread stephen sefick

M - (structure(list(date = structure(c(13634, 13665, 13695, 13726,
13757, 13787, 13818, 13848, 13879, 13910, 13939, 13970, 14000,
14031, 14061, 14092, 14123, 14153, 14184, 14214, 14245, 14276,
14304, 14335), class = Date), cospi = c(1987.31, 2033.37, 2140.13,
2120.66, 2427.09, 2917.7, 2915.28, 3262.06, 2616.26, 2617.75,
2277.69, 2538.13, 2374.09, 1911.22, 2063.73, 2081.28, 1813.58,
1304.96, 1219.73, 1361.74, 1299.2, 1242.74, 1339.18, 1557.29),
cospi.PE = c(19.2, 19.69, 20.13, 24.08, 27.61, 30.9, 30.69,
34.92, 26.95, 27.63, 23.86, 26.14, 23.72, 19.5, 23.43, 23.73,
20.69, 16.4, 16.12, 18.04, 18.46, 18.86, 20.24, 23.53)), .Names = c(date,
cospi, cospi.PE), row.names = 209:232, class = data.frame))
library(ggplot2)
a - melt.data.frame(M, id.var=date)
qplot(date, value, data=a, geom=line)+facet_wrap(~variable, ncol=1,
scales=free)

how about this and much simpler code.  If you add the theme_bw
argument it looks more similar to your plot, but there are some bugs.

Stephen Sefick

On Fri, May 15, 2009 at 6:12 AM, Jakson Alves de Aquino
jaksonaqu...@gmail.com wrote:
 Ajay Shah wrote:
 Here's my best version of your code:

 ## Data
 M - structure(list(date = structure(c(13634, 13665, 13695, 13726,
                       13757, 13787, 13818, 13848, 13879, 13910, 13939, 
 13970, 14000,
                       14031, 14061, 14092, 14123, 14153, 14184, 14214, 
 14245, 14276,
                       14304, 14335), class = Date),
                     cospi = c(1987.31, 2033.37, 2140.13,
                       2120.66, 2427.09, 2917.7, 2915.28, 3262.06, 2616.26, 
 2617.75,
                       2277.69, 2538.13, 2374.09, 1911.22, 2063.73, 2081.28, 
 1813.58,
                       1304.96, 1219.73, 1361.74, 1299.2, 1242.74, 1339.18, 
 1557.29),
                     cospi.PE = c(19.2, 19.69, 20.13, 24.08, 27.61, 30.9, 
 30.69,
                       34.92, 26.95, 27.63, 23.86, 26.14, 23.72, 19.5, 23.43, 
 23.73,
                       20.69, 16.4, 16.12, 18.04, 18.46, 18.86, 20.24, 
 23.53)),
                .Names = c(date, cospi, cospi.PE),
                row.names = 209:232, class = data.frame)

 ## Set up par's to make 2 panel chart
 par(bty=l); par(ps=10)
 par(mfrow=c(2,1))           # try to get two plots, one above the other
 par(mar=c(0,4,0,1))         ## Set par(mar) to eliminate X axis gap
 par(oma=c(2,2,2,2))

 ## Make Plot 1
 plot(M$date, M$cospi, type=l, log=y,
      xaxs=i, yaxs=i, axes=F, lwd=2,
      ylab=Cospi level)
 axis(1, col=grey, at=NULL, labels=FALSE)
 axis(2, col=black, labels=TRUE)
 axis(3, col=grey, labels=TRUE)
 grid(col = lightgrey, lty=1)
 box(col = grey)

 ## Adjust par(mar) for 2nd plot
 par(mar=c(2,4,0,1))

 ## Second plot
 plot(M$date, M$cospi.PE, type=l, col=black, log=y,
      xaxs=i, yaxs=i, axes=F, lwd=2,
      ylab=Cospi P/E)
 axis(2, col=black, at=NULL, labels=T)
 axis(1, col=lightgrey, at=NULL, labels=T)
 grid(col = lightgrey, lty=1)
 box(col = grey)


 I think it's better if the lines are above the grid:

 ## Data
 M - structure(list(date = structure(c(13634, 13665, 13695, 13726,
        13757, 13787, 13818, 13848, 13879, 13910, 13939, 13970, 14000,
        14031, 14061, 14092, 14123, 14153, 14184, 14214, 14245, 14276,
        14304, 14335), class = Date),
    cospi = c(1987.31, 2033.37, 2140.13,
      2120.66, 2427.09, 2917.7, 2915.28, 3262.06, 2616.26, 2617.75,
      2277.69, 2538.13, 2374.09, 1911.22, 2063.73, 2081.28, 1813.58,
      1304.96, 1219.73, 1361.74, 1299.2, 1242.74, 1339.18, 1557.29),
    cospi.PE = c(19.2, 19.69, 20.13, 24.08, 27.61, 30.9, 30.69,
      34.92, 26.95, 27.63, 23.86, 26.14, 23.72, 19.5, 23.43, 23.73,
      20.69, 16.4, 16.12, 18.04, 18.46, 18.86, 20.24, 23.53)),
  .Names = c(date, cospi, cospi.PE),
  row.names = 209:232, class = data.frame)

 ## Set up par's to make 2 panel chart
 par(bty=l)
 par(ps=10)
 par(mfrow=c(2,1))       # try to get two plots, one above the other
 par(mar=c(0,4,0,1))     ## Set par(mar) to eliminate X axis gap
 par(oma=c(2,2,2,2))

 ## Make Plot 1
 plot(M$date, M$cospi, type=l, log=y, xaxs=i, yaxs=i, axes=F,
  lwd=0, ylab=Cospi level)
 grid(col = lightgrey, lty=1)
 lines(M$date, M$cospi, type=l, lwd=2)
 axis(1, col=grey, at=NULL, labels=FALSE)
 axis(2, col=black, labels=TRUE)
 axis(3, col=grey, labels=TRUE)
 box(col = grey)

 ## Adjust par(mar) for 2nd plot
 par(mar=c(2,4,0,1))

 ## Second plot
 plot(M$date, M$cospi.PE, type=l, col=black, log=y,
  xaxs=i, yaxs=i, axes=F, lwd=0, ylab=Cospi P/E)
 grid(col = lightgrey, lty=1)
 lines(M$date, M$cospi.PE, col=black, lwd=2)
 axis(2, col=black, at=NULL, labels=T)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Stephen Sefick

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for

Re: [R] displaying results

2009-05-15 Thread stephen sefick

Read the posting guide please.  Self-contained, minimal, reproducible code.

On Fri, May 15, 2009 at 8:33 AM, deanj2k dl...@le.ac.uk wrote:

 Hi everyone, can anyone tell me how i can change how i display mean(age), i
 want it to say The mean age of patients within the sample is mean(age)
 --
 View this message in context: 
 http://www.nabble.com/displaying-results-tp23558890p23558890.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Stephen Sefick

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] need help

2009-05-15 Thread Simon Pickett


Read about glm by typing

?glm

There are tons of books and pdfs out there to show you the basics.

http://cran.r-project.org/other-docs.html

HTH, Si.


- Original Message - 
From: H Z zamani_...@yahoo.com

To: r-help@r-project.org
Sent: Friday, May 15, 2009 12:26 PM
Subject: [R] need help



Dear all
please ,I need to write a function in R to estimate the parameters of 
negative binomial distribution and then calculate the loglikelihood amount 
for given data.Is there any one to help me.

thank you very much for any help
Best regards




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using sample to create Training and Test sets

2009-05-15 Thread Liaw, Andy

Here's one possibility:

idx - sample(nrow(acc))
training - acc[idx[1:400], ]
testset - acc[-idx[1:400], ]

Andy

From: Chris Arthur
 
 Forgive the newbie question, I want to select random rows from my 
 data.frame to create a test set (which I can do) but then I want to 
 create a training set using whats left over.
 
 Example code:
 acc - read.table(accOUT.txt, header=T, sep = ,, row.names=1)
 #select 400 random rows in data
 training - acc[sample(1:nrow(acc), 400, replace=TRUE),]
 
 #try to get whats left of acc not in training
 testset - acc[-training, ]
 Fails with the following error
 Error: invalid subscript type
 In addition: Warning message:
 - not meaningful for factors in: Ops.factor(left)
 
 I then try.
 testset - acc[!training, ]
 Which gives me the warning message
 ! not meaningful for factors in: Ops.factor(left)
 And if i look at testset It is 400 rows of NA's ... which 
 clearly isn't 
 right.
 
 Can anyone tell me what I'm doing wrong.
 
 Thanks in advance
 
 Chris
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
Notice:  This e-mail message, together with any attachme...{{dropped:12}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] selecting points on 3D scatterplots

2009-05-15 Thread John Fox

Dear list members,

I was out of town when this message arrived and so didn't respond at the
time. I did respond to a private email from the poster.

Yes, the scatter3d() function in the Rcmdr package can identify points in 3D
scatterplots drawn with rgl via the identify3d() function in the same
package. Points are identified by right-clicking and dragging. The nice()
function is in the car package, one of the suggested packages for the
Rcmdr package.

John

-- original message --

It looks like Rcmdr may be able to select points on 3D scatterplots
however when I try to use it's 3dscatter plot function I get the error
message:  could not find function nice

If I copy the code:

scatter3d(data$X, data$Z, data$Y, surface=FALSE, residuals=TRUE, bg=white,
+ axis.scales=TRUE, grid=TRUE, ellipsoid=FALSE, xlab=X, ylab=Z,
zlab=Y)

into the R console I get the same error message. Sorry I'm new - does
anyone know where this missing nice function can be found?


I tried using scatterplot3d but it doesn't rotate or zoom - which I
need to be able to do to select the data... but thanks for the
suggestion!

. . .

--
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] displaying results

2009-05-15 Thread Simon Pickett


Absolutely no idea what you mean,

Try reconstructing your question in concise English with reproducible code.

Simon.


- Original Message - 
From: deanj2k dl...@le.ac.uk

To: r-help@r-project.org
Sent: Friday, May 15, 2009 1:33 PM
Subject: [R] displaying results




Hi everyone, can anyone tell me how i can change how i display mean(age), 
i

want it to say The mean age of patients within the sample is mean(age)
--
View this message in context: 
http://www.nabble.com/displaying-results-tp23558890p23558890.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] need help

2009-05-15 Thread Jorge Ivan Velez

Dear H Z,
Take a look at the examples in

require(MASS)
?glm.nb

This might be useful as well

summary(glm.nb(yourvariable ~ 1, data = yourdata))

HTH,

Jorge


On Fri, May 15, 2009 at 7:26 AM, H Z zamani_...@yahoo.com wrote:

 Dear all
 please ,I need to write a function in R to estimate the parameters of
 negative binomial distribution and then calculate the loglikelihood amount
 for given data.Is there any one to help me.
 thank you very much for any help
 Best regards




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fw: Help with loops(corrected question)

2009-05-15 Thread Amit Patel

--- On Fri, 15/5/09, Amit Patel amitrh...@yahoo.co.uk wrote:

 From: Amit Patel amitrh...@yahoo.co.uk
 Subject: Help with loops
 To: r-help@r-project.org
 Date: Friday, 15 May, 2009, 12:17 PM
 Hi
 I am trying to create a loop which averages replicates in
 my data.
 The original data has many rows. and consists of 40 column
 zz[,2:41] plus row headings in zz[,1]
 I am trying to average each set of values (i.e. zz[1,2:3]
 averaged and placed in average_value[1,2] and so on.
 below is my script but it seems to be stuck in an endless
 loop
 Any suggestions??

 for (i in 1:length(zz[,1])) {

 #calculates Meanss
 #Sample A
 average_value[i,2] - rowMeans(zz[i,2:3])
 average_value[i,3] - rowMeans(zz[i,4:5])
 average_value[i,4] - rowMeans(zz[i,6:7])
 average_value[i,5] - rowMeans(zz[i,8:9])
 average_value[i,6] - rowMeans(zz[i,10:11])

 #Sample B
 average_value[i,7] - rowMeans(zz[i,12:13])
 average_value[i,8] - rowMeans(zz[i,14:15])
 average_value[i,9] - rowMeans(zz[i,16:17])
 average_value[i,10] - rowMeans(zz[i,18:19])
 average_value[i,11] - rowMeans(zz[i,20:21])

 #Sample C
 average_value[i,12] - rowMeans(zz[i,22:23])
 average_value[i,13] - rowMeans(zz[i,24:25])
 average_value[i,14] - rowMeans(zz[i,26:27])
 average_value[i,15] - rowMeans(zz[i,28:29])
 average_value[i,16] - rowMeans(zz[i,30:31])

 #Sample D
 average_value[i,17] - rowMeans(zz[i,32:33])
 average_value[i,18] - rowMeans(zz[i,34:35])
 average_value[i,19] - rowMeans(zz[i,36:37])
 average_value[i,20] - rowMeans(zz[i,38:39])
 average_value[i,21] - rowMeans(zz[i,40:41])
   }

 thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with as.numeric

2009-05-15 Thread Marc Schwartz


On May 15, 2009, at 6:57 AM, deanj2k wrote:



hi everyone, wondering if you could help me with a novice problem.   
I have a
data frame called subjects with a height and weight variable and  
want to

calculate a bmi variable from the two.  i have tried:

attach(subjects)
bmi - (weight)/((height/100)^2)

but it comes up with the error:
Warning messages:
1: In Ops.factor(height, 100) : / not meaningful for factors
2: In Ops.factor((weight), ((height/100)^2)) :
 / not meaningful for factors

I presume that this means the vectors height and weight are not in  
numeric

form (confirmed by is.numeric) so i changed the code to:

bmi - (as.numeric(weight))/((as.numeric(height)/100)^2)

but this just comes up with a result which doesnt make sense i.e.  
numbers

such as 4 within bmi vector.  Ive looked at
as.numeric(height)/as.numeric(weight) and these numbers just arnt  
the same
as height/weight which is the reason for the incorrect bmi.  Cant  
anyone

tell me where I am going wrong?  Its quiet frustrating because I cant
understand why a function claiming to convert to numeric would come  
up with

such a bizarre result.



That 'height' is a factor suggests that you imported the data using  
one of the read.table() family of functions and that there are non- 
numeric characters in at least one of the entries in that column.


Since 'height' is a factor, if you use as.numeric(), you will get  
numeric values returned that are the factor level numeric codes and  
not the expected numeric values. That is why you are getting bad  
values for BMI.


See:

  
http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-do-I-convert-factors-to-numeric_003f


If you use something like:

  grep([^0-9\\.], height, value = TRUE)

that should show you where you have non-numeric values in the 'height'  
column. That is, entries for 'height' that contain characters other  
than numeric or a decimal. Foe example:


height - factor(c(seq(0, 1, 0.1), 1,10, letters[1:5]))

 height
 [1] 00.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  11,10 a 
bcde

Levels: 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1,10 a b c d e

 as.numeric(height)
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17

 grep([^0-9\\.], height, value = TRUE)
[1] 1,10 abcde


I would also check the 'weight' column for the same reasons, to be  
sure that you don't have bad data there. Another approach would be to  
use:


  str(subjects)

which will give you a sense of the data types for each column in your  
data frame. Review each column and take note of any columns that  
should be numeric, but are factors.


See ?str, ?grep and ?regex for more information. You might also want  
to look at ?type.convert, which is the function used by the  
read.table() family of functions to determine the data types for each  
column during import.


HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] can you tell what .Random.seed was?

2009-05-15 Thread Warren Young


Duncan Murdoch wrote:

1) can you tell me what my original set.seed() value was?  (I wouldn't
be able to figure it out, but maybe someone can)


The only way I know is to test all 2^32 possible values of the seed.  I 
think cryptographers would know faster ways.


Well, I'm not a cryptographer, but I know a faster way: rainbow tables.

http://en.wikipedia.org/wiki/Rainbow_table

Given that the algorithm to generate these images is known and that each 
seed always gives the same image as output, you can simply precompute 
all possible images, hash them using your favorite algorithm -- say, 
SHA-256 -- and record the seed-to-hash correspondence on disk.  Then 
given an output image, you can hash it and use that to look up the seed.


It can take a long time to generate all the images, but then you have 
database like lookup speeds for image-to-seed correspondence.


This is not just a theoretical idea.  There are underground sites where 
you can put in, say, an MD5 password hash and get out the likely 
password that was actually used.  This allows a black hat to break into 
one site, grab their password hash database, reverse engineer the 
passwords, and then go use them to bang on the front door of other sites 
users of that site he first compromised also use.  There are defenses 
against this: salting the passwords and using passwords too big to 
appear in rainbow tables are easiest.


Now, if the seed was removed just before the values were generated, the 
seed would be generated from the system clock.  If you knew the time 
that this occurred approximately, the search could be a lot faster.


This also helps with the rainbow table approach.

Given that the seed for the generation algorithm is always the current 
wall time, you can restrict the needed rainbow table size greatly.  You 
simply have to know when the algorithm was first put into use, then 
start your rainbow table with that time's value as the first seed, and 
only compute up to now plus whatever you need for future operations.


For instance, you can cover about 3 years worth of image production in 
about 1/45 the time as it takes to cover all 2^32 possible images.  Say 
it takes a month to generate a rainbow table covering those 3 years. 
Full coverage would then take nearly 4 years.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] creating and then executing command strings

2009-05-15 Thread Philipp Schmidt

Hi:

I very recently started experimenting with R and am occasionally
running into very basic problems that I can't seem to solve. If there
is an R-newbies forum that is more appropriate for these kinds of
questions, please direct me to it.

I'd like to automatically add vectors to a dataframe. I am able to
build command strings that would do what I want, but R is not
executing them.

A simplified example:

# Add three vectors called avg_col1, avg_col2, avg_col3 to dataframe df
for(colname in c(col1, col2, col3)){
print(paste(df$avg_,colname,  - 0;, sep='')) # Just using this
to make sure the command is correct
paste(avg_,colname,  - 0;, sep='') # Does nothing
}

Output:

[1] df$avg_col1 - 0;
[1] df$avg_col2 - 0;
[1] df$avg_col3 - 0;

Thanks for your help!

Best - P

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] displaying results

2009-05-15 Thread Duncan Murdoch


On 5/15/2009 8:33 AM, deanj2k wrote:

Hi everyone, can anyone tell me how i can change how i display mean(age), i
want it to say The mean age of patients within the sample is mean(age)


I think you want something like this:

cat(sprintf(The mean age of patients within the sample is %.1f.\n, 
mean(age)))


Play with the %.1f format for more decimal places, etc.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] creating and then executing command strings

2009-05-15 Thread Romain Francois


Hi,

You can either parse and eval the string you are making, as in:

eval( parse( text = paste(avg_,colname,  - 0;, sep='') ) )


Or you can do something like this:

df[[ paste( avg_, colname, sep =  ) ]] - 0

Romain

Philipp Schmidt wrote:

Hi:

I very recently started experimenting with R and am occasionally
running into very basic problems that I can't seem to solve. If there
is an R-newbies forum that is more appropriate for these kinds of
questions, please direct me to it.

I'd like to automatically add vectors to a dataframe. I am able to
build command strings that would do what I want, but R is not
executing them.

A simplified example:

# Add three vectors called avg_col1, avg_col2, avg_col3 to dataframe df
for(colname in c(col1, col2, col3)){
print(paste(df$avg_,colname,  - 0;, sep='')) # Just using this
to make sure the command is correct
paste(avg_,colname,  - 0;, sep='') # Does nothing
}

Output:

[1] df$avg_col1 - 0;
[1] df$avg_col2 - 0;
[1] df$avg_col3 - 0;

Thanks for your help!

Best - P

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  



--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] displaying results

2009-05-15 Thread Wacek Kusnierczyk

Duncan Murdoch wrote:
 On 5/15/2009 8:33 AM, deanj2k wrote:
 Hi everyone, can anyone tell me how i can change how i display
 mean(age), i
 want it to say The mean age of patients within the sample is mean(age)

 I think you want something like this:

 cat(sprintf(The mean age of patients within the sample is %.1f.\n,
 mean(age)))


or maybe

cat(sprintf(The mean age of patients within the sample is %.1f.\n,
round(mean(age), 1)))


 Play with the %.1f format for more decimal places, etc.

... and be aware of excel bugs.

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Issue displaying legend for large data

2009-05-15 Thread anisha_sinnarkar

Hi,

We are working on R project with the latest version 2.9.0. We are using matplot 
and barplot functions to draw different graphs. End user may generate graphs 
for large number of data. Also, each point to be plotted may have large name 
(around 170 characters). These names (Y axis points) need to be displayed in 
legend for the graph. However, it is not possible to fit these large names in 
legend on a R window when large number of points are selected for trending.

We tried setting the font and window size for the graphs using the graphical 
parameters. However, it did not help for large number of points having long 
names. Further, we tried using R packages tcltk and tkrplot to display graphs 
and legend in a Tk widget instead of R window. We are able to display the full 
description of plotting points on click on the corresponding point style. 
However, we are not able to save/export this graph(widget) in some format. 

Currently, we are displaying the legend for the points in a separate R window. 
But, it does not seem to be associated with graphs generated. We need to have 
the actual graph and legend associated with it on a same window with all the 
plotted points and point styles. 

Is there any other way to solve this display issue of large number of data used 
for plotting?

Thanks in advance.

Regards,
Anisha Sinnarkar


DISCLAIMER
=\ ==
=\ This...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Additional points to scatter plot show up at wrong place

2009-05-15 Thread Peter Menzel

Hi everyone,

I have a problem with adding points to scatter plots.
The plot is drawn with the scatterplot() function from the car library:

scatterplot(data[,2] ~ data[,1],
data=data,smooth=F,reg.line=F,xlim=c(0.5,1),ylim=c(0.5,1),ylab=ML,xlab=Freq,cex.lab=1.9,cex.axis=1.8)

after that, I draw one line with abline(0,1,col=gray20) which works
perfectly fine.

now I want to add, say the point (0.6,0.6) to the plot with
points(c(0.6),c(0.6)).
The point is plotted, but not exactly at the proper coordinates, but
at something like (0.55,0.55).

When I use the plot() function to make the scatter plots, this problem
does not occur, but I want to have those nice box plots next to the X
and Y-axes that are drawn by scatterplot()..

Anybody has an idea how to get the points at the right place in the plot?

cheers, Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] warning message while installing a package

2009-05-15 Thread meenusahi

Dear all
I was trying to install the package ISwR and got the following message. I was 
connected to the internet.

Warning: unable to access index for repository
http://cms.unipune.ernet.in/computing/cran/bin/windows/contrib/2.8


Please help.
regards
M.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] warning message while installing a package

2009-05-15 Thread Duncan Murdoch


On 5/15/2009 11:04 AM, meenus...@gmail.com wrote:

Dear all
I was trying to install the package ISwR and got the following message. I was 
connected to the internet.

Warning: unable to access index for repository
http://cms.unipune.ernet.in/computing/cran/bin/windows/contrib/2.8


That's a problem connecting to the mirror.  Try a different one.  (You 
can do this from the menus in the GUI, or from the console using 
chooseCRANmirror().)


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] warning message while installing a package

2009-05-15 Thread Marc Schwartz


On May 15, 2009, at 9:04 AM, Duncan Murdoch wrote:


On 5/15/2009 11:04 AM, meenus...@gmail.com wrote:

Dear all
I was trying to install the package ISwR and got the following  
message. I was connected to the internet.

Warning: unable to access index for repository
http://cms.unipune.ernet.in/computing/cran/bin/windows/contrib/2.8


That's a problem connecting to the mirror.  Try a different one.   
(You can do this from the menus in the GUI, or from the console  
using chooseCRANmirror().)




It's not a problem connecting, but either a permissions issue or there  
is just nothing there:


  http://cms.unipune.ernet.in/computing/cran

That URL is 'Not Found'.

I don't see that mirror (or any mirrors in India) on the 'official'  
mirror list, but if legit, might be worthwhile contacting the mirror  
Admin to see what's up.


That being said, definitely use a different mirror in the mean time.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with loops

2009-05-15 Thread David Freedman


I'm not quite sure what you want to do, but this might help:

d=data.frame(replicate(40, rnorm(20)))
d$sample=rep(c('a','b','c','d'),each=5)
lib(doBy)
summaryBy(.~sample,da=d)

David Freedman

Amit Patel-7 wrote:
 
 
 Hi
 I am trying to create a loop which averages replicates in my data.
 The original data has many rows. and consists of 40 column zz[,2:41] plus
 row headings in zz[,1]
 I am trying to average each set of values (i.e. zz[1,2:3] averaged and
 placed in average_value[1,2] and so on.
 below is my script but it seems to be stuck in an endless loop
 Any suggestions??
 
 for (i in 1:length(average_value[,1])) {
 average_value[i] - i^100; print(average_value[i])
 
 #calculates Meanss
 #Sample A
 average_value[i,2] - rowMeans(zz[i,2:3])
 average_value[i,3] - rowMeans(zz[i,4:5])
 average_value[i,4] - rowMeans(zz[i,6:7])
 average_value[i,5] - rowMeans(zz[i,8:9])
 average_value[i,6] - rowMeans(zz[i,10:11])
 
 #Sample B
 average_value[i,7] - rowMeans(zz[i,12:13])
 average_value[i,8] - rowMeans(zz[i,14:15])
 average_value[i,9] - rowMeans(zz[i,16:17])
 average_value[i,10] - rowMeans(zz[i,18:19])
 average_value[i,11] - rowMeans(zz[i,20:21])
 
 #Sample C
 average_value[i,12] - rowMeans(zz[i,22:23])
 average_value[i,13] - rowMeans(zz[i,24:25])
 average_value[i,14] - rowMeans(zz[i,26:27])
 average_value[i,15] - rowMeans(zz[i,28:29])
 average_value[i,16] - rowMeans(zz[i,30:31])
 
 #Sample D
 average_value[i,17] - rowMeans(zz[i,32:33])
 average_value[i,18] - rowMeans(zz[i,34:35])
 average_value[i,19] - rowMeans(zz[i,36:37])
 average_value[i,20] - rowMeans(zz[i,38:39])
 average_value[i,21] - rowMeans(zz[i,40:41])
   }
 
 
 thanks
 
 
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Help-with-loops-tp23558647p23560599.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Additional points to scatter plot show up at wrong place

2009-05-15 Thread Stefan Grosse

On Fri, 15 May 2009 15:43:33 +0200 Peter Menzel
pmen...@googlemail.com wrote:

PM scatterplot(data[,2] ~ data[,1],
PM 
data=data,smooth=F,reg.line=F,xlim=c(0.5,1),ylim=c(0.5,1),ylab=ML,xlab=Freq,cex.lab=1.9,cex.axis=1.8)

Side remark: you don't need do data[,2] if you have specified data=data
as you did. So var1~var2 would be enough.

PM after that, I draw one line with abline(0,1,col=gray20) which
PM works perfectly fine.

abline for me also does not work in the expected way, see below.
 
PM now I want to add, say the point (0.6,0.6) to the plot with
PM points(c(0.6),c(0.6)).

the c() is not necessary points(0.6,0.6) is enough.

PM The point is plotted, but not exactly at the proper coordinates, but
PM at something like (0.55,0.55).

That seems to be a bug. The axis seems not to be drawn exactly.
to replicate see:
library(car)
data-data.frame(x1=rnorm(100),x2=rnorm(100,.25))
scatterplot(x1~x2,data=data,ylab=ML,xlab=Freq)
points(0.5,0.5,col=blue)
abline(h=0.5,lty=2) # check whether point is at the correct location.
abline(v=0.5,lty=2)
abline(h=1) # line is not at 1 at the y-axis!

So maybe one can contact the package owner?

Btw. creating such a plot by yourself is easy, have a look
at ?layout ?axis 

Stefan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using sample to create Training and Test sets

2009-05-15 Thread Max Kuhn

 Forgive the newbie question, I want to select random rows from my
 data.frame to create a test set (which I can do) but then I want to
 create a training set using whats left over.


The caret package has a function, createDataPartition, that does the
split taking into account the distribution of the outcome. This might
be good in classification cases where one or more classes have low
percentages in the data set.

There is more detail in the pdf:

 http://cran.r-project.org/web/packages/caret/vignettes/caretMisc.pdf

and examples in this pdf

  http://cran.r-project.org/web/packages/caret/vignettes/caretTrain.pdf

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with viewports, print.trellis and more/newpage

2009-05-15 Thread Sebastien . Bihorel

Hi Deepayan,

Thank you very much for the tip. After removing the 'more' argument and
another couple of hours, I finally found something that works for my
multi-page multi-graph plots. For documentation, the script is:


library(lattice)
library(grid)

foo - data.frame(x=1:10,y=1:10)

# Defines some viewports
fulldevice - viewport(x=0, y=0, width=1, height=1, just=c(0,0),
name=fulldevice)
plotvw - viewport(x=0, y=0, width=1, height=0.95, just=c(0,0),
name=plotvw)
titlevw - viewport(x=0, y=0.95, width=1, height=0.05, just=c(0,0),
name=titlevw)
tree - vpTree(fulldevice,vpList(plotvw,titlevw))

for (i in 1:4) {
  plots - xyplot((i*y)~x,data=foo)

  grid.newpage()
  pushViewport(tree)

  seekViewport(plotvw)

  print(plots, split=c(1,1,2,4), newpage=FALSE)
  print(plots, split=c(2,1,2,4), newpage=FALSE)
  print(plots, split=c(1,2,2,4), newpage=FALSE)
  print(plots, split=c(2,2,2,4), newpage=FALSE)
  print(plots, split=c(1,3,2,4), newpage=FALSE)
  print(plots, split=c(2,3,2,4), newpage=FALSE)
  print(plots, split=c(1,4,2,4), newpage=FALSE)
  print(plots, split=c(2,4,2,4), newpage=FALSE)


  seekViewport(titlevw)

  grid.text(label = test,
just = c(centre,centre),
gp = gpar(fontsize = 10, font = 2))
}



 On Thu, May 14, 2009 at 1:58 PM, Sebastien Bihorel
 sebastien.biho...@cognigencorp.com wrote:
 Dear R-users,

 I have got the following problem. I need to create 4x2 arrays of
 xyplot's on
 several pages. The plots are created within a loop and plotted using the
 print function. It seems that I cannot find the proper grid syntax with
 my
 viewports, and the more/newpage arguments.

 The following script is a simplification but hopefully will suffice to
 illustrate my problem. Any suggestion from the list would be greatly
 appreciated.

 Without looking at it in detail, here's one bit of advice that might
 help: if you are using pushViewport(), don't use 'more', use only
 'newpage', and preferably don't use 'split' either. In particular, if
 you are using 'more', the first print.trellis() call will always start
 a new page, and your viewport will be lost.

 -Deepayan


 Sebastien

 #

 library(lattice)

 foo - data.frame(x=1:10,y=1:10)

 for (i in 1:4) {
 Â #isnewpage - Â  Â  FALSE

 Â plots - xyplot(y~x,data=foo)

 Â pushViewport(viewport(x=0,
 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  y=0,
 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  width=1,
 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  height=0.95,
 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  just=c(0,0)))

 Â print(plots, split=c(1,1,2,4), more=T)#, newpage=isnewpage)
 Â print(plots, split=c(2,1,2,4), more=T)#, newpage=isnewpage)
 Â print(plots, split=c(1,2,2,4), more=T)#, newpage=isnewpage)
 Â print(plots, split=c(2,2,2,4), more=T)#, newpage=isnewpage)
 Â print(plots, split=c(1,3,2,4), more=T)#, newpage=isnewpage)
 Â print(plots, split=c(2,3,2,4), more=T)#, newpage=isnewpage)
 Â print(plots, split=c(1,4,2,4), more=T)#, newpage=isnewpage)
 Â print(plots, split=c(2,4,2,4), more=F)#, newpage=isnewpage)
 Â  Â Â popViewport()
 Â  Â Â pushViewport(viewport(x=0,
 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  y=0.95,
 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  width=1,
 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  height=0.05,
 Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  just=c(0,0)))
 Â  Â  grid.text(label = i,
 Â  Â  Â  Â  Â  Â  Â  just = c(centre,centre),
 Â  Â  Â  Â  Â  Â  Â  gp = gpar(fontsize = 10, font = 2))
 Â  Â  popViewport()
 Â  Â Â # Updates isnewpage
 Â # isnewpage - TRUE
 }
 --
 *Sebastien Bihorel, PharmD, PhD*
 PKPD Scientist
 Cognigen Corp
 Email: sebastien.biho...@cognigencorp.com
 mailto:sebastien.biho...@cognigencorp.com
 Phone: (716) 633-3463 ext. 323

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] replace % with \%

2009-05-15 Thread Liviu Andronic

Dear all,
I'm trying to gsub() % with \% with no obvious success.
 temp1 - c(mean, sd,   0%,   25%,  50%,  75%,  100%)
 temp1
[1] mean sd   0%   25%  50%  75%  100%
 gsub(%, \%, temp1, fixed=TRUE)
[1] mean sd   0%   25%  50%  75%  100%
Warning messages:
1: '\%' is an unrecognized escape in a character string
2: unrecognized escape removed from \%

I am not quite sure on how to deal with this error message. I tried
the following
 gsub(%, \\%, temp1, fixed=TRUE)
[1] mean   sd 0\\%   25\\%  50\\%  75\\%  100\\%

Could anyone suggest how to obtain output similar to:
[1] mean   sd 0\%   25\%  50\%  75\%  100\%

Thank you,
Liviu



-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] analysis of circular data with mixed models???

2009-05-15 Thread Steven Van Wilgenburg

Hi.

I am trying to model data on movements (direction) of birds and the 
response variables are compass directions (0 to 360).

I have found two packages CircStats and Circular that can implement 
linear models for a circular response, which will do what I need for 
the data set I am currently working on (modeling movements for only 1 
species).

However, in the near future, I would like to extend my modeling by 
including multiple species, and treating each species as a random 
effect. It appears that analysis of circular data using a mixed model 
approach is possible (see the text Statistical Analysis of Circular 
Data, Fisher 1996); however, does anyone know of a package in R that 
implements mixed models for circular data?


Cheers

-Steve

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] replace % with \%

2009-05-15 Thread Marc Schwartz



On May 15, 2009, at 9:46 AM, Liviu Andronic wrote:


Dear all,
I'm trying to gsub() % with \% with no obvious success.

temp1 - c(mean, sd,   0%,   25%,  50%,  75%,  100%)
temp1

[1] mean sd   0%   25%  50%  75%  100%

gsub(%, \%, temp1, fixed=TRUE)

[1] mean sd   0%   25%  50%  75%  100%
Warning messages:
1: '\%' is an unrecognized escape in a character string
2: unrecognized escape removed from \%

I am not quite sure on how to deal with this error message. I tried
the following

gsub(%, \\%, temp1, fixed=TRUE)

[1] mean   sd 0\\%   25\\%  50\\%  75\\%  100\\%

Could anyone suggest how to obtain output similar to:
[1] mean   sd 0\%   25\%  50\%  75\%  100\%

Thank you,
Liviu


Presuming that you might want to output the results to a TeX file for  
subsequent processing, where the '%' would otherwise be a comment  
character, the key is not to get a single '\', but a double '\\', so  
that you then get a single '\' on output:


temp1 - c(mean, sd,   0%,   25%,  50%,  75%,  100%)

temp2 - gsub(%, %, temp1)

 temp2
[1] mean   sd 0\\%   25\\%  50\\%  75\\%  100\\%

 cat(temp2)
mean sd 0\% 25\% 50\% 75\% 100\%


Remember that the single '\' is an escape character, which needs to be  
doubled.


HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] replacing default axis labels on a plot - SOLUTION

2009-05-15 Thread Graves, Gregory



The original problem posed was:

On 14/05/2009 7:31 AM, Graves, Gregory wrote:
 I have 3 columns:  flow, month, and monthname, where month is 1-12,
and
 monthname is name of month.  I can't get the plot to replace the 1-12
 with monthname using ticks.lab.  What am I doing wrong?
 
 plot(flow~factor(month),xlab=Month,ylab=Total Flow per Month,
 ylim=c(0,55000), ticks.lab=monthname)

Here is the solution to this:

# make a boxplot but suppress default labels on x axis with xaxt=n

plot(flow~factor(month),xlab=Month,ylab=Total Flow per Month,
ylim=c(0,55000), xaxt=n)  #NOTE xaxt

# create a vector containing month abbrevs with [[1]] suffix as follows

month.name-list(c(Jan, Feb, Mar, Apr, May, Jun, Jul,
Aug, Sep, Oct, Nov, Dec))[[1]]  

# place the 12 months on axis 1 (the x axis) as follows:
axis(1, at=1:12, labels=month.name)

Gregory A. Graves
Lead Scientist
REstoration COoordination and VERification (RECOVER) 
Watershed Division
South Florida Water Management District
Phones:  DESK: 561 / 682 - 2429 
 CELL:  561 / 719 - 8157
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] creating and then executing command strings

2009-05-15 Thread Philipp Schmidt

On Fri, May 15, 2009 at 3:38 PM, Romain Francois
romain.franc...@dbmail.com wrote:
 Hi,

 You can either parse and eval the string you are making, as in:

 eval( parse( text = paste(avg_,colname,  - 0;, sep='') ) )


 Or you can do something like this:

 df[[ paste( avg_, colname, sep =  ) ]] - 0


Thanks you so much! I used the first version and it worked.

What puzzles me, is that I am not able to use - instead of = (my R
book says the two can be exchanged) or break the command into
different parts and execute them one after another.

I get various error messages when I try:

eval( parse( text - paste(avg_,colname,  - 0;, sep='') ) )

or

text = paste(avg_,colname,  - 0;, sep='')
parse(text)
eval(parse(text))

Anyway, thanks a lot - you greatly improved the likelihood of me not
working on the weekend!

Best - P

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] replace % with \%

2009-05-15 Thread Liviu Andronic

Thanks all for the prompt responses. Now Hmisc::latex() no longer
generates errors on Rcmdr::numSummary() objects (with `tempa' below
being such an object).
 colnames(tempa$table) - gsub(%, \\%, colnames(tempa$table), fixed=TRUE)
 latex(tempa$table, cdec=3)

Best regards,
Liviu


On Fri, May 15, 2009 at 5:13 PM, Patrick Burns pbu...@pburns.seanet.com wrote:
 See 'The R Inferno' page 46.



 Patrick Burns
 patr...@burns-stat.com
 +44 (0)20 8525 0696
 http://www.burns-stat.com
 (home of The R Inferno and A Guide for the Unwilling S User)

 Liviu Andronic wrote:

 Dear all,
 I'm trying to gsub() % with \% with no obvious success.

 temp1 - c(mean, sd,   0%,   25%,  50%,  75%,  100%)
 temp1

 [1] mean sd   0%   25%  50%  75%  100%

 gsub(%, \%, temp1, fixed=TRUE)

 [1] mean sd   0%   25%  50%  75%  100%
 Warning messages:
 1: '\%' is an unrecognized escape in a character string
 2: unrecognized escape removed from \%

 I am not quite sure on how to deal with this error message. I tried
 the following

 gsub(%, \\%, temp1, fixed=TRUE)

 [1] mean   sd     0\\%   25\\%  50\\%  75\\%  100\\%

 Could anyone suggest how to obtain output similar to:
 [1] mean   sd     0\%   25\%  50\%  75\%  100\%

 Thank you,
 Liviu







-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Additional points to scatter plot show up at wrong place

2009-05-15 Thread Peter Dalgaard

Peter Menzel wrote:
 Hi everyone,
 
 I have a problem with adding points to scatter plots.
 The plot is drawn with the scatterplot() function from the car library:
 
 scatterplot(data[,2] ~ data[,1],
 data=data,smooth=F,reg.line=F,xlim=c(0.5,1),ylim=c(0.5,1),ylab=ML,xlab=Freq,cex.lab=1.9,cex.axis=1.8)
 
 after that, I draw one line with abline(0,1,col=gray20) which works
 perfectly fine.
 
 now I want to add, say the point (0.6,0.6) to the plot with
 points(c(0.6),c(0.6)).
 The point is plotted, but not exactly at the proper coordinates, but
 at something like (0.55,0.55).

scatterplot() is using layout() interńally, so you can't expect this to
work. I don't think there's a nice way of going back to a previous
subregion.

 When I use the plot() function to make the scatter plots, this problem
 does not occur, but I want to have those nice box plots next to the X
 and Y-axes that are drawn by scatterplot()..
 
 Anybody has an idea how to get the points at the right place in the plot?
 
 cheers, Peter
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with loops

2009-05-15 Thread Jorge Ivan Velez

Dear Amit,
The following should get you started. What I'm doing is creating an
identifiers (g) with the names of the columns you want to group for and
then use a combination of apply() and tapply() to get the mean for each row
in the levels of g. In your case, you have more columns than I have in my
example, but with slightly modifications you can adapt the code below to
your needs.

See ?apply and ?rep for more information.

HTH,

Jorge


# Some data
set.seed(123)
X - matrix(rnorm(100), ncol=10)
colnames(X) - paste('x',1:10,sep=)
rownames(X) - paste('sample_',1:10,sep=)

# Defining the groups using seq()
g - rep(1:(ncol(X)/2), each = 2 )

# Calculating the means
res - t( apply(X, 1, tapply, g, mean) )
res

# res[1,1] is the mean for X[1, 1:2]
mean(X[1,1:2])
# [1] 0.2408457


On Fri, May 15, 2009 at 8:17 AM, Amit Patel amitrh...@yahoo.co.uk wrote:


 Hi
 I am trying to create a loop which averages replicates in my data.
 The original data has many rows. and consists of 40 column zz[,2:41] plus
 row headings in zz[,1]
 I am trying to average each set of values (i.e. zz[1,2:3] averaged and
 placed in average_value[1,2] and so on.
 below is my script but it seems to be stuck in an endless loop
 Any suggestions??

 for (i in 1:length(average_value[,1])) {
 average_value[i] - i^100; print(average_value[i])

 #calculates Meanss
 #Sample A
 average_value[i,2] - rowMeans(zz[i,2:3])
 average_value[i,3] - rowMeans(zz[i,4:5])
 average_value[i,4] - rowMeans(zz[i,6:7])
 average_value[i,5] - rowMeans(zz[i,8:9])
 average_value[i,6] - rowMeans(zz[i,10:11])

 #Sample B
 average_value[i,7] - rowMeans(zz[i,12:13])
 average_value[i,8] - rowMeans(zz[i,14:15])
 average_value[i,9] - rowMeans(zz[i,16:17])
 average_value[i,10] - rowMeans(zz[i,18:19])
 average_value[i,11] - rowMeans(zz[i,20:21])

 #Sample C
 average_value[i,12] - rowMeans(zz[i,22:23])
 average_value[i,13] - rowMeans(zz[i,24:25])
 average_value[i,14] - rowMeans(zz[i,26:27])
 average_value[i,15] - rowMeans(zz[i,28:29])
 average_value[i,16] - rowMeans(zz[i,30:31])

 #Sample D
 average_value[i,17] - rowMeans(zz[i,32:33])
 average_value[i,18] - rowMeans(zz[i,34:35])
 average_value[i,19] - rowMeans(zz[i,36:37])
 average_value[i,20] - rowMeans(zz[i,38:39])
 average_value[i,21] - rowMeans(zz[i,40:41])
  }


 thanks




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Optimization algorithm to be applied to S4 classes - specifically sparse matrices

2009-05-15 Thread Douglas Bates

On Wed, May 13, 2009 at 5:21 PM,  avraham.ad...@guycarp.com wrote:

 Hello.

 I am trying to optimize a set of parameters using /optim/ in which the
 actual function to be minimized contains matrix multiplication and is of
 the form:

 SUM ((A%*%X - B)^2)

 where A is a matrix and X and B are vectors, with X as parameter vector.

As Spencer Graves pointed out, what you are describing here is a
linear least squares problem, which has a direct (i.e. non-iterative)
solution.  A comparison of the speed of various ways of solving such a
system is given in one of the vignettes in the Matrix package.

 This has worked well so far. Recently, I was given a data set A of size
 360440 x 1173, which could not be handled as a normal matrix. I brought it
 into 'R' as a sparse matrix (dgCMatrix - using sparseMatrix from the Matrix
 package), and the formulæ and gradient work, but /optim/ returns an error
 of the form no method for coercing this S4 class to a vector.

If you just want the least squares solution X then

X - solve(crossprod(A), crossprod(A, B))

will likely be the fastest method where A is the sparse matrix.

I do feel obligated to point out that the least squares solution for
such large systems is rarely a sensible solution to the underlying
problem.  If you have over 1000 columns in A and it is very sparse
then likely at least parts of A are based on indicator columns for a
categorical variable.  In such situations a model with random effects
for the category is often preferable to the fixed-effects model you
are fitting.


 After briefly looking into methods and classes, I realize I am in way over
 my head. Is there any way I could use /optim/ or another optimization
 algorithm, on sparse matrices?

 Thank you very much,

 --Avraham Adler
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] data summary and some automated t.tests.

2009-05-15 Thread stephen sefick

I would like to preform a t.test to each of the measured variables
(sand.silt etc.) with a mean and sd for each of the treatments (up or
down), and out put this as a table  I am having a hard time
starting- maybe it is to close to lunch.  Any suggestions would be
greatly appreciated.

Stephen Sefick

x - (structure(list(sample. = structure(c(1L, 7L, 8L, 9L, 10L, 11L,
12L, 13L, 14L, 2L, 3L, 4L, 5L, 6L, 1L, 7L, 8L, 9L, 10L, 11L,
12L, 13L, 14L, 2L, 3L, 4L, 5L, 6L, 25L, 28L, 29L, 30L, 31L, 32L,
33L, 34L, 35L, 26L, 25L, 28L, 29L, 30L, 31L, 32L, 33L, 34L, 35L,
26L, 27L, 25L, 28L, 29L, 30L, 31L, 32L, 33L, 34L, 35L, 26L, 15L,
17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 16L, 15L, 17L, 18L, 19L,
20L, 21L, 22L, 23L, 24L, 16L, 36L, 39L, 40L, 41L, 42L, 43L, 44L,
45L, 46L, 37L, 36L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L, 37L,
38L), .Label = c(0805-r1, 0805-r10, 0805-r11, 0805-r12,
0805-r13, 0805-r14, 0805-r2, 0805-r3, 0805-r4, 0805-r5,
0805-r6, 0805-r7, 0805-r8, 0805-r9, 0805-u1, 0805-u10,
0805-u2, 0805-u3, 0805-u4, 0805-u5, 0805-u6, 0805-u7,
0805-u8, 0805-u9, 1005-r1, 1005-r10, 1005-r11, 1005-r2,
1005-r3, 1005-r4, 1005-r5, 1005-r6, 1005-r7, 1005-r8,
1005-r9, 1005-u1, 1005-u10, 1005-u11, 1005-u2, 1005-u3,
1005-u4, 1005-u5, 1005-u6, 1005-u7, 1005-u8, 1005-u9
), class = factor), date = structure(c(2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label =
c(10/1/05,
8/29/05), class = factor), Replicate = c(1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L
), site = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c(dn, up
), class = factor), sand.silt = c(20L, 45L, 90L, 21L, 80L,
77L, 30L, 80L, 36L, 9L, 62L, 71L, 20L, 65L, 10L, 70L, 50L, 80L,
90L, 97L, 94L, 82L, 30L, 10L, 65L, 80L, 90L, 70L, 10L, 50L, 60L,
40L, 10L, 45L, 10L, 10L, 15L, 10L, 8L, 35L, 10L, 40L, 10L, 10L,
28L, 5L, 45L, 35L, 2L, 10L, 40L, 2L, 70L, 40L, 20L, 30L, 50L,
60L, 10L, 100L, 98L, 98L, 90L, 87L, 87L, 40L, 97L, 92L, 70L,
50L, 81L, 35L, 70L, 89L, 28L, 28L, 82L, 81L, 33L, 80L, 40L, 40L,
60L, 30L, 5L, 50L, 70L, 75L, 85L, 95L, 93L, 80L, 80L, 60L, 82L,
60L, 5L, 70L, 80L, 40L), gravel = c(8L, 45L, 7L, 5L, 10L, 5L,
35L, 7L, 45L, 60L, 0L, 0L, 5L, 8L, 25L, 0L, 45L, 15L, 0L, 1L,
2L, 5L, 6L, 15L, 10L, 5L, 3L, 10L, 20L, 0L, 20L, 31L, 20L, 35L,
70L, 30L, 60L, 60L, 70L, 50L, 70L, 40L, 50L, 30L, 48L, 85L, 20L,
30L, 20L, 60L, 30L, 8L, 10L, 30L, 30L, 10L, 0L, 0L, 10L, 0L,
0L, 0L, 2L, 8L, 8L, 30L, 0L, 3L, 15L, 29L, 11L, 60L, 15L, 8L,
60L, 25L, 8L, 9L, 42L, 1L, 50L, 40L, 10L, 60L, 60L, 30L, 10L,
10L, 0L, 0L, 0L, 2L, 2L, 0L, 1L, 25L, 10L, 10L, 10L, 50L), cobble = c(5L,
2L, 1L, 5L, 0L, 3L, 10L, 2L, 4L, 3L, 1L, 0L, 3L, 14L, 50L, 0L,
1L, 1L, 0L, 0L, 0L, 2L, 0L, 5L, 0L, 0L, 2L, 5L, 3L, 0L, 0L, 0L,
0L, 0L, 0L, 30L, 5L, 2L, 1L, 0L, 0L, 0L, 5L, 35L, 3L, 0L, 0L,
0L, 40L, 0L, 0L, 5L, 0L, 0L, 10L, 5L, 0L, 0L, 10L, 0L, 0L, 0L,
0L, 1L, 1L, 30L, 0L, 0L, 0L, 10L, 4L, 3L, 2L, 0L, 2L, 0L, 0L,
0L, 20L, 0L, 0L, 0L, 0L, 0L, 20L, 0L, 10L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 10L, 0L, 0L, 0L), boulder.bedrock = c(60L, 0L,
0L, 45L, 0L, 0L, 0L, 0L, 0L, 8L, 10L, 0L, 35L, 5L, 8L, 0L, 0L,
0L, 0L, 0L, 0L, 10L, 60L, 70L, 0L, 0L, 0L, 5L, 55L, 0L, 0L, 0L,
40L, 0L, 0L, 0L, 0L, 15L, 0L, 0L, 10L, 0L, 20L, 10L, 0L, 0L,
0L, 0L, 20L, 0L, 0L, 60L, 0L, 0L, 20L, 0L, 10L, 0L, 50L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 10L, 0L, 0L, 0L, 0L, 0L, 4L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 5L, 0L, 0L, 5L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 75L, 10L, 0L, 0L), fine.root = c(5L, 7L,
0L, 10L, 2L, 6L, 5L, 4L, 3L, 7L, 0L, 0L, 7L, 4L, 6L, 1L, 4L,
2L, 2L, 2L, 3L, 1L, 0L, 1L, 20L, 5L, 3L, 5L, 10L, 2L, 0L, 6L,
10L, 10L, 15L, 0L, 0L, 5L, 15L, 0L, 10L, 10L, 0L, 5L, 8L, 5L,
0L, 20L, 0L, 8L, 0L, 0L, 7L, 0L, 0L, 15L, 0L, 0L, 0L, 0L, 2L,
0L, 2L, 0L, 2L, 0L, 3L, 3L, 4L, 5L, 0L, 0L, 8L, 2L, 2L, 3L, 0L,
1L, 0L, 10L, 0L, 0L, 0L, 0L, 0L, 12L, 0L, 0L, 10L, 0L, 0L, 5L,
12L, 0L, 0L, 0L, 0L, 10L, 5L, 5L), course.root = c(0L, 0L, 0L,
0L, 0L, 0L, 0L, 3L, 2L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,

[R] Function Surv and interpretation

2009-05-15 Thread K F Pearce



Dear everyone,

My question involves the use of the survival object.

We can have 

Surv(time,time2,event, type=, origin = 0)   
(1)

As detailed on p.65 of:

http://cran.r-project.org/web/packages/survival/survival.pdf


My data (used in my study) is 'right censored'  i.e. my variable corresponding 
to 'event' indicates whether a person is alive (0) or dead (1) at date last 
seen  and my 'time'  indicates time from transplant to date of last contact 
(where this is time from transplant to death if person has died or time from 
transplant to date last seen if person is still alive).

Now I am using function, rcorr.cens

http://lib.stat.cmu.edu/S/Harrell/help/Hmisc/html/rcorr.cens.html

This function involves use of Surv.

Now here is a section of my syntax:


 time-data$ovsrecod
 x1-data$RMY.GROUPS
 death-data$death
 rcorr.cens(x1,Surv(time,death),outx=FALSE)
 (2)

As you can see, I have entered Surv(time,death)...this works (and complies with 
the example given in R for rcorr.cens) and all seems to be well...however, 
bearing in mind that in (1) we have:

Surv(time,time2,event, type=, origin = 0)

...how does R know that 'death' in *my* syntax (2) is the 'event'...i.e. how 
does it know that time2 is skipped in my analysis?  I am a bit perplexed!  The 
R documentation for Surv says that  Surv(time,event) is a 'typical usage' as is 
Surv(time,time2,event, type=, origin = 0)...but how does it know when we are 
using the former and not the latter?  I have tried entering:  
rcorr.cens(x1,Surv(time,event=death),outx=FALSE) but it does not like it saying 
that Error in Surv(time, event = death) : argument time2 is missing, with no 
default

I hope that this makes sense!

Thank you so much for your advice on this ...it's much appreciated, Kim
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] replace % with \%

2009-05-15 Thread Ted Harding

On 15-May-09 14:46:27, Liviu Andronic wrote:
 Dear all,
 I'm trying to gsub() % with \% with no obvious success.
 temp1 - c(mean, sd,   0%,   25%,  50%,  75%,  100%)
 temp1
 [1] mean sd   0%   25%  50%  75%  100%
 gsub(%, \%, temp1, fixed=TRUE)
 [1] mean sd   0%   25%  50%  75%  100%
 Warning messages:
 1: '\%' is an unrecognized escape in a character string
 2: unrecognized escape removed from \%
 
 I am not quite sure on how to deal with this error message. I tried
 the following
 gsub(%, \\%, temp1, fixed=TRUE)
 [1] mean   sd 0\\%   25\\%  50\\%  70\\%  100\\%
 
 Could anyone suggest how to obtain output similar to:
 [1] mean   sd 0\%   25\%  50\%  75\%  100\%
 
 Thank you,
 Liviu

1: The double escape \\ is the correct way to do it. If you
   give \% to gsub, it will try to interpret % as a special
   character (like \n for newline), and there is none such
   (as it tells you). On the other hand, \\ tells gsub to
   interpret \ (normally used as the Escape character) in a
   special way (namely as a literal \).

2: The output
   mean   sd 0\\%   25\\%  50\\%  70\\%  100\\%
   from gsub(%, \\%, temp1, fixed=TRUE) is one of those cases
   where R displays something different from what is really there!
   In other words, 0\\% for example is the character string you
   would have to enter in order for R to store \%. You can see what
   is really there using cat:

 cat(gsub(%, \\%, temp1, fixed=TRUE))
 # mean sd 0\% 25\% 50\% 75\% 100\%

   which, of course, is what you wanted. You can see in other ways
   that what is stored is what you wanted -- for instance:

 temp2 - gsub(%, \\%, temp1, fixed=TRUE)
 write.csv(temp2,gsub.csv)

   and then, if you look into gsub.csv outside of R, you will see:

 ,x
 1,mean
 2,sd
 3,0\%
 4,25\%
 5,50\%
 6,75\%
 7,100\%

   which, again, is what you wanted.

Hoping this helops,
Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 15-May-09   Time: 16:32:13
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Printing to screen a matrix or data.frame in one chunk (not splitting columns)

2009-05-15 Thread Adrián Cortés

Hello,

I saw this nice trick I want to replicate but I lost the source and I hope
one of you can point me to the solution.  My problem is that I don't know
the correct words to query this.

When I print to screen a matrix or data.frame the columns are split and
printed below the previous ones; even though I have plenty of screen left.

E.g.,

 my_matrix = matrix(runif(30),nrow=3,ncol=10)
 my_matrix
  [,1]  [,2]  [,3]   [,4]  [,5]  [,6]
[,7]
[1,] 0.4979305 0.1155717 0.4484069 0.29986049 0.5427566 0.4324351
0.269171456
[2,] 0.8405987 0.3605237 0.6615507 0.75305248 0.8569482 0.3401004
0.192526423
[3,] 0.5608779 0.3953941 0.9995035 0.03141064 0.7985053 0.4903582
0.000490054
  [,8]  [,9]  [,10]
[1,] 0.1402751 0.2852381 0.98816751
[2,] 0.8337806 0.7322920 0.17505541
[3,] 0.5414113 0.4668012 0.04420137

So there is a way to resize the space for printing so that everything in
printed in one chunk.

Thanks in advance,
Adrian

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] error writing to connection

2009-05-15 Thread Stefo Ratino

Hello,
 
I am using: save(data,file=D:/mayData.RData), and I have the following error:
 
Error in save(data, file = D:/mayData.RData) : error writing to connection
 
Thank you very much in advance,
Stefo


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using column length in plot gives error

2009-05-15 Thread MikSmith


Hi

I'm trying to write a generic script for processing some data which finishes
off with some plots. Given Im never sure how many columns will be in my
dataframe I wanted to using the following 

plot(spectra.wavelength, cormat, type = l, ylim=c(-1,1), xlab=Wavelength
(nm), ylab=Correlation)

however even if I specify as type=l it appears plot as points (right hand
plot). If I specify a range such as 

plot(650:700, cormat, type = l, ylim=c(-1,1), xlab=Wavelength (nm),
ylab=Correlation)

it looks good (left hand plot). If I try something like:

plot(spectra.wavelength[1]:spectra.wavelength[length(spectra.wavelength)],
cormat, type = l, ylim=c(-1,1), xlab=Wavelength (nm),
ylab=Correlation)

it fails with variable lengths differ and when I look at
spectra.wavelength[1] it gives me the value but then states there are 53
levels.

What does this mean and how can I get the result I want??!

many thanks

mike
-- 
View this message in context: 
http://www.nabble.com/Using-column-length-in-plot-gives-error-tp23562704p23562704.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plotting question re. cuminc

2009-05-15 Thread K F Pearce

Hello everyone,

(This is my second question posted today on the R list).

I am carrying out a competing risks analysis using the cuminc function...this 
takes the form:

cuminc(ftime,fstatus,group)

In my study, fstatus has 3 different causes of failure (1,2,3) there are also 
censored cases (0).  group has two levels (0 and 1).

I therefore have 6 different cumulative incidence curves:

cause 1, group=0; cause 1 group=1
cause 2, group=0; cause 2 group=1
cause 3, group=0; cause 3 group=1

If I type the following commands:

 xx-cuminc(ftime,fstatus,group)
 plot(xx,lty=1,color=1:6)

I end up with the 6 curves plotted on the same graph.

Is there a way that I can plot a selection of these curves? (say only curves 
for cause 1, group=0 and cause 1 group=1).

Thank you so much,
Kind Regards,
Kim


Dr Kim Pearce CStat
Industrial Statistics Research Unit (ISRU)
School of Mathematics and Statistics
Herschel Building
University of Newcastle
Newcastle upon Tyne
United Kingdom
NE1 7RU

Tel.   0044 (0)191 222 6244 (direct)
Fax.   0044 (0)191 222 8020
Email: k.f.pea...@ncl.ac.uk
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] can you tell what .Random.seed was?

2009-05-15 Thread Stavros Macrakis

On Thu, May 14, 2009 at 3:36 PM, G. Jay Kerns gke...@ysu.edu wrote:
 set.seed(something)
 x - rnorm(100)
 y - runif(500)
 # bunch of other stuff
...
 Now, I give you a copy of my script.R (with the set.seed statement
 removed, of course) together with the .RData file that was generated
 by the save.image() command.
...
 1) can you tell me what my original set.seed() value was?...
 2) is it possible *in principle* to figure out what set.seed was,
 given the above?

Set.seed takes an integer argument, that is, 2^32-1 distinct values
(cf NA_integer_), so the very simplest approach, brute-force search,
has a hope of working:

whatseed - function (v)  {
   i - as.integer(-2^31+1); max - as.integer(2^31-1)
   while (imax) { set.seed(i); if (runif(1)==v) return(i); i-i+1 }
}

 (OK, being able to figure it out in 2*10^68 years
 doesn't count, but within a couple months is acceptable.)

set.seed(-2^31+10)
system.time(whatseed(runif(1)))
   user  system elapsed
   1.530.001.53

2^32*(1.53/10)/3600
= 18.25
18 hours

 3) does the answer change if there is a
 remove(.Random.seed)
 command right before the save.image() command?

Depending on which RNG algorithm (RNGkind) you use, there may be
cryptographic techniques that are more efficient than brute-force
search, especially if the full internal state (.Random.seed) is
preserved.

This all assumes that the seed is set *only* with set.seed.  If
.Random.seed is modified directly, there are many more possibilities
for most of the RNGs.

 -s

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using column length in plot gives error

2009-05-15 Thread MikSmith


Hi

I'm trying to write a generic script for processing some data which finishes
off with some plots. Given Im never sure how many columns will be in my
dataframe I wanted to using the following 

plot(spectra.wavelength, cormat, type = l, ylim=c(-1,1), xlab=Wavelength
(nm), ylab=Correlation)

however even if I specify as type=l it appears plot as points (right hand
plot). If I specify a range such as 

plot(650:700, cormat, type = l, ylim=c(-1,1), xlab=Wavelength (nm),
ylab=Correlation)

it looks good (left hand plot). If I try something like:

plot(spectra.wavelength[1]:spectra.wavelength[length(spectra.wavelength)],
cormat, type = l, ylim=c(-1,1), xlab=Wavelength (nm),
ylab=Correlation)

it fails with variable lengths differ and when I look at
spectra.wavelength[1] it gives me the value but then states there are 53
levels.

What does this mean and how can I get the result I want??!

many thanks

mike http://www.nabble.com/file/p23562717/1.pdf 1.pdf 
-- 
View this message in context: 
http://www.nabble.com/Using-column-length-in-plot-gives-error-tp23562717p23562717.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulation

2009-05-15 Thread Greg Snow

Another possibility (maybe more readable, gives the option of a list, probably 
not faster):

Replicate(1000, rexp(15,1) )

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Ben Bolker
 Sent: Friday, May 15, 2009 6:37 AM
 To: r-help@r-project.org
 Subject: Re: [R] Simulation
 
 
 
 On Fri, 15 May 2009 19:17:37 +1000 Kon Knafelman konk2...@hotmail.com
 wrote:
 
 KK I hve the same problem as the initial one, except i need 1000
 KK samples of size 15, and my distribution is Exp(1). I've adjusted
 KK some of the loop formulas for my n=15, but im unsure how to proceed
 KK in the quickest way.
 KK Can someone please help?
 
 
   Taking a guess:
 
 matrix(rexp(15000,1),ncol=15)
 
 ?
 
 --
 View this message in context: http://www.nabble.com/Simulation-
 tp23556274p23558953.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] replace % with \%

2009-05-15 Thread Wacek Kusnierczyk

Marc Schwartz wrote:

 On May 15, 2009, at 9:46 AM, Liviu Andronic wrote:

 Dear all,
 I'm trying to gsub() % with \% with no obvious success.
 temp1 - c(mean, sd,   0%,   25%,  50%,  75%,  100%)
 temp1
 [1] mean sd   0%   25%  50%  75%  100%
 gsub(%, \%, temp1, fixed=TRUE)
 [1] mean sd   0%   25%  50%  75%  100%
 Warning messages:
 1: '\%' is an unrecognized escape in a character string
 2: unrecognized escape removed from \%

 I am not quite sure on how to deal with this error message. I tried
 the following
 gsub(%, \\%, temp1, fixed=TRUE)
 [1] mean   sd 0\\%   25\\%  50\\%  75\\%  100\\%

 Could anyone suggest how to obtain output similar to:
 [1] mean   sd 0\%   25\%  50\%  75\%  100\%

 Thank you,
 Liviu

 Presuming that you might want to output the results to a TeX file for
 subsequent processing, where the '%' would otherwise be a comment
 character, the key is not to get a single '\', but a double '\\', so
 that you then get a single '\' on output:

 temp1 - c(mean, sd,   0%,   25%,  50%,  75%,  100%)

 temp2 - gsub(%, %, temp1)

  temp2
 [1] mean   sd 0\\%   25\\%  50\\%  75\\%  100\\%

  cat(temp2)
 mean sd 0\% 25\% 50\% 75\% 100\%


 Remember that the single '\' is an escape character, which needs to be
 doubled.


this confusing backslash each backslashing backslash scheme is
idiosyncratic to r;  in many cases where one'd otherwise use a single
backslash in a regex or a replacement string in another programming
language, in r you have to double it.

and actually, in this case you don't need four backslashes.  the
original poster has actually had a valid solution, but he wasn't aware
that the string \\%, returned (not printed) by gsub includes two, not
three characters --  thus only one backslash, not two:

cat(
gsub(
pattern='%',
replacement='\\%',
x='foo % bar',
fixed=TRUE))
# foo \% bar

of course, if the pattern cannot be fixed, i.e., fixed=TRUE is less than
helpful, you'd need four backslashes in the replacement -- a cute,
though somewhat disturbing, weirdo.

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Optimization algorithm to be applied to S4 classes - specifically sparse matrices

2009-05-15 Thread spencerg


Dear Avraham:


	  For problems with many parameters to estimate, I highly recommend 
Pinheiro and Bates (2000) Mixed-Effects Models in S and S-Plus 
(Springer).  This book includes numerous examples showing how to use the 
nlme package.  The value of this book is greatly enhanced by the 
availability of script files named, ch01.R, ch02.R, ... ch08.R 
showing how to work virtually all the examples in the book.  These 
script files are available in your local installation of R.  To find 
them, enter the following at a commands prompt in R:



   system.file('scripts', package='nlme')


  Hope this helps.
  Spencer Graves


##
Dear Doug, et al.:


 What would you recommend for analyzing a longitudinal abundance
survey of 22 species, when the species were not selected at random?  A
prominent scientist tried to tell me that mixed-effects modeling is
inappropriate in that case because the species were selected
purposefully not at random.


 My response is that even in that case, one should still use
mixed-effects modeling, because it will tend to produce more appropriate
estimates for the deviations of individual species from the average of
all species -- potentially much lower variance with slight bias -- than
naive ordinary least squares.  The estimated variance components will
not represent the between-species variance for the actual population of
all hypothetical species of the particular type, but will represent the
between-species variability in a hypothetical population from which the
selected species might be considered a random sample.


 Best Wishes,
 Spencer Graves
p.s.  I appreciate very much Doug's comment on this.  I thought about
adding something like that to my reply but didn't feel I could afford
the time then.


Douglas Bates wrote:

On Wed, May 13, 2009 at 5:21 PM,  avraham.ad...@guycarp.com wrote:
  

Hello.

I am trying to optimize a set of parameters using /optim/ in which the
actual function to be minimized contains matrix multiplication and is of
the form:

SUM ((A%*%X - B)^2)

where A is a matrix and X and B are vectors, with X as parameter vector.



As Spencer Graves pointed out, what you are describing here is a
linear least squares problem, which has a direct (i.e. non-iterative)
solution.  A comparison of the speed of various ways of solving such a
system is given in one of the vignettes in the Matrix package.

  

This has worked well so far. Recently, I was given a data set A of size
360440 x 1173, which could not be handled as a normal matrix. I brought it
into 'R' as a sparse matrix (dgCMatrix - using sparseMatrix from the Matrix
package), and the formulæ and gradient work, but /optim/ returns an error
of the form no method for coercing this S4 class to a vector.



If you just want the least squares solution X then

X - solve(crossprod(A), crossprod(A, B))

will likely be the fastest method where A is the sparse matrix.

I do feel obligated to point out that the least squares solution for
such large systems is rarely a sensible solution to the underlying
problem.  If you have over 1000 columns in A and it is very sparse
then likely at least parts of A are based on indicator columns for a
categorical variable.  In such situations a model with random effects
for the category is often preferable to the fixed-effects model you
are fitting.


  

After briefly looking into methods and classes, I realize I am in way over
my head. Is there any way I could use /optim/ or another optimization
algorithm, on sparse matrices?

Thank you very much,

--Avraham Adler
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulation

2009-05-15 Thread Ben Bolker

Greg Snow wrote:
 Another possibility (maybe more readable, gives the option of a list, 
 probably not faster):
 
 Replicate(1000, rexp(15,1) )
 

  I think that should be replicate

  The matrix form is quite a bit faster, but don't know if that will
matter -- times below are for doing this task (1000 x 15 replicates)
1000 times ...

 system.time(replicate(1000,replicate(1000,rexp(15,1
   user  system elapsed
 12.689   0.220  12.985
 system.time(replicate(1000,matrix(rexp(15000,1),ncol=15)))
   user  system elapsed
  2.512   0.452   2.976


-- 
Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
bol...@ufl.edu / www.zoology.ufl.edu/bolker
GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc



signature.asc
Description: OpenPGP digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Optimization algorithm to be applied to S4 classes - specifically sparse matrices

2009-05-15 Thread spencerg

Dear Doug, et al.: 



 What would you recommend for analyzing a longitudinal abundance 
survey of 22 species, when the species were not selected at random?  A 
prominent scientist tried to tell me that mixed-effects modeling is 
inappropriate in that case because the species were selected 
purposefully not at random. 



 My response is that even in that case, one should still use 
mixed-effects modeling, because it will tend to produce more appropriate 
estimates for the deviations of individual species from the average of 
all species -- potentially much lower variance with slight bias -- than 
naive ordinary least squares.  The estimated variance components will 
not represent the between-species variance for the actual population of 
all hypothetical species of the particular type, but will represent the 
between-species variability in a hypothetical population from which the 
selected species might be considered a random sample. 



 Best Wishes,
 Spencer Graves
p.s.  I appreciate very much Doug's comment on this.  I thought about 
adding something like that to my reply but didn't feel I could afford 
the time then. 



Douglas Bates wrote:

On Wed, May 13, 2009 at 5:21 PM,  avraham.ad...@guycarp.com wrote:
  

Hello.

I am trying to optimize a set of parameters using /optim/ in which the
actual function to be minimized contains matrix multiplication and is of
the form:

SUM ((A%*%X - B)^2)

where A is a matrix and X and B are vectors, with X as parameter vector.



As Spencer Graves pointed out, what you are describing here is a
linear least squares problem, which has a direct (i.e. non-iterative)
solution.  A comparison of the speed of various ways of solving such a
system is given in one of the vignettes in the Matrix package.

  

This has worked well so far. Recently, I was given a data set A of size
360440 x 1173, which could not be handled as a normal matrix. I brought it
into 'R' as a sparse matrix (dgCMatrix - using sparseMatrix from the Matrix
package), and the formulæ and gradient work, but /optim/ returns an error
of the form no method for coercing this S4 class to a vector.



If you just want the least squares solution X then

X - solve(crossprod(A), crossprod(A, B))

will likely be the fastest method where A is the sparse matrix.

I do feel obligated to point out that the least squares solution for
such large systems is rarely a sensible solution to the underlying
problem.  If you have over 1000 columns in A and it is very sparse
then likely at least parts of A are based on indicator columns for a
categorical variable.  In such situations a model with random effects
for the category is often preferable to the fixed-effects model you
are fitting.


  

After briefly looking into methods and classes, I realize I am in way over
my head. Is there any way I could use /optim/ or another optimization
algorithm, on sparse matrices?

Thank you very much,

--Avraham Adler
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Printing to screen a matrix or data.frame in one chunk (not splitting columns)

2009-05-15 Thread Martin Maechler

 AC == Adrián Cortés adrc...@gmail.com
 on Fri, 15 May 2009 08:58:04 -0700 writes:

AC Hello,
AC I saw this nice trick I want to replicate but I lost the source and I 
hope
AC one of you can point me to the solution.  My problem is that I don't 
know
AC the correct words to query this.

AC When I print to screen a matrix or data.frame the columns are split and
AC printed below the previous ones; even though I have plenty of screen 
left.

AC E.g.,

 my_matrix = matrix(runif(30),nrow=3,ncol=10)
 my_matrix
AC [,1]  [,2]  [,3]   [,4]  [,5]  [,6]
AC [,7]
AC [1,] 0.4979305 0.1155717 0.4484069 0.29986049 0.5427566 0.4324351
AC 0.269171456
AC [2,] 0.8405987 0.3605237 0.6615507 0.75305248 0.8569482 0.3401004
AC 0.192526423
AC [3,] 0.5608779 0.3953941 0.9995035 0.03141064 0.7985053 0.4903582
AC 0.000490054
AC [,8]  [,9]  [,10]
AC [1,] 0.1402751 0.2852381 0.98816751
AC [2,] 0.8337806 0.7322920 0.17505541
AC [3,] 0.5414113 0.4668012 0.04420137

AC So there is a way to resize the space for printing so that everything in
AC printed in one chunk.

options(width = 100) # or whatever.

---

For ESS users, 
this option is set to the correct value, when R is started.
If later, the emacs window is resized,
you can automatically set the width to the current buffer
(window) size, by

   M-x ess-execute-screen-options

or, for everyone here who has

  (add-hook 'ess-mode-hook'ess-add-MM-keys)
  (add-hook 'inferior-ess-mode-hook 'ess-add-MM-keys)

in their ~/.emacs equivalent, it's a simple  
C-c w ('w' for 'width') to adapt the R option to the emacs
window size.

Martin Maechler, 
ETH Zurich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Printing to screen a matrix or data.frame in one chunk (not splitting columns)

2009-05-15 Thread Marc Schwartz


On May 15, 2009, at 10:58 AM, Adrián Cortés wrote:


Hello,

I saw this nice trick I want to replicate but I lost the source and  
I hope
one of you can point me to the solution.  My problem is that I don't  
know

the correct words to query this.

When I print to screen a matrix or data.frame the columns are split  
and
printed below the previous ones; even though I have plenty of screen  
left.


E.g.,


my_matrix = matrix(runif(30),nrow=3,ncol=10)
my_matrix

 [,1]  [,2]  [,3]   [,4]  [,5]  [,6]
[,7]
[1,] 0.4979305 0.1155717 0.4484069 0.29986049 0.5427566 0.4324351
0.269171456
[2,] 0.8405987 0.3605237 0.6615507 0.75305248 0.8569482 0.3401004
0.192526423
[3,] 0.5608779 0.3953941 0.9995035 0.03141064 0.7985053 0.4903582
0.000490054
 [,8]  [,9]  [,10]
[1,] 0.1402751 0.2852381 0.98816751
[2,] 0.8337806 0.7322920 0.17505541
[3,] 0.5414113 0.4668012 0.04420137

So there is a way to resize the space for printing so that  
everything in

printed in one chunk.

Thanks in advance,
Adrian




See ?options and take note of 'width' which defaults to 80. Increase  
that value to a number that suits your requirements.


HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Optimization algorithm to be applied to S4 classes - specifically sparse matrices

2009-05-15 Thread Avraham . Adler


Thank you both very much for your replies. What makes this a little less
straightforward, at least to me, is that there needs to be constraints on
the solved parameters. They most certainly need to be positive and there
may be an upper limit as well. The true best linear fit would have negative
entries for some of the parameters.


Originally, I was using the L-BFGS-B method of optim which both allows for
box constraints and has the limited memory advantage useful when dealing
with large matrices. Having the analytic gradient, I thought of using BFGS
and having a statement in the function returning Inf for any parameters
outside the allowable constraints.


I do /not/ know how to apply parameter constraints when using linear
models. I looked around at the various manuals and help features, and
outside of package glmc I did not find anything I could use. Perhaps I
overlooked something. If there is something I missed, please let me know.


If there truly is no standard optimization routine that works on sparse
matrices, my next step may be to use the normal equations to shrink the
size of the matrix, recast it as a dense matrix (it would only be 1173x1173
then) and then hand it off to optim.


Any further suggestions or corrections would be very much appreciated.


Thank you,


--Avraham Adler


   
 Douglas Bates 
 ba...@stat.wisc. 
 edu   To 
 Sent by:  avraham.ad...@guycarp.com   
 dmba...@gmail.com  cc 
   r-help@r-project.org
   Subject 
 05/15/2009 11:57  Re: [R] Optimization algorithm to   
 AMbe applied to S4 classes -  
   specifically sparse matrices
   
   
   
   
   
   




On Wed, May 13, 2009 at 5:21 PM,  avraham.ad...@guycarp.com wrote:

 Hello.

 I am trying to optimize a set of parameters using /optim/ in which the
 actual function to be minimized contains matrix multiplication and is of
 the form:

 SUM ((A%*%X - B)^2)

 where A is a matrix and X and B are vectors, with X as parameter vector.

As Spencer Graves pointed out, what you are describing here is a
linear least squares problem, which has a direct (i.e. non-iterative)
solution.  A comparison of the speed of various ways of solving such a
system is given in one of the vignettes in the Matrix package.

 This has worked well so far. Recently, I was given a data set A of size
 360440 x 1173, which could not be handled as a normal matrix. I brought
it
 into 'R' as a sparse matrix (dgCMatrix - using sparseMatrix from the
Matrix
 package), and the formulæ and gradient work, but /optim/ returns an error
 of the form no method for coercing this S4 class to a vector.

If you just want the least squares solution X then

X - solve(crossprod(A), crossprod(A, B))

will likely be the fastest method where A is the sparse matrix.

I do feel obligated to point out that the least squares solution for
such large systems is rarely a sensible solution to the underlying
problem.  If you have over 1000 columns in A and it is very sparse
then likely at least parts of A are based on indicator columns for a
categorical variable.  In such situations a model with random effects
for the category is often preferable to the fixed-effects model you
are fitting.


 After briefly looking into methods and classes, I realize I am in way
over
 my head. Is there any way I could use /optim/ or another optimization
 algorithm, on sparse matrices?

 Thank you very much,

 --Avraham Adler
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error writing to connection

2009-05-15 Thread Marc Schwartz


On May 15, 2009, at 8:22 AM, Stefo Ratino wrote:


Hello,

I am using: save(data,file=D:/mayData.RData), and I have the  
following error:


Error in save(data, file = D:/mayData.RData) : error writing to  
connection


Thank you very much in advance,
Stefo


Presuming that drive 'D' exists and that you have permission to write  
to it, it is possible that there is insufficient room on that drive to  
save 'data'.


Check on the above.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error writing to connection

2009-05-15 Thread Duncan Murdoch


On 5/15/2009 9:22 AM, Stefo Ratino wrote:

Hello,
 
I am using: save(data,file=D:/mayData.RData), and I have the following error:
 
Error in save(data, file = D:/mayData.RData) : error writing to connection


Do you have permission to create a file there?  Try it from outside R.

Duncan Murdoch

 
Thank you very much in advance,

Stefo


  
	[[alternative HTML version deleted]]






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] memory usage grows too fast

2009-05-15 Thread Ping-Hsun Hsieh

Thanks for Peter, William, and Hadley's helps.
Your codes are much more concise than mine.  :P
 
Both William and Hadley's comments are the same. Here are their codes.

f - function(dataMatrix) rowMeans(datamatrix==02)

And Peter's codes are the following.

apply(yourMatrix, 1, function(x) 
length(x[x==yourPattern]))/ncol(yourMatrix)


In terms of the running time, the first one ran faster than the later one on my 
dataset (2.5 mins vs. 6.4 mins)
The memory consumption, however, of the first one is much higher than the 
later.  ( 8G vs. ~3G )

Any thoughts? My guess is the rowMeans created extra copies to perform its 
calculation, but not so sure.
And I am also interested in understanding ways to handle memory issues. Help 
someone could shed light on this for me. :)

Best,
Mike

-Original Message-
From: Peter Alspach [mailto:palsp...@hortresearch.co.nz] 
Sent: Thursday, May 14, 2009 4:47 PM
To: Ping-Hsun Hsieh
Subject: RE: [R] memory usage grows too fast

Tena koe Mike

If I understand you correctly, you should be able to use something like:

apply(yourMatrix, 1, function(x)
length(x[x==yourPattern]))/ncol(yourMatrix)

I see you've divided by nrow(yourMatrix) so perhaps I am missing
something.

HTH ...

Peter Alspach

 

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Ping-Hsun Hsieh
 Sent: Friday, 15 May 2009 11:22 a.m.
 To: r-help@r-project.org
 Subject: [R] memory usage grows too fast
 
 Hi All,
 
 I have a 1000x100 matrix. 
 The calculation I would like to do is actually very simple: 
 for each row, calculate the frequency of a given pattern. For 
 example, a toy dataset is as follows.
 
 Col1  Col2Col3Col4
 0102  02  00  = Freq of 02 is 0.5
 0202  02  01  = Freq of 02 is 0.75
 0002  01  01  ...
 
 My code is quite simple as the following to find the pattern 02.
 
 OccurrenceRate_Fun-function(dataMatrix)
 {
   tmp-NULL
   tmpMatrix-apply(dataMatrix,1,match,02)
for ( i in 1: ncol(tmpMatrix))
   {
 tmpRate-table(tmpMatrix[,i])[[1]]/ nrow(tmpMatrix)
 tmp-c(tmp,tmpHET)
   }
   rm(tmpMatrix)
   rm(tmpRate)
   return(tmp)
   gc()
 }
 
 The problem is the memory usage grows very fast and hard to 
 be handled on machines with less RAM.
 Could anyone please give me some comments on how to reduce 
 the space complexity in this calculation?
 
 Thanks,
 Mike
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

The contents of this e-mail are confidential and may be ...{{dropped:14}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulation

2009-05-15 Thread Greg Snow

I wrote replicate but the darn e-mail program fixed it for me.  I expected 
replicate to be a bit slower, but not by that amount.  I just wanted to include 
replicate as a more readable version of lapply while still improving over the 
loop approach.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: Ben Bolker [mailto:bol...@ufl.edu]
 Sent: Friday, May 15, 2009 10:19 AM
 To: Greg Snow
 Cc: r-help@r-project.org
 Subject: Re: [R] Simulation
 
 Greg Snow wrote:
  Another possibility (maybe more readable, gives the option of a list,
 probably not faster):
 
  Replicate(1000, rexp(15,1) )
 
 
   I think that should be replicate
 
   The matrix form is quite a bit faster, but don't know if that will
 matter -- times below are for doing this task (1000 x 15 replicates)
 1000 times ...
 
  system.time(replicate(1000,replicate(1000,rexp(15,1
user  system elapsed
  12.689   0.220  12.985
  system.time(replicate(1000,matrix(rexp(15000,1),ncol=15)))
user  system elapsed
   2.512   0.452   2.976
 
 
 --
 Ben Bolker
 Associate professor, Biology Dep't, Univ. of Florida
 bol...@ufl.edu / www.zoology.ufl.edu/bolker
 GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] drawing arrows

2009-05-15 Thread Duncan Murdoch


On 5/15/2009 12:43 PM, christophe dutang wrote:

Hi,

I would like to draw arrows in a classic 2D plot. Which package should I
use? is there R base functions that do job?

On google, I could not find any useful discussion about this topic, except a
link to the function 'grid.arrows' of the grid package.

My problem is I would like to draw arrows at the edge of circles drawn by
the 'symbols' function. Maybe there is already a dedicated function for
this?

Any help is appreciated.



See ?arrows.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] drawing arrows

2009-05-15 Thread christophe dutang

Hi,

I would like to draw arrows in a classic 2D plot. Which package should I
use? is there R base functions that do job?

On google, I could not find any useful discussion about this topic, except a
link to the function 'grid.arrows' of the grid package.

My problem is I would like to draw arrows at the edge of circles drawn by
the 'symbols' function. Maybe there is already a dedicated function for
this?

Any help is appreciated.

Christophe



-- 
Christophe DUTANG
Ph. D. student at ISFA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] can you tell what .Random.seed was?

2009-05-15 Thread G. Jay Kerns


 Set.seed takes an integer argument, that is, 2^32-1 distinct values
 (cf NA_integer_), so the very simplest approach, brute-force search,
 has a hope of working:

 whatseed - function (v)  {
   i - as.integer(-2^31+1); max - as.integer(2^31-1)
   while (imax) { set.seed(i); if (runif(1)==v) return(i); i-i+1 }
 }

 (OK, being able to figure it out in 2*10^68 years
 doesn't count, but within a couple months is acceptable.)

 set.seed(-2^31+10)
 system.time(whatseed(runif(1)))
   user  system elapsed
   1.53    0.00    1.53

 2^32*(1.53/10)/3600
    = 18.25
 18 hours

 3) does the answer change if there is a
 remove(.Random.seed)
 command right before the save.image() command?

 Depending on which RNG algorithm (RNGkind) you use, there may be
 cryptographic techniques that are more efficient than brute-force
 search, especially if the full internal state (.Random.seed) is
 preserved.

 This all assumes that the seed is set *only* with set.seed.  If
 .Random.seed is modified directly, there are many more possibilities
 for most of the RNGs.

             -s




Thanks very much to Warren and Stavros for their additional insight.
Putting all of this together, I think I am now ready to formulate my
question intelligently:

Using Sweave, I want to distribute randomly generated problems AND
answers to both teacher AND student.

More precisely, I want to distribute:
1) the .Rnw file
2) the .RData file saved near the end of the Sweave process.

I want it to be *easy* for the Instructor to change my seed and
generate new problems.

I want it to be *difficult* for students to figure out the seed and
automatically generate solutions on their own.


Of course, difficult is a relative term, since what is difficult
for them may well be easy for me, and what is difficult for me will
be trivial to cryptographers and some people on this list.  The
audience would be, say, upper division undergraduate students at a
public university.


What is clear so far: a brute force search of set.seed() is really
pretty easy and fast... even for students at this level.

However, relating to Duncan's second remark:  what if the Instructor
inserted an *unknown* very large number of calls to the RNG near the
beginning of the .Rnw (but after the set.seed)...  and did not
distribute this information to the students...  that would make it
much harder, yes?

Any ideas that are even better than this?

Conceivably, some of my students will be searching these archives in
the future;  please feel free to respond off-list if appropriate.

Jay


















-- 

***
G. Jay Kerns, Ph.D.
Associate Professor
Department of Mathematics  Statistics
Youngstown State University
Youngstown, OH 44555-0002 USA
Office: 1035 Cushwa Hall
Phone: (330) 941-3310 Office (voice mail)
-3302 Department
-3170 FAX
E-mail: gke...@ysu.edu
http://www.cc.ysu.edu/~gjkerns/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Any R workshops on BUGS or resampling or other...?

2009-05-15 Thread Kevin W

I would like to know about any workshops/meetings on the topics of (1) using
some version of BUGS with R (2) resampling methods (3) other advanced
courses.

Thanks for any ideas.

Kevin Wright

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] memory usage grows too fast

2009-05-15 Thread Ping-Hsun Hsieh

Hi William,

Thanks for the comments and explanation.
It is really good to know the details of rowMeans.
I did modified Peter's codes from length(x[x==02]) to sum(x==02), though it 
improved only in few seconds. :)

Best,
Mike

-Original Message-
From: William Dunlap [mailto:wdun...@tibco.com] 
Sent: Friday, May 15, 2009 10:09 AM
To: Ping-Hsun Hsieh
Subject: RE: [R] memory usage grows too fast

rowMeans(dataMatrix==02) must
  (a) make a logical matrix the dimensions of dataMatrix in which to put
   the result of dataMatrix==02 (4 bytes/logical element)
  (b) make a double precision matrix (8 bytes/element) the size of that
   logical matrix because rowMeans uses some C code that only works
on
   doubles
apply(dataMatrix,1,function(x)length(x[x==02])/ncol(dataMatrix))
never has to make any copies of the entire matrix.  It extracts a row
at a time and when it is done with the row, the memory used for
working on the row is available for other uses.  Note that it would
probably
be a tad faster if it were changed to
   apply(dataMatrix,1,function(x)sum(x==02)) / ncol(dataMatrix)
as sum(logicalVector) is the same as length(x[logicalVector]) and there
is no need to compute ncol(dataMatrix) more than once.

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com  

 -Original Message-
 From: Ping-Hsun Hsieh [mailto:hsi...@ohsu.edu] 
 Sent: Friday, May 15, 2009 9:58 AM
 To: Peter Alspach; William Dunlap; hadley wickham
 Cc: r-help@r-project.org
 Subject: RE: [R] memory usage grows too fast
 
 Thanks for Peter, William, and Hadley's helps.
 Your codes are much more concise than mine.  :P
  
 Both William and Hadley's comments are the same. Here are their codes.
 
   f - function(dataMatrix) rowMeans(datamatrix==02)
 
 And Peter's codes are the following.
 
   apply(yourMatrix, 1, function(x) 
 length(x[x==yourPattern]))/ncol(yourMatrix)
 
 
 In terms of the running time, the first one ran faster than 
 the later one on my dataset (2.5 mins vs. 6.4 mins)
 The memory consumption, however, of the first one is much 
 higher than the later.  ( 8G vs. ~3G )
 
 Any thoughts? My guess is the rowMeans created extra copies 
 to perform its calculation, but not so sure.
 And I am also interested in understanding ways to handle 
 memory issues. Help someone could shed light on this for me. :)
 
 Best,
 Mike
 
 -Original Message-
 From: Peter Alspach [mailto:palsp...@hortresearch.co.nz] 
 Sent: Thursday, May 14, 2009 4:47 PM
 To: Ping-Hsun Hsieh
 Subject: RE: [R] memory usage grows too fast
 
 Tena koe Mike
 
 If I understand you correctly, you should be able to use 
 something like:
 
 apply(yourMatrix, 1, function(x)
 length(x[x==yourPattern]))/ncol(yourMatrix)
 
 I see you've divided by nrow(yourMatrix) so perhaps I am missing
 something.
 
 HTH ...
 
 Peter Alspach
 
  
 
  -Original Message-
  From: r-help-boun...@r-project.org 
  [mailto:r-help-boun...@r-project.org] On Behalf Of Ping-Hsun Hsieh
  Sent: Friday, 15 May 2009 11:22 a.m.
  To: r-help@r-project.org
  Subject: [R] memory usage grows too fast
  
  Hi All,
  
  I have a 1000x100 matrix. 
  The calculation I would like to do is actually very simple: 
  for each row, calculate the frequency of a given pattern. For 
  example, a toy dataset is as follows.
  
  Col1Col2Col3Col4
  01  02  02  00  = Freq of 02 is 0.5
  02  02  02  01  = Freq of 02 is 0.75
  00  02  01  01  ...
  
  My code is quite simple as the following to find the pattern 02.
  
  OccurrenceRate_Fun-function(dataMatrix)
  {
tmp-NULL
tmpMatrix-apply(dataMatrix,1,match,02)
 for ( i in 1: ncol(tmpMatrix))
{
  tmpRate-table(tmpMatrix[,i])[[1]]/ nrow(tmpMatrix)
  tmp-c(tmp,tmpHET)
}
rm(tmpMatrix)
rm(tmpRate)
return(tmp)
gc()
  }
  
  The problem is the memory usage grows very fast and hard to 
  be handled on machines with less RAM.
  Could anyone please give me some comments on how to reduce 
  the space complexity in this calculation?
  
  Thanks,
  Mike
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  
 
 The contents of this e-mail are confidential and may be 
 subject to legal privilege.
  If you are not the intended recipient you must not use, 
 disseminate, distribute or
  reproduce all or any part of this e-mail or attachments.  If 
 you have received this
  e-mail in error, please notify the sender and delete all 
 material pertaining to this
  e-mail.  Any opinion or views expressed in this e-mail are 
 those of the individual
  sender and may not represent those of The New Zealand 
 Institute for Plant and
  Food Research Limited.

[R] Rotating x-axis categorical labels

2009-05-15 Thread Bill Hudspeth

Hello,

I am using barplot to generate a histogram of population by county. I
need to plot the bars for about 35 counties, and would like to rotate
the county name labels on the x-axis to a vertical orientation so that I
can fit them all. An example of my syntax is below:

r.barplot(x,main=main,
xlab=xlab,ylab=ylab,names_arg=counties,axis_lty=1,col=lavender,ylim=r.c(0,100),cex_axis=0.7,cex_names=0.7,offset=0,las=1)

Using las rotates the y-axis labels --- how do I rotate the X-axis
labels...?

Thanks, William

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rotating x-axis categorical labels

2009-05-15 Thread Marc Schwartz



On May 15, 2009, at 12:18 PM, Bill Hudspeth wrote:


Hello,

I am using barplot to generate a histogram of population by county. I
need to plot the bars for about 35 counties, and would like to rotate
the county name labels on the x-axis to a vertical orientation so  
that I

can fit them all. An example of my syntax is below:

r.barplot(x,main=main,
xlab 
= 
xlab 
,ylab 
= 
ylab 
,names_arg 
= 
counties 
,axis_lty 
= 
1 
,col 
=lavender,ylim=r.c(0,100),cex_axis=0.7,cex_names=0.7,offset=0,las=1)


Using las rotates the y-axis labels --- how do I rotate the X-axis
labels...?

Thanks, William



par(las) takes 4 values 0:3. See ?par

Try:

  # Rotate both x and y
  barplot(1:5, names.arg = paste(Bar, 1:5), las = 2)

  # Rotate just x
  barplot(1:5, names.arg = paste(Bar, 1:5), las = 3)


If you want something other than a 90 degree rotation, see:

  
http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-create-rotated-axis-labels_003f

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] creating and then executing command strings

2009-05-15 Thread Greg Snow

The arrow - is used to assign a value to a variable, the equals sign = is 
used to specify the value for a function argument.  Recent versions of R allow 
= to be used for - at the top level and certain circumstances which some 
people find more convenient, but can also lead to confusion (purists always 
keep them separate).

The code:

 parse( text - paste( ... 

Will take the results of paste, save them in a variable named text, then pass a 
copy to the first argument of parse, which is file, not text, so parse will 
just get confused (looking for a file named what your code is).

The code:

 parse( text = paste( ...

Will take the results of paste and pass them to the parse function as the text 
argument.

But having said that, you should refer to fortune(106) (type that after loading 
the fortunes package) and possibly fortune(181).

There are probably better ways to do what you want, Romain's second example is 
one way.
-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Philipp Schmidt
 Sent: Friday, May 15, 2009 8:35 AM
 To: Romain Francois
 Cc: r-help@r-project.org
 Subject: Re: [R] creating and then executing command strings
 
 On Fri, May 15, 2009 at 3:38 PM, Romain Francois
 romain.franc...@dbmail.com wrote:
  Hi,
 
  You can either parse and eval the string you are making, as in:
 
  eval( parse( text = paste(avg_,colname,  - 0;, sep='') ) )
 
 
  Or you can do something like this:
 
  df[[ paste( avg_, colname, sep =  ) ]] - 0
 
 
 Thanks you so much! I used the first version and it worked.
 
 What puzzles me, is that I am not able to use - instead of = (my R
 book says the two can be exchanged) or break the command into
 different parts and execute them one after another.
 
 I get various error messages when I try:
 
 eval( parse( text - paste(avg_,colname,  - 0;, sep='') ) )
 
 or
 
 text = paste(avg_,colname,  - 0;, sep='')
 parse(text)
 eval(parse(text))
 
 Anyway, thanks a lot - you greatly improved the likelihood of me not
 working on the weekend!
 
 Best - P
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulation

2009-05-15 Thread Wacek Kusnierczyk

Greg Snow wrote:
 Another possibility (maybe more readable, gives the option of a list, 
 probably not faster):

 Replicate(1000, rexp(15,1) )

   

provided that simplify=FALSE:

is(replicate(10, rexp(15, 1)))
# matrix ...

is(replicate(10, rexp(15, 1), simplify=FALSE))
# list ...

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] drawing arrows

2009-05-15 Thread Greg Snow

Duncan mentioned the arrows function, which may do everything you want.  But, 
also look at the my.symbols function in the TeachingDemos package for another 
way to draw arrows, or to draw your circles and arrows in 1 step.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of christophe dutang
 Sent: Friday, May 15, 2009 10:44 AM
 To: r-help@r-project.org
 Subject: [R] drawing arrows
 
 Hi,
 
 I would like to draw arrows in a classic 2D plot. Which package should
 I
 use? is there R base functions that do job?
 
 On google, I could not find any useful discussion about this topic,
 except a
 link to the function 'grid.arrows' of the grid package.
 
 My problem is I would like to draw arrows at the edge of circles drawn
 by
 the 'symbols' function. Maybe there is already a dedicated function for
 this?
 
 Any help is appreciated.
 
 Christophe
 
 
 
 --
 Christophe DUTANG
 Ph. D. student at ISFA
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Optimization algorithm to be applied to S4 classes - specifically sparse matrices

2009-05-15 Thread spencerg

 I suggest you try to translate your constraints into an 
unconstrained constrained problem using logarithms, then do nonlinear 
mixed effects modeling as described in chapters 6-8 of Pinheiro and 
Bates (2000).  To do this, I would first start with the simpler linear 
estimation problem to get starting values for the nonlinear estimation.  
You should be able to do this using the nlme function in the nlme 
package.  If you have trouble with this, you might consider the nlmer 
function in the lme4 package.  The latter is newer and better in many 
ways but not as well documented. 



 Hope this helps. 
 Spencer Graves


avraham.ad...@guycarp.com wrote:

Thank you both very much for your replies. What makes this a little less
straightforward, at least to me, is that there needs to be constraints on
the solved parameters. They most certainly need to be positive and there
may be an upper limit as well. The true best linear fit would have negative
entries for some of the parameters.


Originally, I was using the L-BFGS-B method of optim which both allows for
box constraints and has the limited memory advantage useful when dealing
with large matrices. Having the analytic gradient, I thought of using BFGS
and having a statement in the function returning Inf for any parameters
outside the allowable constraints.


I do /not/ know how to apply parameter constraints when using linear
models. I looked around at the various manuals and help features, and
outside of package glmc I did not find anything I could use. Perhaps I
overlooked something. If there is something I missed, please let me know.


If there truly is no standard optimization routine that works on sparse
matrices, my next step may be to use the normal equations to shrink the
size of the matrix, recast it as a dense matrix (it would only be 1173x1173
then) and then hand it off to optim.


Any further suggestions or corrections would be very much appreciated.


Thank you,


--Avraham Adler


   
 Douglas Bates 
 ba...@stat.wisc. 
 edu   To 
 Sent by:  avraham.ad...@guycarp.com   
 dmba...@gmail.com  cc 
   r-help@r-project.org
   Subject 
 05/15/2009 11:57  Re: [R] Optimization algorithm to   
 AMbe applied to S4 classes -  
   specifically sparse matrices
   
   
   
   
   
   





On Wed, May 13, 2009 at 5:21 PM,  avraham.ad...@guycarp.com wrote:
  

Hello.

I am trying to optimize a set of parameters using /optim/ in which the
actual function to be minimized contains matrix multiplication and is of
the form:

SUM ((A%*%X - B)^2)

where A is a matrix and X and B are vectors, with X as parameter vector.



As Spencer Graves pointed out, what you are describing here is a
linear least squares problem, which has a direct (i.e. non-iterative)
solution.  A comparison of the speed of various ways of solving such a
system is given in one of the vignettes in the Matrix package.

  

This has worked well so far. Recently, I was given a data set A of size
360440 x 1173, which could not be handled as a normal matrix. I brought


it
  

into 'R' as a sparse matrix (dgCMatrix - using sparseMatrix from the


Matrix
  

package), and the formulæ and gradient work, but /optim/ returns an error
of the form no method for coercing this S4 class to a vector.



If you just want the least squares solution X then

X - solve(crossprod(A), crossprod(A, B))

will likely be the fastest method where A is the sparse matrix.

I do feel obligated to point out that the least squares solution for
such large systems is rarely a sensible solution to the underlying
problem.  If you have over 1000 columns in A and it is very sparse
then likely at least parts of A are based on indicator columns for a
categorical variable.  In such situations a model with random effects
for the category is often preferable to the fixed-effects model you
are fitting.


  

After briefly looking into methods and classes, I realize I am in way


over
  

my head. Is there any way I could use /optim/

Re: [R] drawing arrows

2009-05-15 Thread Christophe Dutang

Thanks, I'll take a look.

Christophe

Le 15 mai 09 à 20:11, Greg Snow a écrit :

 Duncan mentioned the arrows function, which may do everything you  
 want.  But, also look at the my.symbols function in the  
 TeachingDemos package for another way to draw arrows, or to draw  
 your circles and arrows in 1 step.

 -- 
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of christophe dutang
 Sent: Friday, May 15, 2009 10:44 AM
 To: r-help@r-project.org
 Subject: [R] drawing arrows

 Hi,

 I would like to draw arrows in a classic 2D plot. Which package  
 should
 I
 use? is there R base functions that do job?

 On google, I could not find any useful discussion about this topic,
 except a
 link to the function 'grid.arrows' of the grid package.

 My problem is I would like to draw arrows at the edge of circles  
 drawn
 by
 the 'symbols' function. Maybe there is already a dedicated function  
 for
 this?

 Any help is appreciated.

 Christophe



 --
 Christophe DUTANG
 Ph. D. student at ISFA

  [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

Christophe Dutang
Ph. D. student at ISFA, Lyon, France
website: http://dutangc.free.fr




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] can you tell what .Random.seed was?

2009-05-15 Thread Warren Young


G. Jay Kerns wrote:


I want it to be *difficult* for students to figure out the seed and
automatically generate solutions on their own.


Hmmm Would it really be a bad thing if someone reverse engineered 
this to generate answers given the problem set?  If it's hard enough to 
do that, it'd be more worth solving than the given problem set.  I call 
that extra credit.



a brute force search of set.seed() is really
pretty easy and fast... even for students at this level.


Either you're misunderstanding Stavros' benchmark results, or I am. 
Could easily be the latter...I'm an R newbie.


As far as I can tell, the inner part of the loop does very little.  If 
that's right, Stavros is saying it will take 18 hours to try every 
possible seed when the algorithm based on that seed takes almost no time 
to run.  But, if generating each problem set takes, say, a minute, it 
will take 4.7 million years to generate a complete rainbow table when 
there are 2^32 possible seeds.



what if the Instructor
inserted an *unknown* very large number of calls to the RNG near the
beginning of the .Rnw (but after the set.seed)...  and did not
distribute this information to the students...  that would make it
much harder, yes?


There are better ways.

As above, one key to making rainbow tables impractical is making the 
per-iteration time long enough.  Even if it only takes a second to 
generate each possible problem set, that's enough when multiplied by 
high enough powers of 2.


The other key is using big enough powers of 2.

I hadn't looked into R's random number generation before, but it appears 
quite robust.  Seeding it with the current wall clock time (a 32-bit 
integer on most systems) is an insult to its capability.


The default pseudo-random number generator (PRNG) in my copy of R is the 
Mersenne Twister, a truly awesome algorithm.  It's capable of very high 
quality results, as long as you give it a good seed.  It will take a 
vector of *many* integers as a seed, not just one.  It's not clear to me 
from the R docs if you can pass an arbitrary array of integers with any 
value, or if it needs something special.


Assuming you can give it any old passel of randomness as a seed, you 
just have to find a good source of randomness to create that seed.  On a 
Linux box, you could concatenate several dozen bytes read from 
/dev/random, the current wall clock time in microseconds, the inode of 
the R script being run, the process ID of the R interpreter, and the 
current mouse cursor position into a single string.  Feed all that into 
a hash algorithm, and break off pieces of that 4 bytes long, cast them 
to integers, and send that array of ints to set.seed().


If you use SHA-256 as the hash algorithm, that scheme should give you 
enough input randomness to get any of the possible 2^256 hash outputs, 
making that the amount of possible problem sets.  That's more than a 
rainbow table buster...there aren't enough atoms in the visible universe 
to construct a computer big enough to cope with 2^256 possible outputs.


That said, the quality of the PRNG just *allows* you to avoid screwing 
up.  It doesn't make it impossible make a weak algorithm.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] readBin: read from defined offset TO defined offset?

2009-05-15 Thread Johannes Graumann

Thanks guys!

Duncan's hints regarding character (which I was naturally using ;0) and 
the double readBin solved my problem - I'm extracting an index from a 
REALLY big XML file to get fast direct access to subsections, so that I only 
have to parse them rather than the whole thing (only SAX-style passing would 
be possible, since there's no way the thing will fit into memory).

Thanks again, Joh

Johannes Graumann wrote:

 Hello,
 
 With the help of seek I can start readBin from any byte offset within
 my file that I deem appropriate.
 What I would like to do is to be able to define the endpoint of that read
 as well. Is there any solution to that already out there?
 
 Thanks for any hints, Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sweave: Howto write real TeX formula in plot

2009-05-15 Thread cameron.bracken



cls59 wrote:
 
 install.packages('pgfSweave',repos='http://www.rforge.net')
 

For others that are not on Linux. I would suggest using the r-forge site for
binary installation. The binaries on rforge.net are not completely up to
date in some cases. 

install.packages(pgfSweave, repos=http://R-Forge.R-project.org;)

Otherwise use if you are not on a Linux system:

install.packages('pgfSweave',repos='http://www.rforge.net',type='source')

-Cameron 

-- 
View this message in context: 
http://www.nabble.com/Sweave%3A-Howto-write-real-TeX-formula-in-plot-tp23127536p23565286.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Optimization algorithm to be applied to S4 classes - specifically sparse matrices

2009-05-15 Thread Ravi Varadhan

Hi,

I think quadratic programming is the way to go.  Look at solve.QP or
limSolve package.

Here is a toy example that I had worked out some time back for a linear
least squares problem with simple box constraints:
# Problem:  minimize ||Ax - y||, subject to low = x = upp

require(limSolve)

nc - 7  # 7 unknown parameters

nr - 20  # 20 equations

# Bounds on the parameters: 0  x  1,  for all x
#

set.seed(123)

A - matrix(rnorm(nr*nc), nr, nc)

x - c(runif(nc-1), 1.5) # Note: the last component is out of bounds!

y - A %*% x + rnorm(nr, sd=0.1)

qr.solve(A, y)  # unconstrained least-squares

low - rep(0, nc)  # lower bounds

upp - rep(1, nc)  # upper bounds

# Implementing the bounds (there is probably a simpler way to do this)
#
c1 - matrix(0, nc, nc)

diag(c1) - 1

c2 - matrix(0, nc, nc)

diag(c2) - -1

cmat - rbind(c1, c2)

vec - rep(0, 10)

vec[seq(1, 2*nc, by=2)] - 1:nc

vec[seq(2, 2*nc, by=2)] - (nc+1):(2*nc)

Cmat - rbind(c1, c2)[vec, ]  # Constraint matrix G

b0 - c(low, -upp)[vec]

ans - lsei(A = A, B = y, G = Cmat, H = b0)

ans 


Hope this helps,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html







-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of spencerg
Sent: Friday, May 15, 2009 2:22 PM
To: avraham.ad...@guycarp.com
Cc: r-help@r-project.org; Douglas Bates
Subject: Re: [R] Optimization algorithm to be applied to S4 classes -
specifically sparse matrices

  I suggest you try to translate your constraints into an unconstrained
constrained problem using logarithms, then do nonlinear mixed effects
modeling as described in chapters 6-8 of Pinheiro and Bates (2000).  To do
this, I would first start with the simpler linear estimation problem to get
starting values for the nonlinear estimation.  
You should be able to do this using the nlme function in the nlme 
package.  If you have trouble with this, you might consider the nlmer 
function in the lme4 package.  The latter is newer and better in many ways
but not as well documented. 


  Hope this helps. 
  Spencer Graves

avraham.ad...@guycarp.com wrote:
 Thank you both very much for your replies. What makes this a little 
 less straightforward, at least to me, is that there needs to be 
 constraints on the solved parameters. They most certainly need to be 
 positive and there may be an upper limit as well. The true best linear 
 fit would have negative entries for some of the parameters.


 Originally, I was using the L-BFGS-B method of optim which both allows 
 for box constraints and has the limited memory advantage useful when 
 dealing with large matrices. Having the analytic gradient, I thought 
 of using BFGS and having a statement in the function returning Inf 
 for any parameters outside the allowable constraints.


 I do /not/ know how to apply parameter constraints when using linear 
 models. I looked around at the various manuals and help features, and 
 outside of package glmc I did not find anything I could use. Perhaps 
 I overlooked something. If there is something I missed, please let me
know.


 If there truly is no standard optimization routine that works on 
 sparse matrices, my next step may be to use the normal equations to 
 shrink the size of the matrix, recast it as a dense matrix (it would 
 only be 1173x1173
 then) and then hand it off to optim.


 Any further suggestions or corrections would be very much appreciated.


 Thank you,


 --Avraham Adler




  Douglas Bates

  ba...@stat.wisc.

  edu   To

  Sent by:  avraham.ad...@guycarp.com

  dmba...@gmail.com  cc

r-help@r-project.org

Subject

  05/15/2009 11:57  Re: [R] Optimization algorithm to

  AMbe applied to S4 classes -

specifically sparse matrices

















 On Wed, May 13, 2009 at 5:21 PM,  avraham.ad...@guycarp.com wrote:
   
 Hello.

 I am trying to optimize a set of parameters using /optim/ in which 
 the actual function to be minimized contains matrix multiplication 
 and is of the form:

 SUM ((A%*%X - B)^2)

 where A is a matrix and X and B are vectors, with X as parameter vector.
 

 As Spencer Graves pointed out, what you are describing here is a 
 linear least squares problem, which

Re: [R] can you tell what .Random.seed was?

2009-05-15 Thread Stavros Macrakis

On Fri, May 15, 2009 at 12:07 PM, Stavros Macrakis
macra...@alum.mit.edu wrote:
 system.time(whatseed(runif(1)))

Sorry, though I got lucky and my overall result is roughly correct,
this is an incorrect time measure.  It should be

r - runif(1); system.time(whatseed(r))

because R's call-by-need semantics don't evaluate the runif before it
starts running whatseed.  The correct time (on my machine) is then 28
hours, not 18.

Better to avoid side-effect functions as arguments

 -s

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] can you tell what .Random.seed was?

2009-05-15 Thread Dirk Eddelbuettel


On 15 May 2009 at 13:08, G. Jay Kerns wrote:
| Thanks very much to Warren and Stavros for their additional insight.
| Putting all of this together, I think I am now ready to formulate my
| question intelligently:
| 
| Using Sweave, I want to distribute randomly generated problems AND
| answers to both teacher AND student.
| 
| More precisely, I want to distribute:
| 1) the .Rnw file
| 2) the .RData file saved near the end of the Sweave process.
| 
| I want it to be *easy* for the Instructor to change my seed and
| generate new problems.
| 
| I want it to be *difficult* for students to figure out the seed and
| automatically generate solutions on their own.
|
| Of course, difficult is a relative term, since what is difficult
| for them may well be easy for me, and what is difficult for me will
| be trivial to cryptographers and some people on this list.  The
| audience would be, say, upper division undergraduate students at a
| public university.
| 
| 
| What is clear so far: a brute force search of set.seed() is really
| pretty easy and fast... even for students at this level.
| 
| However, relating to Duncan's second remark:  what if the Instructor
| inserted an *unknown* very large number of calls to the RNG near the
| beginning of the .Rnw (but after the set.seed)...  and did not
| distribute this information to the students...  that would make it
| much harder, yes?
| 
| Any ideas that are even better than this?

You could use (one or more) seeds from a hardware RNGs.

The website http://random.org by Mads Haahr distributes such numbers (and my
CRAN package 'random' gets them for you in a convenient fashion).  Have a
look at the docs at random.org, and the two vignettes in the random package:

   RANDOM.ORG offers true random numbers to anyone on the Internet. The
   randomness comes from atmospheric noise, which for many purposes is better
   than the pseudo-random number algorithms typically used in computer
   programs. People use RANDOM.ORG for holding drawings, lotteries and
   sweepstakes, to drive games and gambling sites, for scientific
   applications and for art and music.  

Hth, Dirk

-- 
Three out of two people have difficulties with fractions.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 114 matches

Mail list logo