date:20120130

[R] Data generation

2012-01-30 Thread Partha Sinha

I want to generate a data matrix (20*30) having mean 3 and std
deviation 1 (normal dist).
pl help
Partha

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] FW: repeated measures MANOVA with interaction

2012-01-30 Thread Gregory McCullagh (x2010rcw)

If I have a matrix x:

x
slug surgery swat  prey  predator
1  122  2   91
2  240  8   115
3  348  3   110

slug = individual is tested in each swat, prey and predator odour treatments
surgery = different surgical treatment on slug
values in swat, prey, predator columns = average headings of slug

2 levels of treatments: surgery and odour (swat, prey, predator)

how do I test (appropriate code) a MANOVA repeated measures with an
interaction between surgery and odour (swat,prey,predator)?

do I need to re-arrange the matrix?

the solution might be:
y - cbind(swat,prey,predator)
fit - manova(y ~ surgery * swat,prey,predator+ Error(slug), data = x)
???
Greg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] repeat function for entire list of matrices

2012-01-30 Thread pabears

didn't seem to quite work:

i tried different subsetting.

lapply(nestedseasonlower, nested(nestedseason,.)

are there any functions that can repeat a function while counting each
iteration of the repeated function? (n=1, n=2, n=3)

thanks

--
View this message in context: 
http://r.789695.n4.nabble.com/repeat-function-for-entire-list-of-matrices-tp4334587p4339299.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] height of plots

2012-01-30 Thread 1Rnwb

Hello R gurus,

 I have to create 12 plots, I have been using the following script, which
leaves a large white space between two plot. I would appreciate if someone
can suggest an alternative to reduce the white space.
par(mar=c(3,3,.5,.5))
split.screen(c(6,2))# split display into two screens

for (i in 1:12)
{

if (i11)
{
screen(i)
plot(1:10,xaxt='n', xlab='', ylab='')  
box()
}else{
screen(i)
plot(1:10, xlab='', ylab='', cex=0.75)  
box()
}
}

Thanks
Sharad

--
View this message in context: 
http://r.789695.n4.nabble.com/height-of-plots-tp4339152p4339152.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] need some help with model.matrix

2012-01-30 Thread Daniel Negusse

hello folks, 

i am learning R and microarray analysis from scratch using different sites. 

today i am doing an exercise from 
http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual#R_functions

the section i am at is 2. Affymetrix data analysis. 

I understand the syntax given in this section up until: 

design - model.matrix(~ -1+factor(c(1,1,2,2,3,3))) # Creates appropriate 
design matrix. Alternatively, such a design matrix can be created in any 
spreadsheet program and then imported into R.

i am stuck at this point. i believe the model.matrix is creating a design 
matrix that the data will be put in later. the data in the example is: 

NameFileName
Target
Shoot12h.1  COLD_CONTROL_12H_SHOOT_REP1.cel c12h
Shoot12h.2  COLD_CONTROL_12H_SHOOT_REP2.cel c12h
ColdShoot6h.1   COLD_6H_SHOOT_REP1.cel  t6h
ColdShoot6h.2   COLD_6H_SHOOT_REP2.cel  t6h
ColdShoot12h.1  COLD_12H_SHOOT_REP1.cel t12h
ColdShoot12h.2  COLD_12H_SHOOT_REP2.cel t12h

Three experimental samples (duplicates of each giving a total of 6 arrays). 

now back to where i got stuck: 

design - model.matrix(~ -1+factor(c(1,1,2,2,3,3))) # Creates appropriate 
design matrix. Alternatively, such a design matrix can be created in any 
spreadsheet program and then imported into R.

what is model.matrix exactly doing? 

my real data that i will analyze after figuring this out has 49 arrays 
(columns): 3, 6, 9, 12 month samples with 9 replicates each and then 23 month 
samples with 13 replicates == total 49. 

how should i create an appropriate design matrix?? PLEASE help? 

thanks, 

daniel 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to calculate length of each triangulated face in deldir

2012-01-30 Thread uday

Hi, 
I have some data 

data=read.table(SCI.was ,header=TRUE)

sci_lat=data[,7] # latitude
temp_lon=data[,8] # longitude
# the longitude data is in 360 degree format need to convert to -180 to 180
sci_lon= ((temp_lon+180) %% 360 ) -180

m -cbind(sci_lon,sci_lat)
dist - spDistsN1(m, m[1,], longlat=TRUE)
hist(dist[2:9997])
try - deldir(sci_lon,sci_lat)
try_p - deldir(sci_lon,sci_lat,plot=TRUE,wl='tr') 

Now I would like to calculate the length of each triangulated face , could
somebody please tell me how to calculate it ?


Cheers 
Uday 
  

 

--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-calculate-length-of-each-triangulated-face-in-deldir-tp4339205p4339205.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to select columns

2012-01-30 Thread David Studer

Hello,
I have the following question:

when creating a data.frame
 a1-c(1,2,3)
 a2-c(1,2,3)
 c-data.frame(a1,a2)
I can select columns using an index like:
c[,1:2]
Is this possible too when using column-names? (something like c(,a1:a2),
which doesn't work)

Alternative question: Is there a function to get the index of a variable by
name or can I
select certain columns using a loop? (a_1, a_2, ..., a_n)

Thank you very much!
David

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Modifying whiskers in boxplots?

2012-01-30 Thread J. Willacker

Hello,

I know this has been covered on here before, but as a complete novice, I
need a little more guidance.  I would like to produce boxplots with the
whiskers extending to the 10 and 90th percentiles.  I found this code:

myboxplot.stats -  function (x, coef = NULL, do.conf = TRUE, do.out =
TRUE)
{
  nna - !is.na(x)
  n - sum(nna)
  stats - quantile(x, c(.1,.25,.5,.75,.9), na.rm = TRUE)
  iqr - diff(stats[c(2, 4)])
  out - x  stats[1] | x  stats[5]
  conf - if (do.conf)
stats[3] + c(-1.58, 1.58) * diff(stats[c(2, 4)])/sqrt(n)
  list(stats = stats, n = n, conf = conf, out = x[out  nna])
}

posted by Mr. Jim Bowers, and additional posts discussing how to make
it work.  The issue I am having is that all the posts say to edit
boxplot.default, but I have no idea how to actually do that.  I've
tried fix(bowplot.default)

and fixInNamespace(...), but what do I actually change?  I tried
including an argument stats=myboxplot.stats but that did not change
anything.  Once it is changed, can I just use the same boxplot(...)
code as normal or

do I need to use myboxplot?  I know this should be simple, but it is
generating a lot of frustration.  Any help would be greatly
appreciated.

Thanks, James

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] PLEASE HELP creating a matrix

2012-01-30 Thread hagereseb

hello folks, 

i am learning R and microarray analysis from scratch using different sites. 

today i am doing an exercise from
http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual#R_functions

the section i am at is 2. Affymetrix data analysis. 

I understand the syntax given in this section up until: 

design - model.matrix(~ -1+factor(c(1,1,2,2,3,3))) # Creates appropriate
design matrix. Alternatively, such a design matrix can be created in any
spreadsheet program and then imported into R.

i am stuck at this point. i believe the model.matrix is creating a design
matrix that the data will be put in later. the data in the example is: 

NameFileName
Target
Shoot12h.1  COLD_CONTROL_12H_SHOOT_REP1.cel c12h
Shoot12h.2  COLD_CONTROL_12H_SHOOT_REP2.cel c12h
ColdShoot6h.1   COLD_6H_SHOOT_REP1.cel  t6h
ColdShoot6h.2   COLD_6H_SHOOT_REP2.cel  t6h
ColdShoot12h.1  COLD_12H_SHOOT_REP1.cel t12h
ColdShoot12h.2  COLD_12H_SHOOT_REP2.cel t12h

Three experimental samples (duplicates of each giving a total of 6 arrays). 

now back to where i got stuck: 

design - model.matrix(~ -1+factor(c(1,1,2,2,3,3))) # Creates appropriate
design matrix. Alternatively, such a design matrix can be created in any
spreadsheet program and then imported into R.

what is model.matrix exactly doing? 

my real data that i will analyze after figuring this out has 49 arrays
(columns): 3, 6, 9, 12 month samples with 9 replicates each and then 23
month samples with 13 replicates == total 49. 

how should i create an appropriate design matrix?? PLEASE help? 

thanks, 

daniel 

--
View this message in context: 
http://r.789695.n4.nabble.com/PLEASE-HELP-creating-a-matrix-tp4340263p4340263.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] about undefined columns selected

2012-01-30 Thread xiaocong zuo

Hi,all,

when I run the below code,there is an error occured. could you please tell
me how to treat it?
 pdf('covariate.pdf')
 par(mfrow=c(1,1))
 pairs(data2[,c(ID,TYPE,AGE,GNDR,HT)],
+ panel=function(x,y) { points(x,y); lines(lowess(x,y))})
Error in `[.data.frame`(data2, , c(ID, TYPE, AGE, GNDR, HT)) :
undefined columns selected
 dev.off()
RStudioGD
2
Thank you very much!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using influence plots and obtaining id numbers

2012-01-30 Thread Pam

I am a novice R user, and I am having difficulty understanding R's influence
plots. 

I am trying to remove outliers from a particular variable, sib. I am able
to generate influence plots and further outlier information such as below
(which is a shortened example). For my analyses, I end up excluding the
points R refers to, 7, 18, 26, and 105. However, my question is, how can I
understand which ID numbers these points (7,18,26, and 105) are referring
to? These numbers, 7,18, 26. and 105, are definitely not my study ID
numbers.


 Myoutput-aov(sib~newgroup1, data=Study1)
 influencePlot(Myoutput) 
[1]   7  18  26 105
 influence.measures(Myoutput)
Influence measures of
 aov(formula = sib ~ newgroup1, data = Study1) :

   dfb.1_  dfb.nw12  dfb.nw13  dfb.nw14  dfb.nw15 dffit cov.r  
cook.dhat inf
33   1.70e-01 -1.33e-01 -1.53e-01 -1.56e-01 -1.52e-01  0.170405 1.124
5.83e-03 0.0909   *
34   7.79e-02 -6.07e-02 -7.00e-02 -7.14e-02 -6.94e-02  0.077934 1.131
1.22e-03 0.0909   *
35   1.47e-01 -1.15e-01 -1.32e-01 -1.35e-01 -1.31e-01  0.147268 1.126
4.36e-03 0.0909   *
36   6.64e-02 -5.17e-02 -5.96e-02 -6.08e-02 -5.91e-02  0.066386 1.132
8.86e-04 0.0909   *
37  -3.15e-01  2.46e-01  2.83e-01  2.89e-01  2.81e-01 -0.315448 1.100
1.99e-02 0.0909   *
38   1.47e-01 -1.15e-01 -1.32e-01 -1.35e-01 -1.31e-01  0.147268 1.126
4.36e-03 0.0909   *
39  -9.26e-01  7.22e-01  8.32e-01  8.48e-01  8.24e-01 -0.926059 0.882
1.64e-01 0.0909   *


--
View this message in context: 
http://r.789695.n4.nabble.com/Using-influence-plots-and-obtaining-id-numbers-tp4339144p4339144.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] about undefined columns selected

2012-01-30 Thread Milan Bouchet-Valat

Le dimanche 29 janvier 2012 à 21:50 -0500, xiaocong zuo a écrit :
 Hi,all,
 
 when I run the below code,there is an error occured. could you please tell
 me how to treat it?
  pdf('covariate.pdf')
  par(mfrow=c(1,1))
  pairs(data2[,c(ID,TYPE,AGE,GNDR,HT)],
 + panel=function(x,y) { points(x,y); lines(lowess(x,y))})
 Error in `[.data.frame`(data2, , c(ID, TYPE, AGE, GNDR, HT)) :
 undefined columns selected
  dev.off()
 RStudioGD
 2
 Thank you very much!
This simply means that one of the columns you tried to select doesn't
exist in data2. You can see what columns are present using:
colnames(data2)

or since data2 is a data frame:
names(data2)


But you could probably have figured this out by yourself... ;-)

Hope this helps

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data generation

2012-01-30 Thread MK


Assuming you want the whole data matrix coming from a single distribution.

matrix(rnorm(20 *30, 3, 1), 20, 30)


On 30/01/12 06:33, Partha Sinha wrote:

I want to generate a data matrix (20*30) having mean 3 and std
deviation 1 (normal dist).
pl help
Partha

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to select columns

2012-01-30 Thread Milan Bouchet-Valat

Le lundi 30 janvier 2012 à 08:30 +0100, David Studer a écrit :
 Hello,
 I have the following question:
 
 when creating a data.frame
  a1-c(1,2,3)
  a2-c(1,2,3)
  c-data.frame(a1,a2)
 I can select columns using an index like:
 c[,1:2]
 Is this possible too when using column-names? (something like c(,a1:a2),
 which doesn't work)
Read the R intro, or any tutorial on R. You can just do:
c[,c(a1, a2)]

(And I think you don't understand what : does, read the manual. At
least, it doesn't work like your attempt c(,a1:a2) would imply.)

 Alternative question: Is there a function to get the index of a variable by
 name or can I
 select certain columns using a loop? (a_1, a_2, ..., a_n)
No need for a loop:
which(colnames(c) == a1)


Cheers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] height of plots

2012-01-30 Thread Jim Lemon


On 01/30/2012 07:24 AM, 1Rnwb wrote:

Hello R gurus,

  I have to create 12 plots, I have been using the following script, which
leaves a large white space between two plot. I would appreciate if someone
can suggest an alternative to reduce the white space.
par(mar=c(3,3,.5,.5))
split.screen(c(6,2))# split display into two screens

for (i in 1:12)
{

if (i11)
{
screen(i)
plot(1:10,xaxt='n', xlab='', ylab='')
box()
}else{
screen(i)
plot(1:10, xlab='', ylab='', cex=0.75)
box()
}
}


Hi Sharad,
Specify your margins like this:

split.screen(c(6,2))
for (i in 1:12) {
 if (i11) {
  screen(i)
  par(mar=c(0.5,3,0,0.5))
  plot(1:10,xaxt='n', xlab='', ylab='')
  box()
 }
 else {
  screen(i)
  par(mar=c(3,3,0,0.5))
  plot(1:10, xlab='', ylab='', cex=0.75)
 }
}

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] handling a lot of data

2012-01-30 Thread Petr Kurtin

Hi,

I have got a lot of SPSS data for years 1993-2010. I load all data into
lists so I can easily index the values over the years. Unfortunately loaded
data occupy quite a lot of memory (10Gb) - so my question is, what's the
best approach to work with big data files? Can R get a value from the file
data without full loading into memory? How can a slower computer with not
enough memory work with such data?

I use the following commands:

data1993 = vector(list, 4);
data1993[[1]] = read.spss(...)  # first trimester
data1993[[2]] = read.spss(...)  # second trimester
...
data_all = vector(list, 17);
data_all[[1993]] = data1993;
...

and indexing, e.g.: data_all[[1993]][[1]]$DISTRICT, etc.

Thanks,
Petr Kurtin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] handling a lot of data

2012-01-30 Thread Milan Bouchet-Valat

Le lundi 30 janvier 2012 à 09:54 +0100, Petr Kurtin a écrit :
 Hi,
 
 I have got a lot of SPSS data for years 1993-2010. I load all data into
 lists so I can easily index the values over the years. Unfortunately loaded
 data occupy quite a lot of memory (10Gb) - so my question is, what's the
 best approach to work with big data files? Can R get a value from the file
 data without full loading into memory? How can a slower computer with not
 enough memory work with such data?
 
 I use the following commands:
 
 data1993 = vector(list, 4);
 data1993[[1]] = read.spss(...)  # first trimester
 data1993[[2]] = read.spss(...)  # second trimester
 ...
 data_all = vector(list, 17);
 data_all[[1993]] = data1993;
 ...
 
 and indexing, e.g.: data_all[[1993]][[1]]$DISTRICT, etc.
Have a look at the Large memory and out-of-memory data of High
Performance Computing task view[1]. In particular, you may want to use
the ff package and its ffdf object, which allows backing a data frame
on a file so that RAM can be freed when needed.

Another advice I'd give you is to convert the data from SPSS format
to .RData once, and to always use the latter. In my experience,
importation often creates memory fragmentation, in addition to being
very slow (don't hesitate to save, quit and restart R to reduce this
problem).

What use do you make of the different years? If you need e.g. to run a
model on all of them at the same time, then you'll need to concatenate
all the data frames from the data_all list, and I guess that's where
the RAM will be the problem: you'll have two copies of the data at the
same time. Once you've succeeded doing this, loading the full data set
will use less RAM, and so may work on lower-end computers.

A general solution is also to only load the variables you really need.
The saves package allows you to save the whole data set into an
archive of several .RData files, and to load only what you want from it.
It all depends on your needs, constraints, and failed attempts. ;-)


Regards


1: http://cran.r-project.org/web/views/HighPerformanceComputing.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odp: Modifying whiskers in boxplots?

2012-01-30 Thread Petr PIKAL

Hi

 
 Hello,
 
 I know this has been covered on here before, but as a complete novice, I
 need a little more guidance.  I would like to produce boxplots with the
 whiskers extending to the 10 and 90th percentiles.  I found this code:
 
 myboxplot.stats -  function (x, coef = NULL, do.conf = TRUE, do.out =
 TRUE)
 {
   nna - !is.na(x)
   n - sum(nna)
   stats - quantile(x, c(.1,.25,.5,.75,.9), na.rm = TRUE)
   iqr - diff(stats[c(2, 4)])
   out - x  stats[1] | x  stats[5]
   conf - if (do.conf)
 stats[3] + c(-1.58, 1.58) * diff(stats[c(2, 4)])/sqrt(n)
   list(stats = stats, n = n, conf = conf, out = x[out  nna])
 }
 
 posted by Mr. Jim Bowers, and additional posts discussing how to make
 it work.  The issue I am having is that all the posts say to edit
 boxplot.default, but I have no idea how to actually do that.  I've
 tried fix(bowplot.default)

I do not see any need for modifying boxplot.default. You could change 
$stats part of list produced from boxplot call

set.seed(111)
x-rnorm(100)
bb-boxplot(x, plot=F)

quantile(x,c(.1,.9))
  10%   90% 
-1.315051  1.400721 

bb$stats
   [,1]
[1,] -2.3023457
[2,] -0.7581696
[3,]  0.1315965
[4,]  0.6211842
[5,]  2.4856616

bb$stats[c(1,5),]-quantile(x,c(.1,.9))
 bb$stats
   [,1]
[1,] -1.3150509
[2,] -0.7581696
[3,]  0.1315965
[4,]  0.6211842
[5,]  1.4007212

 bxp(bb, add=T, col=2)

Regards
Petr


 
 and fixInNamespace(...), but what do I actually change?  I tried
 including an argument stats=myboxplot.stats but that did not change
 anything.  Once it is changed, can I just use the same boxplot(...)
 code as normal or
 
 do I need to use myboxplot?  I know this should be simple, but it is
 generating a lot of frustration.  Any help would be greatly
 appreciated.
 
 Thanks, James
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ColorBrewer question

2012-01-30 Thread Mario Giesel

It works! Thanks a lot for your explanations, Michael.
 
Good luck,
 Mario



Von: R. Michael Weylandt michael.weyla...@gmail.com

Cc: r-help@r-project.org r-help@r-project.org 
Gesendet: 5:22 Montag, 30.Januar 2012
Betreff: Re: [R] ColorBrewer question

I believe you need to use the scale_fill_brewer since fill is the
color of the bars while color is the outside of the bars in
ggplot2-speak:

E.g., with built-in data (it's polite to provide yours so that your
minimal working example is working):

data(diamonds)
ggplot(diamonds, aes(clarity)) + geom_bar(aes(fill = clarity, color = clarity))

# Note the borders are now changed but the fill is the same
ggplot(diamonds, aes(clarity)) + geom_bar(aes(fill = clarity, color =
clarity)) + scale_color_brewer(pal = Blues)

# Now the fill is changed, but you probably want to drop the border
coloring since it's hideous against the blues
ggplot(diamonds, aes(clarity)) + geom_bar(aes(fill = clarity, color =
clarity)) + scale_fill_brewer(pal = Blues)

# So lovely
ggplot(diamonds, aes(clarity)) + geom_bar(aes(fill = clarity)) +
scale_fill_brewer(pal = Blues)

Michael


 Hello, R friends,

 I'm trying to change colors of my horizontal bars so that they show a 
 sequence.
 I chose the ColorBrewer palette Blues. However the resulting plot doesn't 
 show any changes to the default.
 I tried several places of + scale_colour_brewer(type=seq, pal = Blues) 
 with no effect.
 This is my code:

 p - ggplot(data, aes(x = gender))  + 
 scale_y_continuous(,formatter=percent) + xlab(Gender) + coord_flip() +  
    scale_colour_brewer(type=seq, pal = Blues)
 p+geom_bar(aes(fill=pet),colour='black',position='fill')


 Any ideas welcome.
 Thanks,
  Mario
        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multiple column comparison

2012-01-30 Thread Petr PIKAL

Hi
 
I did not see any response and actually I can not offer any ready made 
solution too. For such problems there could be various solutions from 
cycles to *apply, reshape or plyr options.

However for anybody to start with it would be nice to get rather more 
clear description together with some small toy ready available data 
(preferably produced by dput) and desired result.

Regards
Petr


 Hello, 
 I have a very large content analysis project, which I've just begun to
 collect training data on. I have three coders, who are entering data on 
up
 to 95 measurements. Traditionally, I've used Excel to check coder 
agreement
 (e.g., percentage agreement), by lining up each coder's measurements
 side-by-side, creating a new column with the results using if 
statements.
 That is, if (a=b, 1, 0). With this many variables, I am clearly 
interested
 in something that I don't have to create manually every time I check
 percentage agreement for coders. 
 
 The data are set up like this: 
 
 IDCODER V1  V2   V3   V4 ... V95
 ID1  C1 y  int   doc  y
 ID2  C1 y  ext   doc  y
 ID1  C2nint  doc  y
 ID2  C2nint  doc  y
 ID1 C3 n int  doc  y
 ID2 C3 n int  doc  y
 
 I would like to write a script to do the following:
 For each variable compare each pair of coders using if statements (e.g., 
if
 C1.V1.==C1.V2, 1, 0)
 
 IDC1.V1  C2.V1 C3.V1
 ID1   y   y   y 
 ID2  yy   y 
 
 For each coding pair, enter the resulting 1s and 0s into a new column. 
 
 The new column name would reflect the results of the comparison, such as
 C1.C2.V1
 
 I'd ideally like to create this so that it can handle any number of
 variables and any number of coders. 
 
 I appreciate any thoughts, help, and pointers on this. 
 
 Thanks in advance. 
 
 Best,
 Ryan Fuller
 Doctoral Candidate, Communication
 Graduate Student Researcher, Carsey-Wolf Center
 http://carseywolf.ucsb.edu
 University of California, Santa Barbara
 
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/multiple-
 column-comparison-tp4332604p4332604.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Consultant to program R-code dealing with social networks

2012-01-30 Thread Michael Haenlein

Dear all,

I am looking for a consultant/ programmer to program a relatively simple R
code for me.

Specifically, I have about 50 social networks. These networks have between
5,000 and 5 million nodes and between 30,000 and 70 million edges. The code
should (a) read one network into R, (b) draw a snowball sample of size x
out of the network (e.g., a snowball sample of 1,000 nodes), (c) determine
some basic network statistics for that sample and (d) save the sample and
network statistics into two files for further use.

Let me know by email on case you are interested so that we can speak about
the remaining details.

Thanks,

Michael




Michael Haenlein
Associate Professor of Marketing
ESCP Europe
Paris, France

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nice report generator?

2012-01-30 Thread Tal Galili

Helloe dear Duncan, Gabor, Michael and others,

After taking some time, I wrote a bridge function between a cast_df
object from the {reshape} package into a table in Duncan's new {tables}
package.

The motivation was to make cast_df table prettier in the R terminal, as
well as allow us to export a pretty version of the table to latex (using
Hmisc::latex, on the output of tabular.cast_df)

The code is now available on:
http://www.r-statistics.com/2012/01/printing-nested-tables-in-r-bridging-between-the-reshape-and-tables-packages/

I would be happy for any input/revisions/suggestions from you.

With regards,
Tal



Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Thu, Dec 8, 2011 at 8:37 PM, Tal Galili tal.gal...@gmail.com wrote:

 reasonably

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Installing Rcompression package

2012-01-30 Thread Jeremy MAZET

Dear all

I'm trying to install the Rcompression package under R-2.14.0 on a Windows 
plateform. 
I need it to use the Ropenoffice package

Because there is no binary available, I'm trying to install it from source 
but I have always some error messages. 
I have installed zlib and Bzip2 softwares, defined  LIB_ZLIB and LIB_BZIP2 
variables and I have no space in my R home directories... but nothing tod 
do!

Is there anyone who could help me?

Jérémy Mazet

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Installing Rcompression package

2012-01-30 Thread Prof Brian Ripley


On 30/01/2012 12:24, Jeremy MAZET wrote:

Dear all

I'm trying to install the Rcompression package under R-2.14.0 on a Windows
plateform.
I need it to use the Ropenoffice package

Because there is no binary available, I'm trying to install it from source
but I have always some error messages.
I have installed zlib and Bzip2 softwares, defined  LIB_ZLIB and LIB_BZIP2
variables and I have no space in my R home directories... but nothing tod
do!

Is there anyone who could help me?


Well, you have not even told us where you got either of those packages 
from.  But as the posting guide told you, your first port of call is the 
maintainer.


AFAIK these are Omegahat packages, and Omegahat often provides Windows 
binaries for its packages.  So if they have chosen not to do so for 
Rcompression, the maintainer may have a good reason.




Jérémy Mazet

[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Variable selection based on both training and testing data

2012-01-30 Thread Jin Minming

Dear all,

The variable selection in regression is usually determined by the training data 
using AIC or F value, such as stepAIC. Is there some R package that can 
consider both the training and test dataset? For example, I have two separate 
training data and test data. Firstly, a regression model is obtained by using 
training data, and then this model is tested by using test data. This process 
continues in order to find some possible optimal models in terms of RMSE or R2 
for both training and test data. 

Thanks,

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Getting htmlParse to work with Hebrew? (on windows)

2012-01-30 Thread Tal Galili

Hello dear R-help mailing list.



I wish to be able to have htmlParse work well with Hebrew, but it keeps to
scramble the Hebrew text in pages I feed into it.

For example:

# why can't I parse the Hebrew correctly?

library(RCurl)
library(XML)
u = http://humus101.com/?p=2737;
a = getURL(u)
a # Here - the hebrew is fine.
a2 - htmlParse(a)
a2 # Here it is a mess...

None of these seem to fix it:

htmlParse(a, encoding = utf-8)

htmlParse(a, encoding = iso8859-8)

This is my locale:

 Sys.getlocale()

[1] 
LC_COLLATE=Hebrew_Israel.1255;LC_CTYPE=Hebrew_Israel.1255;LC_MONETARY=Hebrew_Israel.1255;LC_NUMERIC=C;LC_TIME=Hebrew_Israel.1255


Any suggestions?


Thanks up front,
Tal



Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Variable selection based on both training and testing data

2012-01-30 Thread Liaw, Andy

Variable section is part of the training process-- it chooses the model.  By 
definition, test data is used only for testing (evaluating chosen model).

If you find a package or function that does variable selection on test data, 
run from it!

Best,
Andy 

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Jin Minming
 Sent: Monday, January 30, 2012 8:14 AM
 To: r-help@r-project.org
 Subject: [R] Variable selection based on both training and 
 testing data
 
 Dear all,
 
 The variable selection in regression is usually determined by 
 the training data using AIC or F value, such as stepAIC. Is 
 there some R package that can consider both the training and 
 test dataset? For example, I have two separate training data 
 and test data. Firstly, a regression model is obtained by 
 using training data, and then this model is tested by using 
 test data. This process continues in order to find some 
 possible optimal models in terms of RMSE or R2 for both 
 training and test data. 
 
 Thanks,
 
 Jim
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
Notice:  This e-mail message, together with any attachme...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Checking for invalid dates: Code works but needs improvement

2012-01-30 Thread Paul Miller

Hi Rui, Marc, and Gabor,

Thanks for your replies to my question. All were helpful and it was interesting 
to see how different people approach various aspects of the same problem.

Spent some time this weekend looking at Rui's solution, which is certainly much 
clearer than my own. Managed to figure out pretty much all the details of how 
it works. Also managed to tweak it slightly in order to make it do exactly what 
I wanted. (See revised code below.)

Still have a couple of questions though. The first concerns the insertion of 
the code Y  2012 to set year values beyond 2012 to NA (on line 10 of the 
function below).  When I add this (or use it in place of nchar(Y)  4), the 
code succesfully finds the problem date 05/16/2015. After that though, it 
produces the following error message:

Error in if (any(is.na(x)  M != un  Y != un)) cat(Warning: Invalid date 
values in,  :  missing value where TRUE/FALSE needed

Why is this happening? If the code correctly correctly handles the date 
06/20/1840 without producing an error, why can't it do likelwise with 
05/16/2015?

The second question is why it's necessary to put x on line 15 following 
cat(Warning ...). I know that I don't get any date columns if I don't 
include this but am not sure why.

The third question is whether it's possible to change the class of the date 
variables without using a for loop. I played around with this a little but 
didn't find a vectorized alternative. It may be that this is not really 
important. It's just that I've read in several places that for loops should be 
avoided wherever possible.

Thanks,

Paul 


##
 Code for detecting invalid dates 
##

 Test Data 

connection - textConnection(
1 11/23/21931 05/23/2009 un/17/2011
2 06/20/1840  02/30/2010 03/17/2011
3 06/17/1935  12/20/2008 07/un/2011
4 05/31/1937  01/18/2007 04/30/2011
5 06/31/1933  05/16/2015 11/20/un
)

TestDates - data.frame(scan(connection, 
 list(Patient=0, birthDT=, diagnosisDT=, metastaticDT=)))

close(connection)

 Input Data 

TDSaved - TestDates

 List of Date Variables 

DateNames - c(birthDT, diagnosisDT, metastaticDT)

 Date Function 

fun - function(Dat){
f - function(jj, DF){
x - as.character(DF[, jj])
x - unlist(strsplit(x, /))
n - length(x)
M - x[seq(1, n, 3)]
D - x[seq(2, n, 3)]
Y - x[seq(3, n, 3)]
D[D == un] - 15
Y - ifelse(nchar(Y)  4 | Y  2012 | Y  1900, NA, Y)
x - as.Date(paste(Y, M, D, sep=-), format=%Y-%m-%d)
if(any(is.na(x)  M != un  Y != un))
cat(Warning: Invalid date values in, jj, \n,
as.character(DF[is.na(x), jj]), \n)
x
}
Dat - data.frame(sapply(names(Dat), function(j) f(j, Dat)))
for(i in names(Dat)) class(Dat[[i]]) - Date
Dat
}

 Output Data 

TD - TDSaved

 Read Dates 

TD[, DateNames] - fun(TD[, DateNames])
TD

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] useR! 2012: Earlybird Registration for International R Users Conference, Nashville TN 12-15 2012

2012-01-30 Thread Frank Harrell

The 8th international R users conference useR! 2012 will be in Nashville TN
USA June 12-15 with a special all-day pre-conference course from Bill
Venables on June 11.   We have a terrific lineup of half-day tutorials on
June 12 and will have invited and contributed presentations of interest to a
wide variety of R users.  Details may be found at
http://biostat.mc.vanderbilt.edu/UseR-2012 and online early bird
registration is now available.  Abstract submissions for contributed talks
and posters are welcomed.  There are major entertainment events in and
around Nashville before and during the conference that you may also want to
take advantage of.


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/useR-2012-Earlybird-Registration-for-International-R-Users-Conference-Nashville-TN-12-15-2012-tp4341040p4341040.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nice report generator?

2012-01-30 Thread Duncan Murdoch


On 30/01/2012 6:59 AM, Tal Galili wrote:

Helloe dear Duncan, Gabor, Michael and others,

After taking some time, I wrote a bridge function between a cast_df
object from the {reshape} package into a table in Duncan's new {tables}
package.

The motivation was to make cast_df table prettier in the R terminal, as
well as allow us to export a pretty version of the table to latex (using
Hmisc::latex, on the output of tabular.cast_df)

The code is now available on:
http://www.r-statistics.com/2012/01/printing-nested-tables-in-r-bridging-between-the-reshape-and-tables-packages/

I would be happy for any input/revisions/suggestions from you.


Seems like a nice idea.  Two comments:

1. I did add a Factor() function as described in the message you quote 
from me, so you might be able to use that and simplify things a little.


2. It's more flexible to construct the language object as a language 
object, rather than pasting something together and parsing it.  For one 
thing, that allows non-syntactic variable names; I think it's also 
easier to read.  So your code


txt- paste(tabular(value*v*, LEFT , ~ ,RIGHT ,, data = m_xx, suppressLabels  = 
2,...), sep = )
eval(parse(text = txt ))

could be rewritten as

formula- substitute( value*v*LEFT ~ RIGHT, list(LEFT=LEFT, RIGHT=RIGHT))
tabular(formula, data = m_xx, suppressLabels = 2, ...)

It might make sense to put something like this into the tables package, but I 
don't want to have a dependency on reshape.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] percentage from density()

2012-01-30 Thread Duke

Great suggestions and comments, Bill, Greg and Rolf. You provided me 
some valuable ways to deal with the data I am working with. Thank you 
all so much!


Bests,

D.

On 1/29/12 4:03 PM, William Dunlap wrote:

If v is your original data,
v- c(-20, rep(0,98), 20)
why not use
mean( -20  v  v  2)
as your estimate of the probability that v is in (-20,2)?

Estimating a density is like taking the derivative
of a smooth of the empirical distribution function,
so why not eliminate the middleman instead of integrating
the estimated density?  Any difference between the two
methods tells more about the smoothing used than about
the data involved.  (Not that I am any sort of expert
in this matter.)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Greg Snow
Sent: Saturday, January 28, 2012 8:12 PM
To: Duke; r-help@r-project.org
Subject: Re: [R] percentage from density()

If you use logspline estimation (logspline package) instead of kernel density 
estimation then this is
simple as there are cumulative area functions for logspline fits.

If you need to do this with kernel density estimates then you can just find the 
area over your region
for the kernel centered at each data point and average those values together to 
get the area under the
entire density estimate.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Duke
Sent: Friday, January 27, 2012 3:45 PM
To: r-help@r-project.org
Subject: [R] percentage from density()

Hi folks,

I know that density function will give a estimated density for a give
dataset. Now from that I want to have a percentage estimation for a
certain range. For examle:

y = density(c(-20,rep(0,98),20))
plot(y, xlim=c(-4,4))

Now if I want to know the percentage of data lying in (-20,2). Basically
it should be the area of the curve from -20 to 2. Anybody knows a simple
function to do it?

Thanks,

D.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ROC curve

2012-01-30 Thread Josiane NJIWA



Hello all,

I am very new to R and i am facing two problems. First i didn't succeed 
changing the konsole language in english even after trying the line command set 
language='en'.
I would like to plot ROC curves. I have a serie of 10 threshold tests that i do 
for 10 patients. The prediction for the patients is always the same but the 
status can change given to the considered threshold.
I have 11 columns of 10 rows, the first colums containing the10 lines of the 
predicted status of the patients (0=cured, 1=non cured). Then follow 10 columns 
(10 thresholds) containing the found status using the threshold.
Please do someone know how i can use those values with R to plot ROC curves?

I thank you for your understanding,

Josiane.


Everything should be made as simple as possible, but not simpler.Albert 
Einstein.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RCurl format

2012-01-30 Thread KTD Services

I am having trouble with the postForm function in RCurl.

I want to send a the command DELETE https://somewebsite.com.json

but I can't seem to find it.  I could try:

postForm(url, _method=DELETE, .opts = list(username:password) )

but I get the error:

Error: unexpected input in postForm(url4, _

this error seems to be due to the underscore _ before method

Any ideas how I can do a DELETE command another way in RCurl?

Thanks.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] parameter estimate

2012-01-30 Thread Christopher Kelvin

I need help,
the codes below estimates the weibull parameters with complete failure, my 
question is how do i change the state to include
some censoring (may be right, type-I or type-II) to generate and estimate the 
parameters.
thank you

x=rweibull(10,2,2)
library(survival)
d-data.frame(ob=c(x),state=1)
s - Surv(d$ob,d$state)
sr - survreg(s~1,dist=weibull)
print(paste(beta =,1/sr$scale))
print(paste(eta =,exp(sr$coefficients[1])))

or


library(MASS)
set.seed(123)
m - replicate(1000, coef(fitdistr(rweibull(50, 0.8, 2), weibull)))
summary(t(m))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ANOVA factors

2012-01-30 Thread Wolfgang Polasek

Hi all

How to make from a n x m matrix with the stack command 2 categorical
factors in R, the row and the col factor?

Is there a function for nice graphical outputs in ANOVA?

Thanks
Wolfgang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] New package geotools

2012-01-30 Thread Antoine Lucas

Dear All,

I have upload a new package geotools, that main purpose is to
propose functions to get distance between cities, with city name or
postal code (usage: shipment).

For now: there is only the french cities dataset.

An example:
Return all postal code at 7 kms from Paris:
codesNearToCode(zipCode(Paris),7)


Regards,

Antoine Lucas.

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merge multiple data frames

2012-01-30 Thread Massimo Bressan


hi don

I followed your advice about using sqldf package but the problem of 
labelling the fields persists;

for some reasons I can not properly handle the sql 'as' statement

a_b-sqldf(select a.*, b.* from a left join b on a.date=b.date)
a_b_c-sqldf(select a_b.*, c.* from a_b left join c on a_b.date=c.date)

bye

max





- Original Message - 
From: MacQueen, Don macque...@llnl.gov

To: maxbre mbres...@arpa.veneto.it; r-help@r-project.org
Sent: Saturday, January 28, 2012 12:24 AM
Subject: Re: [R] merge multiple data frames


Not tested, but this might be a case for the sqldf package.

-Don

--
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 1/26/12 9:29 AM, maxbre mbres...@arpa.veneto.it wrote:


This is my reproducible example (three data frames: a, b, c)

a-structure(list(date = structure(1:6, .Label = c(2012-01-03,
2012-01-04, 2012-01-05, 2012-01-06, 2012-01-07, 2012-01-08,
2012-01-09, 2012-01-10, 2012-01-11, 2012-01-12, 2012-01-13,
2012-01-14, 2012-01-15, 2012-01-16, 2012-01-17, 2012-01-18,
2012-01-19, 2012-01-20, 2012-01-21, 2012-01-22, 2012-01-23
), class = factor), so2 = c(0.799401398190476, 0, 0,
0.0100453950434783,
0.200154920565217, 0.473866969181818), nox = c(111.716109973913,
178.077239330435, 191.257829021739, 50.6799951473913, 115.284643540435,
110.425185027727), no = c(48.8543691516522, 88.7197448817391,
93.9931932472609, 13.9759949817391, 43.1395266865217, 41.7280296016364
), no2 = c(36.8673432865217, 42.37150668, 47.53311701, 29.3026882474783,
49.2986070321739, 46.5978461731818), co = c(0.618856168125,
0.99659347508,
0.66698741608, 0.38343731117, 0.281604928875, 0.155383408913043
), o3 = c(12.1393100029167, 12.3522739816522, 10.9908791203043,
26.9122200013043, 13.8421695947826, 12.3788847045455), ipa =
c(167.541954974667,
252.7196257875, 231.802370709167, 83.4850259595833, 174.394613581667,
173.868599272609), ws = c(1.47191016429167, 0.765781205208333,
0.937053086791667, 1.581022406625, 0.909756802125, 0.959252831695652
), wd = c(45.2650019737732, 28.2493544114369, 171.049080544214,
319.753674830936, 33.8713897347193, 228.368119533759), temp =
c(7.9197282588,
3.79434291520833, 2.1287644735, 6.733854600625, 3.136579722,
3.09864120704348), umr = c(86.11566638875, 94.5034087491667,
94.14451249375, 53.1016709004167, 65.63420423, 74.955669236087
)), .Names = c(date, so2, nox, no, no2, co, o3,
ipa, ws, wd, temp, umr), row.names = c(NA, 6L), class =
data.frame)


b-structure(list(date = structure(1:6, .Label = c(2012-01-03,
2012-01-04, 2012-01-05, 2012-01-06, 2012-01-07, 2012-01-08,
2012-01-09, 2012-01-10, 2012-01-11, 2012-01-12, 2012-01-13,
2012-01-14, 2012-01-15, 2012-01-16, 2012-01-17, 2012-01-18,
2012-01-19, 2012-01-20, 2012-01-21, 2012-01-22, 2012-01-23
), class = factor), so2 = c(0, 0, 0, 0, 0, 0), nox = c(13.74758511,
105.8060582, 61.22720599, 11.45280354, 56.86804174, 39.17917222
), no = c(0.882593766, 48.97037506, 9.732937217, 1.794549972,
16.32300019, 8.883637786), no2 = c(11.80447753, 25.35235381,
28.72990261, 8.590004034, 31.9003796, 25.50512403), co = c(0.113954917,
0.305985964, 0.064001839, 0, 1.86e-05, 0), o3 = c(5.570499897,
9.802379608, 5.729360104, 11.91304016, 12.13407993, 10.00961971
), ipa = c(6.065110207, 116.9079971, 93.21240234, 10.5777998,
66.40740204, 34.47359848), ws = c(0.122115001, 0.367668003, 0.494913995,
0.627124012, 0.473895013, 0.593913019), wd = c(238.485119317031,
221.645073036776, 220.372076815032, 237.868340917096, 209.532933617465,
215.752030286564), temp = c(4.044159889, 1.176810026, 0.142934993,
0.184606999, -0.935989976, -2.015399933), umr = c(72.29229736,
88.69879913, 87.49530029, 24.00079918, 44.8852005, 49.47729874
)), .Names = c(date, so2, nox, no, no2, co, o3,
ipa, ws, wd, temp, umr), row.names = c(NA, 6L), class =
data.frame)


c-structure(list(date = structure(1:6, .Label = c(2012-01-03,
2012-01-04, 2012-01-05, 2012-01-06, 2012-01-07, 2012-01-08,
2012-01-09, 2012-01-10, 2012-01-11, 2012-01-12, 2012-01-13,
2012-01-14, 2012-01-15, 2012-01-16, 2012-01-17, 2012-01-18,
2012-01-19, 2012-01-20, 2012-01-21, 2012-01-22, 2012-01-23
), class = factor), so2 = c(2.617839247, 0, 0, 0.231044086,
0.944608887, 2.12400444), nox = c(308.9046313, 275.6778849, 390.0824142,
178.7429364, 238.655832, 251.892601), no = c(156.0262489, 151.4412498,
221.0725021, 65.96049786, 106.541748, 119.3471241), no2 = c(74.80145447,
59.29991481, 66.5897975, 77.84267978, 75.68422569, 85.43044816
), co = c(1.628431197, 1.716231492, 1.264678366, 1.693460745,
0.780637084, 0.892724398), o3 = c(26.1473999, 15.91584015, 22.46199989,
37.39400101, 15.63426018, 17.51494026), ipa = c(538.414978, 406.4620056,
432.6459961, 275.2820129, 435.7909851, 436.8039856), ws = c(4.995530128,
1.355309963, 1.708899975, 3.131690025, 1.546270013, 1.571320057
), wd = c(58.15639877, 64.5657153143848, 39.9754269501381,
24.0739884380921,
55.9453098437477, 56.7648829092446), temp = c(10.24740028, 7.052690029,
4.33258009,

[R] r-help; parameter estimate

2012-01-30 Thread Christopher Kelvin

I need help,
the codes below estimates the weibull parameters with complete failure, my 
question is how do i change the state to include
some censoring (may be right, type-I or type-II) to generate and estimate the 
parameters.
thank you

x=rweibull(10,2,2)
library(survival)
d-data.frame(ob=c(x),state=1)
s - Surv(d$ob,d$state)
sr - survreg(s~1,dist=weibull)
print(paste(beta =,1/sr$scale))
print(paste(eta =,exp(sr$coefficients[1])))

or


library(MASS)
set.seed(123)
m - replicate(1000, coef(fitdistr(rweibull(50, 0.8, 2), weibull)))
summary(t(m))
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merge multiple data frames

2012-01-30 Thread Massimo Bressan


thanks michael

it's working like a charm: that's exaclty what I was looking for

bye

max

- Original Message - 
From: R. Michael Weylandt michael.weyla...@gmail.com

To: Massimo Bressan mbres...@arpa.veneto.it
Cc: r-help@r-project.org
Sent: Friday, January 27, 2012 4:16 PM
Subject: Re: [R] merge multiple data frames


Oh, sorry -- I assumed that was intentional since my code passed the
identical() test with what you said you wanted.

Perhaps this gets what you meant you wanted instead (though the
treatment of the names is far from elegant)

mergeAll - function(..., by = date, all = TRUE) {
 dotArgs - list(...)
 dotNames - lapply(dotArgs, names)
 repNames - Reduce(intersect, dotNames)
 repNames - repNames[repNames != by]
 for(i in seq_along(dotArgs)){
   wn - which( (names(dotArgs[[i]]) %in% repNames) 
(names(dotArgs[[i]]) != by))
   names(dotArgs[[i]])[wn] - paste(names(dotArgs[[i]])[wn],
names(dotArgs)[[i]], sep = .)
 }
 Reduce(function(x, y) merge(x, y, by = by, all = all), dotArgs)
}

print(str(mergeAll(a=a,b=b,c=c)))

Is that what you were going for?

Michael

On Fri, Jan 27, 2012 at 3:19 AM, Massimo Bressan
mbres...@arpa.veneto.it wrote:

I tested your code: it's OK but there is still the problem of the suffixes
for the last dataframe
thank you for the support


- Original Message - From: R. Michael Weylandt
michael.weyla...@gmail.com
To: maxbre mbres...@arpa.veneto.it
Cc: r-help@r-project.org
Sent: Thursday, January 26, 2012 8:19 PM
Subject: Re: [R] merge multiple data frames


I might do something like this:

mergeAll - function(..., by = date, all = TRUE) {
dotArgs - list(...)
Reduce(function(x, y)
merge(x, y, by = by, all = all, suffixes=paste(., names(dotArgs),
sep = )),
dotArgs)}

mergeAll(a = a, b = b, c = c)

str(.Last.value)

You also might be able to set it up to capture names without you
having to put a = a etc. using substitute.

On Thu, Jan 26, 2012 at 12:29 PM, maxbre mbres...@arpa.veneto.it wrote:


This is my reproducible example (three data frames: a, b, c)

a-structure(list(date = structure(1:6, .Label = c(2012-01-03,
2012-01-04, 2012-01-05, 2012-01-06, 2012-01-07, 2012-01-08,
2012-01-09, 2012-01-10, 2012-01-11, 2012-01-12, 2012-01-13,
2012-01-14, 2012-01-15, 2012-01-16, 2012-01-17, 2012-01-18,
2012-01-19, 2012-01-20, 2012-01-21, 2012-01-22, 2012-01-23
), class = factor), so2 = c(0.799401398190476, 0, 0, 
0.0100453950434783,

0.200154920565217, 0.473866969181818), nox = c(111.716109973913,
178.077239330435, 191.257829021739, 50.6799951473913, 115.284643540435,
110.425185027727), no = c(48.8543691516522, 88.7197448817391,
93.9931932472609, 13.9759949817391, 43.1395266865217, 41.7280296016364
), no2 = c(36.8673432865217, 42.37150668, 47.53311701, 29.3026882474783,
49.2986070321739, 46.5978461731818), co = c(0.618856168125,
0.99659347508,
0.66698741608, 0.38343731117, 0.281604928875, 0.155383408913043
), o3 = c(12.1393100029167, 12.3522739816522, 10.9908791203043,
26.9122200013043, 13.8421695947826, 12.3788847045455), ipa =
c(167.541954974667,
252.7196257875, 231.802370709167, 83.4850259595833, 174.394613581667,
173.868599272609), ws = c(1.47191016429167, 0.765781205208333,
0.937053086791667, 1.581022406625, 0.909756802125, 0.959252831695652
), wd = c(45.2650019737732, 28.2493544114369, 171.049080544214,
319.753674830936, 33.8713897347193, 228.368119533759), temp =
c(7.9197282588,
3.79434291520833, 2.1287644735, 6.733854600625, 3.136579722,
3.09864120704348), umr = c(86.11566638875, 94.5034087491667,
94.14451249375, 53.1016709004167, 65.63420423, 74.955669236087
)), .Names = c(date, so2, nox, no, no2, co, o3,
ipa, ws, wd, temp, umr), row.names = c(NA, 6L), class =
data.frame)


b-structure(list(date = structure(1:6, .Label = c(2012-01-03,
2012-01-04, 2012-01-05, 2012-01-06, 2012-01-07, 2012-01-08,
2012-01-09, 2012-01-10, 2012-01-11, 2012-01-12, 2012-01-13,
2012-01-14, 2012-01-15, 2012-01-16, 2012-01-17, 2012-01-18,
2012-01-19, 2012-01-20, 2012-01-21, 2012-01-22, 2012-01-23
), class = factor), so2 = c(0, 0, 0, 0, 0, 0), nox = c(13.74758511,
105.8060582, 61.22720599, 11.45280354, 56.86804174, 39.17917222
), no = c(0.882593766, 48.97037506, 9.732937217, 1.794549972,
16.32300019, 8.883637786), no2 = c(11.80447753, 25.35235381,
28.72990261, 8.590004034, 31.9003796, 25.50512403), co = c(0.113954917,
0.305985964, 0.064001839, 0, 1.86e-05, 0), o3 = c(5.570499897,
9.802379608, 5.729360104, 11.91304016, 12.13407993, 10.00961971
), ipa = c(6.065110207, 116.9079971, 93.21240234, 10.5777998,
66.40740204, 34.47359848), ws = c(0.122115001, 0.367668003, 0.494913995,
0.627124012, 0.473895013, 0.593913019), wd = c(238.485119317031,
221.645073036776, 220.372076815032, 237.868340917096, 209.532933617465,
215.752030286564), temp = c(4.044159889, 1.176810026, 0.142934993,
0.184606999, -0.935989976, -2.015399933), umr = c(72.29229736,
88.69879913, 87.49530029, 24.00079918, 44.8852005, 49.47729874
)), .Names = c(date, so2, nox, no,

[R] Problem in Fitting model equation in nls function

2012-01-30 Thread ram basnet

Dear R users,
 
I am struggling to fit expo-linear equation to my data using nls 
function. I am always getting error message as i highlighted below in yellow 
color: 
 
 
### Theexpo-linear equation which i am interested to fit my data:   
response_variable =  (c/r)*log(1+exp(r*(Day-tt))), where Day is 
time-variable
 
## my response variable
 
rl - c(2,1.5,1.8,2,2,2.5,2.6,1.5,2.4,1.7,2.3,2.4,2.2,2.6,
 2.8,2,2.5,1.8,2.4,2.4,2.3,2.6,3,2,2.6,1.8,2.5,2.5,
 2.3,2.7,3,2.2,2.6,1.8,2.5,2.5,2.3,2.7,3,2.2)
myday - rep(c(3,5,7,9,10), each = 8) # creating my predictor time-variable
mydata - data.frame(rl,myday) # data object

### fitting model equation in nls function
### when i assigned initial value for tt = 0.6,
 
CASE-I: 
 
 mytest - nls(rl ~ (c/r)*log(1+exp(r*(myday-tt))), data = mydata,
+ na.action = na.omit, 
+ start = list(c = 2.0, r = 0.05, tt = 0.6),algorithm = plinear)
Error in numericDeriv(form[[3L]], names(ind), env) : 
  Missing value or an infinity produced when evaluating the model
 
CASE - II:
When i assigned initial value for tt = 1: 
 
 mytest - nls(rl ~ (c/r)*log(1+exp(r*(myday-tt))), data = mydata,
+ na.action = na.omit, 
+ start = list(c = 2.0, r = 0.5, tt = 1),algorithm = plinear)
Error in nls(rl ~ (c/r) * log(1 + exp(r * (myday - tt))), data = mydata,  : 
  singular gradient
 
I am getting the yellow-color highlighted error message (see above). Truely 
speaking, i have not so much experienced with fitting specific model equation 
in R-package.
I have following queries: 
 
1. Does any one can explain me what is going wrong here ? 
 
2. Importantly, how can i write above equation into nls functions ? 
 
I will be very thankful to you, if any one can help me.
I am looking for your cooperations.
 
Thanks
 
 
Regards,
Ram Kumar Basnet
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] about changing line type and line width in Taylor Diagram

2012-01-30 Thread Roopashree Shrivastava

Dear all,
 I am new to plotting Taylor Diagram using plotrix package within R, hence
this post. I have written a script which plots Taylor Diagram with one
reference and 7 model values. However the font size, line width and line
type are not clear when saving the diagram as a jpeg file. I tried the
functions lty, lwd and font but no apparent change. I am attaching the
script here. Any help would be greatly appreciated. The script is

# my first taylor diagram
ref-c(0.00640091,0.00533091,0.00381636,0.00275519,0.00277649,0.00280806,0.00267945,0.00237123,0.000970663,0.000986191,0.00100226,0.00086391,0.000622819,0.000485319,0.000362976,0.000246112,0.000165615,0.8184,0.4)

m1-c(0.0124827,0.011662,0.0102956,0.0091183,0.00813907,0.007192,0.00662517,0.00433745,0.00184044,0.000649477,0.00024642,5.43E-05,0.97696,0.000194817,0.000182709,0.000134398,0.000106024,8.92E-05,6.28E-05)
taylor.diagram(ref,m1,pos.cor=FALSE,ngamma=3,pcex=1,grad.corr.lines=c(-0.99,-0.95,-0.9,-0.8,-0.6,-0.4,-0.2,0,0.2,0.4,0.6,0.8,0.9,0.95,0.99),lty=1,lwd=10,font=5)

m2-c(0.0101348,0.00920886,0.0086196,0.00785134,0.00723838,0.00675833,0.00579093,0.00540478,0.00226489,0.000809049,0.00019625,3.95E-05,8.89E-05,0.000195028,0.000185004,0.000131202,0.000109852,9.98E-05,6.80E-05)
taylor.diagram(ref,m2,add=TRUE,pch=19,col=blue,lty=solid,lwd=3)

m3-c(0.0123251,0.0120384,0.00871793,0.00678519,0.00628331,0.00532673,0.00486861,0.0048328,0.0038655,0.00143683,0.00022057,8.61E-06,7.79E-05,0.000184976,0.000185927,0.000133771,0.000104613,9.26E-05,6.38E-05)
taylor.diagram(ref,m3,add=TRUE,pch=19,col=orange,lty=solid,lwd=3)

m4-c(0.0134251,0.0126776,0.012559,0.0121933,0.0099911,0.00727952,0.00475407,0.00227909,0.00130748,0.000705607,0.000304828,5.70E-05,0.000109972,0.000187504,0.0002016,0.000133706,0.000109697,9.54E-05,6.35E-05)
taylor.diagram(ref,m4,add=TRUE,pch=19,col=pink,lty=solid,lwd=3)

m5-c(0.0124275,0.0112242,0.00886243,0.00793019,0.0067846,0.00603205,0.00566561,0.00530552,0.00318331,0.000961854,0.000218234,3.66E-05,7.99E-05,0.000182724,0.000196627,0.000136862,0.000104907,0.94622,6.20E-05)
taylor.diagram(ref,m5,add=TRUE,pch=19,col=purple,lty=solid,lwd=3)

m6-c(0.0142817,0.0134474,0.0129694,0.0113914,0.0102208,0.00920309,0.00555206,0.00289796,0.00143831,0.000706277,0.000277201,5.60E-05,0.000114714,0.000186412,0.000198743,0.000134991,0.000108689,9.43E-05,6.16E-05)
taylor.diagram(ref,m6,add=TRUE,pch=19,col=brown,lty=solid,lwd=3)

m7-c(0.0120621,0.0117936,0.00854782,0.00734006,0.00669576,0.00629334,0.00595018,0.00564455,0.00396859,0.000991006,0.000171742,9.68E-06,8.10E-05,0.000186982,0.00018854,0.000136548,0.000104581,9.18E-05,6.20E-05)
taylor.diagram(ref,m7,add=TRUE,pch=19,col=cyan,lty=solid,lwd=3)
lpos-3.5*sd(ref)
legend(.75*lpos,1.5*lpos,legend=c(,1171,1211,2121,2221,4141,5251),pch=19,col=c(red,blue,orange,pink,purple,brown,cyan))

Thanking you,
Warm Regards
Roopa

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] And Statement for two if functions

2012-01-30 Thread kerry1912

I want to perform two if functions at the same time:

if(home team  away team  home team = away team + 7) in R but i am
struggling to work out how to write this correctly. 

Thanks for any help. 

--
View this message in context: 
http://r.789695.n4.nabble.com/And-Statement-for-two-if-functions-tp4341179p4341179.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] package does not have a NAMESPACE

2012-01-30 Thread Reinker, Stefan

Hello Ondrej,

I experienced the same problem and circumvented it by installing R 2.13.2 where 
the package runs fine. I also tried contacting the authors with not reply so 
far, but if you manage to solve the NAMESPACE problem in 2.14 I would be 
interested. The source does not seem to be available.

Stefan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fw: Variable selection based on both training and testing data

2012-01-30 Thread SR Millis

From: SR Millis srmil...@yahoo.com
To: Jin Minming jminm...@yahoo.com 
Sent: Monday, January 30, 2012 9:25 AM
Subject: Re: [R] Variable selection based on both training and testing data

Jim,

First, stepwise methods for variable selection should be avoided.  Frank 
Harrell (in Regression Modeling Strategies) discusses this at length.

Second, splitting a dataset into training and validation sets is generally not 
a good idea unless you have a really large sample, eg,  20,000.  As Harrell 
has discussed, split-sample validation does not provide external validation, is 
terribly inefficient, and is arbitrary.  It's better to specify your model a 
priori and use the bootstrap to obtain an estimate of your model's 
over-optimism.  Bootstrapping can be implemented with Harrell's rms package in 
R.

Scott

~~~
Scott R Millis, PhD, ABPP, CStat, PStat®
Professor
Wayne State University School of Medicine
Email:  aa3...@wayne.edu
Email:  srmil...@yahoo.com
Tel: 313-993-8085

To: r-help@r-project.org 
Sent: Monday, January 30, 2012 8:14 AM
Subject: [R] Variable selection based on both training and testing data

Dear all,

The variable selection in regression is usually determined by the training data 
using AIC or F value, such as stepAIC. Is there some R package that can 
consider both the training and test dataset? For example, I have two separate 
training data and test data. Firstly, a regression model is obtained by using 
training data, and then this model is tested by using test data. This process 
continues in order to find some possible optimal models in terms of RMSE or R2 
for both training and test data. 

Thanks,

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,
 reproducible code.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Displaying percentages within bars

2012-01-30 Thread Mario Giesel

Hello, R friends,
 
I've got this graph:

p - ggplot(diamonds, aes(x = color))  + scale_fill_brewer(type=seq, pal = 
Blues)
  + scale_y_continuous(,formatter=percent) + coord_flip()
p+geom_bar(aes(fill=cut),colour='black',position='fill')
 
Is it possible to place percentages within each field of the bars?
So, for instance, the dark blue field of color = D would contain a number of 
about 42.0%.
A second question is how to change the size of this number.
 
Any comments are welcome!
Mario
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merge multiple data frames

2012-01-30 Thread MacQueen, Don

Does this example help? It doesn't handle the problem of common field
names, but see below for another example.

df1 - data.frame(jn=1:4, a1=letters[1:4], a2=LETTERS[1:4])
df2 - data.frame(jn=2:6, b1=month.abb[2:6])
df3 - data.frame(jn=3:7, x=rnorm(5), y=13:17)

dfn - sqldf('select * from df1 left join df2 using (jn) left join df3
using (jn)')

In this example, you automatically get all fields from all three data
frames, without having to name them in the SQL statement -- but you should
not have common names.


To deal with common names, I myself would probably rename the variables in
the data frames before trying to merge.

A general method would be something like:
  nms1 - names(df1)
  nms1[nms1 != 'date'] - paste(nms1[nms1 != 'date'],'.1',sep='')
  names(df1) - nms1
Of course it has to be done for every data frame, but this can be put in a
loop, if necessary.


However, here is an example where I have changed df1 and df2; they both
have a field named 'aa', in addition to the matching field.

df1 - data.frame(jn=1:4, aa=letters[1:4], a2=LETTERS[1:4])
df2 - data.frame(jn=2:6, aa=month.abb[2:6])
df3 - data.frame(jn=3:7, x=rnorm(5), y=13:17)

dfn - sqldf('select jn, df1.aa aa1, df2.aa aa2,
  a2, x, y
   from df1 left join df2 using (jn) left join df3 using (jn)')

By the way, you can still select *, even with common names:


  dfx - sqldf('select *   from df1 left join df2 using (jn) left join df3
using (jn)')but you might not like the result. Try it and see!




It's my understanding that in the current SQL definition 'as' is no longer
required when changing field names (though it is also still allowed in the
databases I work with, Oracle and MySQL). Perhaps sqldf does not allow it.
I don't know.

Hope this helps.

-Don



-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 1/30/12 4:40 AM, Massimo Bressan mbres...@arpa.veneto.it wrote:

hi don

I followed your advice about using sqldf package but the problem of
labelling the fields persists;
for some reasons I can not properly handle the sql 'as' statement

a_b-sqldf(select a.*, b.* from a left join b on a.date=b.date)
a_b_c-sqldf(select a_b.*, c.* from a_b left join c on a_b.date=c.date)

bye

max





- Original Message -
From: MacQueen, Don macque...@llnl.gov
To: maxbre mbres...@arpa.veneto.it; r-help@r-project.org
Sent: Saturday, January 28, 2012 12:24 AM
Subject: Re: [R] merge multiple data frames


Not tested, but this might be a case for the sqldf package.

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 1/26/12 9:29 AM, maxbre mbres...@arpa.veneto.it wrote:

This is my reproducible example (three data frames: a, b, c)

a-structure(list(date = structure(1:6, .Label = c(2012-01-03,
2012-01-04, 2012-01-05, 2012-01-06, 2012-01-07, 2012-01-08,
2012-01-09, 2012-01-10, 2012-01-11, 2012-01-12, 2012-01-13,
2012-01-14, 2012-01-15, 2012-01-16, 2012-01-17, 2012-01-18,
2012-01-19, 2012-01-20, 2012-01-21, 2012-01-22, 2012-01-23
), class = factor), so2 = c(0.799401398190476, 0, 0,
0.0100453950434783,
0.200154920565217, 0.473866969181818), nox = c(111.716109973913,
178.077239330435, 191.257829021739, 50.6799951473913, 115.284643540435,
110.425185027727), no = c(48.8543691516522, 88.7197448817391,
93.9931932472609, 13.9759949817391, 43.1395266865217, 41.7280296016364
), no2 = c(36.8673432865217, 42.37150668, 47.53311701, 29.3026882474783,
49.2986070321739, 46.5978461731818), co = c(0.618856168125,
0.99659347508,
0.66698741608, 0.38343731117, 0.281604928875, 0.155383408913043
), o3 = c(12.1393100029167, 12.3522739816522, 10.9908791203043,
26.9122200013043, 13.8421695947826, 12.3788847045455), ipa =
c(167.541954974667,
252.7196257875, 231.802370709167, 83.4850259595833, 174.394613581667,
173.868599272609), ws = c(1.47191016429167, 0.765781205208333,
0.937053086791667, 1.581022406625, 0.909756802125, 0.959252831695652
), wd = c(45.2650019737732, 28.2493544114369, 171.049080544214,
319.753674830936, 33.8713897347193, 228.368119533759), temp =
c(7.9197282588,
3.79434291520833, 2.1287644735, 6.733854600625, 3.136579722,
3.09864120704348), umr = c(86.11566638875, 94.5034087491667,
94.14451249375, 53.1016709004167, 65.63420423, 74.955669236087
)), .Names = c(date, so2, nox, no, no2, co, o3,
ipa, ws, wd, temp, umr), row.names = c(NA, 6L), class =
data.frame)


b-structure(list(date = structure(1:6, .Label = c(2012-01-03,
2012-01-04, 2012-01-05, 2012-01-06, 2012-01-07, 2012-01-08,
2012-01-09, 2012-01-10, 2012-01-11, 2012-01-12, 2012-01-13,
2012-01-14, 2012-01-15, 2012-01-16, 2012-01-17, 2012-01-18,
2012-01-19, 2012-01-20, 2012-01-21, 2012-01-22, 2012-01-23
), class = factor), so2 = c(0, 0, 0, 0, 0, 0), nox = c(13.74758511,
105.8060582, 61.22720599, 11.45280354, 56.86804174, 39.17917222
), no = c(0.882593766, 48.97037506, 9.732937217, 1.794549972,
16.32300019, 8.883637786), no2 =

Re: [R] And Statement for two if functions

2012-01-30 Thread Jorge I Velez

Hi kerry1912,

And what exactly would you like to do after the if(...) statement?  How did
you read your data in?  What's the output of str(yourdata)?  Please see
http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.htmland
help us to help you.

Regards,
Jorge


On Mon, Jan 30, 2012 at 9:52 AM, kerry1912  wrote:

 I want to perform two if functions at the same time:

 if(home team  away team  home team = away team + 7) in R but i am
 struggling to work out how to write this correctly.

 Thanks for any help.

 --
 View this message in context:
 http://r.789695.n4.nabble.com/And-Statement-for-two-if-functions-tp4341179p4341179.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ROC curve

2012-01-30 Thread Corey Dow-Hygelund

Hi Josiane,

Concerning ROC curves, the package ROCR should do what you want to do.  Use
install.packages to add it to you library.

Getting you data into a text file format, use read.delim to read into an
data frame.  Once you have a data frame, you can use the methods in ROCR to
analyze the data.

Best,

Corey

On Mon, Jan 30, 2012 at 1:52 AM, Josiane NJIWA joa...@yahoo.com wrote:



 Hello all,

 I am very new to R and i am facing two problems. First i didn't succeed
 changing the konsole language in english even after trying the line command
 set language='en'.
 I would like to plot ROC curves. I have a serie of 10 threshold tests that
 i do for 10 patients. The prediction for the patients is always the same
 but the status can change given to the considered threshold.
 I have 11 columns of 10 rows, the first colums containing the10 lines of
 the predicted status of the patients (0=cured, 1=non cured). Then follow 10
 columns (10 thresholds) containing the found status using the threshold.
 Please do someone know how i can use those values with R to plot ROC
 curves?

 I thank you for your understanding,

 Josiane.


 Everything should be made as simple as possible, but not simpler.
  Albert Einstein.
[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
*The mark of a successful man is one that has spent an entire day on the
bank of a river without feeling guilty about it.*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] package does not have a NAMESPACE

2012-01-30 Thread Petr Savicky

On Mon, Jan 30, 2012 at 02:35:29PM +, Reinker, Stefan wrote:
 Hello Ondrej,
 
 I experienced the same problem and circumvented it by installing R 2.13.2 
 where the package runs fine. I also tried contacting the authors with not 
 reply so far, but if you manage to solve the NAMESPACE problem in 2.14 I 
 would be interested. The source does not seem to be available.

Hello:

The source code kopls_1.1.1.tar.gz is available at

  http://kopls.sourceforge.net/download.shtml

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] handling a lot of data

2012-01-30 Thread R. Michael Weylandt

This won't help with large memory issues, but just a pointer:

When you start to construct data_all with these commands

data_all = vector(list, 17);
data_all[[1993]] = data1993;

The first pre-allocates a list of length 17, but the second adds the
data to the 1993rd slot requiring a complete reallocation. Look at
length(data_all). You'd be better off in general with something like
this:

data_all - vector(list, 17)
names(data_all) - 1993: 2010
data_all[[1993]] - data1993
etc.

which creates a vector of length 17 with components named after the years.

If you want to automate that last bit over each year, this would work:

for( yr in 1993: 2010){
data_all[[as.character(yr)]] - get(paste(data, yr, sep = ))
}

It's also been pointed out to me that the Oarray package allows one to
start indexing at an arbitrary point (e.g., 1993 for the first slot)
which might be helpful for managing your data_all object.

Michael

On Mon, Jan 30, 2012 at 3:54 AM, Petr Kurtin kur...@avast.com wrote:
 Hi,

 I have got a lot of SPSS data for years 1993-2010. I load all data into
 lists so I can easily index the values over the years. Unfortunately loaded
 data occupy quite a lot of memory (10Gb) - so my question is, what's the
 best approach to work with big data files? Can R get a value from the file
 data without full loading into memory? How can a slower computer with not
 enough memory work with such data?

 I use the following commands:

 data1993 = vector(list, 4);
 data1993[[1]] = read.spss(...)  # first trimester
 data1993[[2]] = read.spss(...)  # second trimester
 ...
 data_all = vector(list, 17);
 data_all[[1993]] = data1993;
 ...

 and indexing, e.g.: data_all[[1993]][[1]]$DISTRICT, etc.

 Thanks,
 Petr Kurtin

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] repeat function for entire list of matrices

2012-01-30 Thread R. Michael Weylandt

lapply() takes a function in its second argument, but that is not what
you passed it. Also, there's no such construct in R as .

What happens with the code I gave you?

Michael

On Sun, Jan 29, 2012 at 4:28 PM, pabears danss...@gmail.com wrote:
 didn't seem to quite work:

 i tried different subsetting.

 lapply(nestedseasonlower, nested(nestedseason,.)

 are there any functions that can repeat a function while counting each
 iteration of the repeated function? (n=1, n=2, n=3)

 thanks

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/repeat-function-for-entire-list-of-matrices-tp4334587p4339299.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question on model.matrix

2012-01-30 Thread Paul Johnson

Greetings

On Sat, Jan 28, 2012 at 2:43 PM, Daniel Negusse
daniel.negu...@my.mcphs.edu wrote:



 while reading some tutorials, i came across this and i am stuck. i want to 
 understand it and would appreciate if anyone can tell me.

 design - model.matrix(~ -1+factor(c(1,1,2,2,3,3)))

 can someone break down this code and explain to me what the ~, and the 
 -1+factor are doing?

A formula would be y ~ x, so when you don't include y, it means you
only want the right hand side variables.  The term design matrix
generally means the numeric coding that is fitted in a statistical
procedure.

The -1 in the formula means do not insert an intercept for me.  It
affects the way the factor variable is converted to numeric contrasts
in the design matrix.   If there is an intercept, then the contrasts
have to be adjusted to prevent perfect multicollinearity.

If you run a few examples, you will see. This uses lm, but the formula
and design matrix ideas are same. Note, with an intercept, I get 3
dummy variables from x2, but with no intercept, I get 4 dummies:

 x1 - rnorm(16)
 x2 - gl(4, 4, labels=c(none,some,more,lots))
 y - rnorm(16)
 m1 - lm(y ~ x1 + x2)
 model.matrix(m1)
   (Intercept)  x1 x2some x2more x2lots
11 -0.2567  0  0  0
21  0.94963659  0  0  0
31  0.06915561  0  0  0
41  0.89971204  0  0  0
51  0.73817482  1  0  0
61  2.92451195  1  0  0
71 -0.80682449  1  0  0
81  1.07472998  1  0  0
91  1.34949123  0  1  0
10   1 -0.42203984  0  1  0
11   1 -1.66316740  0  1  0
12   1 -2.83232063  0  1  0
13   1  1.26177313  0  0  1
14   1  0.10359857  0  0  1
15   1 -1.85671242  0  0  1
16   1 -0.25140729  0  0  1
attr(,assign)
[1] 0 1 2 2 2
attr(,contrasts)
attr(,contrasts)$x2
[1] contr.treatment

 m2 - lm(y ~ -1 + x1 + x2)
 model.matrix(m2)
x1 x2none x2some x2more x2lots
1  -0.2567  1  0  0  0
2   0.94963659  1  0  0  0
3   0.06915561  1  0  0  0
4   0.89971204  1  0  0  0
5   0.73817482  0  1  0  0
6   2.92451195  0  1  0  0
7  -0.80682449  0  1  0  0
8   1.07472998  0  1  0  0
9   1.34949123  0  0  1  0
10 -0.42203984  0  0  1  0
11 -1.66316740  0  0  1  0
12 -2.83232063  0  0  1  0
13  1.26177313  0  0  0  1
14  0.10359857  0  0  0  1
15 -1.85671242  0  0  0  1
16 -0.25140729  0  0  0  1
attr(,assign)
[1] 1 2 2 2 2
attr(,contrasts)
attr(,contrasts)$x2
[1] contr.treatment

I think you'll need to mess about with R basics like plot and lm
before you go off using the formulas that you really care about.
Otherwise, well, you'll always be lost about stuff like ~ and -1.

I've started posting all my lecture notes (source code, R code, pdf
output) http://pj.freefaculty.org/guides.  That might be a quick start
for you.

-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] handling a lot of data

2012-01-30 Thread Paul Bivand

If you do not need all the variables in the SPSS files, use package 'memisc'.
spss.system.file() and it's subset() allow you to just load the
variables needed.

You will need to transform into data.frame as the memisc data.set
includes the SPSS attributes, user-missings etc.

Paul Bivand
Centre for Economic and Social Inclusion
London

On 30 January 2012 16:02, R. Michael Weylandt
michael.weyla...@gmail.com wrote:
 This won't help with large memory issues, but just a pointer:

 When you start to construct data_all with these commands

 data_all = vector(list, 17);
 data_all[[1993]] = data1993;

 The first pre-allocates a list of length 17, but the second adds the
 data to the 1993rd slot requiring a complete reallocation. Look at
 length(data_all). You'd be better off in general with something like
 this:

 data_all - vector(list, 17)
 names(data_all) - 1993: 2010
 data_all[[1993]] - data1993
 etc.

 which creates a vector of length 17 with components named after the years.

 If you want to automate that last bit over each year, this would work:

 for( yr in 1993: 2010){
    data_all[[as.character(yr)]] - get(paste(data, yr, sep = ))
 }

 It's also been pointed out to me that the Oarray package allows one to
 start indexing at an arbitrary point (e.g., 1993 for the first slot)
 which might be helpful for managing your data_all object.

 Michael

 On Mon, Jan 30, 2012 at 3:54 AM, Petr Kurtin kur...@avast.com wrote:
 Hi,

 I have got a lot of SPSS data for years 1993-2010. I load all data into
 lists so I can easily index the values over the years. Unfortunately loaded
 data occupy quite a lot of memory (10Gb) - so my question is, what's the
 best approach to work with big data files? Can R get a value from the file
 data without full loading into memory? How can a slower computer with not
 enough memory work with such data?

 I use the following commands:

 data1993 = vector(list, 4);
 data1993[[1]] = read.spss(...)  # first trimester
 data1993[[2]] = read.spss(...)  # second trimester
 ...
 data_all = vector(list, 17);
 data_all[[1993]] = data1993;
 ...

 and indexing, e.g.: data_all[[1993]][[1]]$DISTRICT, etc.

 Thanks,
 Petr Kurtin

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] discrete simulated annealing

2012-01-30 Thread yan jiao


Dear All,

I need to use simulated annealing for optimization
is there a way to limit the search place to only discrete values? And 
also exclude certain solutions, e.g. exclude the solutions when all the 
variables are the same?


many thanks

Yan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RCurl format

2012-01-30 Thread Duncan Temple Lang


Hi KTD Services (!)

 I assume by DELETE, you mean the HTTP method
and not the value of a parameter  named _method
that is processed by the URL script.

 If that is the case, then you want to use the
 customRequest option for the libcurl operation
 and you don't need or want to use postForm().


 Either

curlPerform(url = url, customrequest = DELETE,
  userpwd = user:password)

 or with a recent version of the RCurl package


httpDELETE(url, userpwd = user:password)


  The parameter _method you are using is being passed on to the form
script.  It is not recognized by postForm() as being something controlling
the request, but just part of the form submission.

  D.



On 1/30/12 2:55 AM, KTD Services wrote:
 I am having trouble with the postForm function in RCurl.
 
 I want to send a the command DELETE https://somewebsite.com.json
 
 but I can't seem to find it.  I could try:
 
 postForm(url, _method=DELETE, .opts = list(username:password) )
 
 but I get the error:
 
 Error: unexpected input in postForm(url4, _
 
 this error seems to be due to the underscore _ before method
 
 Any ideas how I can do a DELETE command another way in RCurl?
 
 Thanks.
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ROC curve

2012-01-30 Thread David Winsemius



On Jan 30, 2012, at 4:52 AM, Josiane NJIWA wrote:




Hello all,

I am very new to R and i am facing two problems. First i didn't  
succeed changing the konsole language in english even after trying  
the line command set language='en'.


R is a functional language, so it shouldn't surprise you that issuing  
a command does not do what you apparently expected based on your  
experience with macro languages. You should read:


?locales


I would like to plot ROC curves. I have a serie of 10 threshold  
tests that i do for 10 patients. The prediction for the patients is  
always the same but the status can change given to the considered  
threshold.
I have 11 columns of 10 rows, the first colums containing the10  
lines of the predicted status of the patients (0=cured, 1=non  
cured). Then follow 10 columns (10 thresholds) containing the found  
status using the threshold.
Please do someone know how i can use those values with R to plot ROC  
curves?


I thank you for your understanding,

Josiane.


Everything should be made as simple as possible, but not  
simpler.Albert Einstein.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] And Statement for two if functions

2012-01-30 Thread David Winsemius



On Jan 30, 2012, at 9:52 AM, kerry1912 wrote:


I want to perform two if functions at the same time:

if(home team  away team  home team = away team + 7) in R but i am
struggling to work out how to write this correctly.


Generally newcomers to the R language find that the ifelse function  
does what they expect. The if function is quite different and seemes  
less likely to be what you wnat:


?Control
?ifelse


--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] discrete simulated annealing

2012-01-30 Thread Petr Savicky

On Mon, Jan 30, 2012 at 04:57:36PM +, yan jiao wrote:
 Dear All,
 
 I need to use simulated annealing for optimization
 is there a way to limit the search place to only discrete values? And 
 also exclude certain solutions, e.g. exclude the solutions when all the 
 variables are the same?

Dear Yan:

The page ?optim says

  If a function to generate a new candidate point is given,
  method ‘SANN’ can also be used to solve combinatorial
  optimization problems.

If you have your specific function to generate a new point,
this function may apply the required restrictions.

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Checking for invalid dates: Code works but needs improvement

2012-01-30 Thread David Winsemius



On Jan 30, 2012, at 8:44 AM, Paul Miller wrote:


Hi Rui, Marc, and Gabor,

Thanks for your replies to my question. All were helpful and it was  
interesting to see how different people approach various aspects of  
the same problem.


Spent some time this weekend looking at Rui's solution, which is  
certainly much clearer than my own. Managed to figure out pretty  
much all the details of how it works. Also managed to tweak it  
slightly in order to make it do exactly what I wanted. (See revised  
code below.)


Still have a couple of questions though. The first concerns the  
insertion of the code Y  2012 to set year values beyond 2012 to  
NA (on line 10 of the function below).  When I add this (or use it  
in place of nchar(Y)  4), the code succesfully finds the problem  
date 05/16/2015. After that though, it produces the following  
error message:


Error in if (any(is.na(x)  M != un  Y != un)) cat(Warning:  
Invalid date values in,  :  missing value where TRUE/FALSE needed


It's a bit dangerous to use comparison operators on mixed data types.  
In your case you are comparing a character value to a numeric value  
and may not realize that 2015 is not the same as 2015. Try 123   
1000 if you want a quick counter-example. You may want to coerce the Y  
value to numeric mode to be safe.


Also 'any' does not expect the logical connectives. You probably want:

any(is.na(x) , M != un , Y != un)



Why is this happening? If the code correctly correctly handles the  
date 06/20/1840 without producing an error, why can't it do  
likelwise with 05/16/2015?


The second question is why it's necessary to put x on line 15  
following cat(Warning ...). I know that I don't get any date  
columns if I don't include this but am not sure why.


The third question is whether it's possible to change the class of  
the date variables without using a for loop. I played around with  
this a little but didn't find a vectorized alternative. It may be  
that this is not really important. It's just that I've read in  
several places that for loops should be avoided wherever possible.


Thanks,

Paul


##
 Code for detecting invalid dates 
##

 Test Data 

connection - textConnection(
1 11/23/21931 05/23/2009 un/17/2011
2 06/20/1840  02/30/2010 03/17/2011
3 06/17/1935  12/20/2008 07/un/2011
4 05/31/1937  01/18/2007 04/30/2011
5 06/31/1933  05/16/2015 11/20/un
)

TestDates - data.frame(scan(connection,
 list(Patient=0, birthDT=, diagnosisDT=, metastaticDT=)))

close(connection)

 Input Data 

TDSaved - TestDates

 List of Date Variables 

DateNames - c(birthDT, diagnosisDT, metastaticDT)

 Date Function 

fun - function(Dat){
   f - function(jj, DF){
   x - as.character(DF[, jj])
   x - unlist(strsplit(x, /))
   n - length(x)
   M - x[seq(1, n, 3)]
   D - x[seq(2, n, 3)]
   Y - x[seq(3, n, 3)]
   D[D == un] - 15
   Y - ifelse(nchar(Y)  4 | Y  2012 | Y  1900, NA, Y)
   x - as.Date(paste(Y, M, D, sep=-), format=%Y-%m-%d)
   if(any(is.na(x)  M != un  Y != un))
   cat(Warning: Invalid date values in, jj, \n,
   as.character(DF[is.na(x), jj]), \n)
   x
   }
   Dat - data.frame(sapply(names(Dat), function(j) f(j, Dat)))
   for(i in names(Dat)) class(Dat[[i]]) - Date
   Dat
}

 Output Data 

TD - TDSaved

 Read Dates 

TD[, DateNames] - fun(TD[, DateNames])
TD

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Variable selection based on both training and testing data

2012-01-30 Thread Jin Minming

I do not have enough test data for regression analysis although I know there 
are some statistical regression methods that can be used for small dataset. 
That is why I need build a model firslty using training dataset.

Thanks,

Jim
 

--- On Mon, 30/1/12, Liaw, Andy andy_l...@merck.com wrote:

 From: Liaw, Andy andy_l...@merck.com
 Subject: RE: [R] Variable selection based on both training and testing data
 To: 'Jin Minming' jminm...@yahoo.com, r-help@r-project.org 
 r-help@r-project.org
 Date: Monday, 30 January, 2012, 13:39
 Variable section is part of the
 training process-- it chooses the model.  By
 definition, test data is used only for testing (evaluating
 chosen model).
 
 If you find a package or function that does variable
 selection on test data, run from it!
 
 Best,
 Andy 
 
  -Original Message-
  From: r-help-boun...@r-project.org
 
  [mailto:r-help-boun...@r-project.org]
 On Behalf Of Jin Minming
  Sent: Monday, January 30, 2012 8:14 AM
  To: r-help@r-project.org
  Subject: [R] Variable selection based on both training
 and 
  testing data
  
  Dear all,
  
  The variable selection in regression is usually
 determined by 
  the training data using AIC or F value, such as
 stepAIC. Is 
  there some R package that can consider both the
 training and 
  test dataset? For example, I have two separate training
 data 
  and test data. Firstly, a regression model is obtained
 by 
  using training data, and then this model is tested by
 using 
  test data. This process continues in order to find some
 
  possible optimal models in terms of RMSE or R2 for both
 
  training and test data. 
  
  Thanks,
  
  Jim
  
  __
  R-help@r-project.org
 mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
 reproducible code.
  
 Notice:  This e-mail message, together with any
 attachments, contains
 information of Merck  Co., Inc. (One Merck Drive,
 Whitehouse Station,
 New Jersey, USA 08889), and/or its affiliates Direct contact
 information
 for affiliates is available at 
 http://www.merck.com/contact/contacts.html) that may be
 confidential,
 proprietary copyrighted and/or legally privileged. It is
 intended solely
 for the use of the individual or entity named on this
 message. If you are
 not the intended recipient, and have received this message
 in error,
 please notify us immediately by reply e-mail and then delete
 it from 
 your system.
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] replacing characters in matrix. substitute, delayedAssign, huh?

2012-01-30 Thread Paul Johnson

A user question today has me stumped.  Can you advise me, please?

User wants a matrix that has some numbers, some variables, possibly
even some function names.  So that has to be a character matrix.
Consider:

 BM - matrix(0.1, 5, 5)

Use data.entry(BM) or similar to set some to more abstract values.

 BM[3,1] - a
 BM[4,2] - b
 BM[5,2] - b
 BM[5,3] - d
 BM
 var1  var2  var3  var4  var5
[1,] 0.1 0.1 0.1 0.1 0.1
[2,] 0.1 0.1 0.1 0.1 0.1
[3,] a   0.1 0.1 0.1 0.1
[4,] 0.1 b   0.1 0.1 0.1
[5,] 0.1 b   d 0.1 0.1

Later on, user code will set values, e.g.,

a - rnorm(1)
b - 17
d - 4

Now, push those into BM, convert whole thing to numeric

newBM - apply(BM, c(1,2), as.numeric)

and use newBM for some big calculation.

Then re-set new values for a, b, d, do the same over again.

I've been trying lots of variations on parse, substitute, and eval.

The most interesting function I learned about this morning was delayedAssign.
If I had only to work with one scalar, it does what I want

 delayedAssign(a, whatA)
 whatA - 91
 a
[1] 91

I can't see how to make that work in the matrix context, though.

Got ideas?

pj

 sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.14.1

-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plot with ylim with regural interval

2012-01-30 Thread gianni lavaredo

Dear Researchers,

sorry for the easy question but Is it possible to plot with an interval of
1 or .5 in a plot using ylim?

Thanks
gianni

x = 0:10;
y = 0:10;

plot(x~y,ylim=c(0,10),las=1)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] replacing characters in matrix. substitute, delayedAssign, huh?

2012-01-30 Thread Richard M. Heiberger

Are you sure this isn't a dataframe?  Some minor rethinking of the
structure might get it there.

Rich

On Mon, Jan 30, 2012 at 1:26 PM, Paul Johnson pauljoh...@gmail.com wrote:

 A user question today has me stumped.  Can you advise me, please?

 User wants a matrix that has some numbers, some variables, possibly
 even some function names.  So that has to be a character matrix.
 Consider:

  BM - matrix(0.1, 5, 5)

 Use data.entry(BM) or similar to set some to more abstract values.

  BM[3,1] - a
  BM[4,2] - b
  BM[5,2] - b
  BM[5,3] - d
  BM
 var1  var2  var3  var4  var5
 [1,] 0.1 0.1 0.1 0.1 0.1
 [2,] 0.1 0.1 0.1 0.1 0.1
 [3,] a   0.1 0.1 0.1 0.1
 [4,] 0.1 b   0.1 0.1 0.1
 [5,] 0.1 b   d 0.1 0.1

 Later on, user code will set values, e.g.,

 a - rnorm(1)
 b - 17
 d - 4

 Now, push those into BM, convert whole thing to numeric

 newBM - apply(BM, c(1,2), as.numeric)

 and use newBM for some big calculation.

 Then re-set new values for a, b, d, do the same over again.

 I've been trying lots of variations on parse, substitute, and eval.

 The most interesting function I learned about this morning was
 delayedAssign.
 If I had only to work with one scalar, it does what I want

  delayedAssign(a, whatA)
  whatA - 91
  a
 [1] 91

 I can't see how to make that work in the matrix context, though.

 Got ideas?

 pj

  sessionInfo()
 R version 2.14.1 (2011-12-22)
 Platform: x86_64-pc-linux-gnu (64-bit)

 locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=C LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 loaded via a namespace (and not attached):
 [1] tools_2.14.1

 --
 Paul E. Johnson
 Professor, Political Science
 1541 Lilac Lane, Room 504
 University of Kansas

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to select columns

2012-01-30 Thread David Winsemius



On Jan 30, 2012, at 2:30 AM, David Studer wrote:


Hello,
I have the following question:

when creating a data.frame
a1-c(1,2,3)
a2-c(1,2,3)
c-data.frame(a1,a2)
I can select columns using an index like:
c[,1:2]
Is this possible too when using column-names? (something like  
c(,a1:a2),

which doesn't work):


Generally you need to use grep to convert column names to numbers for  
use within [ operations]


df[ , grep(^a1$, names(df)):grep^a2$, names(df)) ]

--
Another David



Alternative question: Is there a function to get the index of a  
variable by

name


That's what grep will do.


or can I
select certain columns using a loop? (a_1, a_2, ..., a_n)

Thank you very much!
David


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ode() tries to allocate an absurd amount of memory

2012-01-30 Thread Thomas Brown

Hi there R-helpers:

I'm having problems with the function ode() found in the package deSolve.
It seems that when my state variables are too numerous (33000 elements),
the function throws the following error:

Error in vode(y, times, func, parms, ...) :
  cannot allocate memory block of size 137438953456.0 Gb
In addition: Warning message:
In vode(y, times, func, parms, ...) : NAs introduced by coercion

This appears to be case regardless of the computer I use; that is, whether
it's a laptop or server with 24Gb of RAM. Why is ode() trying to allocate
137 billion gigabytes of memory?! (I receive exactly the same error message
whether I have, for example, 34000 or 8 state variables: the amount of
memory trying to be allocated is exactly the same.) I have included a
trivial example below that uses a function that returns a rate of change of
zero for all state variables.

 require(deSolve)
Loading required package: deSolve
 C-rep(0,34000)
 TestFunc-function(t,C,para){
+ return(list(rep(0,length(C
+ }
 soln-ode(y=C,times=seq(0,1,0.1),func=TestFunc,parms=c(0),method=vode)
Error in vode(y, times, func, parms, ...) :
  cannot allocate memory block of size 137438953456.0 Gb
In addition: Warning message:
In vode(y, times, func, parms, ...) : NAs introduced by coercion


Am I making a foolish mistake somewhere or is this simply a limitation of
the function?

Thanks in advance!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Euler identity with complex exp

2012-01-30 Thread Joseph Park

Hi,

Am i doing something silly here in expecting Euler's
formula to be handled by exp? exp( ix ) = cos x + i sin x.
The first example below follows this, the others not.

Thanks for the education!

  exp( complex(real = 0, imag = 2*pi) )
[1] 1-0i
  exp( complex(real = pi, imag = 2*pi) )
[1] 23.14069-0i
  exp( complex(real = pi/2, imag = 0) )
[1] 4.810477+0i


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fw: Variable selection based on both training and testing data

2012-01-30 Thread SR Millis

From: SR Millis srmil...@yahoo.com

To: Jin Minming jminm...@yahoo.com 
Sent: Monday, January 30, 2012 9:25 AM
Subject: Re: [R] Variable selection based on both training and testing data

Jim,

First, stepwise methods for variable selection should be avoided.  Frank 
Harrell (in Regression Modeling Strategies) discusses this at length.

Second, splitting a dataset into training and validation sets is generally not 
a good idea unless you have a really large sample, eg,  20,000.  As Harrell 
has discussed, split-sample validation does not provide external validation, is 
terribly inefficient, and is arbitrary.  It's better to specify your model a 
priori and use the bootstrap to obtain an estimate of your model's 
over-optimism.  Bootstrapping can be implemented with Harrell's rms package in 
R.

Scott

~~~
Scott R Millis, PhD, ABPP, CStat, PStat®
Professor
Wayne State University School of Medicine
Email:  aa3...@wayne.edu
Email:  srmil...@yahoo.com
Tel: 313-993-8085

To: r-help@r-project.org 
Sent: Monday, January 30, 2012 8:14 AM
Subject: [R] Variable selection based on both training and testing data

Dear all,

The variable selection in regression is usually determined by the training data 
using AIC or F value, such as stepAIC. Is there some R package that can 
consider both the training and test dataset? For example, I have two separate 
training data and test data. Firstly, a regression model is obtained by using 
training data, and then this model is tested by using test data. This process 
continues in order to find some possible optimal models in terms of RMSE or R2 
for both training and test data. 

Thanks,

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and
 provide commented, minimal, self-contained,
 reproducible code.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Reg : Hello all.. help needed regarding heatmaps

2012-01-30 Thread koushik gangavaram

Hello all ,

I am beginner and new to this -R world.  I have heard much about R and
started working on it.

I have some data of 20 business applications( y -axis) and Months( x-axis)
and values as their score for every month .  I tried to generate a heatmap
with this data and got some good results.  Can some one help me on how to
generate the legend next to heatmap please...


can some one send me a sample code ..?


Some of the useful link that i found on web :
http://www.oga-lab.net/RGM2/func.php?rd_id=gplots:heatmap.2


Thanks in advance.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculate a function repeatedly over sections of a ts object

2012-01-30 Thread Jorge Molinos


Thank you very much Mike. The script is working now.

Jorge





From: R. Michael Weylandt [michael.weyla...@gmail.com]
Sent: 30 January 2012 04:29
To: Jorge Molinos; r-help
Subject: Re: [R] Calculate a function repeatedly over sections of a ts object

Sorry, that last line should read:

FUN=function(z){
  lz - length(z)
  SDF(z,method=lag window,
window=taper(type=parzen,n.sample=lz,cutoff= 2*sqrt(lz)), npad=2*lz)
}

On Sun, Jan 29, 2012 at 11:29 PM, R. Michael Weylandt
michael.weyla...@gmail.com wrote:
 It's customary to keep the list cc'd.

 I can't run your code without the data, but it does seem to me that
 your problem is in the FUN argument, as you guess.

 You have:

 FUN=function(z) SDF(adezoo,method=lag window,
 window=taper(type=parzen,n.sample=n.d,cutoff=(2*sqrt(n.d))),
 npad=2*n.d)

 But this function doesn't actually act on it's argument: you tell it
 to accept something called z but then it never gets told to do
 anything to z. Perhaps you meant

 FUN=function(z) SDF(z,method=lag window,
 window=taper(type=parzen,n.sample=n.d,cutoff=(2*sqrt(n.d))),
 npad=2*n.d)

 I also worry about your use of n.d; are you sure you don't want to
 use the length of the rolling window? Something more like:

 FUN=function(z){
   lz - length(z)
   SDF(z,method=lag window,
 window=taper(type=parzen,n.sample=lz,cutoff= 2*sqrt(lz)),
 npad=2*nlz)
 }

 Does that fix it?

 Michael

 On Fri, Jan 27, 2012 at 1:06 PM, Jorge Molinos jgarc...@tcd.ie wrote:
 Hi Michael,

 Sorry, I've been trying to use rollapply with my function but it seems I 
 can't get it to work properly. The function seems to be dividing the time 
 series accordingly (every 1) and using the correct length for the time 
 window (10 years) but when I look at the results all of them are the same 
 for all the subseries which doesn't make sense. The problem has to be within 
 the FUN argument though I cannot figure out what it is. Would you mind 
 checking on the code to see if you can spot where is the problem?

 adets-ts(adeery$DA,c(adeery$Year[1],adeery$Day[1]),frequency=365)

 adezoo-as.zoo(adets)

 n.d-length(adets)

 especlist-rollapply(adezoo, width=3650, FUN=function(z) 
 SDF(adezoo,method=lag window,
window=taper(type=parzen,n.sample=n.d,cutoff=(2*sqrt(n.d))),
npad=2*n.d), by = 365, align=left)


 And these are, for example, the SDF values at the last day for each 10-y 
 subseries (all the same though they should be different as I have it verify 
 by doing the SDF step by step using the same values for the arguments within 
 the function):

 especlist1.7048
 1978(20)1.998068e-06
 1979(20)1.998068e-06
 1980(20)1.998068e-06
 1981(20)1.998068e-06
 1982(20)1.998068e-06
 1983(20)1.998068e-06
 1984(20)1.998068e-06
 1985(20)1.998068e-06
 1986(20)1.998068e-06
 1987(20)1.998068e-06

 Thanks a lot.

 Jorge


 
 From: R. Michael Weylandt [michael.weyla...@gmail.com]
 Sent: 26 January 2012 21:00
 To: Jorge Molinos
 Cc: r-help@R-project.org
 Subject: Re: [R] Calculate a function repeatedly over sections of a ts object

 I'm not sure if it's easily doable with a ts class, but the rollapply
 function in the zoo package will do this easily. (Also, I find zoo to
 be a much more natural time-series workflow than ts so it might make
 the rest of your life easier as well)

 Michael

 On Thu, Jan 26, 2012 at 2:24 PM, Jorge Molinos jgarc...@tcd.ie wrote:

 Hi,

 I want to apply a function (in my case SDF; package “sapa”) repeatedly over 
 discrete sections of a daily time series object by sliding a time window of 
 constant length (e.g. 10 consecutive years or 1825 days) over the entire ts 
 at increments of 1 time unit (e.g. 1 year or 365 days). So for example, the 
 first SDF would be calculated for the daily values of my variable recorded 
 between years 1 to 5, SDF2 to those for years 2 to 6 and so on until the 
 total length of the series is covered. How can I implement this into a R 
 script? Any help is much appreciated.

 Jorge
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to sum multiple data entries for the same sampling event?

2012-01-30 Thread karengrace84

I'm having trouble with some catch per unit effort data (CPUE, fisheries
data). Some of the samples were retained and some unretained, and they
are entered as 2 separate entries for the same sampling event (Date and
time). I want to calculate the total CPUE (so sum the retained and
unretained number for each sampling event) and am having troubld doing so.
Here's a sample of what my data.frame looks like now:

Date   lmb.cpue  
Disposition.of.Catch
1999-07-10 12:10:00   0.6667   Unretained
1999-07-10 12:10:00   0.1667 Retained
1999-07-14 11:22:00   0.8333   Unretained
1999-07-14 11:22:00   0.5556 Retained
1999-07-14 11:48:00   0.1667   Unretained
1999-07-14 11:48:00   0.5833 Retained
1999-07-14 13:56:00   0.57142857 Retained
1999-07-15 10:23:00   0. Retained
1999-07-22 12:03:00   0. Retained
1999-07-25 11:26:00   0.4000   Unretained
1999-07-25 11:26:00   1. Retained


And I would like to end up with:

Date   lmb.cpue  
1999-07-10 12:10:00   0.8333
1999-07-14 11:22:00   1.3889
1999-07-14 11:48:00   0.7500
1999-07-14 13:56:00   0.57142857
1999-07-15 10:23:00   0.
1999-07-22 12:03:00   0.
1999-07-25 11:26:00   1.4000  

Thanks for any help you have to offer!

--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-sum-multiple-data-entries-for-the-same-sampling-event-tp4341670p4341670.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot with ylim with regural interval

2012-01-30 Thread Jorge I Velez

Hi Gianni,

Yes, take a look at

x - y - 1:10
plot(x, y, ylim=c(0,10),las=1, yaxt = 'n')
axis(2, seq(0, 10, by = .5), seq(0, 10, by = .5), las = 2)

plot(x, y, ylim=c(0,10),las=1, yaxt = 'n')
axis(2, seq(0, 10, by = 1), seq(0, 10, by = 1), las = 2)

Also, check ?plot and ?par for more details.

HTH,
Jorge.-


On Mon, Jan 30, 2012 at 1:28 PM, gianni lavaredo  wrote:

 Dear Researchers,

 sorry for the easy question but Is it possible to plot with an interval of
 1 or .5 in a plot using ylim?

 Thanks
 gianni

 x = 0:10;
 y = 0:10;

 plot(x~y,ylim=c(0,10),las=1)

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fw: Variable selection based on both training and testing data

2012-01-30 Thread Jin Minming

Dear Scott,

I am so sorry that I think I just sent an empty email to you.
Thanks a lot for your advice.

The problem is that we do not have sufficient prior knowledge for the 
regression form and even appropriate inputs. We need try to find some possible 
regression equations, then add our explanation to them.  So we need explore a 
lot of options.  The two input datasets are very different in nature and they 
are from two locations.  Hence, it can be used for testing purpose although it 
may turn out to be that there is not an appropriate regression due to the 
intrinsic difference in these two datasets. 

In fact, if I can extract the models used (not only the final model) in stepAIC 
function, then it will be easier to add some simple scripts to calculate R2 or 
RMSE for both datasets. 

Thanks,

Jim


--- On Mon, 30/1/12, SR Millis aa3...@wayne.edu wrote:

 From: SR Millis aa3...@wayne.edu
 Subject: [R] Fw: Variable selection based on both training and testing data
 To: r-help@r-project.org r-help@r-project.org
 Date: Monday, 30 January, 2012, 14:57
 
 
 From: SR Millis srmil...@yahoo.com
 To: Jin Minming jminm...@yahoo.com
 
 Sent: Monday, January 30, 2012 9:25 AM
 Subject: Re: [R] Variable selection based on both training
 and testing data
  
 
 Jim,
 
 First, stepwise methods for variable selection should be
 avoided.  Frank Harrell (in Regression Modeling Strategies)
 discusses this at length.
 
 Second, splitting a dataset into training and validation
 sets is generally not a good idea unless you have a really
 large sample, eg,  20,000.  As Harrell has discussed,
 split-sample validation does not provide external
 validation, is terribly inefficient, and is arbitrary. 
 It's better to specify your model a priori and use the
 bootstrap to obtain an estimate of your model's
 over-optimism.  Bootstrapping can be implemented with
 Harrell's rms package in R.
 
 Scott
  
 ~~~
 Scott R Millis, PhD, ABPP, CStat, PStat®
 Professor
 Wayne State University School of Medicine
 Email:  aa3...@wayne.edu
 Email:  srmil...@yahoo.com
 Tel: 313-993-8085
 
 
 
 
 To: r-help@r-project.org
 
 Sent: Monday, January 30, 2012 8:14 AM
 Subject: [R] Variable selection based on both training and
 testing data
 
 Dear all,
 
 The variable selection in regression is usually determined
 by the training data using AIC or F value, such as stepAIC.
 Is there some R package that can consider both the training
 and test dataset? For example, I have two separate training
 data and test data. Firstly, a regression model is obtained
 by using training data, and then this model is tested by
 using test data. This process continues in order to find
 some possible optimal models in terms of RMSE or R2 for both
 training and test data. 
 
 Thanks,
 
 Jim
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
  reproducible code.
     [[alternative HTML version deleted]]
 
 
 -Inline Attachment Follows-
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible
 code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to select columns

2012-01-30 Thread Marc Schwartz


On Jan 30, 2012, at 12:33 PM, David Winsemius wrote:

 
 On Jan 30, 2012, at 2:30 AM, David Studer wrote:
 
 Hello,
 I have the following question:
 
 when creating a data.frame
 a1-c(1,2,3)
 a2-c(1,2,3)
 c-data.frame(a1,a2)
 I can select columns using an index like:
 c[,1:2]
 Is this possible too when using column-names? (something like c(,a1:a2),
 which doesn't work):
 
 Generally you need to use grep to convert column names to numbers for use 
 within [ operations]
 
 df[ , grep(^a1$, names(df)):grep^a2$, names(df)) ]
 
 -- 
 Another David


Just to throw out another option here, the ?subset function has a 'select' 
argument, which supports a start:end syntax to extract sequential columns from 
a data frame. Thus:

  subset(DF, StartColumnName:EndColumnName)

gets you that ability. The column names are NOT quoted, so in your case:

  subset(DF, select = a1:a2)

You can even select sequential and non-sequential columns by using c() along 
with the start:end syntax:

  subset(DF, select = c(ColA, ColF:ColH, ColK, ColN:ColW, ColZ))

HTH,

Marc Schwartz

 
 
 Alternative question: Is there a function to get the index of a variable by
 name
 
 That's what grep will do.
 
 or can I
 select certain columns using a loop? (a_1, a_2, ..., a_n)
 
 Thank you very much!
 David
 
 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT
\

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] repeat function for entire list of matrices

2012-01-30 Thread pabears

michael,

i don't know what happened, i was reading up on ?lapply(), i was up really
late, and somehow it didn't seem to take, but i tried it again this morning
and it worked like a charm.(sorry about the ellipses, i was just
being lazy/unclear).

that's great, thanks, this is a great help...

--
View this message in context: 
http://r.789695.n4.nabble.com/repeat-function-for-entire-list-of-matrices-tp4334587p4341629.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Linear Mixed Model set-up

2012-01-30 Thread Maggie Neff

Hello,

I have some data covering contaminant concentrations in fish over a time
period of ~35 years.  Each year, multiple samples of fish were taken (with
varying sample sizes each year).  Ultimately, I want an estimation of the
variance between years, and the variance within years + random effects.  I
used a linear mixed model to estimate these variances, but after reading a
number of different references and examples, I am still unclear as to
whether I have set up the model correctly to obtain these values.

I've used the *lme* function as follows - the example here is on an
abbreviated version of my data set:

 fish-read.csv(data.csv,header=TRUE)
 fish
   SPECIES YEAR CONTAMINANT
1  Walleye 19702.83
2  Walleye 19702.56
3  Walleye 19702.83
4  Walleye 19702.56
5  Walleye 19702.77
6  Walleye 19702.56
7  Walleye 19702.64
8  Walleye 19702.22
9  Walleye 19702.56
10 Walleye 19702.40
11 Walleye 19751.59
12 Walleye 19751.53
13 Walleye 19752.16
14 Walleye 19751.60
15 Walleye 19752.16
16 Walleye 19762.03
17 Walleye 19761.97
18 Walleye 19761.95
19 Walleye 19762.36
20 Walleye 19761.82
21 Walleye 19761.99
22 Walleye 19771.06
23 Walleye 19772.00
24 Walleye 19771.97
25 Walleye 19772.00
26 Walleye 19771.99
27 Walleye 19771.95
28 Walleye 19772.10
29 Walleye 19772.29
30 Walleye 19772.20
31 Walleye 19791.90
32 Walleye 19791.98
33 Walleye 19792.00
34 Walleye 19792.11
35 Walleye 19801.92
36 Walleye 19802.00
37 Walleye 19801.98
38 Walleye 19802.25
39 Walleye 19811.22
40 Walleye 19811.36
41 Walleye 19811.48
42 Walleye 19811.86
43 Walleye 19811.41
44 Walleye 19821.25
45 Walleye 19821.10
46 Walleye 19821.28
47 Walleye 19821.28
48 Walleye 19821.77
49 Walleye 19821.59
50 Walleye 19821.61
51 Walleye 19821.55
52 Walleye 19841.25
53 Walleye 19841.41
54 Walleye 19841.50
55 Walleye 19841.39
 contaminant-fish$CONTAMINANT
 year-fish$YEAR
 mod-lme(contaminant~year,random=~1|year,data=data)
 varcomp(mod,cum=FALSE)
  year Within
0.02695566 0.05758531
attr(,class)
[1] varcomp

Thanks in advance for your help - I very new to formula-building in R.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] replacing characters in matrix. substitute, delayedAssign, huh?

2012-01-30 Thread Henrik Bengtsson

The quick solution:

parseAndEval - function(x, ...) eval(parse(text=x))
apply(BM, MARGIN=c(1,2), FUN=parseAndEval)

My $.02

/Henrik

On Mon, Jan 30, 2012 at 10:26 AM, Paul Johnson pauljoh...@gmail.com wrote:
 A user question today has me stumped.  Can you advise me, please?

 User wants a matrix that has some numbers, some variables, possibly
 even some function names.  So that has to be a character matrix.
 Consider:

 BM - matrix(0.1, 5, 5)

 Use data.entry(BM) or similar to set some to more abstract values.

 BM[3,1] - a
 BM[4,2] - b
 BM[5,2] - b
 BM[5,3] - d
 BM
     var1  var2  var3  var4  var5
 [1,] 0.1 0.1 0.1 0.1 0.1
 [2,] 0.1 0.1 0.1 0.1 0.1
 [3,] a   0.1 0.1 0.1 0.1
 [4,] 0.1 b   0.1 0.1 0.1
 [5,] 0.1 b   d 0.1 0.1

 Later on, user code will set values, e.g.,

 a - rnorm(1)
 b - 17
 d - 4

 Now, push those into BM, convert whole thing to numeric

 newBM - apply(BM, c(1,2), as.numeric)

 and use newBM for some big calculation.

 Then re-set new values for a, b, d, do the same over again.

 I've been trying lots of variations on parse, substitute, and eval.

 The most interesting function I learned about this morning was delayedAssign.
 If I had only to work with one scalar, it does what I want

 delayedAssign(a, whatA)
 whatA - 91
 a
 [1] 91

 I can't see how to make that work in the matrix context, though.

 Got ideas?

 pj

 sessionInfo()
 R version 2.14.1 (2011-12-22)
 Platform: x86_64-pc-linux-gnu (64-bit)

 locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=C                 LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base

 loaded via a namespace (and not attached):
 [1] tools_2.14.1

 --
 Paul E. Johnson
 Professor, Political Science
 1541 Lilac Lane, Room 504
 University of Kansas

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Euler identity with complex exp

2012-01-30 Thread R. Michael Weylandt

Seems fine to me:

exp(pi + i*2pi) = exp(pi) * exp(i *2pi) = exp(pi) * (cos(2pi) +
i*sin(2*pi)) = exp(pi) *(1+ 0i) = exp(pi) ~ 23.14
exp(pi/2) ~ 4.81

What would you expect?

Michael

On Mon, Jan 30, 2012 at 10:37 AM, Joseph Park josephp...@ieee.org wrote:
 Hi,

 Am i doing something silly here in expecting Euler's
 formula to be handled by exp? exp( ix ) = cos x + i sin x.
 The first example below follows this, the others not.

 Thanks for the education!

   exp( complex(real = 0, imag = 2*pi) )
 [1] 1-0i
   exp( complex(real = pi, imag = 2*pi) )
 [1] 23.14069-0i
   exp( complex(real = pi/2, imag = 0) )
 [1] 4.810477+0i


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reg : Hello all.. help needed regarding heatmaps

2012-01-30 Thread R. Michael Weylandt

If you don't mind using an external (but very popular) graphics
package known as ggplot2 it's super easy:

https://learnr.wordpress.com/2010/01/26/ggplot2-quick-heatmap-plotting/

I'm sure it can be done in base graphics as well, but I'll leave that
to someone else.

It's also well implemented in gplots::heatmap.2 (that is, the
heatmap.2 function from the gplots package which, name
notwithstanding, is unrelated to ggplot2). Run

if(!require(gplots)) {install.packages(gplot); library(gplot)}
example(heatmap.2)

for some examples.

Michael

On Mon, Jan 30, 2012 at 10:47 AM, koushik gangavaram
kgangava...@gmail.com wrote:
 Hello all ,

 I am beginner and new to this -R world.  I have heard much about R and
 started working on it.

 I have some data of 20 business applications( y -axis) and Months( x-axis)
 and values as their score for every month .  I tried to generate a heatmap
 with this data and got some good results.  Can some one help me on how to
 generate the legend next to heatmap please...


 can some one send me a sample code ..?


 Some of the useful link that i found on web :
 http://www.oga-lab.net/RGM2/func.php?rd_id=gplots:heatmap.2


 Thanks in advance.

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] replacing characters in matrix. substitute, delayedAssign, huh?

2012-01-30 Thread Gabor Grothendieck

On Mon, Jan 30, 2012 at 1:26 PM, Paul Johnson pauljoh...@gmail.com wrote:
 A user question today has me stumped.  Can you advise me, please?

 User wants a matrix that has some numbers, some variables, possibly
 even some function names.  So that has to be a character matrix.
 Consider:

 BM - matrix(0.1, 5, 5)

 Use data.entry(BM) or similar to set some to more abstract values.

 BM[3,1] - a
 BM[4,2] - b
 BM[5,2] - b
 BM[5,3] - d
 BM
     var1  var2  var3  var4  var5
 [1,] 0.1 0.1 0.1 0.1 0.1
 [2,] 0.1 0.1 0.1 0.1 0.1
 [3,] a   0.1 0.1 0.1 0.1
 [4,] 0.1 b   0.1 0.1 0.1
 [5,] 0.1 b   d 0.1 0.1

 Later on, user code will set values, e.g.,

 a - rnorm(1)
 b - 17
 d - 4

 Now, push those into BM, convert whole thing to numeric

 newBM - apply(BM, c(1,2), as.numeric)

 and use newBM for some big calculation.

 Then re-set new values for a, b, d, do the same over again.

 I've been trying lots of variations on parse, substitute, and eval.

 The most interesting function I learned about this morning was delayedAssign.
 If I had only to work with one scalar, it does what I want

 delayedAssign(a, whatA)
 whatA - 91
 a
 [1] 91

 I can't see how to make that work in the matrix context, though.


You can do this:

 m - list(a, 1L, 2.5, function(x)x^2)
 dim(m) - c(2, 2)
 m
 [,1] [,2]
[1,] a  2.5
[2,] 1?

 # Run the function in 2,2 passing it argument in 1,2
 m[[2,2]]( m[[1, 2]] )
[1] 6.25

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Euler identity with complex exp

2012-01-30 Thread Peter Langfelder

Not sure why you think the formula does not hold... but am guessing
you think that sin(x) and cos(x) are have values in [-1, 1]? Well that
only holds for real x. If you have a complex x, sin(x) and cos(x) are
unbounded - indeed, if you can write x=iy and y is real, you can show
(up to my own ignorance of possible signs) cos(x) = cosh(y), and
sin(x) = -sinh(y) simply by expressing (from the formula you wrote)
cos(x) and sin(x) as

cos(x) = ( exp(ix) + exp(-ix) )/2
and sin(x) = ( exp(ix) - exp(-ix) )/2

In any case, plug any complex number into
exp( ix )
and
cos x + i sin x

in R and you will get the exact same answers.

HTH,

Peter

On Mon, Jan 30, 2012 at 7:37 AM, Joseph Park josephp...@ieee.org wrote:
 Hi,

 Am i doing something silly here in expecting Euler's
 formula to be handled by exp? exp( ix ) = cos x + i sin x.
 The first example below follows this, the others not.

 Thanks for the education!

   exp( complex(real = 0, imag = 2*pi) )
 [1] 1-0i
   exp( complex(real = pi, imag = 2*pi) )
 [1] 23.14069-0i
   exp( complex(real = pi/2, imag = 0) )
 [1] 4.810477+0i


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot with ylim with regural interval

2012-01-30 Thread David Winsemius



On Jan 30, 2012, at 1:28 PM, gianni lavaredo wrote:


Dear Researchers,

sorry for the easy question but Is it possible to plot with an  
interval of

1 or .5 in a plot using ylim?

Thanks
gianni

x = 0:10;
y = 0:10;

plot(x~y,ylim=c(0,10),las=1)


plot(x~y,ylim=c(0,10), xaxt=n)
axis(1, at=seq(0, 10, by=0.5) , labels= seq(0, 10, by=0.5),  
cex.axis=0.75)


Unless you make the cex.axis number small enough, axis() won't put  
them all in.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to sum multiple data entries for the same sampling event?

2012-01-30 Thread R. Michael Weylandt

Perhaps something like

# Untested
library(plyr)
ddply(DATA, Date, function(d) sum(d$lmb.cpue))

For example, on some fake data

DATA - data.frame(class = rep(letters[1:5], each = 2), type =
rep(c(good, bad), 5), value = rnorm(10))
ddply(DATA, class, function(d) sum(d$value))

If you want to send example data, it's best to send it with the
plaintext output of dput().

Michael

On Mon, Jan 30, 2012 at 12:16 PM, karengrace84 kgfis...@alumni.unc.edu wrote:
 I'm having trouble with some catch per unit effort data (CPUE, fisheries
 data). Some of the samples were retained and some unretained, and they
 are entered as 2 separate entries for the same sampling event (Date and
 time). I want to calculate the total CPUE (so sum the retained and
 unretained number for each sampling event) and am having troubld doing so.
 Here's a sample of what my data.frame looks like now:

 Date                               lmb.cpue
 Disposition.of.Catch
 1999-07-10 12:10:00   0.6667           Unretained
 1999-07-10 12:10:00   0.1667             Retained
 1999-07-14 11:22:00   0.8333           Unretained
 1999-07-14 11:22:00   0.5556             Retained
 1999-07-14 11:48:00   0.1667           Unretained
 1999-07-14 11:48:00   0.5833             Retained
 1999-07-14 13:56:00   0.57142857             Retained
 1999-07-15 10:23:00   0.             Retained
 1999-07-22 12:03:00   0.             Retained
 1999-07-25 11:26:00   0.4000           Unretained
 1999-07-25 11:26:00   1.             Retained


 And I would like to end up with:

 Date                               lmb.cpue
 1999-07-10 12:10:00   0.8333
 1999-07-14 11:22:00   1.3889
 1999-07-14 11:48:00   0.7500
 1999-07-14 13:56:00   0.57142857
 1999-07-15 10:23:00   0.
 1999-07-22 12:03:00   0.
 1999-07-25 11:26:00   1.4000

 Thanks for any help you have to offer!

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/how-to-sum-multiple-data-entries-for-the-same-sampling-event-tp4341670p4341670.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] replacing characters in matrix. substitute, delayedAssign, huh?

2012-01-30 Thread Duncan Murdoch


On 30/01/2012 1:26 PM, Paul Johnson wrote:

A user question today has me stumped.  Can you advise me, please?

User wants a matrix that has some numbers, some variables, possibly
even some function names.  So that has to be a character matrix.


It might make more sense for it to be a list-mode matrix.  Lists are 
vectors, and if they have dimension, they are matrices, but the entries 
need not be the same types.



Consider:

  BM- matrix(0.1, 5, 5)

Use data.entry(BM) or similar to set some to more abstract values.

  BM[3,1]- a
  BM[4,2]- b
  BM[5,2]- b
  BM[5,3]- d
  BM
  var1  var2  var3  var4  var5
[1,] 0.1 0.1 0.1 0.1 0.1
[2,] 0.1 0.1 0.1 0.1 0.1
[3,] a   0.1 0.1 0.1 0.1
[4,] 0.1 b   0.1 0.1 0.1
[5,] 0.1 b   d 0.1 0.1

Later on, user code will set values, e.g.,

a- rnorm(1)
b- 17
d- 4

Now, push those into BM, convert whole thing to numeric

newBM- apply(BM, c(1,2), as.numeric)

and use newBM for some big calculation.

Then re-set new values for a, b, d, do the same over again.

I've been trying lots of variations on parse, substitute, and eval.

The most interesting function I learned about this morning was delayedAssign.
If I had only to work with one scalar, it does what I want

  delayedAssign(a, whatA)
  whatA- 91
  a
[1] 91

I can't see how to make that work in the matrix context, though.

Got ideas?


I don't think delayedAssign is what you want:  it creates promises, 
and promises can only be evaluated once.  You want language entries in 
your matrix, and you want to use eval() to evaluate them.   (Or 
character entries, and use Henrik's parseAndEval.)


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] timeseries highlighting

2012-01-30 Thread Alexy Khrabrov

I'd like to plot a given time series in a primary color but highlight
a segment of it in a different color.  Is there an elegant way to do
it?

A+

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] timeseries highlighting

2012-01-30 Thread R. Michael Weylandt

library(zoo)
demo(zoo-overplot)

Michael

On Mon, Jan 30, 2012 at 2:05 PM, Alexy Khrabrov delivera...@gmail.com wrote:
 I'd like to plot a given time series in a primary color but highlight
 a segment of it in a different color.  Is there an elegant way to do
 it?

 A+

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] timeseries highlighting

2012-01-30 Thread Gabor Grothendieck

On Mon, Jan 30, 2012 at 2:12 PM, R. Michael Weylandt
michael.weyla...@gmail.com wrote:
 library(zoo)
 demo(zoo-overplot)


Also:

library(zoo)
example(xblocks)

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] repeat function for entire list of matrices

2012-01-30 Thread R. Michael Weylandt

No problem. Glad it worked for you.

Michael

On Mon, Jan 30, 2012 at 12:05 PM, pabears danss...@gmail.com wrote:
 michael,

 i don't know what happened, i was reading up on ?lapply(), i was up really
 late, and somehow it didn't seem to take, but i tried it again this morning
 and it worked like a charm.(sorry about the ellipses, i was just
 being lazy/unclear).

 that's great, thanks, this is a great help...

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/repeat-function-for-entire-list-of-matrices-tp4334587p4341629.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] And Statement for two if functions

2012-01-30 Thread kerry1912

Sorry that post was written in a bit if a rush.

I am writing a function in which I am trying to create a league table from a
data frame of rugby matches with the columns as follows: home team, away
team, home score and away score.

In rugby you can get an extra bonus point if you are the losing team and
lose by less than 7 points. So therefore in my function I am writing if the
away team loses AND loses by less than or equal to 7 points then the away
team will get an extra point, 

So ideally want to write:

if(games[i,3]  games[i,4]  AND games[i,3] = games[i,4] + 7) {
T[which(teams == games[i,2]),Points] -
T[which(teams == 
games[i,2]),Points] + 1}

Which is inset into a function in R where the input of the function is
'games' which will be the list of the 132 matches of rugby being analysed
and where teams is the list of 12 teams in the league. 

I wasn't sure if it was possible to write an 'if' function embedded in
another 'if' function or which method would be best to achieve this. 

Thank you. 

 


--
View this message in context: 
http://r.789695.n4.nabble.com/And-Statement-for-two-if-functions-tp4341179p4342098.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem in fitting model equation in nls function

2012-01-30 Thread ram basnet

Dear R users,

I am struggling to fit expo-linear equation to my data using nls 
function. I am always getting error message as i highlighted below in yellow 
color: 


### Theexpo-linear equation which i am interested to fit my data:   
response_variable =  (c/r)*log(1+exp(r*(Day-tt))), where Day is 
time-variable

## my response variable

rl - c(2,1.5,1.8,2,2,2.5,2.6,1.5,2.4,1.7,2.3,2.4,2.2,2.6,
 2.8,2,2.5,1.8,2.4,2.4,2.3,2.6,3,2,2.6,1.8,2.5,2.5,
 2.3,2.7,3,2.2,2.6,1.8,2.5,2.5,2.3,2.7,3,2.2)
myday - rep(c(3,5,7,9,10), each = 8) # creating my predictor time-variable
mydata - data.frame(rl,myday) # data object

### fitting model equation in nls function
### when i assigned initial value for tt = 0.6,

CASE-I: 

 mytest - nls(rl ~ (c/r)*log(1+exp(r*(myday-tt))), data = mydata,
+ na.action = na.omit, 
+ start = list(c = 2.0, r = 0.05, tt = 0.6),algorithm = plinear)
Error in numericDeriv(form[[3L]], names(ind), env) : 
  Missing value or an infinity produced when evaluating the model
 
CASE - II:
When i assigned initial value for tt = 1: 
 
 mytest - nls(rl ~ (c/r)*log(1+exp(r*(myday-tt))), data = mydata,
+ na.action = na.omit, 
+ start = list(c = 2.0, r = 0.5, tt = 1),algorithm = plinear)
Error in nls(rl ~ (c/r) * log(1 + exp(r * (myday - tt))), data = mydata,  : 
  singular gradient
 
I am getting the yellow-color highlighted error message (see above). Truely 
speaking, i have not so much experienced with fitting specific model equation 
in R-package.
I have following queries: 
 
1. Does any one can explain me what is going wrong here ? 
 
2. Importantly, how can i write above equation into nls functions ? 
 
I will be very thankful to you, if any one can help me.
I am looking for your cooperations.

Thanks


Regards,
Ram Kumar Basnet
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fw: Variable selection based on both training and testing data

2012-01-30 Thread SR Millis

Jim,
With regard to variable and model selection, you might consider using Bayesian 
model averaging (bma program) or some sort of shrinkage (lars or lasso2 
programs).

Scott Millis





 From: Jin Minming jminm...@yahoo.com
To: r-help@r-project.org r-help@r-project.org; SR Millis aa3...@wayne.edu 
Sent: Monday, January 30, 2012 11:30 AM
Subject: Re: [R] Fw: Variable selection based on both training and testing data

Dear Scott,

I am so sorry that I think I just sent an empty email to you.
Thanks a lot for your advice.

The problem is that we do not have sufficient prior knowledge for the 
regression form and even appropriate inputs. We need try to find some possible 
regression equations, then add our explanation to them.  So we need explore a 
lot of options.  The two input datasets are very different in nature and they 
are from two locations.  Hence, it can be used for testing purpose although it 
may turn out to be that there is not an appropriate regression due to the 
intrinsic difference in these two datasets. 

In fact, if I can extract the models used (not only the final model) in stepAIC 
function, then it will be easier to add some simple scripts to calculate R2 or 
RMSE for both datasets. 

Thanks,

Jim


--- On Mon, 30/1/12, SR Millis aa3...@wayne.edu wrote:

 From: SR Millis aa3...@wayne.edu
 Subject: [R] Fw: Variable selection based on both training and testing data
 To: r-help@r-project.org r-help@r-project.org
 Date: Monday, 30 January, 2012, 14:57
 
 
 From: SR Millis srmil...@yahoo.com
 To: Jin Minming jminm...@yahoo.com
 
 Sent: Monday, January 30, 2012 9:25 AM
 Subject: Re: [R] Variable selection based on both training
 and testing data
  
 
 Jim,
 
 First, stepwise methods for variable selection should be
 avoided.  Frank Harrell (in Regression Modeling Strategies)
 discusses this at length.
 
 Second, splitting a dataset into training and validation
 sets is generally not a good idea unless you have a really
 large sample, eg,  20,000.  As Harrell has discussed,
 split-sample validation does not provide external
 validation, is terribly inefficient, and is arbitrary. 
 It's better to specify your model a priori and use the
 bootstrap to obtain an estimate of your model's
 over-optimism.  Bootstrapping can be implemented with
 Harrell's rms package in R.
 
 Scott
  
 ~~~
 Scott R Millis, PhD, ABPP, CStat, PStat®
 Professor
 Wayne State University School of Medicine
 Email:  aa3...@wayne.edu
 Email:  srmil...@yahoo.com
 Tel: 313-993-8085
 
 
 
 
 To: r-help@r-project.org
 
 Sent: Monday, January 30, 2012 8:14 AM
 Subject: [R] Variable selection based on both training and
 testing data
 
 Dear all,
 
 The variable selection in regression is usually determined
 by the training data using AIC or F value, such as stepAIC.
 Is there some R package that can consider both the training
 and test dataset? For example, I have two separate training
 data and test data. Firstly, a regression model is obtained
 by using training data, and then this model is tested by
 using test data. This process continues in order to find
 some possible optimal models in terms of RMSE or R2 for both
 training and test data. 
 
 Thanks,
 
 Jim
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
  reproducible code.
     [[alternative HTML version deleted]]
 
 
 -Inline Attachment Follows-
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible
 code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rpart usersplits

2012-01-30 Thread jcress410

I'm inspecting tests/usersplits.R in rpart, trying to get my head around how
to pass data to the split function.

I'm trying to instantiate a number of goodness measures which compare
treatment vs control within splits.  

A simple example is difference-in-difference estimate of a candidate split, 

(Y_t - Y_c)_L - (Y_t - Y_c)_R
(difference between treatment and control for the left minus difference
between treatment and control for the right)

I need to know whether each Y value is a treatment Y or a control.
the documentation in usersplits.R says that Y is provided in sort order of
X, so I'm not sure how to pass a vector indicating treatment vs control that
will split Y appropriately. 

rpart.poisson takes a two column matrix as an input, and I've been trying to
mimic that, but there's no documentation in rpart.poisson (i can't figure
out where it creates the 'vector of goodness') 

I'd appreciate any advice,

J. Cress



--
View this message in context: 
http://r.789695.n4.nabble.com/rpart-usersplits-tp4342156p4342156.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] User Interface Equivalent Code

2012-01-30 Thread Ajay Askoolum

When I plot, the plot's user interface offers me a choice:

File | Copy to the Clipboard | as a Bitmap.

What is the equivalent code for achieving this but without the plot interface 
becoming visible?

Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Checking for invalid dates: Code works but needs improvement

2012-01-30 Thread Marc Schwartz


On Jan 30, 2012, at 12:15 PM, David Winsemius wrote:

 
 On Jan 30, 2012, at 8:44 AM, Paul Miller wrote:
 
 Hi Rui, Marc, and Gabor,
 
 Thanks for your replies to my question. All were helpful and it was 
 interesting to see how different people approach various aspects of the same 
 problem.
 
 Spent some time this weekend looking at Rui's solution, which is certainly 
 much clearer than my own. Managed to figure out pretty much all the details 
 of how it works. Also managed to tweak it slightly in order to make it do 
 exactly what I wanted. (See revised code below.)
 
 Still have a couple of questions though. The first concerns the insertion of 
 the code Y  2012 to set year values beyond 2012 to NA (on line 10 of the 
 function below).  When I add this (or use it in place of nchar(Y)  4), 
 the code succesfully finds the problem date 05/16/2015. After that though, 
 it produces the following error message:
 
 Error in if (any(is.na(x)  M != un  Y != un)) cat(Warning: Invalid 
 date values in,  :  missing value where TRUE/FALSE needed
 
 It's a bit dangerous to use comparison operators on mixed data types. In your 
 case you are comparing a character value to a numeric value and may not 
 realize that 2015 is not the same as 2015. Try 123  1000 if you want a 
 quick counter-example. You may want to coerce the Y value to numeric mode 
 to be safe.
 
 Also 'any' does not expect the logical connectives. You probably want:
 
 any(is.na(x) , M != un , Y != un)


Perhaps I am missing something relevant here, but I am still confused by what I 
see as an over engineering of the code being implemented. If the primary 
requirements are:

1. Impute the 15th of month if it is 'un'
2. Reject dates prior to 1900 or after 2011
3. Reject dates with an unknown ('un') month or year
4. Reject years with 4 digits, also presuming that the value passed should 
always be 10 characters in length

If that is the basic functionality required, then a modest modification of my 
prior code should work:

checkDate - function(x) {

  # Replace unknown day with 15
  tmp - gsub(/un/, /15/, x)

  tmp2 - as.Date(tmp, format = %m/%d/%Y)

  as.character(x[is.na(tmp2) | 
 tmp2  as.Date(1900/01/01) |
 tmp2  as.Date(2012/01/01) |
 nchar(as.character(x))  10])
}


 TestDates
  Patient birthDT diagnosisDT metastaticDT
1   1 11/23/21931  05/23/2009   un/17/2011
2   2  06/20/1840  02/30/2010   03/17/2011
3   3  06/17/1935  12/20/2008   07/un/2011
4   4  05/31/1937  01/18/2007   04/30/2011
5   5  06/31/1933  05/16/2015 11/20/un


 lapply(TestDates[, -1], checkDate)
$birthDT
[1] 11/23/21931 06/20/1840  06/31/1933 

$diagnosisDT
[1] 02/30/2010 05/16/2015

$metastaticDT
[1] un/17/2011 11/20/un  


Does that not do what you require Paul?

Marc

 
 
 Why is this happening? If the code correctly correctly handles the date 
 06/20/1840 without producing an error, why can't it do likelwise with 
 05/16/2015?
 
 The second question is why it's necessary to put x on line 15 following 
 cat(Warning ...). I know that I don't get any date columns if I don't 
 include this but am not sure why.
 
 The third question is whether it's possible to change the class of the date 
 variables without using a for loop. I played around with this a little but 
 didn't find a vectorized alternative. It may be that this is not really 
 important. It's just that I've read in several places that for loops should 
 be avoided wherever possible.
 
 Thanks,
 
 Paul

snip prior content

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Checking for invalid dates: Code works but needs improvement

2012-01-30 Thread Marc Schwartz


On Jan 30, 2012, at 1:30 PM, Marc Schwartz wrote:

 
 On Jan 30, 2012, at 12:15 PM, David Winsemius wrote:
 
 
 On Jan 30, 2012, at 8:44 AM, Paul Miller wrote:
 
 Hi Rui, Marc, and Gabor,
 
 Thanks for your replies to my question. All were helpful and it was 
 interesting to see how different people approach various aspects of the 
 same problem.
 
 Spent some time this weekend looking at Rui's solution, which is certainly 
 much clearer than my own. Managed to figure out pretty much all the details 
 of how it works. Also managed to tweak it slightly in order to make it do 
 exactly what I wanted. (See revised code below.)
 
 Still have a couple of questions though. The first concerns the insertion 
 of the code Y  2012 to set year values beyond 2012 to NA (on line 10 of 
 the function below).  When I add this (or use it in place of nchar(Y)  
 4), the code succesfully finds the problem date 05/16/2015. After that 
 though, it produces the following error message:
 
 Error in if (any(is.na(x)  M != un  Y != un)) cat(Warning: Invalid 
 date values in,  :  missing value where TRUE/FALSE needed
 
 It's a bit dangerous to use comparison operators on mixed data types. In 
 your case you are comparing a character value to a numeric value and may not 
 realize that 2015 is not the same as 2015. Try 123  1000 if you want a 
 quick counter-example. You may want to coerce the Y value to numeric mode 
 to be safe.
 
 Also 'any' does not expect the logical connectives. You probably want:
 
 any(is.na(x) , M != un , Y != un)
 
 
 Perhaps I am missing something relevant here, but I am still confused by what 
 I see as an over engineering of the code being implemented. If the primary 
 requirements are:
 
 1. Impute the 15th of month if it is 'un'
 2. Reject dates prior to 1900 or after 2011
 3. Reject dates with an unknown ('un') month or year
 4. Reject years with 4 digits, also presuming that the value passed should 
 always be 10 characters in length
 
 If that is the basic functionality required, then a modest modification of my 
 prior code should work:


Ack...typo in my code for the upper end of the date range. Should be:

checkDate - function(x) {

 # Replace unknown day with 15
 tmp - gsub(/un/, /15/, x)

 tmp2 - as.Date(tmp, format = %m/%d/%Y)

 as.character(x[is.na(tmp2) | 
  tmp2  as.Date(1900/01/01) |
  tmp2  as.Date(2011/12/31) |
  nchar(as.character(x))  10])
}


Marc

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Euler identity with complex exp

2012-01-30 Thread Peter Langfelder

On Mon, Jan 30, 2012 at 11:43 AM, Joseph Park josephp...@ieee.org wrote:
 Thanks Michael  Peter.

 Michael's expansion makes sense.

 This is what I expected:

 a = pi + 0i
 complex( real = cos(Re(a)), imaginary = sin(Im(a)) )
 [1] -1+0i

As they say, the error is between the keyboard and the chair. You
cannot drop parts of a - in your formula above you dropped the
imaginary part of a in cos and the real part of a in sin. In this case
it doesn't make a difference but in general it will.


 Not this:
 exp(a)

you need exp(ia), not exp(a):

i = complex(real = 0, imaginary = 1)
exp(i*a)
[1] -1+0i


 [1] 23.14069+0i

 Is this not an implementation of Euler's formula:
 complex( real = cos(2*pi), imaginary = sin(2*pi) )
 [1] 1-0i

 And that is a result Michael depends on in his
 expansion, yet if we pass this argument to exp:
 exp( (complex( real = 2*pi, imaginary = 2*pi) ) )
 [1] 535.4917-0i

Again, you are not using the formula correctly. Remember x = 2*pi, so you need
exp( i * 2 * pi)

and you get the same result as complex( real = cos(2*pi), imaginary =
sin(2*pi) )

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Euler identity with complex exp

2012-01-30 Thread R. Michael Weylandt

This is off-topic for R-help, but we might as well finish what's been started:

Take a closer look at exp(i*x). If x is real, i*x is a pure imaginary
number, not a complex number so the formula you are using doesn't hold
in general.** The general Euler result for complex (= mixed real and
imaginary) numbers looks like this:

exp(x + iy) = exp(x)*(cos(y) + i sin(y))

That is, the real part gives the modulus and the imaginary part goes
solely to the argument. What's often surprising about this is that

exp(2 + 2*pi*i) = exp(2) = exp(2+4*pi*i) = exp(2 - 2*pi*i)

because the trig functions which get applied to the imaginary part are
periodic.

Take a closer look at what you wrote:

complex( real = cos(2*pi), imaginary = sin(2*pi) )
exp( (complex( real = 2*pi, imaginary = 2*pi) ) )

The number in the first line is not what gets exponentialed in the
second! You'll get the expected (by you) behavior if you actually use
the same number for both calculations:

complex( real = cos(2*pi), imaginary = sin(2*pi) )
exp(complex( real = cos(2*pi), imaginary = sin(2*pi) ))

or

complex(real = 2*pi, imaginary = 2*pi)
exp(complex(real = 2*pi, imaginary = 2*pi))

If you work out the second like I did for exp(pi + 2*pi*i) in my first
email, you'll get the correct answer.

All in all, R is definitely correct in it's interpretation of
Euler's formula. There's only one way to parse this relationship that
gives mathematical consistency and it's what Peter and I have set out
for you.

Michael

** Not actually true, if x is complex, it of course works out
correctly as well, but you wind up having to use the more general
expression I give to get there.

On Mon, Jan 30, 2012 at 2:43 PM, Joseph Park josephp...@ieee.org wrote:
 Thanks Michael  Peter.

 Michael's expansion makes sense.

 This is what I expected:

 a = pi + 0i
 complex( real = cos(Re(a)), imaginary = sin(Im(a)) )
 [1] -1+0i

 Not this:
 exp(a)
 [1] 23.14069+0i

 Is this not an implementation of Euler's formula:
 complex( real = cos(2*pi), imaginary = sin(2*pi) )
 [1] 1-0i

 And that is a result Michael depends on in his
 expansion, yet if we pass this argument to exp:
 exp( (complex( real = 2*pi, imaginary = 2*pi) ) )
 [1] 535.4917-0i

 That would not work in Michaels expansion, the answer must
 be 1 + 0i.

 Which seems to suggest that exp( ix ) and cos x + i sin x (as
 written above) are different interpretations.


 On 01/30/2012 12:47 PM, Peter Langfelder wrote:

 Not sure why you think the formula does not hold... but am guessing
 you think that sin(x) and cos(x) are have values in [-1, 1]? Well that
 only holds for real x. If you have a complex x, sin(x) and cos(x) are
 unbounded - indeed, if you can write x=iy and y is real, you can show
 (up to my own ignorance of possible signs) cos(x) = cosh(y), and
 sin(x) = -sinh(y) simply by expressing (from the formula you wrote)
 cos(x) and sin(x) as

 cos(x) = ( exp(ix) + exp(-ix) )/2
 and sin(x) = ( exp(ix) - exp(-ix) )/2

 In any case, plug any complex number into
 exp( ix )
 and
 cos x + i sin x

 in R and you will get the exact same answers.

 HTH,

 Peter

 On Mon, Jan 30, 2012 at 7:37 AM, Joseph Park josephp...@ieee.org wrote:

 Hi,

 Am i doing something silly here in expecting Euler's
 formula to be handled by exp? exp( ix ) = cos x + i sin x.
 The first example below follows this, the others not.

 Thanks for the education!

   exp( complex(real = 0, imag = 2*pi) )
 [1] 1-0i
   exp( complex(real = pi, imag = 2*pi) )
 [1] 23.14069-0i
   exp( complex(real = pi/2, imag = 0) )
 [1] 4.810477+0i


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] And Statement for two if functions

2012-01-30 Thread R. Michael Weylandt

Nested if's are fine in R, but as David said you probably want
ifelse(). This sounds sufficiently homework-y that I'm hesitant to
give example code but it's all over the archives.

Just to head off a problem I see in your pesudo-code; you're going to
want to use ifelse() to construct the points vector and then assign
it: it's terribly dangerous to do assignment within ifelse() as if it
were a simple if().

Michael


On Mon, Jan 30, 2012 at 1:55 PM, kerry1912 kerry1...@hotmail.com wrote:
 Sorry that post was written in a bit if a rush.

 I am writing a function in which I am trying to create a league table from a
 data frame of rugby matches with the columns as follows: home team, away
 team, home score and away score.

 In rugby you can get an extra bonus point if you are the losing team and
 lose by less than 7 points. So therefore in my function I am writing if the
 away team loses AND loses by less than or equal to 7 points then the away
 team will get an extra point,

 So ideally want to write:

 if(games[i,3]  games[i,4]  AND games[i,3] = games[i,4] + 7) {
                                T[which(teams == games[i,2]),Points] -
                                                        T[which(teams == 
 games[i,2]),Points] + 1}

 Which is inset into a function in R where the input of the function is
 'games' which will be the list of the 132 matches of rugby being analysed
 and where teams is the list of 12 teams in the league.

 I wasn't sure if it was possible to write an 'if' function embedded in
 another 'if' function or which method would be best to achieve this.

 Thank you.




 --
 View this message in context: 
 http://r.789695.n4.nabble.com/And-Statement-for-two-if-functions-tp4341179p4342098.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] replacing characters in matrix. substitute, delayedAssign, huh?

2012-01-30 Thread Paul Johnson

Henrik's proposal works well, so far.  Thanks very much. I could not
have figured that out (without much more suffering).

Here's the working example
in case future googlers find their way to this thread.


## Paul Johnson paulj...@ku.edu
## 2012-01-30

## Special thanks to r-help email list contributors,
## especially Henrik Bengtsson


BM - matrix(0.1, 5, 5)

BM[2,1] - a
BM[3,2] - b

BM

parseAndEval - function(x, ...) eval(parse(text=x))

a - 0.5
b - 0.4

realBM - apply(BM, MARGIN=c(1,2), FUN=parseAndEval)

BM[4,5] - rnorm(1, m=7, sd=1)

BM

realBM - apply(BM, MARGIN=c(1,2), FUN=parseAndEval)

realBM

## Now, what about gui interaction with that table?
## The best nice looking options are not practical at the moment.

## Try this instead

data.entry(BM)

## That will work on all platforms, so far as I know, without
## any special effort from us. Run that, make some changes, then
## make sure you insert new R variables to match in your environment.

## Suppose you inserted the letter z in there somewhere

## set z out here

z - rpois(1, lambda=10)

realBM - apply(BM, MARGIN=c(1,2), FUN=parseAndEval)


-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Different type of legend?

2012-01-30 Thread rkevinburton



How would I create a legend that looks like the attached image? 
Basically all of the color boxes are right next to each other and the 
text is below. This kind of arrangement allows for many more items in 
the legend. Using the legend() method seems to top out at about 14 items 
(that will fit in the horizontal plot).


Suggestions?

Thank you.

Kevin
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Different type of legend?

2012-01-30 Thread R. Michael Weylandt

Server stripped the attachment. Can you post a link somewhere?

Michael

On Mon, Jan 30, 2012 at 4:25 PM,  rkevinbur...@charter.net wrote:

 How would I create a legend that looks like the attached image? Basically
 all of the color boxes are right next to each other and the text is below.
 This kind of arrangement allows for many more items in the legend. Using the
 legend() method seems to top out at about 14 items (that will fit in the
 horizontal plot).

 Suggestions?

 Thank you.

 Kevin

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 115 matches

Mail list logo