Re: [R] vectorisation

2013-02-03 Thread Berend Hasselman

On 02-02-2013, at 17:38, Brett Robinson brett.robin...@7dials.com wrote:

 Hi
 I'm trying to set up a simulation problem without resorting to (m)any loops. 
 I want to set entries in a data frame of zeros ('starts' in the code below) 
 to 1 at certain points and the points have been randomly generated and stored 
 in a separate data.frame ('sl'), which has the same number of columns.
 
 An example of the procedure is as follows:
 ml - data.frame(matrix(sample(1:50,80, replace=TRUE),20,4))
 mm - apply(ml, 2, cumsum)
 starts- data.frame(matrix(0,600,4))
 
 I can achieve the result I want with a loop:
 for (i in 1:4){
 lstarts[,i][mm[,i]] -1
 }
 
 But as I want to use a large number of columns I would like to do away with 
 the loop
 
 Can anyone suggest how this might be done?

Another way is this

f2 - function(starts, mm) {
mn - cbind(as.vector(mm),rep(1:ncol(mm),each=nrow(mm)))
x - as.matrix(starts) 
x[mn] - 1  
as.data.frame(x)
}

starts2 - f2(starts,mm)
# identical(starts2,starts1)
# [1] TRUE

Collect all the options presented so far in functions, use the compiler package 
to see if that helps
and do some speed tests with Arun's parameters.

# Brett
f1 - function(starts, mm) {
for (i in 1:ncol(mm)){
starts[,i][mm[,i]] -1
}
starts
}

# Berend
f2 - function(starts, mm) {
mn - cbind(as.vector(mm),rep(1:ncol(mm),each=nrow(mm)))
x - as.matrix(starts) 
x[mn] - 1  
as.data.frame(x)
}

# Rui
f3 - function(s2,mm) {
s2[] - lapply(seq_len(ncol(mm)), function(i) {s2[,i][mm[,i]] - 1; s2[,i]})
s2
}

# Arun
f4 - function(starts,mm) {
starts2 - as.data.frame(do.call(cbind,lapply(1:ncol(mm),function(i) 
{starts[,i][mm[,i]]-1;starts[,i]})))
colnames(starts2)- colnames(starts)
starts2
}

library(compiler)
f1c - cmpfun(f1)
f2c - cmpfun(f2)
f3c - cmpfun(f3)
f4c - cmpfun(f4)

library(rbenchmark)

# Arun's test
set.seed(11)
starts - data.frame(matrix(0,1e6,4))
ml - data.frame(matrix(sample(1:1e4,1e3, replace=TRUE),100,4))
mm - apply(ml, 2, cumsum)

z1 - f1(starts,mm)
z2 - f2(starts,mm)
z3 - f3(starts,mm)
z4 - f4(starts,mm)
z1c - f1c(starts,mm)
z2c - f2c(starts,mm)
z3c - f3c(starts,mm)
z4c - f4c(starts,mm)

identical(z2,z1)
identical(z3,z1)
identical(z4,z1)
identical(z1c,z1)
identical(z2c,z1)
identical(z3c,z1)
identical(z4c,z1)

benchmark( f1(starts,mm) , f2(starts,mm),
   f1c(starts,mm), f2c(starts,mm),
   f3(starts,mm) , f4(starts,mm),
   f3c(starts,mm), f4c(starts,mm),
   replications=1,order=relative, 
columns=c(test,relative,elapsed,replications))

Result:

#  identical(z2,z1)
# [1] TRUE
#  identical(z3,z1)
# [1] TRUE
#  identical(z4,z1)
# [1] TRUE
#  identical(z1c,z1)
# [1] TRUE
#  identical(z2c,z1)
# [1] TRUE
#  identical(z3c,z1)
# [1] TRUE
#  identical(z4c,z1)
# [1] TRUE
#  
#  benchmark( f1(starts,mm) , f2(starts,mm),
# +f1c(starts,mm), f2c(starts,mm),
# +f3(starts,mm) , f4(starts,mm),
# +f3c(starts,mm), f4c(starts,mm),
# +replications=1,order=relative, 
columns=c(test,relative,elapsed,replications))
#  test relative elapsed replications
# 2  f2(starts, mm)1.000   0.1951
# 4 f2c(starts, mm)1.005   0.1961
# 1  f1(starts, mm)2.990   0.5831
# 3 f1c(starts, mm)3.082   0.6011
# 7 f3c(starts, mm)3.903   0.7611
# 5  f3(starts, mm)3.949   0.7701
# 8 f4c(starts, mm)4.436   0.8651
# 6  f4(starts, mm)4.462   0.8701

Compiling doesn't deliver significant speed gains in this case.
Function f2 is the quickest.

Berend
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Split xts data set into weeks

2013-02-03 Thread Seimizu Joukan
Hi, Jeff

Thank you for your advice.

 Your example of the problem is not reproducible [1]. This behavior could 
 arise due to small discrepancies in the index values, or from specifying 
 frequency instead of f as the second argument, our perhaps you have found 
 a bug that only your data triggers. Any verification of what your problem is 
 will require a reproducible example.
 [1] 
 http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

I tried to make a piece of reproducible codes.
Would you please paste the following codes to R console and make a confirmation?

#Codes start from here

library(quantmod)
tmp-structure(c(112.34, 112.89, 112.75, 113.5, 115.16, 115.21, 114.84,
114.93, 115.05, 114.46, 113.34, 113.71, 113.56, 115.08, 115.97,
115.26, 115.22, 115.24, 115.24, 114.98, 111.96, 112.75, 112.5,
113.1, 114.85, 114.55, 114.55, 114.75, 114.2, 112.92, 112.87,
112.8, 113.54, 115.05, 115.06, 114.85, 114.93, 115.09, 114.28,
113.92), class = c(xts, zoo), .indexCLASS = Date, tclass =
Date, .indexTZ = , tzone = , index = structure(c(1298818800,
1298905200, 1298991600, 1299078000, 1299164400, 1299423600, 129951,
1299596400, 1299682800, 1299769200), tzone = , tclass = Date),
.Dim = c(10L,
4L), .Dimnames = list(NULL, c(Open, High, Low, Close)))
class(tmp)
(res1-split(tmp,f=weeks))
(res2-split(tmp,frequency=weeks))

#Codes end here

the original data is saved in tmp and the split() results are saved
in res1 and res2.
res1 is the result of f and res2 is the resut of frequency, both break the
week started from 2011-02-28  and result of frequency is even worse.

BTW, my R version is as following:

 version   _
platform   i686-pc-linux-gnu
arch   i686
os linux-gnu
system i686, linux-gnu
status
major  2
minor  15.2
year   2012
month  10
day26
svn rev61015
language   R
version.string R version 2.15.2 (2012-10-26)
nickname   Trick or Treat

Thank you.

Seimizu Joukan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Relative Risk in logistic regression

2013-02-03 Thread Michael Dewey

At 10:49 30/01/2013, aminreza Aamini wrote:

Hi all,
I am very grateful to all those who write to me
1) how i  can  obtain relative risk (risk ratio) in logistic regression in R.


@TECHREPORT{lumley06,
  author = {Lumley, T and Kronmal, R and Ma, S},
  year = 2006,
  title = {Relative risk regression in medical research: models, contrasts,
  estimators, and algorithms},
  number = 293,
  institution = {{UW} Biostatistics Working Paper Series},
  keywords = {glm, Poisson},
  url = {http://www.bepress.com/uwbiostat/paper293}
}



2) how to obtain  the predicted risk for a certain individual using
fitted regression model in R.

Many thanks, in advance, for your help.

Amin.


Michael Dewey
i...@aghmed.fsnet.co.uk
http://www.aghmed.fsnet.co.uk/home.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Split xts data set into weeks

2013-02-03 Thread R. Michael Weylandt
On Sun, Feb 3, 2013 at 6:57 AM, Seimizu Joukan saim...@gmail.com wrote:
 Would you please paste the following codes to R console and make a 
 confirmation?


Indeed, well done and much appreciated.

 #Codes start from here

 library(quantmod)
 tmp-structure(c(112.34, 112.89, 112.75, 113.5, 115.16, 115.21, 114.84,
 114.93, 115.05, 114.46, 113.34, 113.71, 113.56, 115.08, 115.97,
 115.26, 115.22, 115.24, 115.24, 114.98, 111.96, 112.75, 112.5,
 113.1, 114.85, 114.55, 114.55, 114.75, 114.2, 112.92, 112.87,
 112.8, 113.54, 115.05, 115.06, 114.85, 114.93, 115.09, 114.28,
 113.92), class = c(xts, zoo), .indexCLASS = Date, tclass =
 Date, .indexTZ = , tzone = , index = structure(c(1298818800,
 1298905200, 1298991600, 1299078000, 1299164400, 1299423600, 129951,
 1299596400, 1299682800, 1299769200), tzone = , tclass = Date),
 .Dim = c(10L,
 4L), .Dimnames = list(NULL, c(Open, High, Low, Close)))
 class(tmp)
 (res1-split(tmp,f=weeks))
 (res2-split(tmp,frequency=weeks))

Looking at args(split.xts) I think you actually do want split(..., f =
) here, not split(..., frequency = ), which would ignore and default
to months.


I get the following for res1, running R-Devel on OS X 10.6.8:

 res1
[[1]]
 Open   HighLow  Close
2011-02-27 112.34 113.34 111.96 112.87

[[2]]
 Open   HighLow  Close
2011-02-28 112.89 113.71 112.75 112.80
2011-03-01 112.75 113.56 112.50 113.54
2011-03-02 113.50 115.08 113.10 115.05
2011-03-03 115.16 115.97 114.85 115.06
2011-03-06 115.21 115.26 114.55 114.85

[[3]]
 Open   HighLow  Close
2011-03-07 114.84 115.22 114.55 114.93
2011-03-08 114.93 115.24 114.75 115.09
2011-03-09 115.05 115.24 114.20 114.28
2011-03-10 114.46 114.98 112.92 113.92

so I think it's likely a timezone issue. Try setting

indexTZ(tmp) - GMT

or something similar and giving it another shot.

You might also want to move to the R-SIG-Finance class where the
authors of xts are more frequently seen.

It might also help to report Sys.timezone() in addition to your
specific linux distro.

Cheers,

MW

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Split xts data set into weeks

2013-02-03 Thread Seimizu Joukan
Hi, Michael

Thank you very much!

 Looking at args(split.xts) I think you actually do want split(..., f =
 ) here, not split(..., frequency = ), which would ignore and default
 to months.

yes, split(...,f=) is what I want.

 so I think it's likely a timezone issue. Try setting
 indexTZ(tmp) - GMT

yes, I think you are right.
when I set indexTZ(tmp) to GMT or JST (Japan),
I got the same result to yours.

when I do Sys.timezone(), I got  , I am afraid that R failed to get
my ubuntu's system
environment variable. perhaps because I am using ubuntu on vmware,
there are some problems with timezone. not sure! :(

I  know little about timezone, I will go on to investigate and
learn something about it. thank you for your help!

Best regards!

Seimizu Joukan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question: write an R script with help information available to the user

2013-02-03 Thread Gabor Grothendieck
On Sun, Feb 3, 2013 at 1:50 AM, Bert Gunter gunter.ber...@gene.com wrote:
 A  related approach which, if memory serves, was originally in S eons
 ago, is to define a doc attribute of any function (or object, for
 that matter)  that you wish to document that contains text for
 documentation and a doc() function of the form:

 doc - function(obj) cat(attr(obj,doc))

 used as:

 f- function(x) NULL
 attr(f,doc) - Some text\n\n
 doc(f)
 doc(f)
 Some text

 This is pretty primitive, but I suppose you could instead have the
 attribute point to something like an HTML file and the doc() function
 open it in a web browser, which is basically what R's built-in package
 document system does anyway. Except you wouldn't have to build a
 package and don't have to learn or follow R's procedures. Which means
 you don't get  R's standardization and organization and no one but a
 private bunch of users will be able to use your function. But maybe
 that's sufficient for your needs.


To further build on this try the above idea using the comment function:

 f - function() NULL
 comment(f) - Help goes here

 comment(f)
[1] Help goes here

or combine it with the redefinition of ? like like this:

`?` - function(...) if (!is.null(doc -
comment(get(match.call()[[2]] cat(doc, \n) else help(...)
?f#  displays:  Help goes here
?dim   # normal help

--
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Relative Risk in logistic regression

2013-02-03 Thread aminreza Aamini
Dear Coleagues ,
As my friend John  mentined,* the measure of association from a logistic
regression is the odds ratio, not the relative risk*. but the point is in
follow-up studies, it is commonly preferred to estimate a risk ratio rather
than an odds ratio. Thats why im looking for RR in logistic models.
Bytheway thank you all for ur consideration.
Amin


On Sun, Feb 3, 2013 at 1:42 PM, Michael Dewey i...@aghmed.fsnet.co.ukwrote:

 At 10:49 30/01/2013, aminreza Aamini wrote:

 Hi all,
 I am very grateful to all those who write to me
 1) how i  can  obtain relative risk (risk ratio) in logistic regression
 in R.


 @TECHREPORT{lumley06,
   author = {Lumley, T and Kronmal, R and Ma, S},
   year = 2006,
   title = {Relative risk regression in medical research: models, contrasts,
   estimators, and algorithms},
   number = 293,
   institution = {{UW} Biostatistics Working Paper Series},
   keywords = {glm, Poisson},
   url = 
 {http://www.bepress.com/**uwbiostat/paper293http://www.bepress.com/uwbiostat/paper293
 }

 }


  2) how to obtain  the predicted risk for a certain individual using
 fitted regression model in R.

 Many thanks, in advance, for your help.

 Amin.


 Michael Dewey
 i...@aghmed.fsnet.co.uk
 http://www.aghmed.fsnet.co.uk/**home.htmlhttp://www.aghmed.fsnet.co.uk/home.html



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Relative Risk in logistic regression

2013-02-03 Thread John Sorkin
Amin,
It is incorrect to use the relative risk as a measure of association in a 
logistic regression.  The measure of association in a logistic regression is 
the odds ratio. The odds ratio is an approximation of the relative risk. The 
approximation becomes progressively better as the disease becomes progressively 
rarer. Regardless of whether the disease is rare or not, inferences drawn from 
a logistic regression are valid. Please do not report a logistic regression 
using relative risk. It is not correct to do so. 
John  

 
John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing) aminreza 
Aamini amin.r@gmail.com 2/3/2013 9:15 AM 
Dear Coleagues ,
As my friend John  mentined,* the measure of association from a logistic
regression is the odds ratio, not the relative risk*. but the point is in
follow-up studies, it is commonly preferred to estimate a risk ratio rather
than an odds ratio. Thats why im looking for RR in logistic models.
Bytheway thank you all for ur consideration.
Amin


On Sun, Feb 3, 2013 at 1:42 PM, Michael Dewey i...@aghmed.fsnet.co.ukwrote:

 At 10:49 30/01/2013, aminreza Aamini wrote:

 Hi all,
 I am very grateful to all those who write to me
 1) how i  can  obtain relative risk (risk ratio) in logistic regression
 in R.


 @TECHREPORT{lumley06,
   author = {Lumley, T and Kronmal, R and Ma, S},
   year = 2006,
   title = {Relative risk regression in medical research: models, contrasts,
   estimators, and algorithms},
   number = 293,
   institution = {{UW} Biostatistics Working Paper Series},
   keywords = {glm, Poisson},
   url = 
 {http://www.bepress.com/**uwbiostat/paper293http://www.bepress.com/uwbiostat/paper293
 }

 }


  2) how to obtain  the predicted risk for a certain individual using
 fitted regression model in R.

 Many thanks, in advance, for your help.

 Amin.


 Michael Dewey
 i...@aghmed.fsnet.co.uk
 http://www.aghmed.fsnet.co.uk/**home.htmlhttp://www.aghmed.fsnet.co.uk/home.html



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Confidentiality Statement:
This email message, including any attachments, is for the sole use of the 
intended recipient(s) and may contain confidential and privileged information.  
Any unauthorized use, disclosure or distribution is prohibited.  If you are not 
the intended recipient, please contact the sender by reply email and destroy 
all copies of the original message. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Relative Risk in logistic regression

2013-02-03 Thread David Winsemius


On Feb 3, 2013, at 8:15 AM, aminreza Aamini wrote:


Dear Coleagues ,
As my friend John  mentined,* the measure of association from a  
logistic
regression is the odds ratio, not the relative risk*. but the point  
is in
follow-up studies, it is commonly preferred to estimate a risk ratio  
rather

than an odds ratio. Thats why im looking for RR in logistic models.
Bytheway thank you all for ur consideration.
Amin


I agree that the relative risk is generally preferred in presenting  
the results of follow-up studies. The question should be:  why do you  
want to use a logistic link? The technical report out of the  
University of Washington Biostatistics Depeartment explains a variety  
of approaches including using a log-binomial model and Poisson  
regression. Either of those can be done in R with glm. The Poisson  
regression model is particularly simple to develop.  You should  
explain a)  what sort of data you have in greater detail and b) your  
reasons for using the logistic link when arguably better alternatives  
are available if you want a more specific answer.


--
David



On Sun, Feb 3, 2013 at 1:42 PM, Michael Dewey  
i...@aghmed.fsnet.co.ukwrote:



At 10:49 30/01/2013, aminreza Aamini wrote:


Hi all,
I am very grateful to all those who write to me
1) how i  can  obtain relative risk (risk ratio) in logistic  
regression

in R.



@TECHREPORT{lumley06,
 author = {Lumley, T and Kronmal, R and Ma, S},
 year = 2006,
 title = {Relative risk regression in medical research: models,  
contrasts,

 estimators, and algorithms},
 number = 293,
 institution = {{UW} Biostatistics Working Paper Series},
 keywords = {glm, Poisson},
 url = {http://www.bepress.com/**uwbiostat/paper293http://www.bepress.com/uwbiostat/paper293 


}

}


2) how to obtain  the predicted risk for a certain individual using

fitted regression model in R.

Many thanks, in advance, for your help.

Amin.



Michael Dewey
i...@aghmed.fsnet.co.uk
http://www.aghmed.fsnet.co.uk/**home.htmlhttp://www.aghmed.fsnet.co.uk/home.html 






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Empty cluster / segfault using vanilla kmeans with version 2.15.2

2013-02-03 Thread Luca Nanetti
Dear experts,
I am encountering a version-dependent issue.

My laptop runs Ubuntu 12.04 LTS 64-bit, R 2.14.1; the issue explained below
never occurred with this version of R
My desktop runs Ubuntu 11.10 64-bit, R 2.13.2; what follows applies to this
setup.

The data I'm clustering is constituted by the rows of a 320 x 6 matrix
containing integers ranging from 1 to 7, no missing data.
I applied kmeans() to this matrix, literally, 256 x 10⁶ times using R
version 2.13.2 or 2.14.1, without never experiencing the slightest problem.
My usual setup is with k=5, nstart=256, iter.max=50.

Upgrading to R 2.15.2, I experienced either a warning message ('Empty
cluster. Choose a better set of initial centers') or a catastrophic
segfault. The only way I can get a solution whatsoever is putting nstart to
its default value, i.e. 1. However, just repeating the clustering, the same
issue still happen. Moreover, this is vastly suboptimal, because the risk
of local minima.

Something similar was reported many years ago, see
https://stat.ethz.ch/pipermail/r-help/2003-November/041784.html. It was
then suggested that R's behaviour was correct. I'm not familiar with such
an early R version, but the up-to-date documentation of kmeans clearly
states that Except for the Lloyd-Forgy method, k clusters will always be
returned if a number is specified..
I am using the default Hartigan-Wong, and I specify an exact number k:
thus, k clusters should be returned. They aren't, and the empty cluster is
then more likely the symptom of a bug rather than the outcome of a 'true'
local minimum.

Using synaptic, I managed to downgrade R to version 2.13.2. The problem
disappeard, i.e. the previous message/segfault didn't occur anymore.

Summarizing: given the same dataset, either an unreasonable message or a
segfault regularly happen in version 2.15.2 by invoking kmeans() on an
Ubuntu 11.10 64bit machine. This does not happen at all in previous
versions of R, on the same machine and operating system.

I respectfully suggest that the behaviour shown in the aforementioned
versions 2.13.2 and 2.14.1 should be considered 'normal', and that version
2.15.2 should revert to that.

Kind regards,
Luca Nanetti.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fractional logit in GLM?

2013-02-03 Thread Rachael Garrett
Hi,

Does anyone know of a function in R that can handle a fractional variable as 
the dependent variable?  The catch is that the function has to be inclusive of 
0 and 1, which betareg() does not.  

It seems like GLM might be able to handle the fractional logit model, but I 
can't figure it out.  How do you format GLM to do so?

Best,

Rachael



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Looping through rows of all elements of a list that has variable length

2013-02-03 Thread Dimitri Liakhovitski
Dear R-ers,
I have a list of data frames such that the length of the list is unknown in
advance (it could be 1 or 2 or more). Each element of the list contains a
data frame.
I need to loop through all rows of the list element 1 AND (if applicable)
of the list element 2 etc. and do something at each iteration.
I am trying to figure out how to write a code that is generic, i.e., loops
through the rows of all elements of my lists even if the total number of
the list elments is unknown in advance.
Below is an example.

a=expand.grid(1:2,1:2)
b=expand.grid(1:2,1:2,1:2)
#
# My list that can have 1 element, e.g.:
l.short-vector(list,1)
l.short[[1]]-a
# I need to loop through rows of l.short[[1]] and do somethinig (it's
unimportant what exactly) with them, e.g.:
out-vector(list,nrow(l.short[[1]]))
for(i in 1:nrow(l.short[[1]])){  # i-1
  out[[i]]-sum(l.short[[1]][i,])
}
(out)

#
# Or my list could have 1 elements, e.g., 2 like below (or 3 or more).
# The total length of my list varies.

l.long-list(a,b)
# I need to loop through rows of l.long[[1]] AND of l.long[[2]]
simultaneously
# and do something with both, - see example below.
# Below, I am doing it manually by using expand.grid to create all
combinations of rows of 2 elements of 'l.long':
mygrid-expand.grid(1:nrow(l.long[[1]]),1:nrow(l.long[[2]]))
out-vector(list,nrow(mygrid))
for(gridrow in 1:nrow(mygrid)){  # gridrow-1
row.a-mygrid[gridrow,1]
row.b-mygrid[gridrow,2]
out[[gridrow]]-sum(l.long[[1]][row.a,])+sum(l.long[[2]][row.b,])
}
Thank you very much for any suggestions!
-- 
Dimitri Liakhovitski
gfk.com http://marketfusionanalytics.com/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Compare each element of a list to a vector

2013-02-03 Thread mtb954
Hello R-helpers,

I have a vector

x-c(1,2,3)

and a list that contains vectors

datalist-list(c(1,2,3),c(2,3,4),c(3,4,5),c(4,5,6))

and I would like to identify those list elements that are identical to x.

I tried

 datalist %in% x
[1] FALSE FALSE FALSE FALSE

but I am obviously using %in% incorrectly. I also tried messing around with
lapply but I can't figure out how to specify the function within lapply.

I would appreciate any suggestions you may have.

Many thanks!

Mark Na

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compare each element of a list to a vector

2013-02-03 Thread jim holtman
try this:

 x-c(1,2,3)
 datalist-list(c(1,2,3),c(2,3,4),c(3,4,5),c(4,5,6))

 result - sapply(datalist, function(.vec){
+ all(.vec == x)
+ })

 result
[1]  TRUE FALSE FALSE FALSE



On Sun, Feb 3, 2013 at 1:15 PM,  mtb...@gmail.com wrote:
 Hello R-helpers,

 I have a vector

 x-c(1,2,3)

 and a list that contains vectors

 datalist-list(c(1,2,3),c(2,3,4),c(3,4,5),c(4,5,6))

 and I would like to identify those list elements that are identical to x.

 I tried

 datalist %in% x
 [1] FALSE FALSE FALSE FALSE

 but I am obviously using %in% incorrectly. I also tried messing around with
 lapply but I can't figure out how to specify the function within lapply.

 I would appreciate any suggestions you may have.

 Many thanks!

 Mark Na

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compare each element of a list to a vector

2013-02-03 Thread William Dunlap
Try
   datalist %in% list(x)
  [1]  TRUE FALSE FALSE FALSE
Both arguments, e1 and e2, of e1 %in% e2 should be of the same type:
e1 %in% e2 is comparing e1[i] and e2[j].

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of mtb...@gmail.com
 Sent: Sunday, February 03, 2013 10:15 AM
 To: r-help@r-project.org
 Subject: [R] Compare each element of a list to a vector
 
 Hello R-helpers,
 
 I have a vector
 
 x-c(1,2,3)
 
 and a list that contains vectors
 
 datalist-list(c(1,2,3),c(2,3,4),c(3,4,5),c(4,5,6))
 
 and I would like to identify those list elements that are identical to x.
 
 I tried
 
  datalist %in% x
 [1] FALSE FALSE FALSE FALSE
 
 but I am obviously using %in% incorrectly. I also tried messing around with
 lapply but I can't figure out how to specify the function within lapply.
 
 I would appreciate any suggestions you may have.
 
 Many thanks!
 
 Mark Na
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compare each element of a list to a vector

2013-02-03 Thread Patrick Burns

My attempt similar to Jim's is:

which(sapply(datalist, function(z) all(z == x)))


However, a safer approach is:

which(sapply(datalist, function(z) isTRUE(all.equal(z, x

This latter approach avoids Circle 1 of 'The R Inferno'.

http://www.burns-stat.com/documents/books/the-r-inferno/

Pat


On 03/02/2013 18:24, jim holtman wrote:

try this:


x-c(1,2,3)
datalist-list(c(1,2,3),c(2,3,4),c(3,4,5),c(4,5,6))

result - sapply(datalist, function(.vec){

+ all(.vec == x)
+ })


result

[1]  TRUE FALSE FALSE FALSE





On Sun, Feb 3, 2013 at 1:15 PM,  mtb...@gmail.com wrote:

Hello R-helpers,

I have a vector

x-c(1,2,3)

and a list that contains vectors

datalist-list(c(1,2,3),c(2,3,4),c(3,4,5),c(4,5,6))

and I would like to identify those list elements that are identical to x.

I tried


datalist %in% x

[1] FALSE FALSE FALSE FALSE

but I am obviously using %in% incorrectly. I also tried messing around with
lapply but I can't figure out how to specify the function within lapply.

I would appreciate any suggestions you may have.

Many thanks!

Mark Na

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @burnsstat @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of:
 'Impatient R'
 'The R Inferno'
 'Tao Te Programming')

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compare each element of a list to a vector

2013-02-03 Thread mtb954
Thanks Jim, William and Patrick for your ideas. I appreciate your help.
Avoiding a circle of the R Inferno sounds good, so I'm going to use
Patrick's 2nd suggestion for now but I learned something from the others
too.

Cheers, Mark





On Sun, Feb 3, 2013 at 12:33 PM, Patrick Burns pbu...@pburns.seanet.comwrote:

 My attempt similar to Jim's is:

 which(sapply(datalist, function(z) all(z == x)))


 However, a safer approach is:

 which(sapply(datalist, function(z) isTRUE(all.equal(z, x

 This latter approach avoids Circle 1 of 'The R Inferno'.

 http://www.burns-stat.com/**documents/books/the-r-inferno/http://www.burns-stat.com/documents/books/the-r-inferno/

 Pat



 On 03/02/2013 18:24, jim holtman wrote:

 try this:

  x-c(1,2,3)
 datalist-list(c(1,2,3),c(2,3,**4),c(3,4,5),c(4,5,6))

 result - sapply(datalist, function(.vec){

 + all(.vec == x)
 + })


 result

 [1]  TRUE FALSE FALSE FALSE




 On Sun, Feb 3, 2013 at 1:15 PM,  mtb...@gmail.com wrote:

 Hello R-helpers,

 I have a vector

 x-c(1,2,3)

 and a list that contains vectors

 datalist-list(c(1,2,3),c(2,3,**4),c(3,4,5),c(4,5,6))

 and I would like to identify those list elements that are identical to x.

 I tried

  datalist %in% x

 [1] FALSE FALSE FALSE FALSE

 but I am obviously using %in% incorrectly. I also tried messing around
 with
 lapply but I can't figure out how to specify the function within lapply.

 I would appreciate any suggestions you may have.

 Many thanks!

 Mark Na

  [[alternative HTML version deleted]]

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





 --
 Patrick Burns
 pbu...@pburns.seanet.com
 twitter: @burnsstat @portfolioprobe
 http://www.portfolioprobe.com/**blog http://www.portfolioprobe.com/blog
 http://www.burns-stat.com
 (home of:
  'Impatient R'
  'The R Inferno'
  'Tao Te Programming')


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RandomForest, Party and Memory Management

2013-02-03 Thread Lorenzo Isella

Dear All,
For a data mining project, I am relying heavily on the RandomForest and  
Party packages.
Due to the large size of the data set, I have often memory problems (in  
particular with the Party package; RandomForest seems to use less memory).  
I really have two questions at this point
1) Please see how I am using the Party and RandomForest packages. Any  
comment is welcome and useful.




myparty - cforest(SalePrice ~ ModelID+
   ProductGroup+
   ProductGroupDesc+MfgYear+saledate3+saleday+
   salemonth,
   data = trainRF,
control = cforest_unbiased(mtry = 3, ntree=300, trace=TRUE))




rf_model - randomForest(SalePrice ~ ModelID+
ProductGroup+
ProductGroupDesc+MfgYear+saledate3+saleday+
salemonth,
data = trainRF,na.action = na.omit,
   importance=TRUE, do.trace=100, mtry=3,ntree=300)

2) I have another question: sometimes R crashes after telling me that it  
is unable to allocate e.g. an array of 1.5 Gb.
However, I have 4Gb of ram on my box, so...technically the memory is  
there, but is there a way to enable R to use more of it?


Many thanks

Lorenzo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RandomForest, Party and Memory Management

2013-02-03 Thread Jeff Newmiller
Neither of your questions meets the Posting Guidelines (see footer of any 
email).
1) Not reproducible. [1]
2) Very operating-system specific and a FAQ. You have not indicated what your 
OS is (via sessionInfo), nor what reading you have done to address memory 
problems already (use a search engine... or begin with the FAQs in R help or on 
CRAN).

[1] 
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Lorenzo Isella lorenzo.ise...@gmail.com wrote:

Dear All,
For a data mining project, I am relying heavily on the RandomForest and
 
Party packages.
Due to the large size of the data set, I have often memory problems (in
 
particular with the Party package; RandomForest seems to use less
memory).  
I really have two questions at this point
1) Please see how I am using the Party and RandomForest packages. Any  
comment is welcome and useful.



myparty - cforest(SalePrice ~ ModelID+
ProductGroup+
ProductGroupDesc+MfgYear+saledate3+saleday+
salemonth,
data = trainRF,
control = cforest_unbiased(mtry = 3, ntree=300, trace=TRUE))




rf_model - randomForest(SalePrice ~ ModelID+
 ProductGroup+
 ProductGroupDesc+MfgYear+saledate3+saleday+
 salemonth,
 data = trainRF,na.action = na.omit,
importance=TRUE, do.trace=100, mtry=3,ntree=300)

2) I have another question: sometimes R crashes after telling me that
it  
is unable to allocate e.g. an array of 1.5 Gb.
However, I have 4Gb of ram on my box, so...technically the memory is  
there, but is there a way to enable R to use more of it?

Many thanks

Lorenzo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using relaimpo or relimp with PLM

2013-02-03 Thread Richard Asturia
Dears,

Unfortunatelly, the packages relaimpo and relimp do not seem to work with
plm function (plm package). Have anyone know about any workaround for those
incompatibilities, or at least of any ideas on that?

Thanks in advance!
Richard A.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fractional logit in GLM?

2013-02-03 Thread Wensui Liu
glm() will handle fractional logit with some tweaks. below is copied from
my blog in a python example. however, you should be able to see the R code
from it.

In [12]: # Address the same type of model with R by Pyper

In [13]: import pyper as pr

In [14]: r = pr.R(use_pandas = True)

In [15]: r.r_data = data

In [16]: # Indirect Estimation of Discrete Dependent Variable Models

In [17]: r('data - rbind(cbind(r_data, y = 1, wt = r_data$LEV_LT3),
cbind(r_data, y = 0, wt = 1 - r_data$LEV_LT3))')
Out[17]: 'try({data - rbind(cbind(r_data, y = 1, wt =
r_data$LEV_LT3), cbind(r_data, y = 0, wt = 1 - r_data$LEV_LT3))})\n'

In [18]: r('mod - glm(y ~ COLLAT1 + SIZE1 + PROF2 + LIQ + IND3A,
weights = wt, subset = (wt  0), data = data, family = binomial)')
Out[18]: 'try({mod - glm(y ~ COLLAT1 + SIZE1 + PROF2 + LIQ + IND3A,
weights = wt, subset = (wt  0), data = data, family =
binomial)})\nWarning message:\nIn eval(expr, envir, enclos) :
non-integer #successes in a binomial glm!\n'

In [19]: print r('summary(mod)')
try({summary(mod)})

Call:
glm(formula = y ~ COLLAT1 + SIZE1 + PROF2 + LIQ + IND3A, family = binomial,
data = data, weights = wt, subset = (wt  0))

Deviance Residuals:
Min   1Q   Median   3Q  Max
-1.0129  -0.4483  -0.3173  -0.1535   2.5379

Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept) -7.249790.56734 -12.779   2e-16 ***
COLLAT1  1.237150.26012   4.756 1.97e-06 ***
SIZE10.359010.03746   9.584   2e-16 ***
PROF2   -3.143130.73895  -4.254 2.10e-05 ***
LIQ -1.382490.35749  -3.867  0.00011 ***
IND3A0.546580.14136   3.867  0.00011 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 2692.0  on 5536  degrees of freedom
Residual deviance: 2456.4  on 5531  degrees of freedom
AIC: 1995.4

Number of Fisher Scoring iterations: 6



On Sun, Feb 3, 2013 at 11:17 AM, Rachael Garrett
rachaeldgarr...@gmail.comwrote:

 Hi,

 Does anyone know of a function in R that can handle a fractional variable
 as the dependent variable?  The catch is that the function has to be
 inclusive of 0 and 1, which betareg() does not.

 It seems like GLM might be able to handle the fractional logit model, but
 I can't figure it out.  How do you format GLM to do so?

 Best,

 Rachael



 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Credit Risk Manager, 53 Bancorp
wensui@53.com
513-295-4370
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Adding complex new columns to data frame depending on existing column

2013-02-03 Thread Tom Oates
Hello

I have a data frame as below
V1 V2 V3V4 V5 V6
chr1 18884  C C  2  0
chr1 135419 TATACA  T  2  0
chr1 332045 T   TTG  0  2
chr1 453838 T   TAC  2  0
chr1 567652 TTG  1  0
chr1 602541 TTTAT  2  0

on which I want to perform complex rearrangement such that:

if V3 is a string 1 (i.e line 2) then I generate 2 new columns where
first new column = V2-1  second new column = V2+(length of string in V3)+1

therefore, for line 2 output would look like:
chr1 135419 TATACA  T   2  0 135418 135426

if length of string in V3 = 1 and V4=string of length1 (i.e. line 1) then
first new column = V2  second new column = V2+2

output for line 1 would be:
chr1 18884  C C  2  0 18884  18886

I am not sure:
a) how to use R to substitute the length of the string in V3 with the
number representing this length
b) whether apply would be best to use here
Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] package installation error in Mac OS X

2013-02-03 Thread londonphd
Hi, I installed R in Mac OS X, and trying to installa package. R is not
allowing me to install the meboot package. Below is the exact message I got
from R:

  installation of package ‘meboot’ had non-zero exit status
trying URL 'http://cran.ma.imperial.ac.uk/src/contrib/meboot_1.1-5.tar.gz'
Content type 'application/x-gzip' length 411681 bytes (402 Kb)
opened URL
==
downloaded 402 Kb

* installing *source* package ‘meboot’ ...
** package ‘meboot’ successfully unpacked and MD5 sums checked
** libs
*** arch - i386
sh: make: command not found
ERROR: compilation failed for package ‘meboot’
* removing ‘/Users/ravshonbek/Library/R/2.15/library/meboot’

The downloaded source packages are in

‘/private/var/folders/c7/jrjv78_x6f53l3sw_w715gk0gn/T/RtmphnbWao/downloaded_packages’

Can anyone please shed some light on it?
Thank you



--
View this message in context: 
http://r.789695.n4.nabble.com/package-installation-error-in-Mac-OS-X-tp4657422.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cumulative sum by group and under some criteria

2013-02-03 Thread arun
Hi,

If you need to extract only the columns `m1` and `n1` which satisfy the 
condition.

 res2[,1:2][res2$cterm1_P1L0.01  res2$cterm1_P1L!=0,] 
#   m1 n1
#20  3  2
#21  3  2

#  If you wanted structure() as shown below for `d`, use dput(res2)
A.K.

- Original Message -
From: zjoanna2...@gmail.com zjoanna2...@gmail.com
To: smartpink...@yahoo.com
Cc: 
Sent: Sunday, February 3, 2013 3:58 PM
Subject: Re: cumulative sum by group and under some criteria

Hi,
Let me restate my questions. I need to get the m1 and n1 that satisfy some 
criteria, for example in this case, within each group, the maximum cterm1_p1L ( 
the last row in this group) 0.01. I need to extract m1=3, n1=2, I only need 
m1, n1 in the row.

Also, how to create the structure from the data.frame, I am new to R, I need to 
change the maxN and run the loop to different data. 
Thanks very much for your help!

quote author='arun kirshna'
HI,

I think this should be more correct:
maxN-9 
c11-0.2 
c12-0.2 
p0L-0.05 
p0H-0.05 
p1L-0.20 
p1H-0.20 

d - structure(list(m1 = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3), 
    n1 = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 
    3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), x1 = c(0, 
    0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 
    2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3), y1 = c(0, 1, 2, 0, 
    1, 2, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 
    2, 0, 1, 2, 0, 1, 2, 0, 1, 2), Fmm = c(0, 0, 0, 0.7, 0.59, 
    0.64, 1, 1, 1, 0, 0, 0, 0, 0.63, 0.7, 0.74, 0.68, 1, 1, 1, 
    1, 0, 0, 0, 0.62, 0.63, 0.6, 0.63, 0.6, 0.68, 1, 1, 1), Fnn = c(0, 
    0.64, 1, 0, 0.51, 1, 0, 0.67, 1, 0, 0.62, 0.69, 1, 0, 0.54, 
    0.62, 1, 0, 0.63, 0.73, 1, 0, 0.63, 1, 0, 0.7, 1, 0, 0.7, 
    1, 0, 0.58, 1), Qm = c(1, 1, 1, 0.65, 0.45, 0.36, 0.5, 0.165, 
    0, 1, 1, 1, 1, 0.685, 0.38, 0.32, 0.32, 0.5, 0.185, 0.135, 
    0, 1, 1, 1, 0.69, 0.37, 0.4, 0.685, 0.4, 0.32, 0.5, 0.21, 
    0), Qn = c(1, 0.36, 0, 0.65, 0.45, 0, 0.5, 0.165, 0, 1, 0.38, 
    0.31, 0, 0.685, 0.38, 0.32, 0, 0.5, 0.185, 0.135, 0, 1, 0.37, 
    0, 0.69, 0.3, 0, 0.685, 0.3, 0, 0.5, 0.21, 0), term1_p0 = c(0.81450625, 
    0.0857375, 0.00225625, 0.0857375, 0.009025, 0.0002375, 0.00225625, 
    0.0002375, 6.25e-06, 0.7737809375, 0.1221759375, 0.0064303124999, 
    0.0001128125, 0.081450625, 0.012860625, 0.000676875, 1.1875e-05, 
    0.0021434375, 0.0003384375, 1.78125e-05, 3.125e-07, 0.7737809375, 
    0.081450625, 0.0021434375, 0.1221759375, 0.012860625, 0.0003384375, 
    0.0064303124999, 0.000676875, 1.78125e-05, 0.0001128125, 
    1.1875e-05, 3.125e-07), term1_p1 = c(0.4096, 0.2048, 0.0256, 
    0.2048, 0.1024, 0.0128, 0.0256, 0.0128, 0.0016, 0.32768, 
    0.24576, 0.06144, 0.00512, 0.16384, 0.12288, 0.03072, 0.00256, 
    0.02048, 0.01536, 0.00384, 0.00032, 0.32768, 0.16384, 0.02048, 
    0.24576, 0.12288, 0.01536, 0.06144, 0.03072, 0.00384, 0.00512, 
    0.00256, 0.00032)), .Names = c(m1, n1, x1, y1, Fmm, 
Fnn, Qm, Qn, term1_p0, term1_p1), row.names = c(NA, 
33L), class = data.frame)

library(zoo)
lst1- split(d,list(d$m1,d$n1))
res2-do.call(rbind,lapply(lst1[lapply(lst1,nrow)!=0],function(x){
x[,11:14]-NA;
x[,11:12][x$Qm=c11,]-cumsum(x[,9:10][x$Qm=c11,]);
x[,13:14][x$Qn=c12,]-cumsum(x[,9:10][x$Qn=c12,]);
colnames(x)[11:14]- c(cterm1_P0L,cterm1_P1L,cterm1_P0H,cterm1_P1H);
x1-na.locf(x);
x1[,11:14][is.na(x1[,11:14])]-0;
x1}))
row.names(res2)- 1:nrow(res2)

 res2
 #  m1 n1 x1 y1  Fmm  Fnn    Qm    Qn     term1_p0 term1_p1   cterm1_P0L
cterm1_P1L   cterm1_P0H cterm1_P1H

#1   2  2  0  0 0.00 0.00 1.000 1.000 0.8145062500  0.40960 0.00  
 0.0 0.00    0.0
#2   2  2  0  1 0.00 0.64 1.000 0.360 0.0857375000  0.20480 0.00  
 0.0 0.00    0.0
#3   2  2  0  2 0.00 1.00 1.000 0.000 0.0022562500  0.02560 0.00  
 0.0 0.0022562500    0.02560
#4   2  2  1  0 0.70 0.00 0.650 0.650 0.0857375000  0.20480 0.00  
 0.0 0.0022562500    0.02560
#5   2  2  1  1 0.59 0.51 0.450 0.450 0.009025  0.10240 0.00  
 0.0 0.0022562500    0.02560
#6   2  2  1  2 0.64 1.00 0.360 0.000 0.0002375000  0.01280 0.00  
 0.0 0.0024937500    0.03840
#7   2  2  2  0 1.00 0.00 0.500 0.500 0.0022562500  0.02560 0.00  
 0.0 0.0024937500    0.03840
#8   2  2  2  1 1.00 0.67 0.165 0.165 0.0002375000  0.01280 0.0002375000  
 0.01280 0.0027312500    0.05120
#9   2  2  2  2 1.00 1.00 0.000 0.000 0.062500  0.00160 0.0002437500  
 0.01440 0.0027375000    0.05280
#10  3  2  0  0 0.00 0.00 1.000 1.000 0.7737809375  0.32768 0.00  
 0.0 0.00    0.0
#11  3  2  0  1 0.00 0.63 1.000 0.370 0.0814506250  0.16384 0.00  
 0.0 0.00    0.0
#12  3  2  0  2 0.00 1.00 1.000 0.000 0.0021434375  0.02048 0.00  
 0.0 0.0021434375    0.02048
#13  3  2  1  0 0.62 0.00 0.690 0.690 0.1221759375  0.24576 0.00  
 0.0 0.0021434375    0.02048

[R] ggplot2 plotting errorbars.

2013-02-03 Thread Pieter Coussement

Hi,
i'm using this lines of code:

dodge -position_dodge(width=0.9)

ggplot(dfm,aes(x = X,y = value)) +
  geom_bar(aes(fill = variable), position=dodge, stat=identity) +
  geom_errorbar(aes(ymin=value-er, ymax=value+er),width=0.25, 
position=dodge,stat=identity)


to plot this data frame
X variable  valueer
1   A X4  58.74  9.44
2  B X4  52.41 10.01
3 C X4  95.52  4.88
4  A X1  75.51  8.54
5  B X1   0.73 23.20
6 C X1  96.66  1.18
7 A  X5  76.70  9.60
8 B  X5   0.56 34.50
9 C  X5 100.58 10.87

result:

As you see the error bars are still very much wrongly positioned.
How do i solve this?

thanks for the help!
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fortan to R

2013-02-03 Thread Ben Bolker
eliza botto eliza_botto at hotmail.com writes:

 
 
 Dear UseRs,
 How can i connect my FTN95 fortran compiler with R in window 7?

  Take a look at the R extensions manual
 http://cran.r-project.org/doc/manuals/R-exts.html
section 5 ...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] package installation error in Mac OS X

2013-02-03 Thread Pascal Oettli

Hi,

Did you install the Xcode Developer Tools on your machine?

HTH,
Pascal


Le 04/02/2013 03:14, londonphd a écrit :

Hi, I installed R in Mac OS X, and trying to installa package. R is not
allowing me to install the meboot package. Below is the exact message I got
from R:

   installation of package ‘meboot’ had non-zero exit status
trying URL 'http://cran.ma.imperial.ac.uk/src/contrib/meboot_1.1-5.tar.gz'
Content type 'application/x-gzip' length 411681 bytes (402 Kb)
opened URL
==
downloaded 402 Kb

* installing *source* package ‘meboot’ ...
** package ‘meboot’ successfully unpacked and MD5 sums checked
** libs
*** arch - i386
sh: make: command not found
ERROR: compilation failed for package ‘meboot’
* removing ‘/Users/ravshonbek/Library/R/2.15/library/meboot’

The downloaded source packages are in

‘/private/var/folders/c7/jrjv78_x6f53l3sw_w715gk0gn/T/RtmphnbWao/downloaded_packages’

Can anyone please shed some light on it?
Thank you



--
View this message in context: 
http://r.789695.n4.nabble.com/package-installation-error-in-Mac-OS-X-tp4657422.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2 plotting errorbars.

2013-02-03 Thread Rafael Robledo
Hi, it seems to be a problem about using aes both in ggplot as also in geom_bar.

You could specify fill property for your geom_bar in ggplot
initialization, in order to avoid this issue
(you could also do the same thing for ymin and ymax properties for
errorbar :P), i.e:

dodge -position_dodge(width=0.9)

ggplot(dfm, aes(x=X, y=value, fill=variable, ymin=value-er, ymax=value+er)) +
  geom_bar(position=dodge) +
  geom_errorbar(position=dodge, width=0.25)

Hope it helps.

On Sun, Feb 3, 2013 at 5:01 PM, Pieter Coussement dencous...@gmail.com wrote:
 Hi,
 i'm using this lines of code:

 dodge -position_dodge(width=0.9)

 ggplot(dfm,aes(x = X,y = value)) +
   geom_bar(aes(fill = variable), position=dodge, stat=identity) +
   geom_errorbar(aes(ymin=value-er, ymax=value+er),width=0.25,
 position=dodge,stat=identity)

 to plot this data frame
 X variable  valueer
 1   A X4  58.74  9.44
 2  B X4  52.41 10.01
 3 C X4  95.52  4.88
 4  A X1  75.51  8.54
 5  B X1   0.73 23.20
 6 C X1  96.66  1.18
 7 A  X5  76.70  9.60
 8 B  X5   0.56 34.50
 9 C  X5 100.58 10.87

 result:

 As you see the error bars are still very much wrongly positioned.
 How do i solve this?

 thanks for the help!
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



--
Rafael R.

On Sun, Feb 3, 2013 at 5:01 PM, Pieter Coussement dencous...@gmail.com wrote:
 Hi,
 i'm using this lines of code:

 dodge -position_dodge(width=0.9)

 ggplot(dfm,aes(x = X,y = value)) +
   geom_bar(aes(fill = variable), position=dodge, stat=identity) +
   geom_errorbar(aes(ymin=value-er, ymax=value+er),width=0.25,
 position=dodge,stat=identity)

 to plot this data frame
 X variable  valueer
 1   A X4  58.74  9.44
 2  B X4  52.41 10.01
 3 C X4  95.52  4.88
 4  A X1  75.51  8.54
 5  B X1   0.73 23.20
 6 C X1  96.66  1.18
 7 A  X5  76.70  9.60
 8 B  X5   0.56 34.50
 9 C  X5 100.58 10.87

 result:

 As you see the error bars are still very much wrongly positioned.
 How do i solve this?

 thanks for the help!
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



--
Rafael R.

On Sun, Feb 3, 2013 at 5:01 PM, Pieter Coussement dencous...@gmail.com wrote:
 Hi,
 i'm using this lines of code:

 dodge -position_dodge(width=0.9)

 ggplot(dfm,aes(x = X,y = value)) +
   geom_bar(aes(fill = variable), position=dodge, stat=identity) +
   geom_errorbar(aes(ymin=value-er, ymax=value+er),width=0.25,
 position=dodge,stat=identity)

 to plot this data frame
 X variable  valueer
 1   A X4  58.74  9.44
 2  B X4  52.41 10.01
 3 C X4  95.52  4.88
 4  A X1  75.51  8.54
 5  B X1   0.73 23.20
 6 C X1  96.66  1.18
 7 A  X5  76.70  9.60
 8 B  X5   0.56 34.50
 9 C  X5 100.58 10.87

 result:

 As you see the error bars are still very much wrongly positioned.
 How do i solve this?

 thanks for the help!
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



--
Rafael R.

On Sun, Feb 3, 2013 at 5:01 PM, Pieter Coussement dencous...@gmail.com wrote:
 Hi,
 i'm using this lines of code:

 dodge -position_dodge(width=0.9)

 ggplot(dfm,aes(x = X,y = value)) +
   geom_bar(aes(fill = variable), position=dodge, stat=identity) +
   geom_errorbar(aes(ymin=value-er, ymax=value+er),width=0.25,
 position=dodge,stat=identity)

 to plot this data frame
 X variable  valueer
 1   A X4  58.74  9.44
 2  B X4  52.41 10.01
 3 C X4  95.52  4.88
 4  A X1  75.51  8.54
 5  B X1   0.73 23.20
 6 C X1  96.66  1.18
 7 A  X5  76.70  9.60
 8 B  X5   0.56 34.50
 9 C  X5 100.58 10.87

 result:

 As you see the error bars are still very much wrongly positioned.
 How do i solve this?

 thanks for the help!
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Rafael R.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding complex new columns to data frame depending on existing column

2013-02-03 Thread arun


Hi,

May be this helps:
df1-read.table(text=
V1    V2    V3        V4        V5 V6
chr1 18884  C        C      2  0
chr1 135419 TATACA  T          2  0
chr1 332045 T      TTG      0  2
chr1 453838 T      TAC      2  0
chr1 567652 T        TG      1  0
chr1 602541 TTTA    T          2  0
,header=TRUE,sep=,stringsAsFactors=FALSE)
 df1$newCol1- ifelse(nchar(df1$V3)1,df1$V2-1,ifelse(nchar(df1$V3)==1  
nchar(df1$V4)1, df1$V2,NA)) 
 df1$newCol2- 
ifelse(nchar(df1$V3)1,df1$V2+nchar(df1$V3)+1,ifelse(nchar(df1$V3)==1  
nchar(df1$V4)1, df1$V2+2,NA)) 


df1
#    V1     V2     V3    V4 V5 V6 newCol1 newCol2
#1 chr1  18884      C C  2  0   18884   18886
#2 chr1 135419 TATACA     T  2  0  135418  135426
#3 chr1 332045      T   TTG  0  2  332045  332047
#4 chr1 453838      T   TAC  2  0  453838  453840
#5 chr1 567652      T    TG  1  0  567652  567654
#6 chr1 602541   TTTA     T  2  0  602540  602546
A.K.
- Original Message -
From: Tom Oates toate...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Sunday, February 3, 2013 12:20 PM
Subject: [R] Adding complex new columns to data frame depending on existing 
column

Hello

I have a data frame as below
V1     V2     V3        V4         V5 V6
chr1 18884  C         C      2  0
chr1 135419 TATACA  T          2  0
chr1 332045 T       TTG      0  2
chr1 453838 T       TAC      2  0
chr1 567652 T        TG      1  0
chr1 602541 TTTA    T          2  0

on which I want to perform complex rearrangement such that:

if V3 is a string 1 (i.e line 2) then I generate 2 new columns where
first new column = V2-1  second new column = V2+(length of string in V3)+1

therefore, for line 2 output would look like:
chr1 135419 TATACA  T   2  0 135418 135426

if length of string in V3 = 1 and V4=string of length1 (i.e. line 1) then
first new column = V2  second new column = V2+2

output for line 1 would be:
chr1 18884  C         C      2  0 18884  18886

I am not sure:
a) how to use R to substitute the length of the string in V3 with the
number representing this length
b) whether apply would be best to use here
Thanks

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rJava works with 32-bit but not 64

2013-02-03 Thread Spencer Graves

Hello:


  rJava works for me under 32-bit but under not 64-bit R; see below.


  Suggestions?
  Thanks,
  Spencer


 library(rJava)
Error : .onLoad failed in loadNamespace() for 'rJava', details:
  call: stop(No CurrentVersion entry in ', key, '! Try re-installing 
Java and make sure R and Java have matching architectures.)

  error: object 'key' not found
Error: package/namespace load failed for 'rJava'
 sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods base


##


 library(rJava)
 sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: i386-w64-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods base

other attached packages:
[1] rJava_0.9-3


--
Spencer Graves, PE, PhD
President and Chief Technology Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567
web:  www.structuremonitoring.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Wide character in print?

2013-02-03 Thread Spencer Graves

Hello:


  I get Wide character in print from trying 
read.xls(22_data.xls) in the gdata package, with 22_data.xls 
downloaded from Varieties_Country_A-E.xls at 
http://www.reinhartandrogoff.com/data/browse-by-topic/topics/7/:



 library(gdata)
 read.xls(22_data.xls)
Wide character in print at 
C:/Users/sgraves/pgms/R/R-2.15.2/library/gdata/perl/xls2csv.pl line 270.

 sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods base

other attached packages:
[1] gdata_2.12.0

loaded via a namespace (and not attached):
[1] gtools_2.7.0


  I get the same message from xls2sep(22_data.xls).


  It's only a comment, so I suppose I could ignore it.  However, 
it's generated by a function I'm adding to the Ecdat package, and I'd 
rather find a way to avoid it.  (I suppose I could dump it to sink, but 
that's pretty extreme and could mask other problems.)



  Thanks,
  Spencer


--
Spencer Graves, PE, PhD
President and Chief Technology Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567
web:  www.structuremonitoring.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rJava works with 32-bit but not 64

2013-02-03 Thread Pascal Oettli

Hello,

Do you have a 64-bit version of Java?

rJava says to you:
call: stop(No CurrentVersion entry in ', key, '! Try re-installing 
Java and make sure R and Java have matching architectures.)


Regards,
Pascal


Le 04/02/2013 14:27, Spencer Graves a écrit :

Hello:


   rJava works for me under 32-bit but under not 64-bit R; see below.


   Suggestions?
   Thanks,
   Spencer


  library(rJava)
Error : .onLoad failed in loadNamespace() for 'rJava', details:
   call: stop(No CurrentVersion entry in ', key, '! Try re-installing
Java and make sure R and Java have matching architectures.)
   error: object 'key' not found
Error: package/namespace load failed for 'rJava'
  sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods base
 

##


  library(rJava)
  sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: i386-w64-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods base

other attached packages:
[1] rJava_0.9-3




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] gettext weirdness

2013-02-03 Thread Florent Angly

Hi,

I am trying to use the gettext() function to translate some text. I have 
never used this function before, so, it's entirely possible that I am 
doing something wrong. The issue that I am encountering is that 
gettext() properly translates some text, but not some other.


Natural language was compiled in my R (installed from the Debian 
repositories):

$ R
R version 2.15.1 (2012-06-22) -- Roasted Marshmallows
[...]
  Natural language support but running in an English locale
[...]
 q()

Here is some text that has some translation in the file ./po/fr.po:
#: src/main/errors.c:290
msgid invalid option \warning.expression\
msgstr option incorrecte \warning.expression\
[...]
#: src/main/errors.c:582
msgid Error in 
msgstr Erreur dans 

Start R in French and see if I can get something translated to French:
$ LANG=fr_FR.UTF8  R
 stop('This is an error')
Erreur : This is an error

 bindtextdomain(R) # does not seem necessary, but just to be safe...
[1] /usr/share/R/share/locale

 gettext(Error in , domain=R)
[1] Error in 

 invalid option \warning.expression\ - msg; gettext(msg, domain=R)
[1] option incorrecte \warning.expression\


So, the stop() function successfully translates. I can also manually 
translate some entries, but why can does it not work for gettext(Error 
in , domain=R)?

Any idea?
Thanks

Florent

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.