Re: [R] Discrepancies in the estimates of Partial least square (PLS) in SAS and R

2012-05-03 Thread Bjørn-Helge Mevik
rakeshnb rakeshn...@gmail.com writes:

 I have been using R and SAS from past 6 months and i found a interesting
 thing while doing PLS in R and SAS is that when we use NO SCALE option in
 SAS and scale=FALSE in R , we see the estimates are matching but if we use
 scaling option in SAS and R the estimates differ to greater extent , you can
 try with any data set we will get very different estimates while using the
 scaling option. can any one help me in this issue ?

My guess is that they use different scalings, which of course will give
different results.  However, since you don't say anything about which R
package you use for PLSR (and since I don't have access to SAS), I can
only guess. :)

-- 
Regards,
Bjørn-Helge Mevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Validation of logistic models in R 2.12

2012-05-03 Thread Dwaipayan Dasgupta
Hi everyone,
I am trying to validate a logistic model built in R. Not my version of R is 
2.12 and  I cannot install ROCR.
I have gone to a point where I have the predicted values using the code:

pred1 = predict(trainlogit1,testdata_1, type = response)

How do I proceed from here? Is there another way in which I can plot lift 
charts?
My model output is:
Call:
glm(formula = Attrition_ind ~ Time.in.com + UV_LTIA_Base + 
as.factor(new_hire_ind) +
as.factor(promotion_ind) + as.factor(Time.in.comp...5.years) +
as.factor(Change.in.Job.Code) + Positioning_num + 
as.factor(Below.Guideline) +
as.factor(Above.Guideline) + as.factor(Drop.in.Rating.in.2010) +
as.factor(Increase.in.Rating.in.2010) + as.factor(job_band),
family = binomial(link = logit), data = traindata_1)

Deviance Residuals:
Min   1Q   Median   3Q  Max
-0.9604  -0.4888  -0.4221  -0.3514   2.9813

Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept)-1.425494   0.115476 -12.345   2e-16
Time.in.comp-0.022690   0.007753  -2.927 0.003425
UV_LTIA_Base   -0.960578   0.210247  -4.569 4.91e-06
as.factor(new_hire_ind)Y   -0.446913   0.145942  -3.062 0.002197
as.factor(promotion_ind)Y  -0.739080   0.195584  -3.779 0.000158
as.factor(Time.in.comp...5.years)Y  -0.248776   0.118014  -2.108 0.035029
as.factor(Change.in.Job.Code)Y -0.244962   0.118475  -2.068 0.038675
Positioning_num-1.987576   0.529700  -3.752 0.000175
as.factor(Below.Guideline)Y-0.622856   0.171635  -3.629 0.000285
as.factor(Above.Guideline)Y-0.446067   0.252602  -1.766 0.077414
as.factor(Drop.in.Rating.in.2010)Y  0.292605   0.174543   1.676 0.093659
as.factor(Increase.in.Rating.in.2010)Y -0.315004   0.101213  -3.112 0.001856
as.factor(job_band)40   0.391219   0.108038   3.621 0.000293
as.factor(job_band)45   1.228778   0.213261   5.762 8.32e-09
as.factor(job_band)50   1.603452   0.434487   3.690 0.000224
as.factor(job_band)60   2.578905   0.559087   4.613 3.97e-06

(Intercept)***
Time.in.comp   **
UV_LTIA_Base   ***
as.factor(new_hire_ind)Y   **
as.factor(promotion_ind)Y  ***
as.factor(Time.in.comp...5.years)Y  *
as.factor(Change.in.Job.Code)Y *
Positioning_num***
as.factor(Below.Guideline)Y***
as.factor(Above.Guideline)Y.
as.factor(Drop.in.Rating.in.2010)Y .
as.factor(Increase.in.Rating.in.2010)Y **
as.factor(job_band)40  ***
as.factor(job_band)45  ***
as.factor(job_band)50  ***
as.factor(job_band)60  ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 4737.8  on 7473  degrees of freedom
Residual deviance: 4607.0  on 7458  degrees of freedom
  (250 observations deleted due to missingness)
AIC: 4639

Number of Fisher Scoring iterations: 5

Could you please help? How can I understand  if my model is a good fit or not?

Regards,
Doy


American Express made the following annotations on Thu May 03 2012 01:13:02 

** 

This message and any attachments are solely for the intended recipient and may 
contain confidential or privileged information. If you are not the intended 
recipient, any disclosure, copying, use, or distribution of the information 
included in this message and any attachments is prohibited. If you have 
received this communication in error, please notify us by reply e-mail and 
immediately and permanently delete this message and any attachments. Thank 
you. 

American Express a ajouté le commentaire suivant le Thu May 03 2012 01:13:02 

Ce courrier et toute pièce jointe qu'il contient sont réservés au seul 
destinataire indiqué et peuvent renfermer des renseignements confidentiels et 
privilégiés. Si vous n'êtes pas le destinataire prévu, toute divulgation, 
duplication, utilisation ou distribution du courrier ou de toute pièce jointe 
est interdite. Si vous avez reçu cette communication par erreur, veuillez nous 
en aviser par courrier et détruire immédiatement le courrier et les pièces 
jointes. Merci. 

** 
---


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] take data from a file to another according to their correlation coefficient

2012-05-03 Thread jeff6868
Hi Rui it's me again.
I would have another question in the function process.all you explained
me. But as you already helped me a lot, and as I promised I won't disturb
you again, I want to ask you first if you accept to help me one more time
before telling you more precisely my problem (about adding an automatic
linear regression in order to have more realistic filling data in the gaps). 
I wrote you a personal message (don't know if you got it), because I would
like to send you a present from the Alps to thank you for all the help you
gave me, and maybe the new help (and so to have your home or work postal
address).
If you agree, let me know and send me your address by mail. I'll explain in
a new post what my boss wants me know to add in your function (this function
is so tricky for me to understand with my small knowledge + Google + R
help).

--
View this message in context: 
http://r.789695.n4.nabble.com/take-data-from-a-file-to-another-according-to-their-correlation-coefficient-tp4580054p4605385.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] warning with glm.predict, wrong number of data rows

2012-05-03 Thread carol white
Hi,
I split a data set into two partitions (80 and 42), use the first as the 
training set in glm and the second as testing set in glm predict. But when I 
call glm.predict, I get the warning message: 

Warning message:
'newdata' had 42 rows but variable(s) found have 80 rows 
-
 s = sample(1:122)
glm.my.data=glm(my.data.class[s[1:80]]~t(my.data)[s[1:80],1:60],family=binomial)
pred.my.data = 
predict(glm.gse13355,as.data.frame(t(my.data)[s[81:122],1:60]),type=response)
Warning message:
'newdata' had 42 rows but variable(s) found have 80 rows 

length(pred.my.data)
[1] 80

Thanks

Carol
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] deparse(substitute(x)) on an object with S3 class

2012-05-03 Thread Remko Duursma
Dear list,


can someone explain to me why deparse(substitute(x)) does not seem to work
when x is of a user-defined S3 class?
In my actual problem, my print method is part of a package, and the method
is registered in the NAMESPACE, if that should make a difference.

 print.testclass - function(x,...){
  xname - deparse(substitute(x))
  cat(Your object name is,xname,\n)
}
 
testlist - list()
testlist[[1]] - 1:10
class(testlist) - testclass

# This does not work as expected: 
 testlist
Your object name is structure(list(1:10), class = testclass) 
 
# But this does :
 testlist2 - unclass(testlist)
 print.testclass(testlist2)
Your object name is testlist2



thanks,
Remko Duursma


--
View this message in context: 
http://r.789695.n4.nabble.com/deparse-substitute-x-on-an-object-with-S3-class-tp4605592.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] New version of the knitr package (0.5)

2012-05-03 Thread Yihui Xie
The knitr package version 0.5 is on CRAN now. It has gone through
extensive development in the past few months, and about 200 issues
were solved (https://github.com/yihui/knitr/issues) thanks to the
feedback of users, which greatly improved the quality and usefulness
of this package. For a complete list of changes, see
https://github.com/yihui/knitr/blob/master/NEWS

Most notable new features are:

- chunk options can be arbitrary valid R code, e.g. echo=!TRUE,
results=ifelse(x, 'asis', 'markup')=; this makes a document really
programmable, and the syntax is also consistent with normal R code;
http://yihui.name/knitr/demo/sweave/
- the listings package is supported via render_listings(), and the
styles are based on Sweavel.sty (courtesy of Frank Harrell);
http://yihui.name/knitr/demo/listings/
- for HTML/markdown documents, R plots can be automatically uploaded
to Imgur to make sure the output is self-contained (no need to copy
images when publishing the output);
http://yihui.name/knitr/demo/upload/
- arbitrary recursive references of code chunks with label (e.g.
chunk A can call chunk B which in turn calls C);
http://yihui.name/knitr/demo/reference/
- for cached chunks, their dependencies can be automatically figured
out by analyzing the R code for global and local variables (see the
'autodep' option)
- new chunk options fig.cap and fig.scap to create figure environments
with captions in LaTeX;
- Sweave concordance was implemented and integrated with RStudio so
that error navigation and PDF/Rnw sync become easy
- more comprehensive support to markdown and reStructuredText;
markdown is also well supported in RStudio (preview version); see
http://yihui.name/en/2012/05/how-to-make-html5-slides-with-knitr/
- other languages like Python and Awk can also be supported, e.g.
https://github.com/yihui/knitr/blob/master/inst/examples/knitr-lang.Rmd

Suggestions and questions are welcome; please join the mailing list
https://groups.google.com/group/knitr or file issues to Github.
Thanks!

Regards,
Yihui
--
Yihui Xie xieyi...@gmail.com
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Discrepancies in the estimates of Partial least square (PLS) in SAS and R

2012-05-03 Thread rakeshnb
I am using pls package but how is scaling done in R?


--
View this message in context: 
http://r.789695.n4.nabble.com/Discrepancies-in-the-estimates-of-Partial-least-square-PLS-in-SAS-and-R-tp4605165p4605332.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Simple plot loop

2012-05-03 Thread Ben Neal
Trying to plot multiple lines from a simple matrix with headers (eight 
observations per column). I will be doing a number of these, with varying 
numbers of columns, and do not want to enter the header names for each one (I 
got frustrated and just wrote them out, which did work). 

Data reads fine, first plot is fine, but when i use the code at the bottom for 
a for i loop it tells me that x and y do not match. . . 

One other issue is that I would prefer not to specify the first column either, 
but when I enter Plot(MONTH, Data2[2] . . . it also does not plot. Should this 
not call the second column? 

Thank you very much for any comments. I know this is simple, but I appreciate 
any assistance. Cheers, Ben 

#

# LOAD DATA FROM CSV
library(zoo)
setwd(/Users/benjaminneal/Documents/All Panama/1110_Panama/CNAT_Segmenting)
Data = read.csv(120503_CNAT_Summary.csv, header=T)

#fill in missing data from last observation
Data2 - na.locf(Data)
attach(Data2)

# PLOT ALL ON ONE CHART
plot(MONTH,T102, type=o, ann=False, ylim=c(1, 100), pch=22, lty=2, 
col=red)
title(main=5m and 10 m Colpophylia natans colonies over time, ylab=% live 
coral / colony, 
  xlab=Months, col.main=black, font.main=4)

lines(MONTH,T162, type=o, pch=22, lty=2, col=red)
lines(MONTH,T231, type=o, pch=22, lty=2, col=green)
lines(MONTH,T250, type=o, pch=22, lty=2, col=green)

##(many other similar lines here, with entered column headers . . . up to 75)

lines(MONTH,T373, type=o, pch=22, lty=2, col=blue)
lines(MONTH,T374, type=o, pch=22, lty=2, col=blue)
lines(MONTH,T377, type=o, pch=22, lty=2, col=blue)

# Tried to add lines another way with for i loop, but this is the part not 
working
for (i in 2:length(Data2)) {
  lines(MONTH, i, type=o, pch=22, lty=2, col=blue)) 
} 
#

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] MLE for estimating the parameters of the TVECM in R

2012-05-03 Thread rizal yudha tama
Dear Mr. Matthieu Stigler

i so excited for your package 'tsDyn'.
firstly introduce myself, i student at Gadjah Mada University,Indonesia.
i'am new user of R and applying it for solving Bi-Variate ( interest rate
and inflation ) with threshold vector error correction model.
now, i writing my final examination about threshold vector error correction
model and i use refference from paper testing for two regime threshold
cointegration in vector error correction model
by hansen and seo (2002) to estimate parameter.
i have tried to reduce MLE , and it's succes. now i have A1(hat), A2(hat)
with MLE and gamma(hat), beta(hat) with grid search from MLE.
My problem, when i using function HanSeo_TVECM() in R, this function can't
running, only to estimate linier cointegration (VECM). and if i using
packade tsDyn version 0-8.1, function HanSeo_TVECM() not avaliable.
however there are function TVECM() but this function using CLS for estimate
parameter.
whether the MLE and CLS estimation would result in same relative values​​?
can u help me sir? for function HanSen_TVECM()?
thanks a lot


output R from function HanSeo_TVECM
 HanSeo_TVECM(data,lag=2,trim=
0.05,gn=300,bn=300)


###Linear VECM estimates (by Johansen MLE)


Cointegrating vector 1 -8.434287
Standard error of the cointegrating value 2.616888
Parameters
  ECMInterceptbi -1 inf -1  bi -2
Equation bi  -0.008627598 -0.006055985 0.758366 0.02512693 -0.1240975
Equation inf  0.131374112 -0.355994435 3.971587 0.20726565 -2.7661240
   inf -2
Equation bi  -0.007603223
Equation inf -0.224955971

 Standard errors with  Eicker-White covariance matrix estimator
 ECM  Intercept  bi -1 inf -1  bi -2
inf -2
Equation bi  0.004081985 0.01663758 0.08914615 0.02456267 0.08221155
0.01799844
Equation inf 0.034760850 0.08915086 1.66241171 0.19362012 0.93278372
0.06298056

Negative LL  -183.1256
AIC  -159.1256
BIC  -160.4877Error in solve.default(t(zzj) %*% zzj) :
  system is computationally singular: reciprocal condition number =
1.15007e-020

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] bcrm package update

2012-05-03 Thread Michael Sweeting

Dear all,

Version 0.3 of the bcrm package is now available on CRAN. The package 
allows users to fit Bayesian Continuous Reassessment Method Phase I 
trial designs. This version has the following new developments from 
version 0.1:


* Stopping rules have been added, allowing stopping to be based on a 
maximum sample size, the maximum number to be treated at the final MTD 
estimate, the precision of the MTD estimate, and a minumum sample size.
* Implementation of escalation based on posterior toxicity intervals 
using loss functions.
* Posterior summaries after each recruited cohort can now be plotted 
using the each argument of plot.bcrm.
* When simulating, operating characteristics are also now presented by 
true regions of toxicity risk.
* Simulations now run faster, as they use information from identical 
previous simulations to choose next dose. This is only implemented if 
nsims=1000, otherwise the computation time to search previous 
simulations becomes unmanageable.
* Plot and print commands now refer to actual dose labels when they have 
been given by the user
* Output from simulations can now be plotted as histograms using the 
function plot.bcrm.sim


Best wishes
Mike

--
---
Dr Michael Sweeting
MRC Biostatistics Unit
Institute of Public Health
Robinson Way
Cambridge
CB2 0SR

Tel: 01223 768257
Fax: 01223 760729

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] `mapply(function(x) function() x, c(a, b))$a()' returns `b'

2012-05-03 Thread Casper Ti. Vector
As the title says, I want to apply a function (which itself returns
a function) to a list (or vector), and get a list (or vector) of
generated functions as the return value, but get unexpected result.

Anyone with an idea about the reason of this phenomenon and a correct
way to implement the requirements? Thanks very much :)

-- 
Using GPG/PGP? Please get my current public key (ID: 0xAEF6A134, valid
from 2010 to 2013) from a key server.



signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] uneven vector length issue with read.zoo?

2012-05-03 Thread Gabor Grothendieck
On Wed, May 2, 2012 at 3:55 PM, knavero knav...@gmail.com wrote:
 I truncated and simplified my code and the read in data that I'm working with
 to isolate the issue. Here is the read in data and R script respectively:

 http://r.789695.n4.nabble.com/file/n4604287/test.csv test.csv

 http://pastebin.com/rCdaDqPm

 Here is the terminal/R shell output that I hope the above replicates on your
 screen:
 source(elecLoad.r, echo = TRUE)

 #Load packages
 library(zoo)

 library(chron)

 #Initial assignments for format (fmt), timezone (TZ), and user
 #defined chron function (chr)
 fmt = %m/%d/%y %I:%M %p

 TZ = PDT

 chr = function(x) as.chron(x, fmt)

 #Read in data as zoo object using relevant arguments in read.zoo()
 #for details of arguments, see Kevin Navero or see ?read.zoo
 #and ?read.table  [TRUNCATED]
 Error in read.zoo(http://dl.dropbox.com/u/41922443/test.csv;, skip = 1,  :
  index has bad entries at data rows: 14 15 16 17 18 19 20 21 22 23 24 25 26
 27 28

 I was hoping that the NULL in colClasses() would've taken care of this
 uneven vector length issue, however, that was not the case. Any ideas?
 Thanks in advance. Sorry if my post didn't follow the forum rules exactly. I
 tried to make small scale reproducible code and what not. I'm still a bit of
 a noob here and there.


Try this using the same library statements, fmt and chr from ijn post:

URL - http://dl.dropbox.com/u/41922443/test.csv;
DF1 - read.table(URL, skip = 1, header = TRUE, sep = ,, fill = TRUE,
  as.is = TRUE)
DF2 - na.omit(DF1[1:2])
z - read.zoo(DF2, FUN = chr)



-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple plot loop

2012-05-03 Thread Jim Lemon

On 05/03/2012 05:50 PM, Ben Neal wrote:

Trying to plot multiple lines from a simple matrix with headers (eight 
observations per column). I will be doing a number of these, with varying 
numbers of columns, and do not want to enter the header names for each one (I 
got frustrated and just wrote them out, which did work).

Data reads fine, first plot is fine, but when i use the code at the bottom for 
a for i loop it tells me that x and y do not match. . .

One other issue is that I would prefer not to specify the first column either, 
but when I enter Plot(MONTH, Data2[2] . . . it also does not plot. Should this 
not call the second column?

Thank you very much for any comments. I know this is simple, but I appreciate 
any assistance. Cheers, Ben

#

# LOAD DATA FROM CSV
library(zoo)
setwd(/Users/benjaminneal/Documents/All Panama/1110_Panama/CNAT_Segmenting)
Data = read.csv(120503_CNAT_Summary.csv, header=T)

#fill in missing data from last observation
Data2- na.locf(Data)
attach(Data2)

# PLOT ALL ON ONE CHART
plot(MONTH,T102, type=o, ann=False, ylim=c(1, 100), pch=22, lty=2, 
col=red)
title(main=5m and 10 m Colpophylia natans colonies over time, ylab=% live coral / 
colony,
   xlab=Months, col.main=black, font.main=4)

lines(MONTH,T162, type=o, pch=22, lty=2, col=red)
lines(MONTH,T231, type=o, pch=22, lty=2, col=green)
lines(MONTH,T250, type=o, pch=22, lty=2, col=green)

##(many other similar lines here, with entered column headers . . . up to 75)

lines(MONTH,T373, type=o, pch=22, lty=2, col=blue)
lines(MONTH,T374, type=o, pch=22, lty=2, col=blue)
lines(MONTH,T377, type=o, pch=22, lty=2, col=blue)

# Tried to add lines another way with for i loop, but this is the part not 
working
for (i in 2:length(Data2)) {
   lines(MONTH, i, type=o, pch=22, lty=2, col=blue))
}
#


Hi Ben,
I think what you may want in your loop is this:

for(column in names(Data2)[2:length(Data2)])
 lines(MONTH,column,type=o,pch=22,lty=2,col=blue)

But, if you want the first two lines to be green, you'll probably have 
to get a vector of colors:


colorvec-rep(blue,length(Data2))
colorvec[1]-red
colorvec[2:3]-green

and change the above to:

columnnames-names(Data2)
for(column in 2:length(Data2))
 lines(MONTH,columnnames[column],type=o,pch=22,lty=2,col=colvec[column])

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] select month data in ts objects

2012-05-03 Thread Gabor Grothendieck
On Wed, May 2, 2012 at 4:27 PM, S. Georgakarakos strat...@aegean.gr wrote:
 In a time series ts object, like the z1.ts below:

 z1 = array(1:235)

 z1.ts = ts(z1, frequency =12)

 I would like to select only a certain month, for instance the February
 data

 If I transform the data to a matrix, I have the problem that 235 is not
 a multiple of 12

 I do not like to cut or add data, or program a loop to pick out the
 correct data.

 I am wondering if exist an easier way to select month data in a ts object.


Use cycle:

tt - ts(1:235, frequency =12)
ts(tt[cycle(tt) == 2])


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with getting values from string

2012-05-03 Thread renu.s7
Hi All,

I have a doubt. I used macros and i try to pass a value to a macro by
concatenating a bunch of strings. But it does not seem to work. Please help.
I have written down my code and the error message please tell me how to pass
the value that a string points to. Thanks in advance

#macro defined
machist_occ_kgfs-defmacro(a,qnu_occ,b,qnl_occ,expr={with(subset(an_ind_data_fin,income_source==a
 region_id==b  normalised_incomeqnl_occ 
normalised_incomeqnu_occ),hist(normalised_income,main=paste(a,b,sep=
)))})
#macro called
machist_occ_kgfs(occ,paste(qnu,ri,occ,collapse=,sep=),ri,paste(qnl,ri,occ,collapse=,sep=))

Error in hist.default(normalised_income, main = paste(occ, ri, sep =  ), 
: 
  hist.default: pretty() error, breaks=
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf

The thing is paste(qnu,ri,occ,collapse=,sep=) returns the value
qnu1Business__Others but the variable - qnu1Business__Others contains an
integer value which i need to be passed on to the macro. Hope i have made
myself clear. Thanks in advance for your help

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-with-getting-values-from-string-tp4605632.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] braking a label in two lines when using expression()

2012-05-03 Thread Beatriz De Francisco
I making an xyplot and the y label is too long and needs to be in two rows, but 
when I brake it there is a huge gap between the last text string and the 
expression, and I can't get rid of it. Any ideas?

Data:
structure(list(Temp = c(8L, 8L, 8L, 8L, 8L, 8L, 12L, 12L, 12L,
12L, 12L, 12L), CO2 = c(380L, 380L, 380L, 750L, 750L, 750L, 380L,
380L, 380L, 750L, 750L, 750L), Treat = structure(c(3L, 3L, 3L,
4L, 4L, 4L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c(12-380, 12-750,
8-380, 8-750), class = factor), Week = c(1L, 3L, 8L, 1L,
3L, 8L, 1L, 3L, 8L, 1L, 3L, 8L), Mean.Rate = c(2.125909389, 1.905870003,
1.417687602, 3.110439984, 2.31043989, 1.849232493, 2.546747098,
3.290235064, 3.000717599, 2.694901409, 3.852590547, 2.964084249
), lower = c(1.846641409, 1.44072624, 1.185304427, 2.56408099,
2.02644683, 1.606374443, 2.253928482, 2.759177284, 2.49014747,
2.168437604, 3.075977559, 2.438453415), upper = c(2.405177369,
2.371013766, 1.650070777, 3.656798978, 2.59443295, 2.092090543,
2.839565714, 3.821292844, 3.511287728, 3.221365214, 4.629203535,
3.489715083), fTemp = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L), .Label = c(8, 12), class = factor),
fCO2 = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L,
2L, 2L), .Label = c(380, 750), class = factor), fTreat = 
structure(c(1L,
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L), .Label = c(8-380,
8-750, 12-380, 12-750), class = c(ordered, factor
)), fWeek = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L,
1L, 2L, 3L), .Label = c(1, 3, 8), class = factor)), .Names = 
c(Temp,
CO2, Treat, Week, Mean.Rate, lower, upper, fTemp,
fCO2, fTreat, fWeek), row.names = c(NA, -12L), class = data.frame)

xyplot(cbind(Mean.Rate,lower,upper)~fWeek|fTreat,
   resp.week.mean.rate,
   as.table=TRUE,
   xlab=Week,
   ylab=expression(Mean Reapisration Rate
(umol.*L^-1*.g (AFDM)^-1*)),
 scales=list(alternating=FALSE,
 tick.number=10,
 tck=c(-1,0)),
   layout=c(4,1),
   ylim=1:5,
   auto.key=list(title=Treatment,
 lines=TRUE,
 cex.title=1,
 columns=2),
   panel=function(x, y,...){
 panel.errbars(x,y,make.grid=none,ewidth=0.2,type=p,...)
 
panel.loess(x[resp.week.mean.rate$Treat==8-380],y[resp.week.mean.rate$Treat==8-380],span
 = 5, degree = 1,lwd=2,...)
 
panel.loess(x[resp.week.mean.rate$Treat==8-750],y[resp.week.mean.rate$Treat==8-750],span
 = 5, degree = 1,lwd=2,...);
 
panel.loess(x[resp.week.mean.rate$Treat==12-380],y[resp.week.mean.rate$Treat==12-380],span
 = 5, degree = 1,lwd=2,...);
 
panel.loess(x[resp.week.mean.rate$Treat==12-750],y[resp.week.mean.rate$Treat==12-750],span
 = 5, degree = 1,lwd=2,...)
   }
   )

Beatriz de Francisco Mora
PhD Student
The Scottish Association for Marine Science
Scottish Marine Institute
Oban
PA37 1QA
Tel: 06131 559000 (switchboard)
Fax: 01631559001
E. beatriz.defranci...@sams.ac.ukmailto:beatriz.defranci...@sams.ac.uk
http://www.smi.ac.uk/beatriz-de-franciso

The Scottish Association for Marine Science (SAMS) is registered in Scotland as 
a Company Limited by Guarantee (SC009292) and is a registered charity (9206). 
SAMS has an actively trading wholly owned subsidiary company: SAMS Research 
Services Ltd a Limited Company (SC224404). All Companies in the group are 
registered in Scotland and share a registered office at Scottish Marine 
Institute, Oban Argyll PA37 1QA. The content of this message may contain 
personal views which are not the views of SAMS unless specifically stated. 
Please note that all email traffic is monitored for purposes of security and 
spam filtering. As such individual emails may be examined in more detail.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-hpc] Quickest way to make a large empty file on disk?

2012-05-03 Thread Jens Oehlschlägel

   Jonathan,
   On some filesystems (e.g. NTFS, see below) it is possible to create 'sparse'
   memory-mapped files, i.e. reserving the space without the cost of actually
   writing initial values.
   Package 'ff' does this automatically and also allows to access the file in
   parallel.  Check  the  example  below and see how big file creation is
   immediate.
   Jens Oehlschlägel
library(ff)
library(snowfall)
ncpus - 2
n - 1e8
system.time(
   + x - ff(vmode=double, length=n, filename=c:/Temp/x.ff)
   + )
  User  System verstrichen
  0.010.000.02
# check finalizer, with an explicit filename we should have a 'close'
   finalizer
finalizer(x)
   [1] close
# if not, set it to 'close' inorder to not let slaves delete x on slave
   shutdown
finalizer(x) - close
sfInit(parallel=TRUE, cpus=ncpus, type=SOCK)
   R Version:  R version 2.15.0 (2012-03-30)
   snowfall 1.84 initialized (using snow 0.3-9): parallel execution on 2 CPUs.
sfLibrary(ff)
   Library ff loaded.
   Library ff loaded in cluster.
   Warnmeldung:
   In library(package = ff, character.only = TRUE, pos = 2, warn.conflicts =
   TRUE,  :
 'keep.source' is deprecated and will be ignored
sfExport(x) # note: do not export the same ff multiple times
# explicitely opening avoids a gc problem
sfClusterEval(open(x, caching=mmeachflush)) # opening with 'mmeachflush'
   inststead of 'mmnoflush' is a bit slower but prevents OS write storms when
   the file is larger than RAM
   [[1]]
   [1] TRUE
   [[2]]
   [1] TRUE
system.time(
   + sfLapply( chunk(x, length=ncpus), function(i){
   +   x[i] - runif(sum(i))
   +   invisible()
   + })
   + )
  User  System verstrichen
  0.000.00   30.78
system.time(
   + s - sfLapply( chunk(x, length=ncpus), function(i) quantile(x[i], c(0.05,
   0.95)) )
   + )
  User  System verstrichen
  0.000.004.38
# for completeness
sfClusterEval(close(x))
   [[1]]
   [1] TRUE
   [[2]]
   [1] TRUE
csummary(s)
5%  95%
   Min.0.04998 0.95
   1st Qu. 0.04999 0.95
   Median  0.05001 0.95
   Mean0.05001 0.95
   3rd Qu. 0.05002 0.95
   Max.0.05003 0.95
# stop slaves
sfStop()
   Stopping cluster
 # with the close finalizer we are responsible for deleting the file
   explicitely (unless we want to keep it)
delete(x)
   [1] TRUE
# remove r-side metadata
rm(x)
# truly free memory
gc()
   Gesendet: Donnerstag, 03. Mai 2012 um 00:23 Uhr
   Von: Jonathan Greenberg j...@illinois.edu
   An: r-help r-help@r-project.org, r-sig-...@r-project.org
   Betreff: [R-sig-hpc] Quickest way to make a large empty file on disk?
   R-helpers:
   What would be the absolute fastest way to make a large empty file (e.g.
   filled with all zeroes) on disk, given a byte size and a given number
   number of empty values. I know I can use writeBin, but the object in
   this case may be far too large to store in main memory. I'm asking because
   I'm going to use this file in conjunction with mmap to do parallel writes
   to this file. Say, I want to create a blank file of 10,000 floating point
   numbers.
   Thanks!
   --j
   --
   Jonathan A. Greenberg, PhD
   Assistant Professor
   Department of Geography and Geographic Information Science
   University of Illinois at Urbana-Champaign
   607 South Mathews Avenue, MC 150
   Urbana, IL 61801
   Phone: 415-763-5476
   AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007
   [1]http://www.geog.illinois.edu/people/JonathanGreenberg.html
   [[alternative HTML version deleted]]
   ___
   R-sig-hpc mailing list
   r-sig-...@r-project.org
   [2]https://stat.ethz.ch/mailman/listinfo/r-sig-hpc

References

   1. http://www.geog.illinois.edu/people/JonathanGreenberg.html
   2. https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R help!

2012-05-03 Thread Nutter, Benjamin
So long as x is a character vector, I tend to use the following for this 
problem.

 x - c(12/31/11 23:45, 1/1/12 2:15)
 
 x.split - strsplit(x,  )
 
 x.date - sapply(x.split, function(y) return(y[1]))
 x.time - sapply(x.split, function(y) if (length(y)  1) return(y[2]) else NA)
 
 x.date
[1] 12/31/11 1/1/12  
 x.time
[1] 23:45 2:15 


  Benjamin Nutter |  Biostatistician     |  Quantitative Health Sciences
  Cleveland Clinic    |  9500 Euclid Ave.  |  Cleveland, OH 44195  | (216) 
445-1365


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Alex Roth
Sent: Wednesday, May 02, 2012 4:01 PM
To: r-help@r-project.org
Subject: [R] R help!

Hello there, I was wondering if you could help me with a quick R issue.

I have a data set where one of the columns has both date and time in it, e.g. 
12/31/11 23:45 in one cell. I want to use R to split this column into two new 
columns: date and time.

One of the problems with splitting here is that when the dates go into single 
digits there are no 0's in front of months January-September (e.g., January is 
represented by 1 as opposed to 01), so every entry is a different length. 
Therefore, splitting by the space is the only option, I think.

Here's the coding I've developed thus far:

z$dt - z$Date#time and date is all under z$Date
foo - strsplit( , z$dt) #attempted split based on the space

And then if that were to work, I would proceed use the coding:

foo2 - matrix(unlist(foo), ncol = 2, byrow=TRUE) z$Date - foo[ ,1] z$Time - 
foo[ ,2]

However, foo - strsplit( , z$dt) isn't working. Do you know what the problem 
is? If you could respond soon, that would be greatly appreciated!

Thanks so much!
Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


===

 Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S.News  World Report (2010).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with 'nls' fitting logistic model (5PL)

2012-05-03 Thread Michal Figurski

Bert,

Thank you for your thoughts.

I can assure you I have plotted the data back and forth many times, and 
that overfitting has nothing to do with it. This is not a _statistical_ 
problem, but a _technical_ problem. Something that works well in ANY 
reliable statistical software doesn't work in R with 'nls'.


This just makes me think that 'nls' is a dysfunctional piece of junk 
that can't handle even a very simple problem. Not to mention that 
'predict()' for 'nls' doesn't give you any sort of confidence interval.


I wonder if there's another package in R that could be used to fit 
5P-logistic model. Any idea?


In the end I may attempt to write a fitting function myself, but that 
would be the last resort.


--
Michal J. Figurski, PhD
HUP, Pathology  Laboratory Medicine
Biomarker Research Laboratory
3400 Spruce St. 7 Maloney S
Philadelphia, PA 19104
tel. (215) 662-3413


On 5/2/2012 3:47 PM, Bert Gunter wrote:

Plot the data. You're clearly overfitting.

(If you don't know what this means or why it causes the problems you
see, try a statistical help list or consult your local statistician).

-- Bert

On Wed, May 2, 2012 at 12:32 PM, Michal Figurski
figur...@mail.med.upenn.edu  wrote:

Dear R-Helpers,

I'm working with immunoassay data and 5PL logistic model. I wanted to
experiment with different forms of weighting and parameter selection, which
is not possible in instrument software, so I turned to R.

I am using R 2.14.2 under Win7 64bit, and the 'nls' library to fit the model
- I started with the same model and weighting type (1/y) as in the
instrument to see if I'll get similar results. However, in some instances I
don't get any results - just errors.

Here is an example calibration data, representative of my experiment.
Instrument soft had no problem fitting it:
x- structure(list(SPL = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L,
4L, 5L, 5L, 6L, 6L, 7L, 7L), .Label = c(St1, St2, St3,
St4, St5, St6, St7), class = factor), MFI = c(10755.5,
9839, 5142.5, 4857, 1510.5, 1505, 502.5, 451, 215, 195.5, 58,
57, 15, 15), nom = c(206, 206, 125, 125, 68, 68, 38, 38, 24,
24, 13, 13, 6.5, 6.5), weights = c(0.0013946353028683, 0.00152454517735542,
0.00291686922702965, 0.00308832612723904, 0.0099304865938431,
0.00996677740863787, 0.0298507462686567, 0.0332594235033259,
0.0697674418604651, 0.0767263427109974, 0.258620689655172,
0.263157894736842,
1, 1)), .Names = c(SPL, MFI, nom, weights), row.names = c(NA,
-14L), class = data.frame)

And here is the nls fit:
fit- nls(MFI~a + b/((1+(nom/c)^d)^f), data=x, weights=x$weights,
start=c(a=100, b=1, c=100, d=-1, f=1))

I've tried every possible combination of starting values, including the
values fitted by the instrument soft - to no avail. I've probably seen all
possible error messages from 'nls' trying to fit this.

If anyone has an idea why it's not working - let me know.

Best regards,

--
Michal J. Figurski, PhD
HUP, Pathology  Laboratory Medicine
Biomarker Research Laboratory
3400 Spruce St. 7 Maloney S
Philadelphia, PA 19104
tel. (215) 662-3413

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Two ecdf with log-scales

2012-05-03 Thread Steven Wolf
I've done it this way before:

eX - ecdf(distribution 1)
eY - ecdf(distribution 2)
par(mar=c(5,5,2,1),xlog=TRUE)
plot(eX, do.points=FALSE, verticals=TRUE, col=black, xlab=xlabel,
xlim=c(1,10), ylab=ylabel, 
lty=1, cex.lab=1.5, cex.axis=1.5, main=,
lwd=3,log=x)
plot(eY, do.points=FALSE, verticals=TRUE, col=blue, add=TRUE,
xlim=c(1,10), main=)

Warning:  It makes a stair-step that may be difficult to see unless you use
color.  I had to change how the ecdf was plotted when I made b/w figures for
my publication so that different dashed lines were distinct.

HTH,
-Steve

-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Wednesday, May 02, 2012 10:17 AM
To: Johannes Radinger
Cc: R-help@r-project.org
Subject: Re: [R] Two ecdf with log-scales


On May 2, 2012, at 6:14 AM, Johannes Radinger wrote:

 Hi,

 i want to plot empirical cumulative density functions for two 
 variables in one plot. For better visualizing the differences in the 
 two cumulative curves I'd like to log-scale the axis.

 So far I found 3 possible functions to plot ecdf:

 1) ecdf() from the package 'stats'. I don't know how to successfully 
 set the log.scales? Combining two plots is not a problem:

 plot(ecdf(x1))
 lines(ecdf(x2),col.h=red)

 2) gx.ecdf() from package 'rgr'. It is easily possible to plot log- 
 scales, but I don't know how to plot two densities?

 gx.ecdf(x1,log=TRUE,ifqs = TRUE)

 3) Ecdf() from package 'Hmisc'. No log-option directly available and 
 here I also don't know how to 'stack' two plots...

 Ecdf(x1,what=F)


 Probably there are many more solutions (e.g. ggplot etc.)...
 ...Has anyone faced a similar task and found a simple solution? Any 
 suggestions are welcome!

Have you searched the Archives? I seem to remember that the log(0) was a
barrier to persons attempting this in the past. (ISTR a posting in the last
few weeks.)  Maybe you could also provide a test data object that has the
same range as your x1 and x 2 variables.

 and provide commented, minimal, self-contained, reproducible code.

-- 

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R help!

2012-05-03 Thread peter dalgaard

On May 3, 2012, at 14:05 , Nutter, Benjamin wrote:

 So long as x is a character vector, I tend to use the following for this 
 problem.
 
 x - c(12/31/11 23:45, 1/1/12 2:15)
 
 x.split - strsplit(x,  )
 
 x.date - sapply(x.split, function(y) return(y[1]))
 x.time - sapply(x.split, function(y) if (length(y)  1) return(y[2]) else 
 NA)
 
 x.date
 [1] 12/31/11 1/1/12  
 x.time
 [1] 23:45 2:15 
 
 
   Benjamin Nutter |  Biostatistician |  Quantitative Health Sciences
   Cleveland Clinic|  9500 Euclid Ave.  |  Cleveland, OH 44195  | (216) 
 445-1365
 


I think this can be simplified to

 sapply(x.split,`[`,1)
[1] 12/31/11 1/1/12  
 sapply(x.split,`[`,2)
[1] 23:45 2:15 

It's a bit inefficient, though. Other ideas:

 sub( .*$, , x)
[1] 12/31/11 1/1/12  
 sub(^.* , , x)
[1] 23:45 2:15 
 read.table(text=x)
V1V2
1 12/31/11 23:45
2   1/1/12  2:15


-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Creating a survival object with and without an event indicator

2012-05-03 Thread wwreith
When reading about surv(). I saw the following statement Although unusual,
the event indicator can be omitted, in which case all subjects are assumed
to have an event. 

So I tried the following

1. survobj-surv(mydata$Time) vs.  2. survobj-surv(mydata$Time,
mydata$Event) 

where mydata$Event is a column of all ones. I did not get the same answer
when I ran survreg(survobj~shift, data=mydata) on each. The second case
actually failed to converge. I then tried a column of all zeros just to make
sure I wasn't misreading something but still got the warning message failed
to converge. 

Can anyone explain what exactly R is doing  for case 1?

--
View this message in context: 
http://r.789695.n4.nabble.com/Creating-a-survival-object-with-and-without-an-event-indicator-tp4605945.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] estimation problem

2012-05-03 Thread Kehl Dániel

Dear List-members,

I have a problem where I have to estimate a mean, or a sum of a 
population but for some reason it contains a huge amount of zeros.

I cannot give real data but I constructed a toy example as follows

N1 - 10
N2 - 3000
x1 - rep(0,N1)
x2 - rnorm(N2,300,100)
x - c(x1,x2)

n - 1000

x_sample - sample(x,n,replace=FALSE)

I want to estimate the sum of x based on x_sample (not knowing N1 and N2 
but their sum (N) only).
The sample mean has a huge standard deviation I am looking for a better 
estimator.
I was thinking about trimmed (or left trimmed as my numbers are all 
positive) means or something similar,

but if I calculate trimmed mean I do not know N2 to multiply with.

Do you have any idea or could you give me some insight?

Thanks a lot:
Daniel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] DEADLINE FAST APPROACHING for useR! 2012

2012-05-03 Thread Frank Harrell

DEADLINE FAST APPROACHING – 8th Annual International R User Conference
useR! 2012, Nashville, Tennessee USA

Registration Deadlines:
Early Registration: Passed
Regular Registration: Mar 1- May 12
Late Registration: May 13 – June 4
On-Site Registration: June 12 – June 15

Please note: Nashville is offering several large entertainment events 
the month of June, and hotels are quickly selling out.  It's imperative 
that you make your hotel accommodations for the conference as soon as 
possible.


Please join us at the 8th Annual International R User Conference useR! 
2012 in Nashville, Tennessee.  A draft program schedule, and other 
conference details are available. Please visit

http://biostat.mc.vanderbilt.edu/UseR-2012

--
Frank E Harrell Jr Professor and Chairman  School of Medicine
   Department of Biostatistics Vanderbilt University

___
r-annou...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-announce

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R help!

2012-05-03 Thread Petr PIKAL
Hi

I would convert it to propper date format and then you can extract 
anything.

dat-strptime(12/31/11 23:45, format=%m/%d/%y %H:%M)
as.Date(dat)
[1] 2011-12-31
format(dat, %H:%M)
[1] 23:45

Regards
Petr
 

 
 Hello there, I was wondering if you could help me with a quick R issue.
 
 I have a data set where one of the columns has both date and time in
 it, e.g. 12/31/11 23:45 in one cell. I want to use R to split this
 column into two new columns: date and time.
 
 One of the problems with splitting here is that when the dates go into
 single digits there are no 0's in front of months January-September
 (e.g., January is represented by 1 as opposed to 01), so every entry
 is a different length. Therefore, splitting by the space is the only
 option, I think.
 
 Here's the coding I've developed thus far:
 
 z$dt - z$Date#time and date is all under z$Date
 foo - strsplit( , z$dt) #attempted split based on the space
 
 And then if that were to work, I would proceed use the coding:
 
 foo2 - matrix(unlist(foo), ncol = 2, byrow=TRUE)
 z$Date - foo[ ,1]
 z$Time - foo[ ,2]
 
 However, foo - strsplit( , z$dt) isn't working. Do you know what
 the problem is? If you could respond soon, that would be greatly
 appreciated!
 
 Thanks so much!
 Alex
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Searching for values

2012-05-03 Thread barb
Hey YaoPau,

what you need are basically three things. If, for and matrix-operations. 

So a quick overview, how you could do it. (With focus on the idea, not 
how the columns and rows are arranged. 


# 1) Building the Vectors with c()

column1-c(p103,p186,p145,p187,p197)
column2-c(p666,p77,p198,p674,p302)

# 1) Building the vector with numbers and letters 
# you can use paste here (cat is also a useful function
# for similar purposes). Seq is obv a sequence. 
# Sep= tells R to not split the letter and numbers
# try sep=  for the difference

longvector-(paste(p,seq(1:700),sep=))

## rep can be used to create an empty vetor, not very
## elegant though

emptyvector-rep(0,length(longvector))

## cbind for creating a matrix out of two vetors
matrix-cbind(longvector,emptyvector)


 Now the interesting part. More difficult to explain
## Focus on the middle term. If term of column 1 equals
## term in the matrix function you want to insert 1.
### Same thing for a-1 below.
### no you want to perform this for all 700 rows = [length(longvector)]
### and for all values in column1 = [length(column1)]
## you will use for here. 


for (i in 1:length(longvector)){
  for (j in 1: length(column1)){
if (column1[j]==matrix[,1][i]){
  matrix[,2][i]=1
}  
if (column2[j]==matrix[,1][i]){
  matrix[,2][i]=a-1
}
  }  
}

--
View this message in context: 
http://r.789695.n4.nabble.com/Searching-for-values-tp4605213p4605722.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to read ANOVA output

2012-05-03 Thread kydaviddoyle
Hi,

Take a look at http://www.khanacademy.org/
Look near the bottom of the page and there is a whole section on statistics
and 3 videos on ANOVA.  It is a very good introduction and I find it very
useful to know how the test works so that I'm not using a Black Box

Also I wrote up a little section on ANOVA that you can see at:
https://sites.google.com/site/davidsstatistics/using-r/anova-nonparameteric

Take Care
David

-
Take Care

David Doyle
--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-read-ANOVA-output-tp2329457p4605879.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with running probit

2012-05-03 Thread pie'
Hi,

I am having problems with running a probit regression and don't understand
where the problem comes from since with the original data set I was able to
get correct estimates. To that data set I have added extra variables and
upon running the regression I get now multiple estimates of the same
predictor:

Deviance Residuals: 
 Min1QMedian3Q   Max  
-1.17741  -0.1   0.0   0.1   1.17741  

Coefficients:
  Estimate Std. Error z value Pr(|z|)
(Intercept) -6.400e+00  1.070e+04  -0.0011.000
shares3m103,493 -6.220e-07  1.513e+04   0.0001.000
shares3m105,188 -6.220e-07  1.513e+04   0.0001.000
shares3m106,743 -6.220e-07  1.513e+04   0.0001.000
shares3m107,809 -6.220e-07  1.513e+04   0.0001.000
shares3m107,844 -6.220e-07  1.513e+04   0.0001.000
shares3m110,902 -6.220e-07  1.513e+04   0.0001.000
shares3m111,112 -6.220e-07  1.513e+04   0.0001.000
shares3m113,173 -6.220e-07  1.513e+04   0.0001.000
shares3m114,798 -6.220e-07  1.513e+04   0.0001.000
shares3m117,454 -6.220e-07  1.513e+04   0.0001.000
shares3m119,446 -6.220e-07  1.513e+04   0.0001.000
shares3m123,500 -6.220e-07  1.513e+04   0.0001.000
shares3m126,366 -6.220e-07  1.513e+04   0.0001.000
shares3m126,383 -6.216e-07  1.513e+04   0.0001.000
shares3m126,436 -6.220e-07  1.513e+04   0.0001.000
shares3m128,043 -6.220e-07  1.513e+04   0.0001.000
shares3m131,975 -6.220e-07  1.513e+04   0.0001.000
shares3m132,254 -6.220e-07  1.513e+04   0.0001.000
shares3m133,984 -6.220e-07  1.513e+04   0.0001.000

.
..

Anyone knows what the problem is?



--
View this message in context: 
http://r.789695.n4.nabble.com/problem-with-running-probit-tp4605742.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problems with lme function

2012-05-03 Thread gaiarrido
I try to fit a repeated measures model, in which:
FA: The difference in the number of scales between the left and right sides
for a given trait
trait: the each of the measured features are 6 for each individual
Condition: is the state of condition of each individual
Individual: Each of the specimens studied

By adjusting the model happens to me the following

 lm.FA-lme(FA~rasgo*condicion,random=~1|individuo)
Error en na.fail.default(list(FA = c(-5, -5, -4, -3, -3, -3, -3, -3, -3,  : 
  missing values in object

I thought it might be because there are values ​​of FA are negative, and
introduced into the model the absolute value of FA, but still the same

 absFA-abs(FA)
 lm.FA-lme(absFA~rasgo*condicion,random=~1|individuo)
Error en na.fail.default(list(absFA = c(5, 5, 4, 3, 3, 3, 3, 3, 3, 3,  : 
  missing values in object

What I'm doing wrong?

Thank you very much in advance for your attention


-
Mario Garrido Escudero
PhD student
Dpto. de Biología Animal, Ecología, Parasitología, Edafología y Qca. Agrícola
Universidad de Salamanca
--
View this message in context: 
http://r.789695.n4.nabble.com/problems-with-lme-function-tp4605998.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Very small random effect estimation in lmer but not in stata xtmixed

2012-05-03 Thread Mohammed Mohammed
Hi folks

I am using the lmer function (in the lme4 library) to analyse some data where 
individuals are clustered into sets (using the SetID variable) with a single 
fixed effect (cc - 0 or 1). The lmer model and output is shown below.
Whilst the fixed effects are consistent with stata (using xtmixed, see below), 
the std dev of the random effect for SetID is very very small 
(3.5803e-05)compared to stata's (see below 1.002). Any ideas why this should be 
happening please?


LMER MODEL

summary(lmer(AnxietyScore ~ cc + (1|SetID), data=mydf))
Linear mixed model fit by REML
Formula: AnxietyScore ~ cc + (1 | SetID)
   Data: mydf
   AIC   BIC logLik deviance REMLdev
493.4 503.4 -242.7486.6   485.4
Random effects:
Groups   NameVariance   Std.Dev.
 SetID(Intercept) 1.2819e-09 3.5803e-05
Residual 1.3352e+01 3.6540e+00
Number of obs: 90, groups: SetID, 33

Fixed effects:
Estimate Std. Error t value
(Intercept)   3.1064 0.5330   5.828
cc2.3122 0.7711   2.999

Correlation of Fixed Effects:
   (Intr)
cc -0.691


STATA XTMIXED

xtmixed anxietyscore cc || setid:, reml

Mixed-effects REML regression   Number of obs  =90
Group variable: setid   Number of groups   =33
Log restricted-likelihood = -242.48259  Prob  chi2=0.0023
--
anxietyscore |  Coef.   Std. Err.  zP|z| [95% Conf. Interval]
-+
  cc |   2.289007   .7492766 3.05   0.002 .82045193.757562
   _cons |   3.116074   .5464282 5.70   0.000 2.0450944.187053
--
--
  Random-effects Parameters  |   Estimate   Std. Err. [95% Conf. Interval]
-+
setid: Identity  |
   sd(_cons) |   1.002484.797775  .21071374.769382
-+
sd(Residual) |   3.515888   .3281988  2.928045 4.22175
--


with thanks  best wishes
M

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] conducting GAM-GEE within gamm4?

2012-05-03 Thread Nathan Furey
Dear R-help users,

I am trying to analyze some visual transect data of organisms to generate a
habitat distribution model. Once organisms are sighted, they are followed
as point data is collected at a given time interval.  Because of the
autocorrelation among these follows, I wish to utilize a GAM-GEE approach
similar to that of Pirotta et al. 2011, using packages 'yags' and 'splines'
(http://www.int-res.com/abstracts/meps/v436/p257-272/).  Their R scripts
are shown here (
http://www.int-res.com/articles/suppl/m436p257_supp/m436p257_supp1-code.r).
I have used this code with limited success and multiple issues of models
failing to converge.


Below is the structure of my data:

 str(dat2)

'data.frame':   10792 obs. of  4 variables:
 $ dist_slag   : num  26475 26340 25886 25400 24934 ...
 $ Depth   : num  -10.1 -10.5 -16.6 -22.2 -29.7 ...
 $ dolphin_presence: int  0 0 0 0 0 0 0 0 0 0 ...


 $ block   : int  1 1 1 1 1 1 1 1 1 1 ...

 head(dat2)

  dist_slagDepth dolphin_presence block
1  26475.47 -10.09340 1
2  26340.47 -10.48700 1
3  25886.33 -16.57520 1
4  25399.88 -22.24740 1


5  24934.29 -29.67970 1
6  24519.90 -26.23700 1


Here is the summary of my block variable (indicating the number of
groups for which autocorrelation exists within each block


 summary(dat2$block)
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
   1.00   39.00   76.00   73.52  111.00  148.00


However, I would like to use the package 'gamm4', as I am more familiar
with Professor Simon Wood's packages and functions, and it appears gamm4
might be the most appropriate. It is important to note that the models have
a binary response (organism presence of absence along a transect), and thus
why I think gamm4 is more appropriate than gamm.  In the gamm help, it
provides the following example for autocorrelation within factors:

## more complicated autocorrelation example - AR errors
## only within groups defined by `fac'
e - rnorm(n,0,sig)
for (i in 2:n) e[i] - 0.6*e[i-1]*(fac[i-1]==fac[i]) + e[i]
y - f + e
b - gamm(y~s(x,k=20),correlation=corAR1(form=~1|fac))

Following this example, the following is the code I used for my dataset

b - gamm4(dolphin_presence~s(dist_slag)+s(Depth),random=(form=~1|block),
family=binomial(),data=dat)


However, by examining the output (summary(b$gam)) and specifically
summary(b$mer)), I am either unsure of how to interpret the results,
or do not believe that the autocorrelation within the group is being
taken into consideration.


 summary(b$gam)

Family: binomial
Link function: logit

Formula:
dolphin_presence ~ s(dist_slag) + s(Depth)

Parametric coefficients:

Estimate Std. Error z value Pr(|z|)
(Intercept)  -13.968  5.145  -2.715  0.00663 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:

   edf Ref.df Chi.sq  p-value
s(dist_slag) 4.943  4.943  70.67 6.85e-14 ***
s(Depth) 6.869  6.869 115.59   2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =  0.317glmer.ML score =  10504  Scale est. = 1 n = 10792


 summary(b$mer)
Generalized linear mixed model fit by the Laplace approximation

   AIC   BIC logLik deviance
 10514 10551  -525210504
Random effects:
 Groups Name Variance Std.Dev.
 Xr s(dist_slag) 1611344  1269.39
 Xr.0   s(Depth)   98622   314.04
Number of obs: 10792, groups: Xr, 8; Xr.0, 8

Fixed effects:
 Estimate Std. Error z value Pr(|z|)
X(Intercept)  -13.968  5.145  -2.715  0.00663 **
Xs(dist_slag)Fx1  -35.871 33.944  -1.057  0.29063
Xs(Depth)Fx13.971  3.740   1.062  0.28823

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
X(Int) X(_)F1
Xs(dst_s)F1  0.654
Xs(Dpth)Fx1 -0.030  0.000



How do I ensure that autocorrelation is indeed being accounted for
within each unique value of the block variable? What is the simplest
way to interpret the output for summary(b$mer)?

The results do differ from a normal gam (package mgcv) using the same
variables and parameters without the correlation=... term,
indicating that *something *different is occurring.

However, when I use a different variable for the correlation term
(season), I get the SAME output:

 dat2 - data.frame(dist_slag = dat$dist_slag, Depth = dat$Depth, 
 dolphin_presence = dat$dolphin_presence,
+ block = dat$block, season=dat$season)
 head(dat2)
  dist_slagDepth dolphin_presence block season
1  26475.47 -10.09340 1  F
2  26340.47 -10.48700 1  F
3  25886.33 -16.57520 1  F
4  25399.88 -22.24740 1  F
5  24934.29 -29.67970 1  F
6  24519.90 -26.23700 1  F

 summary(dat2$season)
   FS
3224 7568


 

Re: [R] `mapply(function(x) function() x, c(a, b))$a()' returns `b'

2012-05-03 Thread Jessica Streicher
As i see it you will save the actual text of the function - and when you call 
it later on it takes the last value of x it has encountered as the value. I 
guess you want the x not to be saved as x, but as a or b, so, as its value.

I am not sure how to do that however as of yet.


Am 03.05.2012 um 12:31 schrieb Casper Ti. Vector:

 As the title says, I want to apply a function (which itself returns
 a function) to a list (or vector), and get a list (or vector) of
 generated functions as the return value, but get unexpected result.
 
 Anyone with an idea about the reason of this phenomenon and a correct
 way to implement the requirements? Thanks very much :)
 
 -- 
 Using GPG/PGP? Please get my current public key (ID: 0xAEF6A134, valid
 from 2010 to 2013) from a key server.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] interactive loop

2012-05-03 Thread Ondřej Mikula
Dear Michael and Sarah,
the superfluous points arose as an error (e.g. double-click) in the
measurement process. Thus, looking on the image I recognize them
easily and everything what I need is to write their numbers.
readline() serves well for that purpose. Thanks a lot!
Ondřej



On 2 May 2012 18:32, Sarah Goslee sarah.gos...@gmail.com wrote:
 And now we have two entirely different interpretations of the question.

 I think Ondřej needs to provide a more detailed explanation of the
 problem and intended result.

 Sarah

 On Wed, May 2, 2012 at 12:23 PM, R. Michael Weylandt
 michael.weyla...@gmail.com michael.weyla...@gmail.com wrote:
 I think readline() will do what you want. It can display a message and take 
 user input, assigning it to a character value so you might need as.numeric()

 Michael

 On May 2, 2012, at 12:08 PM, Ondřej Mikula onmik...@gmail.com wrote:

 Dear R-helpers,
 I have a number of point configurations representing skull shapes, but
 some of them contain superfluous points. I want to write a loop in
 which each configuration is plotted and I am asked to write the
 numbers of points that are superfluous. However, I don't know how to
 introduce this interactive element.
 Would you give me an advice?
 Best regards
 Ondřej Mikula




 --
 Sarah Goslee
 http://www.functionaldiversity.org



-- 
Ondřej Mikula

Institute of Animal Physiology and Genetics
Academy of Sciences of the Czech Republic
Veveri 97, 60200 Brno, Czech Republic

Institute of Vertebrate Biology
Academy of Sciences of the Czech Republic
Studenec 122, 67502 Konesin, Czech Republic

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] add an automatized linear regression in a function

2012-05-03 Thread jeff6868
Dear R users,

For the moment, I have a script and a function which calculates correlation
matrices between all my data files. Then, it chooses the best correlation
for each data and take it in order to fill missing data in the analysed file
(so the data from the best correlation file is put automatically into the
missing data gaps of the first file (because my files are containing missing
values (NAs))). If the best correlated file doesn't contain data , it takes
the data from the second best correlated file. 
The problem is that for the moment, it takes raw data from the best
correlated file. 

So I need to adapt this raw data to the file that is going to be filled. As
a consequence, I'd like to automatize the calculation of a linear regression
(after the selection of the best or the second best correlated data file)
between the two files.
Instead of taking the raw data from the best correlated file to fill the
first one, it should take the estimated data from the regression to fill it
(in order to have more precise filled data). 
The idea is so to do an lm() between these two files, to extract the
coefficients of the straight line (from the regression) and to calculate the
estimated data for all my file (NA included), and finally to fill the gaps
with this estimated data. Hope you've understand my problem.
Here's the function:

process.all - function(df.list, mat){
f - function(station)
 na.fill(df.list[[ station ]], df.list[[ max.cor[station] ]])
 
g - function(station){
x - df.list[[station]]
if(any(is.na(x$data))){
mat[row(mat) == col(mat)] - -Inf
nas - which(is.na(x$data))
ord - order(mat[station, ], decreasing = TRUE)[-c(1,
ncol(mat))]
for(i in nas){
for(y in ord){
if(!is.na(df.list[[y]]$data[i])){
x$data[i] - df.list[[y]]$data[i]
break
}
}
}
}
x
}

n - length(df.list)
nms - names(df.list)
max.cor - sapply(seq.int(n), get.max.cor, corhiver2008capt1)
df.list - lapply(seq.int(n), f)
df.list - lapply(seq.int(n), g) 
names(df.list) - nms
df.list
}

I succeded for a small data.frame I've created, but I don't know how to do
it in this particular case.
Thanks a lot for your help!


--
View this message in context: 
http://r.789695.n4.nabble.com/add-an-automatized-linear-regression-in-a-function-tp4606047.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error fitting coxph model

2012-05-03 Thread David Winsemius


On May 2, 2012, at 3:02 PM, Jessica Myers wrote:


Hi,

I am using coxph from the survival package to fit a large model  
(100,000 observations, ~35 covariates) using both ridge regression  
(on binary covariates) and penalized splines (for continuous  
covariates).


In fitting, I get a strange error:

Error in if (abs((y[nx] - target)/(y[nx - 1] - target))  0.6)  
doing.well - FALSE else doing.well - TRUE :

 missing value where TRUE/FALSE needed


You appear to have missing values in 'y', 'nx', or `target`. This  
could be a case of the newbie error of using if(){}else{} when  
ifelse() should have been used. Without at least the code we probably  
cannot resolve those two possibilities.


Unfortunately, I can't reproduce this error without handing over my  
entire dataset,


Why not? If the removal of missing values is unsuccessful and you were  
not committing the error I mentioned, then run it on two halves. Pick  
the one with the error. Rinse, lather, repeat.


but I thought it would be worth checking if anyone had any insight.   
I should note that the outcome that I'm using has almost everyone  
having an event (~98,000 events out of 100,000).  I have fit other  
models like this with no problem, but on one particular dataset it  
fails.


Thanks!

Jessica Myers
Instructor in Medicine


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] `mapply(function(x) function() x, c(a, b))$a()' returns `b'

2012-05-03 Thread Jessica Streicher
Now.. i just tried around and this might be a bit strange way to do things..

createFunc-function(v){
v_out-NULL
for(i in v){
v_out[[i]]-substitute(function(){x},list(x=i))
}
return(v_out)
}

 y-createFunc(c(a,b))
 y
$a
function() {
a
}

$b
function() {
b
}

 eval(y$a)()
[1] a
 eval(y$b)()
[1] b

Am 03.05.2012 um 12:31 schrieb Casper Ti. Vector:

 As the title says, I want to apply a function (which itself returns
 a function) to a list (or vector), and get a list (or vector) of
 generated functions as the return value, but get unexpected result.
 
 Anyone with an idea about the reason of this phenomenon and a correct
 way to implement the requirements? Thanks very much :)
 
 -- 
 Using GPG/PGP? Please get my current public key (ID: 0xAEF6A134, valid
 from 2010 to 2013) from a key server.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] deparse(substitute(x)) on an object with S3 class

2012-05-03 Thread R. Michael Weylandt
Note that print.testclass(testlist) also works as expected so it might
be there's a forced evaluation somewhere inside S3 dispatch -- I was
playing around with this the other day, having gotten myself confused
by it and stumbled across the internal logic (or at least something
that made sense to me), though I seem to have forgotten it again.
Hopefully someone else can clarify.

Michael

On Thu, May 3, 2012 at 6:15 AM, Remko Duursma remkoduur...@gmail.com wrote:
 Dear list,


 can someone explain to me why deparse(substitute(x)) does not seem to work
 when x is of a user-defined S3 class?
 In my actual problem, my print method is part of a package, and the method
 is registered in the NAMESPACE, if that should make a difference.

 print.testclass - function(x,...){
  xname - deparse(substitute(x))
  cat(Your object name is,xname,\n)
 }

 testlist - list()
 testlist[[1]] - 1:10
 class(testlist) - testclass

 # This does not work as expected:
 testlist
 Your object name is structure(list(1:10), class = testclass)

 # But this does :
 testlist2 - unclass(testlist)
 print.testclass(testlist2)
 Your object name is testlist2



 thanks,
 Remko Duursma


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/deparse-substitute-x-on-an-object-with-S3-class-tp4605592.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] `mapply(function(x) function() x, c(a, b))$a()' returns `b'

2012-05-03 Thread Jessica Streicher
So, to get back to mapply:

eval(mapply(function(x) substitute(function() z,list(z=x)), c(a, b))$a)()
 or like this:
mapply(function(x) eval(substitute(function(i) z*i,list(z=x))), c(2,3))[[1]](2)



Am 03.05.2012 um 16:02 schrieb Jessica Streicher:

 Now.. i just tried around and this might be a bit strange way to do things..
 
 createFunc-function(v){
   v_out-NULL
   for(i in v){
   v_out[[i]]-substitute(function(){x},list(x=i))
   }
   return(v_out)
 }
 
 y-createFunc(c(a,b))
 y
 $a
 function() {
a
 }
 
 $b
 function() {
b
 }
 
 eval(y$a)()
 [1] a
 eval(y$b)()
 [1] b
 
 Am 03.05.2012 um 12:31 schrieb Casper Ti. Vector:
 
 As the title says, I want to apply a function (which itself returns
 a function) to a list (or vector), and get a list (or vector) of
 generated functions as the return value, but get unexpected result.
 
 Anyone with an idea about the reason of this phenomenon and a correct
 way to implement the requirements? Thanks very much :)
 
 -- 
 Using GPG/PGP? Please get my current public key (ID: 0xAEF6A134, valid
 from 2010 to 2013) from a key server.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] error in La.svd Lapack routine 'dgesdd'

2012-05-03 Thread Millo Giovanni

Dear Philipp,

this is just a tentative answer because debugging is really not possible
without a reproducible example (or, at a very bare minimum, the output
from traceback()).
Anyway, thank you for reporting this interesting numerical issue; I'll
try to replicate some similar behaviour on a similarly dimensioned
artificial dataset when I have some time (which might not be soon). As
for now, please see below my remarks with '##', I hope they are useful
anyway. Bottom line: time fixed effects might be out of place here.

Best wishes,
Giovanni

Giovanni Millo, PhD
Research Dept.,
Assicurazioni Generali SpA
Via Machiavelli 4,
34132 Trieste (Italy)
tel. +39 040 671184
fax  +39 040 671160

- original message 

Message: 8
Date: Wed, 2 May 2012 05:45:47 -0700 (PDT)
From: Philipp Grueber philipp.grue...@ebs.edu
To: r-help@r-project.org
Subject: Re: [R] error in La.svd Lapack routine 'dgesdd'
Message-ID: 1335962747113-4603097.p...@n4.nabble.com
Content-Type: text/plain; charset=UTF-8

Dear R Users,

I have an unbalanced panel with (on average) approx. 100 individuals
over
1370 time intervals (with individual time series of different lengths,
varying between 60 and 1370 time intervals). I use the following model:

res1-plm(x~c+d+e,data=pdata_frame, effect=twoways,
model=within,
na.action=na.omit))

## I have difficulty in understanding why you would want to introduce
ca. 1470 incidental parameters... I'd rather go with a more parsimonious
specification: a trend, AR(n) or else...

I repeatedly get the following error (which has been discussed in the
past):

Error in La.svd(x, nu, nv) : error code 1 from Lapack routine
?dgesdd?

I found it hard to create a reproducible example. As noted by Douglas
Bates,
the error might be related to the scaling of the matrix. 

## Too difficult for me to tell without output, references etc.,
although of course I trust D.B.'s opinion. 

For variables x,c,d,and e in object pdata_frame, I find that all sd()
are
reasonably similar both among the cross-sections as well as among the
variables. However, I find that extracting the demeaned data from plm(),
variables demXt$d and demXt$e (i.e. the demeaned variables) have sd()s
that
are very small compared to those of dem_yt and demXt$c (approx. by
factor
1e-15). I extract the demeaned data as follows:

dem_yt-pmodel.response(res) 
demXt-model.matrix(res)

How is this possible? What is it that plm() does with my data so that
the
standard deviations change? 

## it demeans them... (although the scale of the reduction is
impressive, yet you're estimating out 1500 constants!)

I suspect effect=twoways to play a central role because plm() works
fine
for effect=individual. 

## sure, also because individual 'just' introduces 100 more
parameters.

I thought about the idea that maybe, time-effects
simply do not apply here. 

## You know your model. Yet time effects on T=1300 seems hazardous to
me.

However: In order to test my regression for
time-effects (which I detect for subsamples (by time) and for equation
x~e
at high levels of significance), I need both the model with and and the
model without time effects (as otherwise, I can't compare the two models
in
an F-test), right? Any alternative tests? 

## please see ?plmtest

Another thought was that the impact of d and e changes over time (as in
the
subsamples I do see such a change).

Any help is appreciated!

## HTH, G.

Best wishes,
Philipp Grueber

-

EBS Universitaet fuer Wirtschaft und Recht
FARE Department
Wiesbaden/ Germany
http://www.ebs.edu/index.php?id=finaccL=0
--
View this message in context:
http://r.789695.n4.nabble.com/error-in-La-svd-Lapack-routine-dgesdd-tp21
25052p4603097.html
Sent from the R help mailing list archive at Nabble.com.


 
Ai sensi del D.Lgs. 196/2003 si precisa che le informazi...{{dropped:12}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Modified Cholesky decomposition for sparse matrices

2012-05-03 Thread Michael Braun
I am trying to estimate a covariance matrix from the Hessian of a posterior 
mode.  However, this Hessian is indefinite (possibly because of 
numerical/roundoff issues), and thus, the Cholesky decomposition does not 
exist.  So, I want to use a modified Cholesky algorithm to estimate a Cholesky 
of a pseudovariance that is reasonably close to the original matrix.  I know 
that there are R packages that contain code for Gill-Murray and Schnabel-Eskow 
algorithms for standard, dense, base-R matrices.  But my Matrix is large 
(k=3), and sparse (block-arrow structure, stored as a dsCMatrix class from 
the Matrix package).  

Is anyone aware of existing code (or perhaps an algorithm that is easy to 
adapt) that would perform a modified Cholesky decomposition on a large, sparse 
indefinite matrix, preferably working on sparseMatrix classes?  Alternatively, 
is there a way I could compute a sparse LDL' decomposition from an existing R 
function, and quickly modify the output? 

Thanks,

Michael
 


---
Michael Braun
Associate Professor of Management Science
MIT Sloan School of Management
100 Main St.., E62-535
Cambridge, MA 02139
bra...@mit.edu
617-253-3436




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] bwplot: using a numeric variable to position boxplots

2012-05-03 Thread Michael Friendly

[Env: R 2.14.2 / Win Xp]

In the examples below, I'm using lattice::bwplot to plot boxplots of 4 
variables, grouped by a factor 'epoch'
which also corresponds to a numeric year.  I'd like to modify the plots 
to position the boxplots according to

the numeric value of year, but I can't figure out how to do this.

Also, I'd to modify the strip labels that give the variable names to use 
longer variable labels I define below as

vlab.

 install.packages(heplots, repos=http://R-Forge.R-project.org;)
 # requires heplots 0.9-12 for Skulls data
 library(heplots)
 data(Skulls)

 # make shorter labels for epochs
 Skulls$epoch - factor(Skulls$epoch, 
labels=sub(c,,levels(Skulls$epoch)))

 # create year variable
 Skulls$year - rep(c(-4000, -3300, -1850, -200, 150), each=30)
 str(Skulls)
'data.frame':   150 obs. of  6 variables:
 $ epoch: Ord.factor w/ 5 levels 4000BC3300BC..: 1 1 1 1 1 1 1 1 
1 1 ...

 $ mb   : num  131 125 131 119 136 138 139 125 131 134 ...
 $ bh   : num  138 131 132 132 143 137 130 136 134 134 ...
 $ bl   : num  89 92 99 96 100 89 108 93 102 99 ...
 $ nh   : num  49 48 50 44 54 56 48 48 51 51 ...
 $ year : num  -4000 -4000 -4000 -4000 -4000 -4000 -4000 -4000 -4000 
-4000 ...

 table(Skulls$epoch)

4000BC 3300BC 1850BC  200BC  AD150
30 30 30 30 30

This is what I've tried.  I reshape the data to a long format and can 
plot value ~ epoch or value ~ as.factor(year),
but I really want to space the boxplots according to the  numeric value 
of year.


# better variable labels -- how to incorporate these in the strip labels?
vlab - c(maxBreadth, basibHeight, basialLength, nasalHeight)

library(lattice)
library(reshape2)
Skulls$year - rep(c(-4000, -3300, -1850, -200, 150), each=30)
sklong - melt(Skulls, id=c(epoch, year))

bwplot(value ~ epoch | variable, data=sklong, scales=free,
ylab=Variable value,
xlab=Epoch,
panel = function(x,y, ...) {
panel.bwplot(x, y, ...)
panel.linejoin(x,y, col=red, ...)
}
)

bwplot(value ~ as.factor(year) | variable, data=sklong, scales=free,
ylab=Variable value,
xlab=Year,
panel = function(x,y, ...) {
panel.bwplot(x, y, ...)
panel.linejoin(x,y, col=red, ...)
}
)



--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] estimation problem

2012-05-03 Thread Jeff Newmiller
Although you have provided R code to illustrate your problem, it is 
fundamentally a statistics theory question, and belongs somewhere else like 
stats.stackexchange.net.

When you post there, I recommend that you spend more effort to identify why the 
zeros are present. If they are indicators of unknown values, that will be very 
different than if zeros are valid members of the population.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.



Kehl Dániel ke...@ktk.pte.hu wrote:

Dear List-members,

I have a problem where I have to estimate a mean, or a sum of a 
population but for some reason it contains a huge amount of zeros.
I cannot give real data but I constructed a toy example as follows

N1 - 10
N2 - 3000
x1 - rep(0,N1)
x2 - rnorm(N2,300,100)
x - c(x1,x2)

n - 1000

x_sample - sample(x,n,replace=FALSE)

I want to estimate the sum of x based on x_sample (not knowing N1 and
N2 
but their sum (N) only).
The sample mean has a huge standard deviation I am looking for a better

estimator.
I was thinking about trimmed (or left trimmed as my numbers are all 
positive) means or something similar,
but if I calculate trimmed mean I do not know N2 to multiply with.

Do you have any idea or could you give me some insight?

Thanks a lot:
Daniel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] NA's when subset in a dataframe

2012-05-03 Thread agent dunham
Dear community, 

I'm having this silly problem.

I've a linear model. After fixing it, I wanted to know which data had
studentized residuals larger than 3, so i tried this: 

d1 - cooks.distance(lmmodel)
r - sqrt(abs(rstandard(lmmodel)))
rstu - abs(rstudent(lmmodel))

a - cbind( mydata, d1, r,rstu) 

alargerthan3 -  a[rstu 3, ]

And suddenly  a[rstu 3, ]  has 17 rows, 7 of them are new rows, where all
the entries are NA's, even its rownames. 

Because of this I'm not sure of the dimension ofa[rstu 3, ]  (Do I only
have 8 entries?)

Has this happened to anybody before? If so, why this extra NA rows? what's
the problem? Is there any other way to know which data have studentized
residuals larger than   3?


 if it's needed  to upload my data, just tell me.

Thanks in advance,show crossp...@hotmail.com as u...@host.com




--
View this message in context: 
http://r.789695.n4.nabble.com/NA-s-when-subset-in-a-dataframe-tp4606172.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Model Averaging Help

2012-05-03 Thread Tom Finch

Dear All,

I'm using AIC-weighted model averaging to calculate model averaged 
parameter estimates and partial r-squares of each variable in a 
10-variable linear regression.


I've been using the MuMIn package to calculate parameter estimates, but 
am struggling with partial r-squares. There doesn't seem to be any 
function in the MuMIn package dealing with partial r-square 
(r.squaredLR() looks promising, but I think there is a problem updating 
MuMIn so I can't use this (new?) function). I know you can calculate 
partial r-squares using plot(anova(ols()), what = 'partial R2') but I 
don't know how best to do this for all 1024 models and then calculate 
the AIC-weighted model average.


Are there any packages out there that I'm missing? Or any ideas for 
combining MuMIn with anova.Design and Hmisc?


Thanks in advance, and apologies if my problem is badly worded.

Tom

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Identifying the particular X or Y in a sorted list

2012-05-03 Thread Rui Barradas
Hello,


Shankar Lanke wrote
 
 Dear All,
 
 I have a data sets as shown below A (Patient ID ), B and C are the
 Concentration of drug in blood on day 1 and day 4, D is the difference in
 conc. To do this in R I have written a code as follows, identified the
 number of patients who have more concentration on day 4 . Here I want to
 identify specifically the patient ID (is he patient 1 or 2 or 5 and 7),
 whose concentration is more.
 How to write a code to get the list of A (patient ID whose difference is
 more on day 4).
 
 Data-(myDf$B-myDf$C)
 sum(Data0)
 
   A B CD (B-C)  1 14 10 4  2 12 7 5  3 11 15 -4  4 8 3 5  5 1 8 -7
 
 I appreciate your help, thank you very much in advance.
 
 Regards
 
   [[alternative HTML version deleted]]
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

If I understand it correctly, this should do it.

x - read.table(text=
A B C D
1 14 10 4
2 12 7 5
3 11 15 -4
4 8 3 5
5 1 8 -7
, header=TRUE)

which(x$D == max(x$D))
x$A[ which(x$D == max(x$D)) ]

Or you can save the values of the which() function and use them when needed.

Hope this helps,
Rui Barradas



--
View this message in context: 
http://r.789695.n4.nabble.com/Identifying-the-particular-X-or-Y-in-a-sorted-list-tp4605124p4606062.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple plot loop

2012-05-03 Thread Ben Neal
Jim, thanks for the reply. I tried what you recommend, but I still get an error 
when running it, just as I did with the similar loops I was trying. Here is the 
error: 

Error in xy.coords(x, y) : 'x' and 'y' lengths differ

That comes from this code: 

#Get file
library(zoo)
setwd(/Users/benjaminneal/Documents/All Panama/1110_Panama/CNAT_Segmenting)
Data = read.csv(120503_CNAT_Summary.csv, header=T)

#fill in missing data from last observation
Data2 - na.locf(Data)
attach(Data2)

#Plot line, and make it loop
for(column in names(Data2)[2:length(Data2)])
  lines(MONTH,column,type=o,pch=22,lty=2,col=blue)

The problem perhaps is in my data. My columns are observations over time, 
column headers are individual names, and the first column is the time series in 
months. An example is here: 
MONTH   Tag101
0 234
2 236
4 239
8 300
10   320

This then repeats for different individuals . . . I think my problem must be 
that my length of MONTH is different than my length of observations of each 
column . . . except that it is not, as far as I can tell! Thank you for 
assisting me with this simple but frustrating problem - I have the greatest 
respect for those of you who provide these answers that so many of us read and 
utilize. Thank you. Ben Neal


-Original Message-
From: r-help-boun...@r-project.org on behalf of Jim Lemon
Sent: Thu 5/3/2012 3:40 AM
To: Ben Neal
Cc: r-help@r-project.org
Subject: Re: [R] Simple plot loop

On 05/03/2012 05:50 PM, Ben Neal wrote:
 Trying to plot multiple lines from a simple matrix with headers (eight 
 observations per column). I will be doing a number of these, with varying 
 numbers of columns, and do not want to enter the header names for each one (I 
 got frustrated and just wrote them out, which did work).

 Data reads fine, first plot is fine, but when i use the code at the bottom 
 for a for i loop it tells me that x and y do not match. . .

 One other issue is that I would prefer not to specify the first column 
 either, but when I enter Plot(MONTH, Data2[2] . . . it also does not plot. 
 Should this not call the second column?

 Thank you very much for any comments. I know this is simple, but I appreciate 
 any assistance. Cheers, Ben

 #

 # LOAD DATA FROM CSV
 library(zoo)
 setwd(/Users/benjaminneal/Documents/All Panama/1110_Panama/CNAT_Segmenting)
 Data = read.csv(120503_CNAT_Summary.csv, header=T)

 #fill in missing data from last observation
 Data2- na.locf(Data)
 attach(Data2)

 # PLOT ALL ON ONE CHART
 plot(MONTH,T102, type=o, ann=False, ylim=c(1, 100), pch=22, lty=2, 
 col=red)
 title(main=5m and 10 m Colpophylia natans colonies over time, ylab=% live 
 coral / colony,
xlab=Months, col.main=black, font.main=4)

 lines(MONTH,T162, type=o, pch=22, lty=2, col=red)
 lines(MONTH,T231, type=o, pch=22, lty=2, col=green)
 lines(MONTH,T250, type=o, pch=22, lty=2, col=green)

 ##(many other similar lines here, with entered column headers . . . up to 75)

 lines(MONTH,T373, type=o, pch=22, lty=2, col=blue)
 lines(MONTH,T374, type=o, pch=22, lty=2, col=blue)
 lines(MONTH,T377, type=o, pch=22, lty=2, col=blue)

 # Tried to add lines another way with for i loop, but this is the part not 
 working
 for (i in 2:length(Data2)) {
lines(MONTH, i, type=o, pch=22, lty=2, col=blue))
 }
 #

Hi Ben,
I think what you may want in your loop is this:

for(column in names(Data2)[2:length(Data2)])
  lines(MONTH,column,type=o,pch=22,lty=2,col=blue)

But, if you want the first two lines to be green, you'll probably have 
to get a vector of colors:

colorvec-rep(blue,length(Data2))
colorvec[1]-red
colorvec[2:3]-green

and change the above to:

columnnames-names(Data2)
for(column in 2:length(Data2))
  lines(MONTH,columnnames[column],type=o,pch=22,lty=2,col=colvec[column])

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Identifying the particular X or Y in a sorted list

2012-05-03 Thread Shankar Lanke
Dear All,

I have a data sets as shown below A (Patient ID ), B and C are the
Concentration of drug in blood on day 1 and day 4, D is the difference in
conc. To do this in R I have written a code as follows, identified the
number of patients who have more concentration on day 4 . Here I want to
identify specifically the patient ID (is he patient 1 or 2 or 5 and 7),
whose concentration is more.
How to write a code to get the list of A (patient ID whose difference is
more on day 4).

Data-(myDf$B-myDf$C)
sum(Data0)

ABCD (B-C)1141042127531115-44835518-7

I appreciate your help, thank you very much in advance.

Regards

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error fitting coxph model

2012-05-03 Thread Jessica Myers

Hi David,

Thanks for your input.  My first thought was to look for missing  
values, but I can tell you there are no missing values in the input.   
The error is occurring somewhere deep inside coxpenal.fit, so I can't  
identify how any NAs might be created.  Also, the if/else syntax is  
from coxpenal.fit, so I'm assuming they've done it right.


But I like your idea about searching for the error in data halves.   
It's also worth mentioning that when I changed the df parameter on my  
pspline terms to df = 3, it worked fine, but the error below occurred  
with both df = 2 and df = 4.


Thanks,
Jessica

On May 3, 2012, at 10:00 AM, David Winsemius wrote:



On May 2, 2012, at 3:02 PM, Jessica Myers wrote:


Hi,

I am using coxph from the survival package to fit a large model  
(100,000 observations, ~35 covariates) using both ridge regression  
(on binary covariates) and penalized splines (for continuous  
covariates).


In fitting, I get a strange error:

Error in if (abs((y[nx] - target)/(y[nx - 1] - target))  0.6)  
doing.well - FALSE else doing.well - TRUE :

missing value where TRUE/FALSE needed


You appear to have missing values in 'y', 'nx', or `target`. This  
could be a case of the newbie error of using if(){}else{} when  
ifelse() should have been used. Without at least the code we  
probably cannot resolve those two possibilities.


Unfortunately, I can't reproduce this error without handing over my  
entire dataset,


Why not? If the removal of missing values is unsuccessful and you  
were not committing the error I mentioned, then run it on two  
halves. Pick the one with the error. Rinse, lather, repeat.


but I thought it would be worth checking if anyone had any  
insight.  I should note that the outcome that I'm using has almost  
everyone having an event (~98,000 events out of 100,000).  I have  
fit other models like this with no problem, but on one particular  
dataset it fails.


Thanks!

Jessica Myers
Instructor in Medicine


David Winsemius, MD
West Hartford, CT





The information in this e-mail is intended only for the ...{{dropped:7}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pdf, pairs and subfloats

2012-05-03 Thread Jessica Streicher
Hi there!

I have found a trange problem with getting pairs()-plots to show properly in 
latex \subfloat environments.

If i generate images of these plots with pdf() and include them in subfloats, 
they will either show up in grayscale, or sometimes the datapoints of the 
pairplots are missing. Mind you, the PDFs themselves are properly colored and 
look perfectly fine, both as pdf-image in acrobat and in the pdf if included 
them without subfloat. 

If i build a png() instead of pdf() it will be properly colored, but doesn't 
look as nice when reading the pdf (and most people that will read this read it 
on their PC, not printed)

Any ideas?

I am using eclipse with the StatEt Plugin and Sweave.
I generate the document-pdf with texlipse and pdflatex. 

--

If requested i will try to generate the images from R console and try building 
with another Latex tool. That's work though, so i wanted to see first if 
someone knows whats going on.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bwplot: using a numeric variable to position boxplots

2012-05-03 Thread Richard M. Heiberger
Michael,

I normally do this with the panel.bwplot.intermediate.hh in the HH package.
This function works by plotting each box with its own call to the
underlying panel.bwplot function.

This example from ?HH::position uses the positioned class to
determine where to
place the box.

 library(HH)
 ?position
starting httpd help server ... done
 ## boxplots coded by week
 tmp - data.frame(Y=rnorm(40, rep(c(20,25,15,22), 10), 5),
+   week=ordered(rep(1:4, 10)))
 position(tmp$week) - c(1, 2, 4, 8)

bwplot(Y ~ week, horizontal=FALSE,
+   scales=list(x=list(limits=c(0,9),
+   at=position(tmp$week),
+   labels=position(tmp$week))),
+   data=tmp, panel=panel.bwplot.intermediate.hh)



Rich

On 5/3/12, Michael Friendly frien...@yorku.ca wrote:
 [Env: R 2.14.2 / Win Xp]

 In the examples below, I'm using lattice::bwplot to plot boxplots of 4
 variables, grouped by a factor 'epoch'
 which also corresponds to a numeric year.  I'd like to modify the plots
 to position the boxplots according to
 the numeric value of year, but I can't figure out how to do this.

 Also, I'd to modify the strip labels that give the variable names to use
 longer variable labels I define below as
 vlab.

   install.packages(heplots, repos=http://R-Forge.R-project.org;)
   # requires heplots 0.9-12 for Skulls data
   library(heplots)
   data(Skulls)
  
   # make shorter labels for epochs
   Skulls$epoch - factor(Skulls$epoch,
 labels=sub(c,,levels(Skulls$epoch)))
   # create year variable
   Skulls$year - rep(c(-4000, -3300, -1850, -200, 150), each=30)
   str(Skulls)
 'data.frame':   150 obs. of  6 variables:
   $ epoch: Ord.factor w/ 5 levels 4000BC3300BC..: 1 1 1 1 1 1 1 1
 1 1 ...
   $ mb   : num  131 125 131 119 136 138 139 125 131 134 ...
   $ bh   : num  138 131 132 132 143 137 130 136 134 134 ...
   $ bl   : num  89 92 99 96 100 89 108 93 102 99 ...
   $ nh   : num  49 48 50 44 54 56 48 48 51 51 ...
   $ year : num  -4000 -4000 -4000 -4000 -4000 -4000 -4000 -4000 -4000
 -4000 ...
   table(Skulls$epoch)

 4000BC 3300BC 1850BC  200BC  AD150
  30 30 30 30 30

 This is what I've tried.  I reshape the data to a long format and can
 plot value ~ epoch or value ~ as.factor(year),
 but I really want to space the boxplots according to the  numeric value
 of year.

 # better variable labels -- how to incorporate these in the strip labels?
 vlab - c(maxBreadth, basibHeight, basialLength, nasalHeight)

 library(lattice)
 library(reshape2)
 Skulls$year - rep(c(-4000, -3300, -1850, -200, 150), each=30)
 sklong - melt(Skulls, id=c(epoch, year))

 bwplot(value ~ epoch | variable, data=sklong, scales=free,
  ylab=Variable value,
  xlab=Epoch,
  panel = function(x,y, ...) {
  panel.bwplot(x, y, ...)
  panel.linejoin(x,y, col=red, ...)
  }
  )

 bwplot(value ~ as.factor(year) | variable, data=sklong, scales=free,
  ylab=Variable value,
  xlab=Year,
  panel = function(x,y, ...) {
  panel.bwplot(x, y, ...)
  panel.linejoin(x,y, col=red, ...)
  }
  )



 --
 Michael Friendly Email: friendly AT yorku DOT ca
 Professor, Psychology Dept.
 York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
 4700 Keele StreetWeb:   http://www.datavis.ca
 Toronto, ONT  M3J 1P3 CANADA

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] estimation problem

2012-05-03 Thread Kehl Dániel

Dear Jeff,

thank you for the response.
Of course I know this is a theory question still I hope to get some 
comments on it
(if somebody already dealt with alike problems might suggest a package 
and it would not take longer than saying this is a theoretical question)
The values are counts, so 0 means those cases do not have this item, 
they have 0, as such it means a real zero, they are valid members.


thanks,
daniel

2012.05.03. 16:42 keltezéssel, Jeff Newmiller írta:

Although you have provided R code to illustrate your problem, it is 
fundamentally a statistics theory question, and belongs somewhere else like 
stats.stackexchange.net.

When you post there, I recommend that you spend more effort to identify why the 
zeros are present. If they are indicators of unknown values, that will be very 
different than if zeros are valid members of the population.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.us Basics: ##.#.   ##.#.  Live Go...
   Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
---
Sent from my phone. Please excuse my brevity.



Kehl Dánielke...@ktk.pte.hu  wrote:


Dear List-members,

I have a problem where I have to estimate a mean, or a sum of a
population but for some reason it contains a huge amount of zeros.
I cannot give real data but I constructed a toy example as follows

N1- 10
N2- 3000
x1- rep(0,N1)
x2- rnorm(N2,300,100)
x- c(x1,x2)

n- 1000

x_sample- sample(x,n,replace=FALSE)

I want to estimate the sum of x based on x_sample (not knowing N1 and
N2
but their sum (N) only).
The sample mean has a huge standard deviation I am looking for a better

estimator.
I was thinking about trimmed (or left trimmed as my numbers are all
positive) means or something similar,
but if I calculate trimmed mean I do not know N2 to multiply with.

Do you have any idea or could you give me some insight?

Thanks a lot:
Daniel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-hpc] Quickest way to make a large empty file on disk?

2012-05-03 Thread Jonathan Greenberg
Thanks, all!  I'll try these out.  I'm trying to work up something that is
platform independent (if possible) for use with mmap.  I'll do some tests
on these suggestions and see which works best. I'll try to report back in a
few days.  Cheers!

--j



2012/5/3 Jens Oehlschlägel jens.oehlschlae...@truecluster.com

 Jonathan,

 On some filesystems (e.g. NTFS, see below) it is possible to create
 'sparse' memory-mapped files, i.e. reserving the space without the cost of
 actually writing initial values.
 Package 'ff' does this automatically and also allows to access the file in
 parallel. Check the example below and see how big file creation is
 immediate.

 Jens Oehlschlägel


  library(ff)
  library(snowfall)
  ncpus - 2
  n - 1e8
  system.time(
 + x - ff(vmode=double, length=n, filename=c:/Temp/x.ff)
 + )
User  System verstrichen
0.010.000.02
  # check finalizer, with an explicit filename we should have a 'close'
 finalizer
  finalizer(x)
 [1] close
  # if not, set it to 'close' inorder to not let slaves delete x on slave
 shutdown
  finalizer(x) - close
  sfInit(parallel=TRUE, cpus=ncpus, type=SOCK)
 R Version:  R version 2.15.0 (2012-03-30)

 snowfall 1.84 initialized (using snow 0.3-9): parallel execution on 2 CPUs.

  sfLibrary(ff)
 Library ff loaded.
 Library ff loaded in cluster.

 Warnmeldung:
 In library(package = ff, character.only = TRUE, pos = 2, warn.conflicts
 = TRUE,  :
   'keep.source' is deprecated and will be ignored
  sfExport(x) # note: do not export the same ff multiple times
  # explicitely opening avoids a gc problem
  sfClusterEval(open(x, caching=mmeachflush)) # opening with
 'mmeachflush' inststead of 'mmnoflush' is a bit slower but prevents OS
 write storms when the file is larger than RAM
 [[1]]
 [1] TRUE

 [[2]]
 [1] TRUE

  system.time(
 + sfLapply( chunk(x, length=ncpus), function(i){
 +   x[i] - runif(sum(i))
 +   invisible()
 + })
 + )
User  System verstrichen
0.000.00   30.78
  system.time(
 + s - sfLapply( chunk(x, length=ncpus), function(i) quantile(x[i],
 c(0.05, 0.95)) )
 + )
User  System verstrichen
0.000.004.38
  # for completeness
  sfClusterEval(close(x))
 [[1]]
 [1] TRUE

 [[2]]
 [1] TRUE

  csummary(s)
  5%  95%
 Min.0.04998 0.95
 1st Qu. 0.04999 0.95
 Median  0.05001 0.95
 Mean0.05001 0.95
 3rd Qu. 0.05002 0.95
 Max.0.05003 0.95
  # stop slaves
  sfStop()

 Stopping cluster

  # with the close finalizer we are responsible for deleting the file
 explicitely (unless we want to keep it)
  delete(x)
 [1] TRUE
  # remove r-side metadata
  rm(x)
  # truly free memory
  gc()



  *Gesendet:* Donnerstag, 03. Mai 2012 um 00:23 Uhr
 *Von:* Jonathan Greenberg j...@illinois.edu
 *An:* r-help r-help@r-project.org, r-sig-...@r-project.org
 *Betreff:* [R-sig-hpc] Quickest way to make a large empty file on disk?
  R-helpers:

 What would be the absolute fastest way to make a large empty file (e.g.
 filled with all zeroes) on disk, given a byte size and a given number
 number of empty values. I know I can use writeBin, but the object in
 this case may be far too large to store in main memory. I'm asking because
 I'm going to use this file in conjunction with mmap to do parallel writes
 to this file. Say, I want to create a blank file of 10,000 floating point
 numbers.

 Thanks!

 --j

 --
 Jonathan A. Greenberg, PhD
 Assistant Professor
 Department of Geography and Geographic Information Science
 University of Illinois at Urbana-Champaign
 607 South Mathews Avenue, MC 150
 Urbana, IL 61801
 Phone: 415-763-5476
 AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007
 http://www.geog.illinois.edu/people/JonathanGreenberg.html

 [[alternative HTML version deleted]]

 ___
 R-sig-hpc mailing list
 r-sig-...@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-hpc





-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 415-763-5476
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007
http://www.geog.illinois.edu/people/JonathanGreenberg.html

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] estimation problem

2012-05-03 Thread Petr Savicky
On Thu, May 03, 2012 at 03:08:00PM +0200, Kehl Dániel wrote:
 Dear List-members,
 
 I have a problem where I have to estimate a mean, or a sum of a 
 population but for some reason it contains a huge amount of zeros.
 I cannot give real data but I constructed a toy example as follows
 
 N1 - 10
 N2 - 3000
 x1 - rep(0,N1)
 x2 - rnorm(N2,300,100)
 x - c(x1,x2)
 
 n - 1000
 
 x_sample - sample(x,n,replace=FALSE)
 
 I want to estimate the sum of x based on x_sample (not knowing N1 and N2 
 but their sum (N) only).
 The sample mean has a huge standard deviation I am looking for a better 
 estimator.

Hi.

I do not know the exact answer, but let me formulate the following observation.
If the question is redefined to estimate the mean of nonzero numbers, then
an estimate is mean(x_sample[x_sample != 0]). Its standard deviation in your
situation may be estimated as

  res - rep(NA, times=1000)
  for (i in seq.int(along=res)) {
  x_sample - sample(x,n,replace=FALSE)
  res[i] - mean(x_sample[x_sample != 0])
  }
  sd(res)

  [1] 18.72677 # this varies with the seed a bit

The observation is that this cannot be improved much, since the estimate
is based on a very small sample. The average size of the sample of nonzero
values is N2/(N1+N2)*n = 29.1. So, the standard deviation should be
something close to 100/sqrt(29.1) = 18.5376.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] `mapply(function(x) function() x, c(a, b))$a()' returns `b'

2012-05-03 Thread Casper Ti. Vector
Thank you very much, though I still don't quite undertdand the
explanation :)

Nevertheless, I just found a seemingly simple (at least quiker to type)
solution after try-and-error:
eval(mapply(function(x) {x; function() x}, c(a, b)))
Wish it may help future readers.

On Thu, May 03, 2012 at 04:18:38PM +0200, Jessica Streicher wrote:
 So, to get back to mapply:
 
 eval(mapply(function(x) substitute(function() z,list(z=x)), c(a, b))$a)()
  or like this:
 mapply(function(x) eval(substitute(function(i) z*i,list(z=x))), 
 c(2,3))[[1]](2)

-- 
Using GPG/PGP? Please get my current public key (ID: 0xAEF6A134, valid
from 2010 to 2013) from a key server.



signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] LME4 to MCMCglmm

2012-05-03 Thread NightlordTW
Hi all,

I am trying to run an lme4 model (logistic regression with mixed effects) in
MCMCglmm but am unsure how to implement it properly.

Currently, my lme4 model formula looks as follows: outcome ~ (1 + var1 +
var2 | study) + var1 + var2

In English, this means that I am fitting a random effects model, where the
intercept, var1 and var2 are jointly distributed according to study.

My question is now how I would translate this formula to the fixed and
random terms in MCMCglmm.

For the fixed part, I figured that I should make a variable
nooutcome=abs(1-outcome) because it can then be modeled with a multinomial2
family as there is no binomial(logit) option available.  Then, the fixed
part would look as follows:

cbind(outcome,nooutcome)~1+var1+var2

However, I am unsure how to specify the random effects over the intercept,
var1 and var2 jointly. So far, I was able to generate the following:

random=~us(var1):study+us(var2):study+us(1):study

which I think corresponds to  outcome ~ (1  | study) + (var1  | study)  +
(var2  | study) +  var1 + var2 instead of outcome ~ (1 + var1 + var2 |
study) + var1 + var2

I would appreciate any help.

Thomas




--
View this message in context: 
http://r.789695.n4.nabble.com/LME4-to-MCMCglmm-tp4606423.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Identifying the particular X or Y in a sorted list

2012-05-03 Thread John Kane
You do not seem to have suppied either code nor data.  Please supply both.

John Kane
Kingston ON Canada


 -Original Message-
 From: shankarla...@gmail.com
 Sent: Wed, 2 May 2012 22:06:54 -0400
 To: r-help@r-project.org
 Subject: [R] Identifying the particular X or Y in a sorted list
 
 Dear All,
 
 I have a data sets as shown below A (Patient ID ), B and C are the
 Concentration of drug in blood on day 1 and day 4, D is the difference in
 conc. To do this in R I have written a code as follows, identified the
 number of patients who have more concentration on day 4 . Here I want to
 identify specifically the patient ID (is he patient 1 or 2 or 5 and 7),
 whose concentration is more.
 How to write a code to get the list of A (patient ID whose difference is
 more on day 4).
 
 Data-(myDf$B-myDf$C)
 sum(Data0)
 
   A B CD (B-C)  1 14 10 4  2 12 7 5  3 11 15 -4  4 8 3 5  5 1 8 -7
 
 I appreciate your help, thank you very much in advance.
 
 Regards
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at 
http://www.inbox.com/smileys
Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most 
webmails

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] warning with glm.predict, wrong number of data rows

2012-05-03 Thread Charles Berry
carol white wht_crl at yahoo.com writes:

 
 Hi,
 I split a data set into two partitions (80 and 42), use the first as the
training set in glm and the second as
 testing set in glm predict. But when I call glm.predict, I get the warning
message: 
 
 Warning message:
 'newdata' had 42 rows but variable(s) found have 80 rows 
 -

[snip]

The warning correctly diagnoses the problem.

The posting guide asks for a 'reproducible example', but you did not give us 
one.

There is one below. 

Note what happens when predict() tries to reconstruct the variable 'x[1:4]'
as dictated by the formula.

How many elements can 'x[1:4]' have when newdata has (say) nrowsNew?

Use the subset argument to select a subset of observations.


 y - sample(factor(1:2),80,repl=T)
 y - sample(factor(1:2),5,repl=T)
 x - 1:4
 fit - glm( y[1:4] ~ x[1:4], family = binomial)
 fit

Call:  glm(formula = y[1:4] ~ x[1:4], family = binomial)

Coefficients:
(Intercept)   x[1:4]  
 -1.110e-160.000e+00  

Degrees of Freedom: 3 Total (i.e. Null);  2 Residual
Null Deviance:  5.545 
Residual Deviance: 5.545AIC: 9.545 
 predict(fit,newdata=data.frame(x=1:2))
1 2 3 4 
-1.110223e-16 -1.110223e-16NANA 
Warning message:
'newdata' had 2 rows but variable(s) found have 4 rows 
 predict(fit,newdata=data.frame(x=1:5))
1 2 3 4 
-1.110223e-16 -1.110223e-16 -1.110223e-16 -1.110223e-16 
Warning message:
'newdata' had 5 rows but variable(s) found have 4 rows 



HTH,

Chuck

[rest deleted]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cannot read or write to file in Linux Ubuntu

2012-05-03 Thread John Kane
I am the proud owner of a new laptop since my old one died the other day.
Currently I have a dual-boot Windows 7 Home and Ubuntu 12.04 .  I'll leave the 
Windows problems for another post.

 I know practically nothing about Linux so I am probably doing something stupid 
but ... at the moment I cannot seem read or write  files in Ubuntu.   I am not 
having any problem saving other documents to the hard drive and R , from my few 
simple tests, seems to be working okay otherwise.

At the moment I am trying :

mydata - read.csv(DATA/media/DATA/rdata/tt1.csv,  header = TRUE)
  or 
mydata - read.csv(DATA/rdata/tt1.csv,  header = TRUE)


where tt1.csv is a text file on what, from my reading of the path listed  in 
gedit is 
DATA/media/DATA/rdata

The csv data is simply:
aa, bb
2, 3
4, 5

What happens:
---
1 mydata - read.csv(DATA/rdata/tt1.csv,  header = TRUE)
Error in file(file, rt) : cannot open the connection
In addition: Warning message:
In file(file, rt) :
  cannot open file 'DATA/rdata/tt1.csv': No such file or directory

Am I totally screwing up the path?  Or doing something else equally stupid?

BTW I realise that 2.15 is out but Ubuntu as of yesterday did not have it in 
the repositories and I have yet to figure out how to install it from a CRAN 
site.

1 sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: i686-pc-linux-gnu (32-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C  
 [3] LC_TIME=en_CA.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_CA.UTF-8LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=C LC_NAME=C 
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

John Kane
Kingston ON Canada


FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pdf, pairs and subfloats

2012-05-03 Thread Jessica Streicher
I was bored and have tried doing it with console and texworks (also uses 
pdflatex)

texworks preview shows it properly colored, but in acrobat reader it is black 
and white again. Still scratching my head..

Am 03.05.2012 um 17:14 schrieb Jessica Streicher:

 Hi there!
 
 I have found a trange problem with getting pairs()-plots to show properly in 
 latex \subfloat environments.
 
 If i generate images of these plots with pdf() and include them in subfloats, 
 they will either show up in grayscale, or sometimes the datapoints of the 
 pairplots are missing. Mind you, the PDFs themselves are properly colored and 
 look perfectly fine, both as pdf-image in acrobat and in the pdf if included 
 them without subfloat. 
 
 If i build a png() instead of pdf() it will be properly colored, but doesn't 
 look as nice when reading the pdf (and most people that will read this read 
 it on their PC, not printed)
 
 Any ideas?
 
 I am using eclipse with the StatEt Plugin and Sweave.
 I generate the document-pdf with texlipse and pdflatex. 
 
 --
 
 If requested i will try to generate the images from R console and try 
 building with another Latex tool. That's work though, so i wanted to see 
 first if someone knows whats going on.
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cannot read or write to file in Linux Ubuntu

2012-05-03 Thread Sarah Goslee
Hi John,

You're probably messing up the path, just as you suspect.

If you use a relative path, like you are doing, then R looks for that
location starting at R's current working directory, visible with
getwd(). For linux, that's the location at which you started R if you
started it from a terminal.

The safest solution is to use an absolute path, which will likely be
something resembling /home/john/DATA/... etc - note that it will
always start with a / and go from there.

If you know how to start a terminal window and cd to where your file
is, pwd at the command prompt will give you the absolute path to that
location, which is what you should be using until you get more
comfortable with the file system.

The error message means that R can't find the directory you're telling
it to use.

Sarah

On Thu, May 3, 2012 at 12:21 PM, John Kane jrkrid...@inbox.com wrote:
 I am the proud owner of a new laptop since my old one died the other day.
 Currently I have a dual-boot Windows 7 Home and Ubuntu 12.04 .  I'll leave 
 the Windows problems for another post.

  I know practically nothing about Linux so I am probably doing something 
 stupid but ... at the moment I cannot seem read or write  files in Ubuntu.   
 I am not having any problem saving other documents to the hard drive and R , 
 from my few simple tests, seems to be working okay otherwise.

 At the moment I am trying :

 mydata - read.csv(DATA/media/DATA/rdata/tt1.csv,  header = TRUE)
  or
 mydata - read.csv(DATA/rdata/tt1.csv,  header = TRUE)


 where tt1.csv is a text file on what, from my reading of the path listed  in 
 gedit is
 DATA/media/DATA/rdata

 The csv data is simply:
 aa, bb
 2, 3
 4, 5

 What happens:
 ---
 1 mydata - read.csv(DATA/rdata/tt1.csv,  header = TRUE)
 Error in file(file, rt) : cannot open the connection
 In addition: Warning message:
 In file(file, rt) :
  cannot open file 'DATA/rdata/tt1.csv': No such file or directory

 Am I totally screwing up the path?  Or doing something else equally stupid?

 BTW I realise that 2.15 is out but Ubuntu as of yesterday did not have it in 
 the repositories and I have yet to figure out how to install it from a CRAN 
 site.

 1 sessionInfo()
 R version 2.14.1 (2011-12-22)
 Platform: i686-pc-linux-gnu (32-bit)

 locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_CA.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=C                 LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cannot read or write to file in Linux Ubuntu

2012-05-03 Thread Jeff Newmiller
All of your tests are with relative paths. Use getwd() identify your starting 
directory, and if it isn't you can use setwd() to start in the right place.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

John Kane jrkrid...@inbox.com wrote:

I am the proud owner of a new laptop since my old one died the other
day.
Currently I have a dual-boot Windows 7 Home and Ubuntu 12.04 .  I'll
leave the Windows problems for another post.

I know practically nothing about Linux so I am probably doing something
stupid but ... at the moment I cannot seem read or write  files in
Ubuntu.   I am not having any problem saving other documents to the
hard drive and R , from my few simple tests, seems to be working okay
otherwise.

At the moment I am trying :

mydata - read.csv(DATA/media/DATA/rdata/tt1.csv,  header = TRUE)
  or 
mydata - read.csv(DATA/rdata/tt1.csv,  header = TRUE)


where tt1.csv is a text file on what, from my reading of the path
listed  in gedit is 
DATA/media/DATA/rdata

The csv data is simply:
aa, bb
2, 3
4, 5

What happens:
---
1 mydata - read.csv(DATA/rdata/tt1.csv,  header = TRUE)
Error in file(file, rt) : cannot open the connection
In addition: Warning message:
In file(file, rt) :
  cannot open file 'DATA/rdata/tt1.csv': No such file or directory

Am I totally screwing up the path?  Or doing something else equally
stupid?

BTW I realise that 2.15 is out but Ubuntu as of yesterday did not have
it in the repositories and I have yet to figure out how to install it
from a CRAN site.

1 sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: i686-pc-linux-gnu (32-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C  
 [3] LC_TIME=en_CA.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_CA.UTF-8LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=C LC_NAME=C 
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base   
 

John Kane
Kingston ON Canada


FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Two ecdf with log-scales

2012-05-03 Thread Johannes Radinger
Thank you Steve,

thats the thing I was looking for

/Johannes

 Original-Nachricht 
 Datum: Thu, 3 May 2012 08:20:51 -0400
 Von: Steven Wolf wolfs...@msu.edu
 An: \'David Winsemius\' dwinsem...@comcast.net, \'Johannes Radinger\' 
 jradin...@gmx.at
 CC: R-help@r-project.org
 Betreff: RE: [R] Two ecdf with log-scales

 I've done it this way before:
 
 eX - ecdf(distribution 1)
 eY - ecdf(distribution 2)
 par(mar=c(5,5,2,1),xlog=TRUE)
 plot(eX, do.points=FALSE, verticals=TRUE, col=black, xlab=xlabel,
 xlim=c(1,10), ylab=ylabel, 
   lty=1, cex.lab=1.5, cex.axis=1.5, main=,
 lwd=3,log=x)
 plot(eY, do.points=FALSE, verticals=TRUE, col=blue, add=TRUE,
 xlim=c(1,10), main=)
 
 Warning:  It makes a stair-step that may be difficult to see unless you
 use
 color.  I had to change how the ecdf was plotted when I made b/w figures
 for
 my publication so that different dashed lines were distinct.
 
 HTH,
 -Steve
 
 -Original Message-
 From: David Winsemius [mailto:dwinsem...@comcast.net] 
 Sent: Wednesday, May 02, 2012 10:17 AM
 To: Johannes Radinger
 Cc: R-help@r-project.org
 Subject: Re: [R] Two ecdf with log-scales
 
 
 On May 2, 2012, at 6:14 AM, Johannes Radinger wrote:
 
  Hi,
 
  i want to plot empirical cumulative density functions for two 
  variables in one plot. For better visualizing the differences in the 
  two cumulative curves I'd like to log-scale the axis.
 
  So far I found 3 possible functions to plot ecdf:
 
  1) ecdf() from the package 'stats'. I don't know how to successfully 
  set the log.scales? Combining two plots is not a problem:
 
  plot(ecdf(x1))
  lines(ecdf(x2),col.h=red)
 
  2) gx.ecdf() from package 'rgr'. It is easily possible to plot log- 
  scales, but I don't know how to plot two densities?
 
  gx.ecdf(x1,log=TRUE,ifqs = TRUE)
 
  3) Ecdf() from package 'Hmisc'. No log-option directly available and 
  here I also don't know how to 'stack' two plots...
 
  Ecdf(x1,what=F)
 
 
  Probably there are many more solutions (e.g. ggplot etc.)...
  ...Has anyone faced a similar task and found a simple solution? Any 
  suggestions are welcome!
 
 Have you searched the Archives? I seem to remember that the log(0) was a
 barrier to persons attempting this in the past. (ISTR a posting in the
 last
 few weeks.)  Maybe you could also provide a test data object that has the
 same range as your x1 and x 2 variables.
 
  and provide commented, minimal, self-contained, reproducible code.
 
 -- 
 
 David Winsemius, MD
 West Hartford, CT
 
 
 

-- 

Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with 'nls' fitting logistic model (5PL)

2012-05-03 Thread Keith Jewell
 ?nls.control
  fit- nls(MFI~a + b/((1+(nom/c)^d)^f), data=x, weights=x$weights,
+  start=c(a=100, b=1, c=100, d=-1, f=1), 
control=nls.control(warnOnly=TRUE))
Warning message:
In nls(MFI ~ a + b/((1 + (nom/c)^d)^f), data = x, weights = x$weights,  :
  step factor 0.000488281 reduced below 'minFactor' of 0.000976562
 fit
Nonlinear regression model
  model:  MFI ~ a + b/((1 + (nom/c)^d)^f)
   data:  x
 a  b  c  d  f
  165.4360 23866.124738.6157-0.4454 3.2210
 weighted residual sum-of-squares: 2627977

Number of iterations till stop: 23
Achieved convergence tolerance: 48.23
Reason stopped: step factor 0.000488281 reduced below 'minFactor' of 
0.000976563
 summary(fit)

Formula: MFI ~ a + b/((1 + (nom/c)^d)^f)

Parameters:
Estimate Std. Error t value Pr(|t|)
a  1.654e+02  2.742e+04   0.0060.995
b  2.387e+04  3.027e+06   0.0080.994
c  3.862e+01  3.030e+04   0.0010.999
d -4.454e-01  5.754e+01  -0.0080.994
f  3.221e+00  1.008e+03   0.0030.998

Residual standard error: 540.4 on 9 degrees of freedom

Number of iterations till stop: 23
Achieved convergence tolerance: 48.23
Reason stopped: step factor 0.000488281 reduced below 'minFactor' of 
0.000976563


Perhaps nls is a little more stringent than ANY  reliable statistical 
software in what, by default, it considers a fit worth reporting?

Keith J

Michal Figurski figur...@mail.med.upenn.edu wrote in message 
news:4fa2759c.3060...@mail.med.upenn.edu...
 Bert,

 Thank you for your thoughts.

 I can assure you I have plotted the data back and forth many times, and 
 that overfitting has nothing to do with it. This is not a _statistical_ 
 problem, but a _technical_ problem. Something that works well in ANY 
 reliable statistical software doesn't work in R with 'nls'.

 This just makes me think that 'nls' is a dysfunctional piece of junk that 
 can't handle even a very simple problem. Not to mention that 'predict()' 
 for 'nls' doesn't give you any sort of confidence interval.

 I wonder if there's another package in R that could be used to fit 
 5P-logistic model. Any idea?

 In the end I may attempt to write a fitting function myself, but that 
 would be the last resort.

 --
 Michal J. Figurski, PhD
 HUP, Pathology  Laboratory Medicine
 Biomarker Research Laboratory
 3400 Spruce St. 7 Maloney S
 Philadelphia, PA 19104
 tel. (215) 662-3413


 On 5/2/2012 3:47 PM, Bert Gunter wrote:
 Plot the data. You're clearly overfitting.

 (If you don't know what this means or why it causes the problems you
 see, try a statistical help list or consult your local statistician).

 -- Bert

 On Wed, May 2, 2012 at 12:32 PM, Michal Figurski
 figur...@mail.med.upenn.edu  wrote:
 Dear R-Helpers,

 I'm working with immunoassay data and 5PL logistic model. I wanted to
 experiment with different forms of weighting and parameter selection, 
 which
 is not possible in instrument software, so I turned to R.

 I am using R 2.14.2 under Win7 64bit, and the 'nls' library to fit the 
 model
 - I started with the same model and weighting type (1/y) as in the
 instrument to see if I'll get similar results. However, in some 
 instances I
 don't get any results - just errors.

 Here is an example calibration data, representative of my experiment.
 Instrument soft had no problem fitting it:
 x- structure(list(SPL = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L,
 4L, 5L, 5L, 6L, 6L, 7L, 7L), .Label = c(St1, St2, St3,
 St4, St5, St6, St7), class = factor), MFI = c(10755.5,
 9839, 5142.5, 4857, 1510.5, 1505, 502.5, 451, 215, 195.5, 58,
 57, 15, 15), nom = c(206, 206, 125, 125, 68, 68, 38, 38, 24,
 24, 13, 13, 6.5, 6.5), weights = c(0.0013946353028683, 
 0.00152454517735542,
 0.00291686922702965, 0.00308832612723904, 0.0099304865938431,
 0.00996677740863787, 0.0298507462686567, 0.0332594235033259,
 0.0697674418604651, 0.0767263427109974, 0.258620689655172,
 0.263157894736842,
 1, 1)), .Names = c(SPL, MFI, nom, weights), row.names = c(NA,
 -14L), class = data.frame)

 And here is the nls fit:
 fit- nls(MFI~a + b/((1+(nom/c)^d)^f), data=x, weights=x$weights,
 start=c(a=100, b=1, c=100, d=-1, f=1))

 I've tried every possible combination of starting values, including the
 values fitted by the instrument soft - to no avail. I've probably seen 
 all
 possible error messages from 'nls' trying to fit this.

 If anyone has an idea why it's not working - let me know.

 Best regards,

 --
 Michal J. Figurski, PhD
 HUP, Pathology  Laboratory Medicine
 Biomarker Research Laboratory
 3400 Spruce St. 7 Maloney S
 Philadelphia, PA 19104
 tel. (215) 662-3413

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list

[R] Help with readBin

2012-05-03 Thread kapo coulibaly
I'm trying to read a binary file created by a fortran code using readBin
and readChar. Everything reads fine (integers and strings) except for
double precision numbers, they are read as huge or very small number
(1E-250,...). I tried various endianness, swap, But nothing has worked so
far.
I also tried on R 64 bit for linux and windows (R 2.14) and R 2.11 on
windows XP 32 bit.
Any help would be appreciated.

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with 'nls' fitting logistic model (5PL)

2012-05-03 Thread Bert Gunter
Thanks, Keith.

I failed to cc the following reply to John Nash to the list. Your
email persuaded me that it might be useful to do so.

None of this changes the fact that the model is overfitted. You may be
able to get convergence to some set of parameter estates, but they
won't have much meaning since the confidence region will be huge
(better word: essentially degenerate -- since it would have
essentially 0 volume in 5d). Predictions should be ok, but if that is
all that is wanted, a more parsimonious model would be better.
Extrapolation is nuts of course.

In my experience(only), scientists tend to overfit nonlinear models.
Failure to converge seems to me to be more appropriate in these
circumstances than providing meaningless parameters.

And,yes I know I'm a curmudgeon. But facilitating bad practice should
always be a concern.

Best,
Bert

On Thu, May 3, 2012 at 9:38 AM, Keith Jewell k.jew...@campden.co.uk wrote:
 ?nls.control
  fit- nls(MFI~a + b/((1+(nom/c)^d)^f), data=x, weights=x$weights,
 +  start=c(a=100, b=1, c=100, d=-1, f=1),
 control=nls.control(warnOnly=TRUE))
 Warning message:
 In nls(MFI ~ a + b/((1 + (nom/c)^d)^f), data = x, weights = x$weights,  :
  step factor 0.000488281 reduced below 'minFactor' of 0.000976562
 fit
 Nonlinear regression model
  model:  MFI ~ a + b/((1 + (nom/c)^d)^f)
   data:  x
         a          b          c          d          f
  165.4360 23866.1247    38.6157    -0.4454     3.2210
  weighted residual sum-of-squares: 2627977

 Number of iterations till stop: 23
 Achieved convergence tolerance: 48.23
 Reason stopped: step factor 0.000488281 reduced below 'minFactor' of
 0.000976563
 summary(fit)

 Formula: MFI ~ a + b/((1 + (nom/c)^d)^f)

 Parameters:
    Estimate Std. Error t value Pr(|t|)
 a  1.654e+02  2.742e+04   0.006    0.995
 b  2.387e+04  3.027e+06   0.008    0.994
 c  3.862e+01  3.030e+04   0.001    0.999
 d -4.454e-01  5.754e+01  -0.008    0.994
 f  3.221e+00  1.008e+03   0.003    0.998

 Residual standard error: 540.4 on 9 degrees of freedom

 Number of iterations till stop: 23
 Achieved convergence tolerance: 48.23
 Reason stopped: step factor 0.000488281 reduced below 'minFactor' of
 0.000976563


 Perhaps nls is a little more stringent than ANY  reliable statistical
 software in what, by default, it considers a fit worth reporting?

 Keith J

 Michal Figurski figur...@mail.med.upenn.edu wrote in message
 news:4fa2759c.3060...@mail.med.upenn.edu...
 Bert,

 Thank you for your thoughts.

 I can assure you I have plotted the data back and forth many times, and
 that overfitting has nothing to do with it. This is not a _statistical_
 problem, but a _technical_ problem. Something that works well in ANY
 reliable statistical software doesn't work in R with 'nls'.

 This just makes me think that 'nls' is a dysfunctional piece of junk that
 can't handle even a very simple problem. Not to mention that 'predict()'
 for 'nls' doesn't give you any sort of confidence interval.

 I wonder if there's another package in R that could be used to fit
 5P-logistic model. Any idea?

 In the end I may attempt to write a fitting function myself, but that
 would be the last resort.

 --
 Michal J. Figurski, PhD
 HUP, Pathology  Laboratory Medicine
 Biomarker Research Laboratory
 3400 Spruce St. 7 Maloney S
 Philadelphia, PA 19104
 tel. (215) 662-3413


 On 5/2/2012 3:47 PM, Bert Gunter wrote:
 Plot the data. You're clearly overfitting.

 (If you don't know what this means or why it causes the problems you
 see, try a statistical help list or consult your local statistician).

 -- Bert

 On Wed, May 2, 2012 at 12:32 PM, Michal Figurski
 figur...@mail.med.upenn.edu  wrote:
 Dear R-Helpers,

 I'm working with immunoassay data and 5PL logistic model. I wanted to
 experiment with different forms of weighting and parameter selection,
 which
 is not possible in instrument software, so I turned to R.

 I am using R 2.14.2 under Win7 64bit, and the 'nls' library to fit the
 model
 - I started with the same model and weighting type (1/y) as in the
 instrument to see if I'll get similar results. However, in some
 instances I
 don't get any results - just errors.

 Here is an example calibration data, representative of my experiment.
 Instrument soft had no problem fitting it:
 x- structure(list(SPL = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L,
 4L, 5L, 5L, 6L, 6L, 7L, 7L), .Label = c(St1, St2, St3,
 St4, St5, St6, St7), class = factor), MFI = c(10755.5,
 9839, 5142.5, 4857, 1510.5, 1505, 502.5, 451, 215, 195.5, 58,
 57, 15, 15), nom = c(206, 206, 125, 125, 68, 68, 38, 38, 24,
 24, 13, 13, 6.5, 6.5), weights = c(0.0013946353028683,
 0.00152454517735542,
 0.00291686922702965, 0.00308832612723904, 0.0099304865938431,
 0.00996677740863787, 0.0298507462686567, 0.0332594235033259,
 0.0697674418604651, 0.0767263427109974, 0.258620689655172,
 0.263157894736842,
 1, 1)), .Names = c(SPL, MFI, nom, weights), row.names = c(NA,
 -14L), class = data.frame)

 And here is the nls fit:
 fit- 

Re: [R] Cannot read or write to file in Linux Ubuntu

2012-05-03 Thread John Kane
Thanks. 
 I had not realsed there were relative paths until Sarah mentioned them.

It's working now: see my post to Sarah.  

John Kane
Kingston ON Canada


 -Original Message-
 From: jdnew...@dcn.davis.ca.us
 Sent: Thu, 03 May 2012 09:30:10 -0700
 To: jrkrid...@inbox.com, r-help@r-project.org
 Subject: Re: [R] Cannot read or write to file in Linux Ubuntu
 
 All of your tests are with relative paths. Use getwd() identify your
 starting directory, and if it isn't you can use setwd() to start in the
 right place.
 ---
 Jeff NewmillerThe .   .  Go
 Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live
 Go...
   Live:   OO#.. Dead: OO#..  Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#.
 rocks...1k
 ---
 Sent from my phone. Please excuse my brevity.
 
 John Kane jrkrid...@inbox.com wrote:
 
 I am the proud owner of a new laptop since my old one died the other
 day.
 Currently I have a dual-boot Windows 7 Home and Ubuntu 12.04 .  I'll
 leave the Windows problems for another post.
 
 I know practically nothing about Linux so I am probably doing something
 stupid but ... at the moment I cannot seem read or write  files in
 Ubuntu.   I am not having any problem saving other documents to the
 hard drive and R , from my few simple tests, seems to be working okay
 otherwise.
 
 At the moment I am trying :
 
 mydata - read.csv(DATA/media/DATA/rdata/tt1.csv,  header = TRUE)
  or
 mydata - read.csv(DATA/rdata/tt1.csv,  header = TRUE)
 
 
 where tt1.csv is a text file on what, from my reading of the path
 listed  in gedit is
 DATA/media/DATA/rdata
 
 The csv data is simply:
 aa, bb
 2, 3
 4, 5
 
 What happens:
 ---
 1 mydata - read.csv(DATA/rdata/tt1.csv,  header = TRUE)
 Error in file(file, rt) : cannot open the connection
 In addition: Warning message:
 In file(file, rt) :
  cannot open file 'DATA/rdata/tt1.csv': No such file or directory
 
 Am I totally screwing up the path?  Or doing something else equally
 stupid?
 
 BTW I realise that 2.15 is out but Ubuntu as of yesterday did not have
 it in the repositories and I have yet to figure out how to install it
 from a CRAN site.
 
 1 sessionInfo()
 R version 2.14.1 (2011-12-22)
 Platform: i686-pc-linux-gnu (32-bit)
 
 locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_CA.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_CA.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C
 
 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base
 
 
 John Kane
 Kingston ON Canada
 
 
 FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks  orcas on your 
desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cannot read or write to file in Linux Ubuntu

2012-05-03 Thread John Kane
Thanks Sarah,

I suspected something like that but am still gropping around in Linux.  I 
vaguely remember how to cd to someplace.  Shades of DOS 3.2! Of was that Unixor 
both!

Also I think I was trying to be a bit too smart-alecky in where I was placing 
my data folder so I moved it to my home folder to simplify figuring out the 
path. Still thinking in Windows terms.

After a bit of trial and error:

jjohn@john-K53U:~$ cd /home/john/rdata
john@john-K53U:~/rdata$ dir
tti.csv
john@john-K53U:~/rdata$ pwd
/home/john/rdata

so 
mydata - read.csv(/home/john/rdata/tti.csv, header = TRUE)
 works just fine.  I like the idea of staying with absolute paths.

I am most appreciative.

John Kane
Kingston ON Canada


 -Original Message-
 From: sarah.gos...@gmail.com
 Sent: Thu, 3 May 2012 12:29:14 -0400
 To: jrkrid...@inbox.com
 Subject: Re: [R] Cannot read or write to file in Linux Ubuntu
 
 Hi John,
 
 You're probably messing up the path, just as you suspect.
 
 If you use a relative path, like you are doing, then R looks for that
 location starting at R's current working directory, visible with
 getwd(). For linux, that's the location at which you started R if you
 started it from a terminal.
 
 The safest solution is to use an absolute path, which will likely be
 something resembling /home/john/DATA/... etc - note that it will
 always start with a / and go from there.
 
 If you know how to start a terminal window and cd to where your file
 is, pwd at the command prompt will give you the absolute path to that
 location, which is what you should be using until you get more
 comfortable with the file system.
 
 The error message means that R can't find the directory you're telling
 it to use.
 
 Sarah
 
 On Thu, May 3, 2012 at 12:21 PM, John Kane jrkrid...@inbox.com wrote:
 I am the proud owner of a new laptop since my old one died the other
 day.
 Currently I have a dual-boot Windows 7 Home and Ubuntu 12.04 .  I'll
 leave the Windows problems for another post.
 
  I know practically nothing about Linux so I am probably doing something
 stupid but ... at the moment I cannot seem read or write  files in
 Ubuntu.   I am not having any problem saving other documents to the hard
 drive and R , from my few simple tests, seems to be working okay
 otherwise.
 
 At the moment I am trying :
 
 mydata - read.csv(DATA/media/DATA/rdata/tt1.csv,  header = TRUE)
  or
 mydata - read.csv(DATA/rdata/tt1.csv,  header = TRUE)
 
 
 where tt1.csv is a text file on what, from my reading of the path listed
 in gedit is
 DATA/media/DATA/rdata
 
 The csv data is simply:
 aa, bb
 2, 3
 4, 5
 
 What happens:
 ---
 1 mydata - read.csv(DATA/rdata/tt1.csv,  header = TRUE)
 Error in file(file, rt) : cannot open the connection
 In addition: Warning message:
 In file(file, rt) :
  cannot open file 'DATA/rdata/tt1.csv': No such file or directory
 
 Am I totally screwing up the path?  Or doing something else equally
 stupid?
 
 BTW I realise that 2.15 is out but Ubuntu as of yesterday did not have
 it in the repositories and I have yet to figure out how to install it
 from a CRAN site.
 
 1 sessionInfo()
 R version 2.14.1 (2011-12-22)
 Platform: i686-pc-linux-gnu (32-bit)
 
 locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_CA.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=C                 LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C
 
 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base
 
 --
 Sarah Goslee
 http://www.functionaldiversity.org


FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with readBin

2012-05-03 Thread Duncan Murdoch

On 03/05/2012 12:41 PM, kapo coulibaly wrote:

I'm trying to read a binary file created by a fortran code using readBin
and readChar. Everything reads fine (integers and strings) except for
double precision numbers, they are read as huge or very small number
(1E-250,...). I tried various endianness, swap, But nothing has worked so
far.
I also tried on R 64 bit for linux and windows (R 2.14) and R 2.11 on
windows XP 32 bit.
Any help would be appreciated.


As I wrote to someone else with a similar problem a couple of weeks ago:

You need to see what's in the file.  The hexView package can dump it in
various formats; see example(viewFormat) for a couple.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to replace NA with zero (0)

2012-05-03 Thread Christopher Kelvin
Hello,
 When i generate data with the code below there appear NA as part of the 
generated data, i prefer to have zero (0) instead of NA on my data.
Is there a command i can issue to replace the NA with zero (0) even if it is 
after generating the data? 
Thank you
library(survival)

p1-0.8;b-1.5;rr-1000
for(i in 1:rr){
r-runif(45,min=0,max=1)
t-rweibull(45,p1,b)
w=Surv(r,t,type=interval2)

x[1:45]-(w[,1])
u-x[1:45]

y[1:45]-(w[,2])
v-y[1:45]
}
w
u
v

Chris G
Researcher
Institute for Mathematical Research
UPM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Permute/bootstrap F-statistics

2012-05-03 Thread DF
Hi guys,

I really like the package adegenet for population structure analysis. I used
the function Fst from the pegas package (wrapped within adegenet) in order
to generate F-statistic estimates for my data set. However, this does not
provide me with confidence intervals or p-values. I was wondering if there
is a way to generate bootstrapped/permuted p values or confidence intervals?

Thanks,
DF

--
View this message in context: 
http://r.789695.n4.nabble.com/Permute-bootstrap-F-statistics-tp4606264.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cannot read or write to file in Linux Ubuntu

2012-05-03 Thread Jeff Newmiller
I like the idea of staying with absolute paths.

Before you write too much R code that builds in absolute paths, please consider 
how difficult it will be to adjust all of those paths if you need to run on a 
different computer or you need to reorganize your overall directory structure. 
If you keep related R files in the same project directory, you can collapse all 
of those paths down to short relative paths, and do one setwd at the beginning, 
or learn to manually set your base working directory as a matter of habit 
before each working session. (This habit is useful in more areas than just R 
programming.)
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

John Kane jrkrid...@inbox.com wrote:

Thanks Sarah,

I suspected something like that but am still gropping around in Linux. 
I vaguely remember how to cd to someplace.  Shades of DOS 3.2! Of was
that Unixor both!

Also I think I was trying to be a bit too smart-alecky in where I was
placing my data folder so I moved it to my home folder to simplify
figuring out the path. Still thinking in Windows terms.

After a bit of trial and error:

jjohn@john-K53U:~$ cd /home/john/rdata
john@john-K53U:~/rdata$ dir
tti.csv
john@john-K53U:~/rdata$ pwd
/home/john/rdata

so 
mydata - read.csv(/home/john/rdata/tti.csv, header = TRUE)
 works just fine.  I like the idea of staying with absolute paths.

I am most appreciative.

John Kane
Kingston ON Canada


 -Original Message-
 From: sarah.gos...@gmail.com
 Sent: Thu, 3 May 2012 12:29:14 -0400
 To: jrkrid...@inbox.com
 Subject: Re: [R] Cannot read or write to file in Linux Ubuntu
 
 Hi John,
 
 You're probably messing up the path, just as you suspect.
 
 If you use a relative path, like you are doing, then R looks for that
 location starting at R's current working directory, visible with
 getwd(). For linux, that's the location at which you started R if you
 started it from a terminal.
 
 The safest solution is to use an absolute path, which will likely be
 something resembling /home/john/DATA/... etc - note that it will
 always start with a / and go from there.
 
 If you know how to start a terminal window and cd to where your file
 is, pwd at the command prompt will give you the absolute path to that
 location, which is what you should be using until you get more
 comfortable with the file system.
 
 The error message means that R can't find the directory you're
telling
 it to use.
 
 Sarah
 
 On Thu, May 3, 2012 at 12:21 PM, John Kane jrkrid...@inbox.com
wrote:
 I am the proud owner of a new laptop since my old one died the other
 day.
 Currently I have a dual-boot Windows 7 Home and Ubuntu 12.04 .  I'll
 leave the Windows problems for another post.
 
  I know practically nothing about Linux so I am probably doing
something
 stupid but ... at the moment I cannot seem read or write  files in
 Ubuntu.   I am not having any problem saving other documents to the
hard
 drive and R , from my few simple tests, seems to be working okay
 otherwise.
 
 At the moment I am trying :
 
 mydata - read.csv(DATA/media/DATA/rdata/tt1.csv,  header = TRUE)
  or
 mydata - read.csv(DATA/rdata/tt1.csv,  header = TRUE)
 
 
 where tt1.csv is a text file on what, from my reading of the path
listed
 in gedit is
 DATA/media/DATA/rdata
 
 The csv data is simply:
 aa, bb
 2, 3
 4, 5
 
 What happens:

---
 1 mydata - read.csv(DATA/rdata/tt1.csv,  header = TRUE)
 Error in file(file, rt) : cannot open the connection
 In addition: Warning message:
 In file(file, rt) :
  cannot open file 'DATA/rdata/tt1.csv': No such file or directory
 
 Am I totally screwing up the path?  Or doing something else equally
 stupid?
 
 BTW I realise that 2.15 is out but Ubuntu as of yesterday did not
have
 it in the repositories and I have yet to figure out how to install
it
 from a CRAN site.
 
 1 sessionInfo()
 R version 2.14.1 (2011-12-22)
 Platform: i686-pc-linux-gnu (32-bit)
 
 locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_CA.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=C                 LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C
 
 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base
 
 --
 Sarah Goslee
 http://www.functionaldiversity.org


[R] Runtime column name creation

2012-05-03 Thread pvshankar
Hello all, 

I have a data frame with column names s1, s2, s3s11 

I have a function that gets two parameters, one is used as a subscript for
the column names  and another is used as an index into the chosen column. 

For example: 

my_func - function(subscr, index) 
{ 
  if (subscr == 1) 
  { 
df$s1[index] - some value 
  } 
} 

The problem is, I do not want to create a bunch of if statements (one for
each 1:11 column names)). 
Instead, I want to create the column name in run time based on subscr
value. 

I tried eval(as.name(paste(df$s,subscr,sep=)))[index] - some value 

and it complains that object df$s1 is not found. 

Could someone please help me with this? 
(Needless to say, I have just started programing in R) 

Thanks, 
Shankar

--
View this message in context: 
http://r.789695.n4.nabble.com/Runtime-column-name-creation-tp4606497.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] braking a label in two lines when using expression()

2012-05-03 Thread pannigh
Your code did not run on my computer, however the atop function should do
what you are looking for I guess. This is more or less your axis, only
changed a bit in your formula, think it looks better this way:

e.g.

par(mar=c(5,7,.5,.5), las=1, adj=.5, cex.lab=1.5)
plot(1, type=n
, xlab=Week
, ylab=expression(atop(Mean Respisration Rate,
umol%.%L^-1%.%g~(AFDM)^-1) ))

--
View this message in context: 
http://r.789695.n4.nabble.com/braking-a-label-in-two-lines-when-using-expression-tp4605716p4606414.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] overlapping confidence bands for predicted probabilities from a logistic model

2012-05-03 Thread Malcolm Fairbrother
Dear list,

I'm a bit perplexed why the 95% confidence bands for the predicted 
probabilities for units where x=0 and x=1 overlap in the following instance.

I've simulated binary data to which I've then fitted a simple logistic 
regression model, with one covariate, and the coefficient on x is statistically 
significant at the 0.05 level. I've then used two different methods to generate 
95% confidence bands for predicted probabilities, for each of two possible 
values of x. Given the result of the model, I expected the bands not to 
overlap… but they do.

Can anyone please explain where I've gone wrong with the following code, OR why 
it makes sense for the bands to overlap, despite the statistically significant 
beta coefficient? There may be a good statistical reason for this, but I'm not 
aware of it.

Many thanks,
Malcolm Fairbrother


n - 120
set.seed(030512)
x - rbinom(n, 1, 0.5)
dat - within(data.frame(x), ybe - rbinom(n, 1, plogis(-0.5 + x)))
mod1 - glm(ybe ~ x, dat, family=binomial)
summary(mod1) # coefficient on x is statistically significant at the 0.05 
level… almost at the 0.01 level

pred - predict(mod1, newdata=data.frame(x=c(0,1)), se.fit=T)
with(pred, cbind(low = plogis(fit - 1.96*se.fit), est = plogis(fit), up = 
plogis(fit + 1.96*se.fit)))  # confidence bands based on SEs

# simulation-based confidence bands:
sims - t(replicate(200, coef(glm(simulate(mod1)$sim_1 ~ x, data=dat, 
family=binomial
pred0 - plogis(quantile(sims%*%c(1,0), c(0.025, 0.5, 0.975)))
pred1 - plogis(quantile(sims%*%c(1,1), c(0.025, 0.5, 0.975)))
rbind(pred0, pred1)

# the upper bound of the prediction for x=0 is greater than the lower bound of 
the prediction for x=1, using both methods

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Creating a point pattern with cartesian co-ordinates

2012-05-03 Thread AMFTom
I have the following data from an image analysis program, in which the x and
y co-ordinates are locations of the centroids of shapes on a 2 dimensional
plot. The Y co-ordinates were positive, but I changed them to negative as
the resulting scatterplot was upside down (the image analysis program reads
from the top of the image to the bottom, so it seems) I now need these as
point pattern data, as I want to subsequently add marks to them (area of
shape):

Can anybody assist?

XY
159.163 -165.203
162.469 -233
167.77  -213.503
169.988 -174.764
172.113 -246.258
172.494 -232.865
177.855 -213.164
178.568 -250.038
179.486 -186.033
180.208 -244.185
184.28  -170.754
185.447 -181.114
185.597 -193.815
188.779 -204.507
192.978 -220.553
198.208 -224.535
199.346 -185.853
200.3   -229.917
204.028 -190.552
205.1   -217.909
208.686 -242.057
212.575 -243.511
212.942 -223.68
212.994 -259.469
215.643 -200.613
217.51  -219.069
218.193 -225.037
220.049 -172.226
220.181 -251.714
221.946 -185.973
222.738 -195.468
223.08  -177.071
225.058 -262.459
225.265 -220.209
228.767 -199.047
229.196 -188.481
230.268 -245.925
233.604 -207.356
234.708 -247.057
234.874 -213.096
239.822 -251.908
240.771 -178.647
242.287 -193.157
242.468 -249.385
243.207 -233.702
247.955 -196.673
248.178 -252.502
250.82  -147.687
252.076 -209.894
253.883 -199.777
255.635 -251.216
257.03  -78.085
257.108 -229.982
258.46  -119.812
258.641 -218.726
258.953 -166.529
259.216 -150.816
260.544 -146.563
260.557 -84.336
260.59  -67.364
260.742 -190.558
262.247 -79.629
262.316 -86.505
262.436 -227.381
263.622 -127.775
263.814 -85.251
263.874 -94.635
264.945 -75.656
265.038 -223.406
265.313 -204.995
269.082 -183.29
269.428 -209.686
269.984 -246.628
270.371 -84.1
271.263 -72.811
271.886 -195.877
272.59  -250.311
273.572 -97.019
274.086 -199.21
274.329 -235.096
276.979 -77.21
277.313 -246.859
278.096 -188.909
278.223 -68.026
278.35  -139.616
280.89  -195.992
281.413 -78.605
282.904 -92.113
284.499 -231.532
285.708 -240.821
286.862 -221.241
287.679 -135.925
290.702 -68.571
290.744 -76.063
291.361 -239.134
292.471 -89.893
292.915 -229.442
293.193 -99.059
294.381 -81.307
295.605 -141.156
298.543 -100.885
298.689 -239.704
299.711 -137.435
299.925 -221.066
300.618 -130.093
302.588 -93.782
304.273 -203.95
304.766 -194.382
305.091 -110.146
306.04  -88.153
306.761 -186.272
307.554 -173.64
307.778 -80.67
310.273 -87.523
310.368 -103.525
310.621 -71.728
311.168 -193.219
311.29  -198.237
311.338 -168.259
311.925 -115.413
312.178 -203.015
312.577 -98.959
314.495 -81.542
315.035 -185.075
315.256 -92.839
316.889 -196.098
317.982 -95.894
318.453 -181.028
320.048 -67.681
320.888 -88.357
321.327 -75.035
322.997 -104.616
323.373 -95.794
324.414 -193.768
325.789 -84.277
327.046 -89.805
328.459 -78.971
329.2   -251.549
329.913 -72.674
332.794 -263.695
333.451 -200.654
333.756 -124.516
334.524 -174.331
335.301 -213.95
335.759 -166.324
335.848 -210.629
336.07  -75.13
337.105 -87.3
337.78  -178.097
338.997 -65.673
340.501 -184.499
341.346 -193.458
341.55  -230.801
342.054 -80.684
343.082 -202.06
344.873 -69.301
345.798 -232.187
347.177 -201.136
347.53  -103.792
348.168 -87.125
349.156 -236.411
349.531 -252.794
350.503 -92.074
350.79  -116.177
350.953 -177.364
354.308 -123.398
354.417 -237.45
355.424 -197.235
358.092 -164.202
361.405 -240.484
363.009 -213.804
364.806 -243.099


--
View this message in context: 
http://r.789695.n4.nabble.com/Creating-a-point-pattern-with-cartesian-co-ordinates-tp4606343.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with readBin

2012-05-03 Thread kapo coulibaly
 I believe here is the structure of the file I'm trying to read:
record marker (4 bytes), 2 integers (4 bytes each), 2 doubles (8 bytes
each), one string (16 bytes or 16 characters), 3 integers (4 bytes each), 1
record marker (4 bytes) and a big array of doubles (8 bytes each).
Everything in the file is read correctly except for the doubles.
If any indication, I've read similar file before with readBin the only
difference is this one was created with a code compiled with gfortran in
linux 64 bit. I was able to read the same output binary file when the
fortran source code was compiled in windows xp 32 bit. The values I'm
expecting should be between 0 and about 32.




The code I used is:



# Loading Required libraries
library(tcltk)

# Tk inputbox function
 inputBox-function() {
  tt-tktoplevel()
  Zmin-tclVar(0)
  Zmax-tclVar(0)
  dZ-tclVar(0)
  entry.Zmin-tkentry(tt,width=20,textvariable=Zmin)
  entry.Zmax-tkentry(tt,width=20,textvariable=Zmax)
  entry.dZ-tkentry(tt,width=20,textvariable=dZ)
  lbl.Zmin-tklabel(tt,text=Number of layers)
  lbl.Zmax-tklabel(tt,text=Number of Stress Periods)
  lbl.dZ-tklabel(tt,text=dZ)
  tkgrid(lbl.Zmin,entry.Zmin)
  tkgrid(entry.Zmin)
  tkgrid(lbl.Zmax,entry.Zmax)
  tkgrid(entry.Zmax)
  #tkgrid(lbl.dZ,entry.dZ)
  #tkgrid(entry.dZ)

  OnOK - function()
  {
  # NameVal - c(tclvalue(Zmin),tclvalue(Zmax),tclvalue(dZ))
tkdestroy(tt)
}
  OK.but -tkbutton(tt,text=   OK   ,command=OnOK)
  # tkbind(entry.Name, Return,OnOK)
  tkgrid(OK.but,columnspan=3)
  tkfocus(tt)
  tkwait.window(tt)
  res-as.numeric(c(tclvalue(Zmin),tclvalue(Zmax)))#,tclvalue(dZ)))
  return(res)
}


# Main program


# Model Parameters input (number of layers and stress periods)
param-inputBox()

# Select and open Modflow Binary file for reading
fich-tclvalue(tkgetOpenFile(title=Modflow Binary File,filetypes={{hds
binary Files} {.hds}} {{All files} *}))
zz - file(fich, rb)

# Cycling thru time steps and layers
for (k in 1:param[2]) {
  for (i in 1:param[1]) {
readBin(zz,what=numeric,n=1,size=4) # record marker typical of
fortran access=sequential in gfortran
readBin(zz,what=integer,n=2,size=4)-N1
readBin(zz,what=double,n=2,size=8)-N2
readChar(zz,16)-txt1
print(txt1)
readBin(zz,what=integer,n=3,size=4)-N3
tnber-N3[1]*N3[2]
readBin(zz,what=integer,n=1,size=4) # record marker typical of
fortran access=sequential in gfortran
readBin(zz,what=real(),n=tnber,size=4)-N4
readBin(zz,what=integer,n=2,size=4) # record marker typical of
fortran access=sequential in gfortran
print(N4[1:10])


}

}

close(zz)

On Thu, May 3, 2012 at 1:26 PM, Duncan Murdoch murdoch.dun...@gmail.comwrote:

 On 03/05/2012 12:41 PM, kapo coulibaly wrote:

 I'm trying to read a binary file created by a fortran code using readBin
 and readChar. Everything reads fine (integers and strings) except for
 double precision numbers, they are read as huge or very small number
 (1E-250,...). I tried various endianness, swap, But nothing has worked so
 far.
 I also tried on R 64 bit for linux and windows (R 2.14) and R 2.11 on
 windows XP 32 bit.
 Any help would be appreciated.


 As I wrote to someone else with a similar problem a couple of weeks ago:

 You need to see what's in the file.  The hexView package can dump it in
 various formats; see example(viewFormat) for a couple.

 Duncan Murdoch


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error with the 'segmented' package for R

2012-05-03 Thread Szymon Biskup
Hi everyone,


I have encountered this problem while using 'segmented' plugin for R i386 
2.15.0 (for Windows 32bit OS) and I just cannot find neither explanation nor 
solution for it.

I am trying to run this data

gpp temp
1.661   5
5.028   10
9.772   15
8.692   20
5.693   25
6.293   30
7.757   5
4.604   10
8.763   15
8.134   20
4.616   25
8.417   30
3.483   5
5.046   10
8.306   15
9.142   20
4.686   25
7.301   30


and with the 'segmented' I wanted to find the brakepoint for the curve, however 
I get this error mesasge:

Error in 1:ncol(U) : argument of length 0
In addition: Warning message:
In o0$boot.restart - ris : Coercing LHS to a list



I was using the following code:

library(segmented)
curva-read.table(gppdata.txt, header=T)
attach(curva)
fit.glm-glm(gpp~temp, weight=NULL, family=gaussian)
fit.seg-segmented(fit.glm, seg.Z=~temp, psi=15)
summary(fit.seg)



Is anyone able to help me and direct me for the source of the problem?

Thanks for your help.


Regards,

Szymon

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Runtime column name creation

2012-05-03 Thread Rui Barradas
Hello,

Don't cross post, please. 
You could even have saved you some time: the answer is already given in
R-devel.

Rui Barradas



--
View this message in context: 
http://r.789695.n4.nabble.com/Runtime-column-name-creation-tp4606497p4606561.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to replace NA with zero (0)

2012-05-03 Thread J Toll
On Thu, May 3, 2012 at 10:43 AM, Christopher Kelvin
chris_kelvin2...@yahoo.com wrote:

 Is there a command i can issue to replace the NA with zero (0) even if it is 
 after generating the data?

Chris,

I didn't try your example code, so this suggestion is far more
general, but you might try something along the lines of:

x[which(is.na(x))] - 0

Best,

James

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cannot read or write to file in Linux Ubuntu

2012-05-03 Thread Sarah Goslee
On Thu, May 3, 2012 at 1:53 PM, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote:
 I like the idea of staying with absolute paths.

 Before you write too much R code that builds in absolute paths, please 
 consider how difficult it will be to adjust all of those paths if you need to 
 run on a different computer or you need to reorganize your overall directory 
 structure. If you keep related R files in the same project directory, you can 
 collapse all of those paths down to short relative paths, and do one setwd at 
 the beginning, or learn to manually set your base working directory as a 
 matter of habit before each working session. (This habit is useful in more 
 areas than just R programming.)

I agree with this, which is why I suggested you use absolute paths
*until you get more comfortable with the file system.* It's a good way
to diagnose your problem and figure out how to deal with paths on
Linux, but not a good long-term strategy unless you expect that you
will never ever move anything or change to a new computer.

An intermediate solution that I use a lot is to put something like
this at the beginning of R script file:
basepath = /home/sarahg/whatever
and then load files using something like
read.table(paste(basepath, plantdata.csv, sep=/))

This eases portability between computers: I exchange a lot of analyses
with postdocs, students and techs, and somehow they've not all become
convinced that my way of organizing directories is the best one. Using
an object with the absolute path for the data means that we can pass
things back and forth with only changing one value within the script
rather than all input/output commands.

Sarah

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] overlapping confidence bands for predicted probabilities from a logistic model

2012-05-03 Thread Bert Gunter
Your post suggests statistical confusion on several levels. But as
this concerns statistics, not R, consult your local statistician or
post on a statistical help list (e.g. stats.stackexchange.com).

Incidentally, the short answer to your question about the overlap is:
why shouldn't they? You will hopefully receive a more informative
longer answer from the above suggested (or similar) resources.

-- Bert


On Thu, May 3, 2012 at 9:37 AM, Malcolm Fairbrother
m.fairbrot...@bristol.ac.uk wrote:
 Dear list,

 I'm a bit perplexed why the 95% confidence bands for the predicted 
 probabilities for units where x=0 and x=1 overlap in the following instance.

 I've simulated binary data to which I've then fitted a simple logistic 
 regression model, with one covariate, and the coefficient on x is 
 statistically significant at the 0.05 level. I've then used two different 
 methods to generate 95% confidence bands for predicted probabilities, for 
 each of two possible values of x. Given the result of the model, I expected 
 the bands not to overlap… but they do.

 Can anyone please explain where I've gone wrong with the following code, OR 
 why it makes sense for the bands to overlap, despite the statistically 
 significant beta coefficient? There may be a good statistical reason for 
 this, but I'm not aware of it.

 Many thanks,
 Malcolm Fairbrother


 n - 120
 set.seed(030512)
 x - rbinom(n, 1, 0.5)
 dat - within(data.frame(x), ybe - rbinom(n, 1, plogis(-0.5 + x)))
 mod1 - glm(ybe ~ x, dat, family=binomial)
 summary(mod1) # coefficient on x is statistically significant at the 0.05 
 level… almost at the 0.01 level

 pred - predict(mod1, newdata=data.frame(x=c(0,1)), se.fit=T)
 with(pred, cbind(low = plogis(fit - 1.96*se.fit), est = plogis(fit), up = 
 plogis(fit + 1.96*se.fit)))  # confidence bands based on SEs

 # simulation-based confidence bands:
 sims - t(replicate(200, coef(glm(simulate(mod1)$sim_1 ~ x, data=dat, 
 family=binomial
 pred0 - plogis(quantile(sims%*%c(1,0), c(0.025, 0.5, 0.975)))
 pred1 - plogis(quantile(sims%*%c(1,1), c(0.025, 0.5, 0.975)))
 rbind(pred0, pred1)

 # the upper bound of the prediction for x=0 is greater than the lower bound 
 of the prediction for x=1, using both methods

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Identifying the particular X or Y in a sorted list

2012-05-03 Thread Shankar Lanke
Dear All,
Thank you very much in advance.

I have a data sets as shown below A (Patient ID ), B and C are the
Concentration of drug in blood on day 1 and day 4, D is the difference in
conc. To do this in R I have written a code as follows, identified the
number of patients who have more concentration on day 4 . Here I want to
identify specifically the patient ID (is he patient 1 or 2 or 5 and 7),
whose concentration is more.
How to write a code to get the list of A (patient ID whose difference is
more on day 4).


A 1 2 3 4 5
B 7 2 3 6 9
C 4 6 9 2 5
(B-C) 3 -4 -6 4 4

DF1-list(A,B,C)
DF1

DF2-(DF1$C-DF1$B)
length(DF2)
sum(DF20)

#I want to subtract B from C to see and identify how many patients have
greater concentrations and who are these patients (A).


On Thu, May 3, 2012 at 12:15 PM, John Kane jrkrid...@inbox.com wrote:

 You do not seem to have suppied either code nor data.  Please supply both.

 John Kane
 Kingston ON Canada


  -Original Message-
  From: shankarla...@gmail.com
  Sent: Wed, 2 May 2012 22:06:54 -0400
  To: r-help@r-project.org
  Subject: [R] Identifying the particular X or Y in a sorted list
 
  Dear All,
 
  I have a data sets as shown below A (Patient ID ), B and C are the
  Concentration of drug in blood on day 1 and day 4, D is the difference in
  conc. To do this in R I have written a code as follows, identified the
  number of patients who have more concentration on day 4 . Here I want to
  identify specifically the patient ID (is he patient 1 or 2 or 5 and 7),
  whose concentration is more.
  How to write a code to get the list of A (patient ID whose difference is
  more on day 4).
 
  Data-(myDf$B-myDf$C)
  sum(Data0)
 
A B CD (B-C)  1 14 10 4  2 12 7 5  3 11 15 -4  4 8 3 5  5 1 8 -7
 
  I appreciate your help, thank you very much in advance.
 
  Regards
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 
 GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at
 http://www.inbox.com/smileys
 Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and
 most webmails





-- 
Regards,
Shankar Lanke Ph.D.
Assistant Professor
Department of Pharmaceutical Sciences
College of Pharmacy
The University of Findlay
(C) 678-232-3567
(O) 419-434-5448
Fax# 419-434-4390

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to Export an R outcome to an Excel Spreadsheet

2012-05-03 Thread MacQueen, Don
The others have made suggestions for csv output.

For xlsx output go to CRAN, click on the packages link at the left, then
select packages listed by name. Then use your browser to search for the
string 'xls. There are several packages that offer this capability.

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 5/1/12 1:41 PM, Paul Bernal paulberna...@gmail.com wrote:

Hello R community,

I basically created a normal distribution with mean 2500 and standard
deviation = 450 with a sample of size 50 and assigned that to a variable
named genvar2 with the following command:

genvar2-rnorm(mean=2500, sd=450, n=50)

Now, the output of genvar2 generates de following:

[1] 2478.126 2671.259 2163.879 2440.796 2702.234 1871.514 2525.127
2830.688
 [9] 2704.148 3464.478 2609.795 3368.288 2661.613 2731.901 2535.846
2165.461
[17] 1870.069 3513.533 2053.342 2447.887 2605.913 2188.192 2514.004
2965.374
[25] 3550.454 1783.323 2568.323 2324.673 2528.994 2433.895 2751.111
2727.282
[33] 1837.081 1896.721 3445.993 1357.462 2348.177 2368.423 2029.738
2500.372
[41] 2000.008 3088.112 3003.325 2763.740 2475.636 1860.988 2292.909
2134.172
[49] 2291.116 2851.066
I want to export all of these numbers into a column in an excel file (in
an
xlsx and .csv format).

How can I do this?

Any help will be greatly appreciated,

Best regards,

Paul

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with 'nls' fitting logistic model (5PL)

2012-05-03 Thread Gabor Grothendieck
On Wed, May 2, 2012 at 3:32 PM, Michal Figurski
figur...@mail.med.upenn.edu wrote:
 Dear R-Helpers,

 I'm working with immunoassay data and 5PL logistic model. I wanted to
 experiment with different forms of weighting and parameter selection, which
 is not possible in instrument software, so I turned to R.

 I am using R 2.14.2 under Win7 64bit, and the 'nls' library to fit the model
 - I started with the same model and weighting type (1/y) as in the
 instrument to see if I'll get similar results. However, in some instances I
 don't get any results - just errors.

 Here is an example calibration data, representative of my experiment.
 Instrument soft had no problem fitting it:
 x - structure(list(SPL = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L,
 4L, 5L, 5L, 6L, 6L, 7L, 7L), .Label = c(St1, St2, St3,
 St4, St5, St6, St7), class = factor), MFI = c(10755.5,
 9839, 5142.5, 4857, 1510.5, 1505, 502.5, 451, 215, 195.5, 58,
 57, 15, 15), nom = c(206, 206, 125, 125, 68, 68, 38, 38, 24,
 24, 13, 13, 6.5, 6.5), weights = c(0.0013946353028683, 0.00152454517735542,
 0.00291686922702965, 0.00308832612723904, 0.0099304865938431,
 0.00996677740863787, 0.0298507462686567, 0.0332594235033259,
 0.0697674418604651, 0.0767263427109974, 0.258620689655172,
 0.263157894736842,
 1, 1)), .Names = c(SPL, MFI, nom, weights), row.names = c(NA,
 -14L), class = data.frame)

 And here is the nls fit:
 fit - nls(MFI~a + b/((1+(nom/c)^d)^f), data=x, weights=x$weights,
 start=c(a=100, b=1, c=100, d=-1, f=1))


This looks quite linear to me:

fo - log(MFI) ~ log(nom)
plot(fo, x)
abline(lm(fo, x))

Try using that as the basis for an alternate model.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with readBin

2012-05-03 Thread Duncan Murdoch

On 03/05/2012 1:57 PM, kapo coulibaly wrote:

  I believe here is the structure of the file I'm trying to read:
record marker (4 bytes), 2 integers (4 bytes each), 2 doubles (8 bytes
each), one string (16 bytes or 16 characters), 3 integers (4 bytes each), 1
record marker (4 bytes) and a big array of doubles (8 bytes each).
Everything in the file is read correctly except for the doubles.
If any indication, I've read similar file before with readBin the only
difference is this one was created with a code compiled with gfortran in
linux 64 bit. I was able to read the same output binary file when the
fortran source code was compiled in windows xp 32 bit. The values I'm
expecting should be between 0 and about 32.


The first two doubles read properly as 1, and the next one as 33.674, 
when I follow your description above on the dump you sent me 
privately.   I'm on a system with endian=little; you might want to 
specify that explicitly in your readBin calls.


And please, in future, spend some time making your questions easier to 
answer:  put the dump in the public message.  Don't post irrelevant 
code, extract just the bit that's doing the reading.


Duncan Murdoch






The code I used is:



# Loading Required libraries
library(tcltk)

# Tk inputbox function
  inputBox-function() {
   tt-tktoplevel()
   Zmin-tclVar(0)
   Zmax-tclVar(0)
   dZ-tclVar(0)
   entry.Zmin-tkentry(tt,width=20,textvariable=Zmin)
   entry.Zmax-tkentry(tt,width=20,textvariable=Zmax)
   entry.dZ-tkentry(tt,width=20,textvariable=dZ)
   lbl.Zmin-tklabel(tt,text=Number of layers)
   lbl.Zmax-tklabel(tt,text=Number of Stress Periods)
   lbl.dZ-tklabel(tt,text=dZ)
   tkgrid(lbl.Zmin,entry.Zmin)
   tkgrid(entry.Zmin)
   tkgrid(lbl.Zmax,entry.Zmax)
   tkgrid(entry.Zmax)
   #tkgrid(lbl.dZ,entry.dZ)
   #tkgrid(entry.dZ)

   OnOK- function()
   {
   # NameVal- c(tclvalue(Zmin),tclvalue(Zmax),tclvalue(dZ))
 tkdestroy(tt)
 }
   OK.but-tkbutton(tt,text=   OK   ,command=OnOK)
   # tkbind(entry.Name, Return,OnOK)
   tkgrid(OK.but,columnspan=3)
   tkfocus(tt)
   tkwait.window(tt)
   res-as.numeric(c(tclvalue(Zmin),tclvalue(Zmax)))#,tclvalue(dZ)))
   return(res)
}


# Main program


# Model Parameters input (number of layers and stress periods)
param-inputBox()

# Select and open Modflow Binary file for reading
fich-tclvalue(tkgetOpenFile(title=Modflow Binary File,filetypes={{hds
binary Files} {.hds}} {{All files} *}))
zz- file(fich, rb)

# Cycling thru time steps and layers
for (k in 1:param[2]) {
   for (i in 1:param[1]) {
 readBin(zz,what=numeric,n=1,size=4) # record marker typical of
fortran access=sequential in gfortran
 readBin(zz,what=integer,n=2,size=4)-N1
 readBin(zz,what=double,n=2,size=8)-N2
 readChar(zz,16)-txt1
 print(txt1)
 readBin(zz,what=integer,n=3,size=4)-N3
 tnber-N3[1]*N3[2]
 readBin(zz,what=integer,n=1,size=4) # record marker typical of
fortran access=sequential in gfortran
 readBin(zz,what=real(),n=tnber,size=4)-N4
 readBin(zz,what=integer,n=2,size=4) # record marker typical of
fortran access=sequential in gfortran
 print(N4[1:10])


 }

}

close(zz)

On Thu, May 3, 2012 at 1:26 PM, Duncan Murdochmurdoch.dun...@gmail.comwrote:

  On 03/05/2012 12:41 PM, kapo coulibaly wrote:

  I'm trying to read a binary file created by a fortran code using readBin
  and readChar. Everything reads fine (integers and strings) except for
  double precision numbers, they are read as huge or very small number
  (1E-250,...). I tried various endianness, swap, But nothing has worked so
  far.
  I also tried on R 64 bit for linux and windows (R 2.14) and R 2.11 on
  windows XP 32 bit.
  Any help would be appreciated.


  As I wrote to someone else with a similar problem a couple of weeks ago:

  You need to see what's in the file.  The hexView package can dump it in
  various formats; see example(viewFormat) for a couple.

  Duncan Murdoch


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Identifying the particular X or Y in a sorted list

2012-05-03 Thread Bert Gunter
Have you read An Introduction to R? If I understand correctly, your
question is very basic and suggests that you have not yet made even a
minimal effort on your own to learn R before asking for help from this
list.

-- Bert

answer: A[CB]

On Thu, May 3, 2012 at 11:14 AM, Shankar Lanke shankarla...@gmail.com wrote:
 Dear All,
 Thank you very much in advance.

 I have a data sets as shown below A (Patient ID ), B and C are the
 Concentration of drug in blood on day 1 and day 4, D is the difference in
 conc. To do this in R I have written a code as follows, identified the
 number of patients who have more concentration on day 4 . Here I want to
 identify specifically the patient ID (is he patient 1 or 2 or 5 and 7),
 whose concentration is more.
 How to write a code to get the list of A (patient ID whose difference is
 more on day 4).


 A 1 2 3 4 5
 B 7 2 3 6 9
 C 4 6 9 2 5
 (B-C) 3 -4 -6 4 4

 DF1-list(A,B,C)
 DF1

 DF2-(DF1$C-DF1$B)
 length(DF2)
 sum(DF20)

 #I want to subtract B from C to see and identify how many patients have
 greater concentrations and who are these patients (A).


 On Thu, May 3, 2012 at 12:15 PM, John Kane jrkrid...@inbox.com wrote:

 You do not seem to have suppied either code nor data.  Please supply both.

 John Kane
 Kingston ON Canada


  -Original Message-
  From: shankarla...@gmail.com
  Sent: Wed, 2 May 2012 22:06:54 -0400
  To: r-help@r-project.org
  Subject: [R] Identifying the particular X or Y in a sorted list
 
  Dear All,
 
  I have a data sets as shown below A (Patient ID ), B and C are the
  Concentration of drug in blood on day 1 and day 4, D is the difference in
  conc. To do this in R I have written a code as follows, identified the
  number of patients who have more concentration on day 4 . Here I want to
  identify specifically the patient ID (is he patient 1 or 2 or 5 and 7),
  whose concentration is more.
  How to write a code to get the list of A (patient ID whose difference is
  more on day 4).
 
  Data-(myDf$B-myDf$C)
  sum(Data0)
 
    A B CD (B-C)  1 14 10 4  2 12 7 5  3 11 15 -4  4 8 3 5  5 1 8 -7
 
  I appreciate your help, thank you very much in advance.
 
  Regards
 
        [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 
 GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at
 http://www.inbox.com/smileys
 Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and
 most webmails





 --
 Regards,
 Shankar Lanke Ph.D.
 Assistant Professor
 Department of Pharmaceutical Sciences
 College of Pharmacy
 The University of Findlay
 (C) 678-232-3567
 (O) 419-434-5448
 Fax# 419-434-4390

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Identifying the particular X or Y in a sorted list

2012-05-03 Thread John Kane

   I'm sorry, it's still not clear what you are doing but perhaps this is
   close?
   mydata  - data.frame( a = c(1, 2, 3, 4 , 5),
   b  =  c(7, 2, 3, 6, 9),
   c  = c(4, 6, 9, 2, 5))
   mydata$d  - mydata$b - mydata$c
   mydata
   subset(mydata, mydata$d ==max(mydata$d))



   John Kane
   Kingston ON Canada

   -Original Message-
   From: shankarla...@gmail.com
   Sent: Thu, 3 May 2012 14:14:17 -0400
   To: jrkrid...@inbox.com
   Subject: Re: [R] Identifying the particular X or Y in a sorted list

   Dear All,
   Thank you very much in advance.
   I  have  a  data  sets as shown below A (Patient ID ), B and C are the
   Concentration of drug in blood on day 1 and day 4, D is the difference in
   conc. To do this in R I have written a code as follows, identified the
   number of patients who have more concentration on day 4 . Here I want to
   identify specifically the patient ID (is he patient 1 or 2 or 5 and 7),
   whose concentration is more.
   How to write a code to get the list of A (patient ID whose difference is
   more on day 4).

   A 1 2 3 4 5
   B 7 2 3 6 9
   C 4 6 9 2 5
   (B-C) 3 -4 -6 4 4
   DF1-list(A,B,C)
   DF1
   DF2-(DF1$C-DF1$B)
   length(DF2)
   sum(DF20)
   #I want to subtract B from C to see and identify how many patients have
   greater concentrations and who are these patients (A).
   On Thu, May 3, 2012 at 12:15 PM, John Kane [1]jrkrid...@inbox.com wrote:

 You do not seem to have suppied either code nor data.  Please supply both.
 John Kane
 Kingston ON Canada

-Original Message-
From: [2]shankarla...@gmail.com
Sent: Wed, 2 May 2012 22:06:54 -0400
To: [3]r-help@r-project.org
Subject: [R] Identifying the particular X or Y in a sorted list
   
Dear All,
   
I have a data sets as shown below A (Patient ID ), B and C are the
Concentration of drug in blood on day 1 and day 4, D is the difference in
conc. To do this in R I have written a code as follows, identified the
number of patients who have more concentration on day 4 . Here I want to
identify specifically the patient ID (is he patient 1 or 2 or 5 and 7),
whose concentration is more.
How to write a code to get the list of A (patient ID whose difference is
more on day 4).
   
Data-(myDf$B-myDf$C)
sum(Data0)
   

A B CD (B-C)  1 14 10 4  2 12 7 5  3 11 15 -4  4 8 3 5  5 1 8 -7

   
I appreciate your help, thank you very much in advance.
   
Regards
   

[[alternative HTML version deleted]]
 
  __
  [4]R-help@r-project.org mailing list
  [5]https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  [6]http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 GET   FREE   SMILEYS   FOR   YOUR  IMEMAIL  -  Learn  more  at
 [7]http://www.inbox.com/smileys
 Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and
 most webmails

   --
   Regards,
   Shankar Lanke Ph.D.
   Assistant Professor
   Department of Pharmaceutical Sciences
   College of Pharmacy
   The University of Findlay
   (C) 678-232-3567
   (O) 419-434-5448
   Fax# 419-434-4390
 _

   Free Online Photosharing - Share your photos online with your friends and
   family!
   Visit [8]http://www.inbox.com/photosharing to find out more!

References

   1. mailto:jrkrid...@inbox.com
   2. mailto:shankarla...@gmail.com
   3. mailto:r-help@r-project.org
   4. mailto:R-help@r-project.org
   5. https://stat.ethz.ch/mailman/listinfo/r-help
   6. http://www.R-project.org/posting-guide.html
   7. http://www.inbox.com/smileys
   8. http://www.inbox.com/photosharing
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cannot read or write to file in Linux Ubuntu

2012-05-03 Thread John Kane
Thanks Jeff and Sarah.  

I was thinking mainly of using the base path and paste routine which is 
something I do in Windows  

It will take me a while to figrue out relative paths.  

John Kane
Kingston ON Canada


 -Original Message-
 From: sarah.gos...@gmail.com
 Sent: Thu, 3 May 2012 14:07:12 -0400
 To: jdnew...@dcn.davis.ca.us
 Subject: Re: [R] Cannot read or write to file in Linux Ubuntu
 
 On Thu, May 3, 2012 at 1:53 PM, Jeff Newmiller jdnew...@dcn.davis.ca.us
 wrote:
 I like the idea of staying with absolute paths.
 
 Before you write too much R code that builds in absolute paths, please
 consider how difficult it will be to adjust all of those paths if you
 need to run on a different computer or you need to reorganize your
 overall directory structure. If you keep related R files in the same
 project directory, you can collapse all of those paths down to short
 relative paths, and do one setwd at the beginning, or learn to manually
 set your base working directory as a matter of habit before each working
 session. (This habit is useful in more areas than just R programming.)
 
 I agree with this, which is why I suggested you use absolute paths
 *until you get more comfortable with the file system.* It's a good way
 to diagnose your problem and figure out how to deal with paths on
 Linux, but not a good long-term strategy unless you expect that you
 will never ever move anything or change to a new computer.
 
 An intermediate solution that I use a lot is to put something like
 this at the beginning of R script file:
 basepath = /home/sarahg/whatever
 and then load files using something like
 read.table(paste(basepath, plantdata.csv, sep=/))
 
 This eases portability between computers: I exchange a lot of analyses
 with postdocs, students and techs, and somehow they've not all become
 convinced that my way of organizing directories is the best one. Using
 an object with the absolute path for the data means that we can pass
 things back and forth with only changing one value within the script
 rather than all input/output commands.
 
 Sarah
 
 --
 Sarah Goslee
 http://www.functionaldiversity.org



TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if5
Capture screenshots, upload images, edit and send them to your friends
through IMs, post on Twitter®, Facebook®, MySpace™, LinkedIn® – FAST!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with readBin

2012-05-03 Thread kapo coulibaly
I did mention in my initial email that I tried little, big, swap and
.Platform$endian without any success, I keep getting the same very small
numbers.
Thanks

On Thu, May 3, 2012 at 2:36 PM, Duncan Murdoch murdoch.dun...@gmail.comwrote:

 On 03/05/2012 1:57 PM, kapo coulibaly wrote:

  I believe here is the structure of the file I'm trying to read:
 record marker (4 bytes), 2 integers (4 bytes each), 2 doubles (8 bytes
 each), one string (16 bytes or 16 characters), 3 integers (4 bytes each),
 1
 record marker (4 bytes) and a big array of doubles (8 bytes each).
 Everything in the file is read correctly except for the doubles.
 If any indication, I've read similar file before with readBin the only
 difference is this one was created with a code compiled with gfortran in
 linux 64 bit. I was able to read the same output binary file when the
 fortran source code was compiled in windows xp 32 bit. The values I'm
 expecting should be between 0 and about 32.


 The first two doubles read properly as 1, and the next one as 33.674, when
 I follow your description above on the dump you sent me privately.   I'm on
 a system with endian=little; you might want to specify that explicitly in
 your readBin calls.

 And please, in future, spend some time making your questions easier to
 answer:  put the dump in the public message.  Don't post irrelevant code,
 extract just the bit that's doing the reading.

 Duncan Murdoch





 The code I used is:



 # Loading Required libraries
 library(tcltk)

 # Tk inputbox function
  inputBox-function() {
   tt-tktoplevel()
   Zmin-tclVar(0)
   Zmax-tclVar(0)
   dZ-tclVar(0)
   entry.Zmin-tkentry(tt,width=**20,textvariable=Zmin)
   entry.Zmax-tkentry(tt,width=**20,textvariable=Zmax)
   entry.dZ-tkentry(tt,width=**20,textvariable=dZ)
   lbl.Zmin-tklabel(tt,text=**Number of layers)
   lbl.Zmax-tklabel(tt,text=**Number of Stress Periods)
   lbl.dZ-tklabel(tt,text=dZ)
   tkgrid(lbl.Zmin,entry.Zmin)
   tkgrid(entry.Zmin)
   tkgrid(lbl.Zmax,entry.Zmax)
   tkgrid(entry.Zmax)
   #tkgrid(lbl.dZ,entry.dZ)
   #tkgrid(entry.dZ)

   OnOK- function()
   {
   # NameVal- c(tclvalue(Zmin),tclvalue(**Zmax),tclvalue(dZ))
 tkdestroy(tt)
 }
   OK.but-tkbutton(tt,text=   OK   ,command=OnOK)
   # tkbind(entry.Name, Return,OnOK)
   tkgrid(OK.but,columnspan=3)
   tkfocus(tt)
   tkwait.window(tt)
   res-as.numeric(c(tclvalue(**Zmin),tclvalue(Zmax)))#,**tclvalue(dZ)))
   return(res)
 }

 ##**##**
 
 # Main program
 ##**##**
 

 # Model Parameters input (number of layers and stress periods)
 param-inputBox()

 # Select and open Modflow Binary file for reading
 fich-tclvalue(tkgetOpenFile(**title=Modflow Binary
 File,filetypes={{hds
 binary Files} {.hds}} {{All files} *}))
 zz- file(fich, rb)

 # Cycling thru time steps and layers
 for (k in 1:param[2]) {
   for (i in 1:param[1]) {
 readBin(zz,what=numeric,n=1,**size=4) # record marker typical of
 fortran access=sequential in gfortran
 readBin(zz,what=integer,n=2,**size=4)-N1
 readBin(zz,what=double,n=2,**size=8)-N2
 readChar(zz,16)-txt1
 print(txt1)
 readBin(zz,what=integer,n=3,**size=4)-N3
 tnber-N3[1]*N3[2]
 readBin(zz,what=integer,n=1,**size=4) # record marker typical of
 fortran access=sequential in gfortran
 readBin(zz,what=real(),n=**tnber,size=4)-N4
 readBin(zz,what=integer,n=2,**size=4) # record marker typical of
 fortran access=sequential in gfortran
 print(N4[1:10])


 }

 }

 close(zz)

 On Thu, May 3, 2012 at 1:26 PM, Duncan 
 Murdochmurdoch.duncan@gmail.**commurdoch.dun...@gmail.com
 wrote:

   On 03/05/2012 12:41 PM, kapo coulibaly wrote:
 
   I'm trying to read a binary file created by a fortran code using
 readBin
   and readChar. Everything reads fine (integers and strings) except for
   double precision numbers, they are read as huge or very small number
   (1E-250,...). I tried various endianness, swap, But nothing has
 worked so
   far.
   I also tried on R 64 bit for linux and windows (R 2.14) and R 2.11 on
   windows XP 32 bit.
   Any help would be appreciated.
 
 
   As I wrote to someone else with a similar problem a couple of weeks
 ago:
 
   You need to see what's in the file.  The hexView package can dump it in
   various formats; see example(viewFormat) for a couple.
 
   Duncan Murdoch
 

[[alternative HTML version deleted]]

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org 

Re: [R] deparse(substitute(x)) on an object with S3 class

2012-05-03 Thread Thomas Lumley
On Fri, May 4, 2012 at 2:02 AM, R. Michael Weylandt
michael.weyla...@gmail.com wrote:
 Note that print.testclass(testlist) also works as expected so it might
 be there's a forced evaluation somewhere inside S3 dispatch

Indeed. Without evaluating the argument, the S3 method dispatch can't
work out which method to dispatch to. In principle it could go back
and get a copy of the unevaluated promise, but it doesn't (and that
would only move the problem somewhere else, since the evaluation could
have side effects that would then happen multiple times)

Section 5.4 of the R Language Definition says, in part,

When the method is invoked it is called with arguments that are the
same in number and have the same names as in the call to the generic.
They are matched to the arguments of the method according to the
standard R rules for argument matching. However the object, i.e. the
first argument has been evaluated.


   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Very small random effect estimation in lmer but not in stata xtmixed

2012-05-03 Thread Thomas Lumley
On Thu, May 3, 2012 at 11:50 PM, Mohammed Mohammed
m.a.moham...@bham.ac.uk wrote:
 Hi folks

 I am using the lmer function (in the lme4 library) to analyse some data where 
 individuals are clustered into sets (using the SetID variable) with a single 
 fixed effect (cc - 0 or 1). The lmer model and output is shown below.
 Whilst the fixed effects are consistent with stata (using xtmixed, see 
 below), the std dev of the random effect for SetID is very very small 
 (3.5803e-05)compared to stata's (see below 1.002). Any ideas why this should 
 be happening please?


You're not fitting the same model.  Like SAS, Stata by default assumes
that random effects are independent of each other, so your Stata model
has correlation between the random effects forced to zero.  The R
model estimates the correlation, and finds it to be far from zero
(-0.69).

   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Identifying the particular X or Y in a sorted list

2012-05-03 Thread Shankar Lanke
Dear John,

Thank you very much for your response, I appreciate your input.
I am able to subtract the two columns, (B - C) , the subset information I
need is how many As and who are the A. For example P,Q,R,S,T, persons
earned $  7, 2, 3, 6, 9  in 1 st month and  $ 4, 6, 9, 2, 5 in 2nd month. I
want to identify who earned more   on 1st month + the difference (only if
it is positive). In this case P,S,T earned $3,4,4, in 2nd month.



mydata  - data.frame( a = c(1, 2, 3, 4 , 5), b  =  c(7, 2, 3, 6, 9), c  =
c(4, 6, 9, 2, 5))
mydata$d  - mydata$b - mydata$c
mydata
subset(mydata, mydata$d ==max(mydata$d))

On Thu, May 3, 2012 at 2:47 PM, John Kane jrkrid...@inbox.com wrote:

 **
 I'm sorry, it's still not clear what you are doing but perhaps this is
 close?

 mydata  - data.frame( a = c(1, 2, 3, 4 , 5), b  =  c(7, 2, 3, 6, 9), c  =
 c(4, 6, 9, 2, 5))
 mydata$d  - mydata$b - mydata$c
 mydata
 subset(mydata, mydata$d ==max(mydata$d))



 John Kane
 Kingston ON Canada


 -Original Message-
 *From:* shankarla...@gmail.com
 *Sent:* Thu, 3 May 2012 14:14:17 -0400
 *To:* jrkrid...@inbox.com
 *Subject:* Re: [R] Identifying the particular X or Y in a sorted list

 Dear All,
 Thank you very much in advance.

 I have a data sets as shown below A (Patient ID ), B and C are the
 Concentration of drug in blood on day 1 and day 4, D is the difference in
 conc. To do this in R I have written a code as follows, identified the
 number of patients who have more concentration on day 4 . Here I want to
 identify specifically the patient ID (is he patient 1 or 2 or 5 and 7),
 whose concentration is more.
 How to write a code to get the list of A (patient ID whose difference is
 more on day 4).


 A 1 2 3 4 5
 B 7 2 3 6 9
 C 4 6 9 2 5
 (B-C) 3 -4 -6 4 4

 DF1-list(A,B,C)
 DF1

 DF2-(DF1$C-DF1$B)
 length(DF2)
 sum(DF20)

 #I want to subtract B from C to see and identify how many patients have
 greater concentrations and who are these patients (A).


 On Thu, May 3, 2012 at 12:15 PM, John Kane jrkrid...@inbox.com wrote:

 You do not seem to have suppied either code nor data.  Please supply both.

 John Kane
 Kingston ON Canada


  -Original Message-
  From: shankarla...@gmail.com
  Sent: Wed, 2 May 2012 22:06:54 -0400
  To: r-help@r-project.org
  Subject: [R] Identifying the particular X or Y in a sorted list
 
  Dear All,
 
  I have a data sets as shown below A (Patient ID ), B and C are the
  Concentration of drug in blood on day 1 and day 4, D is the difference in
  conc. To do this in R I have written a code as follows, identified the
  number of patients who have more concentration on day 4 . Here I want to
  identify specifically the patient ID (is he patient 1 or 2 or 5 and 7),
  whose concentration is more.
  How to write a code to get the list of A (patient ID whose difference is
  more on day 4).
 
  Data-(myDf$B-myDf$C)
  sum(Data0)
 
A B CD (B-C)  1 14 10 4  2 12 7 5  3 11 15 -4  4 8 3 5  5 1 8 -7
 
  I appreciate your help, thank you very much in advance.
 
  Regards
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 
 GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at
 http://www.inbox.com/smileys
 Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and
 most webmails





 --
 Regards,
 Shankar Lanke Ph.D.
 Assistant Professor
 Department of Pharmaceutical Sciences
 College of Pharmacy
 The University of Findlay
 (C) 678-232-3567
 (O) 419-434-5448
 Fax# 419-434-4390

 --
  Free Online Photosharing - Share your photos online with your friends
 and family!
 Visit http://www.inbox.com/photosharing to find out more!




-- 
Regards,
Shankar Lanke Ph.D.
Assistant Professor
Department of Pharmaceutical Sciences
College of Pharmacy
The University of Findlay
(C) 678-232-3567
(O) 419-434-5448
Fax# 419-434-4390

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with readBin

2012-05-03 Thread William Dunlap
You can do the following to allow others to recreate your problem.

  yourFileBytes - readBin(yourFile, what=integer, size=1, n=300) # is 300 
bytes enough to see the problem?
  dput(yourFileBytes)

Put the output of dput(yourFileBytes) in your mail.  Someone can (and you 
should)
recreate the problem with
  bytes - ... copy 'n paste the printout of dput(bytes) here ...
  tf - tempfile()
  stopifnot(is.integer(bytes)  all(abs(bytes)=128)) # to make sure bytes was 
copied correctly
  writeBin(bytes, con=tf, size=1)

Then show just the commands needed to read a couple of rows of your file, along 
with
the expected output, as precisely and you can.  E.g.,
  con - file(tf, rb)
  readBin(con, what=integer, size=4, n=2) # expect 3 then something less than 
10
  readBin(con, what=numeric, size=8, n=3) # expect 2 numbers in range (0, 32] 
then 2.57
  ...

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of kapo coulibaly
 Sent: Thursday, May 03, 2012 10:57 AM
 To: r-help@r-project.org
 Subject: Re: [R] Help with readBin
 
  I believe here is the structure of the file I'm trying to read:
 record marker (4 bytes), 2 integers (4 bytes each), 2 doubles (8 bytes
 each), one string (16 bytes or 16 characters), 3 integers (4 bytes each), 1
 record marker (4 bytes) and a big array of doubles (8 bytes each).
 Everything in the file is read correctly except for the doubles.
 If any indication, I've read similar file before with readBin the only
 difference is this one was created with a code compiled with gfortran in
 linux 64 bit. I was able to read the same output binary file when the
 fortran source code was compiled in windows xp 32 bit. The values I'm
 expecting should be between 0 and about 32.
 
 
 
 
 The code I used is:
 
 
 
 # Loading Required libraries
 library(tcltk)
 
 # Tk inputbox function
  inputBox-function() {
   tt-tktoplevel()
   Zmin-tclVar(0)
   Zmax-tclVar(0)
   dZ-tclVar(0)
   entry.Zmin-tkentry(tt,width=20,textvariable=Zmin)
   entry.Zmax-tkentry(tt,width=20,textvariable=Zmax)
   entry.dZ-tkentry(tt,width=20,textvariable=dZ)
   lbl.Zmin-tklabel(tt,text=Number of layers)
   lbl.Zmax-tklabel(tt,text=Number of Stress Periods)
   lbl.dZ-tklabel(tt,text=dZ)
   tkgrid(lbl.Zmin,entry.Zmin)
   tkgrid(entry.Zmin)
   tkgrid(lbl.Zmax,entry.Zmax)
   tkgrid(entry.Zmax)
   #tkgrid(lbl.dZ,entry.dZ)
   #tkgrid(entry.dZ)
 
   OnOK - function()
   {
   # NameVal - c(tclvalue(Zmin),tclvalue(Zmax),tclvalue(dZ))
 tkdestroy(tt)
 }
   OK.but -tkbutton(tt,text=   OK   ,command=OnOK)
   # tkbind(entry.Name, Return,OnOK)
   tkgrid(OK.but,columnspan=3)
   tkfocus(tt)
   tkwait.window(tt)
   res-as.numeric(c(tclvalue(Zmin),tclvalue(Zmax)))#,tclvalue(dZ)))
   return(res)
 }
 
 
 
 # Main program
 
 
 
 # Model Parameters input (number of layers and stress periods)
 param-inputBox()
 
 # Select and open Modflow Binary file for reading
 fich-tclvalue(tkgetOpenFile(title=Modflow Binary File,filetypes={{hds
 binary Files} {.hds}} {{All files} *}))
 zz - file(fich, rb)
 
 # Cycling thru time steps and layers
 for (k in 1:param[2]) {
   for (i in 1:param[1]) {
 readBin(zz,what=numeric,n=1,size=4) # record marker typical of
 fortran access=sequential in gfortran
 readBin(zz,what=integer,n=2,size=4)-N1
 readBin(zz,what=double,n=2,size=8)-N2
 readChar(zz,16)-txt1
 print(txt1)
 readBin(zz,what=integer,n=3,size=4)-N3
 tnber-N3[1]*N3[2]
 readBin(zz,what=integer,n=1,size=4) # record marker typical of
 fortran access=sequential in gfortran
 readBin(zz,what=real(),n=tnber,size=4)-N4
 readBin(zz,what=integer,n=2,size=4) # record marker typical of
 fortran access=sequential in gfortran
 print(N4[1:10])
 
 
 }
 
 }
 
 close(zz)
 
 On Thu, May 3, 2012 at 1:26 PM, Duncan Murdoch
 murdoch.dun...@gmail.comwrote:
 
  On 03/05/2012 12:41 PM, kapo coulibaly wrote:
 
  I'm trying to read a binary file created by a fortran code using readBin
  and readChar. Everything reads fine (integers and strings) except for
  double precision numbers, they are read as huge or very small number
  (1E-250,...). I tried various endianness, swap, But nothing has worked so
  far.
  I also tried on R 64 bit for linux and windows (R 2.14) and R 2.11 on
  windows XP 32 bit.
  Any help would be appreciated.
 
 
  As I wrote to someone else with a similar problem a couple of weeks ago:
 
  You need to see what's in the file.  The hexView package can dump it in
  various formats; see example(viewFormat) for a couple.
 
  Duncan Murdoch
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the 

[R] Identifying case by groups in a data frame

2012-05-03 Thread Jose Bustos Melo
Hi everyone,

I would like to identify the case by groups that is just bigger that 
avg plus sd. For example, using species as group and petal.wid as my 
variable in the iris data.
What's the better way to doit? creating a function?

So,the question is to identify the single element of each species that is just 
larger than a cut-off point (i.e. larger than mean + sd)
I made this, but I can not make a relation to to orginal data for identifiying 
the cases.If you have some lights please share it.

data(iris)

library(plyr)
petal.wid.avg -ddply(iris,.(Species),function(df)
  return(c(petal.wid.avg=mean(df$Petal.Width),petal.wid.sd=sd(df$Petal.Width)))
)
petal.wid.avg$avgsd -petal.wid.avg$petal.wid.avg +petal.wid.avg$petal.wid.sd
petal.wid.avg


Thanks in advance:
José
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] NA's when subset in a dataframe

2012-05-03 Thread Milan Bouchet-Valat
Le jeudi 03 mai 2012 à 07:37 -0700, agent dunham a écrit :
 Dear community, 
 
 I'm having this silly problem.
 
 I've a linear model. After fixing it, I wanted to know which data had
 studentized residuals larger than 3, so i tried this: 
 
 d1 - cooks.distance(lmmodel)
 r - sqrt(abs(rstandard(lmmodel)))
 rstu - abs(rstudent(lmmodel))
 
 a - cbind( mydata, d1, r,rstu) 
 
 alargerthan3 -  a[rstu 3, ]
 
 And suddenly  a[rstu 3, ]  has 17 rows, 7 of them are new rows, where all
 the entries are NA's, even its rownames. 
 
 Because of this I'm not sure of the dimension ofa[rstu 3, ]  (Do I only
 have 8 entries?)
 
 Has this happened to anybody before? If so, why this extra NA rows? what's
 the problem? Is there any other way to know which data have studentized
 residuals larger than   3?
 
 
  if it's needed  to upload my data, just tell me.
A small reproducible example would have been better. Anyway, see page 88
of The R Inferno.

In your case, the simplest solutions are to do:
alargerthan3 - a[which(rstu  3),]
or
alargerthan3 - subset(a, rstu  3)


Cheers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] May-July 2012 ***R/S-PLUS Courses***by XLSolutions Corp at 9 USA locations

2012-05-03 Thread s...@xlsolutions-corp.com
 Seattle  May 17-18 *

XLSolutions May-July 2012 R/S-PLUS courses schedule is now
available online at 9 USA cities for with 13 new courses: *** Suggest a
future
course date/city

(1) R-PLUS: A Point-and-Click Approach to R
(2) S-PLUS / R : Programming Essentials.
(3) R/S+ Fundamentals and Programming Techniques
(4) R/S-PLUS Functions by Example.
(5) S/R-PLUS Programming 3: Advanced Techniques and Efficiencies.
(6) R/S+ System: Advanced Programming.
(7) R/S-PLUS Graphics: Essentials.
(8) R/S-PLUS Graphics for SAS Users
(9) R/S-PLUS Graphical Techniques for Marketing Research.
(10) Multivariate Statistical Methods in R/S-PLUS: Practical Research
Applications
(11) Introduction to Applied Econometrics with R/S-PLUS
(12) Exploratory Analysis for Large and Complex Problems in R/S-PLUS
(13) Determining Power and Sample Size Using R/S-PLUS.
(14) R/S-PLUS: Data Preparation for Data Mining
(15) Data Cleaning Techniques in R/S-PLUS
(16) R/S-PLUS: Applied Clustering Techniques


More on website

http://www.xlsolutions-corp.com/courselistlisting.aspx 

Ask for group discount and reserve your seat Now - Earlybird Rates.
Payment due after the class! Email Sue Turner: sue at
xlsolutions-corp.com

Phone: 206-686-1578


Please let us know if you and your colleagues are interested in this
class to take advantage of group discount. Register now to secure your
seat.

Cheers,
Elvis Miller, PhD
Manager Training.
XLSolutions Corporation
206 686 1578
www.xlsolutions-corp.com
elvis at xlsolutions-corp.com


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Identifying case by groups in a data frame

2012-05-03 Thread Jorge I Velez
Dear José,

Here is one way:

# aux. function
foo - function(x, ...){
m - mean(x, ...)
S - sd(x, ...)
x  m + S
}

# result
iris$rule - with(iris, ave(Petal.Width, list(Species), FUN = foo))
head(iris)

HTH,
Jorge.-


On Thu, May 3, 2012 at 5:19 PM, Jose Bustos Melo  wrote:

 Hi everyone,

 I would like to identify the case by groups that is just bigger that
 avg plus sd. For example, using species as group and petal.wid as my
 variable in the iris data.
 What's the better way to doit? creating a function?

 So,the question is to identify the single element of each species that is
 just larger than a cut-off point (i.e. larger than mean + sd)
 I made this, but I can not make a relation to to orginal data for
 identifiying the cases.If you have some lights please share it.

 data(iris)

 library(plyr)
 petal.wid.avg -ddply(iris,.(Species),function(df)
   return(c(petal.wid.avg=mean(df$Petal.Width),petal.wid.sd
 =sd(df$Petal.Width)))
 )
 petal.wid.avg$avgsd -petal.wid.avg$petal.wid.avg +petal.wid.avg$
 petal.wid.sd
 petal.wid.avg


 Thanks in advance:
 José
[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] GAM, how to set qr=TRUE

2012-05-03 Thread Ben quant
Hello,

I don't understand what went wrong or how to fix this. How do I set qr=TRUE
for gam?

When I produce a fit using gam like this:

fit = gam(y~s(x),data=as.data.frame(l_yx),family=family,control =
list(keepData=T))

...then try to use predict:
(see #1 below in the traceback() )

 traceback()
6: stop(lm object does not have a proper 'qr' component.\n Rank zero or
should not have used lm(.., qr=FALSE).) at #81
5: qr.lm(object) at #81
4: summary.glm(object, dispersion = dispersion) at #81
3: summary(object, dispersion = dispersion) at #81
2: predict.glm(fit, data.frame(x = xx), type = response, se.fit = T,
   col = prediction_col, lty = prediction_ln) at #81
1: predict(fit, data.frame(x = xx), type = response, se.fit = T,
   col = prediction_col, lty = prediction_ln) at #81

...I get this error:

Error in qr.lm(object) : lm object does not have a proper 'qr' component.
 Rank zero or should not have used lm(.., qr=FALSE).

I read this post: http://tolstoy.newcastle.edu.au/R/devel/06/04/5133.html

So I tried adding qr=T to the gam call but it didn't make any difference.
This is how I did it:

fit = gam(y~s(x),data=as.data.frame(l_yx),family=family,control =
list(keepData=T),qr=T)

Its all very strange because I've produced fits with this data may times
before with no issues (and never having to do anything with the qr
parameter. I don't understand why this is coming up or how to fix it.

PS - I don't think this matters, but I am calling a script called
FunctionGamFit.r like this:
err = system(paste('C:\\Program Files\\R\\R-2.14.1\\bin\\R.exe', 'CMD
BATCH FunctionGamFit.r'), wait = T)
...to produce the fit.

Thanks for any help!

ben

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple plot loop

2012-05-03 Thread Peter Ehlers

Ben,

I think that your original for-loop would work if you just
replaced the 'i' in the lines() call with 'Data2[,i]':

   for (i in 2:length(Data2)) {
  lines(MONTH, Data2[, i], type=o, pch=22, lty=2, col=blue)
   }

Peter Ehlers

On 2012-05-03 07:04, Ben Neal wrote:

Jim, thanks for the reply. I tried what you recommend, but I still get an error 
when running it, just as I did with the similar loops I was trying. Here is the 
error:

Error in xy.coords(x, y) : 'x' and 'y' lengths differ

That comes from this code:

#Get file
library(zoo)
setwd(/Users/benjaminneal/Documents/All Panama/1110_Panama/CNAT_Segmenting)
Data = read.csv(120503_CNAT_Summary.csv, header=T)

#fill in missing data from last observation
Data2- na.locf(Data)
attach(Data2)

#Plot line, and make it loop
for(column in names(Data2)[2:length(Data2)])
   lines(MONTH,column,type=o,pch=22,lty=2,col=blue)

The problem perhaps is in my data. My columns are observations over time, 
column headers are individual names, and the first column is the time series in 
months. An example is here:
MONTH   Tag101
0 234
2 236
4 239
8 300
10   320

This then repeats for different individuals . . . I think my problem must be 
that my length of MONTH is different than my length of observations of each 
column . . . except that it is not, as far as I can tell! Thank you for 
assisting me with this simple but frustrating problem - I have the greatest 
respect for those of you who provide these answers that so many of us read and 
utilize. Thank you. Ben Neal


-Original Message-
From: r-help-boun...@r-project.org on behalf of Jim Lemon
Sent: Thu 5/3/2012 3:40 AM
To: Ben Neal
Cc: r-help@r-project.org
Subject: Re: [R] Simple plot loop

On 05/03/2012 05:50 PM, Ben Neal wrote:

Trying to plot multiple lines from a simple matrix with headers (eight 
observations per column). I will be doing a number of these, with varying 
numbers of columns, and do not want to enter the header names for each one (I 
got frustrated and just wrote them out, which did work).

Data reads fine, first plot is fine, but when i use the code at the bottom for 
a for i loop it tells me that x and y do not match. . .

One other issue is that I would prefer not to specify the first column either, 
but when I enter Plot(MONTH, Data2[2] . . . it also does not plot. Should this 
not call the second column?

Thank you very much for any comments. I know this is simple, but I appreciate 
any assistance. Cheers, Ben

#

# LOAD DATA FROM CSV
library(zoo)
setwd(/Users/benjaminneal/Documents/All Panama/1110_Panama/CNAT_Segmenting)
Data = read.csv(120503_CNAT_Summary.csv, header=T)

#fill in missing data from last observation
Data2- na.locf(Data)
attach(Data2)

# PLOT ALL ON ONE CHART
plot(MONTH,T102, type=o, ann=False, ylim=c(1, 100), pch=22, lty=2, 
col=red)
title(main=5m and 10 m Colpophylia natans colonies over time, ylab=% live coral / 
colony,
xlab=Months, col.main=black, font.main=4)

lines(MONTH,T162, type=o, pch=22, lty=2, col=red)
lines(MONTH,T231, type=o, pch=22, lty=2, col=green)
lines(MONTH,T250, type=o, pch=22, lty=2, col=green)

##(many other similar lines here, with entered column headers . . . up to 75)

lines(MONTH,T373, type=o, pch=22, lty=2, col=blue)
lines(MONTH,T374, type=o, pch=22, lty=2, col=blue)
lines(MONTH,T377, type=o, pch=22, lty=2, col=blue)

# Tried to add lines another way with for i loop, but this is the part not 
working
for (i in 2:length(Data2)) {
lines(MONTH, i, type=o, pch=22, lty=2, col=blue))
}
#


Hi Ben,
I think what you may want in your loop is this:

for(column in names(Data2)[2:length(Data2)])
   lines(MONTH,column,type=o,pch=22,lty=2,col=blue)

But, if you want the first two lines to be green, you'll probably have
to get a vector of colors:

colorvec-rep(blue,length(Data2))
colorvec[1]-red
colorvec[2:3]-green

and change the above to:

columnnames-names(Data2)
for(column in 2:length(Data2))
   lines(MONTH,columnnames[column],type=o,pch=22,lty=2,col=colvec[column])

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, 

[R] Difference between 10 and 10L

2012-05-03 Thread brwin338

Good Evening
We have been searching through the R documentation manuals without success on 
this one.
What is the purpose or result of the L in the following?

n=10
and 
n=10L

or
c(5,10)
versus
c(5L,10L)

Thanks
Joe



Thanks
Joe


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >