[R] R command to open a file browser on Windows and Mac?

2015-08-03 Thread Jonathan Greenberg
Folks:

Is there an easy function to open a finder window (on mac) or windows
explorer window (on windows) given an input folder?  A lot of times I want
to be able to see via a file browser my working directory.  Is there a good
R hack to do this?

--j

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R command to open a file browser on Windows and Mac?

2015-08-03 Thread Duncan Murdoch
On 03/08/2015 11:19 AM, Jonathan Greenberg wrote:
 Folks:
 
 Is there an easy function to open a finder window (on mac) or windows
 explorer window (on windows) given an input folder?  A lot of times I want
 to be able to see via a file browser my working directory.  Is there a good
 R hack to do this?

On Windows, shell.exec(dir) will open Explorer at that directory.
(It'll do something else if dir isn't a directory name, or has spaces in
it without quotes, so you need to be a little careful.)

On OSX, system2(open, dir) should do the same.

Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Households per Census block

2015-08-03 Thread Keith S Weintraub
Folks,

I am using the UScensus2010 package and I am trying to figure out the number of 
households per census block.

There are a number of possible data downloads in the package but apparently I 
am not smart enough to figure out which data-set is appropriate and what 
functions to use.

Any help or pointers or links would be greatly appreciated.

Thanks for your time,
Best,
KW

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R command to open a file browser on Windows and Mac?

2015-08-03 Thread Mark Sharp
Set your path with setwd(“my_path”) and then use file.choose().

You could have gotten this information sooner with a simple online search.

Mark
R. Mark Sharp, Ph.D.
Director of Primate Records Database
Southwest National Primate Research Center
Texas Biomedical Research Institute
P.O. Box 760549
San Antonio, TX 78245-0549
Telephone: (210)258-9476
e-mail: msh...@txbiomed.org







 On Aug 3, 2015, at 10:19 AM, Jonathan Greenberg j...@illinois.edu wrote:
 
 Folks:
 
 Is there an easy function to open a finder window (on mac) or windows
 explorer window (on windows) given an input folder?  A lot of times I want
 to be able to see via a file browser my working directory.  Is there a good
 R hack to do this?
 
 --j
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using R to fit a curve to a dataset using a specific equation

2015-08-03 Thread David L Carlson
Use Reply-All to keep the discussion on the list.

I suggested reading about nls (not just how to do it in R) because you 
requested R2. It was not clear that you were aware that there are strong 
reasons to suspect that R2 is misleading when applied nls results. That is why 
nls() does not provide it automatically.

But R2 is easily computed from the model results:

GossSS - sum((dta$Gossypol - mean(dta$Gossypol))^2)
R2 - deviance(dta.nls)/GossSS
R2
[1] 0.6318866

As for ggplot, just add the line we created before to the points plot:

library(ggplot)
xval - seq(0, 10, length.out=200)
yval - predict(dta.nls, data.frame(Damage_cm=xval))
ggplot() + geom_point(data=dta, aes(x=Damage_cm, y=Gossypol)) + 
geom_line(aes(x=xval, y=yval))


David Carlson

From: Michael Eisenring [mailto:michael.eisenr...@gmx.ch] 
Sent: Saturday, August 1, 2015 5:33 PM
To: David L Carlson
Subject: Aw: RE: [R] Using R to fit a curve to a dataset using a specific 
equation

Hello and thank you very much for your help!
I just started to read up on non-linear least squares in The RBook. (I am 
totally new to the topic so i dindt even know where to look in the book ).
I have three last questions:
 
In the Rbook they say how to describe a model. In my case it would be something 
like:
‘The model y ~ y0 + a * (1 - b^x)
had y0= 1303.45 ( 386.15 standard error), a= and b=
The model explained ??% of the total variation in y
 
My question is were do I find the %age of total variation the model explains. 
it does not say that in the book.
Is there something similar as a R^2 value or a p-value?
 
My last question: is it possible to use ggplot2 for plotting the whole model?
 
Thanks a lot.
Mike
 
  
Gesendet: Samstag, 01. August 2015 um 13:49 Uhr
Von: David L Carlson dcarl...@tamu.edu
An: Michael Eisenring michael.eisenr...@gmx.ch, r-help@r-project.org 
r-help@r-project.org
Betreff: RE: [R] Using R to fit a curve to a dataset using a specific equation
I can get you started, but you should really read up on non-linear least 
squares. Calling your data frame dta (since data is a function):

plot(Gossypol~Damage_cm, dta)
# Looking at the plot, 0 is a plausible estimate for y0:
# a+y0 is the asymptote, so estimate about 4000;
# b is between 0 and 1, so estimate .5
dta.nls - nls(Gossypol~y0+a*(1-b^Damage_cm), dta,
start=list(y0=0, a=4000, b=.5))
xval - seq(0, 10, length.out=200)
lines(xval, predict(dta.nls, data.frame(Damage_cm=xval)))
profile(dta.nls, alpha= .05)
===
Number of iterations to convergence: 3
Achieved convergence tolerance: 1.750586e-06
attr(,summary)

Formula: Gossypol ~ y0 + a * (1 - b^Damage_cm)

Parameters:
Estimate Std. Error t value Pr(|t|)
y0 1303.4529432 386.1515684 3.37550 0.0013853 **
a 2796.0464520 530.4140959 5.27144 2.5359e-06 ***
b 0.4939111 0.1809687 2.72926 0.0085950 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1394.375 on 53 degrees of freedom

Number of iterations to convergence: 3
Achieved convergence tolerance: 1.750586e-06


David Carlson
Dept of Anthropology
Texas AM
College Station, TX 77843

From: R-help [r-help-boun...@r-project.org] on behalf of Michael Eisenring 
[michael.eisenr...@gmx.ch]
Sent: Saturday, August 01, 2015 10:17 AM
To: r-help@r-project.org
Subject: [R] Using R to fit a curve to a dataset using a specific equation

Hi there




I would like to use a specific equation to fit a curve to one of my data
sets (attached)

 dput(data)

structure(list(Gossypol = c(1036.331811, 4171.427741, 6039.995102,
5909.068158, 4140.242559, 4854.985845, 6982.035521, 6132.876396,
948.2418407, 3618.448997, 3130.376482, 5113.942098, 1180.171957,
1500.863038, 4576.787021, 5629.979049, 3378.151945, 3589.187889,
2508.417927, 1989.576826, 5972.926124, 2867.610671, 450.7205451, 1120.955,
3470.09352, 3575.043632, 2952.931863, 349.0864019, 1013.807628, 910.8879471,
3743.331903, 3350.203452, 592.3403778, 1517.045807, 1504.491931,
3736.144027, 2818.419785, 723.885643, 1782.864308, 1414.161257, 3723.629772,
3747.076592, 2005.919344, 4198.569251, 2228.522959, 3322.115942,
4274.324792, 720.9785449, 2874.651764, 2287.228752, 5654.858696,
1247.806111, 1247.806111, 2547.326207, 2608.716056, 1079.846532), Treatment
= structure(c(2L, 3L, 4L, 5L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 5L, 1L, 2L, 3L,
4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L,
3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L,
2L, 3L, 1L), .Label = c(C, 1c_2d, 3c_2d, 9c_2d, 1c_7d), class =
factor), Damage_cm = c(0.4955, 1.516, 4.409, 3.2665, 0.491, 2.3035, 3.51,
1.8115, 0, 0.4435, 1.573, 1.8595, 0, 0.142, 2.171, 4.023, 4.9835, 0, 0.6925,
1.989, 5.683, 3.547, 0, 0.756, 2.129, 9.437, 3.211, 0, 0.578, 2.966, 4.7245,
1.8185, 0, 1.0475, 1.62, 5.568, 9.7455, 0, 0.8295, 2.411, 7.272, 4.516, 0,
0.4035, 2.974, 8.043, 4.809, 0, 0.6965, 1.313, 5.681, 3.474, 0, 0.5895,
2.559, 0)), .Names = 

Re: [R] R command to open a file browser on Windows and Mac?

2015-08-03 Thread Barry Rowlingson
And for completeness, on linux:

system(paste0(xdg-open ,getwd()))

there's a function in a package somewhere that hides the system
dependencies of opening things with the appropriate application, and
if you pass a folder/directory to it I reckon it will open it in the
Explorer/Finder/Nautilus//xfm//This Month's Linux File Browser// as
appropriate.

But I can't remember the name of the function or the package.

Barry



On Mon, Aug 3, 2015 at 4:19 PM, Jonathan Greenberg j...@illinois.edu wrote:
 Folks:

 Is there an easy function to open a finder window (on mac) or windows
 explorer window (on windows) given an input folder?  A lot of times I want
 to be able to see via a file browser my working directory.  Is there a good
 R hack to do this?

 --j

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R to fit a curve to a dataset using a specific equation

2015-08-03 Thread David L Carlson
Your question is more statistics than R and I’m not qualified to offer an 
opinion. You should be able to find someone locally to help you. The Cross 
Validated website is also a useful resource.

David

From: Michael Eisenring [mailto:michael.eisenr...@gmx.ch]
Sent: Monday, August 3, 2015 10:37 AM
To: David L Carlson
Cc: r-help
Subject: Aw: RE: RE: [R] Using R to fit a curve to a dataset using a specific 
equation

Hi David,
thank you for your help.
It makes sense to me that the R2 is very misleading in an non-linear 
regression. the same is true for the p-values.
My question then is: how can I present the results of my curve and quantify its 
goodness if R2 and p-values are misleading?
thanks a lot,
Mike



Gesendet: Montag, 03. August 2015 um 07:40 Uhr
Von: David L Carlson dcarl...@tamu.edumailto:dcarl...@tamu.edu
An: Michael Eisenring 
michael.eisenr...@gmx.chmailto:michael.eisenr...@gmx.ch, r-help 
r-help@r-project.orgmailto:r-help@r-project.org
Betreff: RE: RE: [R] Using R to fit a curve to a dataset using a specific 
equation
Use Reply-All to keep the discussion on the list.

I suggested reading about nls (not just how to do it in R) because you 
requested R2. It was not clear that you were aware that there are strong 
reasons to suspect that R2 is misleading when applied nls results. That is why 
nls() does not provide it automatically.

But R2 is easily computed from the model results:

GossSS - sum((dta$Gossypol - mean(dta$Gossypol))^2)
R2 - deviance(dta.nls)/GossSS
R2
[1] 0.6318866

As for ggplot, just add the line we created before to the points plot:

library(ggplot)
xval - seq(0, 10, length.out=200)
yval - predict(dta.nls, data.frame(Damage_cm=xval))
ggplot() + geom_point(data=dta, aes(x=Damage_cm, y=Gossypol)) +
geom_line(aes(x=xval, y=yval))


David Carlson

From: Michael Eisenring [mailto:michael.eisenr...@gmx.ch]
Sent: Saturday, August 1, 2015 5:33 PM
To: David L Carlson
Subject: Aw: RE: [R] Using R to fit a curve to a dataset using a specific 
equation

Hello and thank you very much for your help!
I just started to read up on non-linear least squares in The RBook. (I am 
totally new to the topic so i dindt even know where to look in the book ).
I have three last questions:

In the Rbook they say how to describe a model. In my case it would be something 
like:
‘The model y ~ y0 + a * (1 - b^x)
had y0= 1303.45 ( 386.15 standard error), a= and b=
The model explained ??% of the total variation in y

My question is were do I find the %age of total variation the model explains. 
it does not say that in the book.
Is there something similar as a R^2 value or a p-value?

My last question: is it possible to use ggplot2 for plotting the whole model?

Thanks a lot.
Mike


Gesendet: Samstag, 01. August 2015 um 13:49 Uhr
Von: David L Carlson dcarl...@tamu.edumailto:dcarl...@tamu.edu
An: Michael Eisenring 
michael.eisenr...@gmx.chmailto:michael.eisenr...@gmx.ch, 
r-help@r-project.orgmailto:r-help@r-project.org 
r-help@r-project.orgmailto:r-help@r-project.org
Betreff: RE: [R] Using R to fit a curve to a dataset using a specific equation
I can get you started, but you should really read up on non-linear least 
squares. Calling your data frame dta (since data is a function):

plot(Gossypol~Damage_cm, dta)
# Looking at the plot, 0 is a plausible estimate for y0:
# a+y0 is the asymptote, so estimate about 4000;
# b is between 0 and 1, so estimate .5
dta.nls - nls(Gossypol~y0+a*(1-b^Damage_cm), dta,
start=list(y0=0, a=4000, b=.5))
xval - seq(0, 10, length.out=200)
lines(xval, predict(dta.nls, data.frame(Damage_cm=xval)))
profile(dta.nls, alpha= .05)
===
Number of iterations to convergence: 3
Achieved convergence tolerance: 1.750586e-06
attr(,summary)

Formula: Gossypol ~ y0 + a * (1 - b^Damage_cm)

Parameters:
Estimate Std. Error t value Pr(|t|)
y0 1303.4529432 386.1515684 3.37550 0.0013853 **
a 2796.0464520 530.4140959 5.27144 2.5359e-06 ***
b 0.4939111 0.1809687 2.72926 0.0085950 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1394.375 on 53 degrees of freedom

Number of iterations to convergence: 3
Achieved convergence tolerance: 1.750586e-06


David Carlson
Dept of Anthropology
Texas AM
College Station, TX 77843

From: R-help [r-help-boun...@r-project.org] on behalf of Michael Eisenring 
[michael.eisenr...@gmx.ch]
Sent: Saturday, August 01, 2015 10:17 AM
To: r-help@r-project.orgmailto:r-help@r-project.org
Subject: [R] Using R to fit a curve to a dataset using a specific equation

Hi there




I would like to use a specific equation to fit a curve to one of my data
sets (attached)

 dput(data)

structure(list(Gossypol = c(1036.331811, 4171.427741, 6039.995102,
5909.068158, 4140.242559, 4854.985845, 6982.035521, 6132.876396,
948.2418407, 3618.448997, 3130.376482, 5113.942098, 1180.171957,
1500.863038, 4576.787021, 5629.979049, 3378.151945, 3589.187889,
2508.417927, 

[R-es] Menor que 1000

2015-08-03 Thread Manuel Máquez
Estimados Colegas:
Estoy tratando de hacer unas gráficas y al pedir ejecutar la última línea,
el equipo me dice:
geom_smooth: method=auto and size of largest group is 1000, so using
loess. Use 'method = x' to change the smoothing method.

La línea en cuestión es:
ggplot(data = dat, aes(x=srt, y=d_t, col=Detec)) +
geom_point(aes(shape=Detec)) +
geom_smooth(span=0.65, aes(group=1))+
scale_colour_manual(values=c(black,red)) +
ggtitle(Curva Suavizada con Intervalo de Confianza +
Detectados y No Detectados)
Los datos son 348 observaciones de 6 variables.

*¿Me puede alguien ayudar para decirme que debo cambiar y así deje de salir
dicho aviso? Consulté la ayuda de ggplot en soporte de R Studio pero no que
busco.*
*Anticipo las más cumplidas gracias.*

*MANOLO MÁRQUEZ P.*

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R-es] ayuda con análisis de supervivencia

2015-08-03 Thread Griera
Hola:

A que te refieres como el bmi hasta el evento?

Respecto que no sea un tiempo de supervivencia, no eres el único. En este 
artículo tampoco utilizan un tiempo:

http://www.ncbi.nlm.nih.gov/pubmed/8970394

Saludos.


On Sun, 2 Aug 2015 19:19:45 +0200
JM ARBONES marbo...@unizar.es wrote:

 Hola a todos,
 -Estoy estudiando el efecto de dos genotipos (~tratamientos) en la aparición 
 de síndrome metabólico (MetS) con datos longitudinales recogidos a tiempo 
 0,7,10,15,20 y 25 años.
 
 -He hecho un dataframe con las siguientes variables
 MetS: Síndrome Metabólico (Si=1,No=0)
 bmi: Indice de masa corporal (IMC) cuando se produce la conversión a MetS+ . 
 Para los que permancen MetS-, esta variable indica el bmi cuando hay censura 
 (por abandono del estudio o al finalizar el estudio en el año 25).
 bmi0: IMC al inicio del estudio (categórica, levels=normal/overweight/obese)
 apoE4: Genotipo de interés (E4, no-E4)
 
 -Mi hipótesis es que la interacción genotipo~MetS depende del IMC al 
 principio del estudio. Concretamente, individuos 'overweight' al inicio del 
 estudio y con el genotipo E4 hacen la conversión a MetS+ a valores de IMC mas 
 bajos que los que tienen el genotipo no-E4. Este fenómeno no ocurriría en los 
 'normal' y 'obese'.
 
 -He creado unos objetos Surv, pero en lugar de utilizar el tiempo hasta 
 evento (MetS+) estoy utilizando el bmi hasta el evento. Las gráficas que 
 resultan al hacer el análisis de supervivencia parecerían confirmar mi 
 hipótesis, pero no se si lo que estoy haciendo es una aberración estadística. 
 Tampoco se si los coeficientes de la regresión de Cox tienen sentido al no 
 utilizar la variable tiempo.
 
 ?Alguien me podría 1)decir si lo que estoy haciendo tiene sentido y 2) como 
 interpretar los resultados (regresión de Cox y gráficas)?
 Si a alguien se anima a contestar, adjunto un link con los datos (.Rdata) y 
 el script que he utilizado en el análisis.
 
 
 https://www.dropbox.com/s/d96itird8ms42yx/dataframe.Rdata?dl=0 
 https://www.dropbox.com/s/d96itird8ms42yx/dataframe.Rdata?dl=0
 
 sapply(levels(df0$bmi0),function (x){ #SURVIVAL CURVE
   dfx=filter(df0,bmi0==x)
   
   surv2=Surv(dfx$bmi,dfx$MetS)
   km2=survfit(surv2~dfx$apoe4)##start.time=20,type='kaplan')
   plot(km2,lty=2:1,xlim=c(20,41),xlab='BMI at onset',main=x,mark.time = F)
   legend('bottomleft',c('E4','no-E4'),lty=2:1)
   cox=list(coxph(surv2~relevel(dfx$apoe4,ref='no-E4')))
 })
 
 sapply(levels(df0$bmi0),function (x){ #CUMULATIVE HAZARDs
   dfx=filter(df0,bmi0==x)
   
   surv2=Surv(dfx$bmi,dfx$MetS)
   km2=survfit(surv2~dfx$apoe4)
   plot(km2,lty=2:1,xlim=c(20,41),xlab='BMI at onset',main=x,mark.time = 
 F,fun='cumhaz')
   legend('topleft',c('E4','no-E4'),lty=2:1)
   
 })
 
 Muchas gracias y un saludo
 
 Jose Miguel
 
 ---
 
 Jose Miguel Arbones-Mainar, PhD
 Unidad de Investigación Traslacional 
 Instituto Aragones de Ciencias de la Salud 
 Hospital Universitario Miguel Servet 
 Pº Isabel la Católica, 1-3 
 50009 Zaragoza (Spain) 
 Tel: +34 976 769 565
 Fax: +34 976 769 566 
 www.adipofat.com http://www.adipofat.com/
 
 
 
 
 ---
 Jose Miguel Arbones-Mainar
 www.adipofat.com http://www.adipofat.com/
 
 
 
 
 
 
 
   [[alternative HTML version deleted]]
 
 ___
 R-help-es mailing list
 R-help-es@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-help-es

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


[R] Faster text search in document database than with grep?

2015-08-03 Thread Witold E Wolski
I have a database of text documents (letter sequences). Several thousands
of documents with approx. 1000-2000 letters each.

I need to find exact matches of short 3-15 letters sequences in those
documents.

Without any regexp patterns the search of one 3-15 letter words takes in
the order of 1s.

So for a database with several thousand documents it's an the order of
hours.
The naive approach would be to use mcmapply, but than on a standard
hardware I am still in the same order and since R is an interactive
programming environment this isn't a solution I would go for.

But aren't there faster algorithmic solutions? Can anyone point me please
to an implementation  available in R.

Thank you
Witold




-- 
Witold Eryk Wolski

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting lines in R script

2015-08-03 Thread Jim Lemon
Hi Steven,
In general, the command line must be incomplete (in your case, a
trailing hyphen) for the interpreter to take the next line as a
continuation.

Jim


 On Sun, Aug 2, 2015 at 9:05 PM, Steven Yen sye...@gmail.com wrote:
 I have a line containing summation of four components.

 # This works OK:
   p-pbivnorm(bb,dd,tau)+pbivnorm(aa,cc,tau)-
 -pbivnorm(aa,dd,tau)-pbivnorm(bb,cc,tau)

 # This produces unpredicted results without warning:
   p-pbivnorm(bb,dd,tau)+pbivnorm(aa,cc,tau)
 -pbivnorm(aa,dd,tau)-pbivnorm(bb,cc,tau)

 Is there a general rule of thumb for line breaks? Thanks you.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Natural Smoothing B-splines

2015-08-03 Thread Marc Lamblin
More effective search with respect to browsing or digging manually
into the documentation. Almost surely I'll find what I need!

Thanks very much

mggl

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] NCDF_arrays

2015-08-03 Thread Sibylle Stöckli
Dear R-users

I am working with ncdf data using the variables time (1-365), lon (longitude), 
lat (latitude) and the Temperature variable  daily). After setting the 
parameters for the model, I am able to calculate the output for each lon-lat 
grid point. The model works well including one ncdf file (TabsD29) (1). Now I 
want to include more than one ncdf file (2). Alls the ncdf files have the same 
variables with exception of the temperature variable. However, I get the output 
wrong arrays.  I think may suggestion to calculate the mean is wrong? 

Many thanks
Sibylle


 nlon - length(lon)
 nlat - length(lat)
 nday - length(TabsD29[1,1,])

 Tmin - 10.
 GDDmax   - 145
 DOYstart - 1
 
 (1)
 Teffs   - pmax(TabsD29 - Tmin, array(0., dim=c(nlon, nlat, nday))) 

(2)

 
 Teff   - 
pmax(mean(as.numeric(TabsD80+TabsD81+TabsD82+TabsD83+TabsD84+TabsD85+TabsD86+TabsD87+TabsD88+TabsD89+TabsD90+TabsD91+TabsD92+TabsD93+TabsD94+TabsD95+TabsD96+TabsD97+TabsD98+TabsD99+TabsD20+TabsD21+TabsD22+TabsD23+TabsD24+TabsD25+TabsD26+TabsD27+TabsD28+TabsD29))
 - Tmin, array(0., dim=c(nlon, nlat, nday)), na.rm=TRUE) 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] vectorized sub, gsub, grep, etc.

2015-08-03 Thread Adam Erickson
Interesting. I know of no practical use for such a function. If the first
position were 'abb,' sub() would return 'aBb,' failing to replace the
second 'b.' I find it hard to believe that's the desired functionality.
Writing a looped regex function in Rcpp makes the most sense for speed.
Using Boost C++ library regex (link
http://gallery.rcpp.org/articles/boost-regular-expressions/) or a C++
wrapper for PCRE (link https://gist.github.com/abicky/58ea79b01d9e394d5076)
are two solutions, but pure Rcpp would be ideal to avoid external software
dependencies.

Cheers,

Adam

On Sun, Aug 2, 2015 at 9:42 PM, John Thaden jjtha...@flash.net wrote:

 Adam,

 The original posting gave a function sub2 whose aim differs both from your
 functions' aim and from the intent of mgsub() in the qdap package:

  Here is code to apply a different
  pattern and replacement for every target.

 #Example
 X - c(ab, cd, ef)
 patt - c(b, cd, a)
 repl - c(B, CD, A)

 The first pattern ('b') and the first replacement ('B') therefore apply
 only to the first target ('ab'), the second to the second, etc. The
 function achieves its aim, giving the correct answer 'aB', 'CD', 'ef'.

 mgsub() satisfies a different need, testing all targets for matches with
 any pattern in the vector of patterns and, if a match is found, replacing
 the matched target with the replacement value corresponding to the matched
 pattern. It, too, achieves its aim, giving a different (but also correct)
 answer 'AB', 'CD', 'ef'.

 Regards,
 -John

 #Example
 X - c(ab, cd, ef)
 patt - c(b, cd, a)
 repl - c(B, CD, A)



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R error

2015-08-03 Thread peter dalgaard

 On 03 Aug 2015, at 18:00 , Hood, Kyle (CDC/OCOO/OCIO/ITSO) (CTR) 
 y...@cdc.gov wrote:
 
 Good afternoon,
 
 I recently received a ticket from a customer to upgrade from 3.1.1. to 3.2.1. 
  After the upgrade, when he tries to install a package he receives the error 
 below.  Could you please advise as to what is wrong?  Thank you.

It's not too easy to tell given the number of ways large networked installs can 
be configured, but the logic is that if the R installation directory is write 
protected (which is usually a good thing), the packages go into a subdirectory 
of the user's home dir. The output suggests that R believes that this is 
\\cdc.gov\private\M328\ygv7, but apparently that doesn't exist since it tries 
to create \\cdc.gov\private which it can't.

Apart from that, try digging around in 

https://cran.r-project.org/bin/windows/base/rw-FAQ.html

-pd

 
 Kyle
 
 --- Please select a CRAN mirror for use in this session ---
 Warning in install.packages(NULL, .libPaths()[1L], dependencies = NA, type = 
 type) :
  'lib = C:/Program Files/R/R-3.2.1/library' is not writable
 Error in install.packages(NULL, .libPaths()[1L], dependencies = NA, type = 
 type) :
  unable to create '\\cdc.gov\private\M328\ygv7/R/win-library/3.2'
 In addition: Warning message:
 In dir.create(userdir, recursive = TRUE) :
  cannot create dir '\\cdc.gov\private', reason 'Permission denied'
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Environmental Data Connector v1.3

2015-08-03 Thread Robert in SA
Hi Dan, thanks for your response. 

The setwd is coded somewhere in the EDC.get function. I guess I could try
alter the code but I assume this package should work as is.



--
View this message in context: 
http://r.789695.n4.nabble.com/Environmental-Data-Connector-v1-3-tp4710686p4710701.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] scaling variables consecutively and independently

2015-08-03 Thread Ram09
Hello Everyone,

So I am very new to R and I'm having some trouble.  I basically have around
110 datasets each one made up of around 100 variables.  I am trying to
z-score the scores in each column but independently of each other ( each
column independent of the other).  The problem is that there are just too
many variables in each dataset to compute individually.  I figured there
should be some type of loop that I can do in which it would scale the scores
in each column and then move on to the next but I haven't been able to find
anything about this.  Can anyone help?  Thanks so much!

-RW



--
View this message in context: 
http://r.789695.n4.nabble.com/scaling-variables-consecutively-and-independently-tp4710702.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Households per Census block

2015-08-03 Thread Zack Almquist
Hi Anthony and Keith Weintraub,

Here is a way to do what you are asking using the UScensus2010 packages:

## latest version of the package, not yet on CRAN
install.packages(UScensus2010, repos=http://R-Forge.R-project.org;)
library(UScensus2010)
install.blk()
library(UScensus2010blk)
### You will want the H0010001 variable (see help(alabama.blk10))
### Other variables are also available
### You can use the new api function in UScensus2010 to get arbitrary
variables from SF1 and acs

data(states.names)
head(states.names)
state.blk.housing-vector(list,length(states.names))
## notice this could be greatly spead up using the library(parallel)
## with mclapply
## This will be somewhat slow b/c of so much spatial data
for(i in 1:length(states.names)){
data(list=paste(states.names[i],blk10,sep=.))
temp-get(paste(states.names[i],blk10,sep=.))
#unique b/c more shapefiles than fips
state.blk.housing[[i]]-unique(temp@data[,c(fips,H0010001)])
print(i)
rm(paste(states.names,blk10,sep=.))
}

###
# alternatively Using the US Census API function in the new UScensus2010
package
###

## Get all states fips code
data(countyfips)
state.fips-unique(substr(countyfips$fips,1,2))
head(state.fips)
length(state.fips) ## will be 51=50 (states)+ 1(DC)
## You will need a census key
key-YOUR KEY HERE
housing-CensusAPI2010(c(H0010001), state.fips=state.fips, level =
c(block), key, summaryfile = c(sf1))

Best,

-- Zack
-
Zack W.  Almquist
Assistant Professor
Department of Sociology and School of Statistics
Affiliate, Minnesota Population Center
University of Minnesota


On Mon, Aug 3, 2015 at 12:43 PM, Anthony Damico ajdam...@gmail.com wrote:

 hi, ccing the package maintainer.  one alternative is to pull the HU100
 variable directly from the census bureau's summary files: that variable
 starts at position 328 and ends at 336.  just modify this loop and you'll
 get a table with one-record-per-census-block in every state.


 https://github.com/davidbrae/swmap/blob/master/how%20to%20map%20the%20consumer%20expenditure%20survey.R#L104

 (1) line 134 change the very last -9 to 9
 (2) line 137 between pop100 and intptlat add an hu100


 summary file docs-

 http://www.census.gov/prod/cen2010/doc/sf1.pdf#page=18



 On Mon, Aug 3, 2015 at 11:55 AM, Keith S Weintraub kw1...@gmail.com
 wrote:

 Folks,

 I am using the UScensus2010 package and I am trying to figure out the
 number of households per census block.

 There are a number of possible data downloads in the package but
 apparently I am not smart enough to figure out which data-set is
 appropriate and what functions to use.

 Any help or pointers or links would be greatly appreciated.

 Thanks for your time,
 Best,
 KW

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Environmental Data Connector v1.3

2015-08-03 Thread Roy Mendelssohn - NOAA Federal
Hi Robert:

I didn’t see this until Dan sent me something offline.  I apologize for the 
problem. Yes the function should work as is, and a lot of the EDC is Java.  I 
have forwarded your email to the people who did the coding.  But in the 
meantime can you do two things for me to help us in the debugging:

1.  Can you send me  the result of  the command:

 Sys.getenv(“EDC_HOME”)

2.  Can you send  the result 

ls -l

on that directory?

I suggest that at this point do it offline, as I doubt these details would be 
of interest to the list as a whole.  If we find a solution I will post that.  I 
will add that we don’t have the resources to test on all versions of all OS, 
and that at times changes in R have required us to change our code, and this 
can also be one such instance.  But yes, the error message suggests that the 
code can’t write the necessary temp files to whatever directory it is trying.

Thanks,

-Roy



 On Aug 3, 2015, at 11:36 AM, Robert in SA ri.william...@outlook.com wrote:
 
 Hi Dan, thanks for your response. 
 
 The setwd is coded somewhere in the EDC.get function. I guess I could try
 alter the code but I assume this package should work as is.
 
 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Environmental-Data-Connector-v1-3-tp4710686p4710701.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

**
The contents of this message do not reflect any position of the U.S. 
Government or NOAA.
**
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
***Note new address and phone***
110 Shaffer Road
Santa Cruz, CA 95060
Phone: (831)-420-3666
Fax: (831) 420-3980
e-mail: roy.mendelss...@noaa.gov www: http://www.pfeg.noaa.gov/

Old age and treachery will overcome youth and skill.
From those who have been given much, much will be expected 
the arc of the moral universe is long, but it bends toward justice -MLK Jr.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Environmental Data Connector v1.3

2015-08-03 Thread William Dunlap
 During installation EDC_HOME was set to /home/robert/EDC
 and the directory definitely exists.

Are you sure that EDC_HOME is set now?  What do you get from the following
command?
   Sys.getenv(EDC_HOME)
If that is set to something other than , what do you get from
   getwd()
   setwd(Sys.getenv(EDC_HOME))
   getwd()
If it is was not set, do things work better if you first do
   Sys.setenv(EDC_HOME=/home/robert/EDC)



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Mon, Aug 3, 2015 at 8:12 AM, Robert in SA ri.william...@outlook.com
wrote:

 Hello. I have successfully installed EDC v1.3 on linux ubuntu 14.04. I am
 running a 64bit machine with R 3.2.1 via Rstudio. I have tried  example1
 - EDC.get(1) after loading the ncdf and EDCR libraries, from both the R
 terminal and Rstudio and get the following result:

 Error in setwd(paste(Sys.getenv(EDC_HOME), sep = )) :
   cannot change working directory

 During installation EDC_HOME was set to /home/robert/EDC and the directory
 definitely exists. Does anyone have any suggestions?




 --
 View this message in context:
 http://r.789695.n4.nabble.com/Environmental-Data-Connector-v1-3-tp4710686.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] vectorized sub, gsub, grep, etc.

2015-08-03 Thread John Thaden
sub() has practical uses though gsub() may have more. This function was what I 
needed at the time. Of course the gsub() version is also possible.

Sent from Yahoo Mail on Android


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R error

2015-08-03 Thread Hood, Kyle (CDC/OCOO/OCIO/ITSO) (CTR)
Good afternoon,

I recently received a ticket from a customer to upgrade from 3.1.1. to 3.2.1.  
After the upgrade, when he tries to install a package he receives the error 
below.  Could you please advise as to what is wrong?  Thank you.

Kyle

--- Please select a CRAN mirror for use in this session ---
Warning in install.packages(NULL, .libPaths()[1L], dependencies = NA, type = 
type) :
  'lib = C:/Program Files/R/R-3.2.1/library' is not writable
Error in install.packages(NULL, .libPaths()[1L], dependencies = NA, type = 
type) :
  unable to create '\\cdc.gov\private\M328\ygv7/R/win-library/3.2'
In addition: Warning message:
In dir.create(userdir, recursive = TRUE) :
  cannot create dir '\\cdc.gov\private', reason 'Permission denied'

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Environmental Data Connector v1.3

2015-08-03 Thread Nordlund, Dan (DSHS/RDA)
Why are you using paste() ?  Why not just


setwd(Sys.getenv(EDC_HOME))


Dan

Daniel Nordlund, PhD
Research and Data Analysis Division
Services  Enterprise Support Administration
Washington State Department of Social and Health Services


-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Robert in SA
Sent: Monday, August 03, 2015 8:12 AM
To: r-help@r-project.org
Subject: [R] Environmental Data Connector v1.3

Hello. I have successfully installed EDC v1.3 on linux ubuntu 14.04. I am 
running a 64bit machine with R 3.2.1 via Rstudio. I have tried  example1
- EDC.get(1) after loading the ncdf and EDCR libraries, from both the R 
terminal and Rstudio and get the following result:

Error in setwd(paste(Sys.getenv(EDC_HOME), sep = )) : 
  cannot change working directory

During installation EDC_HOME was set to /home/robert/EDC and the directory 
definitely exists. Does anyone have any suggestions?




--
View this message in context: 
http://r.789695.n4.nabble.com/Environmental-Data-Connector-v1-3-tp4710686.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Matching posterior probabilities from poLCA

2015-08-03 Thread Rob de Vries
Hi all,

I'm a newbie to R with a question about poLCA. When you run a latent class
analysis in poLCA it generates a value for each respondent giving their
posterior probability of 'belonging' to each latent class. These are stored
as a matrix in the element 'posterior'.

I would like to create a dataframe which contains each respondent's unique
ID number (which is stored as a variable in the dataframe used for poLCA)
and their *matched* posterior probability from the 'posterior' matrix. I
would then like to write this dataframe to a csv file for use in another
program.

I know this is possible, but I just can't seem to get it right (blame my
incompetence with R). Any help would be very warmly appreciated.

Best wishes.
Robert de Vries

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Environmental Data Connector v1.3

2015-08-03 Thread Robert in SA
Hello. I have successfully installed EDC v1.3 on linux ubuntu 14.04. I am
running a 64bit machine with R 3.2.1 via Rstudio. I have tried  example1
- EDC.get(1) after loading the ncdf and EDCR libraries, from both the R
terminal and Rstudio and get the following result:

Error in setwd(paste(Sys.getenv(EDC_HOME), sep = )) : 
  cannot change working directory

During installation EDC_HOME was set to /home/robert/EDC and the directory
definitely exists. Does anyone have any suggestions?




--
View this message in context: 
http://r.789695.n4.nabble.com/Environmental-Data-Connector-v1-3-tp4710686.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Households per Census block

2015-08-03 Thread Anthony Damico
hi, ccing the package maintainer.  one alternative is to pull the HU100
variable directly from the census bureau's summary files: that variable
starts at position 328 and ends at 336.  just modify this loop and you'll
get a table with one-record-per-census-block in every state.

https://github.com/davidbrae/swmap/blob/master/how%20to%20map%20the%20consumer%20expenditure%20survey.R#L104

(1) line 134 change the very last -9 to 9
(2) line 137 between pop100 and intptlat add an hu100


summary file docs-

http://www.census.gov/prod/cen2010/doc/sf1.pdf#page=18



On Mon, Aug 3, 2015 at 11:55 AM, Keith S Weintraub kw1...@gmail.com wrote:

 Folks,

 I am using the UScensus2010 package and I am trying to figure out the
 number of households per census block.

 There are a number of possible data downloads in the package but
 apparently I am not smart enough to figure out which data-set is
 appropriate and what functions to use.

 Any help or pointers or links would be greatly appreciated.

 Thanks for your time,
 Best,
 KW

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About nls.

2015-08-03 Thread PIKAL Petr
Hi

Please keep conversation on list somebody can have better idea. Other see in 
line.


 -Original Message-
 From: Jianling Fan [mailto:fanjianl...@gmail.com]
 Sent: Friday, July 31, 2015 4:46 PM
 To: PIKAL Petr
 Subject: Re: [R] About nls.

 Hello, Petr,

 Thanks for your help.
 That works but it change my model. And I think that's not the main
 problem.
 from my data, (den1/R1+den2+den3+den4+den5/R5) always 1, which makes
 (1/(den1/R1+den2+den3+den4+den5/R5)-1)0.

It is not true unless I have different data from yours.

 with(dat,(den1/0.9+den2+den3+den4+den5/23))
 [1]  0.466  0.747  0.976  1.073  1.110  0.380
 [7]  0.480  0.850  0.880  1.000  0.480  0.890
[13]  0.980  0.990  1.000  0.200  0.390  0.690
[19]  0.990  1.000  6.0652174  9.9065217 13.3434783 16.5782609
[25] 18.8021739 19.8130435

 with(dat,(1/(den1/.9+den2+den3+den4+den5/23)-1)0)
 [1]  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE
[13]  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE


 str(dat)
'data.frame':   26 obs. of  7 variables:
 $ Depth: int  20 40 60 80 100 15 30 45 60 120 ...
 $ lnd  : num  3 3.69 4.09 4.38 4.61 ...
 $ den1 : num  0.419 0.672 0.878 0.966 0.999 ...
 $ den2 : num  0 0 0 0 0 0.38 0.48 0.85 0.88 1 ...
 $ den3 : num  0 0 0 0 0 0 0 0 0 0 ...
 $ den4 : num  0 0 0 0 0 0 0 0 0 0 ...
 $ den5 : num  0 0 0 0 0 0 0 0 0 0 ...


 dput(dat)
structure(list(Depth = c(20L, 40L, 60L, 80L, 100L, 15L, 30L,
45L, 60L, 120L, 15L, 30L, 45L, 60L, 120L, 15L, 30L, 45L, 60L,
120L, 10L, 30L, 50L, 70L, 90L, 110L), lnd = c(2.995732, 3.688879,
4.094345, 4.382027, 4.60517, 2.70805, 3.401197, 3.806662, 4.094345,
4.787492, 2.70805, 3.401197, 3.806662, 4.094345, 4.787492, 2.70805,
3.401197, 3.806662, 4.094345, 4.787492, 2.302585, 3.401197, 3.912023,
4.248495, 4.49981, 4.70048), den1 = c(0.419, 0.6725, 0.878, 0.966,
0.999, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0), den2 = c(0, 0, 0, 0, 0, 0.38, 0.48, 0.85, 0.88, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), den3 = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0.48, 0.89, 0.98, 0.99, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0), den4 = c(0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0.2, 0.39, 0.69, 0.99, 1, 0, 0, 0, 0, 0, 0),
den5 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 139.5, 227.85, 306.9, 381.3, 432.45, 455.7)), .Names = 
c(Depth,
lnd, den1, den2, den3, den4, den5), class = data.frame, row.names 
= c(1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26))


You can check if this is the same as you have. That is why it is preferable to 
use dput for sending data.

Cheers
Petr


 So, the nls should work in this case. But I don't know why it does not.

 Thanks!

  Regards,

 Julian



 On 31 July 2015 at 00:54, PIKAL Petr petr.pi...@precheza.cz wrote:
  Hi
 
  I am not an expert but the problem seems to me that
 
  (den1/R1+den2+den3+den4+den5/R5)-1
 
  gives you sometimes value 0 and sometimes negative. In these cases
 the value of log(1/result) is NA or Inf and nls can not handle this.
 
  I do not search vhere is nls2 from so I used nls and removed -1 form
 your formula, which resulted to some final values.
 
  fit1-nls(lnd~log(1/(den1/R1+den2+den3+den4+den5/R5))/c+log(d50),
  + start=c(R1=0.9, R5=23, c=-1.1, d50=10), data=test)
 
  coef(fit)
   A  B
   6.9720965 -0.0272203
  coef(fit1)
   R1  R5   c d50
0.9622249 416.1272498  -0.7178156  73.6017161
 
  Cheers
  Petr
 
  -Original Message-
  From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
  Jianling Fan
  Sent: Thursday, July 30, 2015 9:51 PM
  To: r-help@r-project.org
  Subject: [R] About nls.
 
  Hello,
 
  I am trying to do a nls regression with R.  but I always get a error
  as Error in numericDeriv(form[[3L]], names(ind), env) :  Missing
  value or an infinity produced when evaluating the model.  I googled
  it and found someone said it is because of the improper start value.
  I tried many times but can not solve it. Does anyone can help me?
 
  thanks a lot !
 
  my code is:
 
  fit1-nls2(lnd~log(1/(den1/R1+den2+den3+den4+den5/R5)-1)/c+log(d50),
start=c(R1=0.9, R5=23, c=-1.1, d50=10), data=SWrt)
 
  data (SWrt) is:
 
Depth  lnd   den1 den2 den3 den4   den5
  1 20 2.995732 0.4190 0.00 0.00 0.00   0.00
  2 40 3.688879 0.6725 0.00 0.00 0.00   0.00
  3 60 4.094345 0.8780 0.00 0.00 0.00   0.00
  4 80 4.382027 0.9660 0.00 0.00 0.00   0.00
  5100 4.605170 0.9990 0.00 0.00 0.00   0.00
  6 15 2.708050 0. 0.38 0.00 0.00   0.00
  7 30 3.401197 0. 0.48 0.00 0.00   0.00
  8 45 3.806662 0. 0.85 0.00 0.00   0.00
  9 60 4.094345 0. 0.88 0.00 0.00   0.00
  10   120 4.787492 0. 1.00 0.00 0.00   0.00
  1115 2.708050 0. 0.00 0.48 0.00   0.00
  1230 3.401197 0. 0.00 

Re: [R] Faster text search in document database than with grep?

2015-08-03 Thread Witold E Wolski
Dear Duncan,

This is a model of the data I work with.

database - replicate(5, paste(sample(letters,rexp(1,1/500), rep=TRUE),
   collapse=))

words - replicate(1,paste(sample(letters,rexp(1,1/70), rep=TRUE),
   collapse=))

NumberOfWords - 10
system.time(lapply(words[1: NumberOfWords], grep, database))
   user  system elapsed
  5.002   0.003   5.005

 The model reproduces the running times I have to cope with.

To use grep in this context is rather naive and I am wondering if there are
better solutions availabe in R.



On 3 August 2015 at 15:13, Duncan Murdoch murdoch.dun...@gmail.com wrote:

 On 03/08/2015 5:25 AM, Witold E Wolski wrote:
  I have a database of text documents (letter sequences). Several thousands
  of documents with approx. 1000-2000 letters each.
 
  I need to find exact matches of short 3-15 letters sequences in those
  documents.
 
  Without any regexp patterns the search of one 3-15 letter words takes
 in
  the order of 1s.
 
  So for a database with several thousand documents it's an the order of
  hours.
  The naive approach would be to use mcmapply, but than on a standard
  hardware I am still in the same order and since R is an interactive
  programming environment this isn't a solution I would go for.
 
  But aren't there faster algorithmic solutions? Can anyone point me please
  to an implementation  available in R.

 You haven't shown us what you did, but it sounds far slower than I'd
 expect.  I just used the code below to set up a database of 1
 documents of 2000 letters each, and searching those documents for abc
 takes about 70 milliseconds:

 database - replicate(1, paste(sample(letters, 2000, rep=TRUE),
 collapse=))

 grep(abc, database, fixed=TRUE)

 Duncan Murdoch




-- 
Witold Eryk Wolski

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Faster text search in document database than with grep?

2015-08-03 Thread Duncan Murdoch
On 03/08/2015 5:25 AM, Witold E Wolski wrote:
 I have a database of text documents (letter sequences). Several thousands
 of documents with approx. 1000-2000 letters each.
 
 I need to find exact matches of short 3-15 letters sequences in those
 documents.
 
 Without any regexp patterns the search of one 3-15 letter words takes in
 the order of 1s.
 
 So for a database with several thousand documents it's an the order of
 hours.
 The naive approach would be to use mcmapply, but than on a standard
 hardware I am still in the same order and since R is an interactive
 programming environment this isn't a solution I would go for.
 
 But aren't there faster algorithmic solutions? Can anyone point me please
 to an implementation  available in R.

You haven't shown us what you did, but it sounds far slower than I'd
expect.  I just used the code below to set up a database of 1
documents of 2000 letters each, and searching those documents for abc
takes about 70 milliseconds:

database - replicate(1, paste(sample(letters, 2000, rep=TRUE),
collapse=))

grep(abc, database, fixed=TRUE)

Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scaling variables consecutively and independently

2015-08-03 Thread David Winsemius

On Aug 3, 2015, at 1:06 PM, Ram09 wrote:

 Yes, I've been using the scale function but I don't know how to write a line
 of code that will scale the scores in each variable independently of each
 other instead of as a whole.

You do not appear to be reading the help page for `scale`.

  In other words, how can I get the scale
 function to standardize all the scores in one variable (column) then move on
 to the the next, so on and so forth for the whole dataset without having to
 tediously type out the same line of code for each variable? 

This is the first Line of text in the help page:

'scale' is generic function whose default method centers and/or scales the 
columns of a numeric matrix.

If you do not want the result as a matrix then you can use lapply on a 
dataframe.

-- 
David.

R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to simulate informative censoring in a Cox PH model?

2015-08-03 Thread Daniel Meddings
Hi Greg

The copulas concept seems a nicely simple way of simulating event times
that are subject to informative censoring (in contrast to the double cox
model approach I use). The correlation between the marginal uniform random
variables you speak of reminded me that my approach should also induce this
correlation, just in a different way. Similarly I should also observe zero
correlation between my event times from my outcome model and the censoring
times. Unfortunately this was not the case - to cut a long story short I
was inadvertently generating my independent censoring times from a model
that depended on covariates in the outcome model. This now explains the
mixed results I rather laboriously attempted to describe previously.

Re-running some scenarios with my new error-free code I can now clearly
observe the points you have been making, that is informative censoring only
leads to bias if the covariates in the censoring model are not in the
outcome model. Indeed I can choose the common (to both models) treatment
effect to be vastly different (with all other effects the same) and have no
bias, yet small differences in the censoring Z effect (not in the outcome
model) effect lead to moderate biases.

I am still somewhat confused at the other approach to this problem where I
have seen in various journal articles authors assuming an outcome model for
the censored subjects - i.e. an outcome model for the unobserved event
times. Using this approach the definition of informative censoring appears
to be where the observed and un-observed outcome models are different. This
approach also makes sense to me - censoring merely loses precision of the
parameter estimators due to reduced events, but does not lead to bias.
However the concept of correlated event and censoring times does not even
present itself here?

Thanks

Dan



On Fri, Jul 31, 2015 at 5:06 PM, Greg Snow 538...@gmail.com wrote:

 Daniel,

 Basically just responding to your last paragraph (the others are
 interesting, but I think that you are learning as much as anyone and I
 don't currently have any other suggestions).

 I am not an expert on copulas, so this is a basic understanding, you
 should learn more about them if you choose to use them.  The main idea
 of a copula is that it is a bivariate or multivariate distribution
 where all the variables have uniform marginal distributions but the
 variables are not independent from each other.  How I would suggest
 using them is to choose a copula and generate random points from a
 bivariate copula, then put those (uniform) values into the inverse pdf
 function for the Weibull (or other distribution), one of which is the
 event time, the other the censoring time.  This will give you times
 that (marginally) come from the distributions of interest, but are not
 independent (so would be considered informative censoring).  Repeat
 this with different levels of relationship in the copula to see how
 much difference it makes in your simulations.

 On Thu, Jul 30, 2015 at 2:02 PM, Daniel Meddings dpmeddi...@gmail.com
 wrote:
  Thanks Greg once more for taking the time to reply. I certainly agree
 that
  this is not a simple set-up, although it is realistic I think. In short
 you
  are correct about model mis-specification being the key to producing more
  biased estimates under informative than under non-informative censoring.
  After looking again at my code and trying various things I realize that
 the
  key factor that leads to the informative and non-informative censoring
 data
  giving rise to the same biased estimates is how I generate my Z_i
 variable,
  and also the magnitude of the Z_i coefficient in both of the event and
  informative censoring models.
 
  In the example I gave I generated Z_i (I think of this as a poor
 prognosis
  variable) from a beta distribution so that it ranged from 0-1. The biased
  estimates for beta_t_1 (I think of this as the effect of a treatment on
  survival) were approximately 1.56 when the true value was -1. What I
 forgot
  to mention was that estimating a cox model with 1,000,000 subjects to the
  full data (i.e. no censoring at all) arguably gives the best treatment
  effect estimate possible given that the effects of Z_i and Z_i*Treat_i
 are
  not in the model. This best possible estimate turns out to be 1.55 -
 i.e.
  the example I gave just so happens to be such that even with 25-27%
  censoring, the estimates obtained are almost the best that can be
 attained.
 
  My guess is that the informative censoring does not bias the estimate
 more
  than non-informative censoring because the only variable not accounted
 for
  in the model is Z_i which does not have a large enough effect beta_t_2,
  and/or beta_c_2, or perhaps because Z_i only has a narrow range which
 does
  not permit the current beta_t_2 value to do any damage?
 
  To investigate the beta_t_2, and/or beta_c_2 issue I changed
 beta_c_2
  from 2 to 7 and beta_c_0 from 0.2 to -1.2, and beta_d_0 from 

Re: [R] scaling variables consecutively and independently

2015-08-03 Thread David Winsemius

On Aug 3, 2015, at 11:42 AM, Ram09 wrote:

 Hello Everyone,
 
 So I am very new to R and I'm having some trouble.  I basically have around
 110 datasets each one made up of around 100 variables.  I am trying to
 z-score the scores in each column but independently of each other ( each
 column independent of the other).  The problem is that there are just too
 many variables in each dataset to compute individually.  I figured there
 should be some type of loop that I can do in which it would scale the scores
 in each column and then move on to the next but I haven't been able to find
 anything about this.  Can anyone help?  Thanks so much!

Perhaps you are looking for the 'scale'-function?

deleted Nabble link

R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

-- 
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scaling variables consecutively and independently

2015-08-03 Thread Ram09
Yes, I've been using the scale function but I don't know how to write a line
of code that will scale the scores in each variable independently of each
other instead of as a whole.  In other words, how can I get the scale
function to standardize all the scores in one variable (column) then move on
to the the next, so on and so forth for the whole dataset without having to
tediously type out the same line of code for each variable? 

-RW



--
View this message in context: 
http://r.789695.n4.nabble.com/scaling-variables-consecutively-and-independently-tp4710702p4710709.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Environmental Data Connector v1.3

2015-08-03 Thread Roy Mendelssohn - NOAA Federal

 On Aug 3, 2015, at 8:12 AM, Robert in SA ri.william...@outlook.com wrote:
 
 Hello. I have successfully installed EDC v1.3 on linux ubuntu 14.04. I am
 running a 64bit machine with R 3.2.1 via Rstudio. I have tried  example1
 - EDC.get(1) after loading the ncdf and EDCR libraries, from both the R
 terminal and Rstudio and get the following result:
 
 Error in setwd(paste(Sys.getenv(EDC_HOME), sep = )) : 
  cannot change working directory
 
 During installation EDC_HOME was set to /home/robert/EDC and the directory
 definitely exists. Does anyone have any suggestions?
 
 

Bringing this back on list so that the solution is on record , the problem is 
that EDC_HOME was not set when the install was done.  We are looking into 
whether this is a problem with the installer.

In the meantime, if this problem pops-up again, use:

 Sys.setenv(EDC_HOME=/your/EDC/home/directory”)

Many thanks to Bill Dunlop of Tibco for help on solving this.  He also notes:

 Yes.  You can set EDC_HOME on your .Rprofile or Renviron (sp?) files.
 See help(Startup) for details.
 


-Roy


**
The contents of this message do not reflect any position of the U.S. 
Government or NOAA.
**
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
***Note new address and phone***
110 Shaffer Road
Santa Cruz, CA 95060
Phone: (831)-420-3666
Fax: (831) 420-3980
e-mail: roy.mendelss...@noaa.gov www: http://www.pfeg.noaa.gov/

Old age and treachery will overcome youth and skill.
From those who have been given much, much will be expected 
the arc of the moral universe is long, but it bends toward justice -MLK Jr.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.