[R] R help- fit distribution "fitdistr"

2016-05-25 Thread Jessica Wang
Hello, I just start using R. I want to use ??fitdistr?? to fit distribution of 
the data. Then how can I verify if the data really fit the distribution? Thanks 
[data is attached]
 
res<-fitdistr(data$Report.delay, "Poisson") 
 
h<-hist(data$Report.delay) 
 
xfit<-floor(seq(0, 250, 50)) 
 
yfit<-dpois(xfit,res[[1]][1]) 
 
yfit<-yfit*diff(h$mids[1:2])*length(xfit) 
 
lines(xfit, yfit, col="blue", lwd=2)
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Computing means of multiple variables based on a condition

2016-05-25 Thread KMNanus
These will be overlapping subgroups from the same data frame.  For example, 
d<=2 will have length=9, d<=4 will have length=7, etc.


Ken
kmna...@gmail.com
914-450-0816 (tel)
347-730-4813 (fax)



> On May 25, 2016, at 9:06 PM, William Dunlap  wrote:
> 
> Just to be clear, do you really want your 'condition' groups to be be subsets
> of one another?  Most (all?) of the *ply functions assume you want
> non-overlapping groups so they do a split-summarize-combine sequence.
> You would have to replace the split part of that.
> 
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com 
> On Wed, May 25, 2016 at 3:37 PM, KMNanus  > wrote:
> I have a large dataset, a sample of which is:
> 
> a<- c(“A”, “B”,“A”, “B”,“A”, “B”,“A”, “B”,“A”, “B”)
> b <-c(15, 35, 20,  99, 75, 64, 33, 78, 45, 20)
> c<- c( 111, 234, 456, 876, 246, 662, 345, 480, 512, 179)
> d<- c(1.1, 3.2, 14.2, 8.7, 12.5, 5.9, 8.3, 6.0, 2.9, 9.3)
> 
> df <- data.frame(a,b,c,d)
> 
> I’m trying to construct a data frame that shows the means of c & b based on 
> the condition of d and grouped by a.
> 
> I want to create the data frame below, then use ggplot2 to create a line plot 
> of b at various conditions of d.
> 
> I can compute the grouped means (d>=2, d>=4, etc.) one at a time using dplyr 
> but haven’t figured out how to put them all together or put them in one data 
> frame.
> 
> I’d rather not use a loop and am relatively new to R.  Is there a way i can 
> use tapply and set it to the conditions above so that I can create the df 
> below?
> 
> 
> conditionmean(b) mean(c)
> Ad>=2   _
> Bd>=2   _
> Ad>=4   _
> Bd>=4  _
> Ad>=6  _
> B   d>=6  _
> 
> 
> 
> Ken
> kmna...@gmail.com 
> 914-450-0816  (tel)
> 347-730-4813  (fax)
> 
> 
> 
> __
> R-help@r-project.org  mailing list -- To 
> UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help 
> 
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
> 
> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strange error

2016-05-25 Thread John Dougherty
On Wed, 25 May 2016 18:56:47 +0200
alicekalk...@freenet.de wrote:

Alice,

Have you tried running the code in R in a terminal?  If the error
persists, then this may be the right place to ask for help.  If it is
specific to R Studio, then you need to ask them.
-- 

John

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R-es] Error en subset selection

2016-05-25 Thread Elkin Tabares
Hola a todas,

Quiero realizar un subset selection usando el paquete leaps, entre mis
variables explicativas  tengo rezagos  de las mismas, por tanto tienen
datos NA. sin embargo, al tratar de realizar cross validation me pide que
los datos esten en formato data.frame con lo que me arroja un error al
ejecutar el algoritmo. Mi pregunta es cómo puedo forzar en un data.frame en
que tengo variables rezagadas que las tome como rezagos y no como un error
del tamaño. Muchas gracias.

Saludos,

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] Computing means of multiple variables based on a condition

2016-05-25 Thread William Dunlap via R-help
Just to be clear, do you really want your 'condition' groups to be be
subsets
of one another?  Most (all?) of the *ply functions assume you want
non-overlapping groups so they do a split-summarize-combine sequence.
You would have to replace the split part of that.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, May 25, 2016 at 3:37 PM, KMNanus  wrote:

> I have a large dataset, a sample of which is:
>
> a<- c(“A”, “B”,“A”, “B”,“A”, “B”,“A”, “B”,“A”, “B”)
> b <-c(15, 35, 20,  99, 75, 64, 33, 78, 45, 20)
> c<- c( 111, 234, 456, 876, 246, 662, 345, 480, 512, 179)
> d<- c(1.1, 3.2, 14.2, 8.7, 12.5, 5.9, 8.3, 6.0, 2.9, 9.3)
>
> df <- data.frame(a,b,c,d)
>
> I’m trying to construct a data frame that shows the means of c & b based
> on the condition of d and grouped by a.
>
> I want to create the data frame below, then use ggplot2 to create a line
> plot of b at various conditions of d.
>
> I can compute the grouped means (d>=2, d>=4, etc.) one at a time using
> dplyr but haven’t figured out how to put them all together or put them in
> one data frame.
>
> I’d rather not use a loop and am relatively new to R.  Is there a way i
> can use tapply and set it to the conditions above so that I can create the
> df below?
>
>
> conditionmean(b) mean(c)
> Ad>=2   _
> Bd>=2   _
> Ad>=4   _
> Bd>=4  _
> Ad>=6  _
> B   d>=6  _
>
>
>
> Ken
> kmna...@gmail.com
> 914-450-0816 (tel)
> 347-730-4813 (fax)
>
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Computing means of multiple variables based on a condition

2016-05-25 Thread KMNanus
I have a large dataset, a sample of which is:

a<- c(“A”, “B”,“A”, “B”,“A”, “B”,“A”, “B”,“A”, “B”)
b <-c(15, 35, 20,  99, 75, 64, 33, 78, 45, 20)
c<- c( 111, 234, 456, 876, 246, 662, 345, 480, 512, 179)
d<- c(1.1, 3.2, 14.2, 8.7, 12.5, 5.9, 8.3, 6.0, 2.9, 9.3) 

df <- data.frame(a,b,c,d)

I’m trying to construct a data frame that shows the means of c & b based on the 
condition of d and grouped by a.

I want to create the data frame below, then use ggplot2 to create a line plot 
of b at various conditions of d.

I can compute the grouped means (d>=2, d>=4, etc.) one at a time using dplyr 
but haven’t figured out how to put them all together or put them in one data 
frame.

I’d rather not use a loop and am relatively new to R.  Is there a way i can use 
tapply and set it to the conditions above so that I can create the df below?


conditionmean(b) mean(c)
Ad>=2   _
Bd>=2   _
Ad>=4   _
Bd>=4  _
Ad>=6  _
B   d>=6  _



Ken
kmna...@gmail.com
914-450-0816 (tel)
347-730-4813 (fax)



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mixed models

2016-05-25 Thread Jeff Newmiller
Please keep the mailing list in the loop by using reply-all.

I don't think there is a requirement that the number of levels is equal, but 
there may be problems if you don't have the minimum number of records 
corresponding to each combination of levels specified in your model. 

You can change the csv extension to txt and attach for the mailing list. Or, 
better yet, you can use the dput function to embed the data directly in your 
sample code. 

Also,  please learn to post plain text email to avoid corruption of R code by 
the HTML formatting. 
-- 
Sent from my phone. Please excuse my brevity.

On May 25, 2016 2:26:54 PM PDT, James Henson  wrote:
>Good afternoon Jeff,
>The sample sizes for levels of the factor "Irrigation" are not equal.
>If
>'nlme' requires equal sample sizes this may be the problem. The same
>data
>frame runs in 'lme4' without a problem.
>
>Best regards,
>James
>
>
>On Wed, May 25, 2016 at 3:41 PM, James Henson 
>wrote:
>
>> Good afternoon Jeff,
>>
>> When working with this data frame, I just open the .csv file in R
>Studio.
>> But, we should not send .csv file to R_help.  What should I send?
>>
>> Best regards,
>> James
>>
>> On Wed, May 25, 2016 at 2:52 PM, Jeff Newmiller
>
>> wrote:
>>
>>> You forgot to show the commands to us that you used to read the data
>in
>>> with (your example is not "reproducible"). This step can make all
>the
>>> difference in the world as to whether your analysis commands will
>work or
>>> not.
>>> --
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> On May 25, 2016 11:59:06 AM PDT, James Henson 
>>> wrote:
>>>
 Greetings R community,

 My aim is to analyze a mixed-effects model with temporal
>pseudo-replication
 (repeated measures on the same experimental unit) using ‘nlme’. 
>However,
 my code returns the error message “Error in na.fail.default’, even
>though
 the data frame does not contain missing values. My code is below,
>and the
 data file is attached as ‘Eboni2.txt.

 library("nlme")

 str(Eboni2)

 head(Eboni2)

 model1 <- lme(preDawn ~ Irrigation, random=~season_order|treeNo,
 data=Eboni2)

 I am genuinely confused.  Hope someone can help.

 Best regards,

 James F. Henson

 --

 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] svymean in multistage desing

2016-05-25 Thread Diego Cuellar
Thanks for your attention. I have been using  your R library, survey, I
made an example for two stage sampling  SI – SI, an estimate the total and
the mean,  (the point estimation y de SE)

and also, I remaking the estimation.





For the total, point estimation and estimation of the variance, we have
exactly the same number. But for the mean, the estimated sample variance I
found different number.



I have been using  the formula   8.6.6 of Särdall,  -Model Assisted Survey
Sampling-.  That for the  case of  Simple Random sample without replacement
is equal, to the Taylor approximation for two stage sample, Example 5.6.3
of Särdall,  -Model Assisted Survey Sampling-













Can you please help me telling me which one is the formula that function
svymean or svratio are using.



I copy the code.





mydata <- read.table( text =

   "id UPS str_UPS USS str_USS hou85 ue91 lab91
clu_1 clu_2 uno

 3  1 1 1a 1 9230  1623  13727 62 95  1

 4  1 1 2a 1 4896  760   5919  62 95  1

 5  1 1 3a 1 4264  767   5823  62 95  1

 6  1 1 4a 1 3119  568   4011  62 95  1

 7  1 1 5a 2 1946  331   2543  62 95  1

 8  1 1 6a 2 1463  187   1448  62 95  1

 9  1 1 7a 2 675   129   927   62 95  1

 1  2 1 1b 1 26881 4123  33786 62 22  1

 2  2 1 2b 1 26881 4123  33786 62 22  1

 10 2 1 3b 1 18494 823   18649 62 22  1

 11 2 1 4b 2 18196 1543  21004 62 22  1

 12 2 1 5b 2 26814 2735  18796 62 22  1

 13 2 1 6b 2 26510 2638  920   62 22  1

 14 3 2 1c 1 25694 1792  15625 62 256 1

 15 3 2 2c 1 14676 494   26122 62 256 1

 16 3 2 3c 1 5742  520   4007  62 256 1

 17 3 2 4c 1 26024 3827   16850 62 256 1

 18 3 2 5c 2 9534  1458273   62 256 1

 19 3 2 6c 2 18236 204 25497 62 256 1

 20 3 2 7c 2 22311 1743  13217 62 256 1

 21 4 2 1d 1 1689  1797  13383 62 15  1

 22 4 2 2d 1 12632 2136  5206  62 15  1

 23 4 2 3d 1 11685 534   13461 62 15  1

 24 4 2 4d 1 14404 2851  3202  62 15  1

 25 4 2 5d 2 3072  223   12818 62 15  1

 26 4 2 6d 2 24476 989   414   62 15  1

 27 4 2 7d 2 1562  476   20837 62 15  1" ,

   header = TRUE )



View(mydata)



Dis_mues_2_stage<-svydesign(id=~UPS+USS, fpc=~clu_1+clu_2, data=mydata)





svytotal(~ue91, Dis_mues_2_stage)



svymean(~ue91, design=Dis_mues_2_stage)

svyratio(~ue91, ~uno, design=Dis_mues_2_stage,separate=FALSE, na.rm=FALSE)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strange error

2016-05-25 Thread Tom Wright
It may not be the problem, but with RStudio this error pops up when
the area reserved for plotting is too small. Typically this area is in
the right hand column, if you have this minimised (perhaps to maximise
space for typing) you will hit this problem. Try making it bigger.

Edit: Just ran your code, the problem is with the line
par(oma = c(0, 0, 3, 0))

This sets the plot outer margins, my guess is that you have used this
command then managed to save the settings to your default environment.
Changing to:

par(oma = c(1, 4, 3, 4))


which I think is the default, fixes the code on my system

On Wed, May 25, 2016 at 12:56 PM,   wrote:
> Hello everyone,
> almost every time I try to plot something R gives me the following mistake:
> Error in plot.new() : figure margins too large
> One example would be, when I tried to run a function, somebody published to 
> create a Lorenz Attractor:
>
> parameters <- c(s = 10, r = 28, b = 8/3) state <- c(X = 0, Y = 1, Z = 1)
> Lorenz <- function(t, state, parameters) {   with(as.list(c(state, 
> parameters)), { dX <- s * (Y - X) dY <- X * (r - Z) - Y dZ <- X * 
> Y - b * Z list(c(dX, dY, dZ))   }) }
> times <- seq(0, 50, by = 0.01) library(deSolve) out <- ode(y = state, times = 
> times, func = Lorenz, parms = parameters)
> par(oma = c(0, 0, 3, 0)) plot(out, xlab = "time", ylab = "-") plot(out[, 
> "Y"], out[, "Z"], pch = ".", type = "l") mtext(outer = TRUE, side = 3, 
> "Lorenz model", cex = 1.5)
>
> It turns out to be really problematic, because there are barely functions I 
> can plot.
> My version of RStudio is R version 3.2.3 (2015-12-10) -- "Wooden Christmas 
> Tree" and my computer uses Windows 8.1.
> Would it be possible to avoid the problem by using Windows 10?
> Or is there anything else I can do?
>
> Thank you in advance,
> Alice de Sampaio Kalkuhl
>
>
>
> ---
> Alle Postfächer an einem Ort. Jetzt wechseln und E-Mail-Adresse mitnehmen! 
> Rundum glücklich mit freenetMail
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strange error

2016-05-25 Thread Duncan Murdoch

On 25/05/2016 12:56 PM, alicekalk...@freenet.de wrote:

Hello everyone,
almost every time I try to plot something R gives me the following mistake:
Error in plot.new() : figure margins too large
One example would be, when I tried to run a function, somebody published to 
create a Lorenz Attractor:
  
parameters <- c(s = 10, r = 28, b = 8/3) state <- c(X = 0, Y = 1, Z = 1)

Lorenz <- function(t, state, parameters) {   with(as.list(c(state, parameters)), { 
dX <- s * (Y - X) dY <- X * (r - Z) - Y dZ <- X * Y - b * Z list(c(dX, 
dY, dZ))   }) }
times <- seq(0, 50, by = 0.01) library(deSolve) out <- ode(y = state, times = 
times, func = Lorenz, parms = parameters)
par(oma = c(0, 0, 3, 0)) plot(out, xlab = "time", ylab = "-") plot(out[, "Y"], out[, "Z"], pch = 
".", type = "l") mtext(outer = TRUE, side = 3, "Lorenz model", cex = 1.5)
  
It turns out to be really problematic, because there are barely functions I can plot.

My version of RStudio is R version 3.2.3 (2015-12-10) -- "Wooden Christmas 
Tree" and my computer uses Windows 8.1.
Would it be possible to avoid the problem by using Windows 10?
Or is there anything else I can do?


This is an RStudio problem, not an R problem.  One solution is to make 
the "Plots" pane bigger.  There may be others -- you'll have to contact 
RStudio for help with it.


Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] strange error

2016-05-25 Thread alicekalkuhl
Hello everyone,
almost every time I try to plot something R gives me the following mistake:
Error in plot.new() : figure margins too large
One example would be, when I tried to run a function, somebody published to 
create a Lorenz Attractor:
 
parameters <- c(s = 10, r = 28, b = 8/3) state <- c(X = 0, Y = 1, Z = 1)
Lorenz <- function(t, state, parameters) {   with(as.list(c(state, 
parameters)), { dX <- s * (Y - X) dY <- X * (r - Z) - Y dZ <- X * Y 
- b * Z list(c(dX, dY, dZ))   }) }
times <- seq(0, 50, by = 0.01) library(deSolve) out <- ode(y = state, times = 
times, func = Lorenz, parms = parameters)
par(oma = c(0, 0, 3, 0)) plot(out, xlab = "time", ylab = "-") plot(out[, "Y"], 
out[, "Z"], pch = ".", type = "l") mtext(outer = TRUE, side = 3, "Lorenz 
model", cex = 1.5)
 
It turns out to be really problematic, because there are barely functions I can 
plot.
My version of RStudio is R version 3.2.3 (2015-12-10) -- "Wooden Christmas 
Tree" and my computer uses Windows 8.1.
Would it be possible to avoid the problem by using Windows 10?
Or is there anything else I can do?
 
Thank you in advance,
Alice de Sampaio Kalkuhl

 

---
Alle Postfächer an einem Ort. Jetzt wechseln und E-Mail-Adresse mitnehmen! 
Rundum glücklich mit freenetMail

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mixed models

2016-05-25 Thread Jeff Newmiller
You forgot to show the commands to us that you used to read the data in with 
(your example is not "reproducible"). This step can make all the difference in 
the world as to whether your analysis commands will work or not. 
-- 
Sent from my phone. Please excuse my brevity.

On May 25, 2016 11:59:06 AM PDT, James Henson  wrote:
>Greetings R community,
>
>My aim is to analyze a mixed-effects model with temporal
>pseudo-replication
>(repeated measures on the same experimental unit) using ‘nlme’. 
>However,
>my code returns the error message “Error in na.fail.default’, even
>though
>the data frame does not contain missing values. My code is below, and
>the
>data file is attached as ‘Eboni2.txt.
>
>library("nlme")
>
>str(Eboni2)
>
>head(Eboni2)
>
>model1 <- lme(preDawn ~ Irrigation, random=~season_order|treeNo,
>data=Eboni2)
>
>I am genuinely confused.  Hope someone can help.
>
>Best regards,
>
>James F. Henson
>
>
>
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] mixed models

2016-05-25 Thread James Henson
Greetings R community,

My aim is to analyze a mixed-effects model with temporal pseudo-replication
(repeated measures on the same experimental unit) using ‘nlme’.  However,
my code returns the error message “Error in na.fail.default’, even though
the data frame does not contain missing values. My code is below, and the
data file is attached as ‘Eboni2.txt.

library("nlme")

str(Eboni2)

head(Eboni2)

model1 <- lme(preDawn ~ Irrigation, random=~season_order|treeNo,
data=Eboni2)

I am genuinely confused.  Hope someone can help.

Best regards,

James F. Henson
number  LocationSeason  season_orderMonth   treeID  treeNo  preDawn 
midday  Irrigation  PnetGs  E   WUE d15Nd13CNper
Cperinclude2
1   UCC November5   Nov UCCLO 1 60  1.4 1.3 
N   9   0.290700373 3.766207481 2.38967185  
no
2   UCC November5   Nov UCCLO 2 72  1.2 1.3 
N   11  0.326258186 3.120573618 3.524992949 
no
3   UCC November5   Nov UCCLO 3 78  1.1 1.2 
N   8   0.287095701 1.693820753 4.723049937 3   -27.44  
2.1252.12   yes
4   UCC November5   Nov UCCLO 4 79  1.1 2.1 
N   10  0.247517983 1.83934285  5.436724317 3.61-29.5   
1.4251.97   yes
5   UCC November5   Nov UCCLO 5 80  1.4 1.3 
N   13  0.300922817 3.082277827 4.217660032 
no
6   UCC November5   Nov UCCLO 6 81  0.6 1.8 
N   17  0.348733689 2.534550345 6.70730413  2.79-30.5   
1.4949.94   yes
7   UCC November5   Nov UCCLO 7 82  0.9 1.2 
N   12  0.272690759 1.809851748 6.630377328 2.43-29.4   
1.5553.12   yes
8   UCC November5   Nov UCCLO 8 83  1.4 1.1 
N   11  0.269862804 1.919849835 5.729614785 2.85-28.37  
1.7853.52   yes
9   UCC November5   Nov UCCLO 9 84  0.8 1   
N   16  0.32333 2.394825767 6.68107059  2.43-30.1   
1.5452.88   yes
10  UCC November5   Nov UCCLO 1062  0.9 
1.2 N   17  0.29488545  1.429058721 11.89594224 1.51
-31.96  1.6152.94   yes
11  UCC November5   Nov UCCLO 1163  1.3 
2   N   14  0.241601092 3.29815495  4.244797535 
no
12  UCC November5   Nov UCCLO 1264  1.2 
1.3 N   11  0.261040739 1.610353496 6.83079835  2.62
-28.94  1.4651.9yes
13  UCC November5   Nov UCCLO 1365  1.2 
1.3 N   13  0.238863129 2.221057396 5.853068012 1.13
-28.81  2.0851.43   yes
14  UCC November5   Nov UCCLO 1466  1.1 
1.5 N   9   0.309603194 2.859756011 3.14712163  
no
15  UCC November5   Nov UCCLO 1567  1.1 
1.3 N   18  0.383441504 2.949059627 6.103640576 1.51
-30.14  1.6552.04   yes
16  UCC November5   Nov UCCLO 1668  1.3 
2.7 N   13  0.269711187 1.430856375 9.085468137 1.99
-29.09  1.8751.21   yes
17  UCC November5   Nov UCCLO 1769  0.8 
1.4 N   13  0.245685997 3.808576972 3.41334837  
no
18  UCC November5   Nov UCCLO 1870  1.6 
1.8 N   11  0.271419599 2.305398713 4.771408926 2.94
-29.22  1.9351.46   yes
19  UCC November5   Nov UCCLO 1971  1   
1.5 N   18  0.338103566 2.303586185 7.813903435 2.93
-30.27  1.9251.51   yes
20  UCC November5   Nov UCCLO 2073  1.1 
1.2 N   11  0.27096196  3.230604699 3.404935307 
no
21  UCC November5   Nov UCCLO 2174  1.2 
1.5 N   10  0.294348983 3.319661403 3.012355414 
no
22  UCC November5   Nov UCCLO 2275  1.5 
1.9 N   12  0.230438536 2.935053987 4.088510825 1.74
-29.49  1.6450.28   yes
23  UCC November5   Nov UCCLO 2376  1.1 
1.4 N   14  0.264963659 3.689609021 3.794439985   

Re: [R] What are some toy models I can use in R?

2016-05-25 Thread C W
On Wed, May 25, 2016 at 1:13 PM, S Ellison  wrote:

> > -Original Message-
> > My data come from statistical model N(5, 2), with n=100, call this
> model_1
> > Then, I add bias to that data with N(3, 1), with n=100, call this model_2
> Do you mean you have data from N(5,2) that has had data from N(3,1) added
> to it, or that you have two different sets of data?
> Or do you mean that you want to know how to generate such data?
>

I generate toy data from N(5,2). X input will be the same for model_1 and
model_2, say, seq(-3, 3, by=0.01).


> > Ultimately, I want to see model_1+ model_2 gives good prediction
> If you generate random data correctly following a model, the model will
> indeed predict the data pretty well. But under those circumstances it seems
> redundant to ask the question. Were you thinking of fitting a (possibly
> different) model to the data at some point ?  If so, what model would you
> want to fit? And what would you want to predict from it?
>

I only use model_1, model_2 for data generation.  I am using machine
learning methods for estimation and prediction to see if the method is good
and robust.  Then, check performance by comparing results with the known
model_1 and model_2.


> > or perhaps parameter estimation.
> What parameters do you want to estimate?








>
> > I think this is a pretty standard statistical analysis problem?
> Unclear on that at present. See above.
>
>
> ***
> This email and any attachments are confidential. Any u...{{dropped:13}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What are some toy models I can use in R?

2016-05-25 Thread S Ellison
> -Original Message-
> My data come from statistical model N(5, 2), with n=100, call this model_1
> Then, I add bias to that data with N(3, 1), with n=100, call this model_2
Do you mean you have data from N(5,2) that has had data from N(3,1) added to 
it, or that you have two different sets of data?
Or do you mean that you want to know how to generate such data?

> Ultimately, I want to see model_1+ model_2 gives good prediction 
If you generate random data correctly following a model, the model will indeed 
predict the data pretty well. But under those circumstances it seems redundant 
to ask the question. Were you thinking of fitting a (possibly different) model 
to the data at some point ?  If so, what model would you want to fit? And what 
would you want to predict from it?

> or perhaps parameter estimation.
What parameters do you want to estimate?


> I think this is a pretty standard statistical analysis problem?
Unclear on that at present. See above.


***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reduce does not work with data.table?

2016-05-25 Thread Jeff Newmiller
This is a design feature of data.table objects, which don't conform to the 
normal functional programming paradigm that R is usually designed to adhere to 
and which Reduce expects. Specifically, they normally modify in-place rather 
than leaving the original object alone. 

In short, don't do that. Read more about how data.tables work. Their benefits 
come with distinct disadvantages that you need to be very clear about or you 
will get into trouble like this regularly. 
-- 
Sent from my phone. Please excuse my brevity.

On May 25, 2016 8:37:10 AM PDT, James Hirschorn  
wrote:
>Reduce is failing when applied to a list of elements of class 
>data.table. Perhaps this is a bug?
>
>Example:
>
>library(data.table)
>
>dt1 <- data.table(x = 1:3, y = 4:6)
>dt2 <- data.table(x = 4:6, y = 1:3)
>dt3 <- data.table(x = 0:-2, y = 0:-2)
>
># This works fine
>dt1 + dt2 + dt2
>#x y
># 1: 5 5
># 2: 6 6
># 3: 7 7
>
># But:
>dt_list <- list(dt1, dt2, dt3)
>Reduce("+", dt_list)
># Error in f(init, x[[i]]) : non-numeric argument to binary operator
># In addition: Warning message:
># In Reduce("+", dt_list) :
>#   Incompatible methods ("Ops.data.frame", "Ops.data.table") for "+"
>
>If I use data.frame instead of data.table, Reduce works properly.
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R-es] Error en "optim" modelo APARCH sstd

2016-05-25 Thread Mª Ángeles Navarro



Hola a todos!
Estoy aplicando un modelo APARCH (1,1) con distribuci�n t-student asim�trica 
(sstd) a una serie de rendimientos financieros.
Mi problema es que he ejecutado el modelo,con el mismo c�digo, en diferentes 
series (de mismo tama�o muestral) obteniendo resultados adecuados,  pero en 
concreto con una de las series me da el siguiente error:
Error in optim(theta, negloglik, hessian = TRUE, ..., tmp = excess) :   
non-finite value supplied by optimIn addition: Warning messages:1: In 
sqrt(diag(fit$cvar)) : NaNs produced2: In sqrt(diag(fit$cvar)) : NaNs 
produced3: In sqrt(diag(fit$cvar)) : NaNs produced4: In sqrt(diag(fit$cvar)) : 
NaNs produced5: In sqrt(diag(fit$cvar)) : NaNs produced6: In 
sqrt(diag(fit$cvar)) : NaNs produced

Observaciones:La matriz "sigmas" (volatility) del modelo est� compuesta por 
valores "infinitos". 
Notas:Todas las series tienen el mismo tama�o muestral (1762).Este modelo ya se 
ejecut� con la misma serie pero con un tama�o inferior (1572) y se obtuvieron 
resultados sin problemas.
Hip�tesis del problema, seg�n informaci�n encontrada:Parece ser que se trata de 
una falsa convergencia: "Garchfit" usa "optim" que busca los par�metros que 
minimizan la funci�n (-log(likelihood), (funci�n que aparece en negativo para 
obtener el m�ximo).
La matriz de las segundas derivadas parciales (Hessian) de (-log(likelihood)) 
proporcionan una aproximaci�n a la matriz de covarianzas de los par�metros 
estimados. Parece que no se llega a la convergencia adecuada, por lo que se 
obtienen elementos negativos en la diagonal de la matriz de covarianzas, dando 
lugar a los errores 1 a 6.
�Por qu� ocurre esto? �C�mo puedo solucionarlo?

Muchas gracias de antemano.

Un saludo.


  
[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

[R] What are some toy models I can use in R?

2016-05-25 Thread C W
Hi everyone,

I am searching for some toy models in R.  My goal is do to model checking.

For example,

My data come from statistical model N(5, 2), with n=100, call this model_1
Then, I add bias to that data with N(3, 1), with n=100, call this model_2

Ultimately, I want to see model_1+ model_2 gives good prediction or perhaps
parameter estimation.

I think this is a pretty standard statistical analysis problem?

How do people on this list deal with it?  Any suggestions?

Thank you,

Mike

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reduce does not work with data.table?

2016-05-25 Thread James Hirschorn
Reduce is failing when applied to a list of elements of class 
data.table. Perhaps this is a bug?


Example:

library(data.table)

dt1 <- data.table(x = 1:3, y = 4:6)
dt2 <- data.table(x = 4:6, y = 1:3)
dt3 <- data.table(x = 0:-2, y = 0:-2)

# This works fine
dt1 + dt2 + dt2
#x y
# 1: 5 5
# 2: 6 6
# 3: 7 7

# But:
dt_list <- list(dt1, dt2, dt3)
Reduce("+", dt_list)
# Error in f(init, x[[i]]) : non-numeric argument to binary operator
# In addition: Warning message:
# In Reduce("+", dt_list) :
#   Incompatible methods ("Ops.data.frame", "Ops.data.table") for "+"

If I use data.frame instead of data.table, Reduce works properly.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] New package: sparsevar

2016-05-25 Thread Simone Vazzoler

Dear R users,

I would like to announce a new package called "sparsevar" version 0.0.3:

https://cran.r-project.org/web/packages/sparsevar/

The package should be useful to estimate sparse VAR/VECM models.
The developing version can be found on Github:

https://github.com/svazzole/sparsevar

Best,
Simon

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Connecting to Hive in Kerberos enabled hadoop cluster from R

2016-05-25 Thread Kumar, Anoop (GE Corporate, consultant)
Hi All,

Request your help.

We are trying to connect to hive from R using Rstudio. Its a kerberos secured 
cluster. Code snippet is below.

==

library(rJava)
library(RJDBC)

cp = 
c("/usr/hdp/2.3.2.0-2950/hive/lib/hive-jdbc.jar","/usr/hdp/2.3.2.0-2950/hadoop/lib/hadoop-common-2.7.1.2.3.2.0-2950.jar")
.jinit(classpath=cp)

drv <- JDBC("org.apache.hive.jdbc.HiveDriver",classPath = 
list.files("/usr/hdp/2.3.2.0-2950/hadoop/lib",pattern="jar$",full.names=T, 
recursive = TRUE),identifier.quote="`")

conn <- dbConnect(drv, 
"jdbc:hive2://host.node1.com:1/default;principal=hive/shost.node1@node1.com",
 "", "")

show_databases <- dbGetQuery(conn, "show databases")

show_databases

==

But we are getting the below error

Error in .jcall(drv@jdrv, "Ljava/sql/Connection;", "connect", 
as.character(url)[1],  :
  java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.hadoop.security.UserGroupInformation

What are we missing here? A kerberos ticket is there in place. Shall we 
usekerberos  keytab inside R code? What is the function for ir. Also which 
hadoop libraries should we import for R and hive interaction?



Thanks & Regards,

Anoop Kumar K M

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Factor Variable frequency

2016-05-25 Thread S Ellison


> ruipbarra...@sapo.pt
> Maybe the following (untested).
> 
> table(df$Protocol[df$Speed == "SLOW"])

Could also use which.max to get the particular item: ...
tprot <- table(df$Protocol[df$Speed == "SLOW"])
tprot[which.max(tprot)]

S Ellison


***
This email and any attachments are confidential. Any use, copying or
disclosure other than by the intended recipient is unauthorised. If 
you have received this message in error, please notify the sender 
immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com 
and delete this message and any copies from your computer and network. 
LGC Limited. Registered in England 2991879. 
Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Antwort: Re: Creating a data frame from scratch (SOLVED)

2016-05-25 Thread G . Maubach
Hi Dan,
Hi All,

many thanks for your help.

Please find enclosed my little function for your use:

-- cut --

#---
# Module: t_count_na.R
# Author: Georg Maubach
# Date  : 2016-05-24
# Update: 2016-05-25
# Description   : Count NA's
# Source System : R 3.2.2 (64 Bit)
# Target System : R 3.2.2 (64 Bit)
# License   : CC-BY-SA-NC
#1-2-3-4-5-6-7-8

test <- FALSE

t_count_na <- function(dataset,
   variables = "all") {
  # Counts the number of NA within given set of veriables
  #
  # Args:
  #   dataset  : Object with dimnames, e.g. data frame, data table.
  #   variables: Character vector with variable names.
  #
  # Operation:
  #   Adds the variable "na_count" to the given dataset containing the 
count of
  #   NA's within the given variables
  #
  # Returns:
  #   Original dataset with variable "na_count" added.
  #
  # Error handling:
  #   None.
  #
  # Credits: 
  #   
http://stackoverflow.com/questions/4862178/remove-rows-with-nas-in-data-frame
  #   
http://r.789695.n4.nabble.com/Creating-variables-on-the-fly-td4720034.html
 
  version <- "2016-05-25"
 
  if (identical(variables, "all")) {
variable_list <- names(dataset)
  }  else {
variable_list <- variables
  } 
  dataset[["na_count"]] <- apply(dataset[,variable_list],
 1, 
 function(x) sum(is.na(x)))
 
  return(dataset)
 
}

#---

test <- function(do_test = FALSE) {
 
  cat("\n", "\n", "Test function t_count_na()", "\n", "\n")
 
  # Example dataset
gene <- 
c("ENSG0208234","ENSG0199674","ENSG0221622","ENSG0207604", 

 "ENSG0207431","ENSG0221312","ENSG00134940305","ENSG00394039490",
  "ENSG09943004048")
hsap <- c(0,0,0, 0, 0, 0, 1,1, 1)
mmul <- c(NA,2 ,3, NA, 2, 1 , NA,2, NA)
mmus <- c(NA,2 ,NA, NA, NA, 2 , NA,3, 1)
rnor <- c(NA,2 ,NA, 1 , NA, 3 , NA,NA, 2)
cfam <- c(NA,2,NA, 2, 1, 2, 2,NA, NA)
ds_example <- data.frame(gene, hsap, mmul, mmus, rnor, cfam)
ds_example$gene <- as.character(ds_example$gene)
 
  cat("\n", "\n", "Example dataset before function call", "\n", "\n")
  print(ds_example)
 
  cat("\n", "\n", "Function call", "\n", "\n")
  ds_example <- t_count_na(dataset = ds_example,
   variables = c("mmul", "mmus"))
 
  cat("\n", "\n", "Example dataset after function call", "\n", "\n")
  print(ds_example)
}

test(do_test = test)

# EOF .

-- cut --

Kind regards

Georg Maubach




Von:"Nordlund, Dan (DSHS/RDA)" 
An:  "r-help@r-project.org" , 
Datum:  24.05.2016 21:41
Betreff:Re: [R] Creating a data frame from scratch
Gesendet von:   "R-help" 




I would probably write the function something like this:


t_count_na <- function(dataset,
   variables = "all") {
  if (identical(variables, "all")) {
variable_list <- names(dataset)
  }  else {
variable_list <- variables
  } 
  apply(dataset[,variable_list], 1, function(x) sum(is.na(x)))
}


Hope this is helpful,

Dan

Daniel Nordlund, PhD
Research and Data Analysis Division
Services & Enterprise Support Administration
Washington State Department of Social and Health Services


> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
> g.maub...@gmx.de
> Sent: Tuesday, May 24, 2016 11:55 AM
> To: r-help@r-project.org
> Subject: [R] Creating a data frame from scratch
> 
> Hi All,
> 
> I need to create a data frame from scratch and fill variables created on 
the fly
> with values. What I have so far:
> 
> -- schnipp --
> 
> # Example dataset
> gene <-
> c("ENSG0208234","ENSG0199674","ENSG0221622","ENSG0
> 207604",
> 
> "ENSG0207431","ENSG0221312","ENSG00134940305","ENSG0039403
> 9490",
>   "ENSG09943004048")
> hsap <- c(0,0,0, 0, 0, 0, 1,1, 1)
> mmul <- c(NA,2 ,3, NA, 2, 1 , NA,2, NA)
> mmus <- c(NA,2 ,NA, NA, NA, 2 , NA,3, 1) rnor <- c(NA,2 ,NA, 1 , NA, 3 ,
> NA,NA, 2) cfam <- c(NA,2,NA, 2, 1, 2, 2,NA, NA)
> 
> ds_example <- data.frame(gene, hsap, mmul, mmus, rnor, cfam)
> ds_example$gene <- as.character(ds_example$gene)
> 
> t_count_na <- function(dataset,
>variables = "all")
>   # credit: http://stackoverflow.com/questions/4862178/remove-rows-with-
> nas-in-data-frame
>   {
>   ds_na <- data.frame()
>   # if variables = "all" create character vector of variable names
>   if (variables == "all") {
> variable_list <- dimnames(dataset)[[ 2 ]]
>   }
>   # if a character vector with variable names is given
>   # to run the function on a defined set of selected variables
>   else {
> variable_list <- variables
>   }
> 
>   for (var in variable_list) {
> new_name <-