Re: [R] Multibyte strings

2015-09-26 Thread Dennis Fisher
Peter

Thanks for the explanation.  One further comment — you wrote:
> I don't think the FDA "requests" XPT files 

In fact, they do make such a request.  Here is the actual language received 
this week (and repeatedly in the past):
> Program/script files should be submitted using text files (*.TXT) and the 
> data should be submitted using SAS transport files (*.XPT).

Dennis

Dennis Fisher MD
P < (The "P Less Than" Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com



> On Sep 26, 2015, at 5:52 AM, peter dalgaard  wrote:
> 
> Dennis,
> 
> The invalid multibyte issue is almost certainly a symptom of being in a UTF-8 
> locale and trying to handle strings that aren't in UTF-8. (UTF uses 
> particular 8 bit patterns to say that the following k bytes contain a Unicode 
> value outside ASCII, other "8 bit ASCII" encodings, like Latin-1, just use 
> the extra 128 character codes for special characters. Treating the latter as 
> the former causes errors, the other way around just looks weird.
> 
> So perhaps you should try diddling your locale settings and/or look for 
> encoding arguments for the functions that you use. Then again, the XPT format 
> may not be happy with non-ASCII characters, whatever the encoding, in which 
> case you may need to massage the input data sets and change variable names 
> and factor labels (iconv() should be your friend).
> 
> By the way, I don't think the FDA "requests" XPT files. As far as I recall, 
> they say somewhere that they _accept_ them (possibly defending themselves 
> against the platform-specific SAS files that once abunded), but I think even 
> Excel goes for submissions - the important thing is that they can get at the 
> actual data reasonably easy. I can see the attraction of taking the 
> well-trodden path, though.
> 
> -pd
> 
>> On 25 Sep 2015, at 23:23 , Dennis Fisher  wrote:
>> 
>> R 3.2.0
>> OS X
>> 
>> Colleagues,
>> 
>> Earlier today, I initiated a series of emails regarding SASxport (which was 
>> removed from CRAN).  David Winsemius proposed downloading the source code 
>> and installing with the following command:
>>  install.packages('~/Downloads/SASxport_1.5.0.tar.gz', repos = NULL , 
>> type="source”)Th
>> 
>> That works and I am grateful to David for his recommendation.  However, the 
>> package fails on some of the many objects that I attempted to write with:
>>  write.xport
>> 
>> The error message was:
>>  Error in nchar(var) : invalid multibyte string 3157
>> 
>> One work-around would be to edit out multibyte strings.  Is there a simple 
>> way to find and replace them?  Or is there some other clever approach that 
>> bypasses the problem?
>> 
>> Dennis
>> 
>> Dennis Fisher MD
>> P < (The "P Less Than" Company)
>> Phone: 1-866-PLessThan (1-866-753-7784)
>> Fax: 1-866-PLessThan (1-866-753-7784)
>> www.PLessThan.com
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> 
> 
> 
> 
> 
> 
> 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to get significance codes after Kruskal Wallis test

2015-09-26 Thread Kristina Wolf
Perhaps look into the function friedman.test.with.post.hoc()
There is more information here:
http://www.r-statistics.com/wp-content/uploads/2010/02/Friedman-Test-with-Post-Hoc.r.txt

Note, this does not handle NA's though, and technically it is for blocked
designs, but maybe it will lead you somewhere useful or could be adapted?


*​~ Kristina*

​​
Kristina Wolf
​
​
Ph.D. Candidate, Graduate Group in Ecology
M.S. Soil Science
​,
​
B.S. Animal Science​
​
KristinaMWolf.com
Restoration Ecology Lab
​
Department of Plant Sciences
​
University of California, Davis​
​
(530) 750-9771

"We have to remember that what we observe is not nature herself, but nature
exposed to our method of questioning." ~ Werner Heisenberg


On Fri, Sep 25, 2015 at 10:01 PM, Michael Eisenring <
michael.eisenr...@gmx.ch> wrote:

> Is there a way to get significance codes after a pairwise comparisons to a
> Kruskall wallis test? With significance codes I mean letter codes (a, b,c)
> that are assigned to treatments to indicate where differences are
> significant.
>
> With a traditional anova such a test can be performed using HSD.test from
> the agricolae library but for non parametric counterparts of anova I have
> not been able to find anything.
>
> Can anyone help me?
>
> Thanks mike
>
>
>
> I added two example codes.
>
> First code  represents an ANOVA and a HSD.test() giving me significant
> codes
>
>  #FIRST CODE USING ANOVA
>
>
>
>
> library(agricolae)
> an.dta<-aov(Gossypol~Treatment,data=dta)
> summary(an.dta)
>
> HSD.test(an.dta,"Treatment")
> # The level by alpha default is 0.05.
> outT<-HSD.test(an.dta,"Treatment", group=T)
> outT
>
> #I receive significant codes.
>
>
>
>
>
> #SECOND CODE USING KRUSKAL WALLIs
>
> library(agricolae)
> an.dta2<-kruskal.test(Heliocide~Treatment,dta)
> summary(an.dta2)
>
> HSD.test(an.dta2,"Treatment")
>
> #ERROR MESSAGE no significance codes, why??
>
>
>
> #DATA FOR CODES
>
>
> structure(list(Treatment = structure(c(1L, 3L, 4L, 2L, 1L, 3L,
> 4L, 2L, 5L, 1L, 3L, 2L, 5L, 1L, 3L, 4L, 2L, 5L, 1L, 3L, 4L, 2L,
> 5L, 1L, 3L, 4L, 2L, 5L, 1L, 3L, 4L, 2L, 5L, 1L, 3L, 4L, 2L, 5L,
> 1L, 3L, 4L, 2L, 5L, 1L, 3L, 4L, 2L, 5L, 1L, 3L, 4L, 2L, 5L, 1L,
> 3L, 5L), .Label = c("1_2d", "1_7d", "3_2d", "9_2d", "C"), class =
> "factor"),
>
> Code = structure(c(1L, 2L, 3L, 4L, 18L, 19L, 20L, 21L, 22L,
> 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L,
> 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L,
> 47L, 48L, 49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 5L, 6L,
> 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L), .Label =
> c("1_2d_1c",
> "1_2d_3c", "1_2d_9c", "1_7d_1c", "10_2d_1c", "10_2d_3c",
> "10_2d_9c", "10_7d_1c", "10_C", "11_2d_1c", "11_2d_3c", "11_2d_9c",
> "11_7d_1c", "11_C", "12_2d_1c", "12_2d_3c", "12_C", "2_2d_1c",
> "2_2d_3c", "2_2d_9c", "2_7d_1c", "2_C", "3_2d_1c", "3_2d_3c",
> "3_7d_1c", "3_C", "4_2d_1c", "4_2d_3c", "4_2d_9c", "4_7d_1c",
> "4_C", "5_2d_1c", "5_2d_3c", "5_2d_9c", "5_7d_1c", "5_C",
> "6_2d_1c", "6_2d_3c", "6_2d_9c", "6_7d_1c", "6_C", "7_2d_1c",
> "7_2d_3c", "7_2d_9c", "7_7d_1c", "7_C", "8_2d_1c", "8_2d_3c",
> "8_2d_9c", "8_7d_1c", "8_C", "9_2d_1c", "9_2d_3c", "9_2d_9c",
> "9_7d_1c", "9_C"), class = "factor"), Glands = c(165, 289.333,
> 319.333, 472, 334.667, 259, 373.333, 525.667,
> 275.333, 230.667, 346.333, 377.667, 255.333,
> 217.667, 266, 300.333, 354.333, 225.333,
> 294, 359, 359, 222.667, 103, 246.667, 324.667,
> 277, 460, 163.667, 226.333, 228, 357.667, 505,
> 142.667, 324, 278.667, 317.333, 335.667,
> 193.667, 188, 255, 252, 393.333, 248.333, 353,
> 320.667, 228.333, 497, 165.667, 209.333,
> 162.333, 280, 337, 169.667, 231.667, 257.667,
> 218.667), Tannin = c(0.334252451, 1.376077586, 0.896849593,
> 0.888621795, 0.464285714, 0.830236486, 0.870881783, 0.768489583,
> 0.647727273, 0.81372549, 0.51380814, 0.859923246, 0.495265152,
> 0.699932796, 1.09375, 0.785037879, 0.892650463, 0.518963675,
> 1.05859375, 0.447916667, 1.269097222, 1.147522523, 0.391276042,
> 0.883400538, 1.523989899, 0.907930108, 0.749155405, 0.450126263,
> 0.562239583, 0.911151961, 0.6, 1.610677083, 0.446428571,
> 0.601151316, 1.073635057, 1.359923246, 1.00154321, 0.90933642,
> 0.012054398, 1.10208, 1.01736, 1.052372685, 0.958607456,
> 1.224702381, 0.982291667, 1.045138889, 1.611607143, 0.662574405,
> 1.385416667, 0.464518229, 0.99444, 1.23958, 0.877514368,
> 0.74453125, 0.804315476, 1.024066092), H.polone = c(6754.067177,
> 22380.26652, 23622.79158, 23733.77678, 13099.20833, 23564.74907,
> 2725.016387, 18751.03986, 4283.098494, 23008.35336, 10205.56354,
> 19787.63361, 4302.050374, 7400.640798, 22442.86044, 34315.09631,
> 16498.66728, 14170.13252, 9509.1073, 6265.29637, 

Re: [R] How to calculate standard error of estimate (S) for my non-linear regression model?

2015-09-26 Thread peter dalgaard
This is one area in which terminology in (computational) statistics has gone a 
bit crazy. The thing some call "standard error of estimate" is actually the 
residual standard deviation in the regression model, not to be confused with 
the standard errors that are associated with parameter estimates. In 
summary(nls(...)) (and summary(lm()) for that matter), you'll find it as 
"residual standard error",  and even that is a bit of a misnomer.

-pd 

> On 26 Sep 2015, at 07:08 , Michael Eisenring  wrote:
> 
> Hi all,
> 
> I am looking for something that indicates the goodness of fit for my non
> linear regression model (since R2 is not very reliable).
> 
> I read that the standard error of estimate (also known as standard error of
> the regression) is a good alternative.
> 
> 
> 
> The standard error of estimate is described on this page (including the
> formula) http://onlinestatbook.com/2/regression/accuracy.html
>  atbook.com%2F2%2Fregression%2Faccuracy.html> 
> 
> Unfortunately however, I have no clue how to programm it in R. Does anyone
> know and could help me?
> 
> Thank you very much.
> 
> 
> 
> I added an example of my model and a dput() of my data
> 
> #CODE
> 
> dta<-read.csv("Regression_exp2.csv",header=T, sep = ",")
> attach(dta)  # tells R to do the following analyses on this dataset
> head(dta)
> 
> 
> 
> # loading packages: analysis of mixed effect models
> library(nls2)#model
> 
> #Aim: fit equation to data: y~yo+a*(1-b^x) : Two parameter exp. single rise
> to the maximum
> # y =Gossypol (from my data set) x= Damage_cm (from my data set)
> #The other 3 parameters are unknown: yo=Intercept, a= assymptote ans b=slope
> 
> plot(Gossypol~Damage_cm, dta)
> # Looking at the plot, 0 is a plausible estimate for y0:
> # a+y0 is the asymptote, so estimate about 4000;
> # b is between 0 and 1, so estimate .5
> dta.nls <- nls(Gossypol~y0+a*(1-b^Damage_cm), dta,
>   start=list(y0=0, a=4000, b=.5))
> 
> xval <- seq(0, 10, 0.1)
> lines(xval, predict(dta.nls, data.frame(Damage_cm=xval)))
> profile(dta.nls, alpha= .05)
> 
> 
> summary(dta.nls)
> 
> 
> 
> 
> 
> 
> 
> #INPUT
> 
> structure(list(Gossypol = c(948.2418407, 1180.171957, 3589.187889, 
> 450.7205451, 349.0864019, 592.3403778, 723.885643, 2005.919344, 
> 720.9785449, 1247.806111, 1079.846532, 1500.863038, 4198.569251, 
> 3618.448997, 4140.242559, 1036.331811, 1013.807628, 2547.326207, 
> 2508.417927, 2874.651764, 1120.955, 1782.864308, 1517.045807, 
> 2287.228752, 4171.427741, 3130.376482, 1504.491931, 6132.876396, 
> 3350.203452, 5113.942098, 1989.576826, 3470.09352, 4576.787021, 
> 4854.985845, 1414.161257, 2608.716056, 910.8879471, 2228.522959, 
> 2952.931863, 5909.068158, 1247.806111, 6982.035521, 2867.610671, 
> 5629.979049, 6039.995102, 3747.076592, 3743.331903, 4274.324792, 
> 3378.151945, 3736.144027, 5654.858696, 5972.926124, 3723.629772, 
> 3322.115942, 3575.043632, 2818.419785), Treatment = structure(c(5L, 
> 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 1L, 1L, 1L, 1L, 1L, 1L, 
> 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 3L, 3L, 3L, 
> 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 4L, 2L, 4L, 4L, 2L, 4L, 2L, 2L, 
> 4L, 4L, 4L, 4L, 4L, 4L, 2L), .Label = c("1c_2d", "1c_7d", "3c_2d", 
> "9c_2d", "C"), class = "factor"), Damage_cm = c(0, 0, 0, 0, 0, 
> 0, 0, 0, 0, 0, 0, 0.142, 0.4035, 0.4435, 0.491, 0.4955, 0.578, 
> 0.5895, 0.6925, 0.6965, 0.756, 0.8295, 1.0475, 1.313, 1.516, 
> 1.573, 1.62, 1.8115, 1.8185, 1.8595, 1.989, 2.129, 2.171, 2.3035, 
> 2.411, 2.559, 2.966, 2.974, 3.211, 3.2665, 3.474, 3.51, 3.547, 
> 4.023, 4.409, 4.516, 4.7245, 4.809, 4.9835, 5.568, 5.681, 5.683, 
> 7.272, 8.043, 9.437, 9.7455), Damage_groups = c(0.278, 1.616, 
> 2.501, 3.401, 4.577, 5.644, 7.272, 8.043, 9.591, NA, NA, NA, 
> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Gossypol_Averaged =
> c(1783.211, 
> 3244.129, 2866.307, 3991.809, 4468.809, 5121.309, 3723.629772, 
> 3322.115942, 3196.731, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
> NA, NA, NA, NA, NA), Groups = c(42006L, 42038L, 42067L, 42099L, 
> 42130L, 42162L, 42193L, 42225L, 42257L, NA, NA, NA, NA, NA, NA, 
> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
> NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("Gossypol", 
> "Treatment", "Damage_cm", "Damage_groups", "Gossypol_Averaged", 
> "Groups"), class = "data.frame", row.names = c(NA, -56L))
> 
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> 

[R] flexmix - concomitant model and significance of variables

2015-09-26 Thread Damien Jourdain
Dear All, 

I am new to this forum and to flexmix. 
I am using flexmix to make a cluster analysis (Model Based). The data for
the clustering are all continuous (although all between 0 and 1).  I also
want to see the correlation between found clusters and some socio economic
variables, so I am using a concomitant model

The formulation is:
Conc<- FLXmultinom(~factor(Area)+factor(income)+Gender+factor(Education))

f2c <- flexmix(cbind(ECO, SOC, ENV, CULT)~1, k=5, 
  model=FLXMCmvnorm(), concomitant=Conc, data=data)

To recover the influence of socio-economic variables from the output I am
using:
f2c@concomitant@coef

However, how do I know which variables are significant?

Any help is welcomed!

Best

Damien





--
View this message in context: 
http://r.789695.n4.nabble.com/flexmix-concomitant-model-and-significance-of-variables-tp4712809.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R studio and git

2015-09-26 Thread David Winsemius
This is not the  RStudio support list. That function is offered on a web-hosted 
support list.

-- 
David

On Sep 25, 2015, at 6:19 PM, Glenn Schultz wrote:

> Hello all,
> 
> Maybe not the right place.  I have a project that is under git version 
> control in R studio.  I just updated to recent OSX build and now the git tab 
> in R studio is not available.  It seems I can still use version control with 
> source tree but it is strange that I lost git version on the update of OSX. 
> Any ideas?
> 
> Glenn
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Analysis of causal relations between rare (categorical) events

2015-09-26 Thread Giorgio Garziano
Hi,

I may suggest the following book introducing event history analysis with R and
showing some datasets to work with:

https://www.crcpress.com/Event-History-Analysis-with-R/Brostrm/9781439831649

I am not sure it can answer all your questions about your specific problem 
(rare events),
however it may help.

Giorgio Garziano





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multibyte strings

2015-09-26 Thread peter dalgaard
Dennis,

The invalid multibyte issue is almost certainly a symptom of being in a UTF-8 
locale and trying to handle strings that aren't in UTF-8. (UTF uses particular 
8 bit patterns to say that the following k bytes contain a Unicode value 
outside ASCII, other "8 bit ASCII" encodings, like Latin-1, just use the extra 
128 character codes for special characters. Treating the latter as the former 
causes errors, the other way around just looks weird.

So perhaps you should try diddling your locale settings and/or look for 
encoding arguments for the functions that you use. Then again, the XPT format 
may not be happy with non-ASCII characters, whatever the encoding, in which 
case you may need to massage the input data sets and change variable names and 
factor labels (iconv() should be your friend).

By the way, I don't think the FDA "requests" XPT files. As far as I recall, 
they say somewhere that they _accept_ them (possibly defending themselves 
against the platform-specific SAS files that once abunded), but I think even 
Excel goes for submissions - the important thing is that they can get at the 
actual data reasonably easy. I can see the attraction of taking the 
well-trodden path, though.

-pd

> On 25 Sep 2015, at 23:23 , Dennis Fisher  wrote:
> 
> R 3.2.0
> OS X
> 
> Colleagues,
> 
> Earlier today, I initiated a series of emails regarding SASxport (which was 
> removed from CRAN).  David Winsemius proposed downloading the source code and 
> installing with the following command:
>   install.packages('~/Downloads/SASxport_1.5.0.tar.gz', repos = NULL , 
> type="source”)Th
> 
> That works and I am grateful to David for his recommendation.  However, the 
> package fails on some of the many objects that I attempted to write with:
>   write.xport
> 
> The error message was:
>   Error in nchar(var) : invalid multibyte string 3157
> 
> One work-around would be to edit out multibyte strings.  Is there a simple 
> way to find and replace them?  Or is there some other clever approach that 
> bypasses the problem?
> 
> Dennis
> 
> Dennis Fisher MD
> P < (The "P Less Than" Company)
> Phone: 1-866-PLessThan (1-866-753-7784)
> Fax: 1-866-PLessThan (1-866-753-7784)
> www.PLessThan.com
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Analysis of causal relations between rare (categorical) events

2015-09-26 Thread Jim Lemon
Hi Mark,
You might find the eventInterval package of use.

Jim


On Sat, Sep 26, 2015 at 8:33 AM, markshanks 
wrote:

> Hi,
>
> I have only a relatively basic background in statistics (e.g., anova,
> regression), and the books on R I have read so far have focused on
> relatively common statistical analyses (e.g., outlier analysis, trend
> forecasting) and haven't helped me with the data problem I am facing.
>
> In short, imagine if the data is date stamped and we are interested in
> predicting the occurrence of relatively-rare, categorical events through
> the
> occurrence of other, relatively-rare categorical events. Both the outcomes
> and predictors can be a huge number of different types, although it is
> possible to group them as well.
>
> One way to go would seem to be to take the difference in time between each
> set of predictors and outcomes and see if there is more consistency for
> some
> measures than others?? However, I haven't found a good book or package that
> is directly aimed at this type of dataset, although I'm guessing there must
> be some...
>
> Thanks,
>
> Mark
>
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Analysis-of-causal-relations-between-rare-categorical-events-tp4712801.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to calculate standard error of estimate (S) for my non-linear regression model?

2015-09-26 Thread Bert Gunter
Michael:

You appear to be laboring under the illusion that a single numeric
summary (**any summary**)is a useful measure of model adequacy. It is
not; for details about why not, consult any applied statistics text
(e.g. on regression) and/or post on a statistics site, like
stats.stackexchange.com. Better yet, consult a local statistician.

Incidentally, this is even more the case for NON-linear models. Again,
consult appropriate statistical resources. Even googling on "R^2
inadequate for nonlinear models" brought up some interesting
resources, among them:

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2892436/

Cheers,
Bert

Bert Gunter

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
   -- Clifford Stoll


On Sat, Sep 26, 2015 at 8:56 AM, peter dalgaard  wrote:
>
>> On 26 Sep 2015, at 16:46 , Michael Eisenring  
>> wrote:
>>
>> Dear Peter,
>> Thank you for your answer.
>> If I look at my summary I see there a Residual standard error: 1394 on 53
>> degrees of freedom.
>> This number is very high (the fit of the curve is pretty bad I know but
>> still...). Are you sure the residual standard error given in the summary is
>> the same as the one described on this page:
>> http://onlinestatbook.com/2/regression/accuracy.html
>
> Sure I'm sure (& I did check!)... But notice that unlike R^2, Residual SE is 
> not dimensionless. Switch from millimeters to meters in your response measure 
> and the Residual SE instantly turns into 1.394. It's basically saying that 
> your model claims to be able to predict Y within +/-2800 units. How good or 
> bad that is, is your judgment.
>
>> I am basically just looking for a value that describes the goodness of fit
>> for my non-linear regression model.
>>
>>
>> This is probably a pretty obvious question, but I am not a statistician and
>> as you said the terminology is sometimes pretty confusing.
>> Thanks mike
>>
>> -Ursprüngliche Nachricht-
>> Von: peter dalgaard [mailto:pda...@gmail.com]
>> Gesendet: Samstag, 26. September 2015 01:43
>> An: Michael Eisenring 
>> Cc: r-help@r-project.org
>> Betreff: Re: [R] How to calculate standard error of estimate (S) for my
>> non-linear regression model?
>>
>> This is one area in which terminology in (computational) statistics has gone
>> a bit crazy. The thing some call "standard error of estimate" is actually
>> the residual standard deviation in the regression model, not to be confused
>> with the standard errors that are associated with parameter estimates. In
>> summary(nls(...)) (and summary(lm()) for that matter), you'll find it as
>> "residual standard error",  and even that is a bit of a misnomer.
>>
>> -pd
>>
>>> On 26 Sep 2015, at 07:08 , Michael Eisenring 
>> wrote:
>>>
>>> Hi all,
>>>
>>> I am looking for something that indicates the goodness of fit for my
>>> non linear regression model (since R2 is not very reliable).
>>>
>>> I read that the standard error of estimate (also known as standard
>>> error of the regression) is a good alternative.
>>>
>>>
>>>
>>> The standard error of estimate is described on this page (including
>>> the
>>> formula) http://onlinestatbook.com/2/regression/accuracy.html
>>> >> linest atbook.com%2F2%2Fregression%2Faccuracy.html>
>>>
>>> Unfortunately however, I have no clue how to programm it in R. Does
>>> anyone know and could help me?
>>>
>>> Thank you very much.
>>>
>>>
>>>
>>> I added an example of my model and a dput() of my data
>>>
>>> #CODE
>>>
>>> dta<-read.csv("Regression_exp2.csv",header=T, sep = ",")
>>> attach(dta)  # tells R to do the following analyses on this dataset
>>> head(dta)
>>>
>>>
>>>
>>> # loading packages: analysis of mixed effect models
>>> library(nls2)#model
>>>
>>> #Aim: fit equation to data: y~yo+a*(1-b^x) : Two parameter exp. single
>>> rise to the maximum # y =Gossypol (from my data set) x= Damage_cm
>>> (from my data set) #The other 3 parameters are unknown: yo=Intercept,
>>> a= assymptote ans b=slope
>>>
>>> plot(Gossypol~Damage_cm, dta)
>>> # Looking at the plot, 0 is a plausible estimate for y0:
>>> # a+y0 is the asymptote, so estimate about 4000; # b is between 0 and
>>> 1, so estimate .5 dta.nls <- nls(Gossypol~y0+a*(1-b^Damage_cm), dta,
>>>  start=list(y0=0, a=4000, b=.5))
>>>
>>> xval <- seq(0, 10, 0.1)
>>> lines(xval, predict(dta.nls, data.frame(Damage_cm=xval)))
>>> profile(dta.nls, alpha= .05)
>>>
>>>
>>> summary(dta.nls)
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> #INPUT
>>>
>>> structure(list(Gossypol = c(948.2418407, 1180.171957, 3589.187889,
>>> 450.7205451, 349.0864019, 592.3403778, 723.885643, 2005.919344,
>>> 720.9785449, 1247.806111, 1079.846532, 1500.863038, 4198.569251,
>>> 3618.448997, 4140.242559, 1036.331811, 1013.807628, 2547.326207,
>>> 2508.417927, 2874.651764, 1120.955, 1782.864308, 1517.045807,
>>> 

Re: [R] How to calculate standard error of estimate (S) for my non-linear regression model?

2015-09-26 Thread Michael Eisenring
Dear Peter,
Thank you for your answer.
If I look at my summary I see there a Residual standard error: 1394 on 53
degrees of freedom.
This number is very high (the fit of the curve is pretty bad I know but
still...). Are you sure the residual standard error given in the summary is
the same as the one described on this page:
http://onlinestatbook.com/2/regression/accuracy.html
I am basically just looking for a value that describes the goodness of fit
for my non-linear regression model.


This is probably a pretty obvious question, but I am not a statistician and
as you said the terminology is sometimes pretty confusing.
Thanks mike

-Ursprüngliche Nachricht-
Von: peter dalgaard [mailto:pda...@gmail.com] 
Gesendet: Samstag, 26. September 2015 01:43
An: Michael Eisenring 
Cc: r-help@r-project.org
Betreff: Re: [R] How to calculate standard error of estimate (S) for my
non-linear regression model?

This is one area in which terminology in (computational) statistics has gone
a bit crazy. The thing some call "standard error of estimate" is actually
the residual standard deviation in the regression model, not to be confused
with the standard errors that are associated with parameter estimates. In
summary(nls(...)) (and summary(lm()) for that matter), you'll find it as
"residual standard error",  and even that is a bit of a misnomer.

-pd 

> On 26 Sep 2015, at 07:08 , Michael Eisenring 
wrote:
> 
> Hi all,
> 
> I am looking for something that indicates the goodness of fit for my 
> non linear regression model (since R2 is not very reliable).
> 
> I read that the standard error of estimate (also known as standard 
> error of the regression) is a good alternative.
> 
> 
> 
> The standard error of estimate is described on this page (including 
> the
> formula) http://onlinestatbook.com/2/regression/accuracy.html
>  linest atbook.com%2F2%2Fregression%2Faccuracy.html>
> 
> Unfortunately however, I have no clue how to programm it in R. Does 
> anyone know and could help me?
> 
> Thank you very much.
> 
> 
> 
> I added an example of my model and a dput() of my data
> 
> #CODE
> 
> dta<-read.csv("Regression_exp2.csv",header=T, sep = ",")
> attach(dta)  # tells R to do the following analyses on this dataset
> head(dta)
> 
> 
> 
> # loading packages: analysis of mixed effect models 
> library(nls2)#model
> 
> #Aim: fit equation to data: y~yo+a*(1-b^x) : Two parameter exp. single 
> rise to the maximum # y =Gossypol (from my data set) x= Damage_cm 
> (from my data set) #The other 3 parameters are unknown: yo=Intercept, 
> a= assymptote ans b=slope
> 
> plot(Gossypol~Damage_cm, dta)
> # Looking at the plot, 0 is a plausible estimate for y0:
> # a+y0 is the asymptote, so estimate about 4000; # b is between 0 and 
> 1, so estimate .5 dta.nls <- nls(Gossypol~y0+a*(1-b^Damage_cm), dta,
>   start=list(y0=0, a=4000, b=.5))
> 
> xval <- seq(0, 10, 0.1)
> lines(xval, predict(dta.nls, data.frame(Damage_cm=xval))) 
> profile(dta.nls, alpha= .05)
> 
> 
> summary(dta.nls)
> 
> 
> 
> 
> 
> 
> 
> #INPUT
> 
> structure(list(Gossypol = c(948.2418407, 1180.171957, 3589.187889, 
> 450.7205451, 349.0864019, 592.3403778, 723.885643, 2005.919344, 
> 720.9785449, 1247.806111, 1079.846532, 1500.863038, 4198.569251, 
> 3618.448997, 4140.242559, 1036.331811, 1013.807628, 2547.326207, 
> 2508.417927, 2874.651764, 1120.955, 1782.864308, 1517.045807, 
> 2287.228752, 4171.427741, 3130.376482, 1504.491931, 6132.876396, 
> 3350.203452, 5113.942098, 1989.576826, 3470.09352, 4576.787021, 
> 4854.985845, 1414.161257, 2608.716056, 910.8879471, 2228.522959, 
> 2952.931863, 5909.068158, 1247.806111, 6982.035521, 2867.610671, 
> 5629.979049, 6039.995102, 3747.076592, 3743.331903, 4274.324792, 
> 3378.151945, 3736.144027, 5654.858696, 5972.926124, 3723.629772, 
> 3322.115942, 3575.043632, 2818.419785), Treatment = structure(c(5L, 
> 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
> 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 
> 3L, 3L, 3L, 2L, 2L, 2L, 4L, 2L, 4L, 4L, 2L, 4L, 2L, 2L, 4L, 4L, 4L, 
> 4L, 4L, 4L, 2L), .Label = c("1c_2d", "1c_7d", "3c_2d", "9c_2d", "C"), 
> class = "factor"), Damage_cm = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
> 0.142, 0.4035, 0.4435, 0.491, 0.4955, 0.578, 0.5895, 0.6925, 0.6965, 
> 0.756, 0.8295, 1.0475, 1.313, 1.516, 1.573, 1.62, 1.8115, 1.8185, 
> 1.8595, 1.989, 2.129, 2.171, 2.3035, 2.411, 2.559, 2.966, 2.974, 
> 3.211, 3.2665, 3.474, 3.51, 3.547, 4.023, 4.409, 4.516, 4.7245, 4.809, 
> 4.9835, 5.568, 5.681, 5.683, 7.272, 8.043, 9.437, 9.7455), 
> Damage_groups = c(0.278, 1.616, 2.501, 3.401, 4.577, 5.644, 7.272, 
> 8.043, 9.591, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), 
> Gossypol_Averaged = 

[R] Analysis of causal relations between rare (categorical) events

2015-09-26 Thread markshanks
Hi,

I have only a relatively basic background in statistics (e.g., anova,
regression), and the books on R I have read so far have focused on
relatively common statistical analyses (e.g., outlier analysis, trend
forecasting) and haven't helped me with the data problem I am facing.

In short, imagine if the data is date stamped and we are interested in
predicting the occurrence of relatively-rare, categorical events through the
occurrence of other, relatively-rare categorical events. Both the outcomes
and predictors can be a huge number of different types, although it is
possible to group them as well.

One way to go would seem to be to take the difference in time between each
set of predictors and outcomes and see if there is more consistency for some
measures than others?? However, I haven't found a good book or package that
is directly aimed at this type of dataset, although I'm guessing there must
be some...

Thanks,

Mark





--
View this message in context: 
http://r.789695.n4.nabble.com/Analysis-of-causal-relations-between-rare-categorical-events-tp4712801.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R studio and git

2015-09-26 Thread Glenn Schultz

Hello all,

Maybe not the right place.  I have a project that is under git version control 
in R studio.  I just updated to recent OSX build and now the git tab in R 
studio is not available.  It seems I can still use version control with source 
tree but it is strange that I lost git version on the update of OSX. Any ideas?

Glenn
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to get significance codes after Kruskal Wallis test

2015-09-26 Thread Michael Eisenring
Thank you very much Kristina,

Unfortunately that’s not what I am looking for.

I am just very surprised if there would be no possibility to get the 
significance codes for Kruskal Wallis (I would have suggested that this is a 
pretty common test.)

 

I found another option called kruskal() which does pairwise comparison, but 
without significance codes.

Maybe another R-list member knows more.

 

Thank you,

Mike

 

Von: Kristina Wolf [mailto:kmw...@ucdavis.edu] 
Gesendet: Freitag, 25. September 2015 23:26
An: Michael Eisenring 
Cc: r-help 
Betreff: Re: [R] How to get significance codes after Kruskal Wallis test

 

Perhaps look into the function friedman.test.with.post.hoc()

There is more information here: 
http://www.r-statistics.com/wp-content/uploads/2010/02/Friedman-Test-with-Post-Hoc.r.txt

 

Note, this does not handle NA's though, and technically it is for blocked 
designs, but maybe it will lead you somewhere useful or could be adapted? 

 




​

~ Kristina

 

​​

Kristina Wolf

​

​

Ph.D. Candidate, Graduate Group in Ecology

M.S. Soil Science

​, 

​

B.S. Animal Science​

​

KristinaMWolf.com

Restoration Ecology Lab

​

Department of Plant Sciences

​

University of California, Davis​

​

(530) 750-9771

 

"We have to remember that what we observe is not nature herself, but nature 
exposed to our method of questioning." ~ Werner Heisenberg

 

 

On Fri, Sep 25, 2015 at 10:01 PM, Michael Eisenring  > wrote:

Is there a way to get significance codes after a pairwise comparisons to a
Kruskall wallis test? With significance codes I mean letter codes (a, b,c)
that are assigned to treatments to indicate where differences are
significant.

With a traditional anova such a test can be performed using HSD.test from
the agricolae library but for non parametric counterparts of anova I have
not been able to find anything.

Can anyone help me?

Thanks mike



I added two example codes.

First code  represents an ANOVA and a HSD.test() giving me significant codes

 #FIRST CODE USING ANOVA




library(agricolae)
an.dta<-aov(Gossypol~Treatment,data=dta)
summary(an.dta)

HSD.test(an.dta,"Treatment")
# The level by alpha default is 0.05.
outT<-HSD.test(an.dta,"Treatment", group=T)
outT

#I receive significant codes.





#SECOND CODE USING KRUSKAL WALLIs

library(agricolae)
an.dta2<-kruskal.test(Heliocide~Treatment,dta)
summary(an.dta2)

HSD.test(an.dta2,"Treatment")

#ERROR MESSAGE no significance codes, why??



#DATA FOR CODES


structure(list(Treatment = structure(c(1L, 3L, 4L, 2L, 1L, 3L,
4L, 2L, 5L, 1L, 3L, 2L, 5L, 1L, 3L, 4L, 2L, 5L, 1L, 3L, 4L, 2L,
5L, 1L, 3L, 4L, 2L, 5L, 1L, 3L, 4L, 2L, 5L, 1L, 3L, 4L, 2L, 5L,
1L, 3L, 4L, 2L, 5L, 1L, 3L, 4L, 2L, 5L, 1L, 3L, 4L, 2L, 5L, 1L,
3L, 5L), .Label = c("1_2d", "1_7d", "3_2d", "9_2d", "C"), class = "factor"),

Code = structure(c(1L, 2L, 3L, 4L, 18L, 19L, 20L, 21L, 22L,
23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L,
35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L,
47L, 48L, 49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 5L, 6L,
7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L), .Label =
c("1_2d_1c",
"1_2d_3c", "1_2d_9c", "1_7d_1c", "10_2d_1c", "10_2d_3c",
"10_2d_9c", "10_7d_1c", "10_C", "11_2d_1c", "11_2d_3c", "11_2d_9c",
"11_7d_1c", "11_C", "12_2d_1c", "12_2d_3c", "12_C", "2_2d_1c",
"2_2d_3c", "2_2d_9c", "2_7d_1c", "2_C", "3_2d_1c", "3_2d_3c",
"3_7d_1c", "3_C", "4_2d_1c", "4_2d_3c", "4_2d_9c", "4_7d_1c",
"4_C", "5_2d_1c", "5_2d_3c", "5_2d_9c", "5_7d_1c", "5_C",
"6_2d_1c", "6_2d_3c", "6_2d_9c", "6_7d_1c", "6_C", "7_2d_1c",
"7_2d_3c", "7_2d_9c", "7_7d_1c", "7_C", "8_2d_1c", "8_2d_3c",
"8_2d_9c", "8_7d_1c", "8_C", "9_2d_1c", "9_2d_3c", "9_2d_9c",
"9_7d_1c", "9_C"), class = "factor"), Glands = c(165, 289.333,
319.333, 472, 334.667, 259, 373.333, 525.667,
275.333, 230.667, 346.333, 377.667, 255.333,
217.667, 266, 300.333, 354.333, 225.333,
294, 359, 359, 222.667, 103, 246.667, 324.667,
277, 460, 163.667, 226.333, 228, 357.667, 505,
142.667, 324, 278.667, 317.333, 335.667,
193.667, 188, 255, 252, 393.333, 248.333, 353,
320.667, 228.333, 497, 165.667, 209.333,
162.333, 280, 337, 169.667, 231.667, 257.667,
218.667), Tannin = c(0.334252451, 1.376077586, 0.896849593,
0.888621795, 0.464285714, 0.830236486, 0.870881783, 0.768489583,
0.647727273, 0.81372549, 0.51380814, 0.859923246, 0.495265152,
0.699932796, 1.09375, 0.785037879, 0.892650463, 0.518963675,
1.05859375, 0.447916667, 1.269097222, 1.147522523, 0.391276042,
0.883400538, 1.523989899, 0.907930108, 0.749155405, 0.450126263,
0.562239583, 0.911151961, 0.6, 1.610677083, 0.446428571,
0.601151316, 1.073635057, 

Re: [R] How to calculate standard error of estimate (S) for my non-linear regression model?

2015-09-26 Thread peter dalgaard

> On 26 Sep 2015, at 16:46 , Michael Eisenring  wrote:
> 
> Dear Peter,
> Thank you for your answer.
> If I look at my summary I see there a Residual standard error: 1394 on 53
> degrees of freedom.
> This number is very high (the fit of the curve is pretty bad I know but
> still...). Are you sure the residual standard error given in the summary is
> the same as the one described on this page:
> http://onlinestatbook.com/2/regression/accuracy.html

Sure I'm sure (& I did check!)... But notice that unlike R^2, Residual SE is 
not dimensionless. Switch from millimeters to meters in your response measure 
and the Residual SE instantly turns into 1.394. It's basically saying that your 
model claims to be able to predict Y within +/-2800 units. How good or bad that 
is, is your judgment.

> I am basically just looking for a value that describes the goodness of fit
> for my non-linear regression model.
> 
> 
> This is probably a pretty obvious question, but I am not a statistician and
> as you said the terminology is sometimes pretty confusing.
> Thanks mike
> 
> -Ursprüngliche Nachricht-
> Von: peter dalgaard [mailto:pda...@gmail.com] 
> Gesendet: Samstag, 26. September 2015 01:43
> An: Michael Eisenring 
> Cc: r-help@r-project.org
> Betreff: Re: [R] How to calculate standard error of estimate (S) for my
> non-linear regression model?
> 
> This is one area in which terminology in (computational) statistics has gone
> a bit crazy. The thing some call "standard error of estimate" is actually
> the residual standard deviation in the regression model, not to be confused
> with the standard errors that are associated with parameter estimates. In
> summary(nls(...)) (and summary(lm()) for that matter), you'll find it as
> "residual standard error",  and even that is a bit of a misnomer.
> 
> -pd 
> 
>> On 26 Sep 2015, at 07:08 , Michael Eisenring 
> wrote:
>> 
>> Hi all,
>> 
>> I am looking for something that indicates the goodness of fit for my 
>> non linear regression model (since R2 is not very reliable).
>> 
>> I read that the standard error of estimate (also known as standard 
>> error of the regression) is a good alternative.
>> 
>> 
>> 
>> The standard error of estimate is described on this page (including 
>> the
>> formula) http://onlinestatbook.com/2/regression/accuracy.html
>> > linest atbook.com%2F2%2Fregression%2Faccuracy.html>
>> 
>> Unfortunately however, I have no clue how to programm it in R. Does 
>> anyone know and could help me?
>> 
>> Thank you very much.
>> 
>> 
>> 
>> I added an example of my model and a dput() of my data
>> 
>> #CODE
>> 
>> dta<-read.csv("Regression_exp2.csv",header=T, sep = ",")
>> attach(dta)  # tells R to do the following analyses on this dataset
>> head(dta)
>> 
>> 
>> 
>> # loading packages: analysis of mixed effect models 
>> library(nls2)#model
>> 
>> #Aim: fit equation to data: y~yo+a*(1-b^x) : Two parameter exp. single 
>> rise to the maximum # y =Gossypol (from my data set) x= Damage_cm 
>> (from my data set) #The other 3 parameters are unknown: yo=Intercept, 
>> a= assymptote ans b=slope
>> 
>> plot(Gossypol~Damage_cm, dta)
>> # Looking at the plot, 0 is a plausible estimate for y0:
>> # a+y0 is the asymptote, so estimate about 4000; # b is between 0 and 
>> 1, so estimate .5 dta.nls <- nls(Gossypol~y0+a*(1-b^Damage_cm), dta,
>>  start=list(y0=0, a=4000, b=.5))
>> 
>> xval <- seq(0, 10, 0.1)
>> lines(xval, predict(dta.nls, data.frame(Damage_cm=xval))) 
>> profile(dta.nls, alpha= .05)
>> 
>> 
>> summary(dta.nls)
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> #INPUT
>> 
>> structure(list(Gossypol = c(948.2418407, 1180.171957, 3589.187889, 
>> 450.7205451, 349.0864019, 592.3403778, 723.885643, 2005.919344, 
>> 720.9785449, 1247.806111, 1079.846532, 1500.863038, 4198.569251, 
>> 3618.448997, 4140.242559, 1036.331811, 1013.807628, 2547.326207, 
>> 2508.417927, 2874.651764, 1120.955, 1782.864308, 1517.045807, 
>> 2287.228752, 4171.427741, 3130.376482, 1504.491931, 6132.876396, 
>> 3350.203452, 5113.942098, 1989.576826, 3470.09352, 4576.787021, 
>> 4854.985845, 1414.161257, 2608.716056, 910.8879471, 2228.522959, 
>> 2952.931863, 5909.068158, 1247.806111, 6982.035521, 2867.610671, 
>> 5629.979049, 6039.995102, 3747.076592, 3743.331903, 4274.324792, 
>> 3378.151945, 3736.144027, 5654.858696, 5972.926124, 3723.629772, 
>> 3322.115942, 3575.043632, 2818.419785), Treatment = structure(c(5L, 
>> 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
>> 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 
>> 3L, 3L, 3L, 2L, 2L, 2L, 4L, 2L, 4L, 4L, 2L, 4L, 2L, 2L, 4L, 4L, 4L, 
>> 4L, 4L, 4L, 2L), .Label = c("1c_2d", "1c_7d", "3c_2d", "9c_2d", "C"), 
>> class = "factor"), Damage_cm = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
>> 0.142, 0.4035, 0.4435, 0.491, 0.4955, 0.578, 0.5895, 0.6925, 0.6965, 
>> 

Re: [R] How to get significance codes after Kruskal Wallis test

2015-09-26 Thread David Winsemius

On Sep 26, 2015, at 6:48 AM, Michael Eisenring wrote:

> Thank you very much Kristina,
> 
> Unfortunately that’s not what I am looking for.
> 
> I am just very surprised if there would be no possibility to get the 
> significance codes for Kruskal Wallis (I would have suggested that this is a 
> pretty common test.)

Well, it is modestly common test but its really a global test, and not a 
pairwise one.

> I found another option called kruskal() which does pairwise comparison, but 
> without significance codes.
> 
> Maybe another R-list member knows more.
> 

I'm not sure I "know more", and it's very possible I "know less". In particular 
I don't really know what the term "significance code" actually means. (I'm 
hoping it's not a request for "significance stars", a feature which is roundly 
deprecated by more knowledgeable R users.) 

However, looking at the agricolae::kruskal function's help page and submitting 
the data in the non-formulaic manner it expects, I get this output very similar 
in form the the HSD.test output that it appeared you considered satisfied 
satisfactory:

(an.dta3<-kruskal(dta$Heliocide, dta$Treatment))

#
$statistics
 Chisq  p.chisq
  30.25246 4.348055e-06

$parameters
  Df ntr  t.value
   4   5 2.007584

$means
 dta$Heliocide   std  rMin  Max
1_2d 1992.7707 1747.1879 12  334.53973 4929.372
1_7d 2368.8057 1187.9285 11  767.22881 4624.945
3_2d 2640.1286 2659.5800 12  615.91181 8559.142
9_2d 5338.6711 1579.4428 10 3328.89713 8014.897
C 397.9086  443.6019 11   75.73956 1588.431

$rankMeans
  dta$Treatment dta$Heliocide  r
1  1_2d 26.00 12
2  1_7d 32.045455 11
3  3_2d 29.83 12
4  9_2d 47.50 10
5 C  8.954545 11

$comparison
NULL

$groups
   trt means M
1 9_2d 47.50 a
2 1_7d 32.045455 b
3 3_2d 29.83 b
4 1_2d 26.00 b
5 C 8.954545 c

>From context I am guessing that the "significance codes" you ask for are the 
>items in the M column of the "groups" element of the list output.

-- 
David.

> 
> 
> Thank you,
> 
> Mike
> 
> 
> 
> Von: Kristina Wolf [mailto:kmw...@ucdavis.edu] 
> Gesendet: Freitag, 25. September 2015 23:26
> An: Michael Eisenring 
> Cc: r-help 
> Betreff: Re: [R] How to get significance codes after Kruskal Wallis test
> 
> 
> 
> Perhaps look into the function friedman.test.with.post.hoc()
> 
> There is more information here: 
> http://www.r-statistics.com/wp-content/uploads/2010/02/Friedman-Test-with-Post-Hoc.r.txt
> 
> 
> 
> Note, this does not handle NA's though, and technically it is for blocked 
> designs, but maybe it will lead you somewhere useful or could be adapted? 
> 
> 
> ~ Kristina
> 
> Kristina Wolf
> Ph.D. Candidate, Graduate Group in Ecology
> M.S. Soil Science
> 
> 
> 
> On Fri, Sep 25, 2015 at 10:01 PM, Michael Eisenring   > wrote:
> 
> Is there a way to get significance codes after a pairwise comparisons to a
> Kruskall wallis test? With significance codes I mean letter codes (a, b,c)
> that are assigned to treatments to indicate where differences are
> significant.
> 
> With a traditional anova such a test can be performed using HSD.test from
> the agricolae library but for non parametric counterparts of anova I have
> not been able to find anything.
> 
> Can anyone help me?
> 
> Thanks mike
> 
> 
> 
> I added two example codes.
> 
> First code  represents an ANOVA and a HSD.test() giving me significant codes
> 
> #FIRST CODE USING ANOVA
> 
> library(agricolae)
> an.dta<-aov(Gossypol~Treatment,data=dta)
> summary(an.dta)
> 
> HSD.test(an.dta,"Treatment")
> # The level by alpha default is 0.05.
> outT<-HSD.test(an.dta,"Treatment", group=T)
> outT
> 
> #I receive significant codes.
> 
> 
> #SECOND CODE USING KRUSKAL WALLIs
> 
> library(agricolae)
> an.dta2<-kruskal.test(Heliocide~Treatment,dta)
> summary(an.dta2)
> 
> HSD.test(an.dta2,"Treatment")
> 
> #ERROR MESSAGE no significance codes, why??
> 
> 
> 
> #DATA FOR CODES
> 
> 
> structure(list(Treatment = structure(c(1L, 3L, 4L, 2L, 1L, 3L,
> 4L, 2L, 5L, 1L, 3L, 2L, 5L, 1L, 3L, 4L, 2L, 5L, 1L, 3L, 4L, 2L,
> 5L, 1L, 3L, 4L, 2L, 5L, 1L, 3L, 4L, 2L, 5L, 1L, 3L, 4L, 2L, 5L,
> 1L, 3L, 4L, 2L, 5L, 1L, 3L, 4L, 2L, 5L, 1L, 3L, 4L, 2L, 5L, 1L,
> 3L, 5L), .Label = c("1_2d", "1_7d", "3_2d", "9_2d", "C"), class = "factor"),
> 
>Code = structure(c(1L, 2L, 3L, 4L, 18L, 19L, 20L, 21L, 22L,
>23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L,
>35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L,
>47L, 48L, 49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 5L, 6L,
>7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L), .Label =
> c("1_2d_1c",
>"1_2d_3c", "1_2d_9c", "1_7d_1c", "10_2d_1c", "10_2d_3c",
>"10_2d_9c", "10_7d_1c", "10_C", "11_2d_1c", "11_2d_3c", "11_2d_9c",
>"11_7d_1c", "11_C", "12_2d_1c", "12_2d_3c", 

Re: [R] Analysis of causal relations between rare (categorical) events

2015-09-26 Thread giorgio.garzi...@tin.it
Also:

https://cran.r-project.org/web/views/Survival.html
https://www.youtube.com/watch?v=1Mt7EuVJf1A

http://gking.harvard.edu/files/gking/files/0s.pdf?m=1360039053
http://biostat.mc.vanderbilt.edu/wiki/pub/Main/RmS/rms.pdf

---
Giorgio Garziano

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Line in hist count

2015-09-26 Thread bgnumis bgnum
Hi all,

Several time ago I used to work with R, now I´m returning to study and work
and searching old file I see that I used this code:


gfhist<-hist(gf,plot=FALSE)

par(mar=c(6,0,6,6))

barplot(gfhist$counts,axes=FALSE, space=0,horiz=TRUE,col="lightgray")

grid()

title("Marginal Distribution Lagged",font=4)

The thing is I would line to plot a bar (horizontal and thing bar that will
be placed on the last gf data but on the barplot

¿Do you think is it possible? gf is a matrix.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Rattle installation

2015-09-26 Thread Haida
I received below messages during the installation Rattle. please help,
thank you

Warning in install.packages :
  error 1 in extracting from zip file
Warning in install.packages :
  cannot open compressed file 'rJava/DESCRIPTION', probable reason 'No such
file or directory'
Error in install.packages : cannot open the connection

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to calculate standard error of estimate (S) for my non-linear regression model?

2015-09-26 Thread Jeff Newmiller
You may not be a statistician, but you should at least learn about the 
calculations you are making. You cannot expect to convince others that 
your calculations are right just because "Peter on the internet said they 
were right".


To give you a gentle push in this direction, I have reproduced the 
calculations on that reference web page using R, so you can get a head 
start on understanding how to perform them on your data. (Hint: about all 
you should need to do is give your dta.nls to predict and use your 
column names instead of X and Y.) Once you have convinced yourself that
the pre-defined functions are doing what you expect, then you can omit the 
do-it-yourself option with confidence in the results.


Please note that no use of attach is included here... that usually ends in 
unhappiness at some point, so prefer to use the data argument instead.


###
dta <- data.frame( X=c(1,2,3,4,5), Y=c(1,2,1.3,3.75,2.25) )
nrow( dta )
# linear regression
y.lm <- lm( Y~X, data=dta )
# compute predictions from original data
## Be aware that you can also "predict" using a nonlinear regression fit
## Also be aware that you can compute estimates using different data if you
## specify the "newdata" argument... see the help for predict.lm
## ?predict.lm
dta$Yprime <- predict( y.lm )
# ?with
dta$YlessYprime <- with( dta, Y - Yprime )
dta$YlessYprime2 <- with( dta, YlessYprime^2 )

# confirm sums
# ?sum
sum( dta$X )
sum( dta$Y )
sum( dta$Yprime )
sum( dta$YlessYprime )
sum( dta$YlessYprime2 )

# standard error of the estimate for population data
sigma.est <- sqrt( sum( dta$YlessYprime2 ) / nrow( dta ) )
sigma.est
# sd function assumes sample standard deviation, can correct the result if 
# you want

# ?sd
# ?all.equal
all.equal( sigma.est, sd( dta$YlessYprime ) * sqrt( ( nrow( dta ) - 1 ) / 
nrow( dta ) ) )


# alternate formulation
SSY <- sum( ( dta$Y - mean( dta$Y ) )^2 )
rho <- with( dta, cor( Y, X ) )
all.equal( sigma.est, sqrt( (1-rho^2)*SSY/nrow(dta) ) )

# when working with a sample...
s.est <- sqrt( sum( dta$YlessYprime2 ) / ( nrow( dta ) - 2 ) )
s.est



dta <- data.frame( X=c(1,2,3,4,5), Y=c(1,2,1.3,3.75,2.25) )
nrow( dta )

[1] 5

# linear regression
y.lm <- lm( Y~X, data=dta )
# compute predictions from original data
## Be aware that you can also "predict" using a nonlinear regression fit
## Also be aware that you can compute estimates using different data if you
## specify the "newdata" argument... see the help for predict.lm
## ?predict.lm
dta$Yprime <- predict( y.lm )
# ?with
dta$YlessYprime <- with( dta, Y - Yprime )
dta$YlessYprime2 <- with( dta, YlessYprime^2 )

# confirm sums
# ?sum
sum( dta$X )

[1] 15

sum( dta$Y )

[1] 10.3

sum( dta$Yprime )

[1] 10.3

sum( dta$YlessYprime )

[1] 2.220446e-16

sum( dta$YlessYprime2 )

[1] 2.79075


# standard error of the estimate for population data
sigma.est <- sqrt( sum( dta$YlessYprime2 ) / nrow( dta ) )
sigma.est

[1] 0.7470944
# sd function assumes sample standard deviation, can correct the result 
# if you want

# ?sd
# ?all.equal
all.equal( sigma.est, sd( dta$YlessYprime ) * sqrt( ( nrow( dta ) - 1 ) 

/ nrow( dta ) ) )
[1] TRUE


# alternate formulation
SSY <- sum( ( dta$Y - mean( dta$Y ) )^2 )
rho <- with( dta, cor( Y, X ) )
all.equal( sigma.est, sqrt( (1-rho^2)*SSY/nrow(dta) ) )

[1] TRUE


# when working with a sample...
s.est <- sqrt( sum( dta$YlessYprime2 ) / ( nrow( dta ) - 2 ) )
s.est

[1] 0.9644947





On Sat, 26 Sep 2015, Michael Eisenring wrote:


Dear Peter,
Thank you for your answer.
If I look at my summary I see there a Residual standard error: 1394 on 53
degrees of freedom.
This number is very high (the fit of the curve is pretty bad I know but
still...). Are you sure the residual standard error given in the summary is
the same as the one described on this page:
http://onlinestatbook.com/2/regression/accuracy.html
I am basically just looking for a value that describes the goodness of fit
for my non-linear regression model.


This is probably a pretty obvious question, but I am not a statistician and
as you said the terminology is sometimes pretty confusing.
Thanks mike

-Urspr?ngliche Nachricht-
Von: peter dalgaard [mailto:pda...@gmail.com]
Gesendet: Samstag, 26. September 2015 01:43
An: Michael Eisenring 
Cc: r-help@r-project.org
Betreff: Re: [R] How to calculate standard error of estimate (S) for my
non-linear regression model?

This is one area in which terminology in (computational) statistics has gone
a bit crazy. The thing some call "standard error of estimate" is actually
the residual standard deviation in the regression model, not to be confused
with the standard errors that are associated with parameter estimates. In
summary(nls(...)) (and summary(lm()) for that matter), you'll find it as
"residual standard error",  and even that is a bit of a misnomer.

-pd


On 26 Sep 2015, at 07:08 , Michael Eisenring 

wrote:


Hi all,

I am looking for something that indicates 

Re: [R] Rattle installation

2015-09-26 Thread David Winsemius

On Sep 26, 2015, at 9:59 AM, Haida wrote:

> I received below messages during the installation Rattle. please help,
> thank you
> 
> Warning in install.packages :
>  error 1 in extracting from zip file
> Warning in install.packages :
>  cannot open compressed file 'rJava/DESCRIPTION', probable reason 'No such
> file or directory'
> Error in install.packages : cannot open the connection

Off-hand it would appear you have not yet installed rJava.


> 
>   [[alternative HTML version deleted]]
> 

You should read the Posting guide.

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.