Re: [R] Precision in R

2018-02-26 Thread Jeff Newmiller
On Sun, 25 Feb 2018, Iuri Gavronski wrote: Hi, Why sum() on a 10-item vector produces a different value than its counterpart on a 2-item vector? I understand the problems related to the arithmetic precision in storing decimal numbers in binary format, but shouldn't the errors be equal

Re: [R] glm package - Negative binomial regression model - Error

2018-02-26 Thread Thierry Onkelinx
Dear Paula, There are probably missing observations in your data set. Read the na.action part of the glm help file. na.exclude is most likely what you are looking for. Best regards, ir. Thierry Onkelinx Statisticus / Statistician Vlaamse Overheid / Government of Flanders INSTITUUT VOOR

Re: [R] Precision in R

2018-02-26 Thread Thierry Onkelinx
This is described in R FAQ 7.31 ir. Thierry Onkelinx Statisticus / Statistician Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance

[R] Gam with mrf smoother (mgcv)

2018-02-26 Thread Giulia Carella
Hallo, I want to use gam from the mgcv package with a mrf smoother. This is my data set (`x`)    y  id     1  0.6684496  1     2  0.6684496  2     3  0.6684496  3     4  0.6684496  4     5  0.6684496  5     6  0.6684496  6     7  0.6684496  7     8  0.5879492  8     9  0.5879492  9

[R] How to model repeated measures negative binomial data with GEE or GLMM

2018-02-26 Thread B Hansen
Goal: use GEE or GLMM to analyze repeated measures data in R GEE problem: can’t find a way to do GEE with negative binomial family in R GLMM problem: not sure if I’m specifying random effect correctly Study question: Does the interaction of director and recipient group affect rates of a

Re: [R] Parallel assignments and goto

2018-02-26 Thread Thomas Mailund
Following up on this attempt of implementing the tail-recursion optimisation — now that I’ve finally had the chance to look at it again — I find that non-local return implemented with callCC doesn’t actually incur much overhead once I do it more sensibly. I haven’t found a good way to handle

Re: [R] Random Seed Location

2018-02-26 Thread Jeff Newmiller
I am willing to go out on that limb and say the answer to the OP question is yes, the RN sequence in R should be reproducible. I agree though that it doesn't look like he is actually taking care not to run code that would disturb the generator. -- Sent from my phone. Please excuse my brevity.

[R] Random Seed Location

2018-02-26 Thread Gary Black
Hi all, For some odd reason when running naïve bayes, k-NN, etc., I get slightly different results (e.g., error rates, classification probabilities) from run to run even though I am using the same random seed. Nothing else (input-wise) is changing, but my results are somewhat different from run

Re: [R] Random Seed Location

2018-02-26 Thread William Dunlap via R-help
If your computations involve the parallel package then set.seed(seed) may not produce repeatable results. E.g., > cl <- parallel::makeCluster(3) # Create cluster with 3 nodes on local host > set.seed(100); runif(2) [1] 0.3077661 0.2576725 > parallel::parSapply(cl, 101:103, function(i)runif(2,

Re: [R] Random Seed Location

2018-02-26 Thread Bert Gunter
In case you don't get an answer from someone more knowledgeable: 1. I don't know. 2. But it is possible that other packages that are loaded after set.seed() fool with the RNG. 3. So I would call set.seed just before you invoke each random number generation to be safe. Cheers, Bert Bert

Re: [R] alternative for multiple if_else statements

2018-02-26 Thread S Ellison
That many ifelse statements is obviously rather a pain. Would you not have got what you want with ... paste("survey", year, sep="_") ? If that is not what you're looking for (eg because 'year' is the observation year and not the study start year), perhaps something that picks the minimum

[R] questions about performing Robust multiple regression using bootstrap

2018-02-26 Thread faiz rasool
Dear list, I am slightly confused about how I can do the following in R. I want to perform robust multiple regression. I’ve used the Boot function in CAR package to find confidence intervals and standard errors. Inadition to these, I want to find the robust estimates for the F test and

[R] R 3.4.4 scheduled for March 15

2018-02-26 Thread Peter Dalgaard via R-help
Full schedule available on developer.r-project.org (pending auto-update from SVN) -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com

Re: [R] glm package - Negative binomial regression model - Error

2018-02-26 Thread Paula Couto
Thank you so much, Thierry!! I will try that now and see if that solves the issue Bests, Paula On Feb 26, 2018 03:02, "Thierry Onkelinx" wrote: Dear Paula, There are probably missing observations in your data set. Read the na.action part of the glm help file.

Re: [R] questions about performing Robust multiple regression using bootstrap

2018-02-26 Thread Bert Gunter
Although this is superficially a question about R code, it heavily depends on exactly what you mean by "robust" and "robust tests," which are statistical issues, not R coding issues. As such, it is off topic here. So I would suggest that you post on a statistical site like stats.stackexchange.com

Re: [R] questions about performing Robust multiple regression using bootstrap

2018-02-26 Thread Fox, John
Dear Faiz, Bootstrapping R^2 using Boot() is straightforward: Simply write a function that returns R^2, possibly in a vector with the regression coefficients, and use it as the f argument to Boot(). That will get you, e.g., bootstrapped confidence intervals for R^2. (Why you want that is

Re: [R] Precision in R

2018-02-26 Thread William Dunlap via R-help
In the R expression x[1] + x[2] the result must be stored as a double precision number, because that is what R "numerics" are. sum() does not have to keep its intermediate results as doubles, but can use quad precision or Kahan's summation algorithm (both methods involve more than a simple

[R] [R-pkgs] SympluR - Analyze Healthcare Social Media Data from the Symplur API

2018-02-26 Thread Audun Utengen
Hi all, Just launched a new R package - SympluR! It allows you to analyze data from the Healthcare Social Graph via access to the Symplur API. - The Healthcare Social Graph contains billions of healthcare social media data points. Hundreds of published journal articles have leveraged data

Re: [R] alternative for multiple if_else statements

2018-02-26 Thread Kevin Wamae
Dear Ellison, thank you for the feedback, we replaced dplyr::if_else with dplyr::case_when and it seems to do the trick. Still, we have to write several statements to match all the respective years but it's working. Let me see how we can implement your suggestion. Regards --