Re: [R] Data cleaning & Data preparation, what do R users want?

2017-11-30 Thread Dominik Schneider
I would agree that getting data into R from various sources is the biggest pain point. Even if there is an api, the results are not always consistent and you have to do lots of dimension checking to get it right. Or there isn't an open api at all and you have to hack it by web scraping or

Re: [R] How to create separate legend for each plot in the function of facet_wrap in ggplot2?

2017-11-10 Thread Dominik Schneider
That's not the point of facet_wrap so check out the cowplot package for combining multiple ggplot objects (with legends) into one figure. On Fri, Nov 10, 2017 at 10:21 AM, Marna Wagley wrote: > Hi R users, > I need to create more than 20 figures (one for each group) in

Re: [R] h5r package: cannot find hdf5

2016-12-18 Thread Dominik Schneider
Sorry I'm not sure what that error means. Dominik On Sun, Dec 18, 2016 at 17:21 David Winsemius wrote: > > > > On Dec 17, 2016, at 5:09 PM, Pedro Montenegro wrote: > > > > > > I'm new to R and Linux, and I have an issue I didn't see solved on the > > >

Re: [R] h5r package: cannot find hdf5

2016-12-18 Thread Dominik Schneider
Pedro, I've only worked with netcdf4 but I imagine your issue is similar to one's I've had: I think you can: 1. add your hdf5 lib directory to the LD_LIBRARY_PATH http://grokbase.com/t/r/r-help/10at4wcjfq/r-ncdf4-package-installation-in-r 2. you can specify the direct path to lib and include

Re: [R] Question about using ggplot

2016-11-13 Thread Dominik Schneider
past versions of ggplot2 used to accept family as an argument directly, but the latest ggplot2 (perhaps starting with v2?) requires method.args=list(). So the online sources you found using family directly were for an older version of ggplot. On Sun, Nov 13, 2016 at 8:18 AM,

Re: [R] Output formatting in PDF

2016-10-11 Thread Dominik Schneider
You may be able to do everything you need with the cowplot package. On Tue, Oct 11, 2016 at 4:26 AM, Preetam Pal wrote: > Hey Enrico, > LaTex is not possible actually. > > On Tue, Oct 11, 2016 at 2:29 PM, Enrico Schumann > wrote: > > > On Tue, 11

Re: [R] To submit R jobs via SLURM

2016-10-03 Thread Dominik Schneider
I typically call Rscript inside an sbatch file. *batch_r.sh* #! /bin/bash cd /home//aso_regression_project #make sure you change to your correct working directory Rscript scripts/run_splitsample-modeling.R and on the commandline of the login node: sbatch batch_r.sh There are lots of slurm

Re: [R] Faster Subsetting

2016-09-28 Thread Dominik Schneider
I regularly crunch through this amount of data with tidyverse. You can also try the data.table package. They are optimized for speed, as long as you have the memory. Dominik On Wed, Sep 28, 2016 at 10:09 AM, Doran, Harold wrote: > I have an extremely large data frame (~13

Re: [R] Faster Subsetting

2016-09-28 Thread Dominik Schneider
I regularly crunch through this amount of data with tidyverse. You can also try the data.table package. They are optimized for speed, as long as you have the memory. Dominik On Wed, Sep 28, 2016 at 10:09 AM, Doran, Harold wrote: > I have an extremely large data frame (~13

Re: [R] ggplot2: geom_segment does not produce the color I desire?

2016-09-17 Thread Dominik Schneider
ggplot will assign, or map if you will, the color based on the default color scale when color is specified with the mapping argument such as mapping = aes(color=...). You have two options: 1. if you want the color of your arrow to be based on a column in your data, then manually scale the color

Re: [R] glmnet vignette question

2016-09-17 Thread Dominik Schneider
> Is there a way to extract MSE for a lambda, e.g. lambda.1se? nevermind this specific question. it's now obvious. However my overall question stands. On Fri, Sep 16, 2016 at 10:10 AM, Dominik Schneider < dominik.schnei...@colorado.edu> wrote: > I'm doing some linear modelin

[R] glmnet vignette question

2016-09-17 Thread Dominik Schneider
I'm doing some linear modeling and am new to the ridge/lasso/elasticnet procedures. In my case I have N>>p (p=15 based on variables used in past literature and some physical reasoning) so my understanding is that I should be interested in ridge regression to avoid the issue of multicollinearity of

Re: [R] physical constraint with gam

2016-05-16 Thread Dominik Schneider
Thanks for the clarification! On Sat, May 14, 2016 at 1:24 AM, Simon Wood <simon.w...@bath.edu> wrote: > On 12/05/16 02:29, Dominik Schneider wrote: > > Hi again, > I'm looking for some clarification on 2 things. > 1. On that last note, I realize that s(x1,x2) would

Re: [R] physical constraint with gam

2016-05-11 Thread Dominik Schneider
y relate to then you can do, by setting the >> model up as a varying coefficient model, using the `by' argument to 's'... >> >> gam(snowdepth~s(fsca,by=fsca),data=dat) >> >> >> this model is `snowdepth_i = f(fsca_i) * fsca_i + e_i' . s(fsca,by=fsca) >> is not conf

Re: [R] physical constraint with gam

2016-05-11 Thread Dominik Schneider
sca_i) * fsca_i + e_i' . s(fsca,by=fsca) > is not confounded with the intercept, so no constraint is needed or > applied, and you can now interpret the smooth like a local GLM coefficient. > > best, > Simon > > > > > On 11/05/16 01:30, Dominik Schneider wrote: > >> Hi,

[R] physical constraint with gam

2016-05-11 Thread Dominik Schneider
Hi, Just getting into using GAM using the mgcv package. I've generated some models and extracted the splines for each of the variables and started visualizing them. I'm noticing that one of my variables is physically unrealistic. In the example below, my interpretation of the following plot is