On Mon, 8 Aug 2016, Ellis, Alicia M wrote:
I have a large dataset with ~500,000 columns and 1264 rows. Each column
represents the percent methylation at a given location in the genome.
I need to run 500,000 linear models for each of 4 predictors of interest
in the form of:
Don't run 500K separate models. Use the limma package to fit one model that
can learn the variance parameters jointly. Run it on your laptop. And don't
use %methylation as your Y variable, use logit(percent), i.e. the Beta
value.
-Aaron
On Mon, Aug 8, 2016 at 2:49 PM, Ellis, Alicia M
I have a large dataset with ~500,000 columns and 1264 rows. Each column
represents the percent methylation at a given location in the genome. I need
to run 500,000 linear models for each of 4 predictors of interest in the form
of:
Methylation.stie1 ~ predictor1 + covariate1+ covariate2 + ...
3 matches
Mail list logo