On Jun 13, 2013, at 2:21 PM, Bert Gunter wrote: > Lorenzo: > > 1. This is a statistics question, not an R question. > > 2. Your statistical background appears inadequate -- it looks like > Poisson regression, which would fall under "generalized linear > models". But it depends on how "discrete" discrete is (on some level, > all measurements are discrete, discretized to the resolution of the > measurement process).
There is an excellent R vignette on handling count data by authors: Achim Zeileis, Christian Kleiber, Simon Jackman. Easy to find with a Google search. There's also a somewhat older but possibly useful resource a set of worked S/R examples to accompany Agresti's text on categorical data by Laura Thompson. Alsi easy to find on Google. -- David. > > 3. So I would advise seeking local statistical help. Getting > statistical advice remotely over the internet (even on a proper forum > for statistical advice, which this is not) is fraught with hazard and > the risk of bad science (not due to incompetence or maliciousness; > just due to the possibilities of misunderstanding and confusion) -- > imho only, of course. > > Of course, feel free to reject this and proceed at your own risk. > > Cheers, > Bert > > > > On Thu, Jun 13, 2013 at 1:49 PM, Lorenzo Isella > <lorenzo.ise...@gmail.com> wrote: >> Dear All, >> I am struggling with a linear model and an allegedly trivial data set. >> The data set does not consist of categorical variables, but rather of >> numerical discrete variables (essentially, they count the number of times >> that something happened). >> Can I still use a standard linear regression, i.e. something like lm(y~x)? >> I attach a small snippet that illustrates the difficulties that I am >> experiencing (I do not understand why R complains about a list()). >> Any suggestion is appreciated. >> The data file can be downloaded from >> >> http://db.tt/hEKv1wH2 >> >> Cheers >> >> Lorenzo >> >> >> ##################################### >> >> data <- read.csv("testData.csv", header=TRUE) >> >> >> data <- subset(data,select= -c (X100, X182)) >> >> >> y <- data$X358 >> >> z <- subset(data, select=-c(X358)) >> >> myLM <- lm(y~z) >> >> >> ##################### >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > > -- > > Bert Gunter > Genentech Nonclinical Biostatistics > > Internal Contact Info: > Phone: 467-7374 > Website: > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.