Hi,
I have a large data set in binary code, no covariates. Say, positions along a
genomic sequence where reference sequence is represented by 0 and changes
represented by 1. I have 99 positions and 2000 sequences to analyze. I want to
run a univariate analysis to isolate positions where majority of changes are
amongst all sequences, then run multivariate to assess significance of these
porsitions as a whole. Would like to do it in R but don't know how. Please
help.
e.g.
1 2 3 4 5 6 7 8 (positions)
0 0 0 0 1 0 0 0 (sequence1)
1 0 0 0 0 1 0 0 (sequence2)
1 0 0 0 1 0 0 0
Thanks in advance...
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.