Re: [R] data manipulation docs

2006-05-05 Thread Frank E Harrell Jr
Federico Calboli wrote:
 Hi All,
 
 Is there some document/manual about data manipulation within R that I 
 could use as a reference (obviously, aside the R manuals)?
 
 The reason I am asking is that I have a number of data frames/matrices 
 containg genetic data. The data is in a character form, as in:
 
V1 V2 V3 V4 V5
 1 AA AG AA GG AG
 2 AC AA AA GG AG
 3 AA AG AA GG AG
 4 AA AA AA GG AG
 5 AA AA AA GG AA
 
 I need, to chop, subset, and variously manipulate this kind of data, 
 sometimes keeping the data in its character format, sometimes converting 
 it to numeric form (i.e. substitute each data point with the equivalent 
 factor value). Since the data is ofthe quite big, I have to keep things 
 memory efficient.
 
 This whole game is getting excedingly time consuming and frustrating, 
 because I end up with random pieces of code that I save, patching a 
 particular problem, but difficult to be 'abstracted' for a new task, so 
 I get back close to square one annoyingly often.
 
 Cheers,
 
 Federico Calboli
 
 

There is a large data manipulation section on the Alzola Harrell 
document available on CRAN under contributed docs, or a slightly more up 
to date version at biostat.mc.vanderbilt.edu

-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] data manipulation docs

2006-05-04 Thread Larry Howe
On Thursday May 4 2006 10:20, Federico Calboli wrote:
 The reason I am asking is that I have a number of data frames/matrices
 containg genetic data. The data is in a character form, as in:

Take a look at the Bioconductor project: Bioconductor is an open source and 
open development software project for the analysis and comprehension of 
genomic data. http://www.bioconductor.org/

 This whole game is getting excedingly time consuming and frustrating,
 because I end up with random pieces of code that I save, patching a
 particular problem, but difficult to be 'abstracted' for a new task, so
 I get back close to square one annoyingly often.

This sounds like a software engineering problem, not an R problem. Does 
Imperial have a computer science dept.? Maybe they could advise on software 
engineering techniques.

Larry Howe

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html