On Wed, 8 Jul 2015, Robert Citek wrote: > Do you know in advance which fields are text, integer, or floats? Or > can a given field be of mixed data-type?
Robert, Each field must be one type of data. It's when spreadsheet preparers combine text symbols such as '<' in a numeric column that things get FUBAR. R doesn't care about the data type in each column. When the data is read into a data.frame with read.table(), read.csv(), or read.delim() R automagically recognizes numeric types as integer or float; all others are either classified as factors unless 'stringsAsFactors = F' is specified as an argument to the function. In R data.frames consist of columns of lists, and each column can be of a different type. It is necessary, for example. to coerce the sampdate column from factor to date using the as.Date() function. Germane to cleaning the raw data prior to reading it into R, those fields that need to be modified are integer or floating point (text, per se, is not an issue) and I believe that a correctly formulated regex can identify the field as integer or floating point. Does this answer your question? Rich _______________________________________________ PLUG mailing list [email protected] http://lists.pdxlinux.org/mailman/listinfo/plug
