Reproducible example, please. This doesn't make a whole lot of sense otherwise.
On Fri, Jan 20, 2012 at 1:52 PM, Sam Steingold <[email protected]> wrote: > Hi, > I have a data frame with one column containing string of the form > "ABC...|XYZ..." > where ABC etc are fields of 6 alphanumeric characters each > and XYZ etc are fields of 8 alphanumeric characters each; > "|" is a mandatory separator; > I do not know in advance how many fields of each kind will each row contain. > I need to extract these fields from the string. This is already a data frame, so you don't need to import it into R, just process it? > === How do I do that? > > first I need to split the string in 2 on '|' - how? strsplit() > then I need to split the two strings by 6/8 characters -- how? substring() perhaps > then I need to convert each 6/8 character string into an integer base 36 > or 64 (depending on the field) - how? base 36? Really? How are you representing that? Somehow I think you mean something other than what you said. Either way, please clarify. > === What do I do with them once I extract them? I don't know. Save them as a list, most likely. > First thing I want to do is to have a count table of them. > Then I thought of adding an extra column for each field value and > putting 0/1 there, e.g., frame > 1,AB > 2,BCD I thought we had integers at this point? > will turn into > 1,1,1,0,0 > 2,0,1,1,1 > however this would work only if the number of different field values is > manageable. But we have no idea, because you haven't told us. > What do people do? > Can I have a columns of "sets" in data frame? > Does R support the "set" data type? factor() seems to be what you're looking for. > PS. thanks to Sarah Goslee who answered my previous question in so much > detail! You're welcome, but you'd be even more welcome if you'd listened to the parts of my reply about reproducible examples, clear problem statements, and reading the posting guide. Sarah -- Sarah Goslee http://www.functionaldiversity.org ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

