Hi, I have a data frame with one column containing string of the form "ABC...|XYZ..." where ABC etc are fields of 6 alphanumeric characters each and XYZ etc are fields of 8 alphanumeric characters each; "|" is a mandatory separator; I do not know in advance how many fields of each kind will each row contain. I need to extract these fields from the string.
=== How do I do that? first I need to split the string in 2 on '|' - how? then I need to split the two strings by 6/8 characters -- how? then I need to convert each 6/8 character string into an integer base 36 or 64 (depending on the field) - how? === What do I do with them once I extract them? First thing I want to do is to have a count table of them. Then I thought of adding an extra column for each field value and putting 0/1 there, e.g., frame 1,AB 2,BCD will turn into 1,1,1,0,0 2,0,1,1,1 however this would work only if the number of different field values is manageable. What do people do? Can I have a columns of "sets" in data frame? Does R support the "set" data type? Thanks! PS. thanks to Sarah Goslee who answered my previous question in so much detail! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000 http://camera.org http://openvotingconsortium.org http://iris.org.il http://mideasttruth.com http://memri.org http://honestreporting.com Don't take life too seriously, you'll never get out of it alive! ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.