Reproducible example, please. This doesn't make a whole lot of sense
otherwise.

On Fri, Jan 20, 2012 at 1:52 PM, Sam Steingold <[email protected]> wrote:
> Hi,
> I have a data frame with one column containing string of the form 
> "ABC...|XYZ..."
> where ABC etc are fields of 6 alphanumeric characters each
> and XYZ etc are fields of 8 alphanumeric characters each;
> "|" is a mandatory separator;
> I do not know in advance how many fields of each kind will each row contain.
> I need to extract these fields from the string.

This is already a data frame, so you don't need to import it into R,
just process
it?

> === How do I do that?
>
> first I need to split the string in 2 on '|' - how?

strsplit()

> then I need to split the two strings by 6/8 characters -- how?

substring() perhaps


> then I need to convert each 6/8 character string into an integer base 36
> or 64 (depending on the field) - how?

base 36? Really? How are you representing that? Somehow I think you
mean something other than what you said. Either way, please clarify.

> === What do I do with them once I extract them?

I don't know. Save them as a list, most likely.

> First thing I want to do is to have a count table of them.
> Then I thought of adding an extra column for each field value and
> putting 0/1 there, e.g., frame
> 1,AB
> 2,BCD

I thought we had integers at this point?

> will turn into
> 1,1,1,0,0
> 2,0,1,1,1
> however this would work only if the number of different field values is
> manageable.

But we have no idea, because you haven't told us.

> What do people do?
> Can I have a columns of "sets" in data frame?
> Does R support the "set" data type?

factor() seems to be what you're looking for.

> PS. thanks to Sarah Goslee who answered my previous question in so much 
> detail!

You're welcome, but you'd be even more welcome if you'd listened to
the parts of my reply about reproducible examples, clear problem
statements, and reading the posting guide.

Sarah

-- 
Sarah Goslee
http://www.functionaldiversity.org

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to