Hi all,

I've Googled far and wide but don't think I know the correct terms to search for to find an answer.

I have a massive dataset where one of the factors is made up of both individual items and lists of items (for example, "cat" and "cat, dog, bird"). I would like to recode this factor somehow into only the first element of the list (so every list starting with "cat," plus the observations that were already just "cat" would all be set equal to "cat"). I would ideally like to do this in some simple way that does not require me to write hundreds of different sets of code (since the lists probably start with 300+ different items). Is this possible? Extremely complicated?

Also, I am sure this is much simpler, but I cannot seem to get rid of levels of a factor that have no observations. I have tried setting the levels of the factor to only the ones with observations that I am interested in, but every time I summarize the variable there are still 100+ labels all with "0" as their count. This hasn't happened to me before; is there an explanation for it?

Thanks very much,
Jen

---
Jennifer Walsh
Graduate Student, Developmental Psychology
University of Michigan
2020 East Hall, 530 Church St.
Ann Arbor, MI 48109-1043

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to