Hi, I have a dataframe that contains pedigree information; that is individual, sire and dam identities as separate columns. It also has date of birth.
These identifiers are not numeric, or not sequential. Obviously, an identifier can appear in one or two columns, depending on whether it was a parent or not. These should be consistent. Not all identifiers appear in the individual column - it is possible for a parent not to have its own record if its parents were not known. Missing parental (sire and/or dam) identifiers can occur. I need to export the data for use in another program that requires the pedigree to be coded as integers, increasing with date of birth (therefore sire and dam always have lower identifiers than their offspring) and with missing values coded as 0. How would I go about doing this? And a second, simpler related question, if I have a column with n different values (may be strings or non-sequential integers) identifying levels (possibly with repeated occurences), how can I recode them to be sequential from 1 to n? I can solve both problems in fortran, so could use loops to do it in R, but feel there should be quicker, more elegant, "more R" solution. Thanks for your help. Ron. ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.