Ron Crump wrote: > Hi, > > I have a dataframe that contains pedigree information; > that is individual, sire and dam identities as separate > columns. It also has date of birth. > > These identifiers are not numeric, or not sequential. > > Obviously, an identifier can appear in one or two columns, > depending on whether it was a parent or not. These should > be consistent. > > Not all identifiers appear in the individual column - it > is possible for a parent not to have its own record if its > parents were not known. > > Missing parental (sire and/or dam) identifiers can occur. > > I need to export the data for use in another program that > requires the pedigree to be coded as integers, increasing > with date of birth (therefore sire and dam always have > lower identifiers than their offspring) and with missing > values coded as 0. > > How would I go about doing this? >
You might look at http://www.qimr.edu.au/davidD/sib-pair.R, specifically the read.pedigree() and wrlink() functions. The former is not very impressive speedwise -- I usually perform these tasks in the my Sib-pair (Fortran) program, which is on the same webpage. It will order the pedigree by generational position, so a DOB is not required to do the sort. Terry Therneau's kinship package does that ordering, but doesn't include output routines for the Linkage format. David Duffy. | David Duffy (MBBS PhD) ,-_|\ | email: [EMAIL PROTECTED] ph: INT+61+7+3362-0217 fax: -0101 / * | Epidemiology Unit, Queensland Institute of Medical Research \_,-._/ | 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG 4D0B994A v ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.