I would like to read the following hierarchical data set. There is a family
record followed by one or more personal records.
If col. 7 is 1 it is a family record. If it is 2 it is a personal
record.
The family record is formatted as follows:
col. 1-5 family id
col. 71
col. 9
Will this do it for you:
input - readLines(textConnection(06470 1 1
+ 1 232 0
+ 2 230 1
+ 07470 1 0
+ 1 240 1
+ 08470 1 0
+ 1 227 0
+ 09470 1 0
+ 1 213 1
+ 2 222 0
+ 3 224 1
+ 10470 1 1
+ 1 220 0
+ 2 211 1
+ 11470 1 0
+ 1 217 0
+ 2 210 1
+ 3 226
Try this. It uses input defined in Jim's post and defines the rectype
of each row (1 or 2). It then reads the rectype 1 records into
DF1 using read.fwf and the rectype 2 records into DF2 also using
read.fwf. ix is defined to have one component per personal record
giving the row number in DF1 of
Here is a further simplification. We use the colClasses= argument
with NULL for the columns we do not want so we do not have to later
remove those columns.
# record type (1 or 2)
rectype - substr(input, 7, 7)
# read in record type 1
input1 - input[rectype == 1]
DF1 -
4 matches
Mail list logo