Hello,

I have a data set with many individuals all with multiple timed
observations, and I would like to subset the data to exclude later timed
observations.
However, I would like to exclude different amounts of data for each
individual. These individuals have two types of data: DV and dose. What I
would like to do is exclude later instances when one of the types of data
is no longer included.

The data is structured with an (approximate) 28 day cycle. Each individual
has a baseline DV, and on day 1, they receive their first dose. Around day
28, they will have their first DV observed. This means that an individual
"should" have one less dose data item than they have DV data items.

What I would like is to take the following:

ID    TIME    DV   DOSE  TYPE
1         0         0        NA         2
1         1         NA     100        1
1         27       0        NA         2
1         29       NA     100        1
1         54       2        NA         2
1         84       3        NA         2
1         100     3        NA         2
1         127     3        NA         2

2         0         0        NA         2
2         1         NA     120        1
2         28       4        NA         2
2         29       NA     120        1
2         56       8        NA         2
2         57       NA     100        1

3         0         2        NA         2
3         1         NA     80          1
3         28       5        NA         2
3         56       2        NA         1
3         84       1        NA         2

4         0         0        NA         2
4         1         NA     100        1
4         29       NA     100        1
4         57       NA     100        1
4         85       NA     100        1
...


And turn it into:

ID    TIME    DV   DOSE  TYPE
1         0         0        NA         2
1         1         NA     100        1
1         27       0        NA         2
1         29       NA     100        1
1         54       2        NA         2

2         0         0        NA         2
2         1         NA     120        1
2         28       4        NA         2
2         29       NA     120        1
2         56       8        NA         2

3         0         2        NA         2
3         1         NA     80          1
3         28       5        NA         2

4         0         0        NA         2
...


My thought for how to do this was to:

(1)  Subset the data by the "maximum" time an individual had an observed DV
(type=2). However, this will be a different time for every patient, and I
was unsure how to do this type of subsetting.

(2) After I had done that, I would want to take my new subsetted data and
determine the "maximum" time an individual had a dose. Then I would
determine the total rows of data a patient had up to their last dose data
time. Then I could subset the data by taking the first "n+1" observations
for each individual, n="total rows of data a patient had up to their last
dose data time." This step I would hope I could determine from knowing how
to do step (1), if I can use "table" and "max" interchangeably.

Any help would be appreciated!

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to