The obesity Redcap project will have a Survey and data form for study tracking:

 * Tracker (contains invitation mail-out dates, non-automated response
   status info, etc.)
     o data can/will be uploaded in batch (Excel spreadsheet), e.g. to
       assign a mail out date to wave X participants
     o records can also edited in Redcap by study coordinator
 * Survey
     o contains survey responses
     o data entered by study participant or coordinator (if respondent
       uses phone/snail mail)

I know that earlier we reported problems tracking respondents who have not yet filled out a survey. This turned out to be due to having auto-numbering turned on. If you run into this problem at your site and want more details, we can send those in a separate email. Moving right along...

The tracker will be the first form in the project, and it will be batch uploaded from Excel/csv, with the study assigned ID in the first column, so this becomes the record linking field. Then we will use the Redcap participant list, to generate unique survey URLs to be printed on snail-mailed invitations. This will be done by sending dummy invitation emails in bulk to an email address we control, which will also be a field in the tracker form. The list of participants along with their unique URLs can then be exported, for printing the snail-mail survey invitations (presumably via some mail-merge feature in the researcher's word processor).

We would also like to do the following, but have not tested anything yet:

 * Export return codes for unique survey URLs (to help users who lost
   theirs)
     o Been done before per UMN Redcap FAQ, but requires plugin
       specific to UMN
 * Generate QR codes to be printed along with survey URLs
     o Been done before per U Iowa survey tricks

When exporting data from Oracle into the tracker form, we use patient information (BMI percentile, age, sex) to assign the study number, which includes an encoded age-BMI-bin, and extra digits or characters that will guarantee that no two respondents at different sites will have the same ID.

/We use the emergency contact information to determine who/where to mail the survey. Multiple emergency contacts exist for patients. For example, UTHSCSA patients may have up to five separate sets of emergency contact information in Epic. For choosing a contact, if there is no guardian, the tentative plan is to fall back on mother, then father, then emergency contact 1, and finally emergency contact 2. We are working on ways to empirically decide which is the best order of precedence and also awaiting guidance from some of our local experienced study coordinators. If any of y'all have insights into this, please speak up.//
/
Update: it appears that the contact address info in the PATIENT table itself is the most used, but still investigating where to pull the salutation: PROXY_NAME or GUARDIAN_NAME (or some other field). Anybody have any thoughts?

We also use contact information to reduce the number of duplicate mailings. To that end, we are interested in any tools for formatting addresses into USPS standard, comparing for duplicates, and validating addresses. Our current plan is:

 * Sort the entries in a random order.
 * Randomly remove all but one records that come from the same
   household defined as follows:
     o Treat all case-normalized duplicate email addresses as
       representing the same household.
     o Remove all non-numeric characters and leading 1s from phone
       numbers then treat all identical matches that result as
       representing the same household.
     o Concatenate ADDRESS1 and ADDRESS2 fields, replace all runs of
       whitespace with a single whitespace character, convert
       everything to lowercase, use certain USPS conversion rules
       (converting to standard abbreviations except where this would
       cause ambiguity), and then treat all identical matches that
       result as representing the same household.
 * After the high-confidence duplicates are removed, there will remain
   clusters of similar addresses that may or may not be duplicates. We
   will not auto-cull them, but we will flag them in a way that will
   hopefully make them easier for a human to spot. The tentative plan
   is to:
     o Take each address (normalized for case, spaces, and
       abbreviations in the previous step) and calculate the Levenstein
       distance to each other such address.
     o All addresses with a distance lower than our threshold will be
       assigned the same randomly generated ID in the DUPLICATE_ID column.
     o We then skip to the next address that doesn't yet have a
       DUPLICATE_ID and repeat the process until we run out of addresses.
     o Addresses that have no other addresses below the similarity
       threshold will all be assigned a DUPLICATE_ID of 0.
 * The normalized addresses will not be part of the final output to be
   uploaded into REDCap, but the DUPLICATE_ID field will remain.
 * In REDCap we will create a report that pulls only entries where
   DUPLICATE_ID != 0 and sorts those entries by DUPLICATE_ID. A study
   coordinator would then glance through these clusters and if they are
   actually different addresses (e.g. adjacent houses, or apartments
   within the same building, and probably more exotic variants), change
   their DUPLICATE_ID to 0. The remaining ones would be deleted except
   for the first entry.

This will not remove all duplicates, only diminish them. The primary goal is not avoiding siblings. In fact, siblings living in separate households will most likely slip through. This is just a limitation of this study design we have to live with. The real reason removing duplicates matters is minimizing how many households we irritate with repeat mailings.

I2b2 does not contain these emergency contact fields. Therefore, even if a site has an identified i2b2 instance, it will not be useful for extracting contact information. We see no practical alternative at this time to pulling these fields from the Epic source, then doing the study IDs and duplicate detection within a python script. This script will either output a CSV file ready to upload into REDCap to create the tracker or directly create the tracker via the REDCap API. We will send this script out to the study sites. Unless anybody has any better suggestions?
_______________________________________________
Gpc-dev mailing list
[email protected]
http://listserv.kumc.edu/mailman/listinfo/gpc-dev

Reply via email to