RE: Current survey management plan for Obesity

Bhargav Adagarla Tue, 03 Mar 2015 14:23:42 -0800

The few things I noticed which can be implementation risks:

- The other survey plans (ALS and breast cancer) seem technically simpler. For 
example, with ALS Survey (which is closest to Obesity Survey in terms of the 
implementation) might not have any scripts to run in its plan. Each honest 
broker will get a list of patients that they will then share with the study 
coordinator who will manually go through the list and make sure the list does 
not have duplicates or deceased patients (and so on). Then they might place the 
data into a REDCap data import template and upload it via REDCap data import 
tool. But I understand that the Obesity survey cohort will be very large 
compared to ALS and might need something automated.


- We might have to accommodate for the differences in REDCap versions among the 
sites. For example, we were trying to work with folks at one of the GPC sites 
on a non-GPC project when we realized they use REDCap LTS (long term support) 
version which might not have the same features as a regular REDCap.

Thanks.

Regards,
Adagarla, Bhargav Srinivas
________________________________
From: Dan Connolly
Sent: Tuesday, March 03, 2015 9:17 AM
To: Teresa Bosler; Supreet Kathpalia
Cc: Bhargav Adagarla; [email protected]; Alex Bokov
Subject: RE: Current survey management plan for Obesity
Phillip/ Teresa / UTSW, Supreet / UMN, what do you think of this plan? Recall 
you agreed to review in our 17 Feb call.

Review by others is, of course, more than welcome as well.

--
Dan
________________________________
From: [email protected] [[email protected]] on 
behalf of Alex Bokov [[email protected]]
Sent: Tuesday, February 17, 2015 11:11 AM
To: [email protected]
Subject: Current survey management plan for Obesity
The obesity Redcap project will have a Survey and data form for study tracking:


  *   Tracker (contains invitation mail-out dates, non-automated response 
status info, etc.)

     *   data can/will be uploaded in batch (Excel spreadsheet), e.g. to assign 
a mail out date to wave X participants
     *   records can also edited in Redcap by study coordinator

  *   Survey

     *   contains survey responses
     *   data entered by study participant or coordinator (if respondent uses 
phone/snail mail)
I know that earlier we reported problems tracking respondents who have not yet 
filled out a survey. This turned out to be due to having auto-numbering turned 
on. If you run into this problem at your site and want more details, we can 
send those in a separate email. Moving right along...

The tracker will be the first form in the project, and it will be batch 
uploaded from Excel/csv, with the study assigned ID in the first column, so 
this becomes the record linking field. Then we will use the Redcap participant 
list, to generate unique survey URLs to be printed on snail-mailed invitations. 
This will be done by sending dummy invitation emails in bulk to an email 
address we control, which will also be a field in the tracker form. The list of 
participants along with their unique URLs can then be exported, for printing 
the snail-mail survey invitations (presumably via some mail-merge feature in 
the researcher's word processor).

We would also like to do the following, but have not tested anything yet:

  *   Export return codes for unique survey URLs (to help users who lost theirs)

     *   Been done before per UMN Redcap FAQ, but requires plugin specific to 
UMN

  *   Generate QR codes to be printed along with survey URLs

     *   Been done before per U Iowa survey tricks
When exporting data from Oracle into the tracker form, we use patient 
information (BMI percentile, age, sex) to assign the study number, which 
includes an encoded age-BMI-bin, and extra digits or characters that will 
guarantee that no two respondents at different sites will have the same ID.

We use the emergency contact information to determine who/where to mail the 
survey. Multiple emergency contacts exist for patients. For example, UTHSCSA 
patients may have up to five separate sets of emergency contact information in 
Epic. For choosing a contact, if there is no guardian, the tentative plan is to 
fall back on mother, then father, then emergency contact 1, and finally 
emergency contact 2. We are working on ways to empirically decide which is the 
best order of precedence and also awaiting guidance from some of our local 
experienced study coordinators. If any of y'all have insights into this, please 
speak up.

Update: it appears that the contact address info in the PATIENT table itself is 
the most used, but still investigating where to pull the salutation: PROXY_NAME 
or GUARDIAN_NAME (or some other field). Anybody have any thoughts?

We also use contact information to reduce the number of duplicate mailings. To 
that end, we are interested in any tools for formatting addresses into USPS 
standard, comparing for duplicates, and validating addresses. Our current plan 
is:

  *   Sort the entries in a random order.
  *   Randomly remove all but one records that come from the same household 
defined as follows:

     *   Treat all case-normalized duplicate email addresses as representing 
the same household.
     *   Remove all non-numeric characters and leading 1s from phone numbers 
then treat all identical matches that result as representing the same household.
     *   Concatenate ADDRESS1 and ADDRESS2 fields, replace all runs of 
whitespace with a single whitespace character, convert everything to lowercase, 
use certain USPS conversion rules (converting to standard abbreviations except 
where this would cause ambiguity), and then treat all identical matches that 
result as representing the same household.

  *   After the high-confidence duplicates are removed, there will remain 
clusters of similar addresses that may or may not be duplicates. We will not 
auto-cull them, but we will flag them in a way that will hopefully make them 
easier for a human to spot. The tentative plan is to:

     *   Take each address (normalized for case, spaces, and abbreviations in 
the previous step) and calculate the Levenstein distance to each other such 
address.
     *   All addresses with a distance lower than our threshold will be 
assigned the same randomly generated ID in the DUPLICATE_ID column.
     *   We then skip to the next address that doesn't yet have a DUPLICATE_ID 
and repeat the process until we run out of addresses.
     *   Addresses that have no other addresses below the similarity threshold 
will all be assigned a DUPLICATE_ID of 0.

  *   The normalized addresses will not be part of the final output to be 
uploaded into REDCap, but the DUPLICATE_ID field will remain.
  *   In REDCap we will create a report that pulls only entries where 
DUPLICATE_ID != 0 and sorts those entries by DUPLICATE_ID. A study coordinator 
would then glance through these clusters and if they are actually different 
addresses (e.g. adjacent houses, or apartments within the same building, and 
probably more exotic variants), change their DUPLICATE_ID to 0. The remaining 
ones would be deleted except for the first entry.
This will not remove all duplicates, only diminish them. The primary goal is 
not avoiding siblings. In fact, siblings living in separate households will 
most likely slip through. This is just a limitation of this study design we 
have to live with. The real reason removing duplicates matters is minimizing 
how many households we irritate with repeat mailings.

I2b2 does not contain these emergency contact fields. Therefore, even if a site 
has an identified i2b2 instance, it will not be useful for extracting contact 
information. We see no practical alternative at this time to pulling these 
fields from the Epic source, then doing the study IDs and duplicate detection 
within a python script. This script will either output a CSV file ready to 
upload into REDCap to create the tracker or directly create the tracker via the 
REDCap API. We will send this script out to the study sites. Unless anybody has 
any better suggestions?

_______________________________________________
Gpc-dev mailing list
[email protected]
http://listserv.kumc.edu/mailman/listinfo/gpc-dev

RE: Current survey management plan for Obesity

Reply via email to