Hi, << I worked out a design and got it pretty much working. attachment:ssdmf_sharing_spec.py is supposed to explain how it works on your end. Would you please see if it makes sense to you? >>
SUMMARY It makes sense to me. I believe this requires agreement on a canonical format of values to be used for the 'one-way-hash-value comparison' of SSN, date of birth, etc. I believe the canonical forms would be: (a) SSN as exactly 9 digit numeric string (i.e. no dashes) (b) Date of birth as MMDDYYYY <== YES??? This would be of great importance ... the date format before the hash (c) Names in uppercase (it appears the dmf_hash function enforces this) DETAILS Check my understanding ==> The distributed table would be a transformation of the death index file where the SSN, date of birth, first name, last name are de-identified versions. The de-identification is a one-way-hash that can not be converted backwards to the original data. Yes? This implies that consumers of this data would have to compute the same one-way-hash value of SSN or other values that they wish to compare to the de-identified death index. For example, if the hash value of the SSN at your site matches the hash value in the de-identified table, then this is the same original SSN. Yes? My take ==> This works as long as we use a canonical forms of values BEFORE hashing of the things we wish to compare. Specifically: (a) SSN as exactly 9 digit numeric string (i.e. no dashes) (b) Date of birth as MMDDYYYY <== YES??? This would be of great importance ... the date format before the hash (c) Names in uppercase (it appears the dmf_hash function enforces this) Jay Pedersen, M.A. Department of Pathology/Microbiology University of Nebraska Medical Center 985900 Nebraska Medical Center Omaha NE 68198-5900 402-559-9487<tel:402-559-9593> (office) 402-739-3496<tel:402-350-7851> (mobile) ________________________________ From: GPC Informatics <d...@madmode.com> Sent: Thursday, February 14, 2019 1:55 PM To: dconno...@kumc.edu; Narayana, Yeshwanth R; Campbell, James R Cc: dbud...@kumc.edu; rwait...@kumc.edu; Pedersen, Jay G Subject: Re: [gpc-informatics] #688: National Death data feed unavailable due to new site certification requirements Non-UNMC email #688: National Death data feed unavailable due to new site certification requirements --------------------------+---------------------------- Reporter: dconnolly | Owner: ynarayana Type: problem | Status: assigned Priority: major | Milestone: pcornet-cdm-6 Component: data-sharing | Resolution: Keywords: | Blocked By: Blocking: 377 | --------------------------+---------------------------- Changes (by dconnolly): * cc: tmcmahon (removed) * cc: ynarayana, jay.pedersen (added) * owner: dconnolly => ynarayana Comment: UNMC folks, I worked out a design and got it pretty much working. attachment:ssdmf_sharing_spec.py is supposed to explain how it works on your end. Would you please see if it makes sense to you? -- Ticket URL: <http://informatics.gpcnetwork.org/trac/Project/ticket/688#comment:9> gpc-informatics <http://informatics.gpcnetwork.org/> Greater Plains Network - Informatics The information in this e-mail may be privileged and confidential, intended only for the use of the addressee(s) above. Any unauthorized use or disclosure of this information is prohibited. If you have received this e-mail by mistake, please delete it and immediately contact the sender.
_______________________________________________ Gpc-dev mailing list Gpc-dev@listserv.kumc.edu http://listserv.kumc.edu/mailman/listinfo/gpc-dev