Hi all, We have an big issue with several of Rwanda's event reporting systems that have been set up using Tracker/Individual records. For some reason, users are able to create duplicate records in the system through normal data entry. It seems to happen when they enter a value for an attribute and then a couple of seconds or milliseconds later they change that value. The system then stores both records with last updated several milliseconds apart.
This is a big issue because analytics fails to run when it finds these duplicates. Here is the query that finds the duplicates: select programstageinstanceid, dataelementname, count(*) as numrecs from ( SELECT program.name AS programname, program.description, programinstance.dateofincident, programinstance.trackedentityinstanceid, programinstance.status, programstageinstance.programstageinstanceid, programstageinstance.programinstanceid, programstageinstance.programstageid, trackedentitydatavalue.value, dataelement.name AS dataelementname FROM public.programstageinstance, public.programinstance, public.program, public.trackedentitydatavalue, public.dataelement WHERE programstageinstance.programinstanceid = programinstance.programinstanceid AND program.programid = programinstance.programid AND trackedentitydatavalue.programstageinstanceid = programstageinstance.programstageinstanceid AND dataelement.dataelementid = trackedentitydatavalue.dataelementid ) as foo group by programstageinstanceid, dataelementname having count(*) >1 order by numrecs desc; Here is the typical output of the query when there are duplicate records: Programstageinstanceid, dataelementname,numrecs 777921;"TB Sample type_morning Sputum";2 776886;"TB Date of collection_sample";2 773132;"TB History of contact with TPB+ case";2 Here is an example of the duplicates we see when we select the duplicated records from the trackedentitydatavalue table. prograstageinstanceid, dataelementid,value,lastupdated,providedelsewhere,user 773132;35105;"true";"2015-10-02 09:32:48.334";FALSE;"abibarimana" 773132;35105;"true";"2015-10-02 09:32:48.508";FALSE;"abibarimana" 776886;31879;"2015-07-17";"2015-10-07 16:25:26.341";FALSE;"mugeni" 776886;31879;"2015-07-14";"2015-10-07 16:25:26.546";FALSE;"mugeni" 777921;35106;"true";"2015-10-08 09:55:36.133";FALSE;"emufasha" 777921;35106;"true";"2015-10-08 09:55:36.278";FALSE;"emufasha" In order to clean up the trackedentitydatavalue table, I need to write these records to a temporary table, delete them from trackedentitydatavalue, then decide which of the duplicate pairs to delete before copying the correct ones back into trackedentitydatavalues. This error seems to happen almost daily when the systems are in full use. Is there a way we can put a constraint on the trackedentitydatavalue table so that it will not accept duplicate values for programstageinstanceid and dataelementid? I also think there must be an error in the java code that should be updating the first saved record with the one entered by the user later rather than inserting a new record. This is a major bug because it causes analytics to fail and when it fails we cannot run any event reports to get data out of the system. -- *Randy Wilson* *Team Leader: **Knowledge Management, Data Use and Research* Rwanda Health System Strengthening Activity Management Sciences for Health Rwanda-Kigali Direct: +250 788308835 E-mail: [email protected] Skype: wilsonrandy_us <http://www.msh.org/> Stronger health systems. Greater health impact. <https://www.facebook.com/ManagementSciencesForHealth> <https://twitter.com/MSHHealthImpact> <https://www.youtube.com/user/MSHHealthImpact> www.msh.org -- *This message and its attachments are confidential and solely for the intended recipients. If received in error, please delete them and notify the sender via reply e-mail immediately.*
_______________________________________________ Mailing list: https://launchpad.net/~dhis2-users Post to : [email protected] Unsubscribe : https://launchpad.net/~dhis2-users More help : https://help.launchpad.net/ListHelp

