Hi Jason, 

This is VERY helpful. It’s clear you’ve thought a lot about these issues. 
Thanks for sharing these ideas, including the link to your validation example.

Our exchange has very quickly helped me to get a sense of how to get started 
and where the trouble spots might be. Thanks for all of your help.

Eric

On November 24, 2015 at 3:45:13 AM, Jason Pickering 
(jason.p.picker...@gmail.com) wrote:

Hi Eric,

Indicators in DHIS2 are constructed by metadata, so there is no standard way. 
If you are going to aggregate these yourself, then yes, you would need to pull 
out all of the component data elements, reconstruct the indicator in R, and 
then perform the aggregation. You can see an example of an indicator here

https://play.dhis2.org/demo/api/indicators/ReUHfIn0pTQ

The numerators and denominators are described by the following snippet of 
metadata...

<denominator>#{fbfJHSPpUQD.pq2XI5kz2BY}+#{fbfJHSPpUQD.PT59n8BQbqM}</denominator>
<numerator>#{fbfJHSPpUQD.pq2XI5kz2BY}+#{fbfJHSPpUQD.PT59n8BQbqM}-#{Jtf34kNZhzP.pq2XI5kz2BY}-#{Jtf34kNZhzP.PT59n8BQbqM}</numerator>

The first UID corresponds to the data element, and the second the UID of the 
particular disaggregation (category option combination in DHIS2-ese).

There are other metadata components which are used to calculate the indicator, 
such as the annualization factor, etc. Reconstructing the aggregation engine of 
DHIS2 would probably not be totally trivial, but I describe some approaches 
here which could probably be also applied to indicators. In the case shown 
there, I show how you can take the metadata of DHIS2, and then using the 
metadata of the system, perform validation rule evaluation outside of the 
system in R. Since the syntax of the indicators and the syntax of the 
validation rules are the same, it would seem feasible (if non-trivial) to do 
this as well with indicators.

In terms of weighting, the important thing to keep in mind with DHIS2 with 
indicators, is that the numerators and denominators are aggregated themselves, 
and then divided. Thus, you end up effectively with a weighted average. The 
other approach would be to calculate each numerator/denominator pair 
separately, and then calculate the mean (unweighted).

In terms of comment of line 60, there is no guarantee that "indID <- 
indicators$id[indicators$name==ind[i]]" will return anything, the way you have 
the code at the moment. An NA could result, if there is no match there. But 
yes, depending on your API call, NA/NULLs are possible, but the analytics 
resources should not return any NULLs/NAs, but could return blank values. Best 
to check, just to be sure.

Best regards,
Jason




On Tue, Nov 24, 2015 at 3:44 AM, Eric Green <epgr...@gmail.com> wrote:
Hi Alex and Jason, 

Thanks for sharing these ideas. I was able to get the reference table I wanted. 
Much appreciated.

Jason, your points about server stress are good. In my use case queries will be 
small in scope and infrequent, but it’s a good point to remember.

I was not aware of the weighting issue (new to dhis2 and APIs!), but it makes 
sense. I would need to switch to data elements, right? Could anyone point me to 
good resources for finding out how specific indicators are constructed (and 
weighted)? Is there a standard reference?

Jason, in your revised code (thanks!), could you clarify what you mean by 
"#Needs to be checked against NAs and duplicates” in line 60? This step is just 
creating the segment of the url that specifies the indicator, e.g., 
"dimension=dx:ReUHfIn0pTQ”. Are you saying more generally that resulting 
datasets for indicators need to be checked for NAs and duplicates? I think I’m 
missing something here.

Thanks again.

Eric

On November 23, 2015 at 10:36:26 AM, Jason Pickering 
(jason.p.picker...@gmail.com) wrote:

Hi Eric,

Nice to see someone else looking to use R and DHIS2. :)

Another way of getting the orgunit Hierarchy is with something like this.

https://play.dhis2.org/demo/api/organisationUnits?fields=uid,parent[id],name,level,path

Once you have the parent ID you can then generate the entire tree structure . 
The "path" also provides the full hierarchy of the position of a given orgunit 
within the hierarchy. Once you have either of these, it would be possible to 
generate the hierarchical structure pretty easily in R I think (although I have 
not written the code to do it!).

I think your approach will work, but in general, the API can aggregate the data 
for you (depending on how you would like to aggregate it). Otherwise, if you 
make a lot of loops on the server, it could be a lot of data, and could 
potentially put the server under stress (depending on the level of usage of 
course). In general, I think it would make sense to try and only ask for what 
you need, if that is possible, and supported by the API. This will run a lot 
quicker (on the server and in R). This of course, all depends on the scale of 
what you are asking for and if you need to perform some type of filtering (such 
as outliers, bad data, etc) prior to aggregation, which the server may not 
perform.

 
Also, be aware, that when getting indicators from DHIS2, you do not get the 
data values which compose the indicators. Thus, any aggregation which you would 
perform would likely be significantly different than DHIS2, because when DHIS2 
aggregates the data, it does so with a weighted average, as opposed to an 
un-weighted average (which would be the only possibility since you are getting 
the  percentages here rather than both the numerator and denominator).

I hacked your example a bit to make it a bit quicker. You can test the output 
on RFiddle here.

Hope this helps to get you started.

Regards,
Jason





On Mon, Nov 23, 2015 at 3:46 PM, Alex Tumwesigye <atumwesi...@gmail.com> wrote:
Dear Eric,

Something like this  should assist to generate the metadata
http://YOUR_URLl/api/organisationUnits.json?paging=false&fields=id,name,parent[id,name,parent[id,name,parent[id,name]]]&filter=level:EQ:5


The above will generate the orgunit hierachy at level 5 (lowest level) up to 
level 2. Note how I use the parent[id,name]

Alex

On Mon, Nov 23, 2015 at 5:35 PM, Eric Green <epgr...@gmail.com> wrote:
I had a side conversation with Jason Pickering about using R to access the web 
API, and I’m moving the conversation to the mailing list to document it for 
others.

I asked Jason for guidance on modifying the API url to import data into R. 
Prior to contacting Jason, I reviewed this documentation and his presentation 
on R/DHIS2 integration (great stuff!). Jason was nice enough to create this 
example that showed me how to use the pivot table app, copy the API url using 
Firefox/Chrome developer tools, and use the pre-filled URL in R as a template.

I wanted to do more with organization units, so I modified Jason’s example 
here: https://gist.github.com/ericpgreen/bb7fcb55efd8c93d3451.

I might not be approaching the problem the right way, but my general approach 
is to define a set of periods (monthly) and organizational units and then loop 
over a set of indicators to create a data frame for each indicator that has 
values by unit (row) and period (column). Then in R (not shown), I will 
transform each data frame from wide to long and then combine the data frames 
for each indicator into a larger data frame for analysis. 

I would like to have the data at the lowest level possible so I can later 
aggregate at higher organization unit levels (e.g., counties) and periods 
(e.g., years) as needed. I know I could just request these aggregations via the 
API, but I am accustomed to working with datasets at the lowest level and doing 
manipulations in my code so I can follow the process more closely (I’m new to 
APIs).

My current question is how to obtain the metadata that indicates the 
organizational hierarchy of units. When I define urlD in my code, I’d like to 
automatically grab all facility OU’s where county==2, for instance. I know I 
could do this if I had something like the following table. Right now I specify 
each OU manually. Having this table would allow me to build the API url 
programmatically. 

Also, in the data frame that is created, I only know that an observation is 
linked to facility 5, for instance, but I don’t have the metadata to show that 
facility 5 is in sub county 3 which is in county 2 of country 1. So having this 
table would let me aggregate on my end later.

Of course suggestions on improving my general approach are also welcome!!







_______________________________________________
Mailing list: https://launchpad.net/~dhis2-users
Post to     : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help   : https://help.launchpad.net/ListHelp




--
Alex Tumwesigye

Technical Advisor - DHIS2 (Consultant),
Ministry of Health/AFENET
Kampala
Uganda

IT Consultant - BarefootPower Uganda Ltd, SmartSolar, Kenya

IT Specialist (Servers, Networks and Security, Health Information Systems - 
DHIS2 ) & Solar Consultant

+256 774149 775, + 256 759 800161

"I don't want to be anything other than what I have been - one tree hill "

_______________________________________________
Mailing list: https://launchpad.net/~dhis2-users
Post to     : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help   : https://help.launchpad.net/ListHelp




--
Jason P. Pickering
email: jason.p.picker...@gmail.com
tel:+46764147049



--
Jason P. Pickering
email: jason.p.picker...@gmail.com
tel:+46764147049
_______________________________________________
Mailing list: https://launchpad.net/~dhis2-users
Post to     : dhis2-users@lists.launchpad.net
Unsubscribe : https://launchpad.net/~dhis2-users
More help   : https://help.launchpad.net/ListHelp

Reply via email to