http://208.76.222.114/confluence/display/RandD/General+multi-dimensional+model
On Fri, Sep 25, 2009 at 11:55 AM, Abyot Gizaw <[email protected]> wrote: > The one-to-one relationship mentioned between dataelement and categorycombo > is not correct ! > > The realtionship is one-to-many. A categorycombo can be assigned for many > dataelements. But a dataelement can have only one categorycombo. > > Thank you > Abyot. > > > On Fri, Sep 25, 2009 at 11:44 AM, Jason Pickering < > [email protected]> wrote: > >> Hi there. >> >> My basic issue with the category/category combo is that it appears to be a >> one-to-one relationship with data elements. If I look at the data model, >> there is a one-to-one relationship between dataelement and categorycomboid. >> For a given category combo, you can have multiple options. So, you can >> establish a relationship for a given data element and a group of category >> options. >> >> Let me try and describe the issue. We have a set of data elements related >> to malaria for this example. We would like to be able to pivot the data on >> other dimensions dimensions (Data element, age, disease, patient status). >> Obviously there are other dimensions that are pivotable (orgunit, period, >> dataset) >> >> The data elements look like this. I have put the dimensions in square >> brackets, and the dimensional elements into curly brackets. >> >> [Data element, Age, Disease, Patient status] >> Deaths Confirmed Malaria total (composed of) {All ages, Malaria Cases, >> Deaths} >> Deaths Confirmed Malaria 1 to Under 5 Years {1-5, Malaria Cases, Deaths} >> Deaths Confirmed Malaria Over 5 Years {Over 5, Malaria Cases, Deaths} >> Deaths Confirmed Malaria Under 1 Year {Under 1, Malaria Cases, Deaths} >> IP Discharge Confirmed Malaria total (composed of) {All ages, Malaria >> Cases, IP} >> IP Discharge Confirmed Malaria 1 to Under 5 Years {1-5, Malaria Cases, IP} >> IP Discharge Confirmed Malaria Over 5 Years {Over 5, Malaria Cases, >> Deaths} >> IP Discharge Confirmed Malaria Under 1 Year {Under 1, Malaria Cases, >> Deaths} >> OPD 1st Attendance Confirmed Malaria total (composed of) {All ages, >> Malaria Cases, OPD} >> OPD 1st Attendance Confirmed Malaria 1 to Under 5 Years {1-5, Malaria >> Cases, OPD} >> OPD 1st Attendance Confirmed Malaria Over 5 Years {Over 5, Malaria Cases, >> OPD} >> OPD 1st Attendance Confirmed Malaria Under 1 Year {Under 1, Malaria Cases, >> OPD} >> >> OK, I hope this is pretty clear. Obviously, there are more data elements >> (Typhoid, Yellow fever, etc). I might want to know how many Under 1 deaths I >> have had for all diseases, or how many OPD cases I have had for each >> disease. How can I do this with the existing data model? It is not obvious >> to me because there is no relationship between dimensional elements >> (categoryoptions) to each other. Category options can be related through a >> cateogry combination, but since data elements can only be assigned a single >> category option, the dimensionality is broken once it gets time to pull the >> data into a pivot table. >> >> In the incomplete example that I gave yesterday, I established a >> one-to-many relationship between a data element and a dimension. If I >> understand the current data model, I would have to create a separate >> categorycombo for each of these data elements, and assign this >> categorycombo to the data element. Now, I might be able to unfold the >> dimensions using the categories and categorycombos. I it is not apparent how >> the dimensional elements correspond themselves to a particular dimension, as >> there is no relation for this in the database as I can see it. >> >> As (Johan pointed out a few mails ago, if I understand him correctly) is >> different categorycombo's can be created for individual data elements, and >> assigned to these elements. However, this seems to be 1) incredibly >> inefficient and 2) does not establish any relationship between dimensional >> elements and dimensions. Perhaps it is there, and maybe it has been done in >> SL, but the SQL is not apparent to me at all. >> >> It would appear to me, looking from an SQL perspective, that a >> one-to-many relationship between a data element, a dimension (category) and >> dimensional element (category combo) would be much more effieicnet, and >> highly usable from an SQL perspective. As I mentioned in my mail, I am not >> sure how easy this would be to implement in a procedural language like Java, >> but I assume it should be possible to either do it this way, or rewrite my >> Postgres proprietary query in standard SQL (which there are ways to do with >> ANSI SQL). This would require modification to the data model (similar to >> the table I provided yesterday) and modification to the UI to allow users to >> 1) select a dimension (category) 2) Select a dimensional element for the >> given dimension. This would populate the table with a dataelementid, a >> dimensionid (categoryid) and a dimensional element (cateogryoptionid). >> >> My gut feeling this is exactly the same functionality as has currently >> been implemented for organizational units. Users can define a hierarchy for >> organizational units, and then assign them to >> categories/dimension/organizational group sets, decide whether the groups >> are compulsory and exlusive, and then assign a particular organizational >> unit to a particular group (which is analogous to a dimensional element). >> Organizational group sets define the dimension, and one-to-one assignment of >> an organizational unit to a particular organizational group defines which >> dimensional element the organizational unit is a member of. These dimensions >> can then be used in PivotTable analyses, where the orgunitgroupsets become >> dimensions, and orgunitgroups become dimensional elements.. >> >> I beleive that data elements are no different than organizational units. >> They should be able to be grouped into some sort of hierarchy and pivoted on >> any dimension. Data elements groups establish a one-to-many relationship >> between data elements and a data element group, but there is no concept of >> how data element groups relate to each other. >> I think this is perhaps the same concept you mention, ReportSet. >> >> I suspect we would need to potentially rethink the entire concept of >> multidimensionality if we really wanted to get it right. It would see to me >> that the DHIS datamodel and associated aggregation methods have been >> hardwired into aggregation across time (period) and geography (orgunit). >> What we can do with PivotTables and (and OLAP) is to aggregate across any >> possible dimension, slicing as you mention ,on any dimension . I am not sure >> this will be so simple to implement but I think there is a way to do it, >> without major modifications. >> >> I am not sure it solves the SDMX issue. There are potential issues related >> to "ragged" dimensions and how these get handled. Some data elements might >> have three dimensions, while others may have more. I have not thought about >> this in detail, but know it is an issue with cross-tab queries in SQL. You >> normally have to know how many dimensions you are working with in order to >> perform a cross-tab, but there are dynamic solutions. Perhaps this could be >> dealt with somehow in SDMX. >> >> Anyway ,I am rambling. Hope this mail helps though to push my point >> further. Once I get the SQL from SL, I will see if perhaps it has been done >> already, and that I am just writing long emails for nothing. :) >> >> Regards, >> Jason >> >> >> >> >> >> On Fri, Sep 25, 2009 at 10:44 AM, Bob Jolliffe <[email protected]> >> wrote: >> > Hi Jason and Johan >> > >> > I'm really pleased to see you having this discussion as I have been >> > grappling with a similar issue which involves unravelling categories, >> > category options and combos into something more familiar. I have >> reached >> > similar conclusions regarding nomenclature: >> > >> > category = dimension >> > categorycombo - I have been calling a dimension set (it bears a strong, >> and >> > useful, resemblance to xslt:attribute-set) >> > category option - I like your suggestion of DimensionalElement. I am >> going >> > to start calling it that too. >> > >> > In my case I need to export (and import data) into a standard format >> called >> > sdmx. So whereas in the DHIS2 native DXF we export datavalues with >> > effectively three dimensions (source, period, categorycombooption) the >> last >> > dimension is a sort of uber-dimension. Like a peppercorn or a cardamon >> > seed, when you break it open it explodes its rich complexity of >> dimensions. >> > >> > In sdmx we need the dimensions exploded. So data values look like: >> > >> > <dataset> >> > <datavalue name="TB test given" uid="44344 ...44" gender="Male" >> age="0-5" >> > value="32" /> >> > <datavalue name="TB test given" uid="44344 ...44" gender="Female" >> > age="0-5" value="38" /> >> > .. >> > </dataset> >> > >> > My approach to unpicking the dimensions from the dxf file is to >> transform it >> > with an xslt transformation which is still incomplete but seems to work >> > well. >> > >> > One other nomenclatures issue which has surfaced as a result is what we >> call >> > a "dataset". In DHIS2, if I understand correctly, a dataset corresponds >> > roughly to all the dataelements which might occur on a datacollection >> form. >> > If we view all dataelements as having just the three "dimensions" then >> all >> > is well, but if we explode the actual dimensions then we have an issue. >> In >> > the sdmx model a dataset consists only of dataelements with the same >> > dimensionset. After discussing this with Ola we have reached the >> conclusion >> > that we need another level of grouping, primarily for the UI - eg >> FormSet or >> > ReportSet which allows us to group related datasets. But that is an >> aside >> > from what you are talking about. >> > >> > I know that you guys can do magic with sql, but it seems that we should >> try >> > to capture some of this and place it down in the datamodel API. It >> occurs >> > to me that for a multidimensional dataelement we might benefit from some >> > utility methods to retrieve slices and dices which might assist in >> > constructing the pivot tables around dimensions. Does this sound like >> the >> > right thing to do. >> > >> > Regards >> > Bob >> > >> > 2009/9/24 <[email protected]> >> >> >> >> Jason, >> >> I will leave to others to comment the code, but I have a few >> comments... >> >> >> >> > I have done a bit more thinking on this, and would like to offer some >> >> > more examples up for discussion. >> >> > >> >> > Basically, we have a lot of data elements that are somehow related to >> >> > each other, similar to my kooky example in my original mail. I assume >> >> > this is fairly common throughout other HMIS systems. Here, malaria >> >> > attendance is broken down into various dimensions/category by patient >> >> > type (outpatient, inpatient, and deaths) and by age (under 1 ,1-5 and >> >> > over 5). But say you want to be able to pivot to look at outpatient, >> >> > inpatient and deaths totals (i.e. summed up by age). Well, you could >> >> > create a separate data element for this, but it sure would be nice to >> >> > be able to Pivot the data somehow. >> >> >> >> In the Sierra Leone db, Edem and Romain set up views that pulled the >> >> categories through into a "Category" pivot field, which you can then >> use >> >> to get what you want. Simply tick the categories (see below) you want >> to >> >> see, and group them together in excel. Maybe Edem and Romain can help >> >> further here. >> >> >> >> >> >> > Dimension ? Category >> >> > Dimensional element ? Category option ? Category combo ( I think) >> >> >> >> The right symbol disappeared from my reply-mail here, but some >> >> clarification: >> >> >> >> Crosstab Dimension (age AND gender) = Category combo >> >> Dimension (age, gender) = Category >> >> Dimensional element (inpatient, outpatient, death, under1, 1-5, and >> over >> >> 5) = Category option >> >> >> >> So by assigning a DE the category combo of "gender_age", you get 9 >> >> dimensional elements, 3 category options (in category age) by 3 >> category >> >> options (in category gender) >> >> >> >> Johan >> >> >> >> >> >> >> >> >> >> > Anyway, here is the helper table I created. >> >> > >> >> > CREATE TABLE test_dataelementcategorycombo >> >> > ( >> >> > test_dataelementid integer NOT NULL, >> >> > test_dataelementcategoryid integer NOT NULL, >> >> > test_dataelementcategorycomboid integer NOT NULL, >> >> > CONSTRAINT pk_testdataelementcategory PRIMARY KEY >> >> > (test_dataelementid, test_dataelementcategoryid, >> >> > test_dataelementcategorycomboid) >> >> > ) >> >> > WITH (OIDS=FALSE); >> >> > >> >> > So this is a real simple table which references a data element, a >> >> > data element category, and a data element combo. The reference to a >> >> > data element category may be redundant, but anyway, lets leave it in >> >> > for now. >> >> > >> >> > I populated the table with some data, which will be used to assign >> >> > dimensions to data elements. It looks like this in my DB, which looks >> >> > like this. >> >> > >> >> > 309;25250;25251 >> >> > 309;25257;25255 >> >> > 348;25250;25252 >> >> > 348;25257;25255 >> >> > 455;25250;25253 >> >> > 455;25257;25255 >> >> > >> >> > but of course this is meaningless to you. What do these values >> >> > correspond >> >> > to? >> >> > >> >> > "OPD 1st Attendance Clinical Case of Malaria Under 1 >> Year";"Age";"Under >> >> > 1" >> >> > "OPD 1st Attendance Clinical Case of Malaria 1 to Under 5 >> >> > Years";"Age";"Age 1-5" >> >> > "OPD 1st Attendance Clinical Case of Malaria Over 5 >> Years";"Age";"Over >> >> > 5" >> >> > "OPD 1st Attendance Clinical Case of Malaria Under 1 Year";"Patient >> >> > status";"OPD" >> >> > "OPD 1st Attendance Clinical Case of Malaria 1 to Under 5 >> >> > Years";"Patient status";"OPD" >> >> > "OPD 1st Attendance Clinical Case of Malaria Over 5 Years";"Patient >> >> > status";"OPD" >> >> > >> >> > which can be produced by the following view. >> >> > >> >> > CREATE OR REPLACE VIEW vw_dataelements_dimensions AS >> >> > SELECT dataelement.name, dataelementcategory.name AS dimension, >> >> > dataelementcategoryoption.name AS dimension_element >> >> > FROM dataelement >> >> > JOIN test_dataelementcategorycombo ON >> >> > test_dataelementcategorycombo.test_dataelementid = >> >> > dataelement.dataelementid >> >> > JOIN dataelementcategory ON dataelementcategory.categoryid = >> >> > test_dataelementcategorycombo.test_dataelementcategoryid >> >> > JOIN dataelementcategoryoption ON >> >> > test_dataelementcategorycombo.test_dataelementcategorycomboid = >> >> > dataelementcategoryoption.categoryoptionid; >> >> > >> >> > So, that view just provides a human readable view of those integers >> >> > that I populated in the the test_dataelementcategorycombo table I >> >> > created above. This table just assigns particular data elements to >> >> > different category options (dimensional elements). >> >> > >> >> > OK, so far so good, but the problem now is, how to use this with the >> >> > aggregatedatavalue table? If we try and join this table directly, we >> >> > will have issues with duplicates in the pivot table, so we need to >> >> > transform the data slightly. >> >> > >> >> > This should do the trick. >> >> > >> >> > SELECT * FROM crosstab >> >> > ( >> >> > 'SELECT name, dimension, dimension_element FROM >> >> > vw_dataelements_dimensions ORDER BY 1,2,3', >> >> > 'SELECT DISTINCT dimension from vw_dataelements_dimensions ORDER BY 1 >> >> > ASC' >> >> > ) >> >> > as >> >> > ( >> >> > name character varying(230), >> >> > age character varying(160), >> >> > status character varying(160) >> >> > ); >> >> > >> >> > >> >> > which returns this record set >> >> > >> >> > "OPD 1st Attendance Clinical Case of Malaria 1 to Under 5 Years";"Age >> >> > 1-5";"OPD" >> >> > "OPD 1st Attendance Clinical Case of Malaria Over 5 Years";"Over >> >> > 5";"OPD" >> >> > "OPD 1st Attendance Clinical Case of Malaria Under 1 Year";"Under >> >> > 1";"OPD" >> >> > >> >> > >> >> > OK, admittedly, I cheated a bit and used the crosstab function of >> >> > Postgresql, but I assume that this query could be rewritten with a >> few >> >> > more lines of code in standard SQL or some procedural language like >> >> > Java. Now, this record set looks like something that I can almost use >> >> > with the aggregateddatavalue table simply by joining up the table on >> >> > the appropriate dataelementid and pulling everything into a pivot >> >> > table. I would not have any duplicated values and would have columns >> >> > like data element name, period, orgunit, age, patient status and of >> >> > course the value of the data element. I hope that part is pretty >> >> > clear. Just join up that table to the aggregateddata table, and you >> >> > have pretty much what is needed to pull the data directly into a >> >> > PivotTable for further analysis. >> >> > >> >> > This is not a complete example, but it is very close to what I need >> >> > here ,and I think this type of functionality would be much more >> useful >> >> > than the current data element categories functionality. Basically, >> all >> >> > that would be required, at least initially, would be another user >> >> > interface screen to allow the definition of which category(ies) and >> >> > category options a data element is a member of. The rest could ,in >> the >> >> > first instance be executed with custom SQL (obviously, I am partial >> to >> >> > this language and hobbled by the fact that I do not know Java), but >> >> > eventually this would need to be implemented somehow in Java. >> >> > >> >> > I am not sure if this really solves all of the issues surrounding >> >> > multidimensional analysis of data elements, but it seems to solve the >> >> > issues that I am having by trying to assign some sort of dimensional >> >> > hierarchy to data elements (similar to the exclusive/compulsory >> >> > functionality of orgunits). Any thoughts on this? >> >> > >> >> > Best regards, >> >> > Jason >> >> > >> >> > >> >> > >> >> > >> >> > On Wed, Sep 16, 2009 at 10:28 PM, Jason Pickering >> >> > <[email protected]> wrote: >> >> >> >> >> >> >> >> >> On Wed, Sep 16, 2009 at 10:13 PM, <[email protected]> wrote: >> >> >>> >> >> >>> >> However, there does seem to be the ability to assign dimensions, >> >> >>> there >> >> >>> >> does >> >> >>> >> not seem to be the ability to assign particular elements within >> >> >>> those >> >> >>> >> dimensions to a particular DHIS data element. >> >> >>> >> >> >>> >> >> >>> Just some more clarification here: you can make category combos >> which >> >> >>> you >> >> >>> assign to data elements. However, it is not possible to assign just >> >> >>> specific parts of a category combo (only some of the category >> options) >> >> >>> to >> >> >>> a data element. >> >> >> >> >> >> Yes, this was exactly what I wanted. Assigning different categories >> >> >> would >> >> >> seem to break the dimensionality. >> >> >> >> >> >>> >> >> >>> Then you must make a specific category (as the only one in >> >> >>> or part of a new category combo) with just those options. It can be >> >> >>> hell; >> >> >>> in Tajikistan there were way over 20 categories I think, at least >> 10 >> >> >>> just >> >> >>> on various age groups. >> >> >>> >> >> >>> Johan >> >> >>> >> >> >> >> >> >> This was my fear. >> >> >> >> >> >> I will need to do some testing and see. I still fear it is not >> exactly >> >> >> the >> >> >> intended functionality. >> >> >> >> >> >> Basically, I think I need something akin to the exclusive/compulsory >> >> >> groups >> >> >> that are in place for organizational units, but instead, for >> arbitrary >> >> >> dimensions. I will give a try and see what happens. >> >> >> >> >> >> Thanks, >> >> >> Jason >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> _______________________________________________ >> >> Mailing list: >> >> https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs> >> >> Post to : [email protected] >> >> Unsubscribe : >> >> https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs> >> >> More help : https://help.launchpad.net/ListHelp >> > >> > >> >> >> _______________________________________________ >> Mailing list: >> https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs> >> Post to : [email protected] >> Unsubscribe : >> https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs> >> More help : https://help.launchpad.net/ListHelp >> >> > > _______________________________________________ > Mailing list: https://launchpad.net/~dhis2-devs > Post to : [email protected] > Unsubscribe : https://launchpad.net/~dhis2-devs > More help : https://help.launchpad.net/ListHelp > > -- Cheers, Knut Staring
_______________________________________________ Mailing list: https://launchpad.net/~dhis2-devs Post to : [email protected] Unsubscribe : https://launchpad.net/~dhis2-devs More help : https://help.launchpad.net/ListHelp

