The one-to-one relationship mentioned between dataelement and categorycombo is not correct !
The realtionship is one-to-many. A categorycombo can be assigned for many dataelements. But a dataelement can have only one categorycombo. Thank you Abyot. On Fri, Sep 25, 2009 at 11:44 AM, Jason Pickering < [email protected]> wrote: > Hi there. > > My basic issue with the category/category combo is that it appears to be a > one-to-one relationship with data elements. If I look at the data model, > there is a one-to-one relationship between dataelement and categorycomboid. > For a given category combo, you can have multiple options. So, you can > establish a relationship for a given data element and a group of category > options. > > Let me try and describe the issue. We have a set of data elements related > to malaria for this example. We would like to be able to pivot the data on > other dimensions dimensions (Data element, age, disease, patient status). > Obviously there are other dimensions that are pivotable (orgunit, period, > dataset) > > The data elements look like this. I have put the dimensions in square > brackets, and the dimensional elements into curly brackets. > > [Data element, Age, Disease, Patient status] > Deaths Confirmed Malaria total (composed of) {All ages, Malaria Cases, > Deaths} > Deaths Confirmed Malaria 1 to Under 5 Years {1-5, Malaria Cases, Deaths} > Deaths Confirmed Malaria Over 5 Years {Over 5, Malaria Cases, Deaths} > Deaths Confirmed Malaria Under 1 Year {Under 1, Malaria Cases, Deaths} > IP Discharge Confirmed Malaria total (composed of) {All ages, Malaria > Cases, IP} > IP Discharge Confirmed Malaria 1 to Under 5 Years {1-5, Malaria Cases, IP} > IP Discharge Confirmed Malaria Over 5 Years {Over 5, Malaria Cases, Deaths} > IP Discharge Confirmed Malaria Under 1 Year {Under 1, Malaria Cases, > Deaths} > OPD 1st Attendance Confirmed Malaria total (composed of) {All ages, Malaria > Cases, OPD} > OPD 1st Attendance Confirmed Malaria 1 to Under 5 Years {1-5, Malaria > Cases, OPD} > OPD 1st Attendance Confirmed Malaria Over 5 Years {Over 5, Malaria Cases, > OPD} > OPD 1st Attendance Confirmed Malaria Under 1 Year {Under 1, Malaria Cases, > OPD} > > OK, I hope this is pretty clear. Obviously, there are more data elements > (Typhoid, Yellow fever, etc). I might want to know how many Under 1 deaths I > have had for all diseases, or how many OPD cases I have had for each > disease. How can I do this with the existing data model? It is not obvious > to me because there is no relationship between dimensional elements > (categoryoptions) to each other. Category options can be related through a > cateogry combination, but since data elements can only be assigned a single > category option, the dimensionality is broken once it gets time to pull the > data into a pivot table. > > In the incomplete example that I gave yesterday, I established a > one-to-many relationship between a data element and a dimension. If I > understand the current data model, I would have to create a separate > categorycombo for each of these data elements, and assign this > categorycombo to the data element. Now, I might be able to unfold the > dimensions using the categories and categorycombos. I it is not apparent how > the dimensional elements correspond themselves to a particular dimension, as > there is no relation for this in the database as I can see it. > > As (Johan pointed out a few mails ago, if I understand him correctly) is > different categorycombo's can be created for individual data elements, and > assigned to these elements. However, this seems to be 1) incredibly > inefficient and 2) does not establish any relationship between dimensional > elements and dimensions. Perhaps it is there, and maybe it has been done in > SL, but the SQL is not apparent to me at all. > > It would appear to me, looking from an SQL perspective, that a one-to-many > relationship between a data element, a dimension (category) and dimensional > element (category combo) would be much more effieicnet, and highly usable > from an SQL perspective. As I mentioned in my mail, I am not sure how easy > this would be to implement in a procedural language like Java, but I assume > it should be possible to either do it this way, or rewrite my Postgres > proprietary query in standard SQL (which there are ways to do with ANSI > SQL). This would require modification to the data model (similar to the > table I provided yesterday) and modification to the UI to allow users to 1) > select a dimension (category) 2) Select a dimensional element for the given > dimension. This would populate the table with a dataelementid, a > dimensionid (categoryid) and a dimensional element (cateogryoptionid). > > My gut feeling this is exactly the same functionality as has currently been > implemented for organizational units. Users can define a hierarchy for > organizational units, and then assign them to > categories/dimension/organizational group sets, decide whether the groups > are compulsory and exlusive, and then assign a particular organizational > unit to a particular group (which is analogous to a dimensional element). > Organizational group sets define the dimension, and one-to-one assignment of > an organizational unit to a particular organizational group defines which > dimensional element the organizational unit is a member of. These dimensions > can then be used in PivotTable analyses, where the orgunitgroupsets become > dimensions, and orgunitgroups become dimensional elements.. > > I beleive that data elements are no different than organizational units. > They should be able to be grouped into some sort of hierarchy and pivoted on > any dimension. Data elements groups establish a one-to-many relationship > between data elements and a data element group, but there is no concept of > how data element groups relate to each other. > I think this is perhaps the same concept you mention, ReportSet. > > I suspect we would need to potentially rethink the entire concept of > multidimensionality if we really wanted to get it right. It would see to me > that the DHIS datamodel and associated aggregation methods have been > hardwired into aggregation across time (period) and geography (orgunit). > What we can do with PivotTables and (and OLAP) is to aggregate across any > possible dimension, slicing as you mention ,on any dimension . I am not sure > this will be so simple to implement but I think there is a way to do it, > without major modifications. > > I am not sure it solves the SDMX issue. There are potential issues related > to "ragged" dimensions and how these get handled. Some data elements might > have three dimensions, while others may have more. I have not thought about > this in detail, but know it is an issue with cross-tab queries in SQL. You > normally have to know how many dimensions you are working with in order to > perform a cross-tab, but there are dynamic solutions. Perhaps this could be > dealt with somehow in SDMX. > > Anyway ,I am rambling. Hope this mail helps though to push my point > further. Once I get the SQL from SL, I will see if perhaps it has been done > already, and that I am just writing long emails for nothing. :) > > Regards, > Jason > > > > > > On Fri, Sep 25, 2009 at 10:44 AM, Bob Jolliffe <[email protected]> > wrote: > > Hi Jason and Johan > > > > I'm really pleased to see you having this discussion as I have been > > grappling with a similar issue which involves unravelling categories, > > category options and combos into something more familiar. I have reached > > similar conclusions regarding nomenclature: > > > > category = dimension > > categorycombo - I have been calling a dimension set (it bears a strong, > and > > useful, resemblance to xslt:attribute-set) > > category option - I like your suggestion of DimensionalElement. I am > going > > to start calling it that too. > > > > In my case I need to export (and import data) into a standard format > called > > sdmx. So whereas in the DHIS2 native DXF we export datavalues with > > effectively three dimensions (source, period, categorycombooption) the > last > > dimension is a sort of uber-dimension. Like a peppercorn or a cardamon > > seed, when you break it open it explodes its rich complexity of > dimensions. > > > > In sdmx we need the dimensions exploded. So data values look like: > > > > <dataset> > > <datavalue name="TB test given" uid="44344 ...44" gender="Male" > age="0-5" > > value="32" /> > > <datavalue name="TB test given" uid="44344 ...44" gender="Female" > > age="0-5" value="38" /> > > .. > > </dataset> > > > > My approach to unpicking the dimensions from the dxf file is to transform > it > > with an xslt transformation which is still incomplete but seems to work > > well. > > > > One other nomenclatures issue which has surfaced as a result is what we > call > > a "dataset". In DHIS2, if I understand correctly, a dataset corresponds > > roughly to all the dataelements which might occur on a datacollection > form. > > If we view all dataelements as having just the three "dimensions" then > all > > is well, but if we explode the actual dimensions then we have an issue. > In > > the sdmx model a dataset consists only of dataelements with the same > > dimensionset. After discussing this with Ola we have reached the > conclusion > > that we need another level of grouping, primarily for the UI - eg FormSet > or > > ReportSet which allows us to group related datasets. But that is an > aside > > from what you are talking about. > > > > I know that you guys can do magic with sql, but it seems that we should > try > > to capture some of this and place it down in the datamodel API. It > occurs > > to me that for a multidimensional dataelement we might benefit from some > > utility methods to retrieve slices and dices which might assist in > > constructing the pivot tables around dimensions. Does this sound like > the > > right thing to do. > > > > Regards > > Bob > > > > 2009/9/24 <[email protected]> > >> > >> Jason, > >> I will leave to others to comment the code, but I have a few comments... > >> > >> > I have done a bit more thinking on this, and would like to offer some > >> > more examples up for discussion. > >> > > >> > Basically, we have a lot of data elements that are somehow related to > >> > each other, similar to my kooky example in my original mail. I assume > >> > this is fairly common throughout other HMIS systems. Here, malaria > >> > attendance is broken down into various dimensions/category by patient > >> > type (outpatient, inpatient, and deaths) and by age (under 1 ,1-5 and > >> > over 5). But say you want to be able to pivot to look at outpatient, > >> > inpatient and deaths totals (i.e. summed up by age). Well, you could > >> > create a separate data element for this, but it sure would be nice to > >> > be able to Pivot the data somehow. > >> > >> In the Sierra Leone db, Edem and Romain set up views that pulled the > >> categories through into a "Category" pivot field, which you can then use > >> to get what you want. Simply tick the categories (see below) you want to > >> see, and group them together in excel. Maybe Edem and Romain can help > >> further here. > >> > >> > >> > Dimension ? Category > >> > Dimensional element ? Category option ? Category combo ( I think) > >> > >> The right symbol disappeared from my reply-mail here, but some > >> clarification: > >> > >> Crosstab Dimension (age AND gender) = Category combo > >> Dimension (age, gender) = Category > >> Dimensional element (inpatient, outpatient, death, under1, 1-5, and over > >> 5) = Category option > >> > >> So by assigning a DE the category combo of "gender_age", you get 9 > >> dimensional elements, 3 category options (in category age) by 3 category > >> options (in category gender) > >> > >> Johan > >> > >> > >> > >> > >> > Anyway, here is the helper table I created. > >> > > >> > CREATE TABLE test_dataelementcategorycombo > >> > ( > >> > test_dataelementid integer NOT NULL, > >> > test_dataelementcategoryid integer NOT NULL, > >> > test_dataelementcategorycomboid integer NOT NULL, > >> > CONSTRAINT pk_testdataelementcategory PRIMARY KEY > >> > (test_dataelementid, test_dataelementcategoryid, > >> > test_dataelementcategorycomboid) > >> > ) > >> > WITH (OIDS=FALSE); > >> > > >> > So this is a real simple table which references a data element, a > >> > data element category, and a data element combo. The reference to a > >> > data element category may be redundant, but anyway, lets leave it in > >> > for now. > >> > > >> > I populated the table with some data, which will be used to assign > >> > dimensions to data elements. It looks like this in my DB, which looks > >> > like this. > >> > > >> > 309;25250;25251 > >> > 309;25257;25255 > >> > 348;25250;25252 > >> > 348;25257;25255 > >> > 455;25250;25253 > >> > 455;25257;25255 > >> > > >> > but of course this is meaningless to you. What do these values > >> > correspond > >> > to? > >> > > >> > "OPD 1st Attendance Clinical Case of Malaria Under 1 > Year";"Age";"Under > >> > 1" > >> > "OPD 1st Attendance Clinical Case of Malaria 1 to Under 5 > >> > Years";"Age";"Age 1-5" > >> > "OPD 1st Attendance Clinical Case of Malaria Over 5 Years";"Age";"Over > >> > 5" > >> > "OPD 1st Attendance Clinical Case of Malaria Under 1 Year";"Patient > >> > status";"OPD" > >> > "OPD 1st Attendance Clinical Case of Malaria 1 to Under 5 > >> > Years";"Patient status";"OPD" > >> > "OPD 1st Attendance Clinical Case of Malaria Over 5 Years";"Patient > >> > status";"OPD" > >> > > >> > which can be produced by the following view. > >> > > >> > CREATE OR REPLACE VIEW vw_dataelements_dimensions AS > >> > SELECT dataelement.name, dataelementcategory.name AS dimension, > >> > dataelementcategoryoption.name AS dimension_element > >> > FROM dataelement > >> > JOIN test_dataelementcategorycombo ON > >> > test_dataelementcategorycombo.test_dataelementid = > >> > dataelement.dataelementid > >> > JOIN dataelementcategory ON dataelementcategory.categoryid = > >> > test_dataelementcategorycombo.test_dataelementcategoryid > >> > JOIN dataelementcategoryoption ON > >> > test_dataelementcategorycombo.test_dataelementcategorycomboid = > >> > dataelementcategoryoption.categoryoptionid; > >> > > >> > So, that view just provides a human readable view of those integers > >> > that I populated in the the test_dataelementcategorycombo table I > >> > created above. This table just assigns particular data elements to > >> > different category options (dimensional elements). > >> > > >> > OK, so far so good, but the problem now is, how to use this with the > >> > aggregatedatavalue table? If we try and join this table directly, we > >> > will have issues with duplicates in the pivot table, so we need to > >> > transform the data slightly. > >> > > >> > This should do the trick. > >> > > >> > SELECT * FROM crosstab > >> > ( > >> > 'SELECT name, dimension, dimension_element FROM > >> > vw_dataelements_dimensions ORDER BY 1,2,3', > >> > 'SELECT DISTINCT dimension from vw_dataelements_dimensions ORDER BY 1 > >> > ASC' > >> > ) > >> > as > >> > ( > >> > name character varying(230), > >> > age character varying(160), > >> > status character varying(160) > >> > ); > >> > > >> > > >> > which returns this record set > >> > > >> > "OPD 1st Attendance Clinical Case of Malaria 1 to Under 5 Years";"Age > >> > 1-5";"OPD" > >> > "OPD 1st Attendance Clinical Case of Malaria Over 5 Years";"Over > >> > 5";"OPD" > >> > "OPD 1st Attendance Clinical Case of Malaria Under 1 Year";"Under > >> > 1";"OPD" > >> > > >> > > >> > OK, admittedly, I cheated a bit and used the crosstab function of > >> > Postgresql, but I assume that this query could be rewritten with a few > >> > more lines of code in standard SQL or some procedural language like > >> > Java. Now, this record set looks like something that I can almost use > >> > with the aggregateddatavalue table simply by joining up the table on > >> > the appropriate dataelementid and pulling everything into a pivot > >> > table. I would not have any duplicated values and would have columns > >> > like data element name, period, orgunit, age, patient status and of > >> > course the value of the data element. I hope that part is pretty > >> > clear. Just join up that table to the aggregateddata table, and you > >> > have pretty much what is needed to pull the data directly into a > >> > PivotTable for further analysis. > >> > > >> > This is not a complete example, but it is very close to what I need > >> > here ,and I think this type of functionality would be much more useful > >> > than the current data element categories functionality. Basically, all > >> > that would be required, at least initially, would be another user > >> > interface screen to allow the definition of which category(ies) and > >> > category options a data element is a member of. The rest could ,in the > >> > first instance be executed with custom SQL (obviously, I am partial to > >> > this language and hobbled by the fact that I do not know Java), but > >> > eventually this would need to be implemented somehow in Java. > >> > > >> > I am not sure if this really solves all of the issues surrounding > >> > multidimensional analysis of data elements, but it seems to solve the > >> > issues that I am having by trying to assign some sort of dimensional > >> > hierarchy to data elements (similar to the exclusive/compulsory > >> > functionality of orgunits). Any thoughts on this? > >> > > >> > Best regards, > >> > Jason > >> > > >> > > >> > > >> > > >> > On Wed, Sep 16, 2009 at 10:28 PM, Jason Pickering > >> > <[email protected]> wrote: > >> >> > >> >> > >> >> On Wed, Sep 16, 2009 at 10:13 PM, <[email protected]> wrote: > >> >>> > >> >>> >> However, there does seem to be the ability to assign dimensions, > >> >>> there > >> >>> >> does > >> >>> >> not seem to be the ability to assign particular elements within > >> >>> those > >> >>> >> dimensions to a particular DHIS data element. > >> >>> > >> >>> > >> >>> Just some more clarification here: you can make category combos > which > >> >>> you > >> >>> assign to data elements. However, it is not possible to assign just > >> >>> specific parts of a category combo (only some of the category > options) > >> >>> to > >> >>> a data element. > >> >> > >> >> Yes, this was exactly what I wanted. Assigning different categories > >> >> would > >> >> seem to break the dimensionality. > >> >> > >> >>> > >> >>> Then you must make a specific category (as the only one in > >> >>> or part of a new category combo) with just those options. It can be > >> >>> hell; > >> >>> in Tajikistan there were way over 20 categories I think, at least 10 > >> >>> just > >> >>> on various age groups. > >> >>> > >> >>> Johan > >> >>> > >> >> > >> >> This was my fear. > >> >> > >> >> I will need to do some testing and see. I still fear it is not > exactly > >> >> the > >> >> intended functionality. > >> >> > >> >> Basically, I think I need something akin to the exclusive/compulsory > >> >> groups > >> >> that are in place for organizational units, but instead, for > arbitrary > >> >> dimensions. I will give a try and see what happens. > >> >> > >> >> Thanks, > >> >> Jason > >> >> > >> >> > >> >> > >> >> > >> >> > >> > > >> > >> > >> > >> _______________________________________________ > >> Mailing list: > >> https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs> > >> Post to : [email protected] > >> Unsubscribe : > >> https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs> > >> More help : https://help.launchpad.net/ListHelp > > > > > > > _______________________________________________ > Mailing list: > https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs> > Post to : [email protected] > Unsubscribe : > https://launchpad.net/~dhis2-devs<https://launchpad.net/%7Edhis2-devs> > More help : https://help.launchpad.net/ListHelp > >
_______________________________________________ Mailing list: https://launchpad.net/~dhis2-devs Post to : [email protected] Unsubscribe : https://launchpad.net/~dhis2-devs More help : https://help.launchpad.net/ListHelp

