Karsten Hilbert wrote: > On Wed, Jun 27, 2007 at 07:02:27AM +1000, Tim Churches wrote: > >>>> Easier extensibility of the form definitions/metadata, >>> I can see that. Put IOW: it seems safer to update data in >>> deployed databases rather than requiring the databases >>> themselves to be modified, eg. when one needs a new "field" >>> in the form metadata definition. >>> >>> It *can* be done via ALTER TABLE but I agree it sounds safer >>> to rather do UPDATE ... WHERE ... ; >> What if the update to a form which has already been used to collect data >> involves the removal of a column? If you use ALTER TABLE, all the data >> collected in that form field/column is now gone... where's the undo button? > > Well, here's one fundamental difference in what we consider > a form to be: > > In NetEpi forms are used to *collect* data. > > In GNUmed the sort of form we are currently concerned about > is to *use existing* data to fill in fields on and (mostly) > print fields out onto paper forms. > > IOW, the disappearing column scenario doesn't exist. > > I tend to believe, even in your scenario, that a form is but > a puff of smoke and should never endanger existing data. IOW, > > - never *remove* a form definition but rather > retire/obsolete/supercede it by (at least a new version > of) a form
Yes, that's exactly what we do. Each form version has its own table, and obsolete versions are not deleted. There is no user interface to retrieve data from a field/column dropped in a subsequent version of a form, but the syadmin/dbadmin can easily get the data. We should add a user interface to it. > - roll forward "old" form content only where there's a need, > eg to print out repeat prescriptions when the prescription > form has changed inbetween first and repeat request > > I do appreciate this doesn't fit with the > forms-for-data-capture approach where you may need to be > able to process "old" and "new" form content as if it was > captured the same way. Yes, epidemiological data, especially disease outbreak data, is ephemeral and only has a limited currency and is for a specific purpose. So although we never throw data away, we are really only interested in the current data values for the current set of data items. That is a difference from a clinical medical record. >>> (COPY these >>> days is able to dump the result of a SELECT statement :-) >> That would work. XML is not always best, but if you tell people that >> your form definitions are shareable as CSV files, they will look at you >> very strangely, whereas if you say "shareable as XML" then they won't be >> surprised at all. > A good social engineering argument. > > I suppose we'd wrap the CSV in minimalist XML to please them > peoples: > > <xml> > <fancy_message caption="this is the new fancy form you asked > for"></fancy_message> > <csv> > the,csv,for,copy,to,insert > </csv> > </xml> > > et voila, shareable via XML where GNUmed displays > $fancy_message and instructs psql-whatnot to go its merry way > with the CSV data. Yes. >>>> We tested your approach, which is a form of the well-known >>>> EAV (entity-attribute-value) approach, >>> Correct. >>>> i.e. about 250 million rows in the values tables. >>> Let's see: >>> >>> Or 8 000 000 form field values over 10 years. Let's double >>> that up for increase of bureaucracy: 16 million. Double >>> again just for good measure: 32 million. >> Sure - at a clinic or individual institution level, the EAV approach is >> fine performance-wise. > We are looking at the individual clinic level with GNUmed. > > So the gist of the this exchange seems to be that a) > implementation-wise (I did look at the cocklebur code by > now) we have an excellent reference source for how to do it > but we need not be afraid - performance-wise - to employ an > EAV style schema where this seems appropriate. For systems aimed at a single clinic or institution, yes. If you want to scale up beyond that, do some performance simulations with lots of dummy data. But I think you'll be fine. Of course, if you consider the way PostgreSQL (but not other databases) works internally, it uses a form of EAV anyway. Well, close to it. But then it wraps SQL DDL around it. So our approach of a table per form version with a "fixed" schema is really closer to the EAV formulation when done with PostgreSQL than you might think. > Directly reusing the cocklebur forms framework doesn't seem > appropriate :-( as the goals are different, that is > arbitrary data-capture vs. predefined-form-filling with the > existing code being geared towards the former. I agree. Cocklebur is a custom object-relational mapper for NetEpi Collection, it was not designed to be general. > We may end up using it anyways. Happy to help if we can. Tim C _______________________________________________ Gnumed-devel mailing list [email protected] http://lists.gnu.org/mailman/listinfo/gnumed-devel
