Paul wrote:
> --- In [email protected], Tim Churches <[EMAIL PROTECTED]> wrote:
>> Paul wrote:
>>> We would love to hear from folks, in particular who have the following
>>> skill sets:
>>>
>>> 1)  database performance optimization
>>> 2)  OLAP / reporting database design
>>> 3)  Hibernate ORM
>> With respect to 2), you might be interested in NetEpi Analysis. be
>> warned, we haven't worked on it for over 12 months, so its dependencies
>> are all a bit old and thus it is tricky to install, but we hope to get
>> back to it very shortly and make it easier to install. It doesn't do
>> everything you need, but it does do OLAP sort of things, but without the
>> restrictions of OLAP.
>>
>> There'll be a an online and offline/downloadable "screencast" demo of
>> NetEpi Analysis available in March I'm working on the script for it now)
>> - I'll announce it on this list.
> 
> We'll definitely take a look at this.  Thanks for the tip.  One of the
> technical deficiencies of our team is in the OLAP/reporting space. 
> Our long term vision at this point is to have a repository that's
> optimized for incoming data/HL7 and for clinical care purposes
> (optimized for patient-level queries), and another repository which
> shadows the primary for reporting/analysis purposes (higher
> abstraction-level queries by encounter, location, population, program,
> etc.)  So far, we've been able to get away with doing both from a
> single repository as reporting, data analyses can be done as
> non-mission critical tasks in low clinical utilization time periods. 
> I don't believe this will be sustainable over the long run however. 
> For example, our collaborators at PIH have just informed us that their
> OpenMRS instantiation is going "country-wide", and these queries will
> certainly take dedicated horsepower which will likely be continuous.

NetEpi Analysis was designed to deal with the types of data and analyses
which you mention - for example, apart from supporting complex
cross-tabs (with good support for proportions), basic contingency table
analysis and standard statistical charts (again with confidence limits),
it also does direct and indirectly age-standardised population-based
rates (with confidence intervals), and we hope to shortly add support
for log-linear (Poisson and negative-binomial) models for counts/rates,
as well as possibly Bayesian smoothing of rates. And it works quickly
enough for interactive, real-time analysis even with many millions of
records, and the underlying architecture deals well with the sort of
multi-level data you describe eg it supports multi-valued records. And
it has full support for missing data handling. Oh, and it also does
Google-style indexed searching or free text fields. However, a major
downside is that it only accepts batch loads of data at the moment,
without incremental updates (although many data sources can be loaded
into one dataset, but they all have to be loaded in one go). That's
something we also plan to fix as soon as possible, but unless you have
many tens millions of records , periodic batch loading works OK (and is
simpler to set up and maintain than incremental updates, which can be
surprisingly tricky and fragile). But it does support dataset
versioning, so that the latest version of source data can be loaded into
a new dataset in the background while users continue to use an existing
dataset, and when the new load has completely it is transparently used
for all new analysis sessions. Of course, there are still plenty of
rough edges and gaps, but it might be a good start, or a partial
solution. Fairly easy to write hard-coded interfaces to OpenMRS, I
suspect, either in PHP or Python or both, although a generic interface
would be harder, but possible - maybe a few weeks work. It can certainly
load data directly from a database back-end, although some queries to
flatten the data appropriately might be needed. Further processing and
data transformation can be done in Python as the data are loaded. Oh,
you can use it via the Web front-end, no programming, or via its own
Python API, which is simple enough for interactive command-line analysis
as well as for calling from other things.

>>> Additionally, new Java programmers are always welcome to join us.
>> Sorry, we prefer Python, and only use Java if absolutely forced.
>> Speaking of Python, you might be interested in the GNUmed project, which
>> also targets primary care settings, and the GNUmed people are very
>> interested in all sorts of architectural and design issues. GNUmed uses
>> a cross-platform GUI (wxWindows), rather than a Web interface, though.
>> But the design issues are similar.
> 
> Hey, Python is good stuff.  Whatever gets the job done is OK with me.
> :)  Messaging allows one the opportunity for something like NetEpi to
> play nicely with OpenMRS.

Yes. Actually, we don't support messaging in NetEpi, yet, but no reason
why it can't. No, that's wrong, one of our application, for real-time
ED-based public health surveillance, does use HL7 messaging and the data
is fed into NetEpi Analysis. We do have some very insistent calls for
interoperabilty, but they are for interoperabilty with paper-based data
collection forms: people want to be able to scan in and optical mark
read or OCR hand-written paper forms. I think OpenMRS uses SharePoint or
perhaps Adobe Acrobat forms for this? Do they support scanning? That's
what our users want. Naturally we are not very enthusiastic about
providing such facilities, not the least because there are few open
source optical character recognition packages which can be readily used
(yes, IBM released one a while ago, but it takes a lot to use it), and
also, form scanning just doesn't work as well in practice as people
think it should, I've found in the past. Perhaps it works better these days.

Tim C

Reply via email to