Hi Roger, Valid concerns, but I would assume that the typical use case for R would be that the user would typically only be looking at 1) a view that has been prepared for them or 2) be provided with read only access to selected tables. I would not expect that locks would be a problem in this case.
However, your point is well taken. It is therefore I have been tinkering around with luciddb, a column-oriented database, that may be more appropriate for analysis. Which database that should be used is probably a discussion, but the point is that a separation between the "transactional" database that is being used for data entry, and the "analysis" database is a good idea. I would regard this to probably outside the scope of what DHIS is really intended to do. If people need to use tools like R, they will likely as well of being capable of coming up with their own solution. However, this does not exclude that certain simple examples could be built into DHIS. Obviously performance is a concern, but of course, it depends on what you are trying to do. R is incredibly powerful when it comes to producing graphics as I am sure that you are aware, and lightyears ahead of the other components we are using (jPlot I think). So, I would think that the typical use case would be to leverage R, possibly as an extension to DHIS2 for those that need it, for the generation of analysis tables and graphics, that would be beyond the scope of the "basic" package, which is really limited to aggregation. Anyway, just a few more thoughts. Regards, Jason On Thu, May 27, 2010 at 3:39 PM, Friedman, Roger (CDC/OID/NCHHSTP) (CTR) <[email protected]> wrote: > My concern would be DB performance, there's no telling what kind of locks R > or any other product using odbc/jdbc is going to use. I'm already worried > about simultaneous transactional and reporting use. Have there been any > large-volume performance tests? Has any thought been given to splitting > reporting and data entry between different DB servers? I know everyone has > been focused on getting the distributed DB aspects right, but assuming > universal availability of internet, how would DHIS2 perform on a single > (possibly clustered) national DB server? > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of > Jason Pickering > Sent: Thursday, May 27, 2010 7:03 AM > To: Bob Jolliffe > Cc: [email protected]; dhis2-devs > Subject: Re: [Dhis2-users] [Dhis2-devs] DHIS2 with R > > Yeah, this I guess comes back time and time again, with my some what > uncomfortable relationship with Hibernate and Java. Clearly, we need > to think about how to make certain procedures crossplatform compatible > (cross platform in the sense of working between Postgres/MySQL and > other DBs) with the need to offer advanced analysis capabilities, with > acceptable performance. > > There could be multiple ways of doing it, but in the absense of having > R integrated into DHIS2, I think the most likely shorterm use case > would be just some documentation on how to use the R client with the > DHIS2 database. Perhaps those users that use R over time with DHIS2 > could contribute their procedures, which should be able to be > generalized either with PL/R. > > Of course the difference with using Postgres, is that R procedures can > be embedded as a new language inside the DB. I am not really sure this > is possible with MySQL. This of course reduces the internal overhead > of getting the data out of Postgres, through Java, and into the R > interpreter, but I am not sure really what the impact of this might be > without testing it. > > > > On Thu, May 27, 2010 at 12:26 PM, Bob Jolliffe <[email protected]> wrote: >> On 27 May 2010 11:15, Jason Pickering <[email protected]> wrote: >>> Hi Bob, >>> >>> Yes, I suspect that most R users would probably want to do things >>> their own way. It has a rather steep learning curve. :) >>> >>> As for canned R scripts, the best way would probably with with PL/R, a >>> procedural Postgresql language which utilizes R. >>> >>> http://www.joeconway.com/plr/doc/index.html >>> >>> I have done some very basic testing and it seems to work just fine on >>> the server side. >> >> Swings and roundabouts to a certain extent. The main thing is that >> the r scripts are evaluated using the r c library. If they were >> invoked from within java/dhis then I guess data access would be slower >> than from pl/r (we'd need to have a way to get the data to the r >> interpreter), but number crunching would be similar and would also >> work with mysql and friends. Not sure which of these are bigger >> problems in typical/possible scenarios. >> >>> >>> I think they are two separate problems really, but I totally agree, C >>> is likely going to be faster than Java for big operations. However, I >>> do think (as all of you know) that the use of stored procedures (with >>> the wrapper facade type of approach) for certain functions (like >>> aggregation and heavy cross tab operations) would be much better to be >>> executed on the database server as a native stored procedure. >>> >>> Regards, >>> Jason >>> >>> >>> >>> >>> On Thu, May 27, 2010 at 11:45 AM, Bob Jolliffe <[email protected]> >>> wrote: >>>> We've talked before about integrating scripting engine (such as R) >>>> into dhis : http://www.rforge.net/rscript/ >>>> >>>> But my guess is that most R users are going to be of a level of >>>> sophistication that they would be most comfortable doing the kind of >>>> thing you describe - conecting directly to db with r client and doing >>>> their stuff. >>>> >>>> OTOH if there were sufficiently useful "canned" dhis R scripts which >>>> could take some number crunching load off the jvm and produce canned >>>> useful analysis then that would be different. >>>> >>>> Sadly I don't know sufficient about R to know. But I sense it ... >>>> >>>> Regards >>>> Bob >>>> >>>> On 27 May 2010 10:08, Jason Pickering <[email protected]> wrote: >>>>> Hi everyone. I have had a recent question from a user about how DHIS2 >>>>> can be used with R. I am including a trivial example here about how to >>>>> use R as as a client to access data and produce a graph in DHIS2. >>>>> >>>>> Just get a copy of R and install the DBI and RPostregSQL packages with >>>>> >>>>>>install.packages() >>>>> >>>>> >>>>> After that, just connect to the DB, retrieve your data (in this case >>>>> from a report table) and produce a graph. >>>>> >>>>>>library(DBI) >>>>> >>>>>>library(RPostgreSQL) >>>>> >>>>>>drv <- dbDriver("PostgreSQL") >>>>> >>>>>>con <- dbConnect(drv, dbname="dhis2_zm_prod2", user="postgres", >>>>>>password="postgres") >>>>> >>>>>>rs <- dbSendQuery(con, "SELECT * FROM _report_malaria_indicators_district >>>>>>where >>>>> organisationunitid = 3904") >>>>> >>>>>>data <- fetch(rs,n=-1) >>>>> >>>>>>barplot(data$malaria_confirm_incidence, >>>>>>names.arg=as.character(data$periodname), >>>>>>main=as.character(data$organisationunitname[1]),las=2) >>>>> >>>>>>dev.print(png, file="/home/jason/test.png") >>>>> >>>>> Regards, >>>>> Jason >>>>> >>>>> --- >>>>> Jason P. Pickering >>>>> email: [email protected] >>>>> tel:+260968395190 >>>>> >>>>> _______________________________________________ >>>>> Mailing list: https://launchpad.net/~dhis2-devs >>>>> Post to : [email protected] >>>>> Unsubscribe : https://launchpad.net/~dhis2-devs >>>>> More help : https://help.launchpad.net/ListHelp >>>>> >>>> >>> >>> >>> >>> -- >>> -- >>> Jason P. Pickering >>> email: [email protected] >>> tel:+260968395190 >>> >> > > > > -- > -- > Jason P. Pickering > email: [email protected] > tel:+260968395190 > > _______________________________________________ > Mailing list: https://launchpad.net/~dhis2-users > Post to : [email protected] > Unsubscribe : https://launchpad.net/~dhis2-users > More help : https://help.launchpad.net/ListHelp > > > > _______________________________________________ > Mailing list: https://launchpad.net/~dhis2-users > Post to : [email protected] > Unsubscribe : https://launchpad.net/~dhis2-users > More help : https://help.launchpad.net/ListHelp > -- -- Jason P. Pickering email: [email protected] tel:+260968395190 _______________________________________________ Mailing list: https://launchpad.net/~dhis2-devs Post to : [email protected] Unsubscribe : https://launchpad.net/~dhis2-devs More help : https://help.launchpad.net/ListHelp

