Re: Hardware Reference for GPC

Dan Connolly Fri, 24 Jan 2014 09:40:00 -0800

[oops... Phillip gave me an OK to copy gpc-dev, but then I fat-fingered it...]


This is a good question. We've discussed it with IOWA and UNMC, and I think 
Russ got another inquiry more recently.
(I'm copying Russ, Mani, and Steve who are also involved in hardware and budget 
discussions.)

I can speak to

  1.  HERON hardware, where I have hard data, but the requirements go beyond 
what I expect we'll need for GPC
  2.  Plans for GPC, which I'm just starting to get my head around
  3.  Recent tinkering with i2b2 on AWS

Regarding HERON, recall the HERON 
architecture<https://informatics.kumc.edu/work/wiki/HERON#Architecture>:


[HERON arch from Russ's msg of 07/12/2010 09:48:12 
AM]<https://informatics.kumc.edu/work/attachment/wiki/HERON/heron-arch.jpg>

We have two hardware servers (id, deid) and a virtual app server (jboss, web 
UI).

The app server is a relatively unremarkable VM: 8GB RAM, 20GB disk. You could 
perhaps get by with less, if you knew more than I do about how to keep 
jboss/JVM from eating crazy amounts of RAM. We seem to be using ~80% of that 
disk space, though I'm not sure how. Log files, maybe. (Did I mention we have 
open positions for DBA and systems engineer?)


The HP DL180 was our 1st generation hardware. We did a sizing review in 2013; I 
just tweaked the summary spreadsheet to make sense to this audience:

  *   HERON Sizing 
2013<https://docs.google.com/spreadsheet/ccc?key=0Ak2nuw10QdWQdDM5THo4WDQzTUJpLVJXUmppcUFCNnc&usp=sharing>

As shown there, the Gen2 hardware servers are $55K each. The Gen1 servers were 
originally more like $20K. That was sort of OK for 1 user, but it was pretty 
sluggish if anybody else was also using it. So we added RAM and solid state 
storage, which made performance acceptable for our user-base.

This is where requirements for HERON go so far beyond what I can see for GPC. 
For GPC, strictly speaking, I think we need to run 3 queries in the first six 
months, one for each cohort we're characterizing. Of course, there are 
countless iterations to get there, so there's a trade-off between development 
time and hardware cost.

But it's not like HERON, where we aim to support hundreds of queries by dozens 
of researchers every month...

Queries by Month
Year-Month      Queries Users
2014-01 396     30
2013-12 405     33
2013-11 621     43
2013-10 1164    42
2013-09 1008    35
2013-08 1157    52
2013-07 641     36
2013-06 299     21
... with response times of 5 seconds to 5 minutes:
Created Status  Name    User    Groups  Terms   Elapsed

January 24 10:27:43am   INCOMPLETE      ...
        1       32      0:00:50 ******

January 23 04:41:10pm   INCOMPLETE Patient list 1       1       17:47:23        
****************
January 23 04:41:10pm   INCOMPLETE
        1       1       17:47:23        ****************
January 24 10:18:09am   COMPLETED
        3       5       0:01:31 *******

January 24 10:12:59am   COMPLETED
        2       4       0:01:24 *******

January 24 10:12:03am   COMPLETED
        1       3       0:00:11 ****

January 24 08:58:39am   COMPLETED Patient list  2       7       0:00:15 *****

January 24 08:58:39am   COMPLETED
        2       7       0:00:15 *****

January 24 04:53:00am   COMPLETED
        1       1       0:00:01 **

January 23 10:52:51pm   COMPLETED
        1       1       0:00:01 **

January 23 05:31:30pm   COMPLETED Patient list  2       21      0:00:12 ****

January 23 05:31:30pm   COMPLETED
        2       21      0:00:12 ****

January 23 04:52:41pm   COMPLETED
        1       1       0:00:01 **


A separate app server makes sense, yes... though at UNMC, I believe they're 
planning to use the same hardware for the deid DB and the app server, perhaps 
virtualizing the app server.

I heard from Russ that while popmednet is required for exchange between CDRNs, 
it's not required within CDRNs, and other CDRNs have alternative plans. In 
other words, I expect we'll need one popmednet node for the GPC CDRN, not one 
for each GPC site.

Actually, about tinkering on AWS, I don't think my experience so far sheds much 
light, so I'll leave that to a future discussion.

I'm adding this to the hackathon 
agenda<http://informatics.gpcnetwork.org/trac/Project/wiki/HackathonOne#Agenda> 
(but anyone who has input at this point, please share it here and don't wait 
until then).


--
Dan
________________________________
From: Phillip Reeder [[email protected]]
Sent: Friday, January 24, 2014 8:39 AM
To: Dan Connolly
Subject: Hardware Reference for GPC

Do we have a reference as to the hardware/vm's that we expect to need for the 
GPC?  We have our existing DB server and i2b2 app server and web client, but 
I'm assuming we will probably need to put this on it's own app server and i2b2 
web client.  Also, will we need a popmednet server for each site? Etc.  I'm 
getting some questions about budget and justifying the VMs so I was wondering 
if this might be something that the GPC level could help with.  Given that it's 
not really well defined at the moment, I don't want to have them re-budget the 
money to something else, then in 6 months really need 1 more VM.  Have you 
thought much about what servers each GPC site will need?

Phillip

________________________________

UT Southwestern Medical Center
The future of medicine, today.

Re: Hardware Reference for GPC

Reply via email to