Here is some example code:
CREATE TABLE observation_fact (  … ) PARTITION BY RANGE (concept_cd);
CREATE TABLE observation_fact_icd9 PARTITION OF observation_fact FOR VALUES 
FROM ('ICD9:') TO ('ICDA');
CREATE TABLE observation_fact_icd10 PARTITION OF observation_fact FOR VALUES 
FROM ('ICD10:') TO ('ICD11');

It took me a while to come up with valid ranges since Postgres requires them to 
be non-overlapping. I have only made 3 partitions so far, but I am working on 
making more. We are making one partition per concept code prefix, at least that 
is the plan.

If you do partitioning this way Postgres automatically loads your data into the 
correct partition and fails when it comes across a row of data that it cannot 
assign to a partition. In theory you can create all the partitions and then 
load the data, but for now I am creating a partition at a time and using a LIKE 
query to load that subset of the observation facts from our Oracle DB to the 
PostgreSQL DB.

Hope this helps,

Andrew Vallejos, MS
Programmer Analyst, Biomedical Informatics
Clinical & Translational Science Institute
Medical College of Wisconsin
[email protected]



From: Gpc-dev <[email protected]> on behalf of "Pedersen, Jay 
G" <[email protected]>
Date: Thursday, April 26, 2018 at 10:19 AM
To: "[email protected]" <[email protected]>
Subject: Re: [gpc-informatics] #676: Postgres SNOW SHRINE nodes (UNMC , MCW) 
time out

ATTENTION: This email originated from a sender outside of MCW. Use caution when 
clicking on links or opening attachments.
________________________________

Hi,



I am very much interested in the how the partitioning of observation_fact by 
concept_cd

was done.



(a) Can you share the syntax you used to create the partitions?

(b) How many partitions did you make?

    Presumably the most extreme level would be one per concept-code.



How did that affect how you loaded the data?  Do you have to load the

partitions separately, or does postgres determine which partition to use

based on the value of the concept_cd field?



Jay Pedersen, M.A.
Department of Pathology/Microbiology
University of Nebraska Medical Center
985900 Nebraska Medical Center
Omaha NE  68198-5900
402-559-9487<tel:402-559-9593> (office)
402-739-3496<tel:402-350-7851> (mobile)
________________________________
From: GPC Informatics <[email protected]>
Sent: Thursday, April 26, 2018 10:10:04 AM
To: Narayana, Yeshwanth R; [email protected]; [email protected]; McClay, James C; 
[email protected]
Cc: Campbell, James R; Pedersen, Jay G
Subject: Re: [gpc-informatics] #676: Postgres SNOW SHRINE nodes (UNMC , MCW) 
time out

Non-UNMC email




#676: Postgres SNOW SHRINE nodes (UNMC , MCW) time out
--------------------------+----------------------------
 Reporter:  ynarayana     |       Owner:  mish
     Type:  problem       |      Status:  assigned
 Priority:  minor         |   Milestone:  snow-shrine-2
Component:  data-sharing  |  Resolution:
 Keywords:                |  Blocked By:
 Blocking:                |
--------------------------+----------------------------

Comment (by avallejos):

 I have improved the performance of our PostreSQL node. So we should be
 returning results in a timely manner.

 In case anyone is curious... this was done by adding a GIN index to the
 Concept_Dimension and partitioning the observation_fact table based on
 concept_cd. Since we are running PostreSQL 10 we are able to use range
 partitioning on the string prefix on the concept_cd value.

--
Ticket URL: 
<http://informatics.gpcnetwork.org/trac/Project/ticket/676#comment:6<https://urldefense.proofpoint.com/v2/url?u=http-3A__informatics.gpcnetwork.org_trac_Project_ticket_676-23comment-3A6&d=DwMFAg&c=aFamLAsxMIDYjNglYHTMV0iqFn3z4pVFYPQkjgspw4Y&r=bgeYa4UiIgk0KM0TCf1dJw&m=KSeAgqpV-dc0pOP__lQOesBipj2ltqrYY3Ox3zPzd1E&s=gfGqHUYaeRsZ_Xobg0rBoVtl_yAgZIccesWWPgy2j4o&e=>>
gpc-informatics 
<http://informatics.gpcnetwork.org/<https://urldefense.proofpoint.com/v2/url?u=http-3A__informatics.gpcnetwork.org_&d=DwMFAg&c=aFamLAsxMIDYjNglYHTMV0iqFn3z4pVFYPQkjgspw4Y&r=bgeYa4UiIgk0KM0TCf1dJw&m=KSeAgqpV-dc0pOP__lQOesBipj2ltqrYY3Ox3zPzd1E&s=CM2i0B-QG71s32IMUipshpshgIlDo9huDEY0FJIOA2U&e=>>
Greater Plains Network - Informatics

The information in this e-mail may be privileged and confidential, intended 
only for the use of the addressee(s) above. Any unauthorized use or disclosure 
of this information is prohibited. If you have received this e-mail by mistake, 
please delete it and immediately contact the sender.
_______________________________________________
Gpc-dev mailing list
[email protected]
http://listserv.kumc.edu/mailman/listinfo/gpc-dev

Reply via email to