Re: [GENERAL] partitioning question
The comparison would be a if then else end if .. about 8 of them 2013-> and a static insert into v's making a dynamic string and using execute, my presumption would be the execute would be expensive verses a INSERT command A On 1 August 2017 at 07:04, Scott Marlowewrote: > On Sun, Jul 30, 2017 at 7:13 PM, Alex Samad wrote: > > How expensive is dynamic over static. I'm looking at storing yearly > now, so > > I figure if my if then clause has the latest year at the top it should be > > very quick. > > Assuming you're not doing anything particularly crazy it's minimal. > But what is good performance for one application may not be acceptable > for others. Generally the cost of inserting is MUCH higher than the > cost of dynamically setting the target, esp if you stick to plpgsql > and don't try to use rules to accomplish it. >
Re: [GENERAL] partitioning question
On Sun, Jul 30, 2017 at 7:13 PM, Alex Samadwrote: > How expensive is dynamic over static. I'm looking at storing yearly now, so > I figure if my if then clause has the latest year at the top it should be > very quick. Assuming you're not doing anything particularly crazy it's minimal. But what is good performance for one application may not be acceptable for others. Generally the cost of inserting is MUCH higher than the cost of dynamically setting the target, esp if you stick to plpgsql and don't try to use rules to accomplish it. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] partitioning question
How expensive is dynamic over static. I'm looking at storing yearly now, so I figure if my if then clause has the latest year at the top it should be very quick. On 31 July 2017 at 11:07, Justin Pryzbywrote: > On Mon, Jul 31, 2017 at 10:25:54AM +1000, Alex Samad wrote: > > I note that you link to P10 and I am currently looking at 9.6. The > changes > > do look nice for partitioning for p10. > > Yes sorry, pg10 is beta - avoid using it except for testing purposes. > > > I will add currently we don't delete anything, we will keep adding to it. > > > > Also I am thinking my insert trigger becomes a lot smaller and easier if > I > > leave it at yearly. > Note: the trigger function can either be a static function updated monthly > (to > handle the next month), preferably with the most recent months tested > first (so > a typical newly-inserted rows only goes through one if/case test). > > Alternately, the trigger function can dynamically compute the table into > which > to insert using plpgsql "format()" similar to here: > https://www.postgresql.org/docs/9.1/static/plpgsql- > statements.html#PLPGSQL-QUOTE-LITERAL-EXAMPLE > > Justin >
Re: [GENERAL] partitioning question
On Mon, Jul 31, 2017 at 10:25:54AM +1000, Alex Samad wrote: > I note that you link to P10 and I am currently looking at 9.6. The changes > do look nice for partitioning for p10. Yes sorry, pg10 is beta - avoid using it except for testing purposes. > I will add currently we don't delete anything, we will keep adding to it. > > Also I am thinking my insert trigger becomes a lot smaller and easier if I > leave it at yearly. Note: the trigger function can either be a static function updated monthly (to handle the next month), preferably with the most recent months tested first (so a typical newly-inserted rows only goes through one if/case test). Alternately, the trigger function can dynamically compute the table into which to insert using plpgsql "format()" similar to here: https://www.postgresql.org/docs/9.1/static/plpgsql-statements.html#PLPGSQL-QUOTE-LITERAL-EXAMPLE Justin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] partitioning question
Hi I note that you link to P10 and I am currently looking at 9.6. The changes do look nice for partitioning for p10. Interesting your suggest that the MM parition isn't that bad. I will add currently we don't delete anything, we will keep adding to it. Also I am thinking my insert trigger becomes a lot smaller and easier if I leave it at yearly. Also thinking if P10 was the current recommended version right now I would probably look at MM because it looks like it makes partitioning easier Alex On 31 July 2017 at 09:54, Justin Pryzbywrote: > On Mon, Jul 31, 2017 at 09:15:29AM +1000, Alex Samad wrote: > > Hi > > > > I was about to partition a large (?) approx 3T of data 2B rows into > > partition tables but broken up into MM ... > > > > Now I have been reading about limiting the number of partitions otherwise > > it could slow down the parser. > > > > My reasoning for limiting to MM was that most of the request would be > > monthly based. > > > > Should I be making the partitioning based on instead and have lots > > more indexs. > > > > If I have an index on the timestamp field will it help limiting to > MM ? > > The major advantages of partitions are enumerated here: > https://www.postgresql.org/docs/10/static/ddl-partitioning.html#ddl- > partitioning-overview > > For your case, it might be that seq scans of an entire "monthly" partition > turn > out to be very advantageous, compared with index scan (or seq scan of > entire > 3TB data). > > Also DROPing the oldest partition every month is commonly very much more > efficient than DELETEing it.. > > There are warnings like these: > > |All constraints on all partitions of the master table are examined during > |constraint exclusion, so large numbers of partitions are likely to > increase > |query planning time considerably. Partitioning using these techniques > will work > |well with up to perhaps a hundred partitions; don't try to use many > thousands > |of partitions. > > Unless you have 100s of years of data I don't think it would be a problem. > > For us, having hundreds of partitions hasn't been an issue (planning time > is > insignificant for our analytic report queries). But there's an overhead to > partitions and at some point the cost becomes significant. (Actually, I > think > one cost which *did* hit us, while experimenting with *daily* partition > granularity of every table, was probably due to very large pg_statistics > and > pg_attributes tables, which no longer fit in buffer cache). > > Justin > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
Re: [GENERAL] partitioning question
On Mon, Jul 31, 2017 at 09:15:29AM +1000, Alex Samad wrote: > Hi > > I was about to partition a large (?) approx 3T of data 2B rows into > partition tables but broken up into MM ... > > Now I have been reading about limiting the number of partitions otherwise > it could slow down the parser. > > My reasoning for limiting to MM was that most of the request would be > monthly based. > > Should I be making the partitioning based on instead and have lots > more indexs. > > If I have an index on the timestamp field will it help limiting to MM ? The major advantages of partitions are enumerated here: https://www.postgresql.org/docs/10/static/ddl-partitioning.html#ddl-partitioning-overview For your case, it might be that seq scans of an entire "monthly" partition turn out to be very advantageous, compared with index scan (or seq scan of entire 3TB data). Also DROPing the oldest partition every month is commonly very much more efficient than DELETEing it.. There are warnings like these: |All constraints on all partitions of the master table are examined during |constraint exclusion, so large numbers of partitions are likely to increase |query planning time considerably. Partitioning using these techniques will work |well with up to perhaps a hundred partitions; don't try to use many thousands |of partitions. Unless you have 100s of years of data I don't think it would be a problem. For us, having hundreds of partitions hasn't been an issue (planning time is insignificant for our analytic report queries). But there's an overhead to partitions and at some point the cost becomes significant. (Actually, I think one cost which *did* hit us, while experimenting with *daily* partition granularity of every table, was probably due to very large pg_statistics and pg_attributes tables, which no longer fit in buffer cache). Justin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] partitioning question -- how to guarantee uniqueness across partitions
On Jun 28, 2009, at 11:45 AM, Whit Armstrong wrote: Thanks, Tom. Let me give a little more detail on my actual data rather than the simple example I sent. I have a 60GB table of loan balances, which I've partitioned into 26 tables. The loan id's are a sequence of 6 characters, so the partitioning rule I've used is the first character of the loan id, which yields roughly equal sized partitions of 2.8 GB or so. Each loan can only have one balance per month, so the primary key on each partition is set to be loan_id and asofdate. However, this data is meant to be available via a rails application, hence, the need for a surrogate key of integers which is unique across the entire set of partitions. Creation of new rows in the partitioned tables should not be an issue under normal circumstances because I see that all of the child tables use the same sequence for generating new id's. However, what makes me nervous is that there is no explicit constraint in the database that prevents duplicate id's from being created, and I'm not sure how the rails app would react if for whatever reason duplicate id keys wound up in the table. As long as your inserts always use the default value, nextval('sequence_name'), for the id values then that can never happen unless you at some point use setval('sequence_name', X) where X = the max value already present in your partitioned table, which you should never be doing anyway. Erik Jones, Database Administrator Engine Yard Support, Scalability, Reliability 866.518.9273 x 260 Location: US/Pacific IRC: mage2k -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] partitioning question -- how to guarantee uniqueness across partitions
Whit Armstrong armstrong.w...@gmail.com writes: I have a simple example copied from the 8.3 manual on partitioning (http://www.postgresql.org/docs/8.3/interactive/ddl-partitioning.html). My question is, if you create a serial type in the parent table which is meant to be the primary key across all the partitions, how does one guarantee uniqueness of that key? One doesn't. That is not an appropriate way to set up a partitioned table. You need a primary key that can actually be used as a meaningful partitioning key. In this example, the id is completely useless and what you should be looking at is making the data_value be the primary key. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] partitioning question -- how to guarantee uniqueness across partitions
Thanks, Tom. Let me give a little more detail on my actual data rather than the simple example I sent. I have a 60GB table of loan balances, which I've partitioned into 26 tables. The loan id's are a sequence of 6 characters, so the partitioning rule I've used is the first character of the loan id, which yields roughly equal sized partitions of 2.8 GB or so. Each loan can only have one balance per month, so the primary key on each partition is set to be loan_id and asofdate. However, this data is meant to be available via a rails application, hence, the need for a surrogate key of integers which is unique across the entire set of partitions. Creation of new rows in the partitioned tables should not be an issue under normal circumstances because I see that all of the child tables use the same sequence for generating new id's. However, what makes me nervous is that there is no explicit constraint in the database that prevents duplicate id's from being created, and I'm not sure how the rails app would react if for whatever reason duplicate id keys wound up in the table. Any suggestions? Thanks, Whit On Sun, Jun 28, 2009 at 1:27 PM, Tom Lanet...@sss.pgh.pa.us wrote: Whit Armstrong armstrong.w...@gmail.com writes: I have a simple example copied from the 8.3 manual on partitioning (http://www.postgresql.org/docs/8.3/interactive/ddl-partitioning.html). My question is, if you create a serial type in the parent table which is meant to be the primary key across all the partitions, how does one guarantee uniqueness of that key? One doesn't. That is not an appropriate way to set up a partitioned table. You need a primary key that can actually be used as a meaningful partitioning key. In this example, the id is completely useless and what you should be looking at is making the data_value be the primary key. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general