Re: [GENERAL] partitioning question

2017-07-31 Thread Alex Samad
The comparison would be

a if then else end if .. about 8 of them 2013-> and a static insert into

v's

making a dynamic string and using execute, my presumption would be the
execute would be expensive verses a INSERT command

A

On 1 August 2017 at 07:04, Scott Marlowe  wrote:

> On Sun, Jul 30, 2017 at 7:13 PM, Alex Samad  wrote:
> > How expensive is dynamic over static.  I'm looking at storing yearly
> now, so
> > I figure if my if then clause has the latest year at the top it should be
> > very quick.
>
> Assuming you're not doing anything particularly crazy it's minimal.
> But what is good performance for one application may not be acceptable
> for others. Generally the cost of inserting is MUCH higher than the
> cost of dynamically setting the target, esp if you stick to plpgsql
> and don't try to use rules to accomplish it.
>


Re: [GENERAL] partitioning question

2017-07-31 Thread Scott Marlowe
On Sun, Jul 30, 2017 at 7:13 PM, Alex Samad  wrote:
> How expensive is dynamic over static.  I'm looking at storing yearly now, so
> I figure if my if then clause has the latest year at the top it should be
> very quick.

Assuming you're not doing anything particularly crazy it's minimal.
But what is good performance for one application may not be acceptable
for others. Generally the cost of inserting is MUCH higher than the
cost of dynamically setting the target, esp if you stick to plpgsql
and don't try to use rules to accomplish it.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] partitioning question

2017-07-30 Thread Alex Samad
How expensive is dynamic over static.  I'm looking at storing yearly now,
so I figure if my if then clause has the latest year at the top it should
be very quick.




On 31 July 2017 at 11:07, Justin Pryzby  wrote:

> On Mon, Jul 31, 2017 at 10:25:54AM +1000, Alex Samad wrote:
> > I note that you link to P10 and I am currently looking at 9.6.  The
> changes
> > do look nice for partitioning for p10.
>
> Yes sorry, pg10 is beta - avoid using it except for testing purposes.
>
> > I will add currently we don't delete anything, we will keep adding to it.
> >
> > Also I am thinking my insert trigger becomes a lot smaller and easier if
> I
> > leave it at yearly.
> Note: the trigger function can either be a static function updated monthly
> (to
> handle the next month), preferably with the most recent months tested
> first (so
> a typical newly-inserted rows only goes through one if/case test).
>
> Alternately, the trigger function can dynamically compute the table into
> which
> to insert using plpgsql "format()" similar to here:
> https://www.postgresql.org/docs/9.1/static/plpgsql-
> statements.html#PLPGSQL-QUOTE-LITERAL-EXAMPLE
>
> Justin
>


Re: [GENERAL] partitioning question

2017-07-30 Thread Justin Pryzby
On Mon, Jul 31, 2017 at 10:25:54AM +1000, Alex Samad wrote:
> I note that you link to P10 and I am currently looking at 9.6.  The changes
> do look nice for partitioning for p10.

Yes sorry, pg10 is beta - avoid using it except for testing purposes.

> I will add currently we don't delete anything, we will keep adding to it.
> 
> Also I am thinking my insert trigger becomes a lot smaller and easier if I
> leave it at yearly.
Note: the trigger function can either be a static function updated monthly (to
handle the next month), preferably with the most recent months tested first (so
a typical newly-inserted rows only goes through one if/case test).

Alternately, the trigger function can dynamically compute the table into which
to insert using plpgsql "format()" similar to here:
https://www.postgresql.org/docs/9.1/static/plpgsql-statements.html#PLPGSQL-QUOTE-LITERAL-EXAMPLE

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] partitioning question

2017-07-30 Thread Alex Samad
Hi

I note that you link to P10 and I am currently looking at 9.6.  The changes
do look nice for partitioning for p10.


Interesting your suggest that the MM parition isn't that bad.

I will add currently we don't delete anything, we will keep adding to it.

Also I am thinking my insert trigger becomes a lot smaller and easier if I
leave it at yearly.

Also thinking if P10 was the current recommended version right now I would
probably look at MM because it looks like it makes partitioning easier

Alex



On 31 July 2017 at 09:54, Justin Pryzby  wrote:

> On Mon, Jul 31, 2017 at 09:15:29AM +1000, Alex Samad wrote:
> > Hi
> >
> > I was about to partition a large (?) approx 3T of data 2B rows into
> > partition tables but broken up into MM ...
> >
> > Now I have been reading about limiting the number of partitions otherwise
> > it could slow down the parser.
> >
> > My reasoning for limiting to MM was that most of the request would be
> > monthly based.
> >
> > Should I be making the partitioning based on  instead and have lots
> > more indexs.
> >
> > If I have an index on the timestamp field will it help limiting to
> MM ?
>
> The major advantages of partitions are enumerated here:
> https://www.postgresql.org/docs/10/static/ddl-partitioning.html#ddl-
> partitioning-overview
>
> For your case, it might be that seq scans of an entire "monthly" partition
> turn
> out to be very advantageous, compared with index scan (or seq scan of
> entire
> 3TB data).
>
> Also DROPing the oldest partition every month is commonly very much more
> efficient than DELETEing it..
>
> There are warnings like these:
>
> |All constraints on all partitions of the master table are examined during
> |constraint exclusion, so large numbers of partitions are likely to
> increase
> |query planning time considerably. Partitioning using these techniques
> will work
> |well with up to perhaps a hundred partitions; don't try to use many
> thousands
> |of partitions.
>
> Unless you have 100s of years of data I don't think it would be a problem.
>
> For us, having hundreds of partitions hasn't been an issue (planning time
> is
> insignificant for our analytic report queries).  But there's an overhead to
> partitions and at some point the cost becomes significant.  (Actually, I
> think
> one cost which *did* hit us, while experimenting with *daily* partition
> granularity of every table, was probably due to very large pg_statistics
> and
> pg_attributes tables, which no longer fit in buffer cache).
>
> Justin
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] partitioning question

2017-07-30 Thread Justin Pryzby
On Mon, Jul 31, 2017 at 09:15:29AM +1000, Alex Samad wrote:
> Hi
> 
> I was about to partition a large (?) approx 3T of data 2B rows into
> partition tables but broken up into MM ...
> 
> Now I have been reading about limiting the number of partitions otherwise
> it could slow down the parser.
> 
> My reasoning for limiting to MM was that most of the request would be
> monthly based.
> 
> Should I be making the partitioning based on  instead and have lots
> more indexs.
> 
> If I have an index on the timestamp field will it help limiting to MM ?

The major advantages of partitions are enumerated here:
https://www.postgresql.org/docs/10/static/ddl-partitioning.html#ddl-partitioning-overview

For your case, it might be that seq scans of an entire "monthly" partition turn
out to be very advantageous, compared with index scan (or seq scan of entire
3TB data).

Also DROPing the oldest partition every month is commonly very much more
efficient than DELETEing it..

There are warnings like these:

|All constraints on all partitions of the master table are examined during
|constraint exclusion, so large numbers of partitions are likely to increase
|query planning time considerably. Partitioning using these techniques will work
|well with up to perhaps a hundred partitions; don't try to use many thousands
|of partitions.

Unless you have 100s of years of data I don't think it would be a problem.

For us, having hundreds of partitions hasn't been an issue (planning time is
insignificant for our analytic report queries).  But there's an overhead to
partitions and at some point the cost becomes significant.  (Actually, I think
one cost which *did* hit us, while experimenting with *daily* partition
granularity of every table, was probably due to very large pg_statistics and
pg_attributes tables, which no longer fit in buffer cache).

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] partitioning question

2017-07-30 Thread Alex Samad
Hi

I was about to partition a large (?) approx 3T of data 2B rows into
partition tables but broken up into MM ...

Now I have been reading about limiting the number of partitions otherwise
it could slow down the parser.

My reasoning for limiting to MM was that most of the request would be
monthly based.

Should I be making the partitioning based on  instead and have lots
more indexs.

If I have an index on the timestamp field will it help limiting to MM ?


A


Re: [GENERAL] Partitioning

2017-07-26 Thread George Neuner
On Tue, 25 Jul 2017 18:21:43 +0530, Krithika Venkatesh
 wrote:

>I have a table that is partitioned on a numeric column (ID).
>
>Partitioning works when I query the table with no joins.
>
>SELECT * FROM TABLE A a WHERE ID IN (SELECT ID FROM TABLE B b WHERE
>CREATED_TS = CURRENT_TIMESTAMP)
>
>Partitioning doesn't work when I do join.
>
>SELECT A.* FROM TABLE A a INNER JOIN TABLE B b ON  a.ID = b.ID.
>
>Is there any other option that would work.
>
>Thanks in Advance..

The subselect is constraining the set of ID value(s) to be matched in
A, which (at least potentially) permits identifying the relevant
partition(s).

The join must include all partitions of A because the set of ID values
to be matched with B are not constrained.

Also, the join query is not equivalent because it does not include the
timestamp constraint on B.  I don't think that will make any
difference to the query plan ... AFAICS, it still needs to consider
all partitions of A ... but it may improve performance.

George



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning (constraint exclusion involving joins)

2017-07-25 Thread Krithika Venkatesh
We understand the constraints exclusion will work only on constant values.
But in our case we will never pass a constant value to the partitioning key
when we query the partition tables. Will the partition be beneficial in
this case. If yes, can you please explain.

Thanks

On 25-Jul-2017 6:46 PM, "Justin Pryzby"  wrote:

On Tue, Jul 25, 2017 at 06:21:43PM +0530, Krithika Venkatesh wrote:
> I have a table that is partitioned on a numeric column (ID).
>
> Partitioning works when I query the table with no joins.
>
> SELECT * FROM TABLE A a WHERE ID IN (SELECT ID FROM TABLE B b WHERE
> CREATED_TS = CURRENT_TIMESTAMP)
>
> Partitioning doesn't work when I do join.
>
> SELECT A.* FROM TABLE A a INNER JOIN TABLE B b ON  a.ID = b.ID.

I think you mean "constraint exclusion doesn't work when yo do a join",

which is because it only works on simple values compiled before the planner
gets to see them:

main=# explain SELECT COUNT(1) FROM eric_enodeb_metrics WHERE
start_time>now(); -- -'999 minutes'::interval;
   QUERY PLAN

-
 Aggregate  (cost=62.44..62.45 rows=1 width=8)
   ->  Append  (cost=0.00..62.40 rows=14 width=0)
 ->  Seq Scan on eric_enodeb_metrics  (cost=0.00..0.00 rows=1
width=0)
   Filter: (start_time > now())
 ->  Index Only Scan using eric_enodeb_201607_idx on
eric_enodeb_201607  (cost=0.42..4.44 rows=1 width=0)
   Index Cond: (start_time > now())
 ->  Index Only Scan using eric_enodeb_201608_idx on
eric_enodeb_201608  (cost=0.42..4.44 rows=1 width=0)
   Index Cond: (start_time > now())
 ->  Index Only Scan using eric_enodeb_201609_idx on
eric_enodeb_201609  (cost=0.42..4.44 rows=1 width=0)
   Index Cond: (start_time > now())

https://www.postgresql.org/docs/current/static/ddl-partitioning.html
|The following caveats apply to constraint exclusion:
| Constraint exclusion only works when the query's WHERE clause contains
|constants (or externally supplied parameters). For example, a comparison
|against a non-immutable function such as CURRENT_TIMESTAMP cannot be
optimized,
|since the planner cannot know which partition the function value might fall
|into at run time.
[..]


.. and see an early mail on its implementation, here:
https://www.postgresql.org/message-id/1121251997.3970.237.camel@localhost.
localdomain

Justin


Re: [GENERAL] Partitioning (constraint exclusion involving joins)

2017-07-25 Thread Justin Pryzby
On Tue, Jul 25, 2017 at 06:21:43PM +0530, Krithika Venkatesh wrote:
> I have a table that is partitioned on a numeric column (ID).
> 
> Partitioning works when I query the table with no joins.
> 
> SELECT * FROM TABLE A a WHERE ID IN (SELECT ID FROM TABLE B b WHERE
> CREATED_TS = CURRENT_TIMESTAMP)
> 
> Partitioning doesn't work when I do join.
> 
> SELECT A.* FROM TABLE A a INNER JOIN TABLE B b ON  a.ID = b.ID.

I think you mean "constraint exclusion doesn't work when yo do a join",

which is because it only works on simple values compiled before the planner
gets to see them:

main=# explain SELECT COUNT(1) FROM eric_enodeb_metrics WHERE start_time>now(); 
-- -'999 minutes'::interval;
   QUERY PLAN   
 
-
 Aggregate  (cost=62.44..62.45 rows=1 width=8)
   ->  Append  (cost=0.00..62.40 rows=14 width=0)
 ->  Seq Scan on eric_enodeb_metrics  (cost=0.00..0.00 rows=1 width=0)
   Filter: (start_time > now())
 ->  Index Only Scan using eric_enodeb_201607_idx on eric_enodeb_201607 
 (cost=0.42..4.44 rows=1 width=0)
   Index Cond: (start_time > now())
 ->  Index Only Scan using eric_enodeb_201608_idx on eric_enodeb_201608 
 (cost=0.42..4.44 rows=1 width=0)
   Index Cond: (start_time > now())
 ->  Index Only Scan using eric_enodeb_201609_idx on eric_enodeb_201609 
 (cost=0.42..4.44 rows=1 width=0)
   Index Cond: (start_time > now())

https://www.postgresql.org/docs/current/static/ddl-partitioning.html
|The following caveats apply to constraint exclusion:
| Constraint exclusion only works when the query's WHERE clause contains
|constants (or externally supplied parameters). For example, a comparison
|against a non-immutable function such as CURRENT_TIMESTAMP cannot be optimized,
|since the planner cannot know which partition the function value might fall
|into at run time.
[..]


.. and see an early mail on its implementation, here:
https://www.postgresql.org/message-id/1121251997.3970.237.camel@localhost.localdomain

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Partitioning

2017-07-25 Thread Krithika Venkatesh
Hi,

I have a table that is partitioned on a numeric column (ID).

Partitioning works when I query the table with no joins.

SELECT * FROM TABLE A a WHERE ID IN (SELECT ID FROM TABLE B b WHERE
CREATED_TS = CURRENT_TIMESTAMP)

Partitioning doesn't work when I do join.

SELECT A.* FROM TABLE A a INNER JOIN TABLE B b ON  a.ID = b.ID.

Is there any other option that would work.

Thanks in Advance..


Re: [GENERAL] Partitioning and Table Inheritance

2017-05-12 Thread Andrew Staller
Hi Paul,

How much of your data is time-series in nature? Put another way, is there a
timestamp coupled with the inserted data?

Andrew

On Fri, May 12, 2017 at 4:38 PM, Ivan E. Panchenko <
i.panche...@postgrespro.ru> wrote:

> Hi
>
>
> 12.05.2017 23:22, Justin Pryzby пишет:
>
>> On Mon, May 08, 2017 at 10:12:18AM -0700, Paul A Jungwirth wrote:
>>
>>> I'm working on a problem where partitioning seems to be the right
>>> approach, but we would need a lot of partitions (say 10k or 100k).
>>> Everywhere I read that after ~100 child tables you experience
>>> problems. I have a few questions about that:
>>>
>> We use partitioning, previously one child per month (with history of 1-6
>> years); I tried using one child per day, and caused issues.
>>
>> For us, planning time is pretty unimportant (~1sec would be acceptable
>> 99% of
>> the time) but I recall seeing even more than that.  I changed to using
>> daily
>> granularity for only our largest tables, which seems to be working fine
>> for the
>> last ~9months.  So the issue isn't just "number of children" but "total
>> number
>> of tables".  I believe the problem may have been due to large
>> pg_statistic/pg_attribute and similar tables taking more than a few
>> 100MBs, and
>> potentially no longer fitting in buffer cache.
>>
>> 3. Is it true that query planning time should scale linearly as I add
>>> more child tables?
>>>
>> I believe it's understood to be super-linear:
>> https://www.postgresql.org/message-id/26761.1483989025%40sss.pgh.pa.us
>> https://www.postgresql.org/message-id/4188.1298960419%40sss.pgh.pa.us
>>
> There is an extension called pg_pathman which seriously optimizes the
> table partitioning, it might help in your case:
>
> https://github.com/postgrespro/pg_pathman
>
> See also:
>
> https://www.postgresql.org/message-id/4188.1298960419%40sss.pgh.pa.us and
>
> https://postgrespro.com/blog/pgsql/pg_pathman_e
>
>>
>> Justin
>>
>>
>> Ivan
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



-- 
TimescaleDB* | *Growth & Developer Evangelism
c: 908.581.9509

335 Madison Ave.
New York, NY 10017
http://www.timescale.com/
https://github.com/timescale/timescaledb


Re: [GENERAL] Partitioning and Table Inheritance

2017-05-12 Thread Ivan E. Panchenko

Hi


12.05.2017 23:22, Justin Pryzby пишет:

On Mon, May 08, 2017 at 10:12:18AM -0700, Paul A Jungwirth wrote:

I'm working on a problem where partitioning seems to be the right
approach, but we would need a lot of partitions (say 10k or 100k).
Everywhere I read that after ~100 child tables you experience
problems. I have a few questions about that:

We use partitioning, previously one child per month (with history of 1-6
years); I tried using one child per day, and caused issues.

For us, planning time is pretty unimportant (~1sec would be acceptable 99% of
the time) but I recall seeing even more than that.  I changed to using daily
granularity for only our largest tables, which seems to be working fine for the
last ~9months.  So the issue isn't just "number of children" but "total number
of tables".  I believe the problem may have been due to large
pg_statistic/pg_attribute and similar tables taking more than a few 100MBs, and
potentially no longer fitting in buffer cache.


3. Is it true that query planning time should scale linearly as I add
more child tables?

I believe it's understood to be super-linear:
https://www.postgresql.org/message-id/26761.1483989025%40sss.pgh.pa.us
https://www.postgresql.org/message-id/4188.1298960419%40sss.pgh.pa.us
There is an extension called pg_pathman which seriously optimizes the 
table partitioning, it might help in your case:


https://github.com/postgrespro/pg_pathman

See also:

https://www.postgresql.org/message-id/4188.1298960419%40sss.pgh.pa.us and

https://postgrespro.com/blog/pgsql/pg_pathman_e


Justin



Ivan


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and Table Inheritance

2017-05-12 Thread Justin Pryzby
On Mon, May 08, 2017 at 10:12:18AM -0700, Paul A Jungwirth wrote:
> I'm working on a problem where partitioning seems to be the right
> approach, but we would need a lot of partitions (say 10k or 100k).
> Everywhere I read that after ~100 child tables you experience
> problems. I have a few questions about that:

We use partitioning, previously one child per month (with history of 1-6
years); I tried using one child per day, and caused issues.

For us, planning time is pretty unimportant (~1sec would be acceptable 99% of
the time) but I recall seeing even more than that.  I changed to using daily
granularity for only our largest tables, which seems to be working fine for the
last ~9months.  So the issue isn't just "number of children" but "total number
of tables".  I believe the problem may have been due to large
pg_statistic/pg_attribute and similar tables taking more than a few 100MBs, and
potentially no longer fitting in buffer cache.

> 3. Is it true that query planning time should scale linearly as I add
> more child tables?

I believe it's understood to be super-linear:
https://www.postgresql.org/message-id/26761.1483989025%40sss.pgh.pa.us
https://www.postgresql.org/message-id/4188.1298960419%40sss.pgh.pa.us

Justin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Partitioning and Table Inheritance

2017-05-08 Thread Paul A Jungwirth
I'm working on a problem where partitioning seems to be the right
approach, but we would need a lot of partitions (say 10k or 100k).
Everywhere I read that after ~100 child tables you experience
problems. I have a few questions about that:

1. Is it true that the only disadvantage to 10k children is query
planning time when hitting the parent table? Is there anything else to
worry about? (Unlike ext3, ext4 seems to have no problem with lots of
files in one directory (e.g. 1,000,000). It doesn't break and it even
performs fine.) I know a lot of tables increases memory use. There is
clutter too in \dt but I think I can use schemas to cut down on that.
Anything else?

2. Is there any reason I shouldn't bypass the query planning cost by
having the app just SELECT from the correct child table (which is easy
for us 99% of the time)? Similarly I could bypass trigger time by
INSERTing directly into the child, etc.

3. Is it true that query planning time should scale linearly as I add
more child tables? I'm seeing ~16ms planning time from EXPLAIN ANALYZE
with 1000 tables. I haven't tried 10k tables yet, but so far 16ms or
even 160ms seems tolerable if it's only for the 1% of queries that
can't SELECT directly from a known child table.

4. I tried a scheme where instead of one parent table with 1000 direct
children, I have 1 parent with 10 children, and they each have 10
children, and they each have 10 children. I thought by using
increasingly-specific constraints I could maybe make query planning
scale by O(log n) instead of O(n), but instead it went up! Upon
investigating it looks like maybe the planner flattens the inheritance
hierarchy before doing constraint_exclusion work. Is that true? If so,
is there any interest in changing this in future versions? (I might be
willing to take a stab at a patch for it.)

Btw this is on 9.4 but it could be 9.5 or 9.6 if that would help any.

Thanks,
Paul


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and ORM tools

2016-03-29 Thread Brian Fehrle

Here is a working example of trigger based partitioning with a view and
'do instead' that works with ORM tools using the affected rows return
(example attached).

The key things that make it work are:

1. RETURN NEW; (in the function after inserting into the partition)

2. INSTEAD OF INSERT (in the trigger)


example:
insert into data_log_view (date, thingy) values ('2015-01-02', 'test');
INSERT 0 1



On 3/24/16 8:28 AM, CS DBA wrote:



On 03/23/2016 02:48 AM, Chris Travers wrote:



On Wed, Mar 23, 2016 at 9:39 AM, Chris Travers
> wrote:

Use a view with a DO INSTEAD trigger. That will allow you to
return the tuple properly.

On Tue, Mar 22, 2016 at 7:40 PM, CS DBA
 wrote:

Hi All;

we setup partitioning for a large table but had to back off
because the return status (i.e: "INSERT 0 1") returns "INSERT
0 0" when inserting into the partitioned table which causes
the ORM tool to assume the insert inserted 0 rows.  Is there
a standard / best practices work around for this?


Apologies for the top post above.

Just noting additionally that the view with DO INSTEAD approach was
suggested to me by Matt Trout (major contributor to the DBIx::Class
ORM in Perl.

I have used it.  It works well.  I think it is the best practice there.


Thanks in advance




--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




--
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP. Robust and Flexible.  No
vendor lock-in.
http://www.efficito.com/learn_more




--
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No
vendor lock-in.
http://www.efficito.com/learn_more


All;

Thanks for the great Ideas, I'll let you know where we end up.





Brian FehrleDatabase Administrator II | comScore, Inc. (NASDAQ:SCOR)
bfeh...@comscore.com
|  | CO

Rentrak and comScore are now one, creating the new model for a dynamic 
cross-platform world. To learn more, visit: www.comscore.com
--
-- PostgreSQL database dump
--

SET statement_timeout = 0;
SET lock_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SET check_function_bodies = false;
SET client_min_messages = warning;

--
-- Name: part; Type: SCHEMA; Schema: -; Owner: postgres
--

CREATE SCHEMA part;


ALTER SCHEMA part OWNER TO postgres;

--
-- Name: plpgsql; Type: EXTENSION; Schema: -; Owner: 
--

CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog;


--
-- Name: EXTENSION plpgsql; Type: COMMENT; Schema: -; Owner: 
--

COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language';


SET search_path = public, pg_catalog;

--
-- Name: insert_trigger(); Type: FUNCTION; Schema: public; Owner: postgres
--

CREATE FUNCTION insert_trigger() RETURNS trigger
LANGUAGE plpgsql
AS $$
BEGIN
INSERT INTO part.data_log_2015 VALUES (NEW.*);
RETURN NEW;
END;
$$;


ALTER FUNCTION public.insert_trigger() OWNER TO postgres;

SET default_tablespace = '';

SET default_with_oids = false;

--
-- Name: data_log; Type: TABLE; Schema: public; Owner: postgres; Tablespace: 
--

CREATE TABLE data_log (
data_log_sid integer NOT NULL,
date timestamp without time zone NOT NULL,
thingy character varying
);


ALTER TABLE data_log OWNER TO postgres;

--
-- Name: data_log_data_log_sid_seq; Type: SEQUENCE; Schema: public; Owner: 
postgres
--

CREATE SEQUENCE data_log_data_log_sid_seq
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1;


ALTER TABLE data_log_data_log_sid_seq OWNER TO postgres;

--
-- Name: data_log_data_log_sid_seq; Type: SEQUENCE OWNED BY; Schema: public; 
Owner: postgres
--

ALTER SEQUENCE data_log_data_log_sid_seq OWNED BY data_log.data_log_sid;


SET search_path = part, pg_catalog;

--
-- Name: data_log_2015; Type: TABLE; Schema: part; Owner: postgres; Tablespace: 
--

CREATE TABLE data_log_2015 (
data_log_sid integer DEFAULT 
nextval('public.data_log_data_log_sid_seq'::regclass),
date timestamp without time zone,
thingy character varying
)
INHERITS (public.data_log);


ALTER TABLE data_log_2015 OWNER TO postgres;

SET search_path = public, pg_catalog;

--
-- Name: data_log_view; Type: VIEW; Schema: public; Owner: postgres
--

CREATE VIEW data_log_view AS
 SELECT data_log.data_log_sid,
data_log.date,
data_log.thingy
   FROM data_log;


ALTER TABLE data_log_view OWNER TO postgres;

--
-- Name: data_log_sid; Type: DEFAULT; Schema: public; Owner: postgres
--

ALTER TABLE ONLY data_log ALTER COLUMN 

Re: [GENERAL] Partitioning and ORM tools

2016-03-24 Thread CS DBA



On 03/23/2016 02:48 AM, Chris Travers wrote:



On Wed, Mar 23, 2016 at 9:39 AM, Chris Travers 
> wrote:


Use a view with a DO INSTEAD trigger. That will allow you to
return the tuple properly.

On Tue, Mar 22, 2016 at 7:40 PM, CS DBA
>
wrote:

Hi All;

we setup partitioning for a large table but had to back off
because the return status (i.e: "INSERT 0 1") returns "INSERT
0 0" when inserting into the partitioned table which causes
the ORM tool to assume the insert inserted 0 rows.  Is there a
standard / best practices work around for this?


Apologies for the top post above.

Just noting additionally that the view with DO INSTEAD approach was 
suggested to me by Matt Trout (major contributor to the DBIx::Class 
ORM in Perl.


I have used it.  It works well.  I think it is the best practice there.


Thanks in advance




-- 
Sent via pgsql-general mailing list

(pgsql-general@postgresql.org
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




-- 
Best Wishes,

Chris Travers

Efficito:  Hosted Accounting and ERP. Robust and Flexible.  No
vendor lock-in.
http://www.efficito.com/learn_more




--
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor 
lock-in.

http://www.efficito.com/learn_more


All;

Thanks for the great Ideas, I'll let you know where we end up.




Re: [GENERAL] Partitioning and ORM tools

2016-03-23 Thread Chris Travers
On Wed, Mar 23, 2016 at 9:39 AM, Chris Travers 
wrote:

> Use a view with a DO INSTEAD trigger.   That will allow you to return the
> tuple properly.
>
> On Tue, Mar 22, 2016 at 7:40 PM, CS DBA 
> wrote:
>
>> Hi All;
>>
>> we setup partitioning for a large table but had to back off because the
>> return status (i.e: "INSERT 0 1") returns "INSERT 0 0" when inserting into
>> the partitioned table which causes the ORM tool to assume the insert
>> inserted 0 rows.  Is there a standard / best practices work around for this?
>>
>
Apologies for the top post above.

Just noting additionally that the view with DO INSTEAD approach was
suggested to me by Matt Trout (major contributor to the DBIx::Class ORM in
Perl.

I have used it.  It works well.  I think it is the best practice there.

>
>> Thanks in advance
>>
>>
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>
>
>
> --
> Best Wishes,
> Chris Travers
>
> Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
> lock-in.
> http://www.efficito.com/learn_more
>



-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: [GENERAL] Partitioning and ORM tools

2016-03-23 Thread Chris Travers
Use a view with a DO INSTEAD trigger.   That will allow you to return the
tuple properly.

On Tue, Mar 22, 2016 at 7:40 PM, CS DBA  wrote:

> Hi All;
>
> we setup partitioning for a large table but had to back off because the
> return status (i.e: "INSERT 0 1") returns "INSERT 0 0" when inserting into
> the partitioned table which causes the ORM tool to assume the insert
> inserted 0 rows.  Is there a standard / best practices work around for this?
>
> Thanks in advance
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: [GENERAL] Partitioning and ORM tools

2016-03-22 Thread John R Pierce

On 3/22/2016 2:20 PM, CS DBA wrote:
I would think the ORM (as yet undefined) would want to think in terms 
of the parent table and not know about the physical schema details.
Can  the client not be written to check only for errors vs checking 
for non-zero inserts?





That was our first suggestion, they don;t want to make any app changes


so they want someone else to make major architectural changes. great.



--
john r pierce, recycling bits in santa cruz



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and ORM tools

2016-03-22 Thread Manuel Kniep
> So the ORM is parsing the INSERT return value, correct?
> 
> Would something like this(borrowing from docs example) freak it out?:
> 
> CREATE OR REPLACE FUNCTION measurement_insert_trigger()
> RETURNS TRIGGER AS $$
> DECLARE
>_ct int;
> BEGIN
>INSERT INTO measurement_y2016m03 VALUES (NEW.*);
>SELECT INTO  _ct count(NEW.*);
>RAISE NOTICE 'INSERT 0 %', _ct;
>RETURN NULL;
> END;
> $$
> LANGUAGE plpgsql;
> 
> test=# insert into measurement values(1, '03/21/2016', 50, 87);
> NOTICE:  INSERT 0 1
> INSERT 0 0
> 
> 


we had a similar problem using ruby and ActiveRecord and solved it with

 RETURN NEW;

at the end of  the insert trigger 

which would result in inserting the row into the master table as well
that is then deleted right away in an AFTER INSERT trigger

CREATE OR REPLACE FUNCTION delete_master_trigger()
 DECLARE
 r master%rowtype;
 BEGIN  
 DELETE FROM ONLY master WHERE id = NEW.id returning * into r;
 RETURN r;  
 END;
$$
LANGUAGE plpgsql;  

Returning the inserted row here also solves the problem that ORM often need 
auto increment values back.


regards

Manuel Kniep






Re: [GENERAL] Partitioning and ORM tools

2016-03-22 Thread Manuel Kniep
> So the ORM is parsing the INSERT return value, correct?
> 
> Would something like this(borrowing from docs example) freak it out?:
> 
> CREATE OR REPLACE FUNCTION measurement_insert_trigger()
> RETURNS TRIGGER AS $$
> DECLARE
>_ct int;
> BEGIN
>INSERT INTO measurement_y2016m03 VALUES (NEW.*);
>SELECT INTO  _ct count(NEW.*);
>RAISE NOTICE 'INSERT 0 %', _ct;
>RETURN NULL;
> END;
> $$
> LANGUAGE plpgsql;
> 
> test=# insert into measurement values(1, '03/21/2016', 50, 87);
> NOTICE:  INSERT 0 1
> INSERT 0 0
> 
> 


we had a similar problem using ruby and ActiveRecord and solved it with

 RETURN NEW;

at the end of  the insert trigger 

which would result in inserting the row into the master table as well
that is then deleted right away in an AFTER INSERT trigger

CREATE OR REPLACE FUNCTION delete_master_trigger()
 DECLARE
 r master%rowtype;
 BEGIN  
 DELETE FROM ONLY master WHERE id = NEW.id returning * into r;
 RETURN r;  
 END;
$$
LANGUAGE plpgsql;  

Returning the inserted row here also solves the problem that ORM often need 
auto increment values back.


regards

Manuel Kniep






Re: [GENERAL] Partitioning and ORM tools

2016-03-22 Thread Adrian Klaver

On 03/22/2016 02:20 PM, CS DBA wrote:



On 03/22/2016 03:18 PM, Rob Sargent wrote:



On 03/22/2016 03:00 PM, Joshua D. Drake wrote:

On 03/22/2016 01:50 PM, CS DBA wrote:


Understood, was just wondering if there is a way to cause the child
table insert results to be returned to the ORM/Application instead of
the master/base table insert


Insert into the child table directly based on the partition rules.

JD



I would think the ORM (as yet undefined) would want to think in terms
of the parent table and not know about the physical schema details.
Can  the client not be written to check only for errors vs checking
for non-zero inserts?




That was our first suggestion, they don;t want to make any app changes


So the ORM is parsing the INSERT return value, correct?

Would something like this(borrowing from docs example) freak it out?:

CREATE OR REPLACE FUNCTION measurement_insert_trigger()
RETURNS TRIGGER AS $$
DECLARE
_ct int;
BEGIN
INSERT INTO measurement_y2016m03 VALUES (NEW.*);
SELECT INTO  _ct count(NEW.*);
RAISE NOTICE 'INSERT 0 %', _ct;
RETURN NULL;
END;
$$
LANGUAGE plpgsql;

test=# insert into measurement values(1, '03/21/2016', 50, 87);
NOTICE:  INSERT 0 1
INSERT 0 0











--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and ORM tools

2016-03-22 Thread Melvin Davidson
On Tue, Mar 22, 2016 at 5:20 PM, CS DBA  wrote:

>
>
> On 03/22/2016 03:18 PM, Rob Sargent wrote:
>
>>
>>
>> On 03/22/2016 03:00 PM, Joshua D. Drake wrote:
>>
>>> On 03/22/2016 01:50 PM, CS DBA wrote:
>>>
>>> Understood, was just wondering if there is a way to cause the child
 table insert results to be returned to the ORM/Application instead of
 the master/base table insert

>>>
>>> Insert into the child table directly based on the partition rules.
>>>
>>> JD
>>>
>>>
>>> I would think the ORM (as yet undefined) would want to think in terms of
>> the parent table and not know about the physical schema details.
>> Can  the client not be written to check only for errors vs checking for
>> non-zero inserts?
>>
>>
>>
>> That was our first suggestion, they don;t want to make any app changes
>
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

I highly suspect this is a problem with trigger function design, constraint
conflict or a bad insert statement, but since no details have been
provided, it cannot be resolved.
IOW, all they said was "It don't work", but they have not provided proof.

-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Re: [GENERAL] Partitioning and ORM tools

2016-03-22 Thread CS DBA



On 03/22/2016 03:18 PM, Rob Sargent wrote:



On 03/22/2016 03:00 PM, Joshua D. Drake wrote:

On 03/22/2016 01:50 PM, CS DBA wrote:


Understood, was just wondering if there is a way to cause the child
table insert results to be returned to the ORM/Application instead of
the master/base table insert


Insert into the child table directly based on the partition rules.

JD


I would think the ORM (as yet undefined) would want to think in terms 
of the parent table and not know about the physical schema details.
Can  the client not be written to check only for errors vs checking 
for non-zero inserts?





That was our first suggestion, they don;t want to make any app changes




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and ORM tools

2016-03-22 Thread Rob Sargent



On 03/22/2016 03:00 PM, Joshua D. Drake wrote:

On 03/22/2016 01:50 PM, CS DBA wrote:


Understood, was just wondering if there is a way to cause the child
table insert results to be returned to the ORM/Application instead of
the master/base table insert


Insert into the child table directly based on the partition rules.

JD


I would think the ORM (as yet undefined) would want to think in terms of 
the parent table and not know about the physical schema details.
Can  the client not be written to check only for errors vs checking for 
non-zero inserts?




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and ORM tools

2016-03-22 Thread Joshua D. Drake

On 03/22/2016 01:50 PM, CS DBA wrote:


Understood, was just wondering if there is a way to cause the child
table insert results to be returned to the ORM/Application instead of
the master/base table insert


Insert into the child table directly based on the partition rules.

JD


--
Command Prompt, Inc.  http://the.postgres.company/
 +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and ORM tools

2016-03-22 Thread CS DBA



On 03/22/2016 02:43 PM, Joshua D. Drake wrote:

On 03/22/2016 01:35 PM, CS DBA wrote:



On 03/22/2016 02:23 PM, Joshua D. Drake wrote:

On 03/22/2016 11:40 AM, CS DBA wrote:

Hi All;

we setup partitioning for a large table but had to back off because 
the

return status (i.e: "INSERT 0 1") returns "INSERT 0 0" when inserting
into the partitioned table which causes the ORM tool to assume the
insert inserted 0 rows.  Is there a standard / best practices work
around for this?

Thanks in advance


Are you inserting via trigger from an insert into a parent table? That
would do it.



Yes.. that's exactly the issue.


Because the parent table doesn't actually receive the insert, it is 
returning correctly (if obnoxiously considering the circumstances). It 
is known, expected behaviour.


Sincerely,

JD









Understood, was just wondering if there is a way to cause the child 
table insert results to be returned to the ORM/Application instead of 
the master/base table insert





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and ORM tools

2016-03-22 Thread Joshua D. Drake

On 03/22/2016 01:35 PM, CS DBA wrote:



On 03/22/2016 02:23 PM, Joshua D. Drake wrote:

On 03/22/2016 11:40 AM, CS DBA wrote:

Hi All;

we setup partitioning for a large table but had to back off because the
return status (i.e: "INSERT 0 1") returns "INSERT 0 0" when inserting
into the partitioned table which causes the ORM tool to assume the
insert inserted 0 rows.  Is there a standard / best practices work
around for this?

Thanks in advance


Are you inserting via trigger from an insert into a parent table? That
would do it.



Yes.. that's exactly the issue.


Because the parent table doesn't actually receive the insert, it is 
returning correctly (if obnoxiously considering the circumstances). It 
is known, expected behaviour.


Sincerely,

JD









--
Command Prompt, Inc.  http://the.postgres.company/
 +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and ORM tools

2016-03-22 Thread CS DBA



On 03/22/2016 02:23 PM, Joshua D. Drake wrote:

On 03/22/2016 11:40 AM, CS DBA wrote:

Hi All;

we setup partitioning for a large table but had to back off because the
return status (i.e: "INSERT 0 1") returns "INSERT 0 0" when inserting
into the partitioned table which causes the ORM tool to assume the
insert inserted 0 rows.  Is there a standard / best practices work
around for this?

Thanks in advance


Are you inserting via trigger from an insert into a parent table? That 
would do it.




Yes.. that's exactly the issue.




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and ORM tools

2016-03-22 Thread Melvin Davidson
On Tue, Mar 22, 2016 at 4:23 PM, Joshua D. Drake 
wrote:

> On 03/22/2016 11:40 AM, CS DBA wrote:
>
>> Hi All;
>>
>> we setup partitioning for a large table but had to back off because the
>> return status (i.e: "INSERT 0 1") returns "INSERT 0 0" when inserting
>> into the partitioned table which causes the ORM tool to assume the
>> insert inserted 0 rows.  Is there a standard / best practices work
>> around for this?
>>
>> Thanks in advance
>>
>
> Are you inserting via trigger from an insert into a parent table? That
> would do it.
>
>
> --
> Command Prompt, Inc.  http://the.postgres.company/
>  +1-503-667-4564
> PostgreSQL Centered full stack support, consulting and development.
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


>it's the fact that the number of rows inserted is returned as 0 - due to
>the fact that the rows are not in fact inserted in the parent table when
>configured as per the example in the docs.

Yes, you have declared a problem, but without providing actual detailed
information (as previously requested), there is no way we can duplicate
your problem to debug it and provide a solution.
To the best of my knowledge, telepathy and crystal ball options are not yet
available and are not planned for future versions of PostgreSQL/
-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Re: [GENERAL] Partitioning and ORM tools

2016-03-22 Thread Joshua D. Drake

On 03/22/2016 11:40 AM, CS DBA wrote:

Hi All;

we setup partitioning for a large table but had to back off because the
return status (i.e: "INSERT 0 1") returns "INSERT 0 0" when inserting
into the partitioned table which causes the ORM tool to assume the
insert inserted 0 rows.  Is there a standard / best practices work
around for this?

Thanks in advance


Are you inserting via trigger from an insert into a parent table? That 
would do it.



--
Command Prompt, Inc.  http://the.postgres.company/
 +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and ORM tools

2016-03-22 Thread CS DBA



On 03/22/2016 01:10 PM, Rob Sargent wrote:



On 03/22/2016 12:55 PM, Melvin Davidson wrote:
Your problem seems strange as it has never been previously reported 
for anyone else that has _successfully_ set up partioning.
Perhaps is you provide just a little bit more detail we might be able 
to help you.

Useful and needed information would be:
1. Version of PostgreSQL
2. Operating System
3. Table structure for partitioned table
4. Trigger function and trigger used for insert
5. The actual insert statement.

On Tue, Mar 22, 2016 at 2:40 PM, CS DBA > wrote:


Hi All;

we setup partitioning for a large table but had to back off
because the return status (i.e: "INSERT 0 1") returns "INSERT 0
0" when inserting into the partitioned table which causes the ORM
tool to assume the insert inserted 0 rows.  Is there a standard /
best practices work around for this?

Thanks in advance




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)

To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




And which ORM are you using.

I take it the problem is that the ORM is saying "zero rows inserted" 
and that's um, er, upsetting the client ( which might decide to retry 
and then generates an error for non-unique key or some such noise)


rjs



--
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


I think they are using Ruby, turns out the application is checking this 
and throwing an error (and rolling back) when it detects no rows inserted




Re: [GENERAL] Partitioning and ORM tools

2016-03-22 Thread Kris Deugau
Melvin Davidson wrote:
> Your problem seems strange as it has never been previously reported for
> anyone else that has _successfully_ set up partioning.

At least as of when I asked a very similar question
(http://www.postgresql.org/message-id/flat/547f7e88.7080...@vianet.ca#547f7e88.7080...@vianet.ca),
the answer amounted to "deal with it", "don't use partitioning",
"copy-and-delete rather than redirecting the INSERT" (causes undesirable
VACUUM side effects) or "insert directly in the child table".

The problem is not data going astray, or not getting inserted at all,
it's the fact that the number of rows inserted is returned as 0 - due to
the fact that the rows are not in fact inserted in the parent table when
configured as per the example in the docs.

-kgd


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and ORM tools

2016-03-22 Thread Rob Sargent



On 03/22/2016 12:55 PM, Melvin Davidson wrote:
Your problem seems strange as it has never been previously reported 
for anyone else that has _successfully_ set up partioning.
Perhaps is you provide just a little bit more detail we might be able 
to help you.

Useful and needed information would be:
1. Version of PostgreSQL
2. Operating System
3. Table structure for partitioned table
4. Trigger function and trigger used for insert
5. The actual insert statement.

On Tue, Mar 22, 2016 at 2:40 PM, CS DBA > wrote:


Hi All;

we setup partitioning for a large table but had to back off
because the return status (i.e: "INSERT 0 1") returns "INSERT 0 0"
when inserting into the partitioned table which causes the ORM
tool to assume the insert inserted 0 rows. Is there a standard /
best practices work around for this?

Thanks in advance




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org

)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




And which ORM are you using.

I take it the problem is that the ORM is saying "zero rows inserted" and 
that's um, er, upsetting the client ( which might decide to retry and 
then generates an error for non-unique key or some such noise)


rjs



--
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.




Re: [GENERAL] Partitioning and ORM tools

2016-03-22 Thread Melvin Davidson
Your problem seems strange as it has never been previously reported for
anyone else that has _successfully_ set up partioning.
Perhaps is you provide just a little bit more detail we might be able to
help you.
Useful and needed information would be:
1. Version of PostgreSQL
2. Operating System
3. Table structure for partitioned table
4. Trigger function and trigger used for insert
5. The actual insert statement.

On Tue, Mar 22, 2016 at 2:40 PM, CS DBA  wrote:

> Hi All;
>
> we setup partitioning for a large table but had to back off because the
> return status (i.e: "INSERT 0 1") returns "INSERT 0 0" when inserting into
> the partitioned table which causes the ORM tool to assume the insert
> inserted 0 rows.  Is there a standard / best practices work around for this?
>
> Thanks in advance
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


[GENERAL] Partitioning and ORM tools

2016-03-22 Thread CS DBA

Hi All;

we setup partitioning for a large table but had to back off because the 
return status (i.e: "INSERT 0 1") returns "INSERT 0 0" when inserting 
into the partitioned table which causes the ORM tool to assume the 
insert inserted 0 rows.  Is there a standard / best practices work 
around for this?


Thanks in advance




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Partitioning table by multiple columns

2015-10-28 Thread mephysto
Hi there,
I need to partition a table by value of two columns: id_1 and id_2. They are
both bigint.


Anyone have experience in this? Anyone know how can I do this partitioning?

Thanks in advance.

Meph



--
View this message in context: 
http://postgresql.nabble.com/Partitioning-table-by-multiple-columns-tp5871764.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning table by multiple columns

2015-10-28 Thread Jim Nasby

On 10/28/15 6:57 AM, mephysto wrote:

Hi there,
I need to partition a table by value of two columns: id_1 and id_2. They are
both bigint.


Anyone have experience in this? Anyone know how can I do this partitioning?


It will work just like regular table partitioning. You just need to 
account for 2 columns instead of just one. The Fine Manual has the 
details: http://www.postgresql.org/docs/9.4/static/ddl-partitioning.html

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and constraint exclusion

2015-09-08 Thread Jayadevan M
>
>
>
> ​I am not sure but am doubting it is intelligent enough to recognize the
> functional expression even if all of the values are present.  "simple
> equality" (
> http://www.postgresql.org/docs/9.4/interactive/ddl-partitioning.html)
> this is not.
>

Looks like the tables with about 100+ values in the check list gets pulled
in, even with constraint exclusion on. I created a simple test case. One
parent table with just one column, and 3 child tables with one column.
test=# \d+ parent
Table "public.parent"
 Column |  Type   | Modifiers | Storage | Stats target | Description
+-+---+-+--+-
 id | integer |   | plain   |  |
Child tables: child1,
  child2,
  child3

test=# \d+ child1
Table "public.child1"
 Column |  Type   | Modifiers | Storage | Stats target | Description
+-+---+-+--+-
 id | integer |   | plain   |  |
Check constraints:
"c" CHECK (id = ANY (ARRAY[1, 2]))
Inherits: parent

test=# \d+ child2
Table "public.child2"
 Column |  Type   | Modifiers | Storage | Stats target | Description
+-+---+-+--+-
 id | integer |   | plain   |  |
Check constraints:
"c" CHECK (id = ANY (ARRAY[3, 4]))
Inherits: parent

test=# \d+ child3
Table "public.child3"
 Column |  Type   | Modifiers | Storage | Stats target | Description
+-+---+-+--+-
 id | integer |   | plain   |  |
Check constraints:
"c3" CHECK (id = ANY (ARRAY[5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54,
55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73,
74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92,
93, 94, 95, 96, 97, 98, 99, 100]))
Inherits: parent

test=# explain analyze select * from parent where id = 1;
   QUERY PLAN


 Append  (cost=0.00..40.00 rows=13 width=4) (actual time=0.002..0.002
rows=0 loops=1)
   ->  Seq Scan on parent  (cost=0.00..0.00 rows=1 width=4) (actual
time=0.001..0.001 rows=0 loops=1)
 Filter: (id = 1)
*   ->  Seq Scan on child1  (cost=0.00..40.00 rows=12 width=4) (actual
time=0.000..0.000 rows=0 loops=1)*
* Filter: (id = 1)*
 Total runtime: 0.029 ms

If I increase the number of values a bit more.
with t as (select generate_series(*5,110*) x ) select  'alter table child3
add constraint c3 check  ( id in   ( ' || string_agg(x::text,',')  || ' ))
; ' from t;

test=# explain analyze select * from parent where id = 1;
   QUERY PLAN


 Append  (cost=0.00..80.00 rows=25 width=4) (actual time=0.003..0.003
rows=0 loops=1)
   ->  Seq Scan on parent  (cost=0.00..0.00 rows=1 width=4) (actual
time=0.001..0.001 rows=0 loops=1)
 Filter: (id = 1)
   ->  Seq Scan on child1  (cost=0.00..40.00 rows=12 width=4) (actual
time=0.000..0.000 rows=0 loops=1)
 Filter: (id = 1)
*   ->  Seq Scan on child3  (cost=0.00..40.00 rows=12 width=4) (actual
time=0.000..0.000 rows=0 loops=1)*
* Filter: (id = 1)*


Thanks,
Jayadevan

David J.
>
>


[GENERAL] Partitioning and constraint exclusion

2015-09-07 Thread Jayadevan M
Hello ,

I have a parent table and 6 child tables (partitions). The child tables
have check constraints defined in the form
CHECK (myuid in (123,456,..)).
myuid is bigint, the constraints for the 6 child tables are definitely
mutually exclusive. The number of values in the list ranges from 2-10 for 5
of the child tables. For the 6th child table, the list is 2500+ elements.
When I try explain/explain analyze for even a simple query like

select * from parent where myuid in (123,456,789)

the child table with 2500+ elements gets always scanned. I have an index on
the column and that does get used. But why doesn't the planner just use
constraint exclusion and not go for the index scan? Anyone faced a similar
issue?

Thanks,
Jayadevan


Re: [GENERAL] Partitioning and constraint exclusion

2015-09-07 Thread Melvin Davidson
First, what is the PostgresSQL version ??

Next, in postgresql.conf, what is the value of constraint_exclusion ?

On Mon, Sep 7, 2015 at 8:55 AM, Jayadevan M 
wrote:

> Hello ,
>
> I have a parent table and 6 child tables (partitions). The child tables
> have check constraints defined in the form
> CHECK (myuid in (123,456,..)).
> myuid is bigint, the constraints for the 6 child tables are definitely
> mutually exclusive. The number of values in the list ranges from 2-10 for 5
> of the child tables. For the 6th child table, the list is 2500+ elements.
> When I try explain/explain analyze for even a simple query like
>
> select * from parent where myuid in (123,456,789)
>
> the child table with 2500+ elements gets always scanned. I have an index
> on the column and that does get used. But why doesn't the planner just use
> constraint exclusion and not go for the index scan? Anyone faced a similar
> issue?
>
> Thanks,
> Jayadevan
>



-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Re: [GENERAL] Partitioning and constraint exclusion

2015-09-07 Thread Vick Khera
On Mon, Sep 7, 2015 at 4:48 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> IIRC ​The planner doesn't understand​
>
> ​overlaps so having a definition of:
>
> IN (1,2,3,4,5); or nearly equivalently = ANY(ARRAY[1,2,3,4,5]))​
>
> and a request for:
>
> IN (1,3,5) / = ANY(ARRAY[1,3,5]) is going to get you nowhere with the
> planner.
>

The partition code does not execute your CHECK condition; it only tests to
see if the query includes it explicitly. For example, if you have a split
on "id % 100 = 59" for a table, then looking for id = 13059 does you no
good; you have to search for "id = 13059 AND id % 100 = 59" to invoke the
table exclusions.


Re: [GENERAL] Partitioning and constraint exclusion

2015-09-07 Thread Jayadevan M
On Mon, Sep 7, 2015 at 7:12 PM, Melvin Davidson 
wrote:

> First, what is the PostgresSQL version ??
>

9.3.6

>
> Next, in postgresql.conf, what is the value of constraint_exclusion ?
>

partition

In response to the other possible issues pointed out - the planner is
indeed *skipping the rest of the child tables* (the ones with just a few
values in the check constraint). That is why I feel the number of elements
in the check constraint on this particular child table is causing it to be
scanned. The query ends up scanning the table where the data will be found
and the table with 2500+ values in the check constraint. I may be missing
something?
I tried changing the filter from myuid in (1,2,3) to myuid = 1 or myuid = 2
or
It did not improve the plan. One Index Cond became 3 Index Cond .


Thanks,
Jayadevan


Re: [GENERAL] Partitioning and constraint exclusion

2015-09-07 Thread David G. Johnston
On Mon, Sep 7, 2015 at 8:55 AM, Jayadevan M 
wrote:

> Hello ,
>
> I have a parent table and 6 child tables (partitions). The child tables
> have check constraints defined in the form
> CHECK (myuid in (123,456,..)).
> myuid is bigint, the constraints for the 6 child tables are definitely
> mutually exclusive. The number of values in the list ranges from 2-10 for 5
> of the child tables. For the 6th child table, the list is 2500+ elements.
> When I try explain/explain analyze for even a simple query like
>
> select * from parent where myuid in (123,456,789)
>
> the child table with 2500+ elements gets always scanned. I have an index
> on the column and that does get used. But why doesn't the planner just use
> constraint exclusion and not go for the index scan? Anyone faced a similar
> issue?
>
>
IIRC ​The planner doesn't understand​

​overlaps so having a definition of:

IN (1,2,3,4,5); or nearly equivalently = ANY(ARRAY[1,2,3,4,5]))​

and a request for:

IN (1,3,5) / = ANY(ARRAY[1,3,5]) is going to get you nowhere with the
planner.

​I am not sure but am doubting it is intelligent enough to recognize the
functional expression even if all of the values are present.  "simple
equality" (
http://www.postgresql.org/docs/9.4/interactive/ddl-partitioning.html) this
is not.

David J.


Re: [GENERAL] Partitioning and performance

2015-05-28 Thread Ravi Krishna
 By and large, though, this doesn't really matter, since an empty
 parent table won't cost anything much to scan.  If it's significant
 relative to the child table access time then you probably didn't
 need partitioning in the first place.

Is there a rule of thumb as to at what size does the partitioning
start performing better than non partitioned table.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and performance

2015-05-28 Thread Tom Lane
Ravi Krishna sravikrish...@gmail.com writes:
 So cost wise they both  look same, still when i run the sql in a loop
 in large numbers, it takes rougly 1.8 to 2 times more than non
 partitioned table.

If you're testing cases that only involve fetching a single row,
the discrepancy could well be down to extra planning time.  Proving
that the other partitions don't need to be scanned is far from free.

It's also worth realizing that for queries that fetch just one or
a few rows, it's very unlikely that partitioning can beat an unpartitioned
table, period.  Basically, partitioning replaces a runtime search of the
top levels of a large index with a plan-time proof that other partitions
need not be visited.  That is not going to be cheaper and could well be a
lot more expensive.

The situations where partitioning is useful boil down to:

1. You have repetitive, stylized requirements such as every month,
delete all data older than X months that can be mapped to drop
the oldest partition instead of doing an expensive table scan.

2. You can arrange things so that certain partitions are accessed
far more often than others, thus directing most disk traffic to
specific child tables that will remain in RAM cache most of the time.
(In principle, you could get similar cache-friendly behavior from a
clustered unpartitioned table, but it's usually too hard to ensure
that such a table stays clustered.)

It does not sound like your test case is exercising either of those
win scenarios, and all you're measuring is the overhead of partitioning,
which as I said is substantial.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and performance

2015-05-28 Thread Ravi Krishna
On Thu, May 28, 2015 at 12:50 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Sure, because you don't have a constraint forbidding the parent from
 having a matching row, no?

As suggested by you, I included a bogus condition in the parent table
which will prevent any row addition in the parent table and made the
constraint NO INHERIT.

i run this

SET constraint_exclusion = on;
explain select * from tstesting.account where account_row_inst = 1001 ;



Append  (cost=0.14..8.16 rows=1 width=832)
   -  Index Scan using account_part1_pkey on account_part1
(cost=0.14..8.16 rows=1 width=832)
 Index Cond: (account_row_inst = 1001)
(3 rows)

The planner shows this for the non partitioned table

 Index Scan using account_pkey on account  (cost=0.14..8.16 rows=1 width=832)
   Index Cond: (account_row_inst = 1001)
(2 rows)

So cost wise they both  look same, still when i run the sql in a loop
in large numbers, it takes rougly 1.8 to 2 times more than non
partitioned table.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and performance

2015-05-28 Thread Tom Lane
Ravi Krishna sravikrish...@gmail.com writes:
 Is there a rule of thumb as to at what size does the partitioning
 start performing better than non partitioned table.

Personally I'd not worry about partitioning until I had a table
approaching maybe a billion (1e9) rows.  You could argue that
an order of magnitude either way, but it's just not worth the
extra complexity for data volumes very much less than that.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Partitioning and performance

2015-05-28 Thread Ravi Krishna
I am testing partitioning of a large table. I am INHERITING child tables.
It is using a range
partitioning based on a sequence col, which also acts as the primary
key. For inserts I am using a trigger which will redirect insert to
the right table based on the value of the primary key.

Based on my testing, I see that the insert speed is less than 10%
different than a non partitioned table. I am using  SET
constraint_exclusion = on and I checked that via ANALYZE that the
planner does not consider non qualifying child tables.

yet, selects and updates based on the primary key show anywhere from
40 to 200% slowness as compared to non partition. One thing I notice
is that, even with partition pruning, the planner scans the base table
and the table matching the condition. Is that the additional overhead.

I am attaching below the output of analyze.

===
On a non partitioned table

explain select count(*) from tstesting.account where account_row_inst = 101 ;
Aggregate (cost=8.16..8.17 rows=1 width=0)
- Index Only Scan using account_pkey on account (cost=0.14..8.16
rows=1 width=0)
Index Cond: (account_row_inst = 101)
(3 rows)


With partition pruning:

Aggregate (cost=8.45..8.46 rows=1 width=0)
- Append (cost=0.00..8.44 rows=2 width=0)
- Seq Scan on account (cost=0.00..0.00 rows=1 width=0)
Filter: (account_row_inst = 101)
- Index Only Scan using account_part1_pkey on account_part1
(cost=0.42..8.44 rows=1 width=0)
Index Cond: (account_row_inst = 101)
(6 rows)

On a partitioned table, with no partition pruning.

explain analyze select count(*) from tstesting.account where
account_row_inst = 101 ;
Aggregate (cost=29.77..29.78 rows=1 width=0) (actual time=0.032..0.032
rows=1 loops=1)
- Append (cost=0.00..29.76 rows=5 width=0) (actual time=0.029..0.029
rows=0 loops=1)
- Seq Scan on account (cost=0.00..0.00 rows=1 width=0) (actual
time=0.000..0.000 rows=0 loops=1)
Filter: (account_row_inst = 101)
- Index Only Scan using account_part1_pkey on account_part1
(cost=0.42..4.44 rows=1 width=0) (actual time=0.008..0.008 rows=0
loops=1)
Index Cond: (account_row_inst = 101)
Heap Fetches: 0
- Index Only Scan using account_part2_pkey on account_part2
(cost=0.42..8.44 rows=1 width=0) (actual time=0.007..0.007 rows=0
loops=1)
Index Cond: (account_row_inst = 101)
Heap Fetches: 0
- Index Only Scan using account_part3_pkey on account_part3
(cost=0.42..8.44 rows=1 width=0) (actual time=0.007..0.007 rows=0
loops=1)
Index Cond: (account_row_inst = 101)
Heap Fetches: 0
- Index Only Scan using account_part4_pkey on account_part4
(cost=0.42..8.44 rows=1 width=0) (actual time=0.006..0.006 rows=0
loops=1)
Index Cond: (account_row_inst = 101)
Heap Fetches: 0
Planning time: 0.635 ms
Execution time: 0.137 ms
(18 rows)


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and performance

2015-05-28 Thread Ravi Krishna
 Have you set up constraints on the partitions? The planner needs to know
 what is in the child tables so it can avoid scanning them.

Yes. each child table is defined as follows

CREATE TABLE TSTESTING.ACCOUNT_PART1

 ( CHECK (ACCOUNT_ROW_INST BETWEEN 1001 and 271660))

 INHERITS (TSTESTING.ACCOUNT);

ALTER TABLE TSTESTING.ACCOUNT_PART1 ADD CONSTRAINT ACCOUNT_PART1_PKEY
PRIMARY KEY (ACCOUNT_ROW_INST);

Perhaps I was not clear. The planner is excluding partitions which can
not contain the rows looked up in the WHERE clause. However it is
still scanning the parent table.

Aggregate (cost=8.45..8.46 rows=1 width=0)
- Append (cost=0.00..8.44 rows=2 width=0)
- Seq Scan on account (cost=0.00..0.00 rows=1 width=0)
Filter: (account_row_inst = 101)
- Index Only Scan using account_part1_pkey on account_part1
(cost=0.42..8.44 rows=1 width=0)
Index Cond: (account_row_inst = 101)
(6 rows)


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and performance

2015-05-28 Thread Jan Lentfer


Am 28. Mai 2015 17:15:22 MESZ, schrieb Ravi Krishna sravikrish...@gmail.com:
I am testing partitioning of a large table. I am INHERITING child
tables.
It is using a range
partitioning based on a sequence col, which also acts as the primary
key. For inserts I am using a trigger which will redirect insert to
the right table based on the value of the primary key.

Based on my testing, I see that the insert speed is less than 10%
different than a non partitioned table. I am using  SET
constraint_exclusion = on and I checked that via ANALYZE that the
planner does not consider non qualifying child tables.

yet, selects and updates based on the primary key show anywhere from
40 to 200% slowness as compared to non partition. One thing I notice
is that, even with partition pruning, the planner scans the base table
and the table matching the condition. Is that the additional overhead.

I am attaching below the output of analyze.

===
On a non partitioned table

explain select count(*) from tstesting.account where account_row_inst =
101 ;
Aggregate (cost=8.16..8.17 rows=1 width=0)
- Index Only Scan using account_pkey on account (cost=0.14..8.16
rows=1 width=0)
Index Cond: (account_row_inst = 101)
(3 rows)


With partition pruning:

Aggregate (cost=8.45..8.46 rows=1 width=0)
- Append (cost=0.00..8.44 rows=2 width=0)
- Seq Scan on account (cost=0.00..0.00 rows=1 width=0)
Filter: (account_row_inst = 101)
- Index Only Scan using account_part1_pkey on account_part1
(cost=0.42..8.44 rows=1 width=0)
Index Cond: (account_row_inst = 101)
(6 rows)

On a partitioned table, with no partition pruning.

explain analyze select count(*) from tstesting.account where
account_row_inst = 101 ;
Aggregate (cost=29.77..29.78 rows=1 width=0) (actual time=0.032..0.032
rows=1 loops=1)
- Append (cost=0.00..29.76 rows=5 width=0) (actual time=0.029..0.029
rows=0 loops=1)
- Seq Scan on account (cost=0.00..0.00 rows=1 width=0) (actual
time=0.000..0.000 rows=0 loops=1)
Filter: (account_row_inst = 101)
- Index Only Scan using account_part1_pkey on account_part1
(cost=0.42..4.44 rows=1 width=0) (actual time=0.008..0.008 rows=0
loops=1)
Index Cond: (account_row_inst = 101)
Heap Fetches: 0
- Index Only Scan using account_part2_pkey on account_part2
(cost=0.42..8.44 rows=1 width=0) (actual time=0.007..0.007 rows=0
loops=1)
Index Cond: (account_row_inst = 101)
Heap Fetches: 0
- Index Only Scan using account_part3_pkey on account_part3
(cost=0.42..8.44 rows=1 width=0) (actual time=0.007..0.007 rows=0
loops=1)
Index Cond: (account_row_inst = 101)
Heap Fetches: 0
- Index Only Scan using account_part4_pkey on account_part4
(cost=0.42..8.44 rows=1 width=0) (actual time=0.006..0.006 rows=0
loops=1)
Index Cond: (account_row_inst = 101)
Heap Fetches: 0
Planning time: 0.635 ms
Execution time: 0.137 ms
(18 rows)


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Have you set up constraints on the partitions? The planner needs to know what 
is in the child tables so it can avoid scanning them.

Jan


Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.

Re: [GENERAL] Partitioning and performance

2015-05-28 Thread Melvin Davidson
Generally, when you partition, data should only be in child tables, and the
parent table should be empty, otherwise you defeat the purpose of
parttioning.`

On Thu, May 28, 2015 at 12:25 PM, Ravi Krishna sravikrish...@gmail.com
wrote:

  Have you set up constraints on the partitions? The planner needs to know
  what is in the child tables so it can avoid scanning them.

 Yes. each child table is defined as follows

 CREATE TABLE TSTESTING.ACCOUNT_PART1

  ( CHECK (ACCOUNT_ROW_INST BETWEEN 1001 and 271660))

  INHERITS (TSTESTING.ACCOUNT);

 ALTER TABLE TSTESTING.ACCOUNT_PART1 ADD CONSTRAINT ACCOUNT_PART1_PKEY
 PRIMARY KEY (ACCOUNT_ROW_INST);

 Perhaps I was not clear. The planner is excluding partitions which can
 not contain the rows looked up in the WHERE clause. However it is
 still scanning the parent table.

 Aggregate (cost=8.45..8.46 rows=1 width=0)
 - Append (cost=0.00..8.44 rows=2 width=0)
 - Seq Scan on account (cost=0.00..0.00 rows=1 width=0)
 Filter: (account_row_inst = 101)
 - Index Only Scan using account_part1_pkey on account_part1
 (cost=0.42..8.44 rows=1 width=0)
 Index Cond: (account_row_inst = 101)
 (6 rows)


 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general




-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Re: [GENERAL] Partitioning and performance

2015-05-28 Thread Jan Lentfer


Am 28. Mai 2015 18:25:42 MESZ, schrieb Ravi Krishna sravikrish...@gmail.com:
 Have you set up constraints on the partitions? The planner needs to
know
 what is in the child tables so it can avoid scanning them.

Yes. each child table is defined as follows

CREATE TABLE TSTESTING.ACCOUNT_PART1

 ( CHECK (ACCOUNT_ROW_INST BETWEEN 1001 and 271660))

 INHERITS (TSTESTING.ACCOUNT);

ALTER TABLE TSTESTING.ACCOUNT_PART1 ADD CONSTRAINT ACCOUNT_PART1_PKEY
PRIMARY KEY (ACCOUNT_ROW_INST);

Perhaps I was not clear. The planner is excluding partitions which can
not contain the rows looked up in the WHERE clause. However it is
still scanning the parent table.

Aggregate (cost=8.45..8.46 rows=1 width=0)
- Append (cost=0.00..8.44 rows=2 width=0)
- Seq Scan on account (cost=0.00..0.00 rows=1 width=0)
Filter: (account_row_inst = 101)
- Index Only Scan using account_part1_pkey on account_part1
(cost=0.42..8.44 rows=1 width=0)
Index Cond: (account_row_inst = 101)
(6 rows)

You can have a look at pg_partman. It makes setting up partitioning quite easy 
and provides a tool to easily move existing data from parent to child tables.

Jan

Re: [GENERAL] Partitioning and performance

2015-05-28 Thread Tom Lane
Ravi Krishna sravikrish...@gmail.com writes:
 Perhaps I was not clear. The planner is excluding partitions which can
 not contain the rows looked up in the WHERE clause. However it is
 still scanning the parent table.

Sure, because you don't have a constraint forbidding the parent from
having a matching row, no?

In older versions of PG there wasn't any way around this, but recent
versions allow you to mark a constraint as NO INHERIT, which would
let you attach such a constraint to the parent only.

By and large, though, this doesn't really matter, since an empty
parent table won't cost anything much to scan.  If it's significant
relative to the child table access time then you probably didn't
need partitioning in the first place.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and performance

2015-05-28 Thread Ravi Krishna
On Thu, May 28, 2015 at 12:42 PM, Melvin Davidson melvin6...@gmail.com wrote:

 Generally, when you partition, data should only be in child tables, and the 
 parent table should be empty, otherwise you defeat the purpose of 
 parttioning.`

yes of course the parent table is empty. The trigger on insert is
redirecting it to the proper child table.

select count(*) from only tstesting.account ;
 count
---
 0
(1 row)

select count(*) from only tstesting.account_part1 ;
 count
---
 83659
(1 row)

select count(*) from only tstesting.account_part5 ;
 count
---
 83659
(1 row)


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and constraint exclusion

2015-03-05 Thread Stephen Frost
Sam,

* Samuel Smith (pg...@net153.net) wrote:
 Does anyone know if there is a wishlist item for improving this in
 postgresql or is this as good as it gets for now?

It's absolutely on the todo list for PG to support declarative
partitioning and handle these cases better.  There has been a good sized
thread very recently discussing how to make that happen over on the
hackers mailing list recently and while it almost certainly won't be in
the next release (expected to be 9.5, released around September or
October), it may very well be in the release the following year.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [GENERAL] Partitioning and constraint exclusion

2015-03-05 Thread Samuel Smith

On 03/04/2015 12:11 AM, David G Johnston wrote:


In short - since the planner determines exclusion constraints and the
executor, which strictly follows the planner in the query execution process,
would be the one to determine what the value of your date is - there is no
way for a single query to provide data that would then be used to determine
exclusion constraints.

Now, that said, I don't believe you should be actual full table scans during
processing if you have proper indexes setup.  An index scan should be usable
and quickly determine which tables lack data to contribute to the query
results.

In terms of separating out the date query and partition query:

PREPARE/EXECUTE in pure SQL (can, must?, be direct)
EXECUTE/USING in pl/pgsql (via a function)

You may have other reasonable options on the client side...

You should consider providing EXPLAIN ANALYZE results and maybe a test case
so others can give pointers.

David J.




Just to correct, the constraint column is also an indexed column. What I 
meant by scanning all partitions was that the index of all the 
partitions was being checked. Not the biggest issue, but still unwanted.


I spent most of yesterday comparing how DB2 handles table partitions and 
running similar explain queries. It indeed can do partition elimination 
when the values of the constraint column are coming from a sub query or 
other none constant value.


Does anyone know if there is a wishlist item for improving this in 
postgresql or is this as good as it gets for now?


I can work around the issues, but it just kind of caught me off guard 
since there are so many nice benchmarks on the web with partitioning in 
postgresql (and now that I go back an look at them, they all have 
constants in the where clause).



--Sam


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Partitioning with the index on the master table.

2015-03-04 Thread Semyon Reyfman
I have a list-based partition table and I want to prevent scanning of all
children tables in a JOIN, which is using the partition column.   I know
that CHECK constraints are not used in this case so I hope that creating an
index on the partition column would help.  But I am not sure if I should
create this index on the master table or on each of children tables.  It
seems kind of strange to create an index on a children table where all
values of the partition column are the same.  So will an index on the master
table be sufficient in this case?

Thanks,

Simon



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning with the index on the master table.

2015-03-04 Thread Jim Nasby

On 3/4/15 9:25 AM, Semyon Reyfman wrote:

I have a list-based partition table and I want to prevent scanning of all
children tables in a JOIN, which is using the partition column.   I know
that CHECK constraints are not used in this case


Why do you say that? If properly written and with constraint_exclusion 
 'off' and with good queries they will be.



so I hope that creating an
index on the partition column would help.  But I am not sure if I should
create this index on the master table or on each of children tables.  It
seems kind of strange to create an index on a children table where all
values of the partition column are the same.  So will an index on the master
table be sufficient in this case?


It won't help unless you actually have data in the master table. Which 
normally you wouldn't.


Basically, any indexing you do on a partitioned table must be done *on 
the partitions*, not on the master table.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and constraint exclusion

2015-03-04 Thread Stephen Frost
Samuel,

* Samuel Smith (pg...@net153.net) wrote:
 I noticed that I could get very nice partition elimination using
 constant values in the where clause.
 
 Ex:
 select * from table where constraint_col between '2015-01-01'
 and '2015-02-15'
 
 However, I could not get any partition elimination for queries that
 did not have constant values in the where clause.
 
 Ex:
 select * from table where constraint_col = (select max(date)
 from other_table)

That's correct.

 Unfortunately all of our queries on the analytics team need to be
 dynamic like this and summarize data based around certain recorded
 events and dates from other tables. I saw the note in the docs about
 not being able to use current_timestamp in the where clause but I
 really need to be able to use a sub select or CTE in the where
 clause for the needed dates.

Not sure if this will help, but the planner is smart enough to implement
one-time filters for certain cases.  Instead of using inheiritance-based
partitioning, you can use a view like so:

CREATE VIEW v AS 
  SELECT * FROM table1 WHERE column1 = 5
UNION ALL
  SELECT * FROM table2 WHERE column1 = 6
;

Then for cases where we can prove that no results will be returned from
the individual union-all branch, we'll skip it:

SELECT * FROM v WHERE column1 = (select max(column1) from table3);

Unfortunately, we don't appear to support that for an inequality as you
show above.  I'm not sure offhand why not but it didn't work in my
testing.

Another approach to dealing with this is to use plpgsql functions and
'return execute' which essentially compute the constant and then build a
dyanmic SQL query using the constant and return the results.  It's a bit
awkward compared to just writing the query, but it does work.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [GENERAL] Partitioning and constraint exclusion

2015-03-03 Thread David G Johnston
Samuel Smith wrote
 I noticed that I could get very nice partition elimination using 
 constant values in the where clause.
 
 Ex:
 select * from 

  where 
 constraint_col
  between '2015-01-01' and 
 '2015-02-15'
 
 However, I could not get any partition elimination for queries that did 
 not have constant values in the where clause.
 
 Ex:
 select * from 

  where 
 constraint_col
  = (select max(date) from 
 other_table
 )
 
 Unfortunately all of our queries on the analytics team need to be 
 dynamic like this and summarize data based around certain recorded 
 events and dates from other tables. I saw the note in the docs about not 
 being able to use current_timestamp in the where clause but I really 
 need to be able to use a sub select or CTE in the where clause for the 
 needed dates.
 
 I tried about 10 different ways (on both 9.1 and 9.4) to dynamically get 
 the data (sub selects, cte, joins) for my constraint column but all of 
 them resulted in a full scan of all partitions.
 
 I am going to try a few other ways tomorrow, I am hoping I am doing 
 something wrong, or is this just typical?

In short - since the planner determines exclusion constraints and the
executor, which strictly follows the planner in the query execution process,
would be the one to determine what the value of your date is - there is no
way for a single query to provide data that would then be used to determine
exclusion constraints.

Now, that said, I don't believe you should be actual full table scans during
processing if you have proper indexes setup.  An index scan should be usable
and quickly determine which tables lack data to contribute to the query
results.

In terms of separating out the date query and partition query:

PREPARE/EXECUTE in pure SQL (can, must?, be direct)
EXECUTE/USING in pl/pgsql (via a function)

You may have other reasonable options on the client side...

You should consider providing EXPLAIN ANALYZE results and maybe a test case
so others can give pointers.

David J.




--
View this message in context: 
http://postgresql.nabble.com/Partitioning-and-constraint-exclusion-tp5840353p5840356.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Partitioning and constraint exclusion

2015-03-03 Thread Samuel Smith

Howdy,
I spent a majority of today playing around with pg_partman (awesome tool 
btw!). I am mainly using the time-static method with an interval of one 
month.
I wanted to see what performance improvements I could get with some 
common queries that are used by our analytics team. A lot of these 
queries summarize data by day or by month. Our largest database gets 10+ 
million rows a day to several different tables (each). I played around 
with a subset of the data spread across about 6 months (about 10 million 
rows total).


I noticed that I could get very nice partition elimination using 
constant values in the where clause.


Ex:
select * from table where constraint_col between '2015-01-01' and 
'2015-02-15'


However, I could not get any partition elimination for queries that did 
not have constant values in the where clause.


Ex:
select * from table where constraint_col = (select max(date) from 
other_table)


Unfortunately all of our queries on the analytics team need to be 
dynamic like this and summarize data based around certain recorded 
events and dates from other tables. I saw the note in the docs about not 
being able to use current_timestamp in the where clause but I really 
need to be able to use a sub select or CTE in the where clause for the 
needed dates.


I tried about 10 different ways (on both 9.1 and 9.4) to dynamically get 
the data (sub selects, cte, joins) for my constraint column but all of 
them resulted in a full scan of all partitions.


I am kind of bummed out by this as dropping in partitioning in this 
method will just hurt performance and not improve it. The only good 
thing I see is the ability to delete (drop) older data from the table, 
but this is not a functionality we need right now.


I am going to try a few other ways tomorrow, I am hoping I am doing 
something wrong, or is this just typical?


Thanks,
Sam


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] partitioning query planner almost always scans all tables

2015-01-23 Thread Spiros Ioannou
Thank you for your input.

When doing an ORDER BY measurement_time DESC I had hoped for the query
planner to firstly query the most recent child table, i.e. the table
holding current month's data, and then move-on to the oldest table, since
it knows the partition is based on measurement_time. Instead it always
queries the oldest table first and then moves on to the current.

We did the partitioning to be able to drop old data saving disk space, as
you said. Not for performance reasons, although we hoped than having
smaller indexes would also help.

Best regards,
-Spiros









*Spiros Ioannou IT Manager, inAccesswww.inaccess.com
http://www.inaccess.comM: +30 6973-903808T: +30 210-6802-358*

On 22 January 2015 at 17:37, Tom Lane t...@sss.pgh.pa.us wrote:

 Spiros Ioannou siv...@inaccess.com writes:
  It is ORDER BY measurement_time, not measurement_id, and measurement_time
  is used to create the partition. So the planner should know the correct
  order, but instead it seems to query tables in the wrong order.

 The planner does not know that, and even if it attempted to figure it out
 by comparing the child tables' constraints, it could not generate a plan
 that considered only one child table as you incorrectly imagine.  What if
 the latest table turned out to be empty at runtime?

 The obtained plan with a Merge Append atop Index Scan Backwards nodes
 seems perfectly reasonable to me.  This will result in fetching only the
 latest row within each partition, so that the work involved is O(number of
 partitions) not O(total number of rows).

 If you're not happy with that, reconsider how many partitions you really
 need.  Newbies almost invariably create far more partitions than is a good
 idea for performance.  In my view, if you've got more than a couple dozen,
 you're doing it wrong.  Partitioning is, in general, not a benefit for
 query performance (except in a few very narrow, specialized cases); and
 the more partitions you have the worse the penalty.  Partitioning only
 helps for data management, in particular being able to drop old data in
 bulk rather than through expensive DELETE WHERE queries.  How often do
 you do that, and do you really need to be able to do it at a small
 granularity?

 regards, tom lane



Re: [GENERAL] partitioning query planner almost always scans all tables

2015-01-22 Thread Spiros Ioannou


  EXPLAIN ANALYZE  SELECT * FROM measurement_events WHERE
  measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' ORDER BY
  measurement_time DESC LIMIT 1;
 
  This seems to fail, scanning all tables. Do you think this can be
 improved
  at all ? The query plan of the above query is as follows:

 The combination of sorting by measurement_source_id and limit
 hinders constraint exclusion because the order of the column
 across whole the inheritance is not known to planner. And the
 below plan also dosn't show whether constraint exclusion worked
 or not, by the same reason. But I suppose it worked.


It is ORDER BY measurement_time, not measurement_id, and measurement_time
is used to create the partition. So the planner should know the correct
order, but instead it seems to query tables in the wrong order.


Re: [GENERAL] partitioning query planner almost always scans all tables

2015-01-22 Thread Kyotaro HORIGUCHI
Hi, 

 @Kyotaro HORIGUCHI
 thanks for your reply and time Kyotaro,

Not at all.

 Using the following query
 EXPLAIN ANALYZE  SELECT * FROM measurement_events WHERE
 measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' AND
 measurement_time = '2015-01-01 00:00:00+0' LIMIT 1;
 
 produces this plan:
 
  Limit  (cost=0.00..4.02 rows=1 width=67) (actual time=49.125..49.125
 rows=1 loops=1)
-  Append  (cost=0.00..3644.05 rows=907 width=67) (actual
 time=49.122..49.122 rows=1 loops=1)
  -  Seq Scan on measurement_events  (cost=0.00..0.00 rows=1
 width=966) (actual time=0.003..0.003 rows=0 loops=1)
Filter: ((measurement_time = '2015-01-01
 02:00:00+02'::timestamp with time zone) AND (measurement_source_id =
 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid))
  -  Bitmap Heap Scan on *measurement_events_p2015_01*
  (cost=41.73..3546.10 rows=894 width=54) (actual time=49.119..49.119 rows=1
 loops=1)
Recheck Cond: ((measurement_source_id =
 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time =
 '2015-01-01 02:00:00+02'::timestamp with time zone))
-  Bitmap Index Scan on *measurement_events_p2015_01_pkey*
  (cost=0.00..41.51 rows=894 width=0) (actual time=41.836..41.836 rows=997
 loops=1)
  Index Cond: ((measurement_source_id =
 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time =
 '2015-01-01 02:00:00+02'::timestamp with time zone))
  -  Index Scan using *measurement_events_p2015_02_pkey* on
 *measurement_events_p2015_02*  (cost=0.14..8.16 rows=1 width=966) (never
 executed)
Index Cond: ((measurement_source_id =
 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time =
 '2015-01-01 02:00:00+02'::timestamp with time zone))
  -  Index Scan using *measurement_events_p2015_03_pkey* on
 *measurement_events_p2015_03*  (cost=0.14..8.16 rows=1 width=966) (never
 executed)
Index Cond: ((measurement_source_id =
 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time =
 '2015-01-01 02:00:00+02'::timestamp with time zone))
 ... (cut for brevity)
 
 1) Do you know if this means that the query will stop on 1st find (since it
 is limit 1), or will it search all tables regardless results?

It saids that only the first table was scanned because 1 row had
been acquired. But it is unclear whether constraint exclusion
worked. All of the table shown above seems to have the data after
2015/1/1 and match the condition of your query. It's okay if
p_2014_12 and the earlier is not seen in the explain result and
the order in which the tables appear seems to me suggesting it is
okay. Please examine it on that ponit.

 2) To improve on the above, do you (or anyone else) have any input on this:
 
 to get the latest value from all tables, we were using the following query
 (before partitioning):
 
 EXPLAIN ANALYZE  SELECT * FROM measurement_events WHERE
 measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' ORDER BY
 measurement_time DESC LIMIT 1;
 
 This seems to fail, scanning all tables. Do you think this can be improved
 at all ? The query plan of the above query is as follows:

The combination of sorting by measurement_source_id and limit
hinders constraint exclusion because the order of the column
across whole the inheritance is not known to planner. And the
below plan also dosn't show whether constraint exclusion worked
or not, by the same reason. But I suppose it worked.

Since constraint exclusion worked, it seems enough optmized. What
kind of optimizaition do you expect?


 --
  Limit  (cost=5.57..9.99 rows=1 width=921) (actual time=5.361..5.361 rows=1
 loops=1)
-  Merge Append  (cost=5.57..451374.16 rows=102155 width=921) (actual
 time=5.359..5.359 rows=1 loops=1)
  Sort Key: measurement_events.measurement_time
  -  Index Scan Backward using measurement_events_pkey on
 *measurement_events*  (cost=0.12..8.14 rows=1 width=966) (actual
 time=0.004..0.004 rows=0 loops=1)
Index Cond: (measurement_source_id =
 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid)
  -  Index Scan Backward using *measurement_events_p2014_01_pkey*
 on *measurement_events_p2014_01*  (cost=0.14..8.16 rows=1 width=966)
 (actual time=0.002..0.002 rows=0 loops=1)
Index Cond: (measurement_source_id =
 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid)
  -  Index Scan Backward using *measurement_events_p2014_02_pkey*
 on *measurement_events_p2014_02*  (cost=0.14..8.16 rows=1 width=966)
 (actual time=0.001..0.001 rows=0 loops=1)
Index Cond: (measurement_source_id =
 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid)
  -  Index Scan Backward using *measurement_events_p2014_03_pkey*
 on 

Re: [GENERAL] partitioning query planner almost always scans all tables

2015-01-22 Thread Tom Lane
Spiros Ioannou siv...@inaccess.com writes:
 It is ORDER BY measurement_time, not measurement_id, and measurement_time
 is used to create the partition. So the planner should know the correct
 order, but instead it seems to query tables in the wrong order.

The planner does not know that, and even if it attempted to figure it out
by comparing the child tables' constraints, it could not generate a plan
that considered only one child table as you incorrectly imagine.  What if
the latest table turned out to be empty at runtime?

The obtained plan with a Merge Append atop Index Scan Backwards nodes
seems perfectly reasonable to me.  This will result in fetching only the
latest row within each partition, so that the work involved is O(number of
partitions) not O(total number of rows).

If you're not happy with that, reconsider how many partitions you really
need.  Newbies almost invariably create far more partitions than is a good
idea for performance.  In my view, if you've got more than a couple dozen,
you're doing it wrong.  Partitioning is, in general, not a benefit for
query performance (except in a few very narrow, specialized cases); and
the more partitions you have the worse the penalty.  Partitioning only
helps for data management, in particular being able to drop old data in
bulk rather than through expensive DELETE WHERE queries.  How often do
you do that, and do you really need to be able to do it at a small
granularity?

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] partitioning query planner almost always scans all tables

2015-01-21 Thread Spiros Ioannou
@Rob Sargent: sorry Rob, not sure what you are asking.

@Kyotaro HORIGUCHI
thanks for your reply and time Kyotaro,

Using the following query
EXPLAIN ANALYZE  SELECT * FROM measurement_events WHERE
measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' AND
measurement_time = '2015-01-01 00:00:00+0' LIMIT 1;

produces this plan:

 Limit  (cost=0.00..4.02 rows=1 width=67) (actual time=49.125..49.125
rows=1 loops=1)
   -  Append  (cost=0.00..3644.05 rows=907 width=67) (actual
time=49.122..49.122 rows=1 loops=1)
 -  Seq Scan on measurement_events  (cost=0.00..0.00 rows=1
width=966) (actual time=0.003..0.003 rows=0 loops=1)
   Filter: ((measurement_time = '2015-01-01
02:00:00+02'::timestamp with time zone) AND (measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid))
 -  Bitmap Heap Scan on *measurement_events_p2015_01*
 (cost=41.73..3546.10 rows=894 width=54) (actual time=49.119..49.119 rows=1
loops=1)
   Recheck Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time =
'2015-01-01 02:00:00+02'::timestamp with time zone))
   -  Bitmap Index Scan on *measurement_events_p2015_01_pkey*
 (cost=0.00..41.51 rows=894 width=0) (actual time=41.836..41.836 rows=997
loops=1)
 Index Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time =
'2015-01-01 02:00:00+02'::timestamp with time zone))
 -  Index Scan using *measurement_events_p2015_02_pkey* on
*measurement_events_p2015_02*  (cost=0.14..8.16 rows=1 width=966) (never
executed)
   Index Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time =
'2015-01-01 02:00:00+02'::timestamp with time zone))
 -  Index Scan using *measurement_events_p2015_03_pkey* on
*measurement_events_p2015_03*  (cost=0.14..8.16 rows=1 width=966) (never
executed)
   Index Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time =
'2015-01-01 02:00:00+02'::timestamp with time zone))
... (cut for brevity)

1) Do you know if this means that the query will stop on 1st find (since it
is limit 1), or will it search all tables regardless results?

2) To improve on the above, do you (or anyone else) have any input on this:

to get the latest value from all tables, we were using the following query
(before partitioning):

EXPLAIN ANALYZE  SELECT * FROM measurement_events WHERE
measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' ORDER BY
measurement_time DESC LIMIT 1;

This seems to fail, scanning all tables. Do you think this can be improved
at all ? The query plan of the above query is as follows:

--
 Limit  (cost=5.57..9.99 rows=1 width=921) (actual time=5.361..5.361 rows=1
loops=1)
   -  Merge Append  (cost=5.57..451374.16 rows=102155 width=921) (actual
time=5.359..5.359 rows=1 loops=1)
 Sort Key: measurement_events.measurement_time
 -  Index Scan Backward using measurement_events_pkey on
*measurement_events*  (cost=0.12..8.14 rows=1 width=966) (actual
time=0.004..0.004 rows=0 loops=1)
   Index Cond: (measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid)
 -  Index Scan Backward using *measurement_events_p2014_01_pkey*
on *measurement_events_p2014_01*  (cost=0.14..8.16 rows=1 width=966)
(actual time=0.002..0.002 rows=0 loops=1)
   Index Cond: (measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid)
 -  Index Scan Backward using *measurement_events_p2014_02_pkey*
on *measurement_events_p2014_02*  (cost=0.14..8.16 rows=1 width=966)
(actual time=0.001..0.001 rows=0 loops=1)
   Index Cond: (measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid)
 -  Index Scan Backward using *measurement_events_p2014_03_pkey*
on *measurement_events_p2014_03*  (cost=0.14..8.16 rows=1 width=966)
(actual time=0.001..0.001 rows=0 loops=1)
   Index Cond: (measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid)

. (cut for brevity)


Re: [GENERAL] partitioning query planner almost always scans all tables

2015-01-19 Thread Kyotaro HORIGUCHI
Hello,

Your constraint column is of 'timestamp with time zone' and the
query uses 'date'. The comparison between them is an operator
based on a static, non-immutable function so constraint exclusion
doesn't work.

SELECT o.oprname, o.oprcode, p.provolatile
 FROM pg_operator o join pg_proc p on (p.oid = o.oprcode)
 WHERE oprname = '' AND
   oprleft = 'timestamp with time zone'::regtype AND
   oprright = 'date'::regtype;

 oprname |   oprcode   | provolatile 
-+-+-
| timestamptz_lt_date | s

# '' is a random selection. Any comparison ops will do.

The following query instead will do what you wanted. Only the
operand of '=' is changed.

EXPLAIN ANALYZE  SELECT * FROM measurement_events WHERE 
measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' AND 
measurement_time = '2015-01-01 00:00:00+0' LIMIT 1;

The function in the expression used for exclusion is immutable.

SELECT o.oprname, o.oprcode, p.provolatile
 FROM pg_operator o join pg_proc p on (p.oid = o.oprcode)
 WHERE oprname = '' AND
   oprleft = 'timestamp with time zone'::regtype AND
   oprright = 'timestamp with time zone::regtype;

 oprname |oprcode | provolatile 
-++-
| timestamptz_lt | i


The details about this in the following page,

http://www.postgresql.org/docs/9.3/static/ddl-partitioning.html

 5.9.6 Cavert

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


Jan 2015 06:42:53 -0700, Rob Sargent robjsarg...@gmail.com wrote in 
31fed87e-d31b-4cf9-93f8-cc0f131cb...@gmail.com
 I don't understand having both UUID and time stamp in your PK? The first is 
 by defn. unique and the second might be. 
 
 Sent from my iPhone
 
  On Jan 19, 2015, at 6:12 AM, Spiros Ioannou siv...@inaccess.com wrote:
  
  Hello group,
  we have a timeseries table, and we tried to partition it by month (with 
  pg_partman). It seems the query planner always reads all tables regardless 
  of WHERE, except when WHERE is equality.
  
  the parent table:
  
  ifms_db=# \dS measurement_events
Table public.measurement_events
  Column |   Type   | Modifiers
  ---+--+---
   measurement_source_id | uuid | not null
   measurement_time  | timestamp with time zone | not null
   event_reception_time  | timestamp with time zone | not null
   measurement_value | character varying(200)   | not null
   quality   | character varying(500)   | not null
  Indexes:
  measurement_events_pkey PRIMARY KEY, btree (measurement_source_id, 
  measurement_time)
  Triggers:
  measurement_events_part_trig BEFORE INSERT ON measurement_events FOR 
  EACH ROW EXECUTE PROCEDURE measurement_events_part_trig_func()
  Number of child tables: 25 (Use \d+ to list them.)
  
  
  One of the children tables (2014_3)
  
  ifms_db=# \dS measurement_events_p2014_03
Table public.measurement_events_p2014_03
  Column |   Type   | Modifiers
  ---+--+---
   measurement_source_id | uuid | not null
   measurement_time  | timestamp with time zone | not null
   event_reception_time  | timestamp with time zone | not null
   measurement_value | character varying(200)   | not null
   quality   | character varying(500)   | not null
  Indexes:
  measurement_events_p2014_03_pkey PRIMARY KEY, btree 
  (measurement_source_id, measurement_time)
  Check constraints:
  measurement_events_p2014_03_partition_check CHECK (measurement_time 
  = '2014-03-01 00:00:00+02'::timestamp with time zone AND measurement_time 
   '2014-04-01 00:00:00+03'::timestamp with time zone)
  Inherits: measurement_events
  
  
  
  The query:
  # explain analyze  select * from measurement_events where 
  measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' AND 
  measurement_time = DATE '2015-01-01' limit 1;
  
  
   Limit  (cost=0.00..4.12 rows=1 width=87) (actual time=0.377..0.377 rows=1 
  loops=1)
 -  Append  (cost=0.00..2696.08 rows=655 width=87) (actual 
  time=0.376..0.376 rows=1 loops=1)
   -  Seq Scan on measurement_events  (cost=0.00..0.00 rows=1 
  width=966) (actual time=0.001..0.001 rows=0 loops=1)
 Filter: ((measurement_time = '2015-01-01'::date) AND 
  (measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid))
   -  Index Scan using measurement_events_p2014_01_pkey on 
  measurement_events_p2014_01  (cost=0.14..8.16 rows=1 width=966) (actual 
  time=0.005..0.005 rows=0 loops=1)
 Index Cond: ((measurement_source_id = 
  'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND 

Re: [GENERAL] Partitioning

2015-01-19 Thread Kyotaro HORIGUCHI
Hi, 

19 Jan 2015 06:58:21 -0500, François Beausoleil franc...@teksol.info wrote in 
cc2fd572-320a-4225-b98c-48f209925...@teksol.info
 
  Le 2015-01-18 à 20:58, James Sewell james.sew...@lisasoft.com a écrit :
  
  Hello,
  
  I am using partitioning with around 100 sub-tables. Each sub-table is 
  around 11GB and partitioned on the 'id' column. I have an index on the id 
  column on each sub-table.
  
  Is it possible to get a query like the following working using constraint 
  exclusion, or am I doomed to do index/sequential scans of every sub-table?
  
  I want to select all rows which have an id which is in another query, so 
  something like:
  
WITH idlist as (SELECT id from othertable) 
  SELECT id from mastertable WHERE id = idlist.id);
  
  I am guessing that I am not getting constraint exclusion to work as the 
  planner doesn't know the outcome of my subquery at plan time?
  
 How many rows in idlist? Can you do two queries? Fetch the ids, then call the 
 2nd query with those values hard-coded in the query.
 
 If necessary, and you know where each partition starts, tou could sort in the 
 app and query the correct ranges, in multiple queries.

The strategy's effectiveness is depends mainly on how many ids
come from the othertable. It wins if fewer than certain number or
converged in a few partitions, however, straight joins will win
elsewise.

The result of EXPLAIN ANALYZE might draw more precise advices.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning

2015-01-19 Thread James Sewell
Yeah definitely not staying resident once read, although the machine does
gave 256GB of memory so some will persist in the OS cache.

Actually this brings up another question, if I have an unique integer index
of 2.gb what percentage would I expect to read for a value that was higher
or lower than all index values?

Cheers,
James





On Monday, 19 January 2015, John R Pierce pie...@hogranch.com wrote:

 On 1/18/2015 11:13 PM, James Sewell wrote:


 Each index is about 2.5GB, I suspect I am trying to read a these into
 memory in entirety.


 an 11GB table with a (presumably integer) primary key requires an 2.5GB
 index ?  100 of these would need 250GB of shared_buffers to stay
 resident, not likely.





 --
 john r pierce  37N 122W
 somewhere on the middle of the left coast



 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



-- 

James Sewell,
PostgreSQL Team Lead / Solutions Architect
__


 Level 2, 50 Queen St, Melbourne VIC 3000

*P *(+61) 3 8370 8000  *W* www.lisasoft.com  *F *(+61) 3 8370 8099

-- 


--
The contents of this email are confidential and may be subject to legal or 
professional privilege and copyright. No representation is made that this 
email is free of viruses or other defects. If you have received this 
communication in error, you may not copy or distribute any part of it or 
otherwise disclose its contents to anyone. Please advise the sender of your 
incorrect receipt of this correspondence.


Re: [GENERAL] Partitioning

2015-01-19 Thread Kyotaro HORIGUCHI
Hi,

At Mon, 19 Jan 2015 00:14:55 -0800, John R Pierce pie...@hogranch.com wrote 
in 54bcbcff.5040...@hogranch.com
 On 1/18/2015 11:59 PM, James Sewell wrote:
 
  Actually this brings up another question, if I have an unique integer
  index of 2.gb http://2.gb what percentage would I expect to read for
  a value that was higher or lower than all index values?
 
 a couple 8k blocks.  its a b-tree.

Yeah, scanning children reading highest/lowest values for each of
them would make it faster than the first single query.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning

2015-01-19 Thread John R Pierce

On 1/18/2015 11:59 PM, James Sewell wrote:


Actually this brings up another question, if I have an unique integer 
index of 2.gb http://2.gb what percentage would I expect to read for 
a value that was higher or lower than all index values?


a couple 8k blocks.  its a b-tree.

--
john r pierce  37N 122W
somewhere on the middle of the left coast



Re: [GENERAL] partitioning query planner almost always scans all tables

2015-01-19 Thread Rob Sargent
I don't understand having both UUID and time stamp in your PK? The first is by 
defn. unique and the second might be. 

Sent from my iPhone

 On Jan 19, 2015, at 6:12 AM, Spiros Ioannou siv...@inaccess.com wrote:
 
 Hello group,
 we have a timeseries table, and we tried to partition it by month (with 
 pg_partman). It seems the query planner always reads all tables regardless of 
 WHERE, except when WHERE is equality.
 
 the parent table:
 
 ifms_db=# \dS measurement_events
   Table public.measurement_events
 Column |   Type   | Modifiers
 ---+--+---
  measurement_source_id | uuid | not null
  measurement_time  | timestamp with time zone | not null
  event_reception_time  | timestamp with time zone | not null
  measurement_value | character varying(200)   | not null
  quality   | character varying(500)   | not null
 Indexes:
 measurement_events_pkey PRIMARY KEY, btree (measurement_source_id, 
 measurement_time)
 Triggers:
 measurement_events_part_trig BEFORE INSERT ON measurement_events FOR EACH 
 ROW EXECUTE PROCEDURE measurement_events_part_trig_func()
 Number of child tables: 25 (Use \d+ to list them.)
 
 
 One of the children tables (2014_3)
 
 ifms_db=# \dS measurement_events_p2014_03
   Table public.measurement_events_p2014_03
 Column |   Type   | Modifiers
 ---+--+---
  measurement_source_id | uuid | not null
  measurement_time  | timestamp with time zone | not null
  event_reception_time  | timestamp with time zone | not null
  measurement_value | character varying(200)   | not null
  quality   | character varying(500)   | not null
 Indexes:
 measurement_events_p2014_03_pkey PRIMARY KEY, btree 
 (measurement_source_id, measurement_time)
 Check constraints:
 measurement_events_p2014_03_partition_check CHECK (measurement_time = 
 '2014-03-01 00:00:00+02'::timestamp with time zone AND measurement_time  
 '2014-04-01 00:00:00+03'::timestamp with time zone)
 Inherits: measurement_events
 
 
 
 The query:
 # explain analyze  select * from measurement_events where 
 measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' AND 
 measurement_time = DATE '2015-01-01' limit 1;
 
 
  Limit  (cost=0.00..4.12 rows=1 width=87) (actual time=0.377..0.377 rows=1 
 loops=1)
-  Append  (cost=0.00..2696.08 rows=655 width=87) (actual 
 time=0.376..0.376 rows=1 loops=1)
  -  Seq Scan on measurement_events  (cost=0.00..0.00 rows=1 
 width=966) (actual time=0.001..0.001 rows=0 loops=1)
Filter: ((measurement_time = '2015-01-01'::date) AND 
 (measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid))
  -  Index Scan using measurement_events_p2014_01_pkey on 
 measurement_events_p2014_01  (cost=0.14..8.16 rows=1 width=966) (actual 
 time=0.005..0.005 rows=0 loops=1)
Index Cond: ((measurement_source_id = 
 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time = 
 '2015-01-01'::date))
  -  Index Scan using measurement_events_p2014_02_pkey on 
 measurement_events_p2014_02  (cost=0.14..8.16 rows=1 width=966) (actual 
 time=0.002..0.002 rows=0 loops=1)
Index Cond: ((measurement_source_id = 
 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time = 
 '2015-01-01'::date))
  -  Index Scan using measurement_events_p2014_03_pkey on 
 measurement_events_p2014_03  (cost=0.14..8.16 rows=1 width=966) (actual 
 time=0.002..0.002 rows=0 loops=1)
Index Cond: ((measurement_source_id = 
 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time = 
 '2015-01-01'::date))
  -  Index Scan using measurement_events_p2014_04_pkey on 
 measurement_events_p2014_04  (cost=0.14..8.16 rows=1 width=966) (actual 
 time=0.001..0.001 rows=0 loops=1)
Index Cond: ((measurement_source_id = 
 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time = 
 '2015-01-01'::date))
  -  Index Scan using measurement_events_p2014_05_pkey on 
 measurement_events_p2014_05  (cost=0.14..8.16 rows=1 width=966) (actual 
 time=0.001..0.001 rows=0 loops=1)
Index Cond: ((measurement_source_id = 
 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time = 
 '2015-01-01'::date))
  -  Index Scan using measurement_events_p2014_06_pkey on 
 measurement_events_p2014_06  (cost=0.14..8.16 rows=1 width=966) (actual 
 time=0.002..0.002 rows=0 loops=1)
Index Cond: ((measurement_source_id = 
 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time = 
 '2015-01-01'::date))
  -  Index 

[GENERAL] partitioning query planner almost always scans all tables

2015-01-19 Thread Spiros Ioannou
Hello group,
we have a timeseries table, and we tried to partition it by month (with
pg_partman). It seems the query planner always reads all tables regardless
of WHERE, except when WHERE is equality.

*the parent table:*

ifms_db=# \dS measurement_events
  Table public.measurement_events
Column |   Type   | Modifiers
---+--+---
 measurement_source_id | uuid | not null
 measurement_time  | timestamp with time zone | not null
 event_reception_time  | timestamp with time zone | not null
 measurement_value | character varying(200)   | not null
 quality   | character varying(500)   | not null
Indexes:
measurement_events_pkey PRIMARY KEY, btree (measurement_source_id,
measurement_time)
Triggers:
measurement_events_part_trig BEFORE INSERT ON measurement_events FOR
EACH ROW EXECUTE PROCEDURE measurement_events_part_trig_func()
Number of child tables: 25 (Use \d+ to list them.)


One of the children tables (2014_3)

ifms_db=# \dS measurement_events_p2014_03
  Table public.measurement_events_p2014_03
Column |   Type   | Modifiers
---+--+---
 measurement_source_id | uuid | not null
 measurement_time  | timestamp with time zone | not null
 event_reception_time  | timestamp with time zone | not null
 measurement_value | character varying(200)   | not null
 quality   | character varying(500)   | not null
Indexes:
measurement_events_p2014_03_pkey PRIMARY KEY, btree
(measurement_source_id, measurement_time)
Check constraints:
measurement_events_p2014_03_partition_check CHECK (measurement_time
= '2014-03-01 00:00:00+02'::timestamp with time zone AND measurement_time
 '2014-04-01 00:00:00+03'::timestamp with time zone)
Inherits: measurement_events



The query:
# explain analyze  select * from measurement_events where
measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' AND
measurement_time = DATE '2015-01-01' limit 1;


 Limit  (cost=0.00..4.12 rows=1 width=87) (actual time=0.377..0.377 rows=1
loops=1)
   -  Append  (cost=0.00..2696.08 rows=655 width=87) (actual
time=0.376..0.376 rows=1 loops=1)
 -  Seq Scan on measurement_events  (cost=0.00..0.00 rows=1
width=966) (actual time=0.001..0.001 rows=0 loops=1)
   Filter: ((measurement_time = '2015-01-01'::date) AND
(measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid))
 -  Index Scan using measurement_events_p2014_01_pkey on
measurement_events_p2014_01  (cost=0.14..8.16 rows=1 width=966) (actual
time=0.005..0.005 rows=0 loops=1)
   Index Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time =
'2015-01-01'::date))
 -  Index Scan using measurement_events_p2014_02_pkey on
measurement_events_p2014_02  (cost=0.14..8.16 rows=1 width=966) (actual
time=0.002..0.002 rows=0 loops=1)
   Index Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time =
'2015-01-01'::date))
 -  Index Scan using measurement_events_p2014_03_pkey on
measurement_events_p2014_03  (cost=0.14..8.16 rows=1 width=966) (actual
time=0.002..0.002 rows=0 loops=1)
   Index Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time =
'2015-01-01'::date))
 -  Index Scan using measurement_events_p2014_04_pkey on
measurement_events_p2014_04  (cost=0.14..8.16 rows=1 width=966) (actual
time=0.001..0.001 rows=0 loops=1)
   Index Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time =
'2015-01-01'::date))
 -  Index Scan using measurement_events_p2014_05_pkey on
measurement_events_p2014_05  (cost=0.14..8.16 rows=1 width=966) (actual
time=0.001..0.001 rows=0 loops=1)
   Index Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time =
'2015-01-01'::date))
 -  Index Scan using measurement_events_p2014_06_pkey on
measurement_events_p2014_06  (cost=0.14..8.16 rows=1 width=966) (actual
time=0.002..0.002 rows=0 loops=1)
   Index Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time =
'2015-01-01'::date))
 -  Index Scan using measurement_events_p2014_07_pkey on
measurement_events_p2014_07  (cost=0.14..8.16 rows=1 width=966) (actual
time=0.001..0.001 rows=0 loops=1)
   Index Cond: ((measurement_source_id =
'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time =
'2015-01-01'::date))
 -  Index Scan using 

Re: [GENERAL] Partitioning

2015-01-19 Thread Kyotaro HORIGUCHI
Sorry,

 Yeah, scanning children reading highest/lowest values for each of them
would make it faster than the first single query.

Mmm..no. It has nothing different from querieng on the parent table. Please
Ignore  the silly thing.

-- 
Kyotaro Horiguchi


Re: [GENERAL] Partitioning

2015-01-19 Thread François Beausoleil

 Le 2015-01-18 à 20:58, James Sewell james.sew...@lisasoft.com a écrit :
 
 Hello,
 
 I am using partitioning with around 100 sub-tables. Each sub-table is around 
 11GB and partitioned on the 'id' column. I have an index on the id column on 
 each sub-table.
 
 Is it possible to get a query like the following working using constraint 
 exclusion, or am I doomed to do index/sequential scans of every sub-table?
 
 I want to select all rows which have an id which is in another query, so 
 something like:
 
   WITH idlist as (SELECT id from othertable) 
 SELECT id from mastertable WHERE id = idlist.id);
 
 I am guessing that I am not getting constraint exclusion to work as the 
 planner doesn't know the outcome of my subquery at plan time?
 
How many rows in idlist? Can you do two queries? Fetch the ids, then call the 
2nd query with those values hard-coded in the query.

If necessary, and you know where each partition starts, tou could sort in the 
app and query the correct ranges, in multiple queries.

Hope that helps!
François Beausoleil

Re: [GENERAL] Partitioning

2015-01-18 Thread James Sewell
Hello,

Thanks for the reply. I can write queries which get index scans, but they
are still slow.

Each index is about 2.5GB, I suspect I am trying to read a these into
memory in entirety.

Perhaps there is no way to tune this?

Cheers, james



On Monday, 19 January 2015, Kyotaro HORIGUCHI 
horiguchi.kyot...@lab.ntt.co.jp wrote:

 Hello,

 an 2015 14:13:37 +1100, James Sewell james.sew...@lisasoft.com
 javascript:; wrote in 
 cankgpbs8gypq3tqgkdjtd+n-w1rkq5uo97h3tuhg5ewakr6...@mail.gmail.com
 javascript:;
  Sadly not ... I still hit all the tables.

 | 5.9.4. Partitioning and Constraint Exclusion

 http://www.postgresql.org/docs/9.4/static/ddl-partitioning.html

 Constraint exclusion is a mechanism to omit tables that are known
 to have no hit by the query *beforehand* execution. So the
 criteria cannot rely on out of the query itself (and CHECK
 constraints, of course).

 Your query uses the result of the WITH-clause-query in the WHERE
 clause which is unknown to the planner so constraint exclusion
 does not work. JOINs don't change the situation.


  On Mon, Jan 19, 2015 at 1:54 PM, John R Pierce pie...@hogranch.com
 javascript:; wrote:
 
On 1/18/2015 5:58 PM, James Sewell wrote:
  
 WITH idlist as (SELECT id from othertable)
   SELECT id from mastertable WHERE id = idlist.id);
  
  
  
   select mt.id, ...   from mastertable mt join othertable ot on
 mt.id=
   ot.id;
  
   might optimize better.

 As the result, the query inevitably scans all the tables, but not
 necessariry in sequqntial scans or simple index scans. The
 suggestion above seeems showing the notation which the planner
 can find the better plans on that premise.

 For example, if you have an index on id of one of the two tables,
 (and some other conditions match, of course) index only scan will
 be selected for it and the suggested query will give you a
 seemingly better plan than your query.

 regards,

 --
 Kyotaro Horiguchi
 NTT Open Source Software Center



-- 

James Sewell,
PostgreSQL Team Lead / Solutions Architect
__


 Level 2, 50 Queen St, Melbourne VIC 3000

*P *(+61) 3 8370 8000  *W* www.lisasoft.com  *F *(+61) 3 8370 8099

-- 


--
The contents of this email are confidential and may be subject to legal or 
professional privilege and copyright. No representation is made that this 
email is free of viruses or other defects. If you have received this 
communication in error, you may not copy or distribute any part of it or 
otherwise disclose its contents to anyone. Please advise the sender of your 
incorrect receipt of this correspondence.


Re: [GENERAL] Partitioning

2015-01-18 Thread James Sewell
Sadly not ... I still hit all the tables.



Cheers,


James Sewell,
PostgreSQL Team Lead / Solutions Architect
__


 Level 2, 50 Queen St, Melbourne VIC 3000

*P *(+61) 3 8370 8000  *W* www.lisasoft.com  *F *(+61) 3 8370 8099


On Mon, Jan 19, 2015 at 1:54 PM, John R Pierce pie...@hogranch.com wrote:

  On 1/18/2015 5:58 PM, James Sewell wrote:

   WITH idlist as (SELECT id from othertable)
 SELECT id from mastertable WHERE id = idlist.id);



 select mt.id, ...   from mastertable mt join othertable ot on mt.id=
 ot.id;

 might optimize better.



 --
 john r pierce  37N 122W
 somewhere on the middle of the left coast



-- 


--
The contents of this email are confidential and may be subject to legal or 
professional privilege and copyright. No representation is made that this 
email is free of viruses or other defects. If you have received this 
communication in error, you may not copy or distribute any part of it or 
otherwise disclose its contents to anyone. Please advise the sender of your 
incorrect receipt of this correspondence.


Re: [GENERAL] Partitioning

2015-01-18 Thread Kyotaro HORIGUCHI
Hello,

an 2015 14:13:37 +1100, James Sewell james.sew...@lisasoft.com wrote in 
cankgpbs8gypq3tqgkdjtd+n-w1rkq5uo97h3tuhg5ewakr6...@mail.gmail.com
 Sadly not ... I still hit all the tables.

| 5.9.4. Partitioning and Constraint Exclusion

http://www.postgresql.org/docs/9.4/static/ddl-partitioning.html

Constraint exclusion is a mechanism to omit tables that are known
to have no hit by the query *beforehand* execution. So the
criteria cannot rely on out of the query itself (and CHECK
constraints, of course).

Your query uses the result of the WITH-clause-query in the WHERE
clause which is unknown to the planner so constraint exclusion
does not work. JOINs don't change the situation.


 On Mon, Jan 19, 2015 at 1:54 PM, John R Pierce pie...@hogranch.com wrote:
 
   On 1/18/2015 5:58 PM, James Sewell wrote:
 
WITH idlist as (SELECT id from othertable)
  SELECT id from mastertable WHERE id = idlist.id);
 
 
 
  select mt.id, ...   from mastertable mt join othertable ot on mt.id=
  ot.id;
 
  might optimize better.

As the result, the query inevitably scans all the tables, but not
necessariry in sequqntial scans or simple index scans. The
suggestion above seeems showing the notation which the planner
can find the better plans on that premise.

For example, if you have an index on id of one of the two tables,
(and some other conditions match, of course) index only scan will
be selected for it and the suggested query will give you a
seemingly better plan than your query.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning

2015-01-18 Thread John R Pierce

On 1/18/2015 5:58 PM, James Sewell wrote:

WITH idlist as (SELECT id from othertable)
SELECT id from mastertable WHERE id = idlist.id http://idlist.id);



select mt.id, ...   from mastertable mt join othertable ot on 
mt.id=ot.id;


might optimize better.



--
john r pierce  37N 122W
somewhere on the middle of the left coast



[GENERAL] Partitioning

2015-01-18 Thread James Sewell
Hello,

I am using partitioning with around 100 sub-tables. Each sub-table is
around 11GB and partitioned on the 'id' column. I have an index on the id
column on each sub-table.

Is it possible to get a query like the following working using constraint
exclusion, or am I doomed to do index/sequential scans of every sub-table?

I want to select all rows which have an id which is in another query, so
something like:

  WITH idlist as (SELECT id from othertable)
SELECT id from mastertable WHERE id = idlist.id);

I am guessing that I am not getting constraint exclusion to work as the
planner doesn't know the outcome of my subquery at plan time?

Any tricks I am overlooking?

James Sewell,
PostgreSQL Team Lead / Solutions Architect
__


 Level 2, 50 Queen St, Melbourne VIC 3000

*P *(+61) 3 8370 8000  *W* www.lisasoft.com  *F *(+61) 3 8370 8099

-- 


--
The contents of this email are confidential and may be subject to legal or 
professional privilege and copyright. No representation is made that this 
email is free of viruses or other defects. If you have received this 
communication in error, you may not copy or distribute any part of it or 
otherwise disclose its contents to anyone. Please advise the sender of your 
incorrect receipt of this correspondence.


Re: [GENERAL] Partitioning

2015-01-18 Thread John R Pierce

On 1/18/2015 11:13 PM, James Sewell wrote:


Each index is about 2.5GB, I suspect I am trying to read a these into 
memory in entirety.


an 11GB table with a (presumably integer) primary key requires an 2.5GB 
index ?  100 of these would need 250GB of shared_buffers to stay 
resident, not likely.






--
john r pierce  37N 122W
somewhere on the middle of the left coast



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning of a dependent table not based on date

2014-12-02 Thread Herouth Maoz

On 01/12/2014, at 19:26, Andy Colson wrote:

 On 12/1/2014 11:14 AM, Herouth Maoz wrote:
 I am currently in the process of creating a huge archive database that
 contains data from all of our systems, going back for almost a decade.
 
 Most of the tables fall into one of two categories:
 
 1. Static tables, which are rarely updated, such as lookup tables or
 user lists. I don't intend to partition these, I'll just refresh them
 periodically from production.
 2. Transaction tables, that have a timestamp field, for which I have the
 data archived in COPY format by month. Of course a monolithic table over
 a decade is not feasible, so I am partitioning these by month.
 
 (I don't mean transaction in the database sense, but in the sense that
 the data represents historical activity, e.g. message sent, file
 downloaded etc.)
 
 I have one table, though, that doesn't fall into this pattern. It's a
 many-to-one table relating to one of the transaction tables. So on one
 hand, it doesn't have a time stamp field, and on the other hand, it has
 accumulated lots of data over the last decade so I can't keep it
 unpartitioned.
 
 
 Lets stop here.  One big table with lots of rows (and a good index) isn't a 
 problem.  As long as you are not table scanning everything, there isn't a 
 reason to partition the table.
 
 Lots and lots of rows isnt a problem except for a few usage patterns:
 1) delete from bigtable where (some huge percent of the rows)
 2) select * from bigtable where (lots and lots of table scanning and cant 
 really index)
 
 If your index is selective enough, you'll be fine.


Hmm. I suppose you're right. I planned the whole partition thing in the first 
place because most of my transaction tables are still alive so I'll need to 
continue bulk-inserting data every month, and inserting into a fresh partition 
is better than into a huge table.

But in this case, since we have stopped working on this application in January, 
there will be no fresh inserts so it's not as important. We just need the 
archive for legal purposes.

One thing, though: I noticed on my other system (a reports system, that holds a 
year's worth of data) that after I have partitioned the largest tables, backup 
time dropped. I suppose pg_dump of a single huge table takes is not as fast as 
pg_dump of multiple smaller ones.

Herouth

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Partitioning such that key field of inherited tables no longer retains any selectivity

2014-05-11 Thread Tim Kane
The subject line may not actually describe what I want to illustrate…

Basically, let’s say we have a nicely partitioned data-set. Performance is a
net win and I’m happy with it.
The partitioning scheme is equality based, rather than range based.

That is, each partition contains a subset of the data where partition_key =
{some_value}, and of course we let constraint exclusion enable the optimiser
to do its thing.

As such, all of the data contained in a given partition has the same value
for partition_key. That field, within the scope of its partition – isn’t
terribly useful anymore, and in my mind is wasting bytes – it’s only purpose
really is to allow the CHECK constraint to verify the data is what it should
be.


Wouldn’t it be nice if we could somehow create a child table where we could
define a const field value, that did not need to be stored on disk at the
tuple level?
This would allow the check constraint to supply the optimiser with the
information it needs, while removing the need to consume disk to record a
field whose value is always the same.


Extending this idea..
Postgresql could possibly look at any equality based check constraint for a
table and instead of storing each field value verbatim, we could implicitly
optimise away the need to write those field values to disk, on the
understanding that those values can never change (unless the constraint is
removed/altered).

I’m sure there are all kinds of worms in this canister, but I thought it
might be an interesting discussion.


Cheers,

Tim




[GENERAL] Partitioning and triggers

2013-11-17 Thread Edson Richter

Dear community,

In documentation, when partitioning tables, it is said that Optionally, 
define a trigger or rule to redirect data inserted into the master table 
to the appropriate partition.
Is the trigger creation optional? I mean, partitioning will not work as 
expected if we don't have the trigger, right? Or will PostgreSQL decide 
automatically which table to use?
If the trigger is not there, when I add a row to measurement table 
(like in docs example), it will be phisically inserted into 
measurement table, not in the childs tables.

Am I right in this understanding?

Thanks,

Edson



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning and triggers

2013-11-17 Thread Jeff Janes
On Sun, Nov 17, 2013 at 8:46 AM, Edson Richter edsonrich...@hotmail.comwrote:

 Dear community,

 In documentation, when partitioning tables, it is said that Optionally,
 define a trigger or rule to redirect data inserted into the master table to
 the appropriate partition.
 Is the trigger creation optional? I mean, partitioning will not work as
 expected if we don't have the trigger, right?


That depends on what you expect :)


 Or will PostgreSQL decide automatically which table to use?
 If the trigger is not there, when I add a row to measurement table (like
 in docs example), it will be phisically inserted into measurement table,
 not in the childs tables.
 Am I right in this understanding?


Yes.  And if there is a constraint that blocks it from going into the
parent table, you will get an error.  If you have the constraint but not
the trigger, that means it is the application's job to insert into the
correct partition.  This can make sense as triggers have a lot of overhead,
whereas the application can usually do this with less overhead.  Or you can
have neither the trigger nor the constraint, and have it go into the parent
table by design.  Whether that is a good idea depends on why you are
partitioning.

Cheers,

Jeff


Re: [GENERAL] Partitioning and triggers

2013-11-17 Thread Edson Richter

Em 17/11/2013 18:45, Jeff Janes escreveu:
On Sun, Nov 17, 2013 at 8:46 AM, Edson Richter 
edsonrich...@hotmail.com mailto:edsonrich...@hotmail.com wrote:


Dear community,

In documentation, when partitioning tables, it is said that
Optionally, define a trigger or rule to redirect data inserted
into the master table to the appropriate partition.
Is the trigger creation optional? I mean, partitioning will not
work as expected if we don't have the trigger, right? 



That depends on what you expect :)

Or will PostgreSQL decide automatically which table to use?
If the trigger is not there, when I add a row to measurement
table (like in docs example), it will be phisically inserted into
measurement table, not in the childs tables.
Am I right in this understanding?


Yes.  And if there is a constraint that blocks it from going into the 
parent table, you will get an error.  If you have the constraint but 
not the trigger, that means it is the application's job to insert into 
the correct partition.  This can make sense as triggers have a lot of 
overhead, whereas the application can usually do this with less 
overhead.  Or you can have neither the trigger nor the constraint, and 
have it go into the parent table by design.  Whether that is a good 
idea depends on why you are partitioning.


Cheers,

Jeff

Thanks, Jeff.

Your comments togheter with comments I've received from others in this 
mail list, I'll make my recommendation for architecture changes in 
persistence layer.


Regards,

Edson


[GENERAL] partitioning for speed, but query planner ignores

2013-10-02 Thread David Rysdam
We have a by-our-standards large table (about 40e6 rows). Since it is
the bottleneck in some places, I thought I'd experiment with
partitioning. I'm following the instructions here:

http://www.postgresql.org/docs/current/static/ddl-partitioning.html

The table holds data about certain objects, each of which has an object
number and some number of historical entries (like account activity at a
bank, say). The typical usage pattern is: relatively rare inserts that
happen in the background via an automated process (meaning I don't care
if they take a little longer) and frequent querying, including some
where a human is sitting in front of it (i.e. I'd like it to be a lot
faster).

Our most frequent queries either select all history for object N or
most recent item for some subset of objects.

Because object number figure so prominently, I thought I'd partition on
that. To me, it makes the most sense from a load-balancing perspective
to partition on the mod of the object number (for this test, evens vs
odds, but planning to go up to mod 10 or even mod 100). Lower numbers
are going to be queried much less often than higher numbers. This scheme
also means I never have to add partitions in the future.

I set up my check constraints ((objnum % 2) = 0 and (objnum % 2) = 1 on
the relevant tables) and turned constraint_exclusion to 'partition' in
postgresql.conf. I also turned it to 'on' in my psql interface.

However, when I run an explain or an explain analyze, I still seeing it
checking both partitions. Is this because the query planner doesn't want
to do a mod? Should I go with simple ranges, even though this adds a
maintenance task? 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] partitioning for speed, but query planner ignores

2013-10-02 Thread David Rysdam
On Wed, 2 Oct 2013 08:34:44 -0400, David Rysdam drys...@ll.mit.edu wrote:
 However, when I run an explain or an explain analyze, I still seeing it
 checking both partitions. Is this because the query planner doesn't want
 to do a mod? Should I go with simple ranges, even though this adds a
 maintenance task? 

I guess I should give some administrivia as well: Server is 9.2.1
running Linux. The configuration is otherwise pretty vanilla with only
minor, and poorly-understood, conf changes.


smime.p7s
Description: S/MIME cryptographic signature


Re: [GENERAL] partitioning for speed, but query planner ignores

2013-10-02 Thread Bill Moran
On Wed, 2 Oct 2013 08:34:44 -0400
David Rysdam drys...@ll.mit.edu wrote:

 We have a by-our-standards large table (about 40e6 rows). Since it is
 the bottleneck in some places, I thought I'd experiment with
 partitioning. I'm following the instructions here:
 
 http://www.postgresql.org/docs/current/static/ddl-partitioning.html
 
 The table holds data about certain objects, each of which has an object
 number and some number of historical entries (like account activity at a
 bank, say). The typical usage pattern is: relatively rare inserts that
 happen in the background via an automated process (meaning I don't care
 if they take a little longer) and frequent querying, including some
 where a human is sitting in front of it (i.e. I'd like it to be a lot
 faster).
 
 Our most frequent queries either select all history for object N or
 most recent item for some subset of objects.
 
 Because object number figure so prominently, I thought I'd partition on
 that. To me, it makes the most sense from a load-balancing perspective
 to partition on the mod of the object number (for this test, evens vs
 odds, but planning to go up to mod 10 or even mod 100). Lower numbers
 are going to be queried much less often than higher numbers. This scheme
 also means I never have to add partitions in the future.
 
 I set up my check constraints ((objnum % 2) = 0 and (objnum % 2) = 1 on
 the relevant tables) and turned constraint_exclusion to 'partition' in
 postgresql.conf. I also turned it to 'on' in my psql interface.
 
 However, when I run an explain or an explain analyze, I still seeing it
 checking both partitions. Is this because the query planner doesn't want
 to do a mod? Should I go with simple ranges, even though this adds a
 maintenance task? 

Last I looked, the partitioning mechanism isn't _quite_ as smart as could
be desired.  For example:
SELECT * FROM table WHERE objnum = 5; -- will not take advantage of partition
You have to give the planner a little more hint as to the fact that it can
take advantage of the partition:
SELECT * FROM table WHERE (objnum % 2) = 1 AND objnum = 5;
As silly as it seems, this is enough information for the planner to know
that it only needs to scan one partition.

If this doesn't answer your question, you should probably provide some
more details (actual query and actual explain output, for example) to
help people better help you.

-- 
Bill Moran wmo...@potentialtech.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] partitioning for speed, but query planner ignores

2013-10-02 Thread David Rysdam
On Wed, 2 Oct 2013 09:12:02 -0400, Bill Moran wmo...@potentialtech.com wrote:
 Last I looked, the partitioning mechanism isn't _quite_ as smart as could
 be desired.  For example:
 SELECT * FROM table WHERE objnum = 5; -- will not take advantage of partition
 You have to give the planner a little more hint as to the fact that it can
 take advantage of the partition:
 SELECT * FROM table WHERE (objnum % 2) = 1 AND objnum = 5;
 As silly as it seems, this is enough information for the planner to know
 that it only needs to scan one partition.

This seemed ridiculously silly until I thought about it. I guess it has
no way of unwrapping my constraint and figuring out what to do. Would
this also apply if I did ranges or is that a common enough constraint
that it *can* figure it out without me having to modify all my queries?


smime.p7s
Description: S/MIME cryptographic signature


Re: [GENERAL] partitioning for speed, but query planner ignores

2013-10-02 Thread Kevin Grittner
David Rysdam drys...@ll.mit.edu wrote:

 Would the planner be smart enough to figure out ranges without me
 having to hint my queries?

Yes, it handles ranges well.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] partitioning for speed, but query planner ignores

2013-10-02 Thread bricklen
On Wed, Oct 2, 2013 at 9:01 AM, David Rysdam drys...@ll.mit.edu wrote:

  I had some vague notion of tables doing work but really if it can load
 one
 partition into RAM and get most of my hits from there, it'd be a big
 win.


The same concept applies to the frequently-used indexes on that partition.


Re: [GENERAL] Partitioning table with billion row

2013-09-24 Thread sachin kotwal
1. You have to remove foreign key reference from table searchcache.
2. take backup of data from searchcache.
3. create partition of table product
4. add constraints on table searchcache.(if necessary delete and create
searchcache after taking backup.)






-
Thanks and Regards,

Sachin Kotwal
NTT-DATA-OSS Center (Pune)
--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Partitioning-table-with-billion-row-tp5771582p5772155.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning V schema

2013-09-21 Thread Luca Ferrari
On Fri, Sep 20, 2013 at 4:38 AM, Julian temp...@internode.on.net wrote:
 However, I tend to go with partitions when required to be generated on
 demand dynamically and automatically (which probably isn't the case
 here). SCHEMAs have other uses, provide a level of security (GRANT) and
 useful in design when partitioning off blocks of related datasets
 completely.

I would do a partition in this case for the same reason: schemas are
much more logical divisions of data due to different grantings, search
paths, users, and so on. Partition is more a data-level split so it
does make sense when you want the data to stay in a single point but
for performance reason split it across multiple tables.

Luca


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Partitioning V schema

2013-09-20 Thread Jeff Janes
On Thu, Sep 19, 2013 at 12:02 AM, Dave Potts dave.po...@pinan.co.uk wrote:

 Hi List

 I am looking for some general advice about the best was of splitting  a
 large data table,I have  2 different choices, partitioning or different
 schemas.



I don't think there is much of a choice there.  If you put them in
different schemas, then you are inherently partitioning the data.  It just
a question of how you name your partitions, which is more of a naming issue
than a performance issue.



 The data table refers to the number of houses that can be include in a
 city, as such there are large number of records.


 I am wondering if decided to partition the table if the update
 speed/access might be faster that just declaring a different schema per
 city.


If you partition based on city, then there should be no meaningful
difference.  If you partition based on something else, you would have to
describe what it is partitioned on, and what your access patterns are like.

Cheers,

Jeff


Re: [GENERAL] Partitioning V schema

2013-09-20 Thread Gregory Haase
I would look towards how PostGis handles the Tiger census data for
guidance. It's a similar, massive data set.

Greg Haase
On Sep 20, 2013 9:47 AM, Jeff Janes jeff.ja...@gmail.com wrote:

 On Thu, Sep 19, 2013 at 12:02 AM, Dave Potts dave.po...@pinan.co.ukwrote:

 Hi List

 I am looking for some general advice about the best was of splitting  a
 large data table,I have  2 different choices, partitioning or different
 schemas.



 I don't think there is much of a choice there.  If you put them in
 different schemas, then you are inherently partitioning the data.  It just
 a question of how you name your partitions, which is more of a naming issue
 than a performance issue.



 The data table refers to the number of houses that can be include in a
 city, as such there are large number of records.


 I am wondering if decided to partition the table if the update
 speed/access might be faster that just declaring a different schema per
 city.


 If you partition based on city, then there should be no meaningful
 difference.  If you partition based on something else, you would have to
 describe what it is partitioned on, and what your access patterns are like.

 Cheers,

 Jeff



Re: [GENERAL] Partitioning V schema

2013-09-20 Thread Julian

On 21/09/13 02:51, Gregory Haase wrote:


I would look towards how PostGis handles the Tiger census data for 
guidance. It's a similar, massive data set.


Greg Haase


I'm not sure why it wouldn't handle it fine?
The question is at what point would third party imported datasets, 
required for local lookup require their own dedicated solution (database 
and/or server)?
Especially when we are talking about a large amounts of data that come 
with their own unique global identifiers. Something like ISO 3166, even 
small, would do well with its own database - but not large enough to 
require a dedicated server.


This is the question I put to the OP, I would be interested to know.

Regards,
Julian


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Partitioning V schema

2013-09-19 Thread Dave Potts

Hi List

I am looking for some general advice about the best was of splitting  a 
large data table,I have  2 different choices, partitioning or different 
schemas.


The data table refers to the number of houses that can be include in a 
city, as such there are large number of records.



I am wondering if decided to partition the table if the update 
speed/access might be faster that just declaring a different schema per 
city.


Under the partition the data table would appear to be smaller, so I 
should get an increase in speed, but the database still have to do some 
sort of indexing.


If I used different schemas, it resolves data protection issues, but 
doing a backup might become a nightmare


In general which is the fast access method?

regards


Dave.





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


  1   2   3   >